Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Helpful References
Manning, C.D. & Schtze, H. (1999). Foundations of Statistical Natural
Language Processing.
Jurafsky, D. & Martin, J.H. (2008). Speech and Language Processing: An
Introduction to Natural Language Processing, Computational Linguistics
and Speech Recognition. 2nd Edition.
Cover, T.M. & Thomas, J.A. (2006). Elements of Information Theory.
Manning, C.D., Raghavan, P. & Schtze, H. (2008). Introduction to
Information Retrieval. Cambridge University Press.
2. Bags of Words
Goldsmith, J.H. (2007). Probability for linguists.
Baroni, M. (2006). Distributions in texts.
Gilquin, G., & Gries, S. T. (2009). Corpora and experimental methods: A
state-of-the-art review. Corpus Linguistics and Linguistic Theory, 5(1),
126.
2. Zipfs Law
Zipf, G.K. (1949). Human Behavior and the Principle of Least Effort: An
Introduction to Human Ecology. Addison-Wesley: Cambridge, MA.
Mandelbrot, B. (1953). An informational theory of the statistical structure
of language. Communication Theory.
Ferrer i Cancho, R., & Sole, R. V. (2003). Least effort and the origins of
scaling in human language. Proceedings of the National Academy of
Sciences, 100(3), 788791.
L, L., Zhang, Z.K., & Zhou, T. (2010). Zipf's Law Leads to Heaps Law:
Analyzing Their Relation in Finite-Size Systems. PLoS ONE 5(12):
e14139.
Barabasi, A.L. (2005). The origin of bursts and heavy tails in human
dynamics. Nature, 435, 207-211.
5. Distributional Approaches I
Salton, G., Wong, A., & Yang, C. S. (1975). A vector space model for
automatic indexing. Communications of the ACM, 18(11), 613620.
Landauer and Dumais (1997). A solution to Platos problem: The latent
semantic analysis theory of acquisition, induction, and representation
of knowledge. Psychological Review, 104, 211-240.
Lund, K., & Burgess, C. (1996). Producing high-dimensional semantic
spaces from lexical co-occurrence. Behavior Research Methods,
Instruments, & Computers: a Journal of the Psychonomic Society,
28(2), 203208.
6. Distributional Approaches 2
Bullinaria, J. A., & Levy, J. P. (2007). Extracting semantic representations
from word co-occurrence statistics: a computational study. Behavior
Research Methods, 39(3), 510526.
Recchia, G., & Jones, M. N. (2009). More data trumps smarter algorithms:
Comparing pointwise mutual information with latent semantic analysis.
Behavior Research Methods, 41(3), 647656.
Andrews, M., Vigliocco, G., & Vinson, D. (2009). Integrating experiential
and distributional data to learn semantic representations.
Psychological Review, 116(3), 463498.
Lapesa, G., Evert, S., & Walde, S.S.I. (2014). Contrasting Syntagmatic
and Paradigmatic Relations: Insights from Distributional Semantic
Models, 111.
9. Topics Models
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation.
Journal of Machine Learning Research, 3(4-5), 9931022.
Madsen, R. E., Kauchak, D., & Elkan, C. (2005). Modeling word burstiness
using the Dirichlet distribution. Proceedings of the 22st International
Conference on Machine Learning (pp. 545552).
Griffiths, T. L., Steyvers, M., & Tenenbaum, J. B. (2007). Topics in
semantic representation. Psychological Review, 114(2), 211244.
10. Networks
Barabasi, A., & Albert, R. (1999). Emergence of scaling in random
networks. Science, 286(5439), 509512.
Newman, M. (2005). Power laws, Pareto distributions and Zipf's law.
Contemporary Physics, 46(5), 323351.
Steyvers, M., & Tenenbaum, J. B. (2005). The large-scale structure of
semantic networks: statistical analyses and a model of semantic
growth. Cognitive Science, 29(1), 4178.