A comparison of collocations and word associations in Estonian from the perspective of parts of speech


  • Ene Vainik Institute of the Estonian Language, Tallinn, Estonia
  • Maria Tuulik Institute of the Estonian Language, Tallinn, Estonia
  • Kristina Koppel Institute of the Estonian Language, Tallinn, Estonia




collocations, associations, parts of speech, lexicography, Estonian language


The paper provides a comparative study of the collocational and associative structures in Estonian with respect to the role of parts of speech. The lists of collocations and associations of an equal set of nouns, verbs and adjectives, originating from the respective dictionaries, is analysed to find both the range of coincidences and differences. The results show a moderate overlap, among which the biggest overlap occurs in the range of the adjectival associates and collocates. There is an overall prevalence for nouns appearing among the associated and collocated items. The coincidental sets of relations are tentatively explained by the influence of grammatical relations i.e. the patterns of local grammar binding together the collocations and motivating the associations. The results are discussed with respect to the possible reasons causing the associations-collocations mismatch and in relation to the application of these findings in the fields of lexicography and second language acquisition.


Download data is not yet available.


DEWA = Vainik, E. (2019). Eesti keele assotsiatsioonisõnastik [Dictionary of Estonian Word Associations]. doi: 10.15155/3-00-0000-0000-0000-07DF6L

ECD = Kallas, J., Koppel, K., Paulsen, G., & Tuulik, M. (2019). Eesti keele naabersõnad 2019 [Estonian Collocations Dictionary]. doi: 10.15155/3-00-0000-0000-0000-0823EL

Church, K. W., & Hanks, P. (1990). Word association norms, mutual information, and lexicography. Computational linguistics, 16(1), 22–29.

Clark, H. H. (1970). Word associations and linguistic theory. In J. Lyons (Ed.), New horizons in linguistics (pp. 271–286). Baltimore, Maryland: Penguin.

De Deyne, S., & Storms, G. (2015). Word associations. In Taylor (Ed.), The Oxford Handbook of the Word (Oxford Handbooks) (p. 471). OUP Oxford: Kindle Edition.

Deese, J. (1965). The Structure of Associations in Language and Thought. Baltimore: The Johns Hopkins Press.

Durrant, P., & Doherty, A. (2010). Are high-frequency collocations psychologically real? Investigating the thesis of collocational priming. Corpus Linguistics and Linguistic Theory, 6(2), 125–155.

Firth, J. R. (1957). ‘Modes of Meaning’. Papers in linguistics 1934–1951, 190–215. Oxford: Oxford University Press.

Fitzpatrick, T. (2007). Word association patterns: unpacking the assumptions. International Journal of Applied Linguistics, 17(3), 319–331.

Fitzpatrick, T., Playfoot, D., Wray, A., & Wright, M. J. (2015). Establishing the reliability of word association data for investigating individual and group differences. Applied Linguistics, 36(1), 23–50. doi: 10.1093/applin/amt020

Galton, F. (1879). Psychometric experiments. Brain, 2(2), 149–162. doi: 10.1093/brain/2.2.149

Hudson, R. (1994). About 37% of word-tokens are nouns. Language, 70(2), 331–339.

Jung, C. G. (1910). The association method. The American Journal of Psychology, 21(2), 219–269. doi: 10.2307/1413002

Kallas, J. (2013). Eesti keele sisusõnade süntagmaatilised suhted korpus-ja õppeleksikograafias [Syntagmatic Relationships of Estonian Content Words in Corpus and Pedagogical Lexicography]. Tallinna Ülikooli humanitaarteaduste dissertatsioonid 32. Tallinn: Tallinna Ülikool. Tallinn: Tallinn University, Dissertations on Humanities Sciences.

Kallas, J., Kilgarriff, A., Koppel, K., Kudritski, E, Langemets, M., Michelfeit, J., Tuulik, M., & Viks, Ü. (2015). Automatic generation of the Estonian Collocations Dictionary database. In I. Kosem, M. Jakubíček, J. Kallas & S. Krek (Eds.), Electronic lexicography in the 21st century: linking lexical data in the digital age. Proceedings of the eLex 2015 Conference, 11–13 August, 2015, Herstmonceux Castle, United Kingdom (pp. 11–13) Ljubljana/Brighton: Trojina, Institute for Applied Slovene Studies/Lexical Computing Ltd.

Kang, B. M. (2018). Collocation and word association: Comparing collocation measuring methods. International Journal of Corpus Linguistics, 23(1), 85–113.

Kent, G. H., & Rosanoff, A. J. (1910). A study of association in insanity. American Journal of Insanity, 67(1–2), 37–96.

Kilgarriff, A., Rychlý, P., Smrž, P., & Tugwell, D. (2004). The Sketch Engine. In G. Williams & S. Vessier (Eds.), Proceedings of the XI Euralex International Congress (pp. 105–116). Lorient: Université de Bretagne Sud.

Kilgarriff, A., Kovář, V., Krek, S., Srdanović, I., & Tiberius, C. (2010). A quantitative evaluation of word sketches. Proceedings of the XIV Euralex International Congress, 6–10, July 2010, Leeuwarden (pp. 372–379). Ljouwert: Fryske Academy.

Kilgarriff, A., Baisa, V., Bušta, J., Jakubíček, M., Kovář, V., Michelfeit, J., Rychlý, P., & Suchomel, V. (2014). The Sketch Engine: ten years on. Lexicography, 1(1), 7–36.

Kiss, G. R., Armstrong, C., Milroy, R., & Piper, J. (1973). An associative thesaurus of English and its computer analysis. In A. J. Aitken & R. W. Bailey (Eds.), The Computer and Literary Studies (pp. 153–165). Edinburgh: University Press.

Koppel, K., Tavast, A., Langemets, M., & Kallas, J. (2019a). Aggregating dictionaries into the language portal Sõnaveeb: Issues with and without a solution. In I. Kosem, T. Zingano Kuhn, M. Correia, J. P. Ferreria, M. Jansen, I. Pereira, J. Kallas, M. Jakubíček, S. Krek & C. Tiberius (Eds.), Electronic Lexicography in the 21st Century: Smart Lexicography. Proceedings of the eLex 2019 Conference, 1–3 October, 2019, Sintra, Portugal (pp. 434−452). Brno: Lexical Computing CZ, s.r.o.

Koppel, K., Kallas, J., Khokhlova, M., Suchomel, V., Baisa, V., & Michelfeit, J. (2019b). SkELL corpora as a part of the language portal Sõnaveeb: problems and perspectives. In I. Kosem, T. Zingano Kuhn, M. Correia, J. P. Ferreria, M. Jansen, I. Pereira, J. Kallas, M. Jakubíček, S. Krek & C. Tiberius (Eds.), Electronic Lexicography in the 21st Century: Smart Lexicography. Proceedings of the eLex 2019 Conference, 1–3 October, 2019, Sintra, Portugal (pp. 763–782). Brno: Lexical Computing CZ, s.r.o.

Leech, G., & Smith, N. (2000). Manual to accompany the British National Corpus (Version 2) with improved word class tagging. Lancaster: UCREL. Retrieved from http://ucrel.lancs.ac.uk/bnc2/bnc2postag manual.htm

Mollin, S. (2009). Combining corpus linguistic and psychological data on word co-occurrences: Corpus collocates versus word associations. Corpus Linguistics and Linguistic Theory, 5(2), 175–200. doi: 10.1515/CLLT.2009.008

Nelson, D. L., McEvoy, C. L., & Dennis, S. (2000). What is free association and what does it measure? Memory & Cognition, 28 (6), 887–899. doi: 10.3758/BF03209337

Nelson, D. L., McEvoy, C. L., & Schreiber, T. A. (2004). The University of South Florida word association, rhyme, and word fragment norms. Behavior Research Methods, Instruments, & Computers, 36(3), 402–407. doi: 10.3758/ BF03195588

Postman, L., & Keppel, G. (1970). Norms of Word Association. New York NY: Academic Press.

Rosenzweig, M. R. (1961). Comparisons among word-association responses in English, French, German, and Italian. The American Journal of Psychology, 74(3), 347–360. doi: 10.2307/1419741

Roth, T. (2013). Going Online with a German Collocations Dictionary. In I. Kosem, J. Kallas, P. Gantar, S. Krek, M. Langemets, M. Tuulik (Eds.), Electronic lexicography in the 21st century: thinking outside the paper. Proceedings of the eLex 2013 Conference, 17–19 October, 2013, Tallinn, Estonia (pp. 152–163). Retrieved from http://eki.ee/elex2013/proceedings/eLex2013_11_Roth.pdf

Schulte im Walde, S., Melinger, A. Roth, M., & Weber, A. (2008). An empirical characterisation of response types in German association norms. Research on Language and Computation 6(2), 205–238.

Schulte im Walde S., & Borgwaldt, S. (2015). Association Norms for German Noun Compounds and their Constituents. Behavior Research Methods 47(4), 1199–1221.

Scott, M., & Tribble, C. (2006). Textual Patterns: Key Words and Corpus Analysis in Language Education. Amsterdam/Philadelphia: John Bejamins. doi: 10.1075/scl.22

Sinclair, J. (1966). Beginning the Study of Lexis. In C. E. Bazell et al. (Eds.), In Memory of J. R. Firth (pp. 410–430). London: Longman.

Sinopalnikova, A. (2004). Word Association Thesaurus as a Resource for Building WordNet. Proceedings of the 2nd International WordNet Conference, Brno, Czech Republic (pp. 199–205).

Toim, K. (1980). Estonian word association norms for the Kent-Rosanoff test. Problems of cognitive psychology [Труды по психологии. Проблемы когнитивной психологии]. Tartu Riikliku Ülikooli Toimetised, 522, 60–76.

Vainik, E. (2018). Compiling the Dictionary of Word Associations in Estonian: from scratch to the database. Eesti Rakenduslingvistika Ühingu aastaraamat, 14, 229−245. doi: 10.5128/ERYa.1736-2563



Supporting Agencies
Estonian Research Council

How to Cite

Vainik, E., Tuulik, M., & Koppel, K. (2020). A comparison of collocations and word associations in Estonian from the perspective of parts of speech. Slovenščina 2.0: Empirical, Applied and Interdisciplinary Research, 8(2), 139–167. https://doi.org/10.4312/slo2.0.2020.2.139-167