In the Search of Lexicographically Relevant Collocation: The Example of Grammatical Relations Containing Adverbs
This paper presents the results of the analysis of grammatical relations that focussed of identifying not only collocations relevant for lexicographic purposes, but also problematic areas that need further investigation on both lexicographic and grammatical level. In the initial study, collocation candidates for a wide selection of grammatical relations for a heterogeneous sample of 333 lemmas have been automatically extracted from the Gigafida reference corpus of Slovene. A group of linguists then annotated the relevance of collocation candidates, examining both collocations and their examples of use, and their answers were analysed for agreement. The findings were that relations such as adjective + noun, noun + noun in gerund, and some relations verb + preposition + noun exhibited high agreement and large shares of approved collocation candidates. On the other hand, grammatical relations containing adverbs proved to be among the ones where disagreement or uncertainty of linguists-annotators was the highest. Consequently, it was decided that these adverbial relations should be analysed first as a sample set in testing our bottom-up approach to determining which collocation candidates are lexicographically relevant.
Further analysis has shown that the decision on the relevance of collocation candidates for dictionary purposes needs to be made separately for each relation, and groups of adverbs within it. An example of semantically less relevant group proved to be adverbs functioning as intensifiers or having a semantically less relevant role of a participle. Even more problematic is a group of numeral adverbs (once, twice…) which have different levels of semantic relevance (e.g. četrtič doktorirati 'to receive a PhD for the fourth time' versus stokrat povedati 'to say something a hundred times') and thus cannot be delimited on a group level within a particular grammatical relation.
The data from the analyses described in this paper will enable further detailed analyses, in particular a description of each grammatical relation from the perspective of its collocationality. In addition, bad collocation candidates that are the result of errors in morphosyntactic annotation will enable the improvement of sketch grammar and relatedly the quality of automatic extraction output. Furthermore, we intend to use existing findings in order to improve the results of grammatical relations that have been initially excluded from the automatic extraction procedure due to a high percentage of noise.
Copyright (c) 2019 Eva Pori, Iztok Kosem
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors are confirming that they are the authors of the submitting article, which will be published (print and online) in journal Journal for Foreign Languages by Znanstvena založba Filozofske fakultete Univerze v Ljubljani (University of Ljubljana, Faculty of Arts, Aškerčeva 2, 1000 Ljubljana, Slovenia). Author’s name will be evident in the article in journal. All decisions regarding layout and distribution of the work are in hands of the publisher.
- Authors guarantee that the work is their own original creation and does not infringe any statutory or common-law copyright or any proprietary right of any third party. In case of claims by third parties, authors commit their self to defend the interests of the publisher, and shall cover any potential costs.
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-ShareAlike 4.0 International License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.