Corpus Pattern Analysis for Learner Lexicography: A Pilot-study of Lithuanian Verbs
Jolanta Kovalevskaitė
Vytautas Magnus University, Lithuania
Laima Jancaitė
Vytautas Magnus University, Lithuania
Published 2019-10-18


Lithuanian language
corpus linguistics
corpus pattern analysis (CPA)
earner dictionaries

How to Cite

Kovalevskaitė J. and Jancaitė L. (2019) “Corpus Pattern Analysis for Learner Lexicography: A Pilot-study of Lithuanian Verbs”, Taikomoji kalbotyra, (12), pp. 124-154. doi: 10.15388/TK.2019.17235.


The aim of this paper is to present a pilot study which applies the framework of Corpus Pattern Analysis (CPA, Hanks 2004) to analyse some Lithuanian verbs which form part of the basic vocabulary. CPA draws on the insights of the corpus-driven language analysis and contextual and functional theory of meaning: a meaning of a word is associated with a specific lexical and grammatical environment, e.g. corpus patterns which represent an interconnection of lexical and grammatical elements. The CPA procedure is one of the several corpus-driven methods differing from the pattern grammar (Hunston, Francis 2000) in the way that CPA not only uses typical grammatical categories (e.g. word classes) but also introduces semantic values (e.g. semantic types) to distinguish different senses of a word. Semantic types are often the main separator of meanings, especially when two verb senses are associated with the same grammatical pattern. Concerning learners’ dictionaries, CPA could provide learners with more detailed usage data, and this could lead to a better understanding of meaning differences, important both for language reception and language production. After introducing the CPA methodology, we present the CPA analysis of two Lithuanian verbs, namely, the inductive procedure followed to observe and define meaning-related patterning. We also discuss the problematic issues related to the application of CPA as identified in this study and mentioned by other CPA practitioners. First, observing and defining corpus patterns is a challenging task for lexicographers, especially because of the pattern / meaning division and generalizations related to semantic types. The second problematic aspect is automatization in the process of pattern recognition. The third issue relates to foreign language learners as a target group: meaning-related patterning observed in the data has to be presented in a learner dictionary in a user-friendly way.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Most read articles by the same author(s)