Prompting Metalinguistic Awareness in Large Language Models: ChatGPT and Bias Effects on the Grammar of Italian and Italian Varieties
Articles
Angelapia Massaro
Università di Siena, China
https://orcid.org/0000-0003-0708-8159
Giuseppe Samo
Beijing Language and Culture University / University of Geneva
Published 2023-12-20
https://doi.org/10.15388/Verb.42
HTML
PDF

Keywords

cartography
quantitative syntax
ChatGPT
Topics
Italian varieties

How to Cite

Massaro, A. and Samo, G. (2023) “Prompting Metalinguistic Awareness in Large Language Models: ChatGPT and Bias Effects on the Grammar of Italian and Italian Varieties”, Verbum, 14, pp. 1–11. doi:10.15388/Verb.42.

Abstract

We explore ChatGPT’s handling of left-peripheral phenomena in Italian and Italian varieties through prompt engineering to investigate 1) forms of syntactic bias in the model, 2) the model’s metalinguistic awareness in relation to reorderings of canonical clauses (e.g., Topics) and certain grammatical categories (object clitics). A further question concerns the content of the model’s sources of training data: how are minor languages included in the model’s training? The results of our investigation show that 1) the model seems to be biased against reorderings, labelling them as archaic even though it is not the case; 2) the model seems to have difficulties with coindexed elements such as clitics and their anaphoric status, labeling them as ‘not referring to any element in the phrase’, and 3) major languages still seem to be dominant, overshadowing the positive effects of including minor languages in the model’s training.

HTML
PDF
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Downloads

Download data is not yet available.