The Parametric Comparison Method (PCM) offers a principled way to encode syntactic variation across languages in terms of binary parameters that can subsequently be used for phylogenetic reconstruction. Collecting the relevant data, however, requires trained linguists to elicit gram- maticality judgements from native speakers—a process that is slow, expensive, and prone to cross-linguistic inconsistency. We explore the use of large language models (LLMs) as prelim- inary assistants in this workflow, generating examples or counter-examples for parameter ques- tions. Early experiments have shown that such models have the potential to accelerate knowl- edge elicitation but also to expose ambiguities in the definitions of parameter manifestations, helping PCM evolve into a more explicit and replicable scientific framework. We introduce a prototype software platform that automates this process, as illustrated on Eastern and Western Armenian. This study is designed as a registered report aiming to quantify the potential benefits of such automation by manually evaluating the LLM output accuracy and recording the propor- tion of parameter manifestation definitions that required editing, if such were encountered.
Large language models as assistants for the parametric comparison method / Kazakov, Dimitar; Aljohani, Thamer; Karina Anom, Andari; Anastasova, Maria; Carrasco Coquillat, David; Dekova, Rositsa; Nascimento Fernandes, Ingrid; Ferroni, Sofia; Longhin, Marco; Margova, Ruslana; Milković, Lidija; Qomariyah, Nurul; Roy, Reshmi; Sorge, Gaia; Wang, Jiabao; Yarahmadi, Hediye; Longobardi, Giuseppe. - ELETTRONICO. - (2026), pp. 204-210. ( The 16th International Conference The Evolution of Language - Evolang 2026 Plovdiv 07-10/04 2026).
Large language models as assistants for the parametric comparison method
Giuseppe Longobardi
2026-01-01
Abstract
The Parametric Comparison Method (PCM) offers a principled way to encode syntactic variation across languages in terms of binary parameters that can subsequently be used for phylogenetic reconstruction. Collecting the relevant data, however, requires trained linguists to elicit gram- maticality judgements from native speakers—a process that is slow, expensive, and prone to cross-linguistic inconsistency. We explore the use of large language models (LLMs) as prelim- inary assistants in this workflow, generating examples or counter-examples for parameter ques- tions. Early experiments have shown that such models have the potential to accelerate knowl- edge elicitation but also to expose ambiguities in the definitions of parameter manifestations, helping PCM evolve into a more explicit and replicable scientific framework. We introduce a prototype software platform that automates this process, as illustrated on Eastern and Western Armenian. This study is designed as a registered report aiming to quantify the potential benefits of such automation by manually evaluating the LLM output accuracy and recording the propor- tion of parameter manifestation definitions that required editing, if such were encountered.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


