The Parametric Comparison Method (PCM) offers a principled way to encode syntactic variation across languages in terms of binary parameters that can subsequently be used for phylogenetic reconstruction. Collecting the relevant data, however, requires trained linguists to elicit gram- maticality judgements from native speakers—a process that is slow, expensive, and prone to cross-linguistic inconsistency. We explore the use of large language models (LLMs) as prelim- inary assistants in this workflow, generating examples or counter-examples for parameter ques- tions. Early experiments have shown that such models have the potential to accelerate knowl- edge elicitation but also to expose ambiguities in the definitions of parameter manifestations, helping PCM evolve into a more explicit and replicable scientific framework. We introduce a prototype software platform that automates this process, as illustrated on Eastern and Western Armenian. This study is designed as a registered report aiming to quantify the potential benefits of such automation by manually evaluating the LLM output accuracy and recording the propor- tion of parameter manifestation definitions that required editing, if such were encountered.

Large language models as assistants for the parametric comparison method / Kazakov, Dimitar; Aljohani, Thamer; Karina Anom, Andari; Anastasova, Maria; Carrasco Coquillat, David; Dekova, Rositsa; Nascimento Fernandes, Ingrid; Ferroni, Sofia; Longhin, Marco; Margova, Ruslana; Milković, Lidija; Qomariyah, Nurul; Roy, Reshmi; Sorge, Gaia; Wang, Jiabao; Yarahmadi, Hediye; Longobardi, Giuseppe. - ELETTRONICO. - (2026), pp. 204-210. ( The 16th International Conference The Evolution of Language - Evolang 2026 Plovdiv 07-10/04 2026).

Large language models as assistants for the parametric comparison method

Giuseppe Longobardi
2026-01-01

Abstract

The Parametric Comparison Method (PCM) offers a principled way to encode syntactic variation across languages in terms of binary parameters that can subsequently be used for phylogenetic reconstruction. Collecting the relevant data, however, requires trained linguists to elicit gram- maticality judgements from native speakers—a process that is slow, expensive, and prone to cross-linguistic inconsistency. We explore the use of large language models (LLMs) as prelim- inary assistants in this workflow, generating examples or counter-examples for parameter ques- tions. Early experiments have shown that such models have the potential to accelerate knowl- edge elicitation but also to expose ambiguities in the definitions of parameter manifestations, helping PCM evolve into a more explicit and replicable scientific framework. We introduce a prototype software platform that automates this process, as illustrated on Eastern and Western Armenian. This study is designed as a registered report aiming to quantify the potential benefits of such automation by manually evaluating the LLM output accuracy and recording the propor- tion of parameter manifestation definitions that required editing, if such were encountered.
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11368/3133318
 Avviso

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact