This chapter illustrates the theoretical background of the implementation of computational linguistic methods to probe the translation universals hypothesis. Starting from the assumption that both the translation process and the source language impact the linguistic features of translations, we use Labbé’s method for calculating intertextual distance to check whether it can distinguish translated from non-translated texts and proves successful in grouping together texts translated from the same language within a corpus of translations. In addition to compiling a balanced corpus of newspaper articles (both originally written in Italian and translated from several languages), ad hoc procedures are necessary to offset the impact of different text lengths and contents on intertextual distance values. The selection of text chunks of equal length and different language tokens (grammar words, multi-words etc.), along with POS-tagging procedures to identify additional useful linguistic features, provide a promising approach to evaluate different methods to calculate the intertextual distance between translated and non-translated texts (cosine similarity, machine learning, stylometry).

Distanza intertestuale e lingua fonte: premesse teoriche, compilazione di un corpus e procedure di analisi

Ondelli
Membro del Collaboration Group
2017-01-01

Abstract

This chapter illustrates the theoretical background of the implementation of computational linguistic methods to probe the translation universals hypothesis. Starting from the assumption that both the translation process and the source language impact the linguistic features of translations, we use Labbé’s method for calculating intertextual distance to check whether it can distinguish translated from non-translated texts and proves successful in grouping together texts translated from the same language within a corpus of translations. In addition to compiling a balanced corpus of newspaper articles (both originally written in Italian and translated from several languages), ad hoc procedures are necessary to offset the impact of different text lengths and contents on intertextual distance values. The selection of text chunks of equal length and different language tokens (grammar words, multi-words etc.), along with POS-tagging procedures to identify additional useful linguistic features, provide a promising approach to evaluate different methods to calculate the intertextual distance between translated and non-translated texts (cosine similarity, machine learning, stylometry).
2017
978-88-8303-912-6
978-88-8303-913-3
File in questo prodotto:
File Dimensione Formato  
Testi_corpora_confronti metadati.pdf

accesso aperto

Descrizione: frontespizio e sommario del volume
Tipologia: Documento in Versione Editoriale
Licenza: Creative commons
Dimensione 84.62 kB
Formato Adobe PDF
84.62 kB Adobe PDF Visualizza/Apri
Distanza intertestuale e lingua fonte premesse teoriche, compilazione di un corpus e procedure di analisi.pdf

accesso aperto

Descrizione: testo del contributo
Tipologia: Documento in Versione Editoriale
Licenza: Creative commons
Dimensione 6.15 MB
Formato Adobe PDF
6.15 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11368/2914686
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact