This paper illustrates methods, tools, and preliminary results of a study aimed to create a list of job titles and analyze them in a corpus of 100 Italian canonical and non-canonical fictional prose works published between 1825 and 1923. Job titles are interesting because they reflect socio-economic changes, as well as giving important information on how literary settings and genres changed. After a short introduction on job titles, we will discuss tools and methods used for the creation of a list of words, data extraction, and data representation, both from a linguistic and programming point of view. Some preliminary results will be shown and discussed with a statistical approach: data do not suggest significant patterns over time: whereas job titles appear more or less consistent from a chronological point of view. Finally, some advantages and limitations will be examined. The goal of this study is to develop a set of tools and methods that can be easily reproduced to build complex lexical lists, find their items, and represent data for corpora of any genre and size in a simple and effective way.

“Trovare lavoro” in un corpus di narrativa del XIX-XX secolo. Procedure, aspetti e problemi di creazione, estrazione e rappresentazione dei dati

Floriana Carlotta Sciumbata
;
Paolo Nadalutti
;
Luca Tringali
2021-01-01

Abstract

This paper illustrates methods, tools, and preliminary results of a study aimed to create a list of job titles and analyze them in a corpus of 100 Italian canonical and non-canonical fictional prose works published between 1825 and 1923. Job titles are interesting because they reflect socio-economic changes, as well as giving important information on how literary settings and genres changed. After a short introduction on job titles, we will discuss tools and methods used for the creation of a list of words, data extraction, and data representation, both from a linguistic and programming point of view. Some preliminary results will be shown and discussed with a statistical approach: data do not suggest significant patterns over time: whereas job titles appear more or less consistent from a chronological point of view. Finally, some advantages and limitations will be examined. The goal of this study is to develop a set of tools and methods that can be easily reproduced to build complex lexical lists, find their items, and represent data for corpora of any genre and size in a simple and effective way.
File in questo prodotto:
File Dimensione Formato  
2021_Sciumbata-2_et-al RITT.pdf

accesso aperto

Descrizione: articolo
Tipologia: Documento in Versione Editoriale
Licenza: Creative commons
Dimensione 2.73 MB
Formato Adobe PDF
2.73 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11368/3051359
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact