In textual analysis, many corpora include texts which have a chronological order. The temporal evolution of (key) words is relevant in order to highlight the distinctive features of the chronological corpus. In a typical bag-of-words approach data are organized in word-type x time-point contingency tables. Such discrete data can be thought of as continuous objects represented by functional relationships. The aims of this study are identifying a specific sequential pattern for each word as a functional object, and determining prototype patterns representing clusters of words portraying a similar evolution. We propose the application of a flexible waveletbased model for curve clustering to a corpus of end-of-year addresses delivered by the ten Presidents of Italian Republic in the period 1949-2011.

Chronological analysis of textual data and curve clustering: preliminary results based on wavelets

TREVISANI, MATILDE;
2012-01-01

Abstract

In textual analysis, many corpora include texts which have a chronological order. The temporal evolution of (key) words is relevant in order to highlight the distinctive features of the chronological corpus. In a typical bag-of-words approach data are organized in word-type x time-point contingency tables. Such discrete data can be thought of as continuous objects represented by functional relationships. The aims of this study are identifying a specific sequential pattern for each word as a functional object, and determining prototype patterns representing clusters of words portraying a similar evolution. We propose the application of a flexible waveletbased model for curve clustering to a corpus of end-of-year addresses delivered by the ten Presidents of Italian Republic in the period 1949-2011.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11368/2554024
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact