In textual analysis, many corpora include texts in chronological order and in many cases this temporal connotation is crucial to an understanding of their inner structure. In a typical bag-of-words approach, data are organized in contingency tables, the rows reporting the frequency of each word over time-points (shown in columns). These discrete data (temporal patterns for frequen-cies) may be viewed as continuous objects represented by functional relation-ships. This study aimed at identifying a specific sequential pattern for each word as a functional object and at grouping these word patterns in clusters. A model-based clustering procedure is proposed, with specific reference to a cor-pus of end-of-year messages delivered by the ten Presidents of the Italian Republic covering the period from 1949 to 2011.
Shaping the history of words
TREVISANI, MATILDE;
2013-01-01
Abstract
In textual analysis, many corpora include texts in chronological order and in many cases this temporal connotation is crucial to an understanding of their inner structure. In a typical bag-of-words approach, data are organized in contingency tables, the rows reporting the frequency of each word over time-points (shown in columns). These discrete data (temporal patterns for frequen-cies) may be viewed as continuous objects represented by functional relation-ships. This study aimed at identifying a specific sequential pattern for each word as a functional object and at grouping these word patterns in clusters. A model-based clustering procedure is proposed, with specific reference to a cor-pus of end-of-year messages delivered by the ten Presidents of the Italian Republic covering the period from 1949 to 2011.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.