Principal component (PC) analysis is a well-known descriptive technique for dimensionality reduction which is customarily used for cross-sectional data. In this paper we face the issue of dimensionality reduction when data are longitudinal. That is, we aim at finding time-invariant latent dimensions— whether they exist—in longitudinal multivariate data. Time-invariancy here means that construction, hence interpretation, of such latent factors (or PCs) is constant over time. The objective of drawing stable dimensions is worth at least twofold: (i) it allows for a consistent comparison across time of unit profiles, as measured with respect to the same set of PCs; (ii) it allows for detecting a possible temporal pattern in the construction hence interpretation of such latent dimensions. Indeed, when PCs are used to perform a cluster analysis of unit profiles on lower dimensional data, time-invariant PCs allow—from (i)—for a consistent comparison of groupings across time. To this aim we propose a two-step procedure. First, a variable selection method is opportunely developed to extract a common subset of elementary variables from the overall longitudinal multivariate data. In fact, it is known that the more redundant variables are eliminated from the beginning the more any dimensionality reduction method is e.ective. To this aim we use various selection criteria (Al-Kandari et al. (2005)) (such as cluster criterion, multiple correlation criterion and best prediction criterion) and efficiency measures (e.g. the Procustes discrepancy) while supporting analytical results with opportunely developed graphical devices. Secondarily, we perform a longitudinal PC analysis consisting of: a PC analysis on the overall longitudinal set of selected elementary variables, opportune rotations of extracted PCs to enhance clarity of interpretation, a originally suited multiple regression analysis of the derived longitudinal PCs on the selected elementary variables to infer the time-invariant latent dimension weights. This novel approach is applied to high dimensional longitudinal data consisting of financial elementary indicators observed over the set of municipalities of an Italian Region from 2001 to 2006. Analysis results show the existence of four latent dimensions in the surveyed local finance and a slowly-changing interpretation of these during the period being observed. Finally, a comparison of well-identified clusters of municipalities, as measured with respect to such dimensions, is consistently performed across time.
A Longitudinal Principal Component Analysis: An application to a multivariate time series of local finance indicators
TREVISANI, MATILDE
2008-01-01
Abstract
Principal component (PC) analysis is a well-known descriptive technique for dimensionality reduction which is customarily used for cross-sectional data. In this paper we face the issue of dimensionality reduction when data are longitudinal. That is, we aim at finding time-invariant latent dimensions— whether they exist—in longitudinal multivariate data. Time-invariancy here means that construction, hence interpretation, of such latent factors (or PCs) is constant over time. The objective of drawing stable dimensions is worth at least twofold: (i) it allows for a consistent comparison across time of unit profiles, as measured with respect to the same set of PCs; (ii) it allows for detecting a possible temporal pattern in the construction hence interpretation of such latent dimensions. Indeed, when PCs are used to perform a cluster analysis of unit profiles on lower dimensional data, time-invariant PCs allow—from (i)—for a consistent comparison of groupings across time. To this aim we propose a two-step procedure. First, a variable selection method is opportunely developed to extract a common subset of elementary variables from the overall longitudinal multivariate data. In fact, it is known that the more redundant variables are eliminated from the beginning the more any dimensionality reduction method is e.ective. To this aim we use various selection criteria (Al-Kandari et al. (2005)) (such as cluster criterion, multiple correlation criterion and best prediction criterion) and efficiency measures (e.g. the Procustes discrepancy) while supporting analytical results with opportunely developed graphical devices. Secondarily, we perform a longitudinal PC analysis consisting of: a PC analysis on the overall longitudinal set of selected elementary variables, opportune rotations of extracted PCs to enhance clarity of interpretation, a originally suited multiple regression analysis of the derived longitudinal PCs on the selected elementary variables to infer the time-invariant latent dimension weights. This novel approach is applied to high dimensional longitudinal data consisting of financial elementary indicators observed over the set of municipalities of an Italian Region from 2001 to 2006. Analysis results show the existence of four latent dimensions in the surveyed local finance and a slowly-changing interpretation of these during the period being observed. Finally, a comparison of well-identified clusters of municipalities, as measured with respect to such dimensions, is consistently performed across time.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.