Practical applications of extreme value modeling can be found in many fields. For instance, in network analysis the shape and the extremal behavior of random variables describing some network characteristics is of interest. In some cases it is relevant to assess how the tail of such distributions is shaped and what is the cut-off value for which a particular distribution – typically power law like – can be fitted. The shape of the degree distribution is of great importance in network dynamic since if it follows a power law then a peculiar tie-formation mechanism (known as preferential attachment) takes place in the network, whereas if the tail is exponential a simpler random generating mechanism occurs. Applications of extreme value theory can be found in network analysis literature. Here the common practice is to fit a power law distribution for the empirical degree distribution using the method proposed by Clauset et al. (2009). This practice is quite different with respect to the usual methods proposed in the statistical literature (Coles, 2001), especially when considering the threshold selection. In fact, the most traditional methods employ graphical diagnostics to select the threshold. In this work we aim at comparing the results which are obtained through different methods on some network case studies in order to assess sensitiveness of substantive conclusion on the method employed. In particular we plan to compare different distributional assumption, Generalized Pareto Distribution versus Power Law versus Generalized Extreme Value distribution, and different techniques for choosing the fraction of observations to be used to estimate tail properties, graphical methods versus automatic procedures, especially the one proposed by Clauset et al. (2009). We find out that employing graphical methods often leads to very different results with respect the automatic threshold choice of Clauset et al. (2009).
Some Issues in Estimation of Extremes in the Analysis of Network Data
DE STEFANO, DOMENICO
2013-01-01
Abstract
Practical applications of extreme value modeling can be found in many fields. For instance, in network analysis the shape and the extremal behavior of random variables describing some network characteristics is of interest. In some cases it is relevant to assess how the tail of such distributions is shaped and what is the cut-off value for which a particular distribution – typically power law like – can be fitted. The shape of the degree distribution is of great importance in network dynamic since if it follows a power law then a peculiar tie-formation mechanism (known as preferential attachment) takes place in the network, whereas if the tail is exponential a simpler random generating mechanism occurs. Applications of extreme value theory can be found in network analysis literature. Here the common practice is to fit a power law distribution for the empirical degree distribution using the method proposed by Clauset et al. (2009). This practice is quite different with respect to the usual methods proposed in the statistical literature (Coles, 2001), especially when considering the threshold selection. In fact, the most traditional methods employ graphical diagnostics to select the threshold. In this work we aim at comparing the results which are obtained through different methods on some network case studies in order to assess sensitiveness of substantive conclusion on the method employed. In particular we plan to compare different distributional assumption, Generalized Pareto Distribution versus Power Law versus Generalized Extreme Value distribution, and different techniques for choosing the fraction of observations to be used to estimate tail properties, graphical methods versus automatic procedures, especially the one proposed by Clauset et al. (2009). We find out that employing graphical methods often leads to very different results with respect the automatic threshold choice of Clauset et al. (2009).Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.