Reducing Data Dimension for Cluster Detection

Clustering high-dimensional data is often a challenging task both because of the computational burden required to run any technique, and because the difficulty in interpreting clusters generally increases with the data dimension. In this work, a method for finding low-dimensional representations of high-dimensional data is discussed, specically conceived to preserve possible clusters in data. It is based on the critical bandwidth, a nonparametric statistic to test unimodality, related to kernel density estimation. Some useful properties of the aforementioned statistic are enlightened and an adjustment to use it as a basis for reducing dimensionality is suggested. The method is illustrated by simulated and real data examples.

Reducing Data Dimension for Cluster Detection / Torelli, N., Menardi, G.. - In: JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION. - ISSN 0094-9655. - STAMPA. - 83:11(2013), pp. 2047-2063. [10.1080/00949655.2012.679032]

Reducing Data Dimension for Cluster Detection

TORELLI, Nicola;Menardi G.

2013-01-01

Abstract

Clustering high-dimensional data is often a challenging task both because of the computational burden required to run any technique, and because the difficulty in interpreting clusters generally increases with the data dimension. In this work, a method for finding low-dimensional representations of high-dimensional data is discussed, specically conceived to preserve possible clusters in data. It is based on the critical bandwidth, a nonparametric statistic to test unimodality, related to kernel density estimation. Some useful properties of the aforementioned statistic are enlightened and an adjustment to use it as a basis for reducing dimensionality is suggested. The method is illustrated by simulated and real data examples.

Scheda breve

Scheda completa

	Anno
	
				2013
			
	Data ahead of print
	
				2-mag-2012
			
	Stato di pubblicazione
	
				Pubblicato
			
	Rivista
	
				JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION
			
	DOI
	
				https://dx.doi.org/10.1080/00949655.2012.679032
			
	URL
	
				https://www.tandfonline.com/doi/abs/10.1080/00949655.2012.679032
			
	Appare nelle tipologie:
	
				1.1 Articolo in Rivista

File in questo prodotto:

File	Dimensione	Formato
Torelli_Reducing Data Dimension for Cluster Detection.pdf Accesso chiuso Descrizione: articolo Tipologia: Documento in Versione Editoriale Licenza: Copyright Editore Dimensione 5.28 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	5.28 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11368/2488333

Citazioni

ND

4

4

social impact