A methodology for dealing with spatial big data

Spatial data mining (SDM) refers to the mining of knowledge from spatial data. Recently, location-based services have enabled the gathering of a significant amount of geo-referenced data, i.e., of spatial big data (SBD). Spatial datasets often exceed the ability of current computing systems to manage these data with reasonable effort; therefore, data-intensive computing and data mining techniques are useful tools for conducting an analysis. In this paper, we present an approach to the clustering of high-dimensional data that allows a flexible approach to the statistical modelling of phenomena characterised by unobserved heterogeneity. Numerous clustering algorithms have been developed for large databases; density-based algorithms particularly treat a huge amount of data in large spatial databases. We present the Modified Density-Based Spatial Clustering of Applications with Noise (MDBSCAN) algorithm and compare it to the classical k-means approach. Both applications use synthetic datasets and a dataset of satellite images.

A methodology for dealing with spatial big data

SCHOIER, GABRIELLA;BORRUSO, GIUSEPPE

2017-01-01

Abstract

Spatial data mining (SDM) refers to the mining of knowledge from spatial data. Recently, location-based services have enabled the gathering of a significant amount of geo-referenced data, i.e., of spatial big data (SBD). Spatial datasets often exceed the ability of current computing systems to manage these data with reasonable effort; therefore, data-intensive computing and data mining techniques are useful tools for conducting an analysis. In this paper, we present an approach to the clustering of high-dimensional data that allows a flexible approach to the statistical modelling of phenomena characterised by unobserved heterogeneity. Numerous clustering algorithms have been developed for large databases; density-based algorithms particularly treat a huge amount of data in large spatial databases. We present the Modified Density-Based Spatial Clustering of Applications with Noise (MDBSCAN) algorithm and compare it to the classical k-means approach. Both applications use synthetic datasets and a dataset of satellite images.

Scheda breve

Scheda completa

	Anno
	
				2017
			
	Data ahead of print
	
				3-mar-2017
			
	Stato di pubblicazione
	
				Pubblicato
			
	Rivista
	
				INTERNATIONAL JOURNAL OF BUSINESS INTELLIGENCE AND DATA MINING
			
	DOI
	
				https://dx.doi.org/10.1504/IJBIDM.2017.082705
			
	URL
	
				http://www.inderscience.com/info/inarticle.php?artid=82705
http://www.inderscience.com/info/filter.php?aid=82705
			
	Appare nelle tipologie:
	
				1.1 Articolo in Rivista

File in questo prodotto:

File	Dimensione	Formato
2017_IJBIDM_1705_FPV.pdf Accesso chiuso Descrizione: articolo principale Tipologia: Documento in Versione Editoriale Licenza: Digital Rights Management non definito Dimensione 321.3 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	321.3 kB	Adobe PDF	Visualizza/Apri Richiedi una copia
2914562_2017_IJBIDM_1705_FPV-PostPrint.pdf accesso aperto Tipologia: Bozza finale post-referaggio (post-print) Licenza: Digital Rights Management non definito Dimensione 328.56 kB Formato Adobe PDF Visualizza/Apri	328.56 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11368/2914562

Citazioni

ND

9

ND

social impact