Spatial data mining (SDM) refers to the mining of knowledge from spatial data. Recently, location-based services have enabled the gathering of a significant amount of geo-referenced data, i.e., of spatial big data (SBD). Spatial datasets often exceed the ability of current computing systems to manage these data with reasonable effort; therefore, data-intensive computing and data mining techniques are useful tools for conducting an analysis. In this paper, we present an approach to the clustering of high-dimensional data that allows a flexible approach to the statistical modelling of phenomena characterised by unobserved heterogeneity. Numerous clustering algorithms have been developed for large databases; density-based algorithms particularly treat a huge amount of data in large spatial databases. We present the Modified Density-Based Spatial Clustering of Applications with Noise (MDBSCAN) algorithm and compare it to the classical k-means approach. Both applications use synthetic datasets and a dataset of satellite images.
A methodology for dealing with spatial big data
SCHOIER, GABRIELLA;BORRUSO, GIUSEPPE
2017-01-01
Abstract
Spatial data mining (SDM) refers to the mining of knowledge from spatial data. Recently, location-based services have enabled the gathering of a significant amount of geo-referenced data, i.e., of spatial big data (SBD). Spatial datasets often exceed the ability of current computing systems to manage these data with reasonable effort; therefore, data-intensive computing and data mining techniques are useful tools for conducting an analysis. In this paper, we present an approach to the clustering of high-dimensional data that allows a flexible approach to the statistical modelling of phenomena characterised by unobserved heterogeneity. Numerous clustering algorithms have been developed for large databases; density-based algorithms particularly treat a huge amount of data in large spatial databases. We present the Modified Density-Based Spatial Clustering of Applications with Noise (MDBSCAN) algorithm and compare it to the classical k-means approach. Both applications use synthetic datasets and a dataset of satellite images.File | Dimensione | Formato | |
---|---|---|---|
2017_IJBIDM_1705_FPV.pdf
Accesso chiuso
Descrizione: articolo principale
Tipologia:
Documento in Versione Editoriale
Licenza:
Digital Rights Management non definito
Dimensione
321.3 kB
Formato
Adobe PDF
|
321.3 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
2914562_2017_IJBIDM_1705_FPV-PostPrint.pdf
accesso aperto
Tipologia:
Bozza finale post-referaggio (post-print)
Licenza:
Digital Rights Management non definito
Dimensione
328.56 kB
Formato
Adobe PDF
|
328.56 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.