BIG DATA are everywhere. They are high-veracity, high-velocity, highvalue, and/or high-variety data with volumes beyond the ability of commonly-used software to manage, query, and process within a tolerable elapsed time. Big data analytics incorporates various techniques from a broad range of fields, which include cloud computing, data mining, machine learning, mathematics, and statistics. Data mining aims to extract implicit, previously unknown, and potentially useful information from data. On the other hand, nowadays uncertain big data management represents an active and wellrecognized research area where a relevant number of proposals converge. This due to several reasons, but mostly dictated by emerging big data trends as well as the Cloud-computing-paradigms’ explosion. Within this so-wide research context, a leading role is played by the issue of extracting-useful-knowledge- from big data, being the uncertain big data setting a critical case to be considered. In our research, we specially focus on two well-known distinct first-class Data Mining problems over uncertain big data, namely: frequent itemset min- ing from uncertain big data, and constrained mining from uncertain big data. We recognize that these sub-problems converge into a general problem that we name as “complex mining from uncertain big data”, for which a plethora of real-life applications and systems can be found. Inspired by these relevant research challenges, in this chapter we provide the following contributions: (i ) a comprehensive overview of state-of-the-art literature in the context of complex mining from uncertain big data; (ii ) an algorithm for supporting tree-based mining of uncertain big data in distributed environments; (iii ) a MapReduce-based algorithm for supporting constrained mining over uncertain big (transactional) data in Cloud environments.

Complex Mining from Uncertain Big Data in Distributed Environments: Problems, Definitions and Two Effective and Efficient Algorithms

CUZZOCREA, Alfredo Massimiliano;
2017

Abstract

BIG DATA are everywhere. They are high-veracity, high-velocity, highvalue, and/or high-variety data with volumes beyond the ability of commonly-used software to manage, query, and process within a tolerable elapsed time. Big data analytics incorporates various techniques from a broad range of fields, which include cloud computing, data mining, machine learning, mathematics, and statistics. Data mining aims to extract implicit, previously unknown, and potentially useful information from data. On the other hand, nowadays uncertain big data management represents an active and wellrecognized research area where a relevant number of proposals converge. This due to several reasons, but mostly dictated by emerging big data trends as well as the Cloud-computing-paradigms’ explosion. Within this so-wide research context, a leading role is played by the issue of extracting-useful-knowledge- from big data, being the uncertain big data setting a critical case to be considered. In our research, we specially focus on two well-known distinct first-class Data Mining problems over uncertain big data, namely: frequent itemset min- ing from uncertain big data, and constrained mining from uncertain big data. We recognize that these sub-problems converge into a general problem that we name as “complex mining from uncertain big data”, for which a plethora of real-life applications and systems can be found. Inspired by these relevant research challenges, in this chapter we provide the following contributions: (i ) a comprehensive overview of state-of-the-art literature in the context of complex mining from uncertain big data; (ii ) an algorithm for supporting tree-based mining of uncertain big data in distributed environments; (iii ) a MapReduce-based algorithm for supporting constrained mining over uncertain big (transactional) data in Cloud environments.
9781498768078
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11368/2898006
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? ND
social impact