As technology advances, streams of data can be produced in many applications such as social networks, sensor networks, bioinformatics, and chemical informatics. These kinds of streaming data share a property in common.namely, they can be modeled in terms of graph-structured data. Here, the data streams generated by graph data sources in these applications are graph streams. To extract implicit, previously unknown, and potentially useful frequent patterns from these streams, efficient data mining algorithms are in demand. Many existing algorithms capture important streaming data and assume that the captured data can fit into main memory. However, problems arise when such an assumption does not hold (e.g., when the available memory is limited). In this paper, we propose a data structure called DSMatrix for capturing important data from the streams.especially, dense graph streams.onto the disk when the memory space is limited. In addition, we also propose two stream mining algorithms that use DSMatrix to mine frequent patterns. The tree-based horizontal mining algorithm applies an e.ective frequency counting approach to avoid recursive construction of sub-trees as in many tree-based mining. The vertical mining algorithm makes good use of the information captured in the DSMatrix for mining.

Frequent pattern mining from dense graph streams

CUZZOCREA, Alfredo Massimiliano;
2014-01-01

Abstract

As technology advances, streams of data can be produced in many applications such as social networks, sensor networks, bioinformatics, and chemical informatics. These kinds of streaming data share a property in common.namely, they can be modeled in terms of graph-structured data. Here, the data streams generated by graph data sources in these applications are graph streams. To extract implicit, previously unknown, and potentially useful frequent patterns from these streams, efficient data mining algorithms are in demand. Many existing algorithms capture important streaming data and assume that the captured data can fit into main memory. However, problems arise when such an assumption does not hold (e.g., when the available memory is limited). In this paper, we propose a data structure called DSMatrix for capturing important data from the streams.especially, dense graph streams.onto the disk when the memory space is limited. In addition, we also propose two stream mining algorithms that use DSMatrix to mine frequent patterns. The tree-based horizontal mining algorithm applies an e.ective frequency counting approach to avoid recursive construction of sub-trees as in many tree-based mining. The vertical mining algorithm makes good use of the information captured in the DSMatrix for mining.
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11368/2896354
 Avviso

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 10
  • ???jsp.display-item.citation.isi??? ND
social impact