Frequent subgraph mining from streams of uncertain data

Leung, Carson K.; Cuzzocrea, Alfredo Massimiliano

doi:10.1145/2790798.2790799

In the current era of Big data, high volumes of high-value data---such as social network data---can be generated at a high velocity. The quality and accuracy of these data depend on their veracity: uncertainty of the data. A collection of these uncertain data can be viewed as a big, interlinked, dynamic graph structure. Embedded in these big data are implicit, previously unknown, and potentially useful knowledge. Hence, efficient and effective knowledge discovery algorithms for mining frequent subgraphs from these dynamic streaming graph structured data are in demand. Most of the existing algorithms mine frequent subgraph from streams of precise data. However, there are many real-life scientific and engineering applications, in which data are uncertain. Hence, in this paper, we propose algorithms that use limited memory space for mining frequent subgraphs from streams of uncertain data. Evaluation results show the effectiveness of our algorithms in mining frequent subgraphs from streams of uncertain data.