Frequent pattern mining aims to discover implicit, previously unknown, and potentially useful knowledge in the form of sets of frequently co-occurring items, events, or objects. To mine frequent patterns from probabilistic datasets of uncertain data, where each item in a transaction is usually associated with an existential probability expressing the likelihood of its presence in that transaction, the UF-growth algorithm captures important information about uncertain data in a UF-tree structure so that expected support can be computed for each pattern. A pattern is considered frequent if its expected support meets or exceeds the user-specified threshold. However, a challenge is that the UF-tree can be large. To handle this challenge, several algorithms use smaller trees such that upper bounds to expected support can be computed. In this paper, we examine these upper bounds, and determine which ones provide tighter upper bounds to expected support for frequent pattern mining of uncertain big data.
Computing theoretically-sound upper bounds to expected support for frequent pattern mining problems over uncertain big data
CUZZOCREA, Alfredo Massimiliano;
2016-01-01
Abstract
Frequent pattern mining aims to discover implicit, previously unknown, and potentially useful knowledge in the form of sets of frequently co-occurring items, events, or objects. To mine frequent patterns from probabilistic datasets of uncertain data, where each item in a transaction is usually associated with an existential probability expressing the likelihood of its presence in that transaction, the UF-growth algorithm captures important information about uncertain data in a UF-tree structure so that expected support can be computed for each pattern. A pattern is considered frequent if its expected support meets or exceeds the user-specified threshold. However, a challenge is that the UF-tree can be large. To handle this challenge, several algorithms use smaller trees such that upper bounds to expected support can be computed. In this paper, we examine these upper bounds, and determine which ones provide tighter upper bounds to expected support for frequent pattern mining of uncertain big data.File | Dimensione | Formato | |
---|---|---|---|
Cuzzocrea.pdf
Accesso chiuso
Tipologia:
Documento in Versione Editoriale
Licenza:
Digital Rights Management non definito
Dimensione
403.92 kB
Formato
Adobe PDF
|
403.92 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.