Increasing attention has been paid of late to the problem of detecting and explaining “deviant” process instances, i.e. instances diverging from normal/desired outcomes (e.g., frauds, faults, SLA violations), based on log data. Current solutions allow to discriminate between deviant and normal instances, by combining the extraction of (sequence-based) behavioral patterns with standard classifier-induction methods. However, there is no general consensus on which kind of patterns are the most suitable for such a task, while mixing multiple pattern families together will produce a cumbersome redundant representation of log data that may well confuse the learner. We here propose an ensemble-learning approach to this deviance mining tasks, where multiple base learners are trained on different feature-based views of the given log (obtained each by using a distinguished family of patterns). The final model, induced through a stacking procedure, can implicitly reason on heterogeneous kinds of structural features, by leveraging the predictions of the base models. To make the discovered models more effective, the approach leverages resampling techniques and exploits non-structural process data. The approach was implemented and tested on real-life logs, where it reached compelling performances w.r.t. state-of-the-art methods.
A multi-view learning approach to the discovery of deviant process instances
CUZZOCREA, Alfredo Massimiliano;
2015-01-01
Abstract
Increasing attention has been paid of late to the problem of detecting and explaining “deviant” process instances, i.e. instances diverging from normal/desired outcomes (e.g., frauds, faults, SLA violations), based on log data. Current solutions allow to discriminate between deviant and normal instances, by combining the extraction of (sequence-based) behavioral patterns with standard classifier-induction methods. However, there is no general consensus on which kind of patterns are the most suitable for such a task, while mixing multiple pattern families together will produce a cumbersome redundant representation of log data that may well confuse the learner. We here propose an ensemble-learning approach to this deviance mining tasks, where multiple base learners are trained on different feature-based views of the given log (obtained each by using a distinguished family of patterns). The final model, induced through a stacking procedure, can implicitly reason on heterogeneous kinds of structural features, by leveraging the predictions of the base models. To make the discovered models more effective, the approach leverages resampling techniques and exploits non-structural process data. The approach was implemented and tested on real-life logs, where it reached compelling performances w.r.t. state-of-the-art methods.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.