Selecting Optimal Trace Clustering Pipelines with Meta-learning

Trace clustering has been extensively used to discover aspects of the data from event logs. Process Mining techniques guide the identification of sub-logs by grouping traces with similar behaviors, producing more understandable models and improving conformance indicators. Nevertheless, little attention has been posed to the relationship among event log properties, the pipeline of encoding and clustering algorithms, and the quality of the obtained outcome. The present study contributes to the understanding of the aforementioned relationships and provides an automatic selection of a proper combination of algorithms for clustering a given event log. We propose a Meta-Learning framework to recommend the most suitable pipeline for trace clustering, which encompasses the encoding method, clustering algorithm, and its hyperparameters. Our experiments were conducted using a thousand event logs, four encoding techniques, and three clustering methods. Results indicate that our framework sheds light on the trace clustering problem and can assist users in choosing the best pipeline considering their environment.

Selecting Optimal Trace Clustering Pipelines with Meta-learning

Tavares G. M.;Barbon Junior S.;Damiani E.;Ceravolo P.

2022-01-01

Abstract

Trace clustering has been extensively used to discover aspects of the data from event logs. Process Mining techniques guide the identification of sub-logs by grouping traces with similar behaviors, producing more understandable models and improving conformance indicators. Nevertheless, little attention has been posed to the relationship among event log properties, the pipeline of encoding and clustering algorithms, and the quality of the obtained outcome. The present study contributes to the understanding of the aforementioned relationships and provides an automatic selection of a proper combination of algorithms for clustering a given event log. We propose a Meta-Learning framework to recommend the most suitable pipeline for trace clustering, which encompasses the encoding method, clustering algorithm, and its hyperparameters. Our experiments were conducted using a thousand event logs, four encoding techniques, and three clustering methods. Results indicate that our framework sheds light on the trace clustering problem and can assist users in choosing the best pipeline considering their environment.

Scheda breve

Scheda completa

	Anno
	
				2022
			
	Titolo della collana
	
				LECTURE NOTES IN COMPUTER SCIENCE
			
	ISBN
	
				978-3-031-21685-5
978-3-031-21686-2
			
	Appare nelle tipologie:
	
				2.1 Contributo in Volume (Capitolo,Saggio)

File in questo prodotto:

File	Dimensione	Formato
(Lecture Notes in Computer Science, 13653) João Carlos Xavier-Junior, Ricardo Araújo Rios - Intelligent Systems_ 11th Brazilian Conference, BRACIS 2022, Campinas, Brazil, November 28 – December 1, 202-172-186.pdf Accesso chiuso Tipologia: Documento in Versione Editoriale Licenza: Copyright Editore Dimensione 408.03 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	408.03 kB	Adobe PDF	Visualizza/Apri Richiedi una copia
(Lecture+Notes+in+Computer+Science,+13653)+João+Carlos+Xavier-Junior,+Ricardo+Araújo+Rios+-+Intelligent+Systems_+11th+Brazilian+Conference,+BRACIS+2022,+Campin.pdf Open Access dal 20/11/2023 Tipologia: Bozza finale post-referaggio (post-print) Licenza: Digital Rights Management non definito Dimensione 1.04 MB Formato Adobe PDF Visualizza/Apri	1.04 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11368/3055526

Citazioni

ND

6

7

social impact