Multi-label data streams are sequences of multi-label instances arriving over time to a multi-label classifier. The properties of the stream may continuously change due to concept drift. Therefore, algorithms must constantly adapt to the new data distributions. In this paper we propose a novel ensemble method for multi-label drifting streams named Adaptive Ensemble of Self-Adjusting Nearest Neighbor Subspaces (AESAKNNS). It leverages a self-adjusting kNN as a base classifier with the advantages of ensembles to adapt to concept drift in the multi-label environment. To promote diverse knowledge within the ensemble, each base classifier is given a unique subset of features and samples to train on. These samples are distributed to classifiers in a probabilistic manner that follows a Poisson distribution as in online bagging. Accompanying these mechanisms, a collection of ADWIN detectors monitor each classifier for the occurrence of a concept drift on the subspace. Upon detection, the algorithm automatically trains additional classifiers in the background to attempt to capture new concepts on new subspaces of features. The dynamic classifier selection chooses the most accurate classifiers from the active and background ensembles to replace the current ensemble. Our experimental study compares the proposed approach with 30 other classifiers, including problem transformation, algorithm adaptation, kNNs, and ensembles on 30 diverse multi-label datasets and 12 performance metrics. Results, validated using non-parametric statistical analysis, support the better performance of the AESAKNNS and highlight the contribution of its components in improving the performance of the ensemble.

Adaptive ensemble of self-adjusting nearest neighbor subspaces for multi-label drifting data streams

Barbon Junior S.;
2022-01-01

Abstract

Multi-label data streams are sequences of multi-label instances arriving over time to a multi-label classifier. The properties of the stream may continuously change due to concept drift. Therefore, algorithms must constantly adapt to the new data distributions. In this paper we propose a novel ensemble method for multi-label drifting streams named Adaptive Ensemble of Self-Adjusting Nearest Neighbor Subspaces (AESAKNNS). It leverages a self-adjusting kNN as a base classifier with the advantages of ensembles to adapt to concept drift in the multi-label environment. To promote diverse knowledge within the ensemble, each base classifier is given a unique subset of features and samples to train on. These samples are distributed to classifiers in a probabilistic manner that follows a Poisson distribution as in online bagging. Accompanying these mechanisms, a collection of ADWIN detectors monitor each classifier for the occurrence of a concept drift on the subspace. Upon detection, the algorithm automatically trains additional classifiers in the background to attempt to capture new concepts on new subspaces of features. The dynamic classifier selection chooses the most accurate classifiers from the active and background ensembles to replace the current ensemble. Our experimental study compares the proposed approach with 30 other classifiers, including problem transformation, algorithm adaptation, kNNs, and ensembles on 30 diverse multi-label datasets and 12 performance metrics. Results, validated using non-parametric statistical analysis, support the better performance of the AESAKNNS and highlight the contribution of its components in improving the performance of the ensemble.
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S0925231222000984-main.pdf

Accesso chiuso

Tipologia: Documento in Versione Editoriale
Licenza: Copyright Editore
Dimensione 3.31 MB
Formato Adobe PDF
3.31 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
3014631_1-s2.0-S0925231222000984-main-Post_print.pdf

Open Access dal 04/02/2024

Tipologia: Bozza finale post-referaggio (post-print)
Licenza: Creative commons
Dimensione 3.97 MB
Formato Adobe PDF
3.97 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11368/3014631
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 28
  • ???jsp.display-item.citation.isi??? 21
social impact