This work starts from the empirical observation that k nearest neighbours (KNN) consistently outperforms state-of-the-art techniques for regression, including geometric semantic genetic programming (GSGP). However, KNN is a memorization, and not a learning, method, i.e. it evaluates unseen data on the basis of training observations, and not by running a learned model. This paper takes a first step towards the objective of defining a learning method able to equal KNN, by defining a new semantic mutation, called random vectors-based mutation (RVM). GP using RVM, called RVMGP, obtains results that are comparable to KNN, but still needs training data to evaluate unseen instances. A comparative analysis sheds some light on the reason why RVMGP outperforms GSGP, revealing that RVMGP is able to explore the semantic space more uniformly. This finding opens a question for the future: is it possible to define a new genetic operator, that explores the semantic space as uniformly as RVM does, but that still allows us to evaluate unseen instances without using training data?

Is k Nearest Neighbours Regression Better Than GP?

Manzoni, Luca;
2020-01-01

Abstract

This work starts from the empirical observation that k nearest neighbours (KNN) consistently outperforms state-of-the-art techniques for regression, including geometric semantic genetic programming (GSGP). However, KNN is a memorization, and not a learning, method, i.e. it evaluates unseen data on the basis of training observations, and not by running a learned model. This paper takes a first step towards the objective of defining a learning method able to equal KNN, by defining a new semantic mutation, called random vectors-based mutation (RVM). GP using RVM, called RVMGP, obtains results that are comparable to KNN, but still needs training data to evaluate unseen instances. A comparative analysis sheds some light on the reason why RVMGP outperforms GSGP, revealing that RVMGP is able to explore the semantic space more uniformly. This finding opens a question for the future: is it possible to define a new genetic operator, that explores the semantic space as uniformly as RVM does, but that still allows us to evaluate unseen instances without using training data?
File in questo prodotto:
File Dimensione Formato  
cover. index. contributo.pdf

Accesso chiuso

Tipologia: Documento in Versione Editoriale
Licenza: Copyright Editore
Dimensione 3.73 MB
Formato Adobe PDF
3.73 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
post print.pdf

Open Access dal 10/04/2021

Descrizione: final version at link:https://link.springer.com/chapter/10.1007/978-3-030-44094-7_16
Tipologia: Bozza finale post-referaggio (post-print)
Licenza: Digital Rights Management non definito
Dimensione 4.03 MB
Formato Adobe PDF
4.03 MB Adobe PDF Visualizza/Apri
11368_2962860_print.pdf

accesso aperto

Tipologia: Bozza finale post-referaggio (post-print)
Licenza: Digital Rights Management non definito
Dimensione 3.85 MB
Formato Adobe PDF
3.85 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11368/2962860
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact