Is k Nearest Neighbours Regression Better Than GP?

Vanneschi, Leonardo; Castelli, Mauro; Manzoni, Luca; Silva, Sara; Trujillo, Leonardo

doi:10.1007/978-3-030-44094-7_16

This work starts from the empirical observation that k nearest neighbours (KNN) consistently outperforms state-of-the-art techniques for regression, including geometric semantic genetic programming (GSGP). However, KNN is a memorization, and not a learning, method, i.e. it evaluates unseen data on the basis of training observations, and not by running a learned model. This paper takes a first step towards the objective of defining a learning method able to equal KNN, by defining a new semantic mutation, called random vectors-based mutation (RVM). GP using RVM, called RVMGP, obtains results that are comparable to KNN, but still needs training data to evaluate unseen instances. A comparative analysis sheds some light on the reason why RVMGP outperforms GSGP, revealing that RVMGP is able to explore the semantic space more uniformly. This finding opens a question for the future: is it possible to define a new genetic operator, that explores the semantic space as uniformly as RVM does, but that still allows us to evaluate unseen instances without using training data?

Is k Nearest Neighbours Regression Better Than GP? / Vanneschi, Leonardo; Castelli, Mauro; Manzoni, Luca; Silva, Sara; Trujillo, Leonardo. - ELETTRONICO. - 12101:(2020), pp. 244-261. ( European Conference on Genetic Programming (Part of EvoStar) Seville, Spain 15-17 April, 2020) [10.1007/978-3-030-44094-7_16].