Symbolic regression is aimed at discovering mathematical expressions, in symbolic form, that fit a given sample of data points. While genetic programming (GP) constitutes a powerful tool for solving this class of problems, its effectiveness is still severely limited when the data sample requires different expressions in different regions of the input space - i.e., when the approximating function should be discontinuous. In this paper we present a new GP-based approach for symbolic regression of discontinuous functions in multivariate data-sets. We identify the portions of the input space that require different approximating functions by means of a new algorithm that we call hyper-volume error separation (HVES). To this end we run a preliminary GP evolution and partition the input space based on the error exhibited by the best individual across the data-set. Then we partition the data-set based on the partition of the input space and use each such partition for driving an independent, preliminary GP evolution. The populations resulting from such preliminary evolutions are finally merged and evolved again. We compared our approach to the standard GP search and to a GP search for discontinuous functions in univariate data-sets. Our results show that coupling HVES with GP is an effective approach and provides significant accuracy improvements while requiring less computational resources.

Symbolic regression of discontinuous and multivariate functions by Hyper-Volume Error Separation (HVES)

FILLON, CYRIL;BARTOLI, Alberto
2007-01-01

Abstract

Symbolic regression is aimed at discovering mathematical expressions, in symbolic form, that fit a given sample of data points. While genetic programming (GP) constitutes a powerful tool for solving this class of problems, its effectiveness is still severely limited when the data sample requires different expressions in different regions of the input space - i.e., when the approximating function should be discontinuous. In this paper we present a new GP-based approach for symbolic regression of discontinuous functions in multivariate data-sets. We identify the portions of the input space that require different approximating functions by means of a new algorithm that we call hyper-volume error separation (HVES). To this end we run a preliminary GP evolution and partition the input space based on the error exhibited by the best individual across the data-set. Then we partition the data-set based on the partition of the input space and use each such partition for driving an independent, preliminary GP evolution. The populations resulting from such preliminary evolutions are finally merged and evolved again. We compared our approach to the standard GP search and to a GP search for discontinuous functions in univariate data-sets. Our results show that coupling HVES with GP is an effective approach and provides significant accuracy improvements while requiring less computational resources.
2007
978-1-4244-1339-3
978-1-4244-1340-9
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11368/1744059
 Avviso

Registrazione in corso di verifica.
La registrazione di questo prodotto non è ancora stata validata in ArTS.

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 7
  • ???jsp.display-item.citation.isi??? ND
social impact