We consider the problem of optimizing a controller for agents whose observation and action spaces are continuous, i.e., where the controller is a multivariate real function f: R^n → R^m. We use genetic programming (GP) for solving this optimization problem. Namely, we employ a multi-tree-based GP variant, where a candidate solution is an array of m trees, each encoding a univariate function of the agent observation. We compare this form of optimization against the more common one where the controller is a multi-layer perceptron, with a predefined topology, whose weights are optimized through (neuro)evolution (NE). Moreover, we consider an evolutionary algorithm, GraphEA, that directly evolves graphs, each having n input nodes and m output nodes. We apply these three approaches to the case of simulated modular soft robots, where a robot is an aggregation of identical soft modules, each employing a controller that processes the local observation and produces the local action. We find that, in our scenario, multi-tree-based GP is competitive with NE and tends to produce different behaviors. We then experimentally investigate the possibility of optimizing a controller using another, pre-optimized one, as teacher, i.e., we realize a form of offline imitation learning. We consider all the teacher-learner pairs resulting from the three evolutionary algorithms and find that NE is a better learner than GP and GraphEA. However, controllers obtained through offline imitation learning are far less effective than those obtained through direct evolution. We hypothesize that this gap in effectiveness may be explained by the possibility, given by direct evolution, of exploring during the simulations a larger portion of the observation-action space.

GP for Continuous Control: Teacher or Learner? The Case of Simulated Modular Soft Robots

Medvet, Eric
;
Nadizar, Giorgia
2024-01-01

Abstract

We consider the problem of optimizing a controller for agents whose observation and action spaces are continuous, i.e., where the controller is a multivariate real function f: R^n → R^m. We use genetic programming (GP) for solving this optimization problem. Namely, we employ a multi-tree-based GP variant, where a candidate solution is an array of m trees, each encoding a univariate function of the agent observation. We compare this form of optimization against the more common one where the controller is a multi-layer perceptron, with a predefined topology, whose weights are optimized through (neuro)evolution (NE). Moreover, we consider an evolutionary algorithm, GraphEA, that directly evolves graphs, each having n input nodes and m output nodes. We apply these three approaches to the case of simulated modular soft robots, where a robot is an aggregation of identical soft modules, each employing a controller that processes the local observation and produces the local action. We find that, in our scenario, multi-tree-based GP is competitive with NE and tends to produce different behaviors. We then experimentally investigate the possibility of optimizing a controller using another, pre-optimized one, as teacher, i.e., we realize a form of offline imitation learning. We consider all the teacher-learner pairs resulting from the three evolutionary algorithms and find that NE is a better learner than GP and GraphEA. However, controllers obtained through offline imitation learning are far less effective than those obtained through direct evolution. We hypothesize that this gap in effectiveness may be explained by the possibility, given by direct evolution, of exploring during the simulations a larger portion of the observation-action space.
File in questo prodotto:
File Dimensione Formato  
2023-GPTP-GPForControl-TeachingLearning-VSRCase.pdf

embargo fino al 18/02/2025

Tipologia: Bozza finale post-referaggio (post-print)
Licenza: Copyright Editore
Dimensione 585.4 kB
Formato Adobe PDF
585.4 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
978-981-99-8413-8_11.pdf

Accesso chiuso

Tipologia: Documento in Versione Editoriale
Licenza: Copyright Editore
Dimensione 663.87 kB
Formato Adobe PDF
663.87 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11368/3070998
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact