In this work, we consider a novel stochastic optimization algorithm to solve the unconstrained, nonlinear, and non-convex optimization problems arising in the training of deep neural networks. The new algorithm is based on the combination of first- and second-order information, namely, at each step the computed search direction linearly combines a variance-reduced gradient and a stochastic limited memory quasi-Newton direction. We report computational experiments showing the performance of the proposed optimizer in the training of a modern deep residual neural network for image classification tasks. The numerical results show that the proposed algorithm exhibits comparable or superior performance than the state-of-the-art Adam optimizer, without the agonizing pain of tuning its many hyperparameters.

Combined First- and Second-Order Directions for Deep Neural Networks Training

ANGELES MARTINEZ CALOMARDO
Primo
;
Mahsa Yousefi
Ultimo
2025-01-01

Abstract

In this work, we consider a novel stochastic optimization algorithm to solve the unconstrained, nonlinear, and non-convex optimization problems arising in the training of deep neural networks. The new algorithm is based on the combination of first- and second-order information, namely, at each step the computed search direction linearly combines a variance-reduced gradient and a stochastic limited memory quasi-Newton direction. We report computational experiments showing the performance of the proposed optimizer in the training of a modern deep residual neural network for image classification tasks. The numerical results show that the proposed algorithm exhibits comparable or superior performance than the state-of-the-art Adam optimizer, without the agonizing pain of tuning its many hyperparameters.
2025
9783031812408
9783031812415
File in questo prodotto:
File Dimensione Formato  
Numerical Computations_ Theory and Algorithms - 978-3-031-81241-5.pdf

Accesso chiuso

Tipologia: Documento in Versione Editoriale
Licenza: Copyright Editore
Dimensione 1.58 MB
Formato Adobe PDF
1.58 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11368/3104059
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact