In this work, we consider a novel stochastic optimization algorithm to solve the unconstrained, nonlinear, and non-convex optimization problems arising in the training of deep neural networks. The new algorithm is based on the combination of first- and second-order information, namely, at each step the computed search direction linearly combines a variance-reduced gradient and a stochastic limited memory quasi-Newton direction. We report computational experiments showing the performance of the proposed optimizer in the training of a modern deep residual neural network for image classification tasks. The numerical results show that the proposed algorithm exhibits comparable or superior performance than the state-of-the-art Adam optimizer, without the agonizing pain of tuning its many hyperparameters.
Combined First- and Second-Order Directions for Deep Neural Networks Training
ANGELES MARTINEZ CALOMARDO
Primo
;Mahsa YousefiUltimo
2025-01-01
Abstract
In this work, we consider a novel stochastic optimization algorithm to solve the unconstrained, nonlinear, and non-convex optimization problems arising in the training of deep neural networks. The new algorithm is based on the combination of first- and second-order information, namely, at each step the computed search direction linearly combines a variance-reduced gradient and a stochastic limited memory quasi-Newton direction. We report computational experiments showing the performance of the proposed optimizer in the training of a modern deep residual neural network for image classification tasks. The numerical results show that the proposed algorithm exhibits comparable or superior performance than the state-of-the-art Adam optimizer, without the agonizing pain of tuning its many hyperparameters.File | Dimensione | Formato | |
---|---|---|---|
Numerical Computations_ Theory and Algorithms - 978-3-031-81241-5.pdf
Accesso chiuso
Tipologia:
Documento in Versione Editoriale
Licenza:
Copyright Editore
Dimensione
1.58 MB
Formato
Adobe PDF
|
1.58 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.