Machine learning models have demonstrated discriminative and representative learning capabilities over a wide range of applications, even at the cost of high computational complexity. Due to their parallel processing capabilities, reconfigurability, and low power consumption, Systems on Chip based on a Field Programmable Gate Array (SoC/FPGA) have been used to face this challenge. Nevertheless, SoC/FPGA devices are resource-constrained, which implies the need for optimal use of technology for the computation and storage operations involved in ML-based inference. Consequently, mapping a Deep Neural Network (DNN) architecture to a SoC/FPGA requires compression strategies to obtain a hardware design with a good compromise between effectiveness, memory footprint, and inference time. This paper presents an efficient end-to-end workflow for deploying DNNs on an SoC/FPGA by integrating hyperparameter tuning through Bayesian optimization with an ensemble of compression techniques.

An End-to-End Workflow to Efficiently Compress and Deploy DNN Classifiers On SoC/FPGA

Romina Soledad Molina
Writing – Original Draft Preparation
;
Iván René Morales
Visualization
;
Sergio Carrato
Writing – Review & Editing
;
Giovanni Ramponi
Supervision
2023-01-01

Abstract

Machine learning models have demonstrated discriminative and representative learning capabilities over a wide range of applications, even at the cost of high computational complexity. Due to their parallel processing capabilities, reconfigurability, and low power consumption, Systems on Chip based on a Field Programmable Gate Array (SoC/FPGA) have been used to face this challenge. Nevertheless, SoC/FPGA devices are resource-constrained, which implies the need for optimal use of technology for the computation and storage operations involved in ML-based inference. Consequently, mapping a Deep Neural Network (DNN) architecture to a SoC/FPGA requires compression strategies to obtain a hardware design with a good compromise between effectiveness, memory footprint, and inference time. This paper presents an efficient end-to-end workflow for deploying DNNs on an SoC/FPGA by integrating hyperparameter tuning through Bayesian optimization with an ensemble of compression techniques.
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11368/3067459
 Avviso

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact