Machine learning models have demonstrated discriminative and representative learning capabilities over a wide range of applications, even at the cost of high computational complexity. Due to their parallel processing capabilities, reconfigurability, and low power consumption, Systems on Chip based on a Field Programmable Gate Array (SoC/FPGA) have been used to face this challenge. Nevertheless, SoC/FPGA devices are resource-constrained, which implies the need for optimal use of technology for the computation and storage operations involved in ML-based inference. Consequently, mapping a Deep Neural Network (DNN) architecture to a SoC/FPGA requires compression strategies to obtain a hardware design with a good compromise between effectiveness, memory footprint, and inference time. This paper presents an efficient end-to-end workflow for deploying DNNs on an SoC/FPGA by integrating hyperparameter tuning through Bayesian optimization with an ensemble of compression techniques.

An End-to-End Workflow to Efficiently Compress and Deploy DNN Classifiers On SoC/FPGA

Romina Soledad Molina
Writing – Original Draft Preparation
;
Iván René Morales
Visualization
;
Sergio Carrato
Writing – Review & Editing
;
Giovanni Ramponi
Supervision
2024-01-01

Abstract

Machine learning models have demonstrated discriminative and representative learning capabilities over a wide range of applications, even at the cost of high computational complexity. Due to their parallel processing capabilities, reconfigurability, and low power consumption, Systems on Chip based on a Field Programmable Gate Array (SoC/FPGA) have been used to face this challenge. Nevertheless, SoC/FPGA devices are resource-constrained, which implies the need for optimal use of technology for the computation and storage operations involved in ML-based inference. Consequently, mapping a Deep Neural Network (DNN) architecture to a SoC/FPGA requires compression strategies to obtain a hardware design with a good compromise between effectiveness, memory footprint, and inference time. This paper presents an efficient end-to-end workflow for deploying DNNs on an SoC/FPGA by integrating hyperparameter tuning through Bayesian optimization with an ensemble of compression techniques.
2024
14-dic-2023
Pubblicato
File in questo prodotto:
File Dimensione Formato  
An_End-to-End_Workflow_to_Efficiently_Compress_and_Deploy_DNN_Classifiers_on_SoC_FPGA.pdf

Accesso chiuso

Tipologia: Documento in Versione Editoriale
Licenza: Copyright Editore
Dimensione 1.3 MB
Formato Adobe PDF
1.3 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11368/3067459
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact