Machine learning models have demonstrated discriminative and representative learning capabilities over a wide range of applications, even at the cost of high computational complexity. Due to their parallel processing capabilities, reconfigurability, and low power consumption, Systems on Chip based on a Field Programmable Gate Array (SoC/FPGA) have been used to face this challenge. Nevertheless, SoC/FPGA devices are resource-constrained, which implies the need for optimal use of technology for the computation and storage operations involved in ML-based inference. Consequently, mapping a Deep Neural Network (DNN) architecture to a SoC/FPGA requires compression strategies to obtain a hardware design with a good compromise between effectiveness, memory footprint, and inference time. This paper presents an efficient end-to-end workflow for deploying DNNs on an SoC/FPGA by integrating hyperparameter tuning through Bayesian optimization with an ensemble of compression techniques.
An End-to-End Workflow to Efficiently Compress and Deploy DNN Classifiers On SoC/FPGA
Romina Soledad Molina
Writing – Original Draft Preparation
;Iván René MoralesVisualization
;Sergio CarratoWriting – Review & Editing
;Giovanni RamponiSupervision
2024-01-01
Abstract
Machine learning models have demonstrated discriminative and representative learning capabilities over a wide range of applications, even at the cost of high computational complexity. Due to their parallel processing capabilities, reconfigurability, and low power consumption, Systems on Chip based on a Field Programmable Gate Array (SoC/FPGA) have been used to face this challenge. Nevertheless, SoC/FPGA devices are resource-constrained, which implies the need for optimal use of technology for the computation and storage operations involved in ML-based inference. Consequently, mapping a Deep Neural Network (DNN) architecture to a SoC/FPGA requires compression strategies to obtain a hardware design with a good compromise between effectiveness, memory footprint, and inference time. This paper presents an efficient end-to-end workflow for deploying DNNs on an SoC/FPGA by integrating hyperparameter tuning through Bayesian optimization with an ensemble of compression techniques.File | Dimensione | Formato | |
---|---|---|---|
An_End-to-End_Workflow_to_Efficiently_Compress_and_Deploy_DNN_Classifiers_on_SoC_FPGA.pdf
Accesso chiuso
Tipologia:
Documento in Versione Editoriale
Licenza:
Copyright Editore
Dimensione
1.3 MB
Formato
Adobe PDF
|
1.3 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.