Machine learning models have demonstrated discriminative and representative learning capabilities over a wide range of applications, even at the cost of high computational complexity. Due to their parallel processing capabilities, reconfigurability, and low power consumption, Systems on Chip based on a Field Programmable Gate Array (SoC/FPGA) have been used to face this challenge. Nevertheless, SoC/FPGA devices are resource-constrained, which implies the need for optimal use of technology for the computation and storage operations involved in ML-based inference. Consequently, mapping a Deep Neural Network (DNN) architecture to a SoC/FPGA requires compression strategies to obtain a hardware design with a good compromise between effectiveness, memory footprint, and inference time. This paper presents an efficient end-to-end workflow for deploying DNNs on an SoC/FPGA by integrating hyperparameter tuning through Bayesian optimization with an ensemble of compression techniques.
An End-to-End Workflow to Efficiently Compress and Deploy DNN Classifiers On SoC/FPGA
Romina Soledad Molina
Writing – Original Draft Preparation
;Iván René MoralesVisualization
;Sergio CarratoWriting – Review & Editing
;Giovanni RamponiSupervision
2023-01-01
Abstract
Machine learning models have demonstrated discriminative and representative learning capabilities over a wide range of applications, even at the cost of high computational complexity. Due to their parallel processing capabilities, reconfigurability, and low power consumption, Systems on Chip based on a Field Programmable Gate Array (SoC/FPGA) have been used to face this challenge. Nevertheless, SoC/FPGA devices are resource-constrained, which implies the need for optimal use of technology for the computation and storage operations involved in ML-based inference. Consequently, mapping a Deep Neural Network (DNN) architecture to a SoC/FPGA requires compression strategies to obtain a hardware design with a good compromise between effectiveness, memory footprint, and inference time. This paper presents an efficient end-to-end workflow for deploying DNNs on an SoC/FPGA by integrating hyperparameter tuning through Bayesian optimization with an ensemble of compression techniques.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.