An End-to-End Workflow to Efficiently Compress and Deploy DNN Classifiers On SoC/FPGA

Machine learning models have demonstrated discriminative and representative learning capabilities over a wide range of applications, even at the cost of high computational complexity. Due to their parallel processing capabilities, reconfigurability, and low power consumption, Systems on Chip based on a Field Programmable Gate Array (SoC/FPGA) have been used to face this challenge. Nevertheless, SoC/FPGA devices are resource-constrained, which implies the need for optimal use of technology for the computation and storage operations involved in ML-based inference. Consequently, mapping a Deep Neural Network (DNN) architecture to a SoC/FPGA requires compression strategies to obtain a hardware design with a good compromise between effectiveness, memory footprint, and inference time. This paper presents an efficient end-to-end workflow for deploying DNNs on an SoC/FPGA by integrating hyperparameter tuning through Bayesian optimization with an ensemble of compression techniques.

An End-to-End Workflow to Efficiently Compress and Deploy DNN Classifiers On SoC/FPGA

Romina Soledad Molina^{Writing – Original Draft Preparation};Iván René Morales^{Visualization};Maria Liz Crespo^Supervision;Veronica Gil Costa^{Investigation};Sergio Carrato^{Writing – Review & Editing};Giovanni Ramponi^Supervision

2024-01-01

Abstract

Machine learning models have demonstrated discriminative and representative learning capabilities over a wide range of applications, even at the cost of high computational complexity. Due to their parallel processing capabilities, reconfigurability, and low power consumption, Systems on Chip based on a Field Programmable Gate Array (SoC/FPGA) have been used to face this challenge. Nevertheless, SoC/FPGA devices are resource-constrained, which implies the need for optimal use of technology for the computation and storage operations involved in ML-based inference. Consequently, mapping a Deep Neural Network (DNN) architecture to a SoC/FPGA requires compression strategies to obtain a hardware design with a good compromise between effectiveness, memory footprint, and inference time. This paper presents an efficient end-to-end workflow for deploying DNNs on an SoC/FPGA by integrating hyperparameter tuning through Bayesian optimization with an ensemble of compression techniques.

Scheda breve

Scheda completa

	Anno
	
				2024
			
	Data ahead of print
	
				14-dic-2023
			
	Stato di pubblicazione
	
				Pubblicato
			
	Rivista
	
				IEEE EMBEDDED SYSTEMS LETTERS
			
	DOI
	
				https://dx.doi.org/10.1109/LES.2023.3343030
			
	URL
	
				https://ieeexplore.ieee.org/document/10360204
			
	Appare nelle tipologie:
	
				1.1 Articolo in Rivista

File in questo prodotto:

File	Dimensione	Formato
An_End-to-End_Workflow_to_Efficiently_Compress_and_Deploy_DNN_Classifiers_on_SoC_FPGA.pdf Accesso chiuso Tipologia: Documento in Versione Editoriale Licenza: Copyright Editore Dimensione 1.3 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.3 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11368/3067459

Citazioni

ND

2

2

social impact