Ensemble Model Compression for~Fast and~Energy-Efficient Ranking on~{FPGAs}

We investigate novel SoC-FPGA solutions for fast and energy-efficient ranking based on machine-learned ensembles of decision trees. Since the memory footprint of ranking ensembles limits the effective exploitation of programmable logic for large-scale inference tasks, we investigate binning and quantization techniques to reduce the memory occupation of the learned model and we optimize the state-of-the-art ensemble-traversal algorithm for deployment on low-cost, energy-efficient FPGA devices. The results of the experiments conducted using publicly available Learning-to-Rank datasets, show that our model compression techniques do not impact significantly the accuracy. Moreover, the reduced space requirements allow the models and the logic to be replicated on the FPGA device in order to execute several inference tasks in parallel. We discuss in details the experimental settings and the feasibility of the deployment of the proposed solution in a real setting. The results of the experiments conducted show that our FPGA solution achieves performances at the state of the art and consumes from 9x up to 19.8x less energy than an equivalent multi-threaded CPU implementation.

Ensemble Model Compression for~Fast and~Energy-Efficient Ranking on~{FPGAs}

Veronica Gil-Costa;Fernando Loor;Romina Molina;Franco~Maria Nardini;Raffaele Perego;Salvatore Trani

2022-01-01

Abstract

We investigate novel SoC-FPGA solutions for fast and energy-efficient ranking based on machine-learned ensembles of decision trees. Since the memory footprint of ranking ensembles limits the effective exploitation of programmable logic for large-scale inference tasks, we investigate binning and quantization techniques to reduce the memory occupation of the learned model and we optimize the state-of-the-art ensemble-traversal algorithm for deployment on low-cost, energy-efficient FPGA devices. The results of the experiments conducted using publicly available Learning-to-Rank datasets, show that our model compression techniques do not impact significantly the accuracy. Moreover, the reduced space requirements allow the models and the logic to be replicated on the FPGA device in order to execute several inference tasks in parallel. We discuss in details the experimental settings and the feasibility of the deployment of the proposed solution in a real setting. The results of the experiments conducted show that our FPGA solution achieves performances at the state of the art and consumes from 9x up to 19.8x less energy than an equivalent multi-threaded CPU implementation.

Scheda breve

Scheda completa

	Anno
	
				2022
			
	Titolo della collana
	
				LECTURE NOTES IN COMPUTER SCIENCE
			
	ISBN
	
				978-3-030-99735-9
978-3-030-99736-6
			
	URL
	
				https://link.springer.com/chapter/10.1007/978-3-030-99736-6_18
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti Convegno (Proceeding)

File in questo prodotto:

File	Dimensione	Formato
Matthias Hagen (editor), Suzan Verberne (editor), Craig Macdonald (editor), Christin Seifert (editor), Krisztian Balog (editor), Kjetil Nørvåg (editor), Vinay Setty (editor) - Advances in Information -1.pdf Accesso chiuso Descrizione: cover. index. chapter Tipologia: Documento in Versione Editoriale Licenza: Copyright Editore Dimensione 481.82 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	481.82 kB	Adobe PDF	Visualizza/Apri Richiedi una copia
Matthias+Hagen+(editor),+Suzan+Verberne+(editor),+Craig+Macdonald+(editor),+Christin+Seifert+(editor),+Krisztian+Balog+(editor),+Kjetil+Nørvåg+(editor),+Vinay+.pdf Open Access dal 06/04/2023 Tipologia: Bozza finale post-referaggio (post-print) Licenza: Digital Rights Management non definito Dimensione 899.48 kB Formato Adobe PDF Visualizza/Apri	899.48 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11368/3034438

Citazioni

ND

4

2

social impact