In this paper, we propose and evaluate a feature distillation technique for object detection under poor visibility conditions, and we analyze its impact when deployed on an FPGA platform. We demonstrate via extensive experiments how different detection architectures generalize across scenes, and we infer that a scale-permuted feature extraction is the ideal choice for detection tasks in unconstrained environments with an 11–12% gain. As verified by the experiments, image enhancement often fails to provide significant detection gains. We hence introduce a joint training in a scale-permuted student network that learns dehazed features from a dual teacher network without an explicit dehazing step. The student learns to replicate not only the teacher outputs but also the decision-making process of the teacher by using attention transfer. Although the overall goal is to produce a real-time system capable of providing driving assistance in challenging scenarios, the FPGA implementation of a scale-permuted network is the first of its kind. To achieve effective implementation of the model in FPGA technology, a high-level synthesis approach and model compression techniques are employed to obtain a deployment with a good trade-off between quality and memory footprint metrics. We develop two distilled models using the joint feature distillation technique and show that these perform better in poor visibility scenes when compared to other detectors with similar size or even bigger sizes in some cases. Our 8.5 M model shows an mAP gain of almost 1% compared to YOLOv10-M with 15 M parameters, on the Cityscapes Hazy dataset. On night images from the BDD dataset, our 8.5 M model shows an approximate mAP gain of 4% compared to YOLO26-S with 9.5 M parameters. We further perform cross-domain testing with the DriveIndia dataset to show that our models generalize well beyond the distillation distribution and can be used for generic driving scenarios.
A Feature Distillation Network to Enable Object Detection on an FPGA Platform in Poor Visibility Conditions / Bhattacharya, J., Molina, R., Crespo, M.L., Carini, A., Marsi, S., Ramponi, G.. - In: ELECTRONICS. - ISSN 2079-9292. - 15:11(2026), pp. 1-26. [10.3390/electronics15112454]
A Feature Distillation Network to Enable Object Detection on an FPGA Platform in Poor Visibility Conditions
Bhattacharya, Jhilik;Molina, Romina;Carini, Alberto
;Marsi, Stefano;Ramponi, Giovanni
2026-01-01
Abstract
In this paper, we propose and evaluate a feature distillation technique for object detection under poor visibility conditions, and we analyze its impact when deployed on an FPGA platform. We demonstrate via extensive experiments how different detection architectures generalize across scenes, and we infer that a scale-permuted feature extraction is the ideal choice for detection tasks in unconstrained environments with an 11–12% gain. As verified by the experiments, image enhancement often fails to provide significant detection gains. We hence introduce a joint training in a scale-permuted student network that learns dehazed features from a dual teacher network without an explicit dehazing step. The student learns to replicate not only the teacher outputs but also the decision-making process of the teacher by using attention transfer. Although the overall goal is to produce a real-time system capable of providing driving assistance in challenging scenarios, the FPGA implementation of a scale-permuted network is the first of its kind. To achieve effective implementation of the model in FPGA technology, a high-level synthesis approach and model compression techniques are employed to obtain a deployment with a good trade-off between quality and memory footprint metrics. We develop two distilled models using the joint feature distillation technique and show that these perform better in poor visibility scenes when compared to other detectors with similar size or even bigger sizes in some cases. Our 8.5 M model shows an mAP gain of almost 1% compared to YOLOv10-M with 15 M parameters, on the Cityscapes Hazy dataset. On night images from the BDD dataset, our 8.5 M model shows an approximate mAP gain of 4% compared to YOLO26-S with 9.5 M parameters. We further perform cross-domain testing with the DriveIndia dataset to show that our models generalize well beyond the distillation distribution and can be used for generic driving scenarios.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


