RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

Real-time vehicle detection at night faces significant challenges: ineffective low-light feature extraction, low computational efficiency hampering real-time performance, subpar detection of small and occluded vehicles, and limited cross-scenario generalization. To address these issues, this paper presents CEM-YOLO, an enhanced YOLO algorithm leveraging deep learning for nighttime vehicle detection. Two novel modules, Convolutional Maxpooling Downsampling and Multi-branch Residual Feature Fusion, are introduced to mitigate model complexity, reduce feature redundancy, and safeguard input features. Additionally, the Efficient Multi-Scale Attention Module is integrated into the Neck network’s detection layers. Extensive experiments and ablation studies on benchmark datasets demonstrate that CEM-YOLO excels in nighttime scenarios, achieving an optimal speed-accuracy balance for real-time applications.

CEM-YOLO: multi-branch residual feature fusion and convolutional maxpooling downsampling for real-time vehicle detection in night scenarios

Liu, Li-Juan;Jia, Rushi;Karimi, Hamid Reza

2025-01-01

Abstract

Real-time vehicle detection at night faces significant challenges: ineffective low-light feature extraction, low computational efficiency hampering real-time performance, subpar detection of small and occluded vehicles, and limited cross-scenario generalization. To address these issues, this paper presents CEM-YOLO, an enhanced YOLO algorithm leveraging deep learning for nighttime vehicle detection. Two novel modules, Convolutional Maxpooling Downsampling and Multi-branch Residual Feature Fusion, are introduced to mitigate model complexity, reduce feature redundancy, and safeguard input features. Additionally, the Efficient Multi-Scale Attention Module is integrated into the Neck network’s detection layers. Extensive experiments and ablation studies on benchmark datasets demonstrate that CEM-YOLO excels in nighttime scenarios, achieving an optimal speed-accuracy balance for real-time applications.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2025
			
	Titolo della rivista
	
				SIGNAL, IMAGE AND VIDEO PROCESSING
			
	Parole chiave
	
				Convolutional Maxpooling Downsampling; Deep learning; MobileViT; Multi-branch Residual Feature Fusion; Real-time vehicle detection;
			
	Appare nelle tipologie:
	
				01.1 Articolo in Rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1310790

Citazioni

ND

3

3

ND

social impact