RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

In contemporary computer vision, You Only Look Once (YOLO) has become a benchmark for object detection, widely used in domains from intelligent manufacturing-such as industrial quality control and automated inspection-to real-time video surveillance. For example, detecting surface defects on steel products or electronic components in production lines relies on such algorithms to maintain high quality and safety. Despite YOLO’s excellent speed and accuracy in many tasks, it still faces difficulties in certain challenging conditions, notably high dynamic range scenes, complex backgrounds, and the detection of small or subtle objects. These conditions are common in practice-for instance, on shiny metal surfaces with uneven lighting or in busy surveillance scenes-where conventional YOLO models struggle to capture fine details reliably. To overcome these limitations, we propose an improved YOLO-based framework featuring a novel Dynamic Cross-Scale Feature Fusion Module (Dy-CCFM) and a Dual-path Downsampling Convolution Module (DDConv). These modules enhance multi-scale feature representation and preserve detail under extreme lighting and background clutter, which is crucial for monitoring in complex environments. Additionally, we employ the Minimum Point Distance Intersection over Union (MPDIoU) as an optimized loss function for bounding box regression, significantly improving the localization of small objects. Thanks to these innovations, the model achieves a mean Average Precision (mAP) of 75.1 % on the challenging Northeastern University surface defect (NEU-DET) dataset, while the smallest variant is only 1.6M in size. Compared to YOLOv8, our approach improves mAP by 2.1 % while also delivering higher inference speed (FPS), and it surpasses the Detection Transformer (DETR) by 5.0 % mAP. The model further demonstrates excellent generalization on the Google Cloud 10 Defect Detection (GC10-DET) dataset. This enhanced detection algorithm not only improves performance but also offers significant practical value in intelligent manufacturing and automated inspection systems, intelligent video surveillance, and autonomous vehicles, where reliable real-time detection of small defects or targets is critical.

A real-time surface defect detection model based on adaptive feature information selection and fusion

Liu, Li-Juan;Sun, Shao-Qi;Karimi, Hamid Reza

2026-01-01

Abstract

In contemporary computer vision, You Only Look Once (YOLO) has become a benchmark for object detection, widely used in domains from intelligent manufacturing-such as industrial quality control and automated inspection-to real-time video surveillance. For example, detecting surface defects on steel products or electronic components in production lines relies on such algorithms to maintain high quality and safety. Despite YOLO’s excellent speed and accuracy in many tasks, it still faces difficulties in certain challenging conditions, notably high dynamic range scenes, complex backgrounds, and the detection of small or subtle objects. These conditions are common in practice-for instance, on shiny metal surfaces with uneven lighting or in busy surveillance scenes-where conventional YOLO models struggle to capture fine details reliably. To overcome these limitations, we propose an improved YOLO-based framework featuring a novel Dynamic Cross-Scale Feature Fusion Module (Dy-CCFM) and a Dual-path Downsampling Convolution Module (DDConv). These modules enhance multi-scale feature representation and preserve detail under extreme lighting and background clutter, which is crucial for monitoring in complex environments. Additionally, we employ the Minimum Point Distance Intersection over Union (MPDIoU) as an optimized loss function for bounding box regression, significantly improving the localization of small objects. Thanks to these innovations, the model achieves a mean Average Precision (mAP) of 75.1 % on the challenging Northeastern University surface defect (NEU-DET) dataset, while the smallest variant is only 1.6M in size. Compared to YOLOv8, our approach improves mAP by 2.1 % while also delivering higher inference speed (FPS), and it surpasses the Detection Transformer (DETR) by 5.0 % mAP. The model further demonstrates excellent generalization on the Google Cloud 10 Defect Detection (GC10-DET) dataset. This enhanced detection algorithm not only improves performance but also offers significant practical value in intelligent manufacturing and automated inspection systems, intelligent video surveillance, and autonomous vehicles, where reliable real-time detection of small defects or targets is critical.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2026
			
	Titolo della rivista
	
				INFORMATION FUSION
			
	Parole chiave
	
				Deep learning; Feature fusion pyramid network; Object detection; Steel surface defect detection;
			
	Appare nelle tipologie:
	
				01.1 Articolo in Rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1310751

Citazioni

ND

1

1

social impact