RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

Deep learning-based object detection models have achieved significant progress in natural scene tasks. However, in industrial environments, challenges such as densely stacked workpieces, blurred boundaries, and complex backgrounds make it difficult for existing general-purpose detection models to perform effectively in dense workpiece detection tasks. To address this issue, we propose a dense workpiece detection framework for industrial scenarios, called improved YOLOv5 combining slicing training and slicing inference (STSI-YOLOv5). Specifically, we design a selective channel attention module (SCAM), which is effectively integrated into the neck of YOLOv5 to enhance feature aggregation capability by dynamically selecting relevant channel information and suppressing redundant channels, thereby improving model performance in workpiece detection within complex backgrounds. In addition, STSI-YOLOv5 introduces the slicing training and slicing inference (STSI) strategy, which divides the input image into consistently sized patches during both training and inference to enhance the representation of local features, thereby assisting the model in more accurately determining workpiece boundaries and improving localization accuracy in dense scenes. Correspondingly, for the slicing inference in this scenario, we propose a novel postprocessing method, area-based nonmaximum suppression (A-NMS), which ranks the detection boxes based on their area, followed by suppression operations, effectively selecting more accurate detection boxes that align with the target characteristics. Finally, the proposed method was evaluated on the WPCD dataset, with results showing its superiority over existing state-of-the-art methods.

STSI-YOLOv5: Improved YOLOv5 Combining Slicing Training and Slicing Inference for Dense Workpiece Detection in Industrial Scenes

Dong, Chaojun;Zeng, Hong;Li, Ye;Zhai, Yikui;Wang, Tianlei;Liu, Xiankun;Zhou, Jianhong;Quan, Hao;Philip Chen, C. L.

2025-01-01

Abstract

Deep learning-based object detection models have achieved significant progress in natural scene tasks. However, in industrial environments, challenges such as densely stacked workpieces, blurred boundaries, and complex backgrounds make it difficult for existing general-purpose detection models to perform effectively in dense workpiece detection tasks. To address this issue, we propose a dense workpiece detection framework for industrial scenarios, called improved YOLOv5 combining slicing training and slicing inference (STSI-YOLOv5). Specifically, we design a selective channel attention module (SCAM), which is effectively integrated into the neck of YOLOv5 to enhance feature aggregation capability by dynamically selecting relevant channel information and suppressing redundant channels, thereby improving model performance in workpiece detection within complex backgrounds. In addition, STSI-YOLOv5 introduces the slicing training and slicing inference (STSI) strategy, which divides the input image into consistently sized patches during both training and inference to enhance the representation of local features, thereby assisting the model in more accurately determining workpiece boundaries and improving localization accuracy in dense scenes. Correspondingly, for the slicing inference in this scenario, we propose a novel postprocessing method, area-based nonmaximum suppression (A-NMS), which ranks the detection boxes based on their area, followed by suppression operations, effectively selecting more accurate detection boxes that align with the target characteristics. Finally, the proposed method was evaluated on the WPCD dataset, with results showing its superiority over existing state-of-the-art methods.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2025
			
	Titolo della rivista
	
				IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT
			
	Appare nelle tipologie:
	
				01.1 Articolo in Rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1294054

Citazioni

ND

1

2

social impact