This paper proposes a multimodal cross-domain anomaly detection framework that leverages synthetic anomalies as auxiliary data. To overcome the cold-start challenge overlooked by existing methods, we design the Synthetic Anomaly Module (SyAM), which embeds potential anomaly patterns into high-frequency regions of normal samples during forward diffusion, thereby enabling realistic anomaly generation while preserving low-frequency distributions. For detection, a pyramid graph network extracts multi-scale topological features, and a zooming mechanism establishes cross-scale correlations to enhance anomaly localization. Detection is performed via vision-text matching. Experimental results show that the proposed model, Zoom-Anomaly, achieves high accuracy when trained solely on synthetic anomalies and demonstrates robust performance on both the MVTec AD, VisA and PV_actual AD datasets, confirming its effectiveness in real-world industrial environments.
Zoom-Anomaly: Multimodal vision-Language fusion industrial anomaly detection with synthetic data
Li, Jiaqi;Karimi, Hamid Reza
2026-01-01
Abstract
This paper proposes a multimodal cross-domain anomaly detection framework that leverages synthetic anomalies as auxiliary data. To overcome the cold-start challenge overlooked by existing methods, we design the Synthetic Anomaly Module (SyAM), which embeds potential anomaly patterns into high-frequency regions of normal samples during forward diffusion, thereby enabling realistic anomaly generation while preserving low-frequency distributions. For detection, a pyramid graph network extracts multi-scale topological features, and a zooming mechanism establishes cross-scale correlations to enhance anomaly localization. Detection is performed via vision-text matching. Experimental results show that the proposed model, Zoom-Anomaly, achieves high accuracy when trained solely on synthetic anomalies and demonstrates robust performance on both the MVTec AD, VisA and PV_actual AD datasets, confirming its effectiveness in real-world industrial environments.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


