RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

Neural architecture search (NAS) has become an essential tool for designing effective and efficient neural networks. In this paper, we investigate the geometric properties of neural architecture spaces commonly used in differentiable NAS methods, specifically NAS-Bench-201 and differentiable architecture search (DARTS). By introducing notions of flatness in architecture space such as neighborhoods and accuracy barriers along paths, we reveal locality and flatness characteristics analogous to the well-known properties of neural network loss landscapes in weight space. In particular, we unveil the detailed geometrical structure of the architecture search landscape by uncovering the absence of barriers between well-performing architectures, finding that highly accurate architectures cluster together in flat regions, while suboptimal architectures instead remain isolated, showing higher values of the barriers. Building on these insights, we propose architecture-aware minimization (A2M), a novel analytically derived algorithmic framework that explicitly biases, for the first time, the gradient of differentiable NAS methods towards flat minima in architecture space. A2M consistently improves generalization over state-of-the-art DARTS-based algorithms on benchmark datasets including CIFAR-10, CIFAR-100, and ImageNet-16-120, across both NAS-Bench-201 and DARTS search spaces. Notably, A2M is able to increase the test accuracy, on average across different differentiable NAS methods, by +3.60% on CIFAR-10, +4.60% on CIFAR-100, and +3.64% on ImageNet-16-120 - while finding architectures with low accuracy barriers. A2M can be easily integrated into existing differentiable NAS frameworks, offering a versatile tool for future research and applications in automated machine learning. We will open-source our code at https://github.com/AI-Tech-Research-Lab/AsquaredM.

Architecture-aware minimization (A2M): how to find flat minima in neural architecture search

M. Gambella;M. Roveri;F. Pittorino

2025-01-01

Abstract

Neural architecture search (NAS) has become an essential tool for designing effective and efficient neural networks. In this paper, we investigate the geometric properties of neural architecture spaces commonly used in differentiable NAS methods, specifically NAS-Bench-201 and differentiable architecture search (DARTS). By introducing notions of flatness in architecture space such as neighborhoods and accuracy barriers along paths, we reveal locality and flatness characteristics analogous to the well-known properties of neural network loss landscapes in weight space. In particular, we unveil the detailed geometrical structure of the architecture search landscape by uncovering the absence of barriers between well-performing architectures, finding that highly accurate architectures cluster together in flat regions, while suboptimal architectures instead remain isolated, showing higher values of the barriers. Building on these insights, we propose architecture-aware minimization (A2M), a novel analytically derived algorithmic framework that explicitly biases, for the first time, the gradient of differentiable NAS methods towards flat minima in architecture space. A2M consistently improves generalization over state-of-the-art DARTS-based algorithms on benchmark datasets including CIFAR-10, CIFAR-100, and ImageNet-16-120, across both NAS-Bench-201 and DARTS search spaces. Notably, A2M is able to increase the test accuracy, on average across different differentiable NAS methods, by +3.60% on CIFAR-10, +4.60% on CIFAR-100, and +3.64% on ImageNet-16-120 - while finding architectures with low accuracy barriers. A2M can be easily integrated into existing differentiable NAS frameworks, offering a versatile tool for future research and applications in automated machine learning. We will open-source our code at https://github.com/AI-Tech-Research-Lab/AsquaredM.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2025
			
	Titolo della rivista
	
				MACHINE LEARNING: SCIENCE AND TECHNOLOGY
			
	Parole chiave
	
				Neural Architecture Search (NAS), Differentiable NAS, Sharpness Aware Minimization, Loss landscapes, Flatness
			
	Appare nelle tipologie:
	
				01.1 Articolo in Rivista

File in questo prodotto:

File	Dimensione	Formato
Gambella_2025_Mach._Learn.__Sci._Technol._6_035016 (1).pdf Accesso riservato : Publisher’s version Dimensione 3.73 MB Formato Adobe PDF Visualizza/Apri	3.73 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1295436

Citazioni

ND

0

0

social impact