Neural architecture search (NAS) has become an essential tool for designing effective and efficient neural networks. In this paper, we investigate the geometric properties of neural architecture spaces commonly used in differentiable NAS methods, specifically NAS-Bench-201 and differentiable architecture search (DARTS). By introducing notions of flatness in architecture space such as neighborhoods and accuracy barriers along paths, we reveal locality and flatness characteristics analogous to the well-known properties of neural network loss landscapes in weight space. In particular, we unveil the detailed geometrical structure of the architecture search landscape by uncovering the absence of barriers between well-performing architectures, finding that highly accurate architectures cluster together in flat regions, while suboptimal architectures instead remain isolated, showing higher values of the barriers. Building on these insights, we propose architecture-aware minimization (A2M), a novel analytically derived algorithmic framework that explicitly biases, for the first time, the gradient of differentiable NAS methods towards flat minima in architecture space. A2M consistently improves generalization over state-of-the-art DARTS-based algorithms on benchmark datasets including CIFAR-10, CIFAR-100, and ImageNet-16-120, across both NAS-Bench-201 and DARTS search spaces. Notably, A2M is able to increase the test accuracy, on average across different differentiable NAS methods, by +3.60% on CIFAR-10, +4.60% on CIFAR-100, and +3.64% on ImageNet-16-120 - while finding architectures with low accuracy barriers. A2M can be easily integrated into existing differentiable NAS frameworks, offering a versatile tool for future research and applications in automated machine learning. We will open-source our code at https://github.com/AI-Tech-Research-Lab/AsquaredM.

Architecture-aware minimization (A2M): how to find flat minima in neural architecture search

M. Gambella;M. Roveri;F. Pittorino
2025-01-01

Abstract

Neural architecture search (NAS) has become an essential tool for designing effective and efficient neural networks. In this paper, we investigate the geometric properties of neural architecture spaces commonly used in differentiable NAS methods, specifically NAS-Bench-201 and differentiable architecture search (DARTS). By introducing notions of flatness in architecture space such as neighborhoods and accuracy barriers along paths, we reveal locality and flatness characteristics analogous to the well-known properties of neural network loss landscapes in weight space. In particular, we unveil the detailed geometrical structure of the architecture search landscape by uncovering the absence of barriers between well-performing architectures, finding that highly accurate architectures cluster together in flat regions, while suboptimal architectures instead remain isolated, showing higher values of the barriers. Building on these insights, we propose architecture-aware minimization (A2M), a novel analytically derived algorithmic framework that explicitly biases, for the first time, the gradient of differentiable NAS methods towards flat minima in architecture space. A2M consistently improves generalization over state-of-the-art DARTS-based algorithms on benchmark datasets including CIFAR-10, CIFAR-100, and ImageNet-16-120, across both NAS-Bench-201 and DARTS search spaces. Notably, A2M is able to increase the test accuracy, on average across different differentiable NAS methods, by +3.60% on CIFAR-10, +4.60% on CIFAR-100, and +3.64% on ImageNet-16-120 - while finding architectures with low accuracy barriers. A2M can be easily integrated into existing differentiable NAS frameworks, offering a versatile tool for future research and applications in automated machine learning. We will open-source our code at https://github.com/AI-Tech-Research-Lab/AsquaredM.
2025
Neural Architecture Search (NAS), Differentiable NAS, Sharpness Aware Minimization, Loss landscapes, Flatness
File in questo prodotto:
File Dimensione Formato  
Gambella_2025_Mach._Learn.__Sci._Technol._6_035016 (1).pdf

Accesso riservato

: Publisher’s version
Dimensione 3.73 MB
Formato Adobe PDF
3.73 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1295436
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact