RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

The convergence of Artificial Intelligence (AI) and Internet of Things (IoT) is driving the need for real-time, low-latency architectures to trust the inference of complex Machine Learning (ML) models in critical applications like autonomous vehicles and smart healthcare. While traditional cloud-based solutions introduce latency due to the need to transmit data to and from centralized servers, edge computing offers lower response times by processing data locally. In this context, Random Forests (RFs) are highly suited for building hardware accelerators over resource-constrained edge devices due to their inherent parallelism. Nevertheless, maintaining a low latency as the size of the RF grows is still critical for state-of-the-art (SoA) approaches. To address this challenge, this paper proposes Moyogi, a hardware-software codesign framework for memory-centric RF inference that optimizes the architecture for the target ML model, employing RFs with Decision Trees (DTs) of multiple depths and exploring several architectural variations to find the best-performing configuration. We propose a resource estimation model based on the most relevant architectural features to enable effective Design Space Exploration. Moyogi achieves a geomean latency reduction of 3.88x on RFs trained on relevant IoT datasets, compared to the best-performing SoA memory-centric architecture.

Moyogi: A Memory-Centric Accelerator for Low-Latency Random Forest Inference on Embedded Devices

Verosimile, Alessandro;Peverelli, Francesco;Santambrogio, Marco D.

2025-01-01

Abstract

The convergence of Artificial Intelligence (AI) and Internet of Things (IoT) is driving the need for real-time, low-latency architectures to trust the inference of complex Machine Learning (ML) models in critical applications like autonomous vehicles and smart healthcare. While traditional cloud-based solutions introduce latency due to the need to transmit data to and from centralized servers, edge computing offers lower response times by processing data locally. In this context, Random Forests (RFs) are highly suited for building hardware accelerators over resource-constrained edge devices due to their inherent parallelism. Nevertheless, maintaining a low latency as the size of the RF grows is still critical for state-of-the-art (SoA) approaches. To address this challenge, this paper proposes Moyogi, a hardware-software codesign framework for memory-centric RF inference that optimizes the architecture for the target ML model, employing RFs with Decision Trees (DTs) of multiple depths and exploring several architectural variations to find the best-performing configuration. We propose a resource estimation model based on the most relevant architectural features to enable effective Design Space Exploration. Moyogi achieves a geomean latency reduction of 3.88x on RFs trained on relevant IoT datasets, compared to the best-performing SoA memory-centric architecture.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2025
			
	Titolo del libro
	
				Proceedings - 2025 IEEE 33rd Annual International Symposium on Field-Programmable Custom Computing Machines, FCCM 2025
			
	Parole chiave
	
				Artificial Intelligence of Things
Embedded Systems
HW-SW co-design
Low-latency Architectures
Machine Learning
Random Forest
			
	Appare nelle tipologie:
	
				04.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
MoyogiValidated.pdf Accesso riservato : Post-Print (DRAFT o Author’s Accepted Manuscript-AAM) Dimensione 1.4 MB Formato Adobe PDF Visualizza/Apri	1.4 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1298486

Citazioni

ND

0

0

social impact