The convergence of Artificial Intelligence (AI) and Internet of Things (IoT) is driving the need for real-time, low-latency architectures to trust the inference of complex Machine Learning (ML) models in critical applications like autonomous vehicles and smart healthcare. While traditional cloud-based solutions introduce latency due to the need to transmit data to and from centralized servers, edge computing offers lower response times by processing data locally. In this context, Random Forests (RFs) are highly suited for building hardware accelerators over resource-constrained edge devices due to their inherent parallelism. Nevertheless, maintaining a low latency as the size of the RF grows is still critical for state-of-the-art (SoA) approaches. To address this challenge, this paper proposes Moyogi, a hardware-software codesign framework for memory-centric RF inference that optimizes the architecture for the target ML model, employing RFs with Decision Trees (DTs) of multiple depths and exploring several architectural variations to find the best-performing configuration. We propose a resource estimation model based on the most relevant architectural features to enable effective Design Space Exploration. Moyogi achieves a geomean latency reduction of 3.88x on RFs trained on relevant IoT datasets, compared to the best-performing SoA memory-centric architecture.

Moyogi: A Memory-Centric Accelerator for Low-Latency Random Forest Inference on Embedded Devices

Verosimile, Alessandro;Peverelli, Francesco;Santambrogio, Marco D.
2025-01-01

Abstract

The convergence of Artificial Intelligence (AI) and Internet of Things (IoT) is driving the need for real-time, low-latency architectures to trust the inference of complex Machine Learning (ML) models in critical applications like autonomous vehicles and smart healthcare. While traditional cloud-based solutions introduce latency due to the need to transmit data to and from centralized servers, edge computing offers lower response times by processing data locally. In this context, Random Forests (RFs) are highly suited for building hardware accelerators over resource-constrained edge devices due to their inherent parallelism. Nevertheless, maintaining a low latency as the size of the RF grows is still critical for state-of-the-art (SoA) approaches. To address this challenge, this paper proposes Moyogi, a hardware-software codesign framework for memory-centric RF inference that optimizes the architecture for the target ML model, employing RFs with Decision Trees (DTs) of multiple depths and exploring several architectural variations to find the best-performing configuration. We propose a resource estimation model based on the most relevant architectural features to enable effective Design Space Exploration. Moyogi achieves a geomean latency reduction of 3.88x on RFs trained on relevant IoT datasets, compared to the best-performing SoA memory-centric architecture.
2025
Proceedings - 2025 IEEE 33rd Annual International Symposium on Field-Programmable Custom Computing Machines, FCCM 2025
Artificial Intelligence of Things
Embedded Systems
HW-SW co-design
Low-latency Architectures
Machine Learning
Random Forest
File in questo prodotto:
File Dimensione Formato  
MoyogiValidated.pdf

Accesso riservato

: Post-Print (DRAFT o Author’s Accepted Manuscript-AAM)
Dimensione 1.4 MB
Formato Adobe PDF
1.4 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1298486
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact