The convergence of Artificial Intelligence (AI) and Internet of Things (IoT) is driving the need for real-time, low-latency architectures to trust the inference of complex Machine Learning (ML) models in critical applications like autonomous vehicles and smart healthcare. While traditional cloud-based solutions introduce latency due to the need to transmit data to and from centralized servers, edge computing offers lower response times by processing data locally. In this context, Random Forests (RFs) are highly suited for building hardware accelerators over resource-constrained edge devices due to their inherent parallelism. Nevertheless, maintaining a low latency as the size of the RF grows is still critical for state-of-the-art (SoA) approaches. To address this challenge, this paper proposes Moyogi, a hardware-software codesign framework for memory-centric RF inference that optimizes the architecture for the target ML model, employing RFs with Decision Trees (DTs) of multiple depths and exploring several architectural variations to find the best-performing configuration. We propose a resource estimation model based on the most relevant architectural features to enable effective Design Space Exploration. Moyogi achieves a geomean latency reduction of 3.88x on RFs trained on relevant IoT datasets, compared to the best-performing SoA memory-centric architecture.
Moyogi: A Memory-Centric Accelerator for Low-Latency Random Forest Inference on Embedded Devices
Verosimile, Alessandro;Peverelli, Francesco;Santambrogio, Marco D.
2025-01-01
Abstract
The convergence of Artificial Intelligence (AI) and Internet of Things (IoT) is driving the need for real-time, low-latency architectures to trust the inference of complex Machine Learning (ML) models in critical applications like autonomous vehicles and smart healthcare. While traditional cloud-based solutions introduce latency due to the need to transmit data to and from centralized servers, edge computing offers lower response times by processing data locally. In this context, Random Forests (RFs) are highly suited for building hardware accelerators over resource-constrained edge devices due to their inherent parallelism. Nevertheless, maintaining a low latency as the size of the RF grows is still critical for state-of-the-art (SoA) approaches. To address this challenge, this paper proposes Moyogi, a hardware-software codesign framework for memory-centric RF inference that optimizes the architecture for the target ML model, employing RFs with Decision Trees (DTs) of multiple depths and exploring several architectural variations to find the best-performing configuration. We propose a resource estimation model based on the most relevant architectural features to enable effective Design Space Exploration. Moyogi achieves a geomean latency reduction of 3.88x on RFs trained on relevant IoT datasets, compared to the best-performing SoA memory-centric architecture.| File | Dimensione | Formato | |
|---|---|---|---|
|
MoyogiValidated.pdf
Accesso riservato
:
Post-Print (DRAFT o Author’s Accepted Manuscript-AAM)
Dimensione
1.4 MB
Formato
Adobe PDF
|
1.4 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


