Recently, augmented reality and wearable devices, such as smart eyewear systems, have gained significant attention due to advancements in computer vision technology and the proliferation of compact wearable cameras. This has led to an increased interest in egocentric vision, which offers a unique perspective for recognizing human actions and understanding behavior from a first-person view. However, existing approaches for egocentric action recognition often rely on complex architectures with high computational demands, such as large transformers, which are unsuitable for real-time applications on wearable devices with limited processing power. This work aims to develop a lightweight, real-time egocentric action recognition system tailored for resource-constrained environments. We evaluate the recent LaViLa model for online adaptation and explore the use of the lightweight MiniROAD model, initially designed for exocentric Online Action Detection, on egocentric data. By creating a focused dataset, EgoClip Office, we can optimize the model for our specific application. Our approach is validated on an Nvidia Jetson platform, demonstrating the feasibility of achieving real-time performance on low-power embedded devices.
Towards Real-Time Online Egocentric Action Recognition on Smart Eyewear
Santambrogio, Riccardo;Corti, Greta;Mentasti, Simone;Matteucci, Matteo
2025-01-01
Abstract
Recently, augmented reality and wearable devices, such as smart eyewear systems, have gained significant attention due to advancements in computer vision technology and the proliferation of compact wearable cameras. This has led to an increased interest in egocentric vision, which offers a unique perspective for recognizing human actions and understanding behavior from a first-person view. However, existing approaches for egocentric action recognition often rely on complex architectures with high computational demands, such as large transformers, which are unsuitable for real-time applications on wearable devices with limited processing power. This work aims to develop a lightweight, real-time egocentric action recognition system tailored for resource-constrained environments. We evaluate the recent LaViLa model for online adaptation and explore the use of the lightweight MiniROAD model, initially designed for exocentric Online Action Detection, on egocentric data. By creating a focused dataset, EgoClip Office, we can optimize the model for our specific application. Our approach is validated on an Nvidia Jetson platform, demonstrating the feasibility of achieving real-time performance on low-power embedded devices.| File | Dimensione | Formato | |
|---|---|---|---|
|
978-3-031-91989-3_18.pdf
accesso aperto
:
Publisher’s version
Dimensione
514.07 kB
Formato
Adobe PDF
|
514.07 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


