Recently, augmented reality and wearable devices, such as smart eyewear systems, have gained significant attention due to advancements in computer vision technology and the proliferation of compact wearable cameras. This has led to an increased interest in egocentric vision, which offers a unique perspective for recognizing human actions and understanding behavior from a first-person view. However, existing approaches for egocentric action recognition often rely on complex architectures with high computational demands, such as large transformers, which are unsuitable for real-time applications on wearable devices with limited processing power. This work aims to develop a lightweight, real-time egocentric action recognition system tailored for resource-constrained environments. We evaluate the recent LaViLa model for online adaptation and explore the use of the lightweight MiniROAD model, initially designed for exocentric Online Action Detection, on egocentric data. By creating a focused dataset, EgoClip Office, we can optimize the model for our specific application. Our approach is validated on an Nvidia Jetson platform, demonstrating the feasibility of achieving real-time performance on low-power embedded devices.

Towards Real-Time Online Egocentric Action Recognition on Smart Eyewear

Santambrogio, Riccardo;Corti, Greta;Mentasti, Simone;Matteucci, Matteo
2025-01-01

Abstract

Recently, augmented reality and wearable devices, such as smart eyewear systems, have gained significant attention due to advancements in computer vision technology and the proliferation of compact wearable cameras. This has led to an increased interest in egocentric vision, which offers a unique perspective for recognizing human actions and understanding behavior from a first-person view. However, existing approaches for egocentric action recognition often rely on complex architectures with high computational demands, such as large transformers, which are unsuitable for real-time applications on wearable devices with limited processing power. This work aims to develop a lightweight, real-time egocentric action recognition system tailored for resource-constrained environments. We evaluate the recent LaViLa model for online adaptation and explore the use of the lightweight MiniROAD model, initially designed for exocentric Online Action Detection, on egocentric data. By creating a focused dataset, EgoClip Office, we can optimize the model for our specific application. Our approach is validated on an Nvidia Jetson platform, demonstrating the feasibility of achieving real-time performance on low-power embedded devices.
2025
Computer Vision – ECCV 2024 Workshops. ECCV 2024
9783031919886
9783031919893
File in questo prodotto:
File Dimensione Formato  
978-3-031-91989-3_18.pdf

accesso aperto

: Publisher’s version
Dimensione 514.07 kB
Formato Adobe PDF
514.07 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1292488
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact