We present a robust monocular person-following system tailored for deployment on resource-constrained mobile robots. Starting from a state-of-the-art baseline, which introduced a width-based tracking module and a global CNN descriptor with online ridge regression for re-identification, we adapt and optimize the entire pipeline for embedded hardware. An OAK-D Pro camera equipped with a YOLO detector on board provides real-time person detection. At the same time, a Coral Edge TPU executes a custom, lightweight convolutional neural network (CNN) for person re-identification. The CNN is designed and quantized using quantization-aware training for the Edge TPU, preserving accuracy while significantly improving inference speed. We detail the system pipeline and introduce modifications for efficient embedded deployment. Experimental results on a mobile robot demonstrate that our system achieves comparable person-following performance to the baseline (mAP@50-95 of 96.3% compared to 93.3% of the baseline) across varied scenarios while running at 25 FPS on purely edge hardware. This work validates that highperformance person-following is feasible on low-cost platforms without a GPU by leveraging specialized hardware and model optimization.
A Robust Monocular Person-Following System on Embedded Platforms
Bardaro, Gianluca;Matteucci, Matteo
2025-01-01
Abstract
We present a robust monocular person-following system tailored for deployment on resource-constrained mobile robots. Starting from a state-of-the-art baseline, which introduced a width-based tracking module and a global CNN descriptor with online ridge regression for re-identification, we adapt and optimize the entire pipeline for embedded hardware. An OAK-D Pro camera equipped with a YOLO detector on board provides real-time person detection. At the same time, a Coral Edge TPU executes a custom, lightweight convolutional neural network (CNN) for person re-identification. The CNN is designed and quantized using quantization-aware training for the Edge TPU, preserving accuracy while significantly improving inference speed. We detail the system pipeline and introduce modifications for efficient embedded deployment. Experimental results on a mobile robot demonstrate that our system achieves comparable person-following performance to the baseline (mAP@50-95 of 96.3% compared to 93.3% of the baseline) across varied scenarios while running at 25 FPS on purely edge hardware. This work validates that highperformance person-following is feasible on low-cost platforms without a GPU by leveraging specialized hardware and model optimization.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


