We present a new way to detect 3D objects from multimodal inputs, leveraging both LiDAR and RGB cameras in a hybrid late-cascade scheme, that combines an RGB detection network and a 3D LiDAR detector. We exploit late fusion principles to reduce LiDAR False Positives, matching LiDAR detections with RGB ones by projecting the LiDAR bounding boxes on the image. We rely on cascade fusion principles to recover LiDAR False Negatives leveraging epipolar constraints and frustums generated by RGB detections of separate views. Our solution can be plugged on top of any underlying single-modal detectors, enabling a flexible training process that can take advantage of pre-trained LiDAR and RGB detectors, or train the two branches separately. We evaluate our results on the KITTI object detection benchmark, showing significant performance improvements, especially for the detection of Pedestrians and Cyclists. Code can be downloaded from: https://github.com/CarloSgaravatti/HybridLateCascadeFusion.

A Multimodal Hybrid Late-Cascade Fusion Network for Enhanced 3D Object Detection

C. Sgaravatti;R. Basla;R. Pieroni;M. Corno;S. M. Savaresi;L. Magri;G. Boracchi
2025-01-01

Abstract

We present a new way to detect 3D objects from multimodal inputs, leveraging both LiDAR and RGB cameras in a hybrid late-cascade scheme, that combines an RGB detection network and a 3D LiDAR detector. We exploit late fusion principles to reduce LiDAR False Positives, matching LiDAR detections with RGB ones by projecting the LiDAR bounding boxes on the image. We rely on cascade fusion principles to recover LiDAR False Negatives leveraging epipolar constraints and frustums generated by RGB detections of separate views. Our solution can be plugged on top of any underlying single-modal detectors, enabling a flexible training process that can take advantage of pre-trained LiDAR and RGB detectors, or train the two branches separately. We evaluate our results on the KITTI object detection benchmark, showing significant performance improvements, especially for the detection of Pedestrians and Cyclists. Code can be downloaded from: https://github.com/CarloSgaravatti/HybridLateCascadeFusion.
2025
Computer Vision – ECCV 2024 Workshops
978-3-031-91767-7
3D Object Detection, Multimodal, Autonomous Driving
File in questo prodotto:
File Dimensione Formato  
2024_06_ECCV_WRKSP_Autonomous_Vehicles.pdf

Accesso riservato

: Pre-Print (o Pre-Refereeing)
Dimensione 4.34 MB
Formato Adobe PDF
4.34 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1288886
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 0
social impact