: Endoscopic minimally invasive surgery relies on precise tissue video segmentation to avoid complications such as vascular bleeding or nerve injury. However, existing video segmentation methods often fail to maintain long-term robustness due to target loss and challenging conditions (e.g., occlusion, motion blur), limiting their applicability in prolonged surgical procedures. To address these limitations, we proposed the Unified Framework for Joint Pixel-Level Segmentation and Tracking (STF), it integrates a synergistic segmentation-guided tracking pipeline with an adaptive re-detection mechanism. First, a deep learning-based segmentation network precisely localizes the target tissue. A cost-efficient Hough Voting Network then tracks the segmented region, while a Bayesian refinement module improves compatibility between segmentation and tracking. If tracking reliability drops, an evaluation module triggers re-segmentation, ensuring continuous and stable long-term performance. Extensive experiments confirm that STF achieves superior accuracy and temporal consistency over segmentation networks in long-term surgical video segmentation, particularly under extreme conditions. This automated methodology significantly improves the robustness and re-detection capability for sustained tissue analysis, markedly reducing the dependency on manual intervention prevalent in many model-based tracking solutions.

STF: A Unified Framework for Joint Pixel-Level Segmentation and Tracking of Tissues in Endoscopic Surgery

Li Y.;Cruciani L.;Ferrigno G.;De Momi E.
2026-01-01

Abstract

: Endoscopic minimally invasive surgery relies on precise tissue video segmentation to avoid complications such as vascular bleeding or nerve injury. However, existing video segmentation methods often fail to maintain long-term robustness due to target loss and challenging conditions (e.g., occlusion, motion blur), limiting their applicability in prolonged surgical procedures. To address these limitations, we proposed the Unified Framework for Joint Pixel-Level Segmentation and Tracking (STF), it integrates a synergistic segmentation-guided tracking pipeline with an adaptive re-detection mechanism. First, a deep learning-based segmentation network precisely localizes the target tissue. A cost-efficient Hough Voting Network then tracks the segmented region, while a Bayesian refinement module improves compatibility between segmentation and tracking. If tracking reliability drops, an evaluation module triggers re-segmentation, ensuring continuous and stable long-term performance. Extensive experiments confirm that STF achieves superior accuracy and temporal consistency over segmentation networks in long-term surgical video segmentation, particularly under extreme conditions. This automated methodology significantly improves the robustness and re-detection capability for sustained tissue analysis, markedly reducing the dependency on manual intervention prevalent in many model-based tracking solutions.
2026
File in questo prodotto:
File Dimensione Formato  
STF_A_Unified_Framework_for_Joint_Pixel-Level_Segmentation_and_Tracking_of_Tissues_in_Endoscopic_Surgery.pdf

accesso aperto

: Post-Print (DRAFT o Author’s Accepted Manuscript-AAM)
Dimensione 18.23 MB
Formato Adobe PDF
18.23 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1307967
Citazioni
  • ???jsp.display-item.citation.pmc??? 1
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact