RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

Background and study aims Reporting of colorectal polyp morphology using the Paris classification is often inaccurate. Multimodal large language models (M-LLMs) may support morphological assessment. This study aimed to evaluate the accuracy of an M-LLM (GPT-4o) in classifying colorectal polyp morphology compared with expert and non-expert endoscopists. Patients and methods We used the SUN dataset of colonoscopy videos from 100 unique colorectal polyps, each labeled with the validated Paris classification. An M-LLM (GPT-4o) classified five representative frames per lesion. Three expert and three non-expert endoscopists, blinded to one another, performed the same task. The primary outcome was accuracy in differentiating non-polypoid (IIa/IIc) from polypoid (Is/Ip/Isp) lesions. The secondary outcome was accuracy in differentiating sessile (Is) from pedunculated (Ip/Isp) lesions. Given the exploratory design, no multiplicity correction was applied; point estimates are presented with 95% confidence intervals (CIs), and P values are interpreted descriptively. Results M-LLM accuracy for differentiating non-polypoid from polypoid lesions was 73% (95% CI 63%-81%), comparable to experts (75%, 65%-83%; P = 0.84) and non-experts (77%, 68%-85%; P = 0.52), with similar sensitivity and specificity. Accuracy for differentiating sessile from pedunculated lesions was 55% (95% CI 42%-67%), lower than experts (76%; P = 0.02) and non-experts (77%; P = 0.01), primarily due to poor specificity (12% vs. experts 82% and non-experts 88%; P < 0.01 for both comparisons). Conclusions M-LLMs performed comparably to endoscopists in distinguishing non-polypoid from polypoid lesions but failed to reliably identify pedunculated morphology.

Large language model for interpreting the Paris classification of colorectal polyps

Massimi, Davide;Carlini, Luca;Mori, Yuichi;Di Stefano, Luca;Antonelli, Giulio;Rizkala, Tommy;Spadaccini, Marco;de Sire, Roberto;Alfarone, Ludovico;Lena, Chiara;D'Aprano, Alessandro;Parasa, Sravanthi;Bisschops, Raf;von Renteln, Daniel;O'Reilly, Susanne Margaret;Savevski, Victor;Sharma, Prateek;Rex, Douglas K.;Bretthauer, Michael;De Momi, Elena;Hassan, Cesare;Repici, Alessandro

2025-01-01

Abstract

Background and study aims Reporting of colorectal polyp morphology using the Paris classification is often inaccurate. Multimodal large language models (M-LLMs) may support morphological assessment. This study aimed to evaluate the accuracy of an M-LLM (GPT-4o) in classifying colorectal polyp morphology compared with expert and non-expert endoscopists. Patients and methods We used the SUN dataset of colonoscopy videos from 100 unique colorectal polyps, each labeled with the validated Paris classification. An M-LLM (GPT-4o) classified five representative frames per lesion. Three expert and three non-expert endoscopists, blinded to one another, performed the same task. The primary outcome was accuracy in differentiating non-polypoid (IIa/IIc) from polypoid (Is/Ip/Isp) lesions. The secondary outcome was accuracy in differentiating sessile (Is) from pedunculated (Ip/Isp) lesions. Given the exploratory design, no multiplicity correction was applied; point estimates are presented with 95% confidence intervals (CIs), and P values are interpreted descriptively. Results M-LLM accuracy for differentiating non-polypoid from polypoid lesions was 73% (95% CI 63%-81%), comparable to experts (75%, 65%-83%; P = 0.84) and non-experts (77%, 68%-85%; P = 0.52), with similar sensitivity and specificity. Accuracy for differentiating sessile from pedunculated lesions was 55% (95% CI 42%-67%), lower than experts (76%; P = 0.02) and non-experts (77%; P = 0.01), primarily due to poor specificity (12% vs. experts 82% and non-experts 88%; P < 0.01 for both comparisons). Conclusions M-LLMs performed comparably to endoscopists in distinguishing non-polypoid from polypoid lesions but failed to reliably identify pedunculated morphology.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2025
			
	Titolo della rivista
	
				ENDOSCOPY INTERNATIONAL OPEN
			
	Parole chiave
	
				CRC screening
Colorectal cancer
Diagnosis and imaging (inc chromoendoscopy, NBI, iSCAN, FICE, CLE...)
Endoscopy Lower GI Tract
Polyps / adenomas / ..
Tissue diagnosis
			
	Appare nelle tipologie:
	
				01.1 Articolo in Rivista

File in questo prodotto:

File	Dimensione	Formato
a-2703-0209.pdf accesso aperto Dimensione 1.37 MB Formato Adobe PDF Visualizza/Apri	1.37 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1315974

Citazioni

1

ND

4

social impact