Background: Existing deep learning studies for the automated detection of hip prosthesis failure only consider the last available radiographic image. However, using longitudinal data is thought to improve the prediction, by combining temporal and spatial components. The aim of this study is to develop artificial intelligence models for predicting hip implant failure from multiple subsequent plain radiographs. Methods: A cohort of 224 patients was considered for model development and a balanced cohort of 14 patients was used for external validation. A sequence of two or three anteroposterior radiographic images per patient was considered to track the prosthesis over time. A combination of a convolutional neural network (CNN) and a recurrent section was used. For the CNN, a pretrained autoencoder, a pretrained RadImageNet DenseNet and a pretrained custom DenseNet were considered. The recurrent section was implemented using either a single Gated Recurrent Unit (GRU) layer or a Long Short-Term Memory block. Results: Considering 3 images as input provided a positive predictive value (PPV) of 0.966 and an f1 score of 0.933 on the validation set. Regarding the 2-image models, using the postoperative and the last image resulted in PPV of 0.933 and f1 score of 0.918, whereas using the second-to-last image with the post-operative one reached a PPV of 0.882 and f1 score of 0.923. On the external validation set, the 3-image model reached an accuracy of 0.786. Conclusion: This study demonstrated the potential of the developed models, based on a series of plain radiographs, to predict hip prosthesis failure.

Hip prosthesis failure prediction through radiological deep sequence learning

Corti A.;Lindemann A.;Corino V.
2025-01-01

Abstract

Background: Existing deep learning studies for the automated detection of hip prosthesis failure only consider the last available radiographic image. However, using longitudinal data is thought to improve the prediction, by combining temporal and spatial components. The aim of this study is to develop artificial intelligence models for predicting hip implant failure from multiple subsequent plain radiographs. Methods: A cohort of 224 patients was considered for model development and a balanced cohort of 14 patients was used for external validation. A sequence of two or three anteroposterior radiographic images per patient was considered to track the prosthesis over time. A combination of a convolutional neural network (CNN) and a recurrent section was used. For the CNN, a pretrained autoencoder, a pretrained RadImageNet DenseNet and a pretrained custom DenseNet were considered. The recurrent section was implemented using either a single Gated Recurrent Unit (GRU) layer or a Long Short-Term Memory block. Results: Considering 3 images as input provided a positive predictive value (PPV) of 0.966 and an f1 score of 0.933 on the validation set. Regarding the 2-image models, using the postoperative and the last image resulted in PPV of 0.933 and f1 score of 0.918, whereas using the second-to-last image with the post-operative one reached a PPV of 0.882 and f1 score of 0.923. On the external validation set, the 3-image model reached an accuracy of 0.786. Conclusion: This study demonstrated the potential of the developed models, based on a series of plain radiographs, to predict hip prosthesis failure.
2025
Artificial intelligence
Hip replacement
Image classification
Temporal dependency
File in questo prodotto:
File Dimensione Formato  
2025Masciulli.pdf

accesso aperto

Descrizione: 2025Masciulli_IntJMedInf
: Publisher’s version
Dimensione 1.07 MB
Formato Adobe PDF
1.07 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1282197
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 0
social impact