Integrating the information coming from biological samples with digital data, such as medical images, has gained prominence with the advent of precision medicine. Research in this field faces an ever-increasing amount of data to manage and, as a consequence, the need to structure these data in a functional and standardized fashion to promote and facilitate cooperation among institutions. Inspired by the Minimum Information About BIobank data Sharing (MIABIS), we propose an extended data model which aims to standardize data collections where both biological and digital samples are involved. In the proposed model, strong emphasis is given to the cause-effect relationships among factors as these are frequently encountered in clinical workflows. To test the data model in a realistic context, we consider the Continuous Observation of SMOking Subjects (COSMOS) dataset as case study, consisting of 10 consecutive years of lung cancer screening and follow-up on more than 5000 subjects. The structure of the COSMOS database, implemented to facilitate the process of data retrieval, is therefore presented along with a description of data that we hope to share in a public repository for lung cancer screening research.

Integrating Biological and Radiological Data in a Structured Repository: a Data Model Applied to the COSMOS Case Study

Garau, Noemi;Baroni, Guido;Paganelli, Chiara;
2022-01-01

Abstract

Integrating the information coming from biological samples with digital data, such as medical images, has gained prominence with the advent of precision medicine. Research in this field faces an ever-increasing amount of data to manage and, as a consequence, the need to structure these data in a functional and standardized fashion to promote and facilitate cooperation among institutions. Inspired by the Minimum Information About BIobank data Sharing (MIABIS), we propose an extended data model which aims to standardize data collections where both biological and digital samples are involved. In the proposed model, strong emphasis is given to the cause-effect relationships among factors as these are frequently encountered in clinical workflows. To test the data model in a realistic context, we consider the Continuous Observation of SMOking Subjects (COSMOS) dataset as case study, consisting of 10 consecutive years of lung cancer screening and follow-up on more than 5000 subjects. The structure of the COSMOS database, implemented to facilitate the process of data retrieval, is therefore presented along with a description of data that we hope to share in a public repository for lung cancer screening research.
2022
Lung cancer screening
Radiology workflow
Standardization
Structured reporting
File in questo prodotto:
File Dimensione Formato  
11311-1233648_Garau.pdf

accesso aperto

: Publisher’s version
Dimensione 2.5 MB
Formato Adobe PDF
2.5 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1233648
Citazioni
  • ???jsp.display-item.citation.pmc??? 0
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact