The recent COVID-19 pandemic has posed novel challenges to the big data and knowledge management community. The unprecedented availability of viral genomes on public databases has made possible the data-driven exploration of viruses' evolution (especially of SARS-CoV-2, the virus responsible for the disease). Properties of data and knowledge in the genomic and virological domain may fuel data science methods for the identification and possible prediction of critical phenomena, such as the emergence of variants with improved transmissibility/virulence and recombined strains. A number of tools have been produced to explore the variants' trends or suggest hypotheses on the evolutionary mechanisms of the virus. In this perspective, we elaborate on plausible directions of this field of research, which are still applicable to the SARS-CoV-2 virus but may become even more relevant in the context of new outbreaks (e.g., monkeypox, malaria, diphtheria). Expressly, we point to 1) data-driven identification of mutations or variants with potential impact; 2) data-driven identification of recombination events - creating opportunities to overcome selective pressure and adapt to new environments and hosts (e.g., livestock or humans). These directions can be framed within genomic surveillance measures, characterized by the possibility of tracking viruses by using their genome, which is collected, sequenced, and submitted to public databases by laboratories around the world. If successful, genomic surveillance substantially supports the understanding of novel viral pathogens and of their dangerousness in terms of prevalence, infectivity, and transmissibility; the implemented services can be of great utility to decision-makers in healthcare. Here, we draw current trends, challenges, and future directions of data-driven services for genomic surveillance.

The Opportunity of Data-Driven Services for Viral Genomic Surveillance

Bernasconi, Anna
2023-01-01

Abstract

The recent COVID-19 pandemic has posed novel challenges to the big data and knowledge management community. The unprecedented availability of viral genomes on public databases has made possible the data-driven exploration of viruses' evolution (especially of SARS-CoV-2, the virus responsible for the disease). Properties of data and knowledge in the genomic and virological domain may fuel data science methods for the identification and possible prediction of critical phenomena, such as the emergence of variants with improved transmissibility/virulence and recombined strains. A number of tools have been produced to explore the variants' trends or suggest hypotheses on the evolutionary mechanisms of the virus. In this perspective, we elaborate on plausible directions of this field of research, which are still applicable to the SARS-CoV-2 virus but may become even more relevant in the context of new outbreaks (e.g., monkeypox, malaria, diphtheria). Expressly, we point to 1) data-driven identification of mutations or variants with potential impact; 2) data-driven identification of recombination events - creating opportunities to overcome selective pressure and adapt to new environments and hosts (e.g., livestock or humans). These directions can be framed within genomic surveillance measures, characterized by the possibility of tracking viruses by using their genome, which is collected, sequenced, and submitted to public databases by laboratories around the world. If successful, genomic surveillance substantially supports the understanding of novel viral pathogens and of their dangerousness in terms of prevalence, infectivity, and transmissibility; the implemented services can be of great utility to decision-makers in healthcare. Here, we draw current trends, challenges, and future directions of data-driven services for genomic surveillance.
2023
Proceedings of the 17th IEEE International Conference on Service-Oriented System Engineering
979-8-3503-2239-2
big data services
virology
pathogen evolution
Genomic surveillance
big data analytics
File in questo prodotto:
File Dimensione Formato  
IEEE_SOSE_2023.pdf

accesso aperto

: Pre-Print (o Pre-Refereeing)
Dimensione 164.53 kB
Formato Adobe PDF
164.53 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1251139
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 0
social impact