The analysis of metagenomic short-reads derived from drinking water systems has traditionally focused on the study of prokaryotes (i.e., bacteria and archaea) neglecting the presence of eukaryotes, even though such (micro)-organisms can have an impact on water quality. This limitation stems from the fact that traditional prokaryotic-centric bioinformatic tools are not suited for eukaryotic genomes due to their higher complexity and typically lower abundance in comparison with prokaryotes. Noticeably, while established bioinformatic pipelines exist for the analysis of prokaryotic-derived genomes, only recently researchers have started to develop tools suitable for the identification and subsequent analysis of eukaryotic populations in mixed communities; for this reason, there is currently no validated and widely accepted bioinformatic workflow for analyses of eukaryotes in metagenomes. In this study we propose a workflow tailored for the identification and analysis of eukaryotic genomes derived from metagenomic drinking water system surveys by comparing the performances of several tools for the identification of eukaryotic contigs, contigs binning, and gene prediction on both synthetically generated metagenomes and a representative drinking water system case study, consisting of samples collected for over 6 months. Our findings indicate that the performance of eukaryotic-centric tools differs between the analysis of real-world and synthetic data. However, an ensemble approach involving the combination of reference-dependent and reference-independent tools was found to be beneficial for the eukaryotic identification and analysis and was included in the workflow. For example, several k-mer-based (i.e., EukRep, Tiara, Whokaryote) and a reference-based (i.e., CAT) contigs classifiers were selected for the identification of eukaryotic contigs, leveraging the benefits of both approaches. Overall, the workflow proposed in this study will enable a more systematic metagenomic characterization of eukaryotic populations in the drinking water microbiome and thus help to better understand their ecological role in drinking water systems.

Establishing a metagenomic workflow for eukaryotic analysis in drinking water system

Gabrielli M.;Antonelli M.;
2022-01-01

Abstract

The analysis of metagenomic short-reads derived from drinking water systems has traditionally focused on the study of prokaryotes (i.e., bacteria and archaea) neglecting the presence of eukaryotes, even though such (micro)-organisms can have an impact on water quality. This limitation stems from the fact that traditional prokaryotic-centric bioinformatic tools are not suited for eukaryotic genomes due to their higher complexity and typically lower abundance in comparison with prokaryotes. Noticeably, while established bioinformatic pipelines exist for the analysis of prokaryotic-derived genomes, only recently researchers have started to develop tools suitable for the identification and subsequent analysis of eukaryotic populations in mixed communities; for this reason, there is currently no validated and widely accepted bioinformatic workflow for analyses of eukaryotes in metagenomes. In this study we propose a workflow tailored for the identification and analysis of eukaryotic genomes derived from metagenomic drinking water system surveys by comparing the performances of several tools for the identification of eukaryotic contigs, contigs binning, and gene prediction on both synthetically generated metagenomes and a representative drinking water system case study, consisting of samples collected for over 6 months. Our findings indicate that the performance of eukaryotic-centric tools differs between the analysis of real-world and synthetic data. However, an ensemble approach involving the combination of reference-dependent and reference-independent tools was found to be beneficial for the eukaryotic identification and analysis and was included in the workflow. For example, several k-mer-based (i.e., EukRep, Tiara, Whokaryote) and a reference-based (i.e., CAT) contigs classifiers were selected for the identification of eukaryotic contigs, leveraging the benefits of both approaches. Overall, the workflow proposed in this study will enable a more systematic metagenomic characterization of eukaryotic populations in the drinking water microbiome and thus help to better understand their ecological role in drinking water systems.
2022
Proc. of 2022 Association of Environmental Engineering and Science Professors (AEESP) Research and Education Conference
File in questo prodotto:
File Dimensione Formato  
2022 Antonelli - AEESP Conference - Eukaryotic analysis in DWS.pdf

accesso aperto

Descrizione: Abstract in atti di convegno
: Pre-Print (o Pre-Refereeing)
Dimensione 76.63 kB
Formato Adobe PDF
76.63 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1227737
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact