Enriched region (ER) identification is a fundamental step in several next-generation sequencing (NGS) experiment types. Yet, although NGS experimental protocols recommend producing replicate samples for each evaluated condition and their consistency is usually assessed, typically pipelines for ER identification do not consider available NGS replicates. This may alter genome-wide descriptions of ERs, hinder significance of subsequent analyses on detected ERs and eventually preclude biological discoveries that evidence in replicate could support. MuSERA is a broadly useful stand-alone tool for both interactive and batch analysis of combined evidence from ERs in multiple ChIP-seq or DNase-seq replicates. Besides rigorously combining sample replicates to increase statistical significance of detected ERs, it also provides quantitative evaluations and graphical features to assess the biological relevance of each determined ER set within its genomic context; they include genomic annotation of determined ERs, nearest ER distance distribution, global correlation assessment of ERs and an integrated genome browser.We review MuSERA rationale and implementation, and illustrate how sets of significant ERs are expanded by applying MuSERA on replicates for several types of NGS data, including ChIP-seq of transcription factors or histone marks and DNase-seq hypersensitive sites. We show that MuSERA can determine a new, enhanced set of ERs for each sample by locally combining evidence on replicates, and prove how the easy-to-use interactive graphical displays and quantitative evaluations that MuSERA provides effectively support thorough inspection of obtained results and evaluation of their biological content, facilitating their understanding and biological interpretations. MuSERA is freely available at http://www.bioinformatics.deib.polimi.it/MuSERA/.

MuSERA: Multiple sample enriched region assessment

Jalili, Vahid;Matteucci, Matteo;Masseroli, Marco
2017-01-01

Abstract

Enriched region (ER) identification is a fundamental step in several next-generation sequencing (NGS) experiment types. Yet, although NGS experimental protocols recommend producing replicate samples for each evaluated condition and their consistency is usually assessed, typically pipelines for ER identification do not consider available NGS replicates. This may alter genome-wide descriptions of ERs, hinder significance of subsequent analyses on detected ERs and eventually preclude biological discoveries that evidence in replicate could support. MuSERA is a broadly useful stand-alone tool for both interactive and batch analysis of combined evidence from ERs in multiple ChIP-seq or DNase-seq replicates. Besides rigorously combining sample replicates to increase statistical significance of detected ERs, it also provides quantitative evaluations and graphical features to assess the biological relevance of each determined ER set within its genomic context; they include genomic annotation of determined ERs, nearest ER distance distribution, global correlation assessment of ERs and an integrated genome browser.We review MuSERA rationale and implementation, and illustrate how sets of significant ERs are expanded by applying MuSERA on replicates for several types of NGS data, including ChIP-seq of transcription factors or histone marks and DNase-seq hypersensitive sites. We show that MuSERA can determine a new, enhanced set of ERs for each sample by locally combining evidence on replicates, and prove how the easy-to-use interactive graphical displays and quantitative evaluations that MuSERA provides effectively support thorough inspection of obtained results and evaluation of their biological content, facilitating their understanding and biological interpretations. MuSERA is freely available at http://www.bioinformatics.deib.polimi.it/MuSERA/.
ChIP-seq and DNase-seq data analysis; Combined evidence in replicates; Genomic data visualization; Integrated genome browser; Next-generation sequencing; Chromatin Immunoprecipitation; Genome; Genomics; Software; High-Throughput Nucleotide Sequencing; Information Systems; Molecular Biology
File in questo prodotto:
File Dimensione Formato  
Brief Bioinform-2016-Jalili-bib_bbw029.pdf

Accesso riservato

: Pre-Print (o Pre-Refereeing)
Dimensione 1.98 MB
Formato Adobe PDF
1.98 MB Adobe PDF   Visualizza/Apri
MuSERA_preprint.pdf

accesso aperto

: Pre-Print (o Pre-Refereeing)
Dimensione 191.26 kB
Formato Adobe PDF
191.26 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1039989
Citazioni
  • ???jsp.display-item.citation.pmc??? 5
  • Scopus 8
  • ???jsp.display-item.citation.isi??? 6
social impact