Numerous genomic annotations are currently stored in different Web-accessible databanks that scientists need to mine with user-defined queries and in a batch mode to orderly integrate the diverse extracted data in suitable user-customizable working environments. Unfortunately, to date, most accessible databanks can be interrogated only for a single gene or protein at a time and generally the data retrieved are available in HTML page format only. We developed GeneWebEx to effectively mine data of interest in different HTML pages of Web-interfaced databanks, and organize extracted data for further analyses. GeneWebEx utilizes user-defined templates to identify data to extract, and aggregates and structures them in a database designed to allocate the various extractions from distinct biomolecular databanks. Moreover, a template-based module enables automatic updating of extracted data. Validations performed on GeneWebEx allowed us to efficiently gather relevant annotations from various sources, and comprehensively query them to highlight significant biological characteristics.

GeneWebEx: Gene annotation Web Extraction, aggregation, and updating from web-interfaced biomolecular databanks

MASSEROLI, MARCO;PINCIROLI, FRANCESCO
2005-01-01

Abstract

Numerous genomic annotations are currently stored in different Web-accessible databanks that scientists need to mine with user-defined queries and in a batch mode to orderly integrate the diverse extracted data in suitable user-customizable working environments. Unfortunately, to date, most accessible databanks can be interrogated only for a single gene or protein at a time and generally the data retrieved are available in HTML page format only. We developed GeneWebEx to effectively mine data of interest in different HTML pages of Web-interfaced databanks, and organize extracted data for further analyses. GeneWebEx utilizes user-defined templates to identify data to extract, and aggregates and structures them in a database designed to allocate the various extractions from distinct biomolecular databanks. Moreover, a template-based module enables automatic updating of extracted data. Validations performed on GeneWebEx allowed us to efficiently gather relevant annotations from various sources, and comprehensively query them to highlight significant biological characteristics.
File in questo prodotto:
File Dimensione Formato  
A26_IJSEKE_2005_15(3)_511-526.pdf

Accesso riservato

: Post-Print (DRAFT o Author’s Accepted Manuscript-AAM)
Dimensione 560.7 kB
Formato Adobe PDF
560.7 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/244043
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact