RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

Novel technologies and growing interest have resulted in a large increase in the amount of data available for genomics and transcriptomics studies, both in terms of volume and contents. Biology is relying more and more on computational methods to process, investigate, and extract knowledge from this huge amount of data. In this work, we present the TICA web server (available at http://www.gmql.eu/tica/), a fast and compact tool developed to support data-driven knowledge discovery in the realm of transcription factor interaction prediction. TICA leverages both the GenoMetric Query Language, a novel query tool (based on the Apache Hadoop and Spark technologies) specialized in the integration and management of heterogeneous, large genomic datasets, and a statistical method for robust detection of co-locations across interval-based data, in order to infer physically interacting transcription factors. Notably, TICA allows investigators to upload and analyze their own ChIP-seq experiments datasets, comparing them both against ENCODE data or between themselves, achieving computation time which increases linearly with respect to dataset size and density. Using ENCODE data from three well-studied cell lines as reference, we show that TICA predictions are supported by existing biological knowledge, making the web server a reliable and efficient tool for interaction screening and data-driven hypothesis generation.

Implementing a transcription factor interaction prediction system using the genometric query language

Perna, Stefano;Canakoglu, Arif;Pinoli, Pietro;Ceri, Stefano;Wong, Limsoon

2018-01-01

Abstract

Novel technologies and growing interest have resulted in a large increase in the amount of data available for genomics and transcriptomics studies, both in terms of volume and contents. Biology is relying more and more on computational methods to process, investigate, and extract knowledge from this huge amount of data. In this work, we present the TICA web server (available at http://www.gmql.eu/tica/), a fast and compact tool developed to support data-driven knowledge discovery in the realm of transcription factor interaction prediction. TICA leverages both the GenoMetric Query Language, a novel query tool (based on the Apache Hadoop and Spark technologies) specialized in the integration and management of heterogeneous, large genomic datasets, and a statistical method for robust detection of co-locations across interval-based data, in order to infer physically interacting transcription factors. Notably, TICA allows investigators to upload and analyze their own ChIP-seq experiments datasets, comparing them both against ENCODE data or between themselves, achieving computation time which increases linearly with respect to dataset size and density. Using ENCODE data from three well-studied cell lines as reference, we show that TICA predictions are supported by existing biological knowledge, making the web server a reliable and efficient tool for interaction screening and data-driven hypothesis generation.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2018
			
	Titolo del libro
	
				Data Mining for Systems Biology
			
	Titolo della collana
	
				METHODS IN MOLECULAR BIOLOGY
			
	ISBN (International Standard Book Number)
	
				978-1-4939-8560-9
978-1-4939-8561-6
			
	Parole chiave
	
				Biostatistics; ChIP-seq analysis; Data integration; Gene regulation; Genomic computing; Transcription factor interaction; Binding Sites; Cell Line, Tumor; Humans; Transcription Factors; Transcription Initiation Site; Internet; Programming Languages; Molecular Biology; Genetics
			
	Appare nelle tipologie:
	
				02.1 Contributo in Volume

File in questo prodotto:

File	Dimensione	Formato
Tica.pdf accesso aperto Dimensione 694.25 kB Formato Adobe PDF Visualizza/Apri	694.25 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1078126

Citazioni

ND

1

0

social impact