Numerous protein-protein interaction (PPI) data are provided by using new high-throughput experimental and computational techniques; they are being collected in different databases. The data generally do not contain phenotypic or even functional or structural information about the interactors, which in many cases are available in other databases. Thus, to have widespread coverage, it is necessary to combine the data from different databases. For this purpose, we are developing a framework to create and maintain a data warehouse on the basis of a conceptual data model. Then, we applied an automatic association inference method, based on the transitive closure concept. In particular, by leveraging IntAct and Mint PPI data, Entrez protein encoding gene data and OMIM genetic disorder data, we inferred associations between proteins and genetic disorders and their phenotypes. In our data warehouse, 46,154 human PPIs regarding 12,178 distinct human proteins were integrated. These human proteins are encoded by 11,232 different human genes. By applying transitive closure concept, we identified 1,130 gene networks and found 1,136 human PPIs associated with 628 genetic disorders. The interactions between the proteins, that are associated to the specific disease with transitive closure method, will help researchers to focus on protein interactions of the disease. This will helps to reveal the disease because of malfunctioning protein interactions. Then possibly the disease treatment strategy such as synthetic protein engineering could be applied. This hypothesis shows the importance of the integration of the PPI data with the genetic disorder data.

Protein-protein interaction associated disorders revealed via data integration

CANAKOGLU, ARIF;MASSEROLI, MARCO
2012-01-01

Abstract

Numerous protein-protein interaction (PPI) data are provided by using new high-throughput experimental and computational techniques; they are being collected in different databases. The data generally do not contain phenotypic or even functional or structural information about the interactors, which in many cases are available in other databases. Thus, to have widespread coverage, it is necessary to combine the data from different databases. For this purpose, we are developing a framework to create and maintain a data warehouse on the basis of a conceptual data model. Then, we applied an automatic association inference method, based on the transitive closure concept. In particular, by leveraging IntAct and Mint PPI data, Entrez protein encoding gene data and OMIM genetic disorder data, we inferred associations between proteins and genetic disorders and their phenotypes. In our data warehouse, 46,154 human PPIs regarding 12,178 distinct human proteins were integrated. These human proteins are encoded by 11,232 different human genes. By applying transitive closure concept, we identified 1,130 gene networks and found 1,136 human PPIs associated with 628 genetic disorders. The interactions between the proteins, that are associated to the specific disease with transitive closure method, will help researchers to focus on protein interactions of the disease. This will helps to reveal the disease because of malfunctioning protein interactions. Then possibly the disease treatment strategy such as synthetic protein engineering could be applied. This hypothesis shows the importance of the integration of the PPI data with the genetic disorder data.
2012
ECCB 2012: 11th European Conference on Computational Biology proceedings
INF
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/657778
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact