Authors:
Jyh-Jong Tsay
;
Bo-Liang Wu
and
Hou-Ji Dai
Affiliation:
National Chung-Cheng University, Taiwan
Keyword(s):
Pathway Database Integration, Data Matching, KEGG, MetaCyc.
Related
Ontology
Subjects/Areas/Topics:
Algorithms and Software Tools
;
Bioinformatics
;
Biomedical Engineering
;
Biostatistics and Stochastic Models
;
Data Mining and Machine Learning
;
Databases and Data Management
;
Web Services in Bioinformatics
Abstract:
Most of biological databases provide cross links that point to data records describing the same object in other databases. However, as more and more databases are available, manually creating and maintaining cross links becomes very time consuming, if not impossible. Existing databases provide only a small portion of all possible links. In this paper, we present a database cross link server BioDBLink that can automatically collect and generate cross links among biological databases. The core of BioDBLink is a data matching technique that identifies and matches data records or elements describing the same object among pathway databases. Experiment on a data set collected from several pathway, enzyme and compound databases shows that our approach is able to identify most of the cross links provided by current databases, discover a large number of missing links, and detect inconsistency and duplicate errors.