In Silico Proteomic Functional Re-annotation of Escherichia coli K-12 using Dynamic Biological Data Fusion Strategy  

Ramesh Kumar Gopal , Subazini Thankaswamy Kosalai , Rajadurai Chinnasamy Perumal , Palani Kannan Kandavel
Bioinformatics Lab, AU-KBC Research Centre, M.I.T Campus of Anna University, Chromepet, Chennai 600044, India
Author    Correspondence author
Computational Molecular Biology, 2014, Vol. 4, No. 4   doi: 10.5376/cmb.2014.04.0004
Received: 15 Apr., 2014    Accepted: 10 May, 2014    Published: 09 Jul., 2014
© 2014 BioPublisher Publishing Platform
This is an open access article published under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Preferred citation for this article:

Kumar et al., 2014, In silico Proteomic Functional Re-annotation of Escherichia coli K-12 Using Dynamic Biological Data Fusion Strategy, Computational Molecular Biology, Vol.4, No.4 34-43 (doi: 10.5376/cmb.2014.04.0004)


Escherichia coli, one of the favorite model organisms, was initially annotated in 1997 and re-annotated in 2007. Although years of intensive research is being carried out on E. coli genome, still complete and accurate functional information of this organism is not available. In E. coli, about 40% of the protein sequences have been annotated as hypothetical proteins, because of lack of information. Hence, such sequences require advanced computational strategies and derive clues on their biological role. Herein, we have carried out re-annotation of the complete genome of E. coli K-12 using “Dynamic biological data fusion method”. It is a computational strategy we typically applied for combining the heterogeneous biological data sources to maximize knowledge sharing and generating the intersection of data sets. Functional re-annotation results reported in this paper help us to present high quality data on complete proteome of E. coli K-12. We have updated all the protein coding genes from previous annotation work and tried to assign new or more precise functions, wherever possible. About 29% of the protein sequences of E. coli which have been previously annotated as unclear / unknown (hypothetical; without functions) have now been assigned with clear / known functions.  Further, the analysis also resulted in the revision of the protein sequences that have been found to be false positive or poorly annotated. Information from this work is made available as a database, “REC-DB", which will remain a useful repository with accurate and updated functional information. Availability: REC-DB is publicly available at

E. coli; Re-annotation; Hypothetical proteins; Confidence level; Phylogenetics; Motif
[Full-Text PDF] [Full-Flipping PDF] [Full-Text HTML]
Computational Molecular Biology
• Volume 4
View Options
. PDF(1453KB)
. Online fPDF
Associated material
. Readers' comments
Other articles by authors
. Ramesh Gopal
. Subazini Thankaswamy Kosalai
. Rajadurai Chinnasamy Perumal
. Palani Kannan Kandavel
Related articles
. E. coli
. Re-annotation
. Hypothetical proteins
. Confidence level
. Phylogenetics
. Motif
. Email to a friend
. Post a comment