Authors:
P. Boba
1
;
P. Weil
2
;
F. Hoffgaard
2
and
K. Hamacher
2
Affiliations:
1
Technische Universität Dresden, Germany
;
2
Technische Universität Darmstadt, Germany
Keyword(s):
Sequence analysis, Human immunodeficiency virus, HIV, Co-evolution, Mutual information, Data mining, Machine learning.
Related
Ontology
Subjects/Areas/Topics:
Bioinformatics
;
Biomedical Engineering
;
Data Mining and Machine Learning
;
Sequence Analysis
;
Structural Bioinformatics
Abstract:
Proteins as molecular phenotypes need to maintain their stability, fold, and the functionality throughout their individual and collective evolution. Such important properties are maintained by a selective pressure that reveals itself in sequence data sets. Small adaptive changes are usually possible, but in general the conservation of structure and function implies the co-evolution of amino acids within the molecule. We analyze two most important enzymes in the progression of viral infection by the human immunodeficiency virus (HIV) – namely the reverse transcriptase and the protease – under an information theoretical framework to derive insight into the selective pressure acting locally and globally on the enzymes. To this end we computed mutual information inside the proteins and between the proteins for some 40,000 sequences. We discuss the results of intra- and inter-protein co-evolution of residues in these enzymes and finally annotate important structural-evolutionary correlati
ons. In particular we focus on the reverse transcriptase and a small signal indicating a potential coevolution between the protease and the reverse transcriptase. We convinced ourselves that our sampling is sufficiently large and that no normalization schemes needs to be applied. We conclude with a short outlook into potential implications for drug resistance development.
(More)