A Sequence-Motif Based Approach to Protein Function Prediction via Deep-CNN Architecture

Vikash Kumar, Ashish Ranjan, Deng Cao, Gopalakrishnan Krishnasamy, Akshay Deepak

2023

Abstract

The challenge of determining protein functions, inferred from the study of protein sub-sequences, is a complex problem. Also, a little literature is evident in this regard, while a broad coverage of the literature shows a bias in the existing approaches for the full-length protein sequences. In this paper, a CNN-based architecture is introduced to detect motif information from the sub-sequence and predict its function. Later, functional inference for sub-sequences is used to facilitate the functional annotation of the full-length protein sequence. The results for the proposed approach demonstrate a great future ahead for further exploration of sub-sequence based protein studies. Comparisons with the ProtVecGen-Plus – a (multi-segment + LSTM) approach – demonstrate, an improvement of +1.24% and +4.66% for the biological process (BP) and molecular function (MF) subontologies, respectively. Next, the proposed method outperformed the hybrid ProtVecGen-Plus + MLDA by a margin of +3.45% for the MF dataset, while raked second for the BP dataset. Overall, the proposed method produced better results for significantly large protein sequences (having sequence length > 500 amino acids).

Download


Paper Citation


in Harvard Style

Kumar V., Ranjan A., Cao D., Krishnasamy G. and Deepak A. (2023). A Sequence-Motif Based Approach to Protein Function Prediction via Deep-CNN Architecture. In Proceedings of the 15th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART, ISBN 978-989-758-623-1, pages 243-251. DOI: 10.5220/0011647800003393


in Bibtex Style

@conference{icaart23,
author={Vikash Kumar and Ashish Ranjan and Deng Cao and Gopalakrishnan Krishnasamy and Akshay Deepak},
title={A Sequence-Motif Based Approach to Protein Function Prediction via Deep-CNN Architecture},
booktitle={Proceedings of the 15th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART,},
year={2023},
pages={243-251},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011647800003393},
isbn={978-989-758-623-1},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 15th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART,
TI - A Sequence-Motif Based Approach to Protein Function Prediction via Deep-CNN Architecture
SN - 978-989-758-623-1
AU - Kumar V.
AU - Ranjan A.
AU - Cao D.
AU - Krishnasamy G.
AU - Deepak A.
PY - 2023
SP - 243
EP - 251
DO - 10.5220/0011647800003393