loading
Documents

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Hao Wan ; Carolina Ruiz and Joseph Beck

Affiliation: Worcester Polytechnic Institute, United States

ISBN: 978-989-758-012-3

Keyword(s): Sequence Classification, Feature Generation, Mutated Subsequences.

Related Ontology Subjects/Areas/Topics: Algorithms and Software Tools ; Bioinformatics ; Biomedical Engineering ; Data Mining and Machine Learning ; Pattern Recognition, Clustering and Classification ; Sequence Analysis

Abstract: In this paper, we present a new feature generation algorithm for sequence data sets called Mutated Subsequence Generation (MSG). Given a data set of sequences, the MSG algorithm generates features from these sequences by incorporating mutative positions in subsequences. We compare this algorithm with other sequence-based feature generation algorithms, including position-based, k-grams, and k-gapped pairs. Our experiments show that the MSG algorithm outperforms these other algorithms in domains in which presence, not specific location, of sequential patterns discriminate among classes in a data set.

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 35.172.100.232

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Wan, H.; Ruiz, C. and Beck, J. (2014). A Novel Feature Generation Method for Sequence Classification - Mutated Subsequence Generation.In Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2014) ISBN 978-989-758-012-3, pages 68-79. DOI: 10.5220/0004808200680079

@conference{bioinformatics14,
author={Hao Wan. and Carolina Ruiz. and Joseph Beck.},
title={A Novel Feature Generation Method for Sequence Classification - Mutated Subsequence Generation},
booktitle={Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2014)},
year={2014},
pages={68-79},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004808200680079},
isbn={978-989-758-012-3},
}

TY - CONF

JO - Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2014)
TI - A Novel Feature Generation Method for Sequence Classification - Mutated Subsequence Generation
SN - 978-989-758-012-3
AU - Wan, H.
AU - Ruiz, C.
AU - Beck, J.
PY - 2014
SP - 68
EP - 79
DO - 10.5220/0004808200680079

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.