loading
Papers

Research.Publish.Connect.

Paper

Paper Unlock

Authors: David Campos ; Sérgio Matos and José Luis Oliveira

Affiliation: University of Aveiro, Portugal

ISBN: 978-989-8425-28-7

Keyword(s): Natural Language Processing, Text Mining, Machine Learning, Named Entity Recognition, Gene/Protein Names.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; BioInformatics & Pattern Discovery ; Computational Intelligence ; Evolutionary Computing ; Information Extraction ; Knowledge Discovery and Information Retrieval ; Knowledge-Based Systems ; Machine Learning ; Mining Text and Semi-Structured Data ; Soft Computing ; Symbolic Systems

Abstract: With the overwhelming amount of publicly available data in the biomedical field, traditional tasks performed by expert database annotators rapidly became hard and very expensive. This situation led to the development of computerized systems to extract information in a structured manner. The first step of such systems requires the identification of named entities (e.g. gene/protein names), a task called Named Entity Recognition (NER). Much of the current research to tackle this problem is based on Machine Learning (ML) techniques, which demand careful and sensitive definition of the several used methods. This article presents a NER system using Conditional Random Fields (CRFs) as the machine learning technique, combining the best techniques recently described in the literature. The proposed system uses biomedical knowledge and a large set of orthographic and morphological features. An F-measure of 0,7936 was obtained on the BioCreative II Gene Mention corpus, achieving a significantly better performance than similar baseline systems. (More)

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.207.137.4

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Campos, D.; Matos, S. and Oliveira, J. (2010). RECOGNITION OF GENE/PROTEIN NAMES USING CONDITIONAL RANDOM FIELDS.In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010) ISBN 978-989-8425-28-7, pages 275-280. DOI: 10.5220/0003096902750280

@conference{kdir10,
author={David Campos. and Sérgio Matos. and José Luis Oliveira.},
title={RECOGNITION OF GENE/PROTEIN NAMES USING CONDITIONAL RANDOM FIELDS},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010)},
year={2010},
pages={275-280},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003096902750280},
isbn={978-989-8425-28-7},
}

TY - CONF

JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010)
TI - RECOGNITION OF GENE/PROTEIN NAMES USING CONDITIONAL RANDOM FIELDS
SN - 978-989-8425-28-7
AU - Campos, D.
AU - Matos, S.
AU - Oliveira, J.
PY - 2010
SP - 275
EP - 280
DO - 10.5220/0003096902750280

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.