One Size Does Not Fit All: An Ensemble Approach Towards Information Extraction from Adverse Drug Event Narratives

Susmitha Wunnava, Xiao Qin, Tabassum Kakar, Xiangnan Kong, Elke A. Rundensteiner, Sanjay K. Sahoo, Suranjan De

2018

Abstract

Recognizing named entities in Adverse Drug Reactions narratives is a fundamental step towards extracting valuable patient information from unstructured text into a structured thus actionable format. This then unlocks advanced data analytics towards intelligent pharmacovigilance. Yet existing biomedical named entity recognition (NER) tools are limited in their ability to identify certain entity types from these domain-specific narratives and result in significant performance differences in terms of accuracy. To address these challenges, we propose an ensemble approach that integrates a rich variety of named entity recognizers to procure the final result. First, one critical problem faced by NER in the biomedical context is that the data is highly skewed. That is, only 1% of words belong to a certain medical entity type, such as, the reason for medication usage compared to all other non-reason words. We propose a balanced, under-sampled bagging strategy that is dependent on the imbalance level to overcome the class imbalance problem. Second, we present an ensemble of heterogeneous recognizers approach that leverages a novel ensemble combiner. Our experimental results show that for biomedical text datasets: (i) a balanced learning environment along with an Ensemble of Heterogeneous Classifiers constantly improves the performance over individual base learners and, (ii) stacking-based ensemble combiner methods outperform simple Majority Voting by 0.30 F-measure.

Download


Paper Citation


in Harvard Style

Wunnava S., Qin X., Kakar T., Kong X., Rundensteiner E., Sahoo S. and De S. (2018). One Size Does Not Fit All: An Ensemble Approach Towards Information Extraction from Adverse Drug Event Narratives. In Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2018) - Volume 5: HEALTHINF; ISBN 978-989-758-281-3, SciTePress, pages 176-188. DOI: 10.5220/0006600201760188


in Bibtex Style

@conference{healthinf18,
author={Susmitha Wunnava and Xiao Qin and Tabassum Kakar and Xiangnan Kong and Elke A. Rundensteiner and Sanjay K. Sahoo and Suranjan De},
title={One Size Does Not Fit All: An Ensemble Approach Towards Information Extraction from Adverse Drug Event Narratives},
booktitle={Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2018) - Volume 5: HEALTHINF},
year={2018},
pages={176-188},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006600201760188},
isbn={978-989-758-281-3},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2018) - Volume 5: HEALTHINF
TI - One Size Does Not Fit All: An Ensemble Approach Towards Information Extraction from Adverse Drug Event Narratives
SN - 978-989-758-281-3
AU - Wunnava S.
AU - Qin X.
AU - Kakar T.
AU - Kong X.
AU - Rundensteiner E.
AU - Sahoo S.
AU - De S.
PY - 2018
SP - 176
EP - 188
DO - 10.5220/0006600201760188
PB - SciTePress