Comparison of Two-Criterion Evolutionary Filtering Techniques in Cardiovascular Predictive Modelling
Christina Brester, Jussi Kauhanen, Tomi-Pekka Tuomainen, Eugene Semenkin, Mikko Kolehmainen
2016
Abstract
In this paper we compare a number of two-criterion filtering techniques for feature selection in cardiovascular predictive modelling. We design two-objective schemes based on different combinations of four criteria describing the quality of reduced feature sets. To find attribute subsystems meeting the introduced criteria in an optimal way, we suggest applying a cooperative multi-objective genetic algorithm. It includes various search strategies working in a parallel way, which allows additional experiments to be avoided when choosing the most effective heuristic for the problem considered. The performance of filtering techniques was investigated in combination with the SVM model on a population-based epidemiological database called KIHD (Kuopio Ischemic Heart Disease Risk Factor Study). The dataset consists of a large number of variables on various characteristics of the study participants. These baseline measures were collected at the beginning of the study. In addition, all major cardiovascular events that had occurred among the participants over an average of 27 years of follow-up were collected from the national health registries. As a result, we found that the usage of the filtering technique including intra- and inter-class distances led to a significant reduction of the feature set (up to 11 times, from 433 to 38 features) without detriment to the predictive ability of the SVM model. This implies that there is a possibility to cut down on the clinical tests needed to collect the data, which is relevant to the prediction of cardiovascular diseases.
References
- Brester, Ch., Semenkin, E., 2015. Cooperative multiobjective genetic algorithm with parallel implementation. ICSI-CCI 2015, Part I, LNCS 9140, pp. 471-478.
- Deb, K., Pratap, A., Agarwal, S., Meyarivan, T., 2002. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation 6 (2), pp. 182-197.
- Goutte, C., Gaussier, E., 2005. A probabilistic interpretation of precision, recall and F-score, with implication for evaluation. ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research, pp. 345-359.
- Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I. H., 2009. The WEKA Data Mining Software: An Update. SIGKDD Explorations, Volume 11, Issue 1.
- He, X., Cai, D., Niyogi, P., 2005. Laplacian score for feature selection. Adv. in Neural Inf. Proc. Syst., pp. 507 - 514.
- Kohavi, R., John G.H., 1997. Wrappers for feature subset selection. Artificial Intelligence, 97, pp. 273-324.
- Kurl, S, Jae, SY, Kauhanen, J, Ronkainen, K, Laukkanen, JA, 2015. Impaired pulmonary function is a risk predictor for sudden cardiac death in men. Ann Med, 47(5), pp. 381-385.
- Tolmunen, T, Lehto, SM, Julkunen, J, Hintikka, J, Kauhanen, J, 2014. Trait anxiety and somatic concerns associate with increased mortality risk: a 23-year follow-up in aging men. Ann Epidemiol, 24(6), pp. 463-468.
- Venkatadri, M., Srinivasa Rao, K., 2010. A multiobjective genetic algorithm for feature selection in data mining. International Journal of Computer Science and Information Technologies, vol. 1, no. 5, pp. 443-448.
- Virtanen, JK, Mursu, J, Virtanen, HE, Fogelholm, M, Salonen, JT, Koskinen, TT, Voutilainen, S, Tuomainen, TP, 2016. Associations of egg and cholesterol intakes with carotid intima-media thickness and risk of incident coronary artery disease according to apolipoprotein E phenotype in men: the Kuopio Ischemic Heart Disease Risk Factor Study. Am J Clin Nutr, 103(3), pp. 895-901.
- Wang, R., 2013. Preference-Inspired Co-evolutionary Algorithms. A thesis submitted in partial fulfillment for the degree of the Doctor of Philosophy, University of Sheffield.
- Whitley, D., Rana, S., and Heckendorn, R., 1997. Island model genetic algorithms and linearly separable problems. Proceedings of AISB Workshop on Evolutionary Computation, vol.1305 of LNCS, pp. 109-125.
- Zhang, Q., Zhou, A., Zhao, S., Suganthan, P. N., Liu, W., Tiwari, S., 2008. Multi-objective optimization test instances for the CEC 2009 special session and competition. University of Essex and Nanyang Technological University, Tech. Rep. CES-487, 2008.
- Zitzler, E., Laumanns, M., Thiele, L., 2002. SPEA2: Improving the Strength Pareto Evolutionary Algorithm for Multiobjective Optimization. Evolutionary Methods for Design Optimisation and Control with Application to Industrial Problems EUROGEN 2001 3242 (103), pp. 95-100.
Paper Citation
in Harvard Style
Brester C., Kauhanen J., Tuomainen T., Semenkin E. and Kolehmainen M. (2016). Comparison of Two-Criterion Evolutionary Filtering Techniques in Cardiovascular Predictive Modelling . In Proceedings of the 13th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO, ISBN 978-989-758-198-4, pages 140-145. DOI: 10.5220/0005971101400145
in Bibtex Style
@conference{icinco16,
author={Christina Brester and Jussi Kauhanen and Tomi-Pekka Tuomainen and Eugene Semenkin and Mikko Kolehmainen},
title={Comparison of Two-Criterion Evolutionary Filtering Techniques in Cardiovascular Predictive Modelling},
booktitle={Proceedings of the 13th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO,},
year={2016},
pages={140-145},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005971101400145},
isbn={978-989-758-198-4},
}
in EndNote Style
TY  - CONF 
JO  - Proceedings of the 13th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO,
TI  - Comparison of Two-Criterion Evolutionary Filtering Techniques in Cardiovascular Predictive Modelling
SN  - 978-989-758-198-4
AU  - Brester C. 
AU  - Kauhanen J. 
AU  - Tuomainen T. 
AU  - Semenkin E. 
AU  - Kolehmainen M. 
PY  - 2016
SP  - 140
EP  - 145
DO  - 10.5220/0005971101400145