Authors:
Ligia Ferreira de Carvalho Gonçalves
1
;
Caio Davi Rabelo Fiorini
1
;
Daniel Rocha Franca
2
;
Marta Dias Moreira Noronha
3
;
Mark Song
3
and
Luis Enrique Zárate Galvez
3
Affiliations:
1
Data Science and Artificial Intelligence, Pontifícia Universidade Católica de Minas Gerais, Rua Claudio Manuel, Belo Horizonte, Brazil
;
2
Computer Science, Pontifícia Universidade Católica de Minas Gerais, Rua Claudio Manuel, Belo Horizonte, Brazil
;
3
Institute of Exact Sciences and Computer Science, Pontifícia Universidade Católica de Minas Gerais, Rua Claudio Manuel, Belo Horizonte, Brazil
Keyword(s):
Stroke, Machine Learning, Genetic Algorithm, Decision Tree, Middle-Aged, Rules.
Abstract:
Data mining and machine learning techniques have been widely used in the knowledge extraction process of medical databases, one highlight being their use to improve diagnostic systems. Decision trees are supervised black box machine learning models that, although simple, are easy to interpret. In this work, we propose the use of these techniques to describe the profile of middle-aged adults (40-59) diagnosed with stroke, a disease that in Brazil was one of the main causes of death in previous years. The genetic algorithm was applied to extract the best characteristics so that the Decision Tree algorithm could then be used in the database provided by the 2019 National Health Survey to obtain the most comprehensive rules and identify the most relevant attributes for describing the profile of these individuals. The conclusions indicate that the rules generated for middle-aged adults are mainly about routine habits, such as work or salt consumption.