Authors:
Sérgio Mosquim Júnior
1
;
2
and
Juliana de Oliveira
1
Affiliations:
1
São Paulo State University, Brazil
;
2
Uppsala University, Sweden
Keyword(s):
Data Mining, Breast Cancer, Decision Trees, Artificial Neural Networks.
Abstract:
Breast cancer has the second highest incidence among all cancer types and is the fifth cause of cancer
related death among women. In Brazil, breast cancer mortality rates have been rising. Cancer classification
is intricate, mainly when differentiating subtypes. In this context, data mining becomes a fundamental tool
to analyze genotypic data, improving diagnostics, treatment and patient care. As the data dimensionality is
problematic, methods to reduce it must be applied. Hence, the present study aims at the analysis of two data
mining methods (i.e., decision trees and artificial neural networks). Weka® and MATLAB® were used to
implement these two methodologies. Decision trees appointed important genes for the classification.
Optimal artificial neural network architecture consists of two layers, one with 99 neurons and the other with
5. Both data mining techniques were able to classify data with high accuracy.