Pedro Ferreira, Inês Dutra, Nuno A. Fonseca, Ryan Woods, Elizabeth Burnside


Breast screening is the regular examination of a woman’s breasts to find breast cancer in an initial stage. The sole exam approved for this purpose is mammography that, despite the existence of more advanced technologies, is considered the cheapest and most efficient method to detect cancer in a preclinical stage. We investigate, using machine learning techniques, how attributes obtained from mammographies can relate to malignancy. In particular, this study focus is on how mass density can influence malignancy from a data set of 348 patients containing, among other information, results of biopsies. To this end, we applied different learning algorithms on the data set using theWEKA tools, and performed significance tests on the results. The conclusions are threefold: (1) automatic classification of a mammography can reach equal or better results than the ones annotated by specialists, which can help doctors to quickly concentrate on some specific mammogram for a more thorough study; (2) mass density seems to be a good indicator of malignancy, as previous studies suggested; (3) we can obtain classifiers that can predict mass density with a quality as good as the specialist blind to biopsy.


  1. Abbass, H. A. (2002). An evolutionary artificial neural networks approach for breast cancer diagnosis. Artificial Intelligence in Medicine, 25:265.
  2. Ayer, T., Alagoz, O., Chhatwal, J., Shavlik, J. W., Kahn, C. E. J., and Burnside, E. S. (2010). Breast cancer risk estimation with artificial neural networks revisited: discrimination and calibration. Cancer, 116(14):3310- 3321.
  3. Cory, R. C. and Linden, S. S. (1993). The mammographic density of breast cancer. AJR Am J Roentgenol, 160:418-419.
  4. Davis, J., Burnside, E. S., Dutra, I. C., Page, D., and Costa, V. S. (2005). Knowledge discovery from structured mammography reports using inductive logic programming. In American Medical Informatics Association 2005 Annual Symposium, pages 86-100.
  5. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I. H. (2009). The weka data mining software: An update. SIGKDD Explorations, 11:263- 286.
  6. Jackson, V. P., Dines, K. A., Bassett, L. W., Gold, R. H., and Reynolds, H. E. (1991). Diagnostic importance of the radiographic density of noncalcified breast masses: analysis of 91 lesions. AJR Am J Roentgenol, 157:25- 28.
  7. John, G. H. and Langley, P. (1995). Estimating continuous distributions in bayesian classifiers. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, pages 338-345. Morgan Kaufmann, San Mateo.
  8. Nassif, H., Page, D., Ayvaci, M., Shavlik, J., and Burnside, E. S. (2010). Uncovering age-specific invasive and dcis breast cancer rules using inductive logic programming. In Proceedings of 2010 ACM International Health Informatics Symposium (IHI 2010). ACM Digital Library.
  9. Nassif, H., Woods, R., Burnside, E., Ayvaci, M., Shavlik, J., and Page, D. (2009). Information extraction for clinical data mining: A mammography case study. In ICDMW 7809: Proceedings of the 2009 IEEE International Conference on Data Mining Workshops, pages 37-42, Washington, DC, USA. IEEE Computer Society.
  10. Platt, J. C. (1998). Sequential minimal optimization: A fast algorithm for training support vector machines. Technical Report MSR-TR-98-14, Microsoft Research.
  11. Sickles, E. A. (1991). Periodic mammographic follow-up of probably benign lesions: results in 3,184 consecutive cases. Radiology, 179:463-468.
  12. Street, W. N., Mangasarian, O. L., and Wolberg, W. H. (1995). An inductive learning approach to prognostic prediction. In ICML, page 522.
  13. Wolberg, W. H. and Mangasarian, O. L. (1990). Multisurface method of pattern separation for medical diagnosis applied to breast cytology. In Proceedings of the National Academy of Sciences, 87, pages 9193-9196.
  14. Woods, R. and Burnside, E. (2010). The mammographic density of a mass is a significant predictor of breast cancer. Radiology. to appear.
  15. Woods, R., Oliphant, L., Shinki, K., Page, D., Shavlik, J., and Burnside, E. (2009). Validation of results from knowledge discovery: Mass density as a predictor of breast cancer. J Digit Imaging, pages 418-419.
  16. Wu, Y., Giger, M. L., Doi, K., Vyborny, C. J., Schmidt, R. A., and Metz, C. E. (1993). Artificial neural networks in mammography: application to decision making in the diagnosis of breast cancer. Radiology, 187:81-87.

Paper Citation

in Harvard Style

Ferreira P., Dutra I., A. Fonseca N., Woods R. and Burnside E. (2011). STUDYING THE RELEVANCE OF BREAST IMAGING FEATURES . In Proceedings of the International Conference on Health Informatics - Volume 1: HEALTHINF, (BIOSTEC 2011) ISBN 978-989-8425-34-8, pages 337-342. DOI: 10.5220/0003172903370342

in Bibtex Style

author={Pedro Ferreira and Inês Dutra and Nuno A. Fonseca and Ryan Woods and Elizabeth Burnside},
booktitle={Proceedings of the International Conference on Health Informatics - Volume 1: HEALTHINF, (BIOSTEC 2011)},

in EndNote Style

JO - Proceedings of the International Conference on Health Informatics - Volume 1: HEALTHINF, (BIOSTEC 2011)
SN - 978-989-8425-34-8
AU - Ferreira P.
AU - Dutra I.
AU - A. Fonseca N.
AU - Woods R.
AU - Burnside E.
PY - 2011
SP - 337
EP - 342
DO - 10.5220/0003172903370342