Assessment of the Relationship Between Attribute Coding and the Interpretability of Machine Learning Models: An Analysis in the Context of Children and Adolescents with Depression

Ludmila Nascimento, Marcelo Balbino, Marcelo Balbino, Maycoln Teodoro, Cristiane Nobre

2024

Abstract

Depression is a global public health challenge that affects approximately 300 million people. Artificial Intelligence and Machine Learning have revolutionized the healthcare sector, allowing the development of models to diagnose depression. Tabular data, shared in healthcare, requires preprocessing, including encoding categorical attributes into numeric values, as many Machine Learning algorithms only support numeric data. This study aims to investigate different coding methods for non-ordinal nominal categorical attributes in a dataset related to depression in children and adolescents suffering from Major Depressive Disorder (MDD). The comparison results revealed that the XGBoost algorithm with the Hash Encoding, Customized One Hot, Frequency, and Dummy coding techniques were more effective for the analyzed data set. However, not all of these encodings are interpretable. These results provide significant insights, highlighting the importance of choosing appropriate coding methods to improve the accuracy of Machine Learning models and the interpretability of these models in healthcare.

Download


Paper Citation


in Harvard Style

Nascimento L., Balbino M., Teodoro M. and Nobre C. (2024). Assessment of the Relationship Between Attribute Coding and the Interpretability of Machine Learning Models: An Analysis in the Context of Children and Adolescents with Depression. In Proceedings of the 17th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 2: HEALTHINF; ISBN 978-989-758-688-0, SciTePress, pages 482-489. DOI: 10.5220/0012386200003657


in Bibtex Style

@conference{healthinf24,
author={Ludmila Nascimento and Marcelo Balbino and Maycoln Teodoro and Cristiane Nobre},
title={Assessment of the Relationship Between Attribute Coding and the Interpretability of Machine Learning Models: An Analysis in the Context of Children and Adolescents with Depression},
booktitle={Proceedings of the 17th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 2: HEALTHINF},
year={2024},
pages={482-489},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012386200003657},
isbn={978-989-758-688-0},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 17th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 2: HEALTHINF
TI - Assessment of the Relationship Between Attribute Coding and the Interpretability of Machine Learning Models: An Analysis in the Context of Children and Adolescents with Depression
SN - 978-989-758-688-0
AU - Nascimento L.
AU - Balbino M.
AU - Teodoro M.
AU - Nobre C.
PY - 2024
SP - 482
EP - 489
DO - 10.5220/0012386200003657
PB - SciTePress