Authors:
Jonathan Tang
1
;
McClain Kressman
2
;
Harsha Lakshmankumar
2
;
Belle Aduaka
2
;
Ava Jakusovszky
1
;
Paul Anderson
1
and
Jean Davidson
2
Affiliations:
1
Department of Computer Science, California Polytechnic University, San Luis Obispo, California, U.S.A.
;
2
Department of Biological Sciences, California Polytechnic University, San Luis Obispo, California, U.S.A.
Keyword(s):
Underspecification, Deep Learning, Neural Network, Breast Cancer, Subtype Classification.
Abstract:
In fields such as biomedicine, neural networks may encounter a problem known as underspecification, in which models learn a solution that performs poorly and inconsistently when deployed in more generalized real-world scenarios. A current barrier to studying this problem in biomedical research is a lack of tools engineered to uncover and measure the degree of underspecification. For this reason, we have developed Predicting Underspecification Monitoring Pipeline or PUMP. We demonstrate the utility of PUMP in predictive modeling of breast cancer subtypes. In addition to providing methods to measure, monitor, and predict underspecification, we explore methods to minimize the production of underspecified models by incorporating biological insight that aims to rank potential models.