Authors:
J. Vince Pulido
1
;
Sana Syed
2
and
Donald E. Brown
3
Affiliations:
1
Applied Physics Laboratory, Johns Hopkins University, Laurel, MD, U.S.A.
;
2
School of Medicine, University of Virginia, Charlottesville, VA, U.S.A.
;
3
School of Data Science, University of Virginia, Charlottesville, VA, U.S.A.
Keyword(s):
Machine Learning, Semi-supervised Learning, Histopathology.
Abstract:
One of the greatest obstacles in the adoption of deep neural networks for new medical applications is that training these models require a large number of manually labeled training samples. In order to circumvent the laborious annotation process, some researchers have turned to semi-supervised learning techniques where models learn from a large body of unlabeled data along with a smaller set of labeled data. However, these techniques have not been fully examined in the histology setting where there is a high degree of noise. This body of work investigates an extension of the semi-supervised method MixMatch–we call CoMixMatch– which applies semi-supervised co-teaching and a contrastive unlabeled loss. More specifically, we study these models’ impact under a highly noisy, open-set histology setting. The findings here motivate the development of semi-supervised methods to ameliorate annotation costs commonly encountered in medical data applications.