Authors:
            
                    Gabriel Oliveira
                    
                        
                    
                    ; 
                
                    Lucas David
                    
                        
                    
                    ; 
                
                    Rafael Padilha
                    
                        
                    
                    ; 
                
                    Ana Paula da Silva
                    
                        
                    
                    ; 
                
                    Francine de Paula
                    
                        
                    
                    ; 
                
                    Lucas Infante
                    
                        
                    
                    ; 
                
                    Lucio Jorge
                    
                        
                    
                    ; 
                
                    Patricia Xavier
                    
                        
                    
                     and
                
                    Zanoni Dias
                    
                        
                    
                    
                
        
        
            Affiliation:
            
                    
                        
                    
                    Institute of Computing, University of Campinas, Campinas, SP, Brazil
                
        
        
        
        
        
             Keyword(s):
            Deep Learning, Dataset Bias, Model Interpretability, Medical image diagnosis, Retinal OCT Analysis.
        
        
            
                
                
            
        
        
            
                Abstract: 
                Deep learning classifiers can achieve high accuracy in many medical imaging analysis problems. However, when evaluating images from outside the training distribution — e.g., from new patients or generated by different medical equipment — their performance is often hindered, highlighting that they might have learned specific characteristics and biases of the training set and can not generalize to real-world scenarios. In this work, we discuss how Transfer Learning, the standard training technique employed in most visual medical tasks in the literature, coupled with small and poorly collected datasets, can induce the model to capture such biases and data collection artifacts. We use the classification of eye diseases from retinal OCT images as the backdrop for our discussion, evaluating several well-established convolutional neural network architectures for this problem. Our experiments showed that models can achieve high accuracy in this problem, yet when we interpret their decisions 
                and learned features, they often pay attention to regions of the images unrelated to diseases.
                (More)