Authors:
            
                    Frederik Timme
                    
                        
                    
                    ; 
                
                    Jochen Kerdels
                    
                        
                    
                     and
                
                    Gabriele Peters
                    
                        
                    
                    
                
        
        
            Affiliation:
            
                    
                        
                    
                    Chair of Human-Computer Interaction, Faculty of Mathematics and Computer Science, University of Hagen, Universitätsstraße 47, 58097 Hagen, Germany
                
        
        
        
        
        
             Keyword(s):
            Convolutional Neural Networks, Performance Evaluation, Transformations, Data Augmentation.
        
        
            
                
                
            
        
        
            
                Abstract: 
                Convolutional Neural Networks (CNNs) have become the dominant and arguably most successful approach for the task of image classification since the release of AlexNet in 2012. Despite their excellent performance, CNNs continue to suffer from a still poorly understood lack of robustness when confronted with adversarial attacks or particular forms of handcrafted datasets. Here we investigate how the recognition performance of three widely used CNN architectures (AlexNet, VGG19 and ResNeXt) changes in response to certain input data transformations. 10,000 images from the ILSVRC2012s validation dataset were systematically manipulated by means of common transformations (translation, rotation, color change, background replacement) as well as methods like image collages and jigsaw-like puzzles. Both the effect of single and combined transformations are investigated. Our results show that three of these input image manipulations (rotation, collage, and puzzle) can cause a significant drop in 
                classification accuracy in all evaluated architectures. In general, the more recent VGG19 and ResNeXt displayed a higher robustness than AlexNet in our experiments indicating that some progress has been made to harden the CNN approach against malicious or unforeseen input.
                (More)