Authors:
            
                    Jos van de Wolfshaar
                    
                        
                    
                    ; 
                
                    Marco Wiering
                    
                        
                    
                     and
                
                    Lambert Schomaker
                    
                        
                    
                    
                
        
        
            Affiliation:
            
                    
                        
                    
                    University of Groningen, Netherlands
                
        
        
        
        
        
             Keyword(s):
            Reinforcement Learning, Deep Learning, Learning Vector Quantization, Nearest Prototype Classification, Deep Reinforcement Learning, Actor-Critic.
        
        
            
                Related
                    Ontology
                    Subjects/Areas/Topics:
                
                        Artificial Intelligence
                    ; 
                        Biomedical Engineering
                    ; 
                        Biomedical Signal Processing
                    ; 
                        Computational Intelligence
                    ; 
                        Evolutionary Computing
                    ; 
                        Health Engineering and Technology Applications
                    ; 
                        Human-Computer Interaction
                    ; 
                        Knowledge Discovery and Information Retrieval
                    ; 
                        Knowledge-Based Systems
                    ; 
                        Machine Learning
                    ; 
                        Methodologies and Methods
                    ; 
                        Neural Networks
                    ; 
                        Neurocomputing
                    ; 
                        Neurotechnology, Electronics and Informatics
                    ; 
                        Pattern Recognition
                    ; 
                        Physiological Computing Systems
                    ; 
                        Sensor Networks
                    ; 
                        Signal Processing
                    ; 
                        Soft Computing
                    ; 
                        Symbolic Systems
                    ; 
                        Theory and Methods
                    
            
        
        
            
                Abstract: 
                We introduce a novel type of actor-critic approach for deep reinforcement learning which is based on learning vector quantization. We replace the softmax operator of the policy with a more general and more flexible operator that is similar to the robust soft learning vector quantization algorithm. We compare our approach to the default A3C architecture on three Atari 2600 games and a simplistic game called Catch. We show that the proposed algorithm outperforms the softmax architecture on Catch. On the Atari games, we observe a nonunanimous pattern in terms of the best performing model.