Authors:
            
                    Olivier Teytaud
                    
                        
                    
                     and
                
                    Sylvain Gelly
                    
                        
                    
                    
                
        
        
            Affiliation:
            
                    
                        
                    
                    TAO (Inria, Univ. Paris-Sud, UMR CNRS-8623), France
                
        
        
        
        
        
             Keyword(s):
            Evolutionary computation and control, Optimization algorithms.
        
        
            
                Related
                    Ontology
                    Subjects/Areas/Topics:
                
                        Artificial Intelligence
                    ; 
                        Evolutionary Computation and Control
                    ; 
                        Formal Methods
                    ; 
                        Informatics in Control, Automation and Robotics
                    ; 
                        Intelligent Control Systems and Optimization
                    ; 
                        Optimization Algorithms
                    ; 
                        Planning and Scheduling
                    ; 
                        Simulation and Modeling
                    ; 
                        Symbolic Systems
                    
            
        
        
            
                Abstract: 
                Many stochastic dynamic programming tasks in continuous action-spaces are tackled through discretization. We here avoid discretization; then, approximate dynamic programming (ADP) involves (i) many learning tasks, performed here by Support Vector Machines, for Bellman-function-regression (ii) many non-linear-optimization tasks for 
action-selection, for which we compare many algorithms. We include discretizations of the domain as particular non-linear-programming-tools in our experiments, so that by the way we compare optimization approaches and discretization methods. We conclude that robustness is strongly required in the non-linear-optimizations in ADP, and experimental results show that (i) discretization is sometimes inefficient, but some specific discretization is very efficient for ”bang-bang” problems (ii) simple evolutionary tools out-perform quasi-random in a stable manner (iii) gradient-based techniques are much less stable (iv) for most high-dimensional ”less unsmooth” p
                roblems Covariance-Matrix-Adaptation is first ranked.
                (More)