Author:
            
                    David M. W. Powers
                    
                        
                    
                    
                
        
        
            Affiliation:
            
                    
                        
                    
                    Beijing University of Technology and Flinders University, China
                
        
        
        
        
        
             Keyword(s):
            Kappa, Correlation, Informedness, Markedness, ROC, AUC, Boosting, Bagging, Adaboost, Multiboost, Machine Learning, Signal Processing, Classifier Fusion, Language Technology, Concept Learning.
        
        
            
                Related
                    Ontology
                    Subjects/Areas/Topics:
                
                        Hybrid Learning Systems
                    ; 
                        Informatics in Control, Automation and Robotics
                    ; 
                        Intelligent Control Systems and Optimization
                    ; 
                        Perception and Awareness
                    ; 
                        Robotics and Automation
                    ; 
                        Sensors Fusion
                    ; 
                        Signal Processing, Sensors, Systems Modeling and Control
                    ; 
                        Signal Reconstruction
                    ; 
                        Vision, Recognition and Reconstruction
                    
            
        
        
            
                Abstract: 
                There has been considerable interest in boosting and bagging, including the combination of the adaptive techniques of AdaBoost with the random selection with replacement techniques of Bagging.  At the same time there has been a revisiting of the way we evaluate, with chance-corrected measures like Kappa, Informedness, Correlation or ROC AUC being advocated. This leads to the question of whether learning algorithms can do better by optimizing an appropriate chance corrected measure. Indeed, it is possible for a weak learner to optimize Accuracy to the detriment of the more reaslistic chance-corrected measures, and when this happens the booster can give up too early.  This phenomenon is known to occur with conventional Accuracy-based AdaBoost, and the MultiBoost algorithm has been developed to overcome such problems using restart techniques based on bagging.  This paper thus complements the theoretical work showing the necessity of using chance-corrected measures for evaluation, with e
                mpirical work showing how use of a chance-corrected measure can improve boosting. We show that the early surrender problem occurs in MultiBoost too, in multiclass situations, so that chance-corrected AdaBook and Multibook can beat standard Multiboost or AdaBoost, and we further identify which chance-corrected measures to use when.
                (More)