Author:
            
                    Ulli Waltinger
                    
                        
                    
                    
                
        
        
            Affiliation:
            
                    
                        
                    
                    Text Technology, Bielefeld University, Germany
                
        
        
        
        
        
             Keyword(s):
            Machine learning, Support vector machine, Sentiment analysis, Polarity identification, Subjectivity resources.
        
        
            
                Related
                    Ontology
                    Subjects/Areas/Topics:
                
                        Artificial Intelligence
                    ; 
                        Knowledge Discovery and Information Retrieval
                    ; 
                        Knowledge-Based Systems
                    ; 
                        Soft Computing
                    ; 
                        Symbolic Systems
                    ; 
                        Web Mining
                    
            
        
        
            
                Abstract: 
                This paper presents an empirical study on machine learning-based sentiment analysis. Though polarity classification has been extensively studied at different document-structure levels (e.g. document, sentence, words), little work has been done investigating feature selection methods and subjectivity resources. We systematically analyze four different English subjectivity resources for the task of sentiment polarity identification. While the results show that the size of dictionaries clearly correlate to polarity-based feature coverage, this property does not correlate to classification accuracy. Using polarity-based feature selection, considering a minimum amount of prior polarity features, in combination with SVM-based machine learning methods exhibits the best performance (acc=84.1, f1=83.9), in comparison to the classical approaches on polarity identification. Based on the findings of the English-based experimental setup, a new German subjectivity resource is proposed for the task
                 of German-based sentiment analysis. The results of the experiments show, with f1=85.9 its good adaptability to the new domain.
                (More)