Author:
            
                    Halim Sayoud
                    
                        
                    
                    
                
        
        
            Affiliation:
            
                    
                        
                    
                    USTHB University, Algeria
                
        
        
        
        
        
             Keyword(s):
            Visual Analytics, Authorship Attribution, Pattern Recognition, Natural Language Processing, Religious Books.
        
        
            
                Related
                    Ontology
                    Subjects/Areas/Topics:
                
                        Abstract Data Visualization
                    ; 
                        Computer Vision, Visualization and Computer Graphics
                    ; 
                        Data Management and Knowledge Representation
                    ; 
                        General Data Visualization
                    ; 
                        Information and Scientific Visualization
                    ; 
                        Text and Document Visualization
                    ; 
                        Visual Analytical Reasoning
                    ; 
                        Visual Data Analysis and Knowledge Discovery
                    ; 
                        Visual Representation and Interaction
                    
            
        
        
            
                Abstract: 
                In this paper, we present a visual analytics based investigation for the task of authorship attribution of the holy Quran with regards to the Hadith Author (the Prophet). This can be seen as an authorship discrimination task between the two religious books: Quran vs Hadith. The first book represents the Divine book written by Allah (God) as claimed by the Prophet Muhammad, whereas the second one represents a collection of certified Prophet’s statements.
Two visual analytics clustering methods are employed, namely: a Hierarchical Clustering and Fuzzy C-mean Clustering. On the other hand, seven types of NLP features are combined and normalized by PCA reduction before the classification process. The visual analytics results have revealed interesting results in 2D and 3D disposition. In summary, they show two main clusters in both experiments: Quran cluster and Hadith cluster; and the disposition of the resulting clusters corresponds to a clear authorship distinction between the two reli
                gious books.
                (More)