Learning Interpretable and Statistically Significant Knowledge from Unlabeled Corpora of Social Text Messages: A Novel Methodology of Descriptive Text Mining

Giacomo Frisoni, Gianluca Moro, Antonella Carbonaro

Abstract

Though the strong evolution of knowledge learning models has characterized the last few years, the explanation of a phenomenon from text documents, called descriptive text mining, is still a difficult and poorly addressed problem. The need to work with unlabeled data, explainable approaches, unsupervised and domain independent solutions further increases the complexity of this task. Currently, existing techniques only partially solve the problem and have several limitations. In this paper, we propose a novel methodology of descriptive text mining, capable of offering accurate explanations in unsupervised settings and of quantifying the results based on their statistical significance. Considering the strong growth of patient communities on social platforms such as Facebook, we demonstrate the effectiveness of the contribution by taking the short social posts related to Esophageal Achalasia as a typical case study. Specifically, the methodology produces useful explanations about the experiences of patients and caregivers. Starting directly from the unlabeled patient’s posts, we derive correct scientific correlations among symptoms, drugs, treatments, foods and so on.

Download


Paper Citation