Learning Interpretable and Statistically Significant Knowledge from Unlabeled Corpora of Social Text Messages: A Novel Methodology of Descriptive Text Mining

Giacomo Frisoni; Gianluca Moro; Antonella Carbonaro

doi:10.5220/0009892001210132

Learning Interpretable and Statistically Significant Knowledge from Unlabeled Corpora of Social Text Messages: A Novel Methodology of Descriptive Text Mining

Giacomo Frisoni, Gianluca Moro, Antonella Carbonaro

2020

Abstract

Though the strong evolution of knowledge learning models has characterized the last few years, the explanation of a phenomenon from text documents, called descriptive text mining, is still a difficult and poorly addressed problem. The need to work with unlabeled data, explainable approaches, unsupervised and domain independent solutions further increases the complexity of this task. Currently, existing techniques only partially solve the problem and have several limitations. In this paper, we propose a novel methodology of descriptive text mining, capable of offering accurate explanations in unsupervised settings and of quantifying the results based on their statistical significance. Considering the strong growth of patient communities on social platforms such as Facebook, we demonstrate the effectiveness of the contribution by taking the short social posts related to Esophageal Achalasia as a typical case study. Specifically, the methodology produces useful explanations about the experiences of patients and caregivers. Starting directly from the unlabeled patient’s posts, we derive correct scientific correlations among symptoms, drugs, treatments, foods and so on.

Download

Paper Citation

in Harvard Style

Frisoni G., Moro G. and Carbonaro A. (2020). Learning Interpretable and Statistically Significant Knowledge from Unlabeled Corpora of Social Text Messages: A Novel Methodology of Descriptive Text Mining.In Proceedings of the 9th International Conference on Data Science, Technology and Applications - Volume 1: DATA, ISBN 978-989-758-440-4, pages 121-132. DOI: 10.5220/0009892001210132

in Bibtex Style

@conference{data20,
author={Giacomo Frisoni and Gianluca Moro and Antonella Carbonaro},
title={Learning Interpretable and Statistically Significant Knowledge from Unlabeled Corpora of Social Text Messages: A Novel Methodology of Descriptive Text Mining},
booktitle={Proceedings of the 9th International Conference on Data Science, Technology and Applications - Volume 1: DATA,},
year={2020},
pages={121-132},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0009892001210132},
isbn={978-989-758-440-4},
}

in EndNote Style

TY - CONF

JO - Proceedings of the 9th International Conference on Data Science, Technology and Applications - Volume 1: DATA,
TI - Learning Interpretable and Statistically Significant Knowledge from Unlabeled Corpora of Social Text Messages: A Novel Methodology of Descriptive Text Mining
SN - 978-989-758-440-4
AU - Frisoni G.
AU - Moro G.
AU - Carbonaro A.
PY - 2020
SP - 121
EP - 132
DO - 10.5220/0009892001210132