Authors: Faiz Ali Shah ; Kairit Sirts and Dietmar Pfahl

Affiliation: Institute of Computer Science, University of Tartu, J. Liivi 2, 50409, Tartu and Estonia

ISBN: 978-989-758-320-9

Keyword(s): App Review Classification, Convolutional Neural Networks, Linguistic Resources, Bag of Words.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Knowledge Management and Information Sharing ; Knowledge-Based Systems ; Requirements Engineering ; Symbolic Systems

Abstract: User reviews submitted to app marketplaces contain information that falls into different categories, e.g., feature evaluation, feature request, and bug report. The information is valuable for developers to improve the quality of mobile applications. However, due to the large volume of reviews received every day, manual classification of user reviews into these categories is not feasible. Therefore, developing automatic classification methods using machine learning approaches is desirable. In this study, we compare the simplest textual machine learning classifier using only lexical features—the so-called Bag-of-Words (BoW) approach—with the more complex models used in previous works adopting rich linguistic features. We find that the performance of the simple BoW model is very competitive and has the advantage of not requiring any external linguistic tools to extract the features. Moreover, we experiment with deep learning based Convolutional Neural Network (CNN) models that have recen tly achieved state-of-the-art results in many classification tasks. We find that, on average the CNN models do not perform better than the simple BoW model—it is possible that for the CNN model to gain an advantage, a larger training set would have been necessary. (More)

Paper citation in several formats:
Ali Shah, F.; Sirts, K. and Pfahl, D. (2018). Simple App Review Classification with Only Lexical Features.In Proceedings of the 13th International Conference on Software Technologies - Volume 1: ICSOFT, ISBN 978-989-758-320-9, pages 112-119. DOI: 10.5220/0006855901460153

