Authors: Amin Omidvar 1 ; Hossein Pourmodheji 1 ; Aijun An 1 and Gordon Edall 2

Affiliations: 1 Department of Electrical Engineering and Computer Science, York University, Canada ; 2 The Globe and Mail, Canada

ISBN: 978-989-758-395-7

ISSN: 2184-433X

Keyword(s): Headline Quality, Deep Learning, NLP.

Abstract: Today, most news readers read the online version of news articles rather than traditional paper-based newspapers. Also, news media publishers rely heavily on the income generated from subscriptions and website visits made by news readers. Thus, online user engagement is a very important issue for online newspapers. Much effort has been spent on writing interesting headlines to catch the attention of online users. On the other hand, headlines should not be misleading (e.g., clickbaits); otherwise readers would be disappointed when reading the content. In this paper, we propose four indicators to determine the quality of published news headlines based on their click count and dwell time, which are obtained by website log analysis. Then, we use soft target distribution of the calculated quality indicators to train our proposed deep learning model which can predict the quality of unpublished news headlines. The proposed model not only processes the latent features of both headline and bod y of the article to predict its headline quality but also considers the semantic relation between headline and body as well. To evaluate our model, we use a real dataset from a major Canadian newspaper. Results show our proposed model outperforms other state-of-the-art NLP models. (More)

