loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Max Kölbl ; Yuki Kyogoku ; J. Nathanael Philipp ; Michael Richter ; Clemens Rietdorf and Tariq Yousef

Affiliation: Institute of Computer Science, NLP Group, Universität Leipzig, Augustusplatz 10, 04109 Leipzig, Germany

Keyword(s): Keyword Extraction, Information Theory, Topic Model, Recurrent Neural Network.

Abstract: This paper reports the results of a study on automatic keyword extraction in German. We employed in general two types of methods: (A) an unsupervised method based on information theory (Shannon, 1948). We employed (i) a bigram model, (ii) a probabilistic parser model (Hale, 2001) and (iii) an innovative model which utilises topics as extra-sentential contexts for the calculation of the information content of the words, and (B) a supervised method employing a recurrent neural network (RNN). As baselines, we employed TextRank and the TF-IDF ranking function. The topic model (A)(iii) outperformed clearly all remaining models, even TextRank and TF-IDF. In contrast, RNN performed poorly. We take the results as first evidence, that (i) information content can be employed for keyword extraction tasks and has thus a clear correspondence to semantics of natural language’s, and (ii) that - as a cognitive principle - the information content of words is determined from extra-sentential contexts, that is to say, from the discourse of words. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.144.212.145

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Kölbl, M.; Kyogoku, Y.; Philipp, J.; Richter, M.; Rietdorf, C. and Yousef, T. (2020). Keyword Extraction in German: Information-theory vs. Deep Learning. In Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 1: NLPinAI; ISBN 978-989-758-395-7; ISSN 2184-433X, SciTePress, pages 459-464. DOI: 10.5220/0009374704590464

@conference{nlpinai20,
author={Max Kölbl. and Yuki Kyogoku. and J. Nathanael Philipp. and Michael Richter. and Clemens Rietdorf. and Tariq Yousef.},
title={Keyword Extraction in German: Information-theory vs. Deep Learning},
booktitle={Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 1: NLPinAI},
year={2020},
pages={459-464},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0009374704590464},
isbn={978-989-758-395-7},
issn={2184-433X},
}

TY - CONF

JO - Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 1: NLPinAI
TI - Keyword Extraction in German: Information-theory vs. Deep Learning
SN - 978-989-758-395-7
IS - 2184-433X
AU - Kölbl, M.
AU - Kyogoku, Y.
AU - Philipp, J.
AU - Richter, M.
AU - Rietdorf, C.
AU - Yousef, T.
PY - 2020
SP - 459
EP - 464
DO - 10.5220/0009374704590464
PB - SciTePress