Unsupervised Keyword Extraction Algorithm Based on Bert Model

Shouhao Zhang

2022

Abstract

Aiming at the problem that traditional word segmentation and unsupervised text keyword extraction methods ignore the context semantics, and the effect of candidate keyword extraction is limited, this paper proposes an algorithm based on the Bidirectional Encoder Representation from Transformers (BERT) model. In the word segmentation stage, the BERT model is used to segment the text to obtain candidate keywords, and then the text is input into BERT to extract the vector of candidate keywords. The word vector extracted in this paper is to reconstruct the hidden layer. According to the last four layers of the neural network sum, and average the word vectors obtained, then obtain the word vectors of candidate keywords, and finally score the similarity with sentence vectors combined with context semantics to obtain keyword ranking.

Download


Paper Citation


in Harvard Style

Zhang S. (2022). Unsupervised Keyword Extraction Algorithm Based on Bert Model. In Proceedings of the 3rd International Symposium on Automation, Information and Computing - Volume 1: ISAIC; ISBN 978-989-758-622-4, SciTePress, pages 306-309. DOI: 10.5220/0011923800003612


in Bibtex Style

@conference{isaic22,
author={Shouhao Zhang},
title={Unsupervised Keyword Extraction Algorithm Based on Bert Model},
booktitle={Proceedings of the 3rd International Symposium on Automation, Information and Computing - Volume 1: ISAIC},
year={2022},
pages={306-309},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011923800003612},
isbn={978-989-758-622-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 3rd International Symposium on Automation, Information and Computing - Volume 1: ISAIC
TI - Unsupervised Keyword Extraction Algorithm Based on Bert Model
SN - 978-989-758-622-4
AU - Zhang S.
PY - 2022
SP - 306
EP - 309
DO - 10.5220/0011923800003612
PB - SciTePress