loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Author: Bengt Dahlqvist

Affiliation: Department of Linguistics and Philology, Uppsala University, P.O. Box 635, 751 26 Uppsala, Sweden

Keyword(s): Text Mining, Medieval Texts, Miracle Stories, Old Swedish, Stop Words, Word Similarity, Spelling Variations, Key Words.

Abstract: A text corpus of one hundred and one Marian Miracle stories in Old Swedish written between c. 1272 and 1430 has been digitally compiled from three transcribed sources from the 19th Century. Highly specialized knowledge is needed to interpret these texts, since the medieval variant of Swedish differs significantly from the modern form of the language. Both the vocabulary and spelling as well as the grammar show substantial variances compared to modern Swedish. To advance the understanding of these texts, automated tools for textual processing are needed. This paper preliminary investigates a number of strategies, such as frequency list analysis and methods for identifying spelling variations for producing stop word lists and exposing the key words of the texts. This can be a help to understand the texts, identifying different word forms of the same word, to ease a lexicon lookup and be a starting point for lemmatisation.

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.145.151.141

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Dahlqvist, B. (2020). Text Processing Procedures for Analysing a Corpus with Medieval Marian Miracle Tales in Old Swedish. In Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 1: NLPinAI; ISBN 978-989-758-395-7; ISSN 2184-433X, SciTePress, pages 452-458. DOI: 10.5220/0009372204520458

@conference{nlpinai20,
author={Bengt Dahlqvist.},
title={Text Processing Procedures for Analysing a Corpus with Medieval Marian Miracle Tales in Old Swedish},
booktitle={Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 1: NLPinAI},
year={2020},
pages={452-458},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0009372204520458},
isbn={978-989-758-395-7},
issn={2184-433X},
}

TY - CONF

JO - Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 1: NLPinAI
TI - Text Processing Procedures for Analysing a Corpus with Medieval Marian Miracle Tales in Old Swedish
SN - 978-989-758-395-7
IS - 2184-433X
AU - Dahlqvist, B.
PY - 2020
SP - 452
EP - 458
DO - 10.5220/0009372204520458
PB - SciTePress