Reconstruction of Implied Semantic Relations in Russian Wiktionary

Serge Klimenkov, Evgenij Tsopa, Alexey Pismak, Alexander Yarkeev

Abstract

There were several attempts to retrieve semantic relations from free, online Wiktionary for Russian language. Previous works combine automatic parsing of wiki snapshot with experts’ assistance. Our main goal is to create machine readable lexical ontology from Russian Wiktionary, maximally close to its online state. This article provides approach to automatic creation of explicit and implicit semantic relations between words (lexemes) and meanings (senses) to provide exact relations from sense to sense. Explicit semantic relations are constructed comparatively easy. For example, if the lexeme contains single sense, then all relations that point to the lexeme will point to this single sense. Reconstruction of implicit relations relies on logical conclusions from already created explicit ones. Several algorithms for implicit semantic links were developed and tested on Russian Wiktionary. There were parsed more than 550000 online pages, containing about 250000 Russian lexemes with about 500000 senses in them, but only about 20% of these senses were linked with at least one external lexeme. About 47% of explicitly existing links were resolved as “sense-to-sense” relations and about 28% of new implicit “sense-to-sense” links were reconstructed. 53% of lexemes’ references could not be resolved to exact sense.

References

  1. Bessmertny I., 2010. Knowledge visualization based on semantic networks. Programming and Computer Software.
  2. Miller, George A., 1995. WordNet: a lexical database for English. Communications of the ACM.
  3. Azarowa I., 2008. RussNet as a computer lexicon for Russian. Proceedings of the Intelligent Information systems IIS-2008.
  4. Loukachevitch N. and Dobrov B., 2014. RuThes linguistic ontology vs. Russian wordnets. Proceedings of Global WordNet Conference GWC-2014.
  5. Balkova V., Suhonogov A. and Yablonsky S., 2008. Some issues in the construction of a Russian wordnet grid. Proceedings of the Forth International WordNet Conference, Szeged.
  6. Braslavski P., Ustalov D., Mukhin M., 2014. A Spinning Wheel for YARN: User Interface for a Crowdsourced Thesaurus. Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics.
  7. Krizhanovsky A.A. and A. V. Smirnov, (2013). An approach to automated construction of a generalpurpose lexical ontology based on Wiktionary. Journal of Computer and Systems Sciences International. pp. 215-225.
  8. Klimenkov S.V., Tsopa E.A., Kharitonova A.E. and Pismak A.E., (2016). Method of automatic generation of semantic network from semi-structured sources. Software products and system. pp. 40
  9. Wandmacher T., Ovchinnikova E. and Krumnack U., (2007). Extraction, Evaluation and Integration of Lexical Semantic Relations for the Automated Construction of a Lexical Ontology. in Third Australasian Ontology Workshop (AOW). pp. 61-69.
  10. Tesoriero C., (2013). Getting Started with OrientDB. Packt Publishing Ltd.
  11. Meyer C.M. and Gurevych I., (2010). How web communities analyze human language: Word senses in wiktionary.
  12. Harary F. (1994) Graph Theory. Reading. AddisonWesley, p. 10.
  13. Smirnov A.V.T., Kruglov, V.M., Krizhanovsky, A.A., Lugovaya, N.B., Karpov, A.A. and Kipyatkova, I.S., (2012). A quantitative analysis of the lexicon in Russian WordNet and Wiktionaries. Trudy SPIIRAN. pp.231-253.
Download


Paper Citation


in Harvard Style

Klimenkov S., Tsopa E., Pismak A. and Yarkeev A. (2016). Reconstruction of Implied Semantic Relations in Russian Wiktionary . In Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 2: KEOD, (IC3K 2016) ISBN 978-989-758-203-5, pages 74-80. DOI: 10.5220/0006038900740080


in Bibtex Style

@conference{keod16,
author={Serge Klimenkov and Evgenij Tsopa and Alexey Pismak and Alexander Yarkeev},
title={Reconstruction of Implied Semantic Relations in Russian Wiktionary},
booktitle={Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 2: KEOD, (IC3K 2016)},
year={2016},
pages={74-80},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006038900740080},
isbn={978-989-758-203-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 2: KEOD, (IC3K 2016)
TI - Reconstruction of Implied Semantic Relations in Russian Wiktionary
SN - 978-989-758-203-5
AU - Klimenkov S.
AU - Tsopa E.
AU - Pismak A.
AU - Yarkeev A.
PY - 2016
SP - 74
EP - 80
DO - 10.5220/0006038900740080