Moving Other Way: Exploring Word Mover Distance Extensions

Ilya Smirnov, Ivan Yamshchikov

2022

Abstract

The word mover’s distance (WMD) is a popular semantic similarity metric for two documents. This metric is quite interpretable and reflects the similarity well, but some aspects can be improved. This position paper studies several possible extensions of WMD. We introduce some regularizations of WMD based on a word match and the frequency of words in the corpus as a weighting factor. Besides, we calculate WMD in word vector spaces with non-Euclidean geometry and compare it with the metric in Euclidean space. We validate possible extensions of WMD on six document classification datasets. Some proposed extensions show better results in terms of the k-nearest neighbor classification error than WMD.

Download


Paper Citation


in Harvard Style

Smirnov I. and Yamshchikov I. (2022). Moving Other Way: Exploring Word Mover Distance Extensions. In Proceedings of the 7th International Conference on Complexity, Future Information Systems and Risk - Volume 1: COMPLEXIS, ISBN 978-989-758-565-4, pages 92-97. DOI: 10.5220/0011096900003197


in Bibtex Style

@conference{complexis22,
author={Ilya Smirnov and Ivan Yamshchikov},
title={Moving Other Way: Exploring Word Mover Distance Extensions},
booktitle={Proceedings of the 7th International Conference on Complexity, Future Information Systems and Risk - Volume 1: COMPLEXIS,},
year={2022},
pages={92-97},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011096900003197},
isbn={978-989-758-565-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 7th International Conference on Complexity, Future Information Systems and Risk - Volume 1: COMPLEXIS,
TI - Moving Other Way: Exploring Word Mover Distance Extensions
SN - 978-989-758-565-4
AU - Smirnov I.
AU - Yamshchikov I.
PY - 2022
SP - 92
EP - 97
DO - 10.5220/0011096900003197