Classification of Products in Retail using Partially Abbreviated Product Names Only

Oliver Allweyer, Christian Schorr, Rolf Krieger, Andreas Mohr

2020

Abstract

The management of product data in ERP systems is a big challenge for most retail companies. The reason lies in the large amount of data and its complexity. There are companies having millions of product data records. Sometimes more than one thousand data records are created daily. Because data entry and maintenance processes are linked with considerable manual effort, costs - both in time and money - for data management are high. In many systems, the product name and product category must be specified before the product data can be entered manually. Based on the product category many default values are proposed to simplify the manual data entry process. Consequently, classification is essential for error-free and efficient data entry. In this paper, we show how to classify products automatically and compare different machine learning approaches to this end. In order to minimize the effort for the manual data entry and due to the severely limited length of the product name field the classification algorithms are based on shortened names of the products. In particular, we analyse the benefits of different pre-processing strategies and compare the quality of classification models on different hierarchy levels. Our results show that, even in this special case, machine learning can considerably simplify the process of data input.

Download


Paper Citation


in Harvard Style

Allweyer O., Schorr C., Krieger R. and Mohr A. (2020). Classification of Products in Retail using Partially Abbreviated Product Names Only.In Proceedings of the 9th International Conference on Data Science, Technology and Applications - Volume 1: DATA, ISBN 978-989-758-440-4, pages 67-77. DOI: 10.5220/0009821400670077


in Bibtex Style

@conference{data20,
author={Oliver Allweyer and Christian Schorr and Rolf Krieger and Andreas Mohr},
title={Classification of Products in Retail using Partially Abbreviated Product Names Only},
booktitle={Proceedings of the 9th International Conference on Data Science, Technology and Applications - Volume 1: DATA,},
year={2020},
pages={67-77},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0009821400670077},
isbn={978-989-758-440-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 9th International Conference on Data Science, Technology and Applications - Volume 1: DATA,
TI - Classification of Products in Retail using Partially Abbreviated Product Names Only
SN - 978-989-758-440-4
AU - Allweyer O.
AU - Schorr C.
AU - Krieger R.
AU - Mohr A.
PY - 2020
SP - 67
EP - 77
DO - 10.5220/0009821400670077