loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock
Improving Public Sector Efficiency using Advanced Text Mining in the Procurement Process

Topics: Big Data Applications; Big Data Search and Mining; Business Intelligence; Data Management for Analytics; Data Mining; Data Science; Feature Selection; Open Data; Pattern Recognition; Predictive Modeling; Semi-Structured and Unstructured Data; Support Vector Machines; Text Analytics; Transparency in Research Data

Authors: Nikola Modrušan 1 ; Kornelije Rabuzin 2 and Leo Mršić 3

Affiliations: 1 Faculty for Information Studies in Novo Mesto, Ljubljanska cesta 31A, Novo Mesto, Slovenia ; 2 Faculty of Organization and Informatics, University of Zagreb, Pavlinska 2, Varaždin, Croatia ; 3 Algebra University College, Ilica 242, Zagreb, Croatia

Keyword(s): Text Mining, Natural Language Processing, Rule Extraction, Automatic Extraction, Data Mining, Knowledge Discovery, Fraud Detection, Corruption Indices, Public Procurement, Big Data.

Abstract: The analysis of the Public Procurement Processes (PPP) and the detection of suspicious or corrupt procedures is an important topic, especially for improving the process’s transparency and for protecting public financial interests. Creating a quality model as a foundation to perform a quality analysis largely depends on the quality and volume of data that is analyzed. It is important to find a way to identify anomalies before they occur and to prevent any kind of harm that is of public interest. For this reason, we focused our research on an early phase of the PPP, the preparation of the tender documentation. During this phase, it is important to collect documents, detect and extract quality content from it, and analyze this content for any possible manipulation of the PPP’s outcome. Part of the documentation related to defining the rules and restrictions for the PPP is usually within a specific section of the documents, often called “technical and professional ability.” In previous s tudies, the authors extracted and processed these sections and used extracted content in order to develop a prediction model for indicating fraudulent activities. As the criteria and conditions can also be found in other parts of the PPP’s documentation, the idea of this research is to detect additional content and to investigate its impact on the outcome of the prediction model. Therefore, our goal was to determine a list of relevant terms and to develop a data science model finding and extracting terms in order to improve the predictions of suspicious tender. An evaluation was conducted based on an initial prediction model trained with the extracted content as additional input parameters. The training results show a significant improvement in the output metrics. This study presents a methodology for detecting the content needed to predict suspicious procurement procedures, for measuring the relevance of extracted terms, and for storing the most important information in a relational structure in a database. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.145.119.199

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Modrušan, N.; Rabuzin, K. and Mršić, L. (2020). Improving Public Sector Efficiency using Advanced Text Mining in the Procurement Process. In Proceedings of the 9th International Conference on Data Science, Technology and Applications - DATA; ISBN 978-989-758-440-4; ISSN 2184-285X, SciTePress, pages 200-206. DOI: 10.5220/0009823102000206

@conference{data20,
author={Nikola Modrušan. and Kornelije Rabuzin. and Leo Mršić.},
title={Improving Public Sector Efficiency using Advanced Text Mining in the Procurement Process},
booktitle={Proceedings of the 9th International Conference on Data Science, Technology and Applications - DATA},
year={2020},
pages={200-206},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0009823102000206},
isbn={978-989-758-440-4},
issn={2184-285X},
}

TY - CONF

JO - Proceedings of the 9th International Conference on Data Science, Technology and Applications - DATA
TI - Improving Public Sector Efficiency using Advanced Text Mining in the Procurement Process
SN - 978-989-758-440-4
IS - 2184-285X
AU - Modrušan, N.
AU - Rabuzin, K.
AU - Mršić, L.
PY - 2020
SP - 200
EP - 206
DO - 10.5220/0009823102000206
PB - SciTePress