loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Author: Mohamed Hosni

Affiliation: MOSI Research Team, ENSAM, University Moulay Ismail of Meknes, Meknes, Morocco

Keyword(s): Categorical Data, Encoder, Software Effort Estimation, Ensemble Effort Estimation.

Abstract: Planning, controlling, and monitoring a software project primarily rely on the estimates of the software development effort. These estimates are usually conducted during the early stages of the software life cycle. At this phase, the available information about the software product is categorical in nature, and only a few numerical data points are available. Therefore, building an accurate effort estimator begins with determining how to process the categorical data that characterizes the software project. This paper aims to shed light on the ways in which categorical data can be treated in software development effort estimation (SDEE) datasets through encoding techniques. Four encoders were used in this study, including one-hot encoder, label encoder, count encoder, and target encoder. Four well-known machine learning (ML) estimators and a homogeneous ensemble were utilized. The empirical analysis was conducted using four datasets. The datasets generated by means of the one-hot encod er appeared to be suitable for the ML estimators, as they resulted in more accurate estimation. The ensemble, which combined four variants of the same technique trained using different datasets generated by means of encoder techniques, demonstrated an equal or better performance compared to the single ML estimation technique. The overall results are promising and pave the way for a new approach to handling categorical data in SDEE datasets. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.217.144.32

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Hosni, M. (2023). Encoding Techniques for Handling Categorical Data in Machine Learning-Based Software Development Effort Estimation. In Proceedings of the 15th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - KDIR; ISBN 978-989-758-671-2; ISSN 2184-3228, SciTePress, pages 460-467. DOI: 10.5220/0012259400003598

@conference{kdir23,
author={Mohamed Hosni.},
title={Encoding Techniques for Handling Categorical Data in Machine Learning-Based Software Development Effort Estimation},
booktitle={Proceedings of the 15th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - KDIR},
year={2023},
pages={460-467},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012259400003598},
isbn={978-989-758-671-2},
issn={2184-3228},
}

TY - CONF

JO - Proceedings of the 15th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - KDIR
TI - Encoding Techniques for Handling Categorical Data in Machine Learning-Based Software Development Effort Estimation
SN - 978-989-758-671-2
IS - 2184-3228
AU - Hosni, M.
PY - 2023
SP - 460
EP - 467
DO - 10.5220/0012259400003598
PB - SciTePress