loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Taoxin Peng and Florian Hanke

Affiliation: Edinburgh Napier University, United Kingdom

Keyword(s): Synthetic, Data Generator, Data Mining, Decision Trees, Classification, Pattern.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Data Engineering ; Data Mining ; Databases and Data Security ; Databases and Information Systems Integration ; Enterprise Information Systems ; Large Scale Databases ; Performance Evaluation and Benchmarking ; Sensor Networks ; Signal Processing ; Soft Computing

Abstract: It is popular to use real-world data to evaluate or teach data mining techniques. However, there are some disadvantages to use real-world data for such purposes. Firstly, real-world data in most domains is difficult to obtain for several reasons, such as budget, technical or ethical. Secondly, the use of many of the real-world data is restricted or in the case of data mining, those data sets do either not contain specific patterns that are easy to mine for teaching purposes or the data needs special preparation and the algorithm needs very specific settings in order to find patterns in it. The solution to this could be the generation of synthetic, “meaningful data” (data with intrinsic patterns). This paper presents a framework for such a data generator, which is able to generate datasets with intrinsic patterns, such as decision trees. A preliminary run of the prototype proves that the generation of such “meaningful data” is possible. Also the proposed approach could be extended to a further development for generating synthetic data with other intrinsic patterns. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.224.53.202

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Peng, T. and Hanke, F. (2016). Towards a Synthetic Data Generator for Matching Decision Trees. In Proceedings of the 18th International Conference on Enterprise Information Systems - Volume 1: ICEIS; ISBN 978-989-758-187-8; ISSN 2184-4992, SciTePress, pages 135-141. DOI: 10.5220/0005829001350141

@conference{iceis16,
author={Taoxin Peng. and Florian Hanke.},
title={Towards a Synthetic Data Generator for Matching Decision Trees},
booktitle={Proceedings of the 18th International Conference on Enterprise Information Systems - Volume 1: ICEIS},
year={2016},
pages={135-141},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005829001350141},
isbn={978-989-758-187-8},
issn={2184-4992},
}

TY - CONF

JO - Proceedings of the 18th International Conference on Enterprise Information Systems - Volume 1: ICEIS
TI - Towards a Synthetic Data Generator for Matching Decision Trees
SN - 978-989-758-187-8
IS - 2184-4992
AU - Peng, T.
AU - Hanke, F.
PY - 2016
SP - 135
EP - 141
DO - 10.5220/0005829001350141
PB - SciTePress