Authors:
Alexander Wurl
1
;
Andreas Falkner
1
;
Alois Haselböck
1
and
Alexandra Mazak
2
Affiliations:
1
Siemens AG Österreich, Austria
;
2
TU Wien, Austria
Keyword(s):
Data Integration, Signifier, Data Quality.
Related
Ontology
Subjects/Areas/Topics:
Data Engineering
;
Data Management and Quality
;
Data Management for Analytics
;
Information Quality
Abstract:
In Rail Automation, planning future projects requires the integration of business-critical data from heterogeneous
data sources. As a consequence, data quality of integrated data is crucial for the optimal utilization of
the production capacity. Unfortunately, current integration approaches mostly neglect uncertainties and inconsistencies
in the integration process in terms of railway specific data. To tackle these restrictions, we propose
a semi-automatic process for data import, where the user resolves ambiguous data classifications. The task
of finding the correct data warehouse classification of source values in a proprietary, often semi-structured
format is supported by the notion of a signifier, which is a natural extension of composite primary keys. In a
case study from the domain of asset management in Rail Automation we evaluate that this approach facilitates
high-quality data integration while minimizing user interaction.