Data Quality Threat Mitigation Strategies for Multi-Sourced Linked Data

Ali Obaidi, Adrienne Chen-Young

2025

Abstract

Federal agencies link data from multiple sources to generate statistical data products essential to informing policy and decision making (National Academies, 2017). The ability to integrate and link data is accompanied by the challenge of harmonizing heterogenous data, disambiguating similar data, and ensuring that the quality of data from all sources can be reconciled at levels that provide value and utility commensurate with the integration effort. Given the significant resources and effort needed to consistently maintain high quality, multi-sourced, linked data in a government ecosystem, this paper proposes steps that can be taken to mitigate threats to data quality at the earliest stage of the statistical analysis data lifecycle: data collection. This paper examines the threats to data quality that are identified in the Federal Committee on Statistical Methodology’s (FCSM) Data Quality Framework (Dworak-Fisher, 2020), utilizes the U.S. Geological Survey’s (USGS) Science Data Lifecycle Model (SDLM) (Faundeen, 2013) to isolate data quality threats that occur before integration processing, and presents mitigation strategies that can be taken to safeguard the utility, objectivity, and integrity of multi-sourced statistical data products.

Download


Paper Citation


in Harvard Style

Obaidi A. and Chen-Young A. (2025). Data Quality Threat Mitigation Strategies for Multi-Sourced Linked Data. In Proceedings of the 14th International Conference on Data Science, Technology and Applications - Volume 1: DATA; ISBN 978-989-758-758-0, SciTePress, pages 72-81. DOI: 10.5220/0013462900003967


in Bibtex Style

@conference{data25,
author={Ali Obaidi and Adrienne Chen-Young},
title={Data Quality Threat Mitigation Strategies for Multi-Sourced Linked Data},
booktitle={Proceedings of the 14th International Conference on Data Science, Technology and Applications - Volume 1: DATA},
year={2025},
pages={72-81},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013462900003967},
isbn={978-989-758-758-0},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 14th International Conference on Data Science, Technology and Applications - Volume 1: DATA
TI - Data Quality Threat Mitigation Strategies for Multi-Sourced Linked Data
SN - 978-989-758-758-0
AU - Obaidi A.
AU - Chen-Young A.
PY - 2025
SP - 72
EP - 81
DO - 10.5220/0013462900003967
PB - SciTePress