Copyright Infringement in Generative AI Input Data Acquisition
Ruyu Yan
2025
Abstract
The rapid development of generative artificial intelligence (GenAI) increases the risk of copyright infringement during data acquisition and use. This study examines infringement risks at GenAI's input stage, focusing on the legal conflicts in data collection, processing, and output. It highlights substantial violations of economic rights, such as reproduction and adaptation. Under China's Copyright Law, statutory licensing is inapplicable due to non-compliant subject qualifications and behavioral discrepancies. Fair use defenses fail because of commercial intent and excessive scope. Tests and analyses, including the three-step test, four-factor analysis, and transformative use doctrine, consistently show non-exemption. To address liability asymmetries, algorithmic opacity requires a fault presumption mechanism with a reversed burden of proof. To counter enforcement deficiencies, the study proposes novel remedies like dynamic compensation models and algorithmic injunctions. It concludes with institutional recommendations: enforcing enhanced robots.txt compliance, creating open-licensed data repositories, and developing international compliance frameworks to balance technological innovation with copyright protection.
DownloadPaper Citation
in Harvard Style
Yan R. (2025). Copyright Infringement in Generative AI Input Data Acquisition. In Proceedings of the 1st International Conference on Politics, Law, and Social Science - Volume 1: ICPLSS; ISBN 978-989-758-785-6, SciTePress, pages 577-582. DOI: 10.5220/0014390300004859
in Bibtex Style
@conference{icplss25,
author={Ruyu Yan},
title={Copyright Infringement in Generative AI Input Data Acquisition},
booktitle={Proceedings of the 1st International Conference on Politics, Law, and Social Science - Volume 1: ICPLSS},
year={2025},
pages={577-582},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0014390300004859},
isbn={978-989-758-785-6},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 1st International Conference on Politics, Law, and Social Science - Volume 1: ICPLSS
TI - Copyright Infringement in Generative AI Input Data Acquisition
SN - 978-989-758-785-6
AU - Yan R.
PY - 2025
SP - 577
EP - 582
DO - 10.5220/0014390300004859
PB - SciTePress