Copyright Infringement in Generative AI Input Data Acquisition

Ruyu Yan

2025

Abstract

The rapid development of generative artificial intelligence (GenAI) increases the risk of copyright infringement during data acquisition and use. This study examines infringement risks at GenAI's input stage, focusing on the legal conflicts in data collection, processing, and output. It highlights substantial violations of economic rights, such as reproduction and adaptation. Under China's Copyright Law, statutory licensing is inapplicable due to non-compliant subject qualifications and behavioral discrepancies. Fair use defenses fail because of commercial intent and excessive scope. Tests and analyses, including the three-step test, four-factor analysis, and transformative use doctrine, consistently show non-exemption. To address liability asymmetries, algorithmic opacity requires a fault presumption mechanism with a reversed burden of proof. To counter enforcement deficiencies, the study proposes novel remedies like dynamic compensation models and algorithmic injunctions. It concludes with institutional recommendations: enforcing enhanced robots.txt compliance, creating open-licensed data repositories, and developing international compliance frameworks to balance technological innovation with copyright protection.

Download


Paper Citation


in Harvard Style

Yan R. (2025). Copyright Infringement in Generative AI Input Data Acquisition. In Proceedings of the 1st International Conference on Politics, Law, and Social Science - Volume 1: ICPLSS; ISBN 978-989-758-785-6, SciTePress, pages 577-582. DOI: 10.5220/0014390300004859


in Bibtex Style

@conference{icplss25,
author={Ruyu Yan},
title={Copyright Infringement in Generative AI Input Data Acquisition},
booktitle={Proceedings of the 1st International Conference on Politics, Law, and Social Science - Volume 1: ICPLSS},
year={2025},
pages={577-582},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0014390300004859},
isbn={978-989-758-785-6},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 1st International Conference on Politics, Law, and Social Science - Volume 1: ICPLSS
TI - Copyright Infringement in Generative AI Input Data Acquisition
SN - 978-989-758-785-6
AU - Yan R.
PY - 2025
SP - 577
EP - 582
DO - 10.5220/0014390300004859
PB - SciTePress