Authors:
Alexander Bernier
and
Adrian Thorogood
Affiliation:
Centre of Genomics and Policy, McGill University Faculty of Medicine, Dr. Penfield, Montréal, Canada
Keyword(s):
Big Data, Bioinformatics, Data Commons, Data Licensing, Intellectual Property, Interoperability, License Standardization, Machine Learning, Open Science, Software Licensing, Technology Law.
Abstract:
Efficient machine learning in bioinformatics requires a large volume of data from different sources. Bioinformatics is shifting from a paradigm of siloed analysis of individual datasets by researchers to the aggregation and analysis of disparate sets of health and biomedical data across from academic, healthcare and commercial settings. Data generating organizations must give thought to selecting legal terms for dataset release that will promote compatibility with other datasets. In releasing bioinformatic data for open use, care must be taken to ensure that the terms of the licenses selected ensure maximum interoperability. The following technical elements should inform the choice of license: License hybridity; waivers of liability, warranties and guarantees; commercial/non-commercial use; attribution and copyleft; granular permission and bilateral or multilateral licensing. Licenses are compared to inform optimal license selection and enable data integration and analysis; considera
tion is given to an eventual standard license for open sharing of bioinformatic data.
(More)