Vagaries of data licensing understand-a-thon and write-a-thon



While many of us are releasing open code under OSI approved licenses, the “secret sauce” with modelling studies is often the data (and assumptions), and my impressions is that the licensing landscape is less clear there and its less obvious what the trade-offs aree between different licenses, their compatibilities, and what restrictions they impose.

There are many questions that seem small but take time to understand and answer, like: what is the difference between “database” and “data” — under what conditions does the user of data licensed with a share-alike license have to make changes available, etc.

Goal of this do-a-thon: work through the pros and cons of different licensing models and write up a short how-to / decision helper article (e.g. for the wiki)

Some related forum posts on the topic:

Some existing info on the wiki:

UPDATE: It is unlikely I will have time to lead this do-a-thon but I’m leaving it up in case others are interested and would like to organise around this topic!


Hello Stefan. Those requiring background, see Ball (2014). HTH, Robbie.

Ball, Alex (17 July 2014). How to license research data. Edinburgh, United Kingdom: Digital Curation Centre (DCC).


I just read this blog post about data attribution.
I liked the aspect that it helps to distinguish between attribution and (scientific) citation.

@ldodds (2018-06-04). How Do We Attribute Data?


Moreover, those holding copyright may well be a subset of those being attributed. Small contributions of an editorial nature do no attract copyright. This may or may not include data curation. And one reason why a git blame report needs further interpretation when used in court (Hemel and Coughlan 2017). Cheers, Robbie

Hemel, Armijn and Shane Martin Coughlan (2017). “Making sense of git in a legal context”. International Free and Open Source Software Law Review. 9 (1): 19–33. ISSN 1877-6922. doi:10.5033/ifosslr.v9i1.121.