[LKD-EU] Data Documentation


#1

Hello everybody,

I am working currently in the project LKD-EU (Langfristige Planung und kurzfristige Optimierung des Elektriztitätssystems in Deutschland im europäischen Kontext)

eng. title: Long-term planning and short-term optimization of the German electricity system within the European framework: Further development of methods and models to analyze the electricity system including the heat and gas sector

We published a data set: ## Electricity, Heat and Gas Sector Data for Modelling the German System ## and the DIW documentation is available here: http://bit.ly/2zIP53C

Your comments and feedback is welcome!


#2

That is an impressive data set of the German energy system.

While the electricity sector (generation, transmission) is well known from OPSD it is the first available complete data for the natural gas system.

As far as I see the gas data was compiled by georeferencing and vectorizing network topology maps from the 16 TSO and addition maps (like ENTSOG).

As much as I like available data for validation, the publication of data without an open license is a tragedy for everybody:

"As this data documentation aims for highest levels of transparency and traceability, we only use open data sources. The sources include a limited number of publications by different institutions, organizations, associations, exchanges, and companies which are publicly available (Table 1). We do not consider commercial data sets (e.g. on power plants), information only available under non-disclosure agreements (e.g. on network data), and references for individual infrastructure objects. " [1]

[1] Kunz et al. 2017, Data Documentation: Electricity, Heat, and Gas Sector Data for Modeling the German System


#3

Hello Philipp. While I agree with @ludwig.huelk , I am going to be a little more conciliatory. Table 1 (page 5–6), for instance, describes the data sources you used for the electricity sector. They span NASA MERRA-2 satellite data under public domain (courtesy of the US government) to OpenStreetMap which is copyleft ODbL. While the legal status for other sources, say that obtained from the OPSD database, is “gray” at best. So it is not possible to relicense the entire dataset under one overarching compliant license.

In theory, it would be possible to split up the spreadsheet into individual spreadsheets or CSV files, each with its own license (or no license) and suitable metadata. That is probably where things could be heading, but the question of internal consistency remains open. That was a design goal for the OPSD project (hopefully they will secure more funding).

The larger problem we collectively face will not be solved until the European Commission mandates the use of permissive licenses or public domain dedications and we and/or the Commission develop an ontology and related standards for the energy sector.

In the background is the question as to whether much or most of this data is indeed copyrightable. Hopefully the Commission can clarify this matter in due course.

Notwithstanding, DIW could release the PDF documentation under a Creative Commons license. Perhaps you should explore that option with your board? I know it is not common practice for institutional reports to be open licensed.

The full references are repeated below, because I just databased them.

Like Ludwig said, it looks like very useful work. Best, Robbie.

References

Kunz, Friedrich, Jens Weibezahn, Philip Hauser, Sina Heidari, Wolf-Peter Schill, Björn Felten, Mario Kendziorski, Matthias Zech, Jan Zepter, Christian von Hirschhausen, Dominik Möst, and Christoph Weber (27 December 2017). Reference data set: electricity, heat, and gas sector data for modeling the German system — Version 1.0.0. Berlin, Germany: DIW Berlin (Deutsches Institut für Wirtschaftsforschung). doi:10.5281/zenodo.1044463.

Kunz, Friedrich, Mario Kendziorski, Wolf-Peter Schill, Jens Weibezahn, Jan Zepter, Christian von Hirschhausen, Philipp Hauser, Matthias Zech, Dominik Möst, Sina Heidari, Björn Felten, and Christoph Weber (December 2017). Electricity, heat, and gas sector data for modeling the German system. Berlin, Germany: DIW Berlin (Deutsches Institut für Wirtschaftsforschung). ISSN 1861-1532.


#4

That’s a good point @robbie.morrison.
The use of OSM requires at least an appropriate attribution. That is missing in total. I only see: “Source: own illustration”

This data set must include some tens of work hours only for digitizing the maps and harmonizing the sources. That’s my only concern: people are spending so much effort, time and money to create something useful like this, but the copyright issues hinder everybody else from using and contributing.


#5

Dear @ludwig.huelk and @robbie.morrison,

thank you for the quick response and your suggestions. I’ll take your feedback to our next project meeting and we will discuss your concerns. As this project is ongoing until March 2019 we may have some time to release an updated version.


#6

Dear All,

We are well aware of the copyright problems. The © given in the front matters of the DIW document is kind of a standard procedure of the institute and only refers to the text itself. We wanted to publish the data set (on Zenodo) completely open but the issues @robbie.morrison pointed out about the copyleft hindered us to select a specific license to republish the whole data set so we wanted to publish without a license at all (“grey”) but this option is not available on the repository which is why we chose “other (open)” to be as unspecific as possible. We are curently planning to publish the documenation under a CC-BY 4.0 license. This will take another while, though… but as @Ph_Hauser mentioned there is still some more time in the project to improve this matter.

Best,
Jens


#7

Hello @jens.weibezahn. I really appreciate the work that DIW and others are doing on dataset licensing. Collections of datasets, such as your zenodo files (one is a multi-sheet spreadsheet and the other a zip of shape files and other formats) represent a special area of law: copyright in compilation in the context of collective works. (Remixing to produce derivative works is a separate matter and not considered here).

The exact same issue exists for large software packages and distros (such as Ubuntu). Moglen and Choudhary (2017) explain the issues in the context of US copyright law. There is a huge debate in the open source law/tech world as to how best manage the FOSS license compliance for packages, package repositories, and distros. This effort is a reflection of FOSS software entering mainstream use (for example, automotive-grade Linux is used by Japanese and Chinese vehicle manufacturers, although I have not heard German firms mentioned in this context).

European database rights may also apply. But this directive (as translated into member state law) seems to be predicated on the concept of a server: perhaps the zenodo site fulfills that function? In any case, a database right can be waived with a simple statement to that effect, but it would need to (I guess) apply to the entire site.

As I mentioned offline to @Ph_Hauser, I am going to follow this up on a legal mailing list and report back if I uncover anything useful. With best wishes, Robbie.

References

Moglen, Eben, and Mishi Choudhary (27 March 2017). Free software distributions and ancillary rights. Software Freedom Law Center. New York, USA.