Do-a-thon: Development of a distributed data architecture for sharing data in energy systems analysis

Dear all,

I also would like to propose a do-a-thon:

Proposal for:
Do-a-thon
Session Title
Development of a distributed data architecture for sharing data in energy systems analysis.
Session Description
In this session we would like to discuss how a distributed architecture of data bases for energy systems modelling could look like. We have a national funded project for the development and demonstration of a distributed data sharing architecture in energy systems modelling. If we want to be successful with such an approach, we need to be inclusive and want to reach out into the community to disuss our initial thoughs how this could work. We want to learn your perspective and jointly improve our vision how this could look like, to develop something for the benefit of all.
Would you like to be responsible for this Session?
Yes, unfortunately I can only do this on Wednesday due to other appointments
Do you need any special infrastructure for this Session?
No
Do you have any recommendations who could be part of this Session?
Anyone interested in sharing data and make the data more easy to discover and more accessible.
…

1 Like

We created a short agenda for the workshop
13:00 Welcome and short introduction of the participants

13:15 Introduction to the LOD-GEOSS Project: Motivation and Targets
Carsten Hoyer-Klick

13:30 Introduction to the databus and concept of the distributed data base architecture
Sebastian Hellmann

13:45 Discussion of the general concept

14:00 Use cases of the distributed architecture
Patrick Kuckertz

14:30 Options of participation

15:00 End of Workshop

We have an etherpad
https://etherpad.wikimedia.org/p/OpenMod_2020_D.1

Content of the etherpad (aren’t etherpad deleted after some times?) pasted here for backup:

Ehterpad for Session D.1 A distributed data architecture for sharing energy

If you are interesed in further information on the project, leave your e-mail here or drop me a message carsten.hoyer-klick@dlr.de, we will set up a newletter from the project.

Session: D.1 A distributed data architecture for sharing energy

Moderation:

Carsten Hoyer-Klick (DLR)

Participants:

Ludwig HĂĽlk (RLI) #OpenScience

Christian Hofmann (RLI)

Simon Worthington (TIB)

Notes:

  • 13:20 Introduction to distributed data infrastructures and Linked Open Data (LOD)

  • Data of peer-reviewed Papers is commonly 2-3 years old

  • Goals:

  • unified description of data

  • proper licensing

  • flexible extendable networked data infrastructure

  • interfaces to GEOSS

  • interfaces to models

Sebastian Hellmann:

presentation URL: [https://](https://tinyurl.com/dbpedia-openmod-2020)[tinyurl.com/dbpedia-openmod-2020](https://tinyurl.com/dbpedia-openmod-2020)

Use open licenses!

licenses in machine readable format:

https://dalicc.net/license-library

Databus project to reference open data by providing and storing meta data (tracability)

https://databus.dbpedia.org

Saves data cleansing effort because mappings are saved and shared online

Open Data Initiatives:

- reegle

- OEP

- OPSD

- IPCC

- GEOSS: data standards for geo data

- MESOR project for open solar data

https://dbpedia.org

WSDL

Open EI Open Energy Information Szenario B, OPSD -> Open data Initiatives

Patrick Kuckertz:

Group activity:

Copy this template for each group and fill out:

— --- — --- TEMPLATE

Identification of problematic use cases in regards to currently used data infrastructures

A) Please generally state the most problematic data handling related use cases you are confronted with in your every day work? (max. five sentences each)

B) Please select the use case with the highest priority from your above list and describe and/or scetch it in detail. (use back side if necessary)

  1.   Which use case did you select?
    
  2.   Who is the primary actor of the process?
    
  3.   Which other stakeholders / systems are involved?
    
  4.   What triggers the process (pre-conditions)?
    
  5.   What are the goals of the process (post-conditions)?
    
  6.   What information is exchanged between the actors / systems?
    
  7.   What are the individual process steps and their chronological order?
    

— --- — --- TEMPLATE

— --- — --- Anonymus1

Identification of problematic use cases in regards to currently used data infrastructures

A) Please generally state the most problematic data handling related use cases you are confronted with in your every day work? (max. five sentences each)

B) Please select the use case with the highest priority from your above list and describe and/or scetch it in detail. (use back side if necessary)

  1.   Which use case did you select?
    
  2.   Who is the primary actor of the process?
    
  3.   Which other stakeholders / systems are involved?
    
  4.   What triggers the process (pre-conditions)?
    
  5.   What are the goals of the process (post-conditions)?
    
  6.   What information is exchanged between the actors / systems?
    
  7.   What are the individual process steps and their chronological order?
    

— --- — --- TEMPLATE

Demo Log Output

slides: http://tinyurl.com/dbpedia-openmod-2020

Command:

bin/DatabusClient -f ttl -c gz -s query.query

##################################

LOG

========================================================

TASK:

convert file(s) from query:

PREFIX dataid: http://dataid.dbpedia.org/ns/core#

PREFIX dataid-cv: http://dataid.dbpedia.org/ns/cv#

PREFIX dct: http://purl.org/dc/terms/

PREFIX dcat: http://www.w3.org/ns/dcat#

SELECT DISTINCT ?file WHERE {

     ?dataset dataid:version <https://databus.dbpedia.org/kurzum/mastr/bnetza-mastr/01.04.00> .

    ?dataset dcat:distribution ?distribution .

    ?distribution dcat:downloadURL ?file .

    ?distribution <http://dataid.dbpedia.org/ns/core#formatExtension> 'csv'^^<http://www.w3.org/2001/XMLSchema#string> . 

    ?distribution <http://dataid.dbpedia.org/ns/cv#type> 'hydro'^^<http://www.w3.org/2001/XMLSchema#string> .

}

to destination:

/home/shellmann/IdeaProjects/databus-client/files

========================================================

DOWNLOAD TOOL:


Files to download:

http://dbpedia-mappings.tib.eu/databus-repo/kurzum/mastr/bnetza-mastr/01.04.00/bnetza-mastr_rli_type=hydro.csv.bz2 --> already exists in Cache

========================================================

CONVERSION TOOL:

input file: /home/shellmann/IdeaProjects/databus-client/target/databus.tmp/cache_dir/dbpedia-mappings.tib.eu/databus-repo/kurzum/mastr/bnetza-mastr/01.04.00/bnetza-mastr_rli_type=hydro.csv.bz2

output file: /home/shellmann/IdeaProjects/databus-client/files/kurzum/mastr/bnetza-mastr/01.04.00/bnetza-mastr_rli_type=hydro.ttl.gz

MappingInfoFile: https://raw.githubusercontent.com/dbpedia/format-mappings/master/tarql/1.ttl#this

Used CSVOptions:

Delimiter: ;

EscapeCharacter:null

QuoteCharacter: null

Encoding: null

http://data.rli.de/ontology/SME922277414628 @rdf:type http://data.rli.de/ontology/bnetza-mastr_type

http://data.rli.de/ontology/SME922277414628 @http://dbpedia.org/ontology/Name “Wasserkraftanlage Kettwig”

http://data.rli.de/ontology/SME922277414628 @http://data.rli.de/ontology/hydro_einheitart “Stromerzeugungseinheit”

http://data.rli.de/ontology/SME922277414628 @http://data.rli.de/ontology/Id “250”^^http://www.w3.org/2001/XMLSchema#integer

http://data.rli.de/ontology/SME968170947832 @rdf:type http://data.rli.de/ontology/bnetza-mastr_type

http://data.rli.de/ontology/SME968170947832 @http://dbpedia.org/ontology/Name “Schliffgesmühle”

…


Dieses Dokument ist gemeinfrei mit der offenen Lizenz CC0 lizenziert.

Alle Beitragenden stimmen mit der Verwendung diesen Bedingungen zu.

Es ist zu beachten, dass kein urheberrechlich geschĂĽtzes Material eingefĂĽgt wird.

https://creativecommons.org/publicdomain/zero/1.0/deed.de


1 Like