-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Description
Challenge Description
When
- starting a new project, or
- moving an existing project into another domain, or
- maintaining your project for a long period of time,
you somehow need to find interoperable, relevant and trustworthy datasets. Today, this is a manual task. Automating this task requires a discovery mechanism, which on the Web today is an unsolved problem.
Example cases:
- Setting up a new route planner.
- Moving digital twin software from one city to another.
- Creating a dashboard of a certain indicator, adding more data when it becomes available.
Impact and Importance
Automating data discovery should reduce the costs for:
- setting up a new project
- bringing the project into another context
- maintaining the project over time
Desired Solution
- A language to express the criteria for a dataset to enter your project, based on: the shape or schema used (e.g., SHACL), the provenance (e.g., only datasets that originate from X or Y), geo-temporal extent, usage conditions, etc.
- A data model for Web-based storage system or data catalog so that the criteria can be evaluated.
- An algorithm to evaluate 1 over 2
Acceptance Criteria
- A specification is available of the language with examples on how to express datasets relevant to your application
- A data model specification is available
- A reference implementation of the algorithm can be tested
References and Resources
- Describing a data catalogs and their datasets with e.g., geotemporal aspects: DCAT
- Describing policies: ODRL
- For pipelines and provenance, see Pipelining workflows across participants #1
JohannesLipp, apomp, balessan, rohitadeshmukh13 and xdxxxdx
Metadata
Metadata
Assignees
Labels
No labels