Skip to content

Data discovery #2

@pietercolpaert

Description

@pietercolpaert

Challenge Description

When

  • starting a new project, or
  • moving an existing project into another domain, or
  • maintaining your project for a long period of time,

you somehow need to find interoperable, relevant and trustworthy datasets. Today, this is a manual task. Automating this task requires a discovery mechanism, which on the Web today is an unsolved problem.

Example cases:

  1. Setting up a new route planner.
  2. Moving digital twin software from one city to another.
  3. Creating a dashboard of a certain indicator, adding more data when it becomes available.

Impact and Importance

Automating data discovery should reduce the costs for:

  1. setting up a new project
  2. bringing the project into another context
  3. maintaining the project over time

Desired Solution

  1. A language to express the criteria for a dataset to enter your project, based on: the shape or schema used (e.g., SHACL), the provenance (e.g., only datasets that originate from X or Y), geo-temporal extent, usage conditions, etc.
  2. A data model for Web-based storage system or data catalog so that the criteria can be evaluated.
  3. An algorithm to evaluate 1 over 2

Acceptance Criteria

  1. A specification is available of the language with examples on how to express datasets relevant to your application
  2. A data model specification is available
  3. A reference implementation of the algorithm can be tested

References and Resources

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions