Skip to content

Partial ISO dates are badly read #1754

@paulgirard

Description

@paulgirard

Context:

Use some legit but partial ISO date:

date
2024
2024-05

with a date schema:

{
  "fields": [{
      "name": "date",
      "type": "date",
   }]
}

Validation works ok.

Then read the data with frictionless using read_rows or other methods.

Problem:

Dates will be interprete as dates but when casting them the python code will make a complete date using the current day information.
If you read your data on 2025-08-08 then the example data will be yielded as:

  • 2024-08-08
  • 2024-05-08

As already mentioned in #1715 it's not possible to use partial ISO dates with frictionless. Iso format can not help neither. Above all in usecases like mine were the date precision varies in the dataset.

Two workarounds :

  • just use validation, do not read the data using frictionless
  • do not use date format but string format with a not perfect regexp pattern:
{
      "name": "date",
      "type": "string",
      "pattern": "[0-9]{4}(-[10][1-9](-(([012][1-9])|(3[01])))?)?"
}

The pattern is not perfect as it does not cover:

  • calendar inconsistencies like 2025-02-31 or 2022-04-31
  • will not cover far away dates like before JC or in distant future (also the pattern could be easily corrected for that)

relates to #1696

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions