Skip to content

Incorrect time_range in catalog causes preprocessor to hang #797

@emaroon

Description

@emaroon

Bug Severity

  • 1 = Minor problem that does not affect total framework functionality (e.g., computation error in a POD, problem with logging output, or an issue on a single system
  • 2 = Major problem that affects overall functionality, but that does not occur for all users (e.g., problems installing the framework with a specific Conda version, a framework option that causes one or more PODs to fail, or missing/incompatible Python modules).
  • 3 = Catastrophic problem that occurs frequently for multiple users and/or on multiple systems (e.g.,framework consistently fails to install on multiple systems, or one or more PODs continuously fails after running successfully)

Describe the bug
Preprocessor hangs indefinitely on a variable if the time_range doesn't have the expected format. No errors are thrown, just hanging.

Steps To Reproduce
I used the catalog tool included in the mdtf to create a CMIP-style catalog. The time_range created by that tool has a format like:
1850-01-15 13:00:00.000007-2014-12-15 12:00:00

It appears like the preprocessor cannot handle it - the MDTF hangs indefinitely. If I manually overwrite the catalog to have a time_range that follows this expected format:
1850-01-15-2014-12-15
then everything works.

I traced the hanging to line 1221 in src/preprocessor.py

                if not var.is_static:
                    if "chunk_freq" in cat_subset.df:
                        cat_subset.esmcat._df = self.check_multichunk(cat_subset.df, date_range, var.log)
                    cat_subset.esmcat._df = self.check_group_daterange(cat_subset.df, date_range, var.log) ##HERE I THINK!

Environment
Describe the system environment:

Log information and/or terminal output
The code hangs after this warning, but no bug thrown:

Querying /glade/work/emaroon/mdtf/MDTF-diagnostics/diagnostics/natl_ocean/CMIP_CESM_historical_001.json for variable vsf for case CESM2_historical_r1i1p1f1.
WARNING: /glade/work/emaroon/conda-envs/_MDTF_base/lib/python3.12/site-packages/intake_esm/_search.py:50: UserWarning: This pattern is interpreted as a regular expression, and has match groups. To actually get the groups, use str.extract.
mask = df[column].str.contains(value, regex=True, case=True, flags=0)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions