Skip to content

Conversation

FlorianDeconinck
Copy link
Collaborator

Main change

This PR proposes a new tool to check NetCDFs against one another with a series of guess to match compute domains. The tool describe it's logic as it goes down and fail if it can't resolve.

Options for name mapping & halo size.

After (re)installing NDSL, it can be called with

best_guess_diff /path/to/A.nc4 /path/to/B.nc4 --halo 3 --name_mapping mapper.yaml

where name_mapping is a yaml file with a single dictionary mapping name in A to B.

Minor QOL

  • Debugger now knows how to FieldBundle
  • quantity.to_netcdf when no rank is given looks at the rank itself
  • ndsl_log is imported early to avoid circular dependencies

Copy link
Collaborator

@oelbert oelbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small changes, but also how many of these did you run into when diffing NetCDFs? Do you want to add checks for extra dimensions while we're at it

def get_parser():
parser = argparse.ArgumentParser(
"Attempt to diff two NetCDFs with similar data."
"Differences that can be reconcialed are strict domain vs halo, variable name mapping."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Differences that can be reconcialed are strict domain vs halo, variable name mapping."
"Differences that can be reconciled are strict domain vs halo, variable name mapping."

parser = argparse.ArgumentParser(
"Attempt to diff two NetCDFs with similar data."
"Differences that can be reconcialed are strict domain vs halo, variable name mapping."
"They program will report on assumptions taken."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"They program will report on assumptions taken."
"The program will report on assumptions taken."

parser.add_argument(
"netcdf_A",
type=str,
help="path of NetCDFs, named A in the logs.",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
help="path of NetCDFs, named A in the logs.",
help="path of first NetCDF file, named A in the logs.",

parser.add_argument(
"netcdf_B",
type=str,
help="path of NetCDFs, named B in the logs.",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
help="path of NetCDFs, named B in the logs.",
help="path of second NetCDF file, named B in the logs.",

xr.Dataset(dataset).to_netcdf(f"best_guest_diff_{pathlib.Path(netcdf_A).stem}.nc4")


def entry_point():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

too good to call it main()? 😂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants