Skip to content

Conversation

navidcy
Copy link
Collaborator

@navidcy navidcy commented Sep 7, 2025

It's time to let go of the cosima_cookbook!

@navidcy navidcy added 🛸 updating An existing notebook needs to be updated 📔 tutorial labels Sep 7, 2025
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@navidcy
Copy link
Collaborator Author

navidcy commented Sep 7, 2025

The last cell fails

df = get_detailed_variable_info(catalog, experiment, 'tau_x')

But if all else is good, I suggest we merge and we fix that tomorrow?

@charles-turner-1
Copy link
Collaborator

Just coming into Melbourne now so I can't check, but the last cell looks like a relatively straightforward fix. I'll take a proper look later/first thing tomorrow.

@navidcy
Copy link
Collaborator Author

navidcy commented Sep 7, 2025

@charles-turner-1 whenever you find sometime have a look and if you fix the last cell feel free to approve and merge!

@@ -0,0 +1,7778 @@
{
Copy link
Collaborator

@charles-turner-1 charles-turner-1 Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should no longer be true and work out of the box.


Reply via ReviewNB

@@ -0,0 +1,7778 @@
{
Copy link
Collaborator

@charles-turner-1 charles-turner-1 Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #3.        path=".*output000.*"

This line can be deleted (I've checked & it's not doing anything)


Reply via ReviewNB

@@ -0,0 +1,7778 @@
{
Copy link
Collaborator

@charles-turner-1 charles-turner-1 Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if columns_with_iterables:
        df = df.explode(columns_with_iterables, ignore_index=True)
    df.index = df[variable_column_name]
Line #29.        df.index = df[variable_column_name]

I think that this will separately explode on each column, which is probably not what we want. It looks like @adele-morrison wrote this - are you able to confirm for me?


Reply via ReviewNB

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out I wrote this - @navidcy you happy for me push the changes to this branch?

@charles-turner-1
Copy link
Collaborator

def get_detailed_variable_info(intake_catalog, experiment_name : str, variable : str | None = None) -> "pd.Dataframe":
    """
    Get detailed information about all the variables available in an experiment contained within the catalog.

    If a specific variable is passed, then the returned dataframe will be filtered to include only information
    about that variable

    Returns a pandas dataframe, reorganised to use the variable as the index.

    Parameters:
    -----------
    intake_catalog: 
        The variable holding the intake catalog. If you have opened the catalog using
        `cat = intake.cat.access_nri`, then `intake_catalog=cat`, etc.
    experiment_name: str
        The name of the experiment you are interested in. Eg. `experiment = "01deg_jra55v13_ryf9091"`
    variable: str | None
        If you want detailed information about just a single variable, then pass it here. For 
        example, if you only want information about potential temperature, pass `variable='pot_temp'`
    """

    expt_ds = intake_catalog[experiment_name]
    columns_with_iterables = list(expt_ds.esmcat.columns_with_iterables)
    variable_column_name = expt_ds.esmcat.aggregation_control.variable_column_name
    
+    for col in columns_with_iterables:
+        expt_ds.df[col] = expt_ds.df[col].astype("object").map(lambda x: list(map(str, x)))

    df = expt_ds.df.copy()
    if columns_with_iterables:
        df = df.explode(columns_with_iterables, ignore_index=True)
        
    df.index = df[variable_column_name]
    df.drop(columns=variable_column_name, inplace=True)
    if variable is not None:
        df = df.loc[variable]

    return df

fixes the failing cell, but I'm not sure it gives the intended results.

@charles-turner-1 charles-turner-1 merged commit 7af03d6 into main Sep 8, 2025
3 checks passed
@charles-turner-1 charles-turner-1 deleted the ncc/remove-cosima_cookbook_refs branch September 8, 2025 03:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📔 tutorial 🛸 updating An existing notebook needs to be updated
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants