Skip to content

Conversation

james-bruten-mo
Copy link
Contributor

@james-bruten-mo james-bruten-mo commented Oct 15, 2024

This PR is to add the main part of the upgrade macro script for applying macros to LFRic. As described in #37 this work will be done in 2 commits, of which this is the first. This code will read through versions.py files, searching for newly added macros, and combine these macros where necessary. The 2nd part will add the rose commands to apply the upgrade macros as well as adding unit tests. Some development history and testing examples can also be seen in lfric_apps:#139 including a description of the overall approach on the details page. A PR for the working practices also exists.

This PR adds:

  • a main and parse args section
  • introduces the ApplyMacros class which is where the majority of the code is. This contains:
    • a section for sourcing required lfric_core and jules working copies (as these contain required metadata)
    • a section for reading and combining new upgrade macros (the majority)
  • some initial functions outside of the above class which provide some general functionality

The script can be tested by getting the following 2 branches:

  • fcm:lfric_apps.x_br/pkg/jamesbruten/vn1.1_macro_pkg
  • fcm:lfric.x_br/dev/jamesbruten/core1.1_core_macro_branch

And then running apply_macros.py vn1.1_t139 -a /path/to/apps -c /path/to/core. This upgrade macro is present in lfric_atm, transport, gungho and components/driver. After running the script, it should be removed from gungho and driver, combined with the existing macro in atm and transport and added to any metadata that imports any of those 4 (most of them in fact!).

@james-bruten-mo james-bruten-mo requested review from jennyhickson and t00sa and removed request for r-sharp November 11, 2024 09:23
Copy link
Contributor

@jennyhickson jennyhickson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of trivial comments but generally looking good.

I realised towards the end that I was finding "meta_dir" as the main thing being used as a reference not very intuitive and had to remind myself what that is refering to. Possibly because its both the path to a versions.py and a path to rose-meta.conf and also used as a general key for each app/science section (because those are really one and the same). I don't have an easy suggestion, but wonder if it could be named differently to help with that?

Copy link
Contributor Author

@james-bruten-mo james-bruten-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Jenny. Addressed all your inline comments and retested to make sure it still works (as far as it goes in part 1). Only thing I haven't changed is the meta_dir variable as I'm not sure what to do. Happy to have a sit down and come up with something together if think changing would be good.

- str, stdout of find command looking for versions.py files
"""

command = f"find {path} -name versions.py"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've taken the opportunity to learn how os.walk works! It's remarkably easy and definitely a better solution than a find command. Cheers

# skip blank lines
continue
if match_python_import(line):
started_imports = True
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's not very robust so I've changed how this works

def write_python_imports(self, meta_dir):
"""
Write out all required python module imports at the top of a versions.py
file. Read imports from self.python_imports
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the doctring and modified the code as above. The import statements are now written at the top of the current import statements

"Jules sources."
)

def read_meta_imports(self, meta_dir, flag="import"):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have set this to just return a list, rather than trying to be too clever about which flag it's got. It's probably marginally less efficient but not really noticeable and a lot more readable. Changed the calls as required in this point - the other bits are in part 2.

Copy link
Contributor

@jennyhickson jennyhickson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. One more comment and over to Sam!

Copy link
Contributor

@t00sa t00sa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I've added some comments in line and I've got a couple of general questions, mostly related to error handling and reporting.

What is the intended behaviour if the script fails during execution? Is it acceptable for it to generate empty and/or partial version.py files and to delegate the process of cleaning up and fixing to the user? Or should the script always fail safe and leave the versions.py files intact?

Is it worth add a top-level exception handler which catches and formats RuntimeErrors? If theses can be triggered by the user, then it would be better to generate a user-friendly message rather than displaying a traceback.

I've suggested using the logging module rather than printing things out. My feeling is that it is better to use logging calls than print because it allows the amount of output to be tuned at runtime, but this is partly because I prefer commands that don't produce a lot of output.

# return the found path
for line in result.stdout.split("\n"):
if line.startswith("Working Copy Root Path"):
return line.split(":", 1)[1].strip()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth raising an exception after the loop in case the working copy doesn't match? Or is safe to assume that fcm will always provide a valid string if the return code is zero?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

file_raw = f.readlines()
file_parsed = []
for line in file_raw:
if len(line.strip().strip("\n")) > 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've discussed this and I don't think the double strip is necessary:

>>> s = "  abc \n"
>>> s
'  abc \n'
>>> s.strip()
'abc'
>>> 

I also wonder if it might be clearer to say:

Suggested change
if len(line.strip().strip("\n")) > 0:
if line.strip() != "":

If you're just filtering non-blank lines?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, done

Comment on lines +405 to +409
ticket_details = re.search(r"Upgrade .* (#\d+) by (\S+.*)", macro)
try:
self.ticket_number = ticket_details.group(1)
self.author = ticket_details.group(2).rstrip('".')
self.author = self.author.strip("<>")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about doing all the work in the regular expression match?

Suggested change
ticket_details = re.search(r"Upgrade .* (#\d+) by (\S+.*)", macro)
try:
self.ticket_number = ticket_details.group(1)
self.author = ticket_details.group(2).rstrip('".')
self.author = self.author.strip("<>")
ticket_details = re.search(r"Upgrade .* (#\d+) by <?([^\.]+)>?\.?", macro)
try:
self.ticket_number = ticket_details.group(1)
self.author = ticket_details.group(2)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


filepath = os.path.join(meta_dir, "versions.py")

with open(filepath, "w") as f:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would benefit from some error handling. Best option would be to write to a temporary file in the same directory and rename once the file has been written but a simple check to make sure the versions file isn't empty might be sufficient...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having done as we discussed (I think), it fails as the target file already exists. Which makes sense but I don't know if I've not done things as expected. So instead of os.link, I've just gone with os.rename() to remove the temp file

Comment on lines 493 to 497
except AttributeError:
raise RuntimeError(
"Couldn't find an after tag in the macro:\n"
f"{macro}"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should raise from the previous exception:

Suggested change
except AttributeError:
raise RuntimeError(
"Couldn't find an after tag in the macro:\n"
f"{macro}"
)
except AttributeError as exc:
raise RuntimeError(
"Couldn't find an after tag in the macro:\n"
f"{macro}"
) from exc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done that elsewhere, clearly missed. Updated now

Comment on lines 643 to 645
with open(fpath, "w") as f:
for line in versions_file:
f.write(line.strip("\n") + "\n")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As with the previous write, worth thinking about writing this to a temporary copy and renaming once the write is successful

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, as above. Have also modified the one with append below

except KeyError:
# Jules Shared directories will produce a key error - these are
# guaranteed to not import anything
imports = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it clearer to return an empty list immediately?

Suggested change
imports = []
return []

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah definitely

Comment on lines +744 to +747
print(
"[INFO] Pre-processing macros in",
self.parse_application_section(meta_dir),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a general point, have you considered using logging module functions instead of printing to stdout?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't, but I think I prefer printing, particularly as this will regularly be used by people not familiar with it. My feeling is that for those, it's better to give some positive feedback that it's actually doing something - particularly as the 2nd PR for this introduces the stuff that takes a bit of time!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And there are only 4/5 print statements so we can easily come back to this later

Comment on lines 862 to 869
# if the class name doesn't conform to the expected vnXX.Y_tTTTT naming
# convention raise an error
class_name = args.tag.replace(".", "")
if not re.match(CLASS_NAME_REGEX, class_name):
raise RuntimeError(
f"The tag '{args.tag}' does not conform to the "
"'vnXX.Y_tTTTT' naming scheme. Please modify and rerun."
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to move this into a function that could be passed in to paser.add_argument() in parse_args() to allow all the command line option error handling to be managed by argpse, e.g.

def check_tag(opt):
    class_name = opt.replace(".", "")
    if not re.match(CLASS_NAME_REGEX, class_name):
        raise argparse.ArgumentTypeError(
            f"The tag '{opt}' does not conform to the "
            "'vnXX.Y_tTTTT' naming scheme. Please modify and rerun."
        )
    return opt

...

    parser.add_argument(
        "tag",
        type=check_tag,
        metavar="after-tag",
        help="The After Tag of the upgrade macro being upgraded to.",
    )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well that's neat!

macro_object.preprocess_macros()

# Clean up temporary directories
for repo, directory in macro_object.temp_dirs.items():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered using tempfile.TemporaryDirectory() to create a top-level directory and letting the library handle the clean-up when it goes out of scope?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but there is a possibility that the script gets the temporary copy of core and updates apps in there, in which case I don't want it to get deleted, so think I need to do this manually?

Copy link
Contributor Author

@james-bruten-mo james-bruten-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Sam. Have addressed all your inline comments. I've left a note about using the logging module. I think having the prints is useful as otherwise it can feel like it's hanging. Happy to discuss that further though - I am probably not using the logging module perfectly either.

Only thing I haven't done is implement a custom error handling. There are a number of raise Exceptions with custom error messages - is it these you want changing? Or is it to catch a runtime error that I haven't got an exception for? In which case surely we want the stack trace?

# return the found path
for line in result.stdout.split("\n"):
if line.startswith("Working Copy Root Path"):
return line.split(":", 1)[1].strip()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

file_raw = f.readlines()
file_parsed = []
for line in file_raw:
if len(line.strip().strip("\n")) > 0:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, done


filepath = os.path.join(meta_dir, "versions.py")

with open(filepath, "w") as f:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having done as we discussed (I think), it fails as the target file already exists. Which makes sense but I don't know if I've not done things as expected. So instead of os.link, I've just gone with os.rename() to remove the temp file

Comment on lines 493 to 497
except AttributeError:
raise RuntimeError(
"Couldn't find an after tag in the macro:\n"
f"{macro}"
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done that elsewhere, clearly missed. Updated now

Comment on lines 583 to 590
# First grep for lines starting with import=
# No imports if it doesn't return anything
result = run_command(f"grep -E '^ *{flag}=' {meta_file}", shell=True)
if not result.stdout:
return []

imports = []
with open(meta_file, "r") as f:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have just deleted the grep - I don't think we need to be that clever about it. These aren't large files, nor is it a performance dependent script

macro_object.preprocess_macros()

# Clean up temporary directories
for repo, directory in macro_object.temp_dirs.items():
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but there is a possibility that the script gets the temporary copy of core and updates apps in there, in which case I don't want it to get deleted, so think I need to do this manually?

Comment on lines +405 to +409
ticket_details = re.search(r"Upgrade .* (#\d+) by (\S+.*)", macro)
try:
self.ticket_number = ticket_details.group(1)
self.author = ticket_details.group(2).rstrip('".')
self.author = self.author.strip("<>")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 643 to 645
with open(fpath, "w") as f:
for line in versions_file:
f.write(line.strip("\n") + "\n")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, as above. Have also modified the one with append below

Comment on lines +744 to +747
print(
"[INFO] Pre-processing macros in",
self.parse_application_section(meta_dir),
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't, but I think I prefer printing, particularly as this will regularly be used by people not familiar with it. My feeling is that for those, it's better to give some positive feedback that it's actually doing something - particularly as the 2nd PR for this introduces the stuff that takes a bit of time!

Comment on lines +744 to +747
print(
"[INFO] Pre-processing macros in",
self.parse_application_section(meta_dir),
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And there are only 4/5 print statements so we can easily come back to this later

Copy link
Contributor

@t00sa t00sa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there. I've added a few suggested changes.

james-bruten-mo and others added 3 commits December 2, 2024 12:05
Co-authored-by: Sam Clarke-Green <74185251+t00sa@users.noreply.github.com>
Co-authored-by: Sam Clarke-Green <74185251+t00sa@users.noreply.github.com>
Co-authored-by: Sam Clarke-Green <74185251+t00sa@users.noreply.github.com>
@james-bruten-mo
Copy link
Contributor Author

Almost there. I've added a few suggested changes.

Thanks Sam, have merged in all those suggestions directly

Copy link
Contributor

@t00sa t00sa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Approved.

@t00sa
Copy link
Contributor

t00sa commented Dec 2, 2024

Over @jennyhickson for approval

@jennyhickson jennyhickson merged commit ce9a8aa into main Dec 2, 2024
1 check passed
@james-bruten-mo james-bruten-mo deleted the apps_upgrade_macros_1 branch September 1, 2025 15:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants