Skip to content

Conversation

ajjackson
Copy link
Contributor

@ajjackson ajjackson commented Jan 17, 2025

Description of work

Fixes #38084

Summary of work

  • Add an input parameter to Abins/Abins2D which explicitly sets the cache file directory. The default value is the Default Save Directory so if the user does nothing files will continue to save in the same place. But they are more likely to notice it!
  • Use temporary directories throughout test suite to avoid collisions and simplify cleanup
    • By refactoring this into the "check" function, some boilerplate is removed from loader tests.
  • Remove existing (more verbose and spooky) cleanup mechanism from Abins tests

Purpose of work

Abins uses the Default Save Directory. This has a few drawbacks:

  • Sometimes it is set to a bad location, and not easy to change (mostly addressed in Detect unwriteable Abins cache, raise better error #38606)
  • Users may be unaware that they are creating cache files that are never cleaned up
  • Parallel tests and workflows may have unintended interactions through a common cache file
  • It is not obvious from the code which functions will interact with a cache

Further detail of work

  • Cache directory was chosen rather than cache filename to avoid a likely future API-break: I am considering migrating all the caching to use joblib (already a Mantid dependency) and this caches to multiple files in a directory structure.

To test:

The changes are covered by unit tests and system tests.

For fun, you can try setting the Default Save Directory to somewhere unwriteable and running the tests. Abins tests used to fail in this scenario, but now will pass because they are all directed to temporary cache directories.

  • Add release note

Reviewer

Please comment on the points listed below (full description).
Your comments will be used as part of the gatekeeper process, so please comment clearly on what you have checked during your review. If changes are made to the PR during the review process then your final comment will be the most important for gatekeepers. In this comment you should make it clear why any earlier review is still valid, or confirm that all requested changes have been addressed.

Code Review

  • Is the code of an acceptable quality?
  • Does the code conform to the coding standards?
  • Are the unit tests small and test the class in isolation?
  • If there is GUI work does it follow the GUI standards?
  • If there are changes in the release notes then do they describe the changes appropriately?
  • Do the release notes conform to the release notes guide?

Functional Tests

  • Do changes function as described? Add comments below that describe the tests performed?
  • Do the changes handle unexpected situations, e.g. bad input?
  • Has the relevant (user and developer) documentation been added/updated?

Does everything look good? Mark the review as Approve. A member of @mantidproject/gatekeepers will take care of it.

Gatekeeper

If you need to request changes to a PR then please add a comment and set the review status to "Request changes". This will stop the PR from showing up in the list for other gatekeepers.

@ajjackson ajjackson added the ISIS: Spectroscopy Issue and pull requests relating to Muons, Indirect and Inelastic at ISIS label Jan 17, 2025
@ajjackson ajjackson added this to the Release 6.13 milestone Jan 17, 2025
It doesn't seem to do anything useful
Add an input parameter "CacheDirectory" which is validated as being
writeable. This gives a nice user experience, with default value
appearing when the field is emptied.

The actual caching mechanics still need to be connected to this
parameter.
Other tests (which create cache in different ways) still fail, this is
expected.
We are also starting to pass the cache directory to Abins explicitly,
although it is not used yet.
- Temporarily raise an error if the default cache_directory (None) is
  passed to abins.io.IO
- This helped identify where tests were using default dirs. Now there
  are updated to use a (somewhat inconsistent) mixture of temporary
  directories and the user default directory
- With explicit cache paths we can remove the need to patch IO class
  and simplify tests
- Using TemporaryDirectory.cleanup() simplifies post-test clean-up
@ajjackson
Copy link
Contributor Author

Looks like the doctest does need updating

@ajjackson
Copy link
Contributor Author

The doctest failure is related to calling the Algorithm from Python without explicitly including the new CacheDirectory argument, which is apparently mandatory. I have provided a default value and would prefer that it works as an optional argument; is there a way to make this work like the other optional parameters?

@ajjackson
Copy link
Contributor Author

ajjackson commented Jan 20, 2025

Have pushed a partial fix to see if it works. For some reason on my machine doc-tests take forever to run and give no output; cannot tell if frozen or just very slow.

Update: just slow, the doctest progress can be monitored with

tail -f build/docs/doctest/output.txt

Abins2D will still break.

I'm not really happy with this "fix" because there _should_ be a
default argument.
As with Abins-v1, we fix the doctest by
- providing a CacheDirectory argument to the Algorithm calls
- setting a corresponding cleanup path

The problems with this approach are:
- it's ugly and distracting
- it _should_ be unnecessary, there is a default value and it works
when calling from GUI
- /tmp probably doesn't work on Windows
@ajjackson
Copy link
Contributor Author

Doctest fixed but could use some further input: as stated in commit message

As with Abins-v1, we fix the doctest by

  • providing a CacheDirectory argument to the Algorithm calls
  • setting a corresponding cleanup path

The problems with this approach are:

  • it's ugly and distracting
  • it should be unnecessary, there is a default value and it works
    when calling from GUI
  • /tmp probably doesn't work on Windows

@ajjackson ajjackson marked this pull request as ready for review January 22, 2025 10:05
@adriazalvarez adriazalvarez self-assigned this Feb 4, 2025
@adriazalvarez
Copy link
Contributor

@ajjackson Changes are looking good, I just have a more general question about the cache files.
Do the users normally want to check the .hdf files after the output workspaces have been processed? It seems is the job of the user to delete the cache files once abins is done.
Then, we could set up the temporary files in temporary directories (probably python's tempfile wiill assign the correct temporary folder for every system) and delete them when the algo is done with them (as in the tests)?
Let me know if I'm missing something.

@ajjackson
Copy link
Contributor Author

ajjackson commented Feb 5, 2025

The idea behind the cache file is that the user might want to run Abins a few times with slightly different settings (e.g. different ISIS instrument, different configuration of that instrument, or difference cross-section weighting). The cache can be used to avoid unnecessary re-computation in that case. As the cache file is named after the input file, they can do this with some persistence across a few input systems and across Mantid sessions.

I think in practice:

  • this is less necessary than it used to be because the performance of Abins has improved a lot. But it might still be helpful for some more heroic cases.
  • Most Abins users don't know where the file goes so they hang around longer than necessary.
  • Most of that benefit could be had by maintaining the cache for a single Mantid session and cleaning up when Mantid closes.
  • But in the cases where the user benefits the most, it might be good to keep them.

Destroying the cache when the Abins Algorithm finishes is generally "too soon". If they could be reliably cleaned up when Mantid closes that would be great, but I'm not sure of the machinery for that.

One middle-ground option could be to set the default cache directory to a system /tmp directory (or whatever the Windows equivalent is) so the cache persists until the next reboot, while "power users" have the option to nominate a user directory and keep them.

@adriazalvarez
Copy link
Contributor

adriazalvarez commented Feb 6, 2025

I see, thank you for the detailed explanation.
Yes, maybe investigating how taxing is to create the cache files and how users are likely to implement scripts which would require reforming many cache files is outside the scope of this pr, but it could be interesting to know how to handle this differently.
I have added a couple comments.
Also as a general comment, maybe dragging the cache_directory parameter around few methods is a bit unsatisfactory. I've seen in other parts of mantid that temporary settings could be stored on mantid config files with a custom entry. We could write the temporary setting on the config file when calling the algorithm (or not write if it's already the same) and then retrieve locally in the methods that needs it. Would it be very inconvenient to do something like this ? Let me know what you think.

@ajjackson
Copy link
Contributor Author

Also as a general comment, maybe dragging the cache_directory parameter around few methods is a bit unsatisfactory. I've seen in other parts of mantid that temporary settings could be stored on mantid config files with a custom entry. We could write the temporary setting on the config file when calling the algorithm (or not write if it's already the same) and then retrieve locally in the methods that needs it. Would it be very inconvenient to do something like this ? Let me know what you think.

This is closer to how it worked before, using the Default Save Directory which is a global Mantid setting.

From a maintenance standpoint I have found this "magical" caching without explicit arguments to be quite confusing and awkward to debug. It isn't very obvious from the function names and signatures what will be cached, and unit tests could run into race conditions by working on the same cache file. This can also be an issue when running multiple parallel instances in an automated workflow such as https://github.com/ajjackson/abins-qpt-convergence/tree/main

@ajjackson
Copy link
Contributor Author

Yes, maybe investigating how taxing is to create the cache files and how users are likely to implement scripts which would require reforming many cache files is outside the scope of this pr, but it could be interesting to know how to handle this differently.

I'm very keen to try the joblib approach: it is a bit like Python's standard library memoization tool but a) plays more nicely with complex objects especially numpy arrays; b) uses disk rather than memory. By memoizing private "pure" functions that take data as input and don't look at any quasi-global state, we should be able to drop a lot of the existing (and somewhat wasteful) logic that determines when the cache should be invalidated. Instead the cache is simply used whenever a function is called with exactly the same arguments (by hashing the data) as previously.

One new problem this would create is deciding what the new cache cleanup strategy should be, and it wouldn't be as easy to save a copy of the cache for use later. There is a reduce_size method that takes age, bytes and item-count limits. The cache can also be set up with an age limit and bytes limit.

@adriazalvarez
Copy link
Contributor

I'm very keen to try the joblib approach: it is a bit like Python's standard library memoization tool but a) plays more nicely with complex objects especially numpy arrays; b) uses disk rather than memory. By memoizing private "pure" functions that take data as input and don't look at any quasi-global state, we should be able to drop a lot of the existing (and somewhat wasteful) logic that determines when the cache should be invalidated. Instead the cache is simply used whenever a function is called with exactly the same arguments (by hashing the data) as previously.

I'm not opposed to this, but you may have to talk to Sarah or Tom if you want to integrate this into the developer package of Mantid.

@ajjackson
Copy link
Contributor Author

It's already in the mantid dependencies, for quickBayes

It has been a bit of a headache getting these to work nicely, but it
may have been something to do with spacing between RST lines.

There seems to be a little API change at work here: the default cache
director is now always the default save directory. Previously, it
seems that command-line runs of Mantid would sometimes use the current
directory from which Mantid was launched.

That may be a headscratcher or even breaking for certain user scripts.
In such workflows it would be highly recommended to set an explicit
directory.
Use magic * to share test setup/cleanup logic across tests.
Move cleanup to top of file and use comments so it is easier to see
what is going on.
Copy link
Contributor

@adriazalvarez adriazalvarez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes.
Test are passing, and the handling of cache directories is more flexible and clear for users of Abins.

@SilkeSchomann SilkeSchomann self-assigned this Feb 20, 2025
@SilkeSchomann SilkeSchomann merged commit 5d49a91 into mantidproject:main Feb 20, 2025
10 checks passed
@peterfpeterson peterfpeterson mentioned this pull request Mar 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ISIS: Spectroscopy Issue and pull requests relating to Muons, Indirect and Inelastic at ISIS
Projects
Status: Merged
Development

Successfully merging this pull request may close these issues.

Abins: explicit cache path and validation
3 participants