Restore reverted change: Abins refactor replacing SData classes with Euphonic features #39669

ajjackson · 2025-07-15T10:01:45Z

Description of work

This reinstates #37350 , which was reverted in #39665

This was done in order to stop breakage of package build tests on non-Linux platforms.
It seems this is caused by an incompatibility between multiprocessing "spawn" and the systemtest code, which also spawns Python subprocesses. This mechanism is not supposed to be nested. On Linux the default mechanism is "fork" which seems to work in parallel systemtest, but from Python 3.14 this default will also change to "spawn" and start failing.

Summary of work

The branch from Refactor Abins, replace SData with Euphonic data structures #37350 has been rebased back onto main
The conditions for hanging test suites were investigated
Multiprocessing code is removed from Abins and replaced by serial map operations.
The performance impact seems to be small, due to optimisations in Euphonic which came after the parallel operations were originally introduced to the branch.

Purpose of work

See #37350

Further detail of work

It's not great that multiprocessing is incompatible with the systemtest suite. In this case we can manage without, but there is a good chance we will want to use it in future. Multithreading is probably ok, but these are not equivalent tools.

To test:

Everything is covered by CI. In addition to the default stuff we need to check build_packages_from_branch on non-Linux platforms as this was breaking.

This shouldn't require a release note because #37350 didn't.

Reviewer

Please comment on the points listed below (full description).
Your comments will be used as part of the gatekeeper process, so please comment clearly on what you have checked during your review. If changes are made to the PR during the review process then your final comment will be the most important for gatekeepers. In this comment you should make it clear why any earlier review is still valid, or confirm that all requested changes have been addressed.

Code Review

Is the code of an acceptable quality?
Does the code conform to the coding standards?
Are the unit tests small and test the class in isolation?
If there is GUI work does it follow the GUI standards?
If there are changes in the release notes then do they describe the changes appropriately?
Do the release notes conform to the release notes guide?

Functional Tests

Do changes function as described? Add comments below that describe the tests performed?
Do the changes handle unexpected situations, e.g. bad input?
Has the relevant (user and developer) documentation been added/updated?

Does everything look good? Mark the review as Approve. A member of @mantidproject/gatekeepers will take care of it.

Gatekeeper

If you need to request changes to a PR then please add a comment and set the review status to "Request changes". This will stop the PR from showing up in the list for other gatekeepers.

To begin migration from SData to euphonic data types, allow existing SData to produce a Spectrum1DCollection. Then we can start modifying code that _uses_ SData to use this datatype instead. So far we modify the _atom_number_s method which creates output workspaces from the Abins calculation. Spectrum2DCollection is not implemented yet, so initially we leave the old implementation in place for 2D cases.

- Although Spectrum1DCollection provides a .select() function, we ended up using a chain of filters here because it can't handle numerical comparisons. Perhaps this would be a bit cleaner if the select() method accepted our filter functions. - The Spectrum1DCollection group_by() method throws an error if the key is in the top level of metadata (i.e. because it applies to all rows.) It would be good to clean that upstream too.

This is not very thoroughly tested here, and should be moved upstream once everything looks happy on Abins side. - Wrap Spectrum1DCollection to deal with group_by() issue where metadata is the same for all items (and so not in line_data) - Implement Spectrum2DCollection - Have SData.get_spectrum_collection() dispatch to 1D or 2D as appropriate - Tweak workspace writers to use either SpectrumNCollection class

As the relevant information is available in spectra, this allows us to drop a redundant parameter and reduce the number of objects passed around

- Now the Algorithm classes never deal with SData at all - SData still used internally and cached

Move the threshold-check logic to module-level functions, and leave existing method as a wrapper. Then we should be able to delete the class later... Also remove some useless parameters/logic that make the return value optional. We can just ignore the return value if we don't care about it.

Bin checks on Spectrum2DCollection need adapting to interpret 3-D z_data array correctly. This implementation is a bit inefficient as it needlessly instantiates a Spectrum2D; leave more verbose/efficient implementation for upstream.

This functionality is also available in Euphonic, and it would make sense to remove this entirely from Abins and use it there. However - this will need a bit of tinkering to work with Spectrum2DCollection, and doing it upstream would avoid a lot of duplication - it would be better to apply kinematic constraints (and instrument broadening) after spectra have been filtered and summed into output groups So let's defer that for now

- Push SData a bit deeper in the chain of functions - Stop using SDataByAngle, instead use angle metadata in SpectrumCollection objects - Patch SpectrumCollection so that group_by() and select() work as expected (i.e. run succesfully) when some keys are at top level of metadata (i.e. common to all lines). - Apply broadening to SpectrumCollection objects. For now we keep using Abins implementation of broadening so that tests pass; this should be replaced with Euphonic implementation later.

This is a significant step towards removing SData entirely. For compatibility at this point, Spectrum objects mostly store the bin centres rather than bin edges. When SData is gone it would nicer to move to storing bin edges. (Centres can be computed from edges, but edges can only be guessed from centres...)

Officially the Spectrum metadata should not include float because they can't be reliably compared for equality

Continue dissolving SData: apply DW to output spectra, drop now-unused method from class

- create new autoconvolution routine that works with spectrum collection - integrate downsampling into autoconvolution loop over atoms - this could save a bit of memory, or benefit from parallism over atoms - not clear if benefit is large at this point, but it doesn't add much complexity either. YAGNI? - apply q2 order corrections, converting 1D initial spectra to 2D maps for quantum orders with isotropic Debye-Waller correction (i.e. all orders above fundamental) - drop _broaden_sdata as this is no longer needed

The Spectrum1DCollection/Spectrum2DCollection classes allow arbitrary metadata, so there is no need for a technical distinction between the class that collects over atoms and quantum orders and one that collects over angles.

For now, fallback via SData if no isotroptic calculation

- Rather than go through the data transformation on arrays of zeros, we can return an almost-empty SpectrumCollection immediately. - That finally eliminates SData from this module! There is definitely room for optimisation as autoconvolution has become more expensive, but the order-1 stuff seems to be running in similar time. It would be a good idea to do some profiling later... - As part of rebasing this, restored some missing units to spowdersemiempiricalcalculator and fixed AbinsSDataTest. Sorry, it should be on another commit but this is a big job and would benefit from some squashing later.

Finally, the cathartic part of the refactor where I get to delete 500 lines of code at once

This was breaking a doctest

Remove innermost loop of Spectrum creation by assembling y_data array directly.

The .from_spectra() constructor has a lot of overhead as it checks the types and numerical agreement of axes; if we are comparing Collections this should only be checked once. If this tweak works well it should move upstream to euphonic. Here we refer to a private method on the 1D collection from the 2D collection: really this can live on a parent class.

Not seeing a big impact on test timings here, but this _should_ scale better as we add more atoms.

- The euphonic SpectumCollection select() method doesn't work when all lines in the collection match the criterion, because such metadata is moved into the top-level metadata rather than the per-line section which is checked. The key is therefore treated as missing. - Compounding this, the data container cannot be entirely empty, so it allows an error to be raised if no matches are found. - More reasonably, the `from_spectra()` constructor cannot operate on an empty list, as this means no (common) axis information woudl be available. In this commit we avoid calling this when no data is available.

Spectrum.x_data (etc.) properties are quite expensive to call as they trigger a call to ureg to build Units and call Quantity.to() to ensure output unit is consistent with the stored unit name. Try to avoid calling it inside a loop! In practice this means using private attributes a bit more; not ideal, but we pin the Euphonic version and I maintain the package so for now it is safe. Avoid calling `* ureg` where we know we have an array and can build Quantity directly. This saves on some type-checking. Implement an "unsafe" constructor for Spectrum1DCollection from a series of Spectrum1D when we already know that the axes will be consistent. Again this saves a lot of time spent on consistency checks.

This iterator makes it more practical to iterate over the content of a spectrum collection without (slowly) creating new Spectrum objects Profiling shows a dramatic impact in the places changed so far.

Currently multiprocessing interacts poorly with systemtests on platforms other than linux: we end up with nested multiprocessing and the tests hang indefinitely.

ajjackson · 2025-07-17T15:52:43Z

Aborted systemtest, as expected tweaking the multiprocessing backend caused it to hang on Linux as well.

ajjackson · 2025-07-18T16:40:42Z

Performance check: full run of Abins for TOSCA using gamma-point phonon modes for a 250-atom amorphous Ge cell (calculated with a MLIP foundation model). Including enumerated quantum order 2 and convolution up to order 10.

Running on Macbook Pro M3 (i.e. 12-core ARM)

Nightly from May (i.e. pre-refactor): 50s
Today's main (i.e. post-revert): 56s [This should be equivalent to the first one...]
Parallel map implementation (i.e. this branch before removing parallelism): 55s
Serial map implementation (i.e. this branch with parallelism removed): 56s

Basically the places parallelism has been implemented so far are no longer a bottleneck due to other performance improvements in Euphonic. There is likely a good performance gain to be had by parallelising other parts of Abins, but it's ok to strip them out here for simplicity and stability.

ajjackson · 2025-07-18T16:43:45Z

Windows build_packages_from_branch is running at https://builds.mantidproject.org/job/build_packages_from_branch/1208/

MohamedAlmaki

Thanks for spotting this tricky bug. The changes remove the multiprocessing code that caused the hanging issue previously. Could you please revert the changes in run-tests then do a full build_packages_from_branch to make sure it passes in all operating systems before merging into main

This reverts commit 36b203a.

ajjackson · 2025-07-21T16:05:53Z

Running here:
https://builds.mantidproject.org/job/build_packages_from_branch/1210/
https://builds.mantidproject.org/job/build_packages_from_branch/1211/
https://builds.mantidproject.org/job/build_packages_from_branch/1212/
https://builds.mantidproject.org/job/build_packages_from_branch/1213/

MohamedAlmaki · 2025-07-21T16:16:36Z

Replaced with one build instead of running separate ones
https://builds.mantidproject.org/job/build_packages_from_branch/1214/

MohamedAlmaki · 2025-07-22T16:03:57Z

I have rebuild the packaging because it failed in the first run: https://builds.mantidproject.org/job/build_packages_from_branch/1219/

MohamedAlmaki

Build passed successfully. Thanks for the fix

ajjackson · 2025-07-24T07:40:40Z

It wasn't easy making sense of the log output on the failed run! Looks like 3/4 platforms built and tested fine and the other one barely started?

cailafinn

Just a couple of code comments on tests and imports:

scripts/test/Abins/AbinsCalculateSPowderTest.py

Testing/SystemTests/tests/framework/AbinsTest.py

scripts/abins/spowdersemiempiricalcalculator.py

For cleaner and safer (parallel) systemtests, run each Abins systemtest case with its own cache directory and clean up using tempfile mechanisms.

cailafinn

This looks good to me now, thanks!

…Euphonic features (mantidproject#39669) ### Description of work This reinstates mantidproject#37350 , which was reverted in mantidproject#39665 This was done in order to stop breakage of package build tests on non-Linux platforms. It seems this is caused by an incompatibility between multiprocessing "spawn" and the systemtest code, which also spawns Python subprocesses. This mechanism is not supposed to be nested. On Linux the default mechanism is "fork" which seems to work in parallel systemtest, but from Python 3.14 this default will also change to "spawn" and start failing. #### Summary of work - The branch from mantidproject#37350 has been rebased back onto `main` - The conditions for hanging test suites were investigated - Multiprocessing code is removed from Abins and replaced by serial `map` operations. - The performance impact seems to be small, due to optimisations in Euphonic which came after the parallel operations were originally introduced to the branch. #### Purpose of work See mantidproject#37350 #### Further detail of work It's not great that multiprocessing is incompatible with the systemtest suite. In this case we can manage without, but there is a good chance we will want to use it in future. Multithreading is probably ok, but these are not equivalent tools. ### To test: Everything is covered by CI. In addition to the default stuff we need to check `build_packages_from_branch` on non-Linux platforms as this was breaking. This shouldn't require a release note because mantidproject#37350 didn't. --- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

This gets the new pre-commit configuration and dependency changes. As a result, it also grabs a bunch of work for abins to get past merge conflicts * #39968 * #39665 * #39502 * #39669 * #39739 * #39881 --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mohammed Almakki <44416400+MohamedAlmaki@users.noreply.github.com> Co-authored-by: Adam J. Jackson <a.j.jackson@physics.org>

ajjackson added 30 commits July 15, 2025 10:00

More Abins spectra->workspaces tidy-up

e770366

Rework abins "masses table" to work with spectrum

dda362b

As the relevant information is available in spectra, this allows us to drop a redundant parameter and reduce the number of objects passed around

Change return type of spowdersemiempiricalcalculator

b580c91

- Now the Algorithm classes never deal with SData at all - SData still used internally and cached

Move threshold check, fix 2D systemtest

1749d1b

Bin checks on Spectrum2DCollection need adapting to interpret 3-D z_data array correctly. This implementation is a bit inefficient as it needlessly instantiates a Spectrum2D; leave more verbose/efficient implementation for upstream.

Continue pushing SData further back into nested function heirarchy

6637dda

Format mass metadata to string

38381a3

Officially the Spectrum metadata should not include float because they can't be reliably compared for equality

Tidy up import

3882397

Apply Debye-Waller factor to SpectrumCollection, drop SData method

7313cd0

Continue dissolving SData: apply DW to output spectra, drop now-unused method from class

Drop SData autoconvolution method, port test to replacement function

b592350

Minor cleanup

e389534

Remove unused SDataByAngle class and tests

debbaa4

The Spectrum1DCollection/Spectrum2DCollection classes allow arbitrary metadata, so there is no need for a technical distinction between the class that collects over atoms and quantum orders and one that collects over angles.

calculate_s_isotropic directly into Spectra

6a007a1

For now, fallback via SData if no isotroptic calculation

Tidy up comments and references to SData

28c592f

Remove Abins SData class

c4fc3c8

Finally, the cathartic part of the refactor where I get to delete 500 lines of code at once

Correct axis-order for dummy 2D spectra

3cfb728

This was breaking a doctest

Slightly improve performance of spectrum broadening

ef2217c

Remove innermost loop of Spectrum creation by assembling y_data array directly.

Extend broadening efficiency tweak to 2D

3bcf152

Not seeing a big impact on test timings here, but this _should_ scale better as we add more atoms.

Implement iter_metadata for spectrum collections, speed up loops

dc5f175

This iterator makes it more practical to iterate over the content of a spectrum collection without (slowly) creating new Spectrum objects Profiling shows a dramatic impact in the places changed so far.

Remove parallelism from AbINS

eb652ea

Currently multiprocessing interacts poorly with systemtests on platforms other than linux: we end up with nested multiprocessing and the tests hang indefinitely.

ajjackson marked this pull request as ready for review July 21, 2025 08:01

Tidy up imports

0c4b417

sf1919 added the ISIS: Spectroscopy Issue and pull requests relating to Muons, Indirect and Inelastic at ISIS label Jul 21, 2025

sf1919 added this to ISIS core workstream v6.14.0 Jul 21, 2025

sf1919 moved this to In review in ISIS core workstream v6.14.0 Jul 21, 2025

MohamedAlmaki self-assigned this Jul 21, 2025

MohamedAlmaki requested changes Jul 21, 2025

View reviewed changes

Revert "For debugging, tweak package build tests to Abins-only"

06dec9f

This reverts commit 36b203a.

MohamedAlmaki previously approved these changes Jul 22, 2025

View reviewed changes

sf1919 added this to the Release 6.14 milestone Jul 23, 2025

thomashampson assigned cailafinn Jul 24, 2025

cailafinn requested changes Jul 24, 2025

View reviewed changes

scripts/test/Abins/AbinsCalculateSPowderTest.py Outdated Show resolved Hide resolved

Testing/SystemTests/tests/framework/AbinsTest.py Outdated Show resolved Hide resolved

scripts/abins/spowdersemiempiricalcalculator.py Outdated Show resolved Hide resolved

Move imports to top level

eb8e10d

ajjackson dismissed MohamedAlmaki’s stale review via eb8e10d July 25, 2025 09:22

Rework Abins systemtests to use temporary cache directory

e388e11

For cleaner and safer (parallel) systemtests, run each Abins systemtest case with its own cache directory and clean up using tempfile mechanisms.

ajjackson mentioned this pull request Jul 25, 2025

Abins systemtests cleanup #39739

Merged

cailafinn approved these changes Jul 28, 2025

View reviewed changes

cailafinn merged commit d1992e9 into mantidproject:main Jul 28, 2025
10 checks passed

github-project-automation bot moved this from In review to Done in ISIS core workstream v6.14.0 Jul 28, 2025

peterfpeterson mentioned this pull request Sep 24, 2025

pre-commit and dependency changes #39976

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Restore reverted change: Abins refactor replacing SData classes with Euphonic features #39669

Restore reverted change: Abins refactor replacing SData classes with Euphonic features #39669

Uh oh!

ajjackson commented Jul 15, 2025 •

edited

Loading

Uh oh!

ajjackson commented Jul 17, 2025

Uh oh!

ajjackson commented Jul 18, 2025

Uh oh!

ajjackson commented Jul 18, 2025

Uh oh!

MohamedAlmaki left a comment •

edited

Loading

Uh oh!

ajjackson commented Jul 21, 2025

Uh oh!

MohamedAlmaki commented Jul 21, 2025

Uh oh!

MohamedAlmaki commented Jul 22, 2025

Uh oh!

MohamedAlmaki left a comment

Uh oh!

ajjackson commented Jul 24, 2025

Uh oh!

cailafinn left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cailafinn left a comment

Uh oh!

Uh oh!

Uh oh!

Restore reverted change: Abins refactor replacing SData classes with Euphonic features #39669

Restore reverted change: Abins refactor replacing SData classes with Euphonic features #39669

Uh oh!

Conversation

ajjackson commented Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of work

Summary of work

Purpose of work

Further detail of work

To test:

Reviewer

Code Review

Functional Tests

Gatekeeper

Uh oh!

ajjackson commented Jul 17, 2025

Uh oh!

ajjackson commented Jul 18, 2025

Uh oh!

ajjackson commented Jul 18, 2025

Uh oh!

MohamedAlmaki left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ajjackson commented Jul 21, 2025

Uh oh!

MohamedAlmaki commented Jul 21, 2025

Uh oh!

MohamedAlmaki commented Jul 22, 2025

Uh oh!

MohamedAlmaki left a comment

Choose a reason for hiding this comment

Uh oh!

ajjackson commented Jul 24, 2025

Uh oh!

cailafinn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cailafinn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ajjackson commented Jul 15, 2025 •

edited

Loading

MohamedAlmaki left a comment •

edited

Loading