Skip to content

Conversation

cphyc
Copy link
Member

@cphyc cphyc commented Jul 17, 2025

PR Summary

The check on grid size cause the building of the entirety of the index.
This causes a massive slowdown for e.g. RAMSES dataset where the smallest cell size requires reading the entirety of the AMR structure.

The fix? Remove this check and live on the edge.

Note: this makes #5060 more or less useless.

@cphyc cphyc added discussion An item for discussion requiring community feedback. Can be minor or major. workshop-2025 labels Jul 17, 2025
@neutrinoceros
Copy link
Member

Looks like many test cases actually rely on the implied indexing...

@cphyc
Copy link
Member Author

cphyc commented Jul 31, 2025

@neutrinoceros yes... I've gone ahead and triggered the creation of the index manually. This isn't super nice, but I don't have the knowledge to touch on the volume rendered. Maybe @chrishavlin would want to have a stab at it?

@cphyc cphyc marked this pull request as ready for review July 31, 2025 13:29
@cphyc cphyc added the enhancement Making something better label Jul 31, 2025
@@ -57,12 +57,6 @@ def __init__(
super().__init__(center, ds, field_parameters, data_source)
# Unpack the radius, if necessary
radius = fix_length(radius, self.ds)
if radius < self.index.get_smallest_dx():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this where we should be adding a self.ds.index for backward compat ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. We do not want the index to be computed unless strictly necessary (this is a costly operation). If we add self.ds.index here, anytime you build a sphere, you'll have an index created along the way, which you don't with ds.region for example.

Note that I agree this is the current behaviour, but removing it is one of the motivation behind this PR :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @cphyc. Taking this out makes these objects significantly lighter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, but the fact that we needed to modify tests indicates there's a very high chance it also breaks user code, doesn't it ?
I agree we should aim at delaying this expensive operation, but I don't think this PR is the correct time to do it.

Copy link
Member Author

@cphyc cphyc Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, the issue is that we have some public APIs that require the index to be built — ds.field_info being one of them — yet they don't all trigger the creation of the index.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. This is the root problem. I still don't think we should knowingly break existing users before we form a better plan

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these tests are not reflecting real usage. One would never really access the field info container in this way. That said, the brittleness of field_info needing other actions before it can be access has always bugged me. PR #5250 is a potential fix for this that would allow us to remove the ds.index calls in these tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One would never really access the field info container in this way.

end users certainly wouldn't, but can we say this with confidence about downstream apps and libraries ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No guarantees, but it's hard to imagine. That this previously worked was a side-effect of suboptimal functionality. This is a potentially huge performance improvement. I'm in favor of merging this whether or not it breaks things downstream.

Copy link
Member

@brittonsmith brittonsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with seeing this go. Data objects should be as light as possible upon creation. If we wanted to retain functionality of this nature, I think it would be better to do it when fields are queried. I'm not convinced we need to do that and I certainly don't think it should happen here.

@brittonsmith brittonsmith merged commit 4849e84 into yt-project:main Aug 28, 2025
12 of 13 checks passed
@neutrinoceros neutrinoceros added this to the 4.5.0 milestone Aug 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion An item for discussion requiring community feedback. Can be minor or major. enhancement Making something better workshop-2025
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants