-
Notifications
You must be signed in to change notification settings - Fork 293
Convert field_info into a property so it can be created on first access. #5250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
self.create_field_info() | ||
return self._field_info | ||
|
||
@field_info.setter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary ?
Immutability has a lot of value when dealing with parallelism, so I would advise to avoid introducing mutability where it's not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are already doing this in far more critical places, e.g., the index. In fact, I would argue that we want some measure of this here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But what for ? Unless there's a test exercising it I don't understand the point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I replied without really thinking this through. We don't do this with index, but we probably do want to allow this to be settable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifically, without this, we can't do what's in the tests in PR #5211.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm even more confused now. I don't see the connection with #5211
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm referring to how the tests in that PR manually alter entries in field_info
. You wouldn't be able to do that without having the setter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've come around on not having a setter for field_info
. When it wasn't a property we allowed it to be set by anyone, but maybe it's better to be safe since we can.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to tie this off, the setter is necessary if we want to avoid modifying all the places in the frontends that setup the field_info on their own, which I believe we do. I also think we should not worry about immutability since we didn't have that before anyway. There's no reason to impose this restriction now. This change doesn't modify existing behavior and only allows us to refer to ds.field_info
without having to think about whether it has been created yet. I don't see a reason not to do this.
Whoops, I removed the bug label before I read your argument for it. Sounds reasonable to me now. |
yt/data_objects/static_output.py
Outdated
def create_field_info(self): | ||
self.index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to admit that this isn't a great thing to do as instantiating the index called create_field_info
. What I think would be better is renaming this _create_field_info
to remove the expectation that anyone ever call it. I would then change line 661 above to simply self.index
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've refactored this so that we only ever do one pass through create_field_info
.
So I looked into the
I prefer the first, though -- it touches fewer things and also operates exactly how a property is supposed to operate. I think we should have the setter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's overall a positive change, but I would just suggest we roll back 458ab76, since this is no longer required.
One issue that PR raises is the fact that we rely heavily on side effects in the code. This seems to be mostly a consequence of having a lazy approach, so I wonder whether we could (should?) refactor to remove these side effects as much as possible, or take a step back and think about the possible states the dataset and index objects can have (see #5252).
what milestone ya'll want this in? it's got a bug label so i'm assuming we'd want this to ship with 4.4.2? |
I'll have a look through the tests and see if there are other places where we do that. This is a very good point you raise about side-effects and that dataset objects can be in different states. It's something we should probably try to be more explicit about. |
Yes, I think this was technically a bug, although one we happily lived with for quite some time. I think we can go with 4.4.2. Thanks @chrishavlin! |
Sorry, I triggered the merge button by mistake while playing in the CLI... However, we can just open an issue to keep a log that we need to remove these unnecessary calls to |
@cphyc, no problem, happy to do a separate PR for that. |
@meeseeksdev backport to yt-4.4.x |
Owee, I'm MrMeeseeks, Look at me. There seem to be a conflict, please backport manually. Here are approximate instructions:
And apply the correct labels and milestones. Congratulations — you did some good work! Hopefully your backport PR will be tested by the continuous integration and merged soon! Remember to remove the If these instructions are inaccurate, feel free to suggest an improvement. |
… can be created on first access.
@chrishavlin did you intend to open a manual backport PR ? I don't see one linked yet, however there seem to be a reference to it from your fork. |
oh! ya, i messed up the cherry picking somehow though (that branch is showing changes to |
(I'll have another go in the next ~10 mins) |
this is a common (and easy to make) mistake. Make sure you run |
…y so it can be created on first access. (cherry picked from commit 82ac140)
… can be created on first access. (cherry picked from commit 82ac140)
The field info container is notably brittle in that either a dataset's index or field_list must be accessed/created first before it can come into existence. This PR makes
Dataset.field_info
into a property allowing it to be created on first access. Additionally, we now poke the index at the start ofcreate_field_info
to allow this to be run before thefield_list
is created. I argue this should have been considered a bug given the naming of the method without an underscore implying the user was free to call it.