-
-
Notifications
You must be signed in to change notification settings - Fork 38
APE26: Removing data storage (representations) from coordinate frames #112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Pinging @eerovaher @eteq @StuartLittlefair @Cadair @ayshih @taldcroft as other "stakeholders" who may want to take a look at this and review! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am convinced that there should be exactly one type for coordinate data. This would prevent users from getting confused about which type they should be using and simplifies things for both astropy
and downstream developers because we'll only have to support one type of coordinate data as input for our functions.
This leaves two possibilities: the single type is a class, which is what this APE proposes, or the single type is a Protocol, which is what Implement BaseCoordinateFrame.frame property
(astropy/astropy#16356) (that the APE mentions) proposed. Implementing a Protocol
would be backwards-compatible and wouldn't cause disruption for downstream packages. In every other regard this APE is a better idea.
Yes, yes and a thousand times yes. The devil will be in the implementation details but this is a thoroughly good idea. |
Does anyone else have thoughts before we ask the @astropy/coordinators to review this? cc @eteq @Cadair @ayshih @taldcroft! |
I have been traveling the last several weeks and haven't had a chance to review this in detail, but I will try to do so this week. I am fairly sure I will have Opinions ™️ (but hopefully positive ones?) |
@eteq OK cool. It'd be good to get your feedback ASAP, because @jeffjennings is starting to work through some of the changes we discuss here (in draft stages, but still, it is somewhat in progress). |
I have two main thoughts about some effects from this change, I think they can both be managed, but probably worth having a think about.
|
Thanks for pinging me about this change. If I'm following the end result will be that coordinate frames will no longer have Reading old files with
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I've left several specific inline comments. I also have several bigger picture concerns. Note that I am not dead-set against this - I get the motivation for this change and can see how it might be an improvement. But I want to be sure we address these concerns before going down this road:
- This is a major change in communicating the purpose of SkyCoord. Previously (i.e. in APE5) SkyCoord is highlighted as a "convenience class" - primarily this is 1) it has a much smarter and easier to use initializer, and 2) it carries the frame attributes along with the data, regardless of whether that frame attribute is in the current frame. This makes subtly different behavior for certain transformations depending on whether you give the to frame the same frame attributes or not. I don't really understand how this proposal will change that - i.e., if there are no frame classes with data, how do you say "do the transform that only uses the frame attributes in the to-frame, but doesn't carry them over from the from-frame"? I can thinks of ways to do this, but this was one of the prime reasons for data and frames being tied together in the first place, to keep that case simpler and not dependent on SkyCoord.
- We should get someone in Sunpy look at this before proceeding. hopefully this won't be a big change for them since I think mostly they define their own frames and transforms, which should be backwards compatible with these changes. But it may be they use the low-level classes directly more often and hence this would be a huge negative impact on them.
- Relatedly, I'm concerned about the number of examples out in the wild that use the low-level classes directly. There are various use cases where we have encouraged people to use the low-level classes with data because they are much cheaper for initialization. If we are going to remove that option they will be impacted whenever the removal finally happens.
- I think it's absolutely critical we highlight the change in philosophy from what APE5 said. This sub-package is sufficiently complex that documentation (either in APEs or otherwise) are critical to helping people understand the big-picture. This might be as simple as just making sure https://docs.astropy.org/en/stable/coordinates/index.html#astropy-coordinates-overview and related documentation spots are up-to-date, but still, that should be explicitly in the implementation plan IMHO since it's a non-trivial effort to update all the docs for this.
APE26.rst
Outdated
different usage modes (with and without data), each exhibiting different behavior. For | ||
instance, some methods such as ``separation`` work fine with frames with data, while | ||
doing this on coordinate frames without data results in an error. This kind of | ||
multi-modal structure is considered an anti-pattern because it forces any code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as with above I think "is considered an anti-pattern" is a bit strong. It makes code working with frame classes be in one place but explicitly have to check for the presence or absence of data. Which for someone trying to understand the code might be a benefit.
I think that this is really a tradeoff between two use cases:
- Someone trying to understand coordinates-related functionality (not just astropy maintainers - this includes e.g. sunpy folks)
- An astropy maintainer who is really groking the layout of the package, how this stuff all works
I think this change benefits 2 over 1. Which doesn't mean we shouldn't do it, but it leaves out the consequence that it makes the bigger picture harder to understand. If we want to go ahead despite that, well, we can, but we need to recognize that choice is being made and be explicit about it here.
(I do, however, agree with the points immediately below that the end-user is the more important thing - this is more like "new maintainer" vs "old/more knowledgable maintainer")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Erik - I've reworked the text to reflect this in 46a6d82 (including explicitly noting the design choice we propose that departs from APE 5, as you suggested).
APE26.rst
Outdated
overlap leads to confusion where beginner users end up creating ``BaseCoordinateFrame`` | ||
objects such as ``ICRS``, when the docs are clear that these are for more advanced users |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is an end-user gain: it's a feature not a bug that SkyCoord and ICRS both can be instantiated with data. SkyCoord is supposed to be a convenience class, with ICRS the simpler underlying layer - that's the whole principle here. I don't really understand how the proposed change makes things any clearer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see how this is a feature, but I think the improved user clarity is motivated earlier on in the text - currently, a user can initialize an ICRS
object or a SkyCoord
object, which "represent the same thing, but return different objects with similar but not identical APIs". The proposed change gives only way to have an object hold both data and frame information - SkyCoord
- preventing user confusion about subtle differences between SkyCoord
and a with-data frame like the current ICRS
(including which one to use, and why their API differs in perhaps unexpected ways).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see that... But I'm a bit concerned that this still doesn't address the fact that there was a reason for this (i.e., they are intentionally different APIs because the objects behave differently when transformed and instantiated)
APE26.rst
Outdated
2. Switching ``SkyCoord`` to use the data-less frame classes, and enabling automatic | ||
conversion of the with-data frames into ``SkyCoord`` objects. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused how this doesn't create a huge mess: does this mean all of the coordinate conversion process during the transition phase is duplicated between the "with data" classes and the "new SkyCoord" classes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There wouldn't be duplication because the coordination conversion machinery would be in the new (data-less) frame classes, and the legacy (with-data) frame classes would subclass the new (data-less) classes.
Name changes
Co-authored-by: Adrian Price-Whelan <adrianmpw@gmail.com>
Co-authored-by: Adrian Price-Whelan <adrianmpw@gmail.com>
Co-authored-by: Nathaniel Starkman <nstarman@users.noreply.github.com>
Hi @ayshih and @Cadair, we're starting to finalize this APE and are wondering if you anticipate any major concerns with respect to SunPy (as well as any suggestions you have, e.g. for addressing Stuart's point 2 above). The first phase of the APE wouldn't introduce any breaking changes, the second phase (timeline flexible) would add deprecation warnings, and the final phase (several major astropy versions away) would require changes to SunPy. Thanks! |
Just to add a comment to the discussion between @ayshih and @nstarman. I think we should not underestimate the benefits to improving the simplicity of the coordinates code base. Personally I have always tried to avoid work which involves the internals of SkyCoord because the existing code base is so complex. If the proposed APE improves this, and I believe it will, then I believe it offers us significant advantages in terms of maintainability, the ability to add new features, and the ability to encourage others to contribute. As an unrelated comment, there is an upcoming funding opportunity in the UK to provide support to existing scientific software packages, so I may be able to provide meaningful support to implementation of this APE, if approved. |
Signed-off-by: Nathaniel Starkman <nstarman@users.noreply.github.com>
docs: add verbiage about Coordinate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! My main concern is the downstream effect, which drove some of my comments below. Overall, thank you for the well thought out proposal and links.
Therefore, we propose implementing this APE through 4 steps (and | ||
substeps). | ||
|
||
1. Splitting the frame classes into two hierarchies: ones with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For these steps, I would like to know more about what is your proposed timeline and exactly how break-y things will get at each step. And what would we do about helping to notify or patch affected downstream packages. So maybe even list each step as a sub-section if needed.
Even nicer is actual example usage code (not the implementation but usage of the implementation) and what the expected output would look like, for an example use case that is very common. Doesn't have to be fancy, just have to reflect what most of us use, which is usually something common like SkyCoord(ra_arr, dec_arr, unit="deg")
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even nicer is actual example usage code
A good idea. Something like:
Step 1:
# Unchanging API
# Normal type hierarchy
ra, dec = ..., ...
rep = SphericalRepresentation(ra, dec)
c = ICRS(rep) # the same. not yet deprecated.
sc = SkyCoord(c)
# Flexible inputs
c = ICRS(ra_arr, dec_arr) # the same. not yet deprecated.
c = ICRS(rep) # the same. not yet deprecated.
sc = SkyCoord(ra_arr, dec_arr, frame=ICRS()) # the same. not yet deprecated.
sc = SkyCoord(rep, frame=ICRS()) # the same. not yet deprecated.
# New API
frame = ICRSFrame()
sc = SkyCoord(rep, frame=frame)
sc = SkyCoord(ra_arr, dec_arr, frame=frame)
Step 2:
# Unchanging API
# Normal type hierarchy
ra, dec = ..., ...
rep = SphericalRepresentation(ra, dec)
c = ICRS(rep) # the same. not yet deprecated.
sc = SkyCoord(c)
# Flexible inputs
c = ICRS(ra_arr, dec_arr) # the same. not yet deprecated.
c = ICRS(rep) # the same. not yet deprecated.
sc = SkyCoord(ra_arr, dec_arr, frame=ICRS()) # the same. not yet deprecated.
sc = SkyCoord(rep, frame=ICRS()) # the same. not yet deprecated.
# New API
frame = ICRSFrame()
c = Coordinate(rep, frame)
sc = SkyCoord(rep, frame=frame)
sc = SkyCoord(c.data, c.frame) # (just showing the strict constructor)
sc = SkyCoord(c) # flexible
sc = SkyCoord(ra_arr, dec_arr, frame=frame) # flexible
Step 3:
# Unchanging API
# Normal type hierarchy
ra, dec = ..., ...
rep = SphericalRepresentation(ra, dec)
c = ICRS(rep) # the same. not yet deprecated.
sc = SkyCoord(c)
# Flexible inputs
c = ICRS(ra_arr, dec_arr) # the same. not yet deprecated.
c = ICRS(rep) # the same. not yet deprecated.
sc = SkyCoord(ra_arr, dec_arr, frame=ICRS()) # redirects to SkyCoord(ra_arr, dec_arr, frame=ICRSFrame())
sc = SkyCoord(rep, frame=ICRS()) # redirects to SkyCoord(ra_arr, dec_arr, frame=ICRSFrame())
# New API
frame = ICRSFrame()
c = Coordinate(rep, frame)
sc = SkyCoord(rep, frame=frame)
sc = SkyCoord(c) # flexible
sc = SkyCoord(ra_arr, dec_arr, frame=frame) # flexible
Step 4:
# Unchanging API
# Normal type hierarchy
ra, dec = ..., ...
rep = SphericalRepresentation(ra, dec)
# Deprecated API
c = ICRS(rep)
c = ICRS(ra_arr, dec_arr)
sc = SkyCoord(c)
sc = SkyCoord(ra_arr, dec_arr, frame=ICRS()) # ICRS -> ICRSFrame
sc = SkyCoord(rep, frame=ICRS()) # ICRS -> ICRSFrame
# New API
frame = ICRSFrame()
c = Coordinate(rep, frame)
sc = SkyCoord(rep, frame=frame)
sc = SkyCoord(c) # flexible
sc = SkyCoord(ra_arr, dec_arr, frame=frame) # flexible
End result:
# Unchanging API
ra, dec = ..., ...
rep = SphericalRepresentation(ra, dec)
frame = ICRSFrame()
c = Coordinate(rep, frame)
sc = SkyCoord(rep, frame=frame)
sc = SkyCoord(c) # flexible
sc = SkyCoord(ra_arr, dec_arr, frame=frame) # flexible
change the frame classes such that they only store metadata and | ||
never coordinate data, superseding this aspect of the implementation | ||
in `APE 5 <https://github.com/astropy/astropy-APEs/blob/main/APE5.rst>`_. | ||
This would make the implementation more modular and performant, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would make the implementation more modular and performant, | |
This would make the implementation more modular, |
You shouldn't mention performance here unless you can back this claim with measurements. The other reasons you are listing here should be convincing enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other reasons you are listing here should be convincing enough.
Agreed, but also...
you can back this claim with measurements
But we can know this even without. Number of times an object is constructed, going through non-fast-path parsing/validation routes, slotting (easy with dataclases), etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either back up your claims with actual measurements or don't bring up performance at all. I wouldn't be surprised if the new API allows for faster implementation, but it is not at all obvious to me how much faster it would be. I also suspect that it would be possible to improve performance with the current API (although I don't expect it to be easy because the current API is over-complicated), in which case the performance argument would be weaker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the plan to only try an implementation for this once the APE is accepted? For similar APEs/PEPs/etc there is sometimes a prototype implementation, which would allow this to be quantified.
APE26.rst
Outdated
duplication (not just in the code proper, but also in the tests, | ||
leading to missed combinations). Restructuring the ``Frame`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The statement in the parenthesis is the opposite of what it should be. Having duplicated test code does not lead to missing coverage. Not having duplicated tests is what has caused problems to go unnoticed.
It is also worth remarking that having duplicated code sometimes means quadrupled (not duplicated) tests because we have to test both BaseCoordinateFrame
and SkyCoord
methods with both BaseCoordinateFrame
and SkyCoord
arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not having duplicated tests is what has caused problems to go unnoticed.
👍 😆.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to be correct and add your point in 9332ae6.
APE26.rst
Outdated
given method, if one looks in ``SkyCoord``, one will likely find | ||
that it does not exist, but instead is defined on | ||
``BaseCoordinateFrame`` and gets dynamically called via | ||
``SkyCoord.__getattr__``. Indeed, the construction of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some doubts whether or not potential new contributors manage to find SkyCoord.__getattr__()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed! We can change the wording to be
one will likely find that it does not exist, and might struggle to find that instead it is defined on ``BaseCoordinateFrame`` and gets dynamically called via ``SkyCoord.__getattr__``
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to Nathaniel's text in cd5471b.
APE26.rst
Outdated
fragile. For example, if users manipulate the internal workings | ||
of ``SkyCoord`` (which is discouraged but possible), the | ||
coordinate data can become decoupled from the caching that | ||
``SkyCoord`` performs for speed. With these proposed changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not worried about users who mess with SkyCoord
internals. The problems they will cause will be their own fault. But it might be worth considering downstream developers who subclass SkyCoord
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a better example to highlight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the text accordingly in 062755f.
objects). We also introduce a new class, ``Coordinate``, which | ||
is akin to ``SkyCoord`` (containing both frame and data), but | ||
without extra features like keeping frame attributes not | ||
associated with the current frame, caching and flexible input | ||
parsing. In this way ``Coordinate`` operates very similarly to | ||
the current ``BaseCoordinateFrame`` objects when they have | ||
data, and is meant to be their direct replacement in the new | ||
framework as well as a more lightweight and performant |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does the new Coordinate
class avoid code duplication and user confusion that this APE tries to solve? Why can't we just use SkyCoord
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does the new
Coordinate
class avoid code duplication and user confusion that this APE tries to solve?
Common base class avoids the code duplication.
Avoiding user confusion is partly handled in the main docstring of each, in particular that Coordinate
only accepts data, frame
versus the long list of inputs for SkyCoord
. Also the common base class should help avoid confusion as objects with a common base class have expectations about common behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I supposed to learn that from reading this paragraph?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, in part. See the lower pseudo-code that demonstrates how simple Coordinate
will be as a subclass of the common base class. That and the assumption that for changes large enough to merit an APE there would need to be a reasonable amount of explanation. For example, even a See Also
is very useful for helping users decide when to use a Coordinate
versus a SkyCoord
. Combined with further docstring and RTD changes which are the natural course of a PR, and I'm not so worried.
|
||
Branches and pull requests | ||
-------------------------- | ||
No direct progress on these changes has yet occurred. Discussion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much effort would it be to do a prototype implementation? Are the proposed changes here actually going to amount to a lot of lines of code if we exclude changes to tests, docs, and deprecation warnings/layers?
Thank you all for your comments. At this point we will leave this PR open and complete a prototype implementation, and return when that is ready and can be demoed. In the interim, we will not address further comments left here. Signed, |
Co-authored-by: P. L. Lim <2090236+pllim@users.noreply.github.com>
Co-authored-by: P. L. Lim <2090236+pllim@users.noreply.github.com>
Co-authored-by: P. L. Lim <2090236+pllim@users.noreply.github.com>
Co-authored-by: P. L. Lim <2090236+pllim@users.noreply.github.com>
Co-authored-by: P. L. Lim <2090236+pllim@users.noreply.github.com>
Co-authored-by: P. L. Lim <2090236+pllim@users.noreply.github.com>
Co-authored-by: P. L. Lim <2090236+pllim@users.noreply.github.com>
Co-authored-by: Eero Vaher <eero.vaher@gmail.com>
Co-authored-by: Eero Vaher <eero.vaher@gmail.com>
In the light of this development, CoCo will hold off on further reviews for now. Thank you. |
Possible updates from Astropy Coordination Meeting 2025:
|
Coordinate frames in
astropy.coordinates
currently store metadata used to construct the frame (i.e. for transforming between frames) and may also store coordinate data itself. This duplicates functionality withSkyCoord
, which acts as a container for both coordinate data and reference frame information. We propose to change the frame classes such that they only store metadata and never coordinate data. This would make the implementation more modular, remove ambiguity for users from having nearly duplicate functionality with slightly different APIs, and better satisfy the principle of Separation of Concerns.Authors of this APE (alphabetical): Jeff Jennings @jeffjennings, Adrian Price-Whelan @adrn, Nathaniel Starkman @nstarman, Marten van Kerkwijk @mhvk