Skip to content

Conversation

jeffjennings
Copy link

@jeffjennings jeffjennings commented Nov 4, 2024

Coordinate frames in astropy.coordinates currently store metadata used to construct the frame (i.e. for transforming between frames) and may also store coordinate data itself. This duplicates functionality with SkyCoord, which acts as a container for both coordinate data and reference frame information. We propose to change the frame classes such that they only store metadata and never coordinate data. This would make the implementation more modular, remove ambiguity for users from having nearly duplicate functionality with slightly different APIs, and better satisfy the principle of Separation of Concerns.

Authors of this APE (alphabetical): Jeff Jennings @jeffjennings, Adrian Price-Whelan @adrn, Nathaniel Starkman @nstarman, Marten van Kerkwijk @mhvk

@adrn
Copy link
Member

adrn commented Nov 4, 2024

Pinging @eerovaher @eteq @StuartLittlefair @Cadair @ayshih @taldcroft as other "stakeholders" who may want to take a look at this and review!

Copy link
Member

@eerovaher eerovaher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am convinced that there should be exactly one type for coordinate data. This would prevent users from getting confused about which type they should be using and simplifies things for both astropy and downstream developers because we'll only have to support one type of coordinate data as input for our functions.

This leaves two possibilities: the single type is a class, which is what this APE proposes, or the single type is a Protocol, which is what Implement BaseCoordinateFrame.frame property (astropy/astropy#16356) (that the APE mentions) proposed. Implementing a Protocol would be backwards-compatible and wouldn't cause disruption for downstream packages. In every other regard this APE is a better idea.

@StuartLittlefair
Copy link

Yes, yes and a thousand times yes.

The devil will be in the implementation details but this is a thoroughly good idea.

@adrn
Copy link
Member

adrn commented Nov 26, 2024

Does anyone else have thoughts before we ask the @astropy/coordinators to review this? cc @eteq @Cadair @ayshih @taldcroft!

@eteq
Copy link
Member

eteq commented Dec 2, 2024

I have been traveling the last several weeks and haven't had a chance to review this in detail, but I will try to do so this week. I am fairly sure I will have Opinions ™️ (but hopefully positive ones?)

@adrn
Copy link
Member

adrn commented Dec 5, 2024

@eteq OK cool. It'd be good to get your feedback ASAP, because @jeffjennings is starting to work through some of the changes we discuss here (in draft stages, but still, it is somewhat in progress).

@Cadair
Copy link
Member

Cadair commented Dec 19, 2024

I have two main thoughts about some effects from this change, I think they can both be managed, but probably worth having a think about.

  1. The asdf serialisation is going to substantially different from now, managing that transition could be complex. - @braingram
  2. Many SunPy frames have an observer attribute which is a realised frame holding the position of the observer. This could obviously become a SkyCoord, but I worry that adds some more complexity, and will for a start make the repr even more obnoxious than it is currently.

@braingram
Copy link

1. The asdf serialisation is going to substantially different from now, managing that transition could be complex. - @braingram

Thanks for pinging me about this change. If I'm following the end result will be that coordinate frames will no longer have data.

Reading old files with data when coordinate frames no longer support data

At that point when a user opens an ASDF file with a serialized coordinate frame (which might contain data) the converter that handles the deserialization could see that the file contains data and generate a SkyCoord instead (possibly with a warning that the file should be updated).

Writing files when coordinate frames no longer support data

This should be fine. Technically we could keep the same schema and just never write data (it's not required) but it might be a good opportunity to include some other schema cleanup (at the expense of some schema churn since many schemas reference the base coordinate frame schema).

Implementation details

The details about how this is implemented may impact ASDF serialization. Specifically the first item under implementation:

  1. Splitting the frame classes into two hierarchies: ones with and without data.

The converter(s) would need to be updated to support these new classes. This isn't a problem but is work that wouldn't be needed if instead the existing classes were updated (deprecating and then removing the data feature).

  1. Deprecating the legacy with-data frame classes.

At this point asdf-astropy could be updated to not use the legacy classes.

Copy link
Member

@eteq eteq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I've left several specific inline comments. I also have several bigger picture concerns. Note that I am not dead-set against this - I get the motivation for this change and can see how it might be an improvement. But I want to be sure we address these concerns before going down this road:

  • This is a major change in communicating the purpose of SkyCoord. Previously (i.e. in APE5) SkyCoord is highlighted as a "convenience class" - primarily this is 1) it has a much smarter and easier to use initializer, and 2) it carries the frame attributes along with the data, regardless of whether that frame attribute is in the current frame. This makes subtly different behavior for certain transformations depending on whether you give the to frame the same frame attributes or not. I don't really understand how this proposal will change that - i.e., if there are no frame classes with data, how do you say "do the transform that only uses the frame attributes in the to-frame, but doesn't carry them over from the from-frame"? I can thinks of ways to do this, but this was one of the prime reasons for data and frames being tied together in the first place, to keep that case simpler and not dependent on SkyCoord.
  • We should get someone in Sunpy look at this before proceeding. hopefully this won't be a big change for them since I think mostly they define their own frames and transforms, which should be backwards compatible with these changes. But it may be they use the low-level classes directly more often and hence this would be a huge negative impact on them.
  • Relatedly, I'm concerned about the number of examples out in the wild that use the low-level classes directly. There are various use cases where we have encouraged people to use the low-level classes with data because they are much cheaper for initialization. If we are going to remove that option they will be impacted whenever the removal finally happens.
  • I think it's absolutely critical we highlight the change in philosophy from what APE5 said. This sub-package is sufficiently complex that documentation (either in APEs or otherwise) are critical to helping people understand the big-picture. This might be as simple as just making sure https://docs.astropy.org/en/stable/coordinates/index.html#astropy-coordinates-overview and related documentation spots are up-to-date, but still, that should be explicitly in the implementation plan IMHO since it's a non-trivial effort to update all the docs for this.

APE26.rst Outdated
different usage modes (with and without data), each exhibiting different behavior. For
instance, some methods such as ``separation`` work fine with frames with data, while
doing this on coordinate frames without data results in an error. This kind of
multi-modal structure is considered an anti-pattern because it forces any code
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as with above I think "is considered an anti-pattern" is a bit strong. It makes code working with frame classes be in one place but explicitly have to check for the presence or absence of data. Which for someone trying to understand the code might be a benefit.

I think that this is really a tradeoff between two use cases:

  1. Someone trying to understand coordinates-related functionality (not just astropy maintainers - this includes e.g. sunpy folks)
  2. An astropy maintainer who is really groking the layout of the package, how this stuff all works

I think this change benefits 2 over 1. Which doesn't mean we shouldn't do it, but it leaves out the consequence that it makes the bigger picture harder to understand. If we want to go ahead despite that, well, we can, but we need to recognize that choice is being made and be explicit about it here.

(I do, however, agree with the points immediately below that the end-user is the more important thing - this is more like "new maintainer" vs "old/more knowledgable maintainer")

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Erik - I've reworked the text to reflect this in 46a6d82 (including explicitly noting the design choice we propose that departs from APE 5, as you suggested).

APE26.rst Outdated
Comment on lines 86 to 87
overlap leads to confusion where beginner users end up creating ``BaseCoordinateFrame``
objects such as ``ICRS``, when the docs are clear that these are for more advanced users
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is an end-user gain: it's a feature not a bug that SkyCoord and ICRS both can be instantiated with data. SkyCoord is supposed to be a convenience class, with ICRS the simpler underlying layer - that's the whole principle here. I don't really understand how the proposed change makes things any clearer?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see how this is a feature, but I think the improved user clarity is motivated earlier on in the text - currently, a user can initialize an ICRS object or a SkyCoord object, which "represent the same thing, but return different objects with similar but not identical APIs". The proposed change gives only way to have an object hold both data and frame information - SkyCoord - preventing user confusion about subtle differences between SkyCoord and a with-data frame like the current ICRS (including which one to use, and why their API differs in perhaps unexpected ways).

Copy link
Member

@eteq eteq Mar 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see that... But I'm a bit concerned that this still doesn't address the fact that there was a reason for this (i.e., they are intentionally different APIs because the objects behave differently when transformed and instantiated)

APE26.rst Outdated
Comment on lines 158 to 159
2. Switching ``SkyCoord`` to use the data-less frame classes, and enabling automatic
conversion of the with-data frames into ``SkyCoord`` objects.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused how this doesn't create a huge mess: does this mean all of the coordinate conversion process during the transition phase is duplicated between the "with data" classes and the "new SkyCoord" classes?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There wouldn't be duplication because the coordination conversion machinery would be in the new (data-less) frame classes, and the legacy (with-data) frame classes would subclass the new (data-less) classes.

jeffjennings and others added 2 commits January 29, 2025 10:25
Co-authored-by: Adrian Price-Whelan <adrianmpw@gmail.com>
Co-authored-by: Adrian Price-Whelan <adrianmpw@gmail.com>
jeffjennings and others added 2 commits January 29, 2025 10:27
Co-authored-by: Nathaniel Starkman <nstarman@users.noreply.github.com>
@jeffjennings
Copy link
Author

jeffjennings commented Feb 7, 2025

Hi @ayshih and @Cadair, we're starting to finalize this APE and are wondering if you anticipate any major concerns with respect to SunPy (as well as any suggestions you have, e.g. for addressing Stuart's point 2 above). The first phase of the APE wouldn't introduce any breaking changes, the second phase (timeline flexible) would add deprecation warnings, and the final phase (several major astropy versions away) would require changes to SunPy. Thanks!

@StuartLittlefair
Copy link

Just to add a comment to the discussion between @ayshih and @nstarman. I think we should not underestimate the benefits to improving the simplicity of the coordinates code base. Personally I have always tried to avoid work which involves the internals of SkyCoord because the existing code base is so complex. If the proposed APE improves this, and I believe it will, then I believe it offers us significant advantages in terms of maintainability, the ability to add new features, and the ability to encourage others to contribute.

As an unrelated comment, there is an upcoming funding opportunity in the UK to provide support to existing scientific software packages, so I may be able to provide meaningful support to implementation of this APE, if approved.

nstarman and others added 5 commits April 24, 2025 19:19
Copy link
Member

@pllim pllim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! My main concern is the downstream effect, which drove some of my comments below. Overall, thank you for the well thought out proposal and links.

Therefore, we propose implementing this APE through 4 steps (and
substeps).

1. Splitting the frame classes into two hierarchies: ones with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For these steps, I would like to know more about what is your proposed timeline and exactly how break-y things will get at each step. And what would we do about helping to notify or patch affected downstream packages. So maybe even list each step as a sub-section if needed.

Even nicer is actual example usage code (not the implementation but usage of the implementation) and what the expected output would look like, for an example use case that is very common. Doesn't have to be fancy, just have to reflect what most of us use, which is usually something common like SkyCoord(ra_arr, dec_arr, unit="deg").

Copy link
Member

@nstarman nstarman May 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even nicer is actual example usage code

A good idea. Something like:

Step 1:

# Unchanging API
# Normal type hierarchy
ra, dec = ..., ...
rep = SphericalRepresentation(ra, dec)
c = ICRS(rep)  # the same. not yet deprecated.
sc = SkyCoord(c)
# Flexible inputs
c = ICRS(ra_arr, dec_arr)  # the same. not yet deprecated.
c = ICRS(rep)  # the same. not yet deprecated.
sc = SkyCoord(ra_arr, dec_arr, frame=ICRS())  # the same. not yet deprecated.
sc = SkyCoord(rep, frame=ICRS())  # the same. not yet deprecated.

# New API
frame = ICRSFrame()
sc = SkyCoord(rep, frame=frame)
sc = SkyCoord(ra_arr, dec_arr, frame=frame)

Step 2:

# Unchanging API
# Normal type hierarchy
ra, dec = ..., ...
rep = SphericalRepresentation(ra, dec)
c = ICRS(rep)  # the same. not yet deprecated.
sc = SkyCoord(c)
# Flexible inputs
c = ICRS(ra_arr, dec_arr)  # the same. not yet deprecated.
c = ICRS(rep)  # the same. not yet deprecated.
sc = SkyCoord(ra_arr, dec_arr, frame=ICRS())  # the same. not yet deprecated.
sc = SkyCoord(rep, frame=ICRS())  # the same. not yet deprecated.

# New API
frame = ICRSFrame()
c = Coordinate(rep, frame)
sc = SkyCoord(rep, frame=frame)
sc = SkyCoord(c.data, c.frame)  # (just showing the strict constructor)
sc = SkyCoord(c)  # flexible
sc = SkyCoord(ra_arr, dec_arr, frame=frame)  # flexible

Step 3:

# Unchanging API
# Normal type hierarchy
ra, dec = ..., ...
rep = SphericalRepresentation(ra, dec)
c = ICRS(rep)  # the same. not yet deprecated.
sc = SkyCoord(c)
# Flexible inputs
c = ICRS(ra_arr, dec_arr)  # the same. not yet deprecated.
c = ICRS(rep)  # the same. not yet deprecated.
sc = SkyCoord(ra_arr, dec_arr, frame=ICRS())  # redirects to SkyCoord(ra_arr, dec_arr, frame=ICRSFrame()) 
sc = SkyCoord(rep, frame=ICRS())  # redirects to SkyCoord(ra_arr, dec_arr, frame=ICRSFrame()) 

# New API
frame = ICRSFrame()
c = Coordinate(rep, frame)
sc = SkyCoord(rep, frame=frame)
sc = SkyCoord(c)  # flexible
sc = SkyCoord(ra_arr, dec_arr, frame=frame)  # flexible

Step 4:

# Unchanging API
# Normal type hierarchy
ra, dec = ..., ...
rep = SphericalRepresentation(ra, dec)

# Deprecated API
c = ICRS(rep)
c = ICRS(ra_arr, dec_arr)
sc = SkyCoord(c)
sc = SkyCoord(ra_arr, dec_arr, frame=ICRS())  # ICRS -> ICRSFrame 
sc = SkyCoord(rep, frame=ICRS())  # ICRS -> ICRSFrame

# New API
frame = ICRSFrame()
c = Coordinate(rep, frame)
sc = SkyCoord(rep, frame=frame)
sc = SkyCoord(c)  # flexible
sc = SkyCoord(ra_arr, dec_arr, frame=frame)  # flexible

End result:

# Unchanging API
ra, dec = ..., ...
rep = SphericalRepresentation(ra, dec)

frame = ICRSFrame()
c = Coordinate(rep, frame)
sc = SkyCoord(rep, frame=frame)
sc = SkyCoord(c)  # flexible
sc = SkyCoord(ra_arr, dec_arr, frame=frame) # flexible

change the frame classes such that they only store metadata and
never coordinate data, superseding this aspect of the implementation
in `APE 5 <https://github.com/astropy/astropy-APEs/blob/main/APE5.rst>`_.
This would make the implementation more modular and performant,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This would make the implementation more modular and performant,
This would make the implementation more modular,

You shouldn't mention performance here unless you can back this claim with measurements. The other reasons you are listing here should be convincing enough.

Copy link
Member

@nstarman nstarman May 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other reasons you are listing here should be convincing enough.

Agreed, but also...

you can back this claim with measurements

But we can know this even without. Number of times an object is constructed, going through non-fast-path parsing/validation routes, slotting (easy with dataclases), etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either back up your claims with actual measurements or don't bring up performance at all. I wouldn't be surprised if the new API allows for faster implementation, but it is not at all obvious to me how much faster it would be. I also suspect that it would be possible to improve performance with the current API (although I don't expect it to be easy because the current API is over-complicated), in which case the performance argument would be weaker.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the plan to only try an implementation for this once the APE is accepted? For similar APEs/PEPs/etc there is sometimes a prototype implementation, which would allow this to be quantified.

APE26.rst Outdated
Comment on lines 81 to 82
duplication (not just in the code proper, but also in the tests,
leading to missed combinations). Restructuring the ``Frame``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The statement in the parenthesis is the opposite of what it should be. Having duplicated test code does not lead to missing coverage. Not having duplicated tests is what has caused problems to go unnoticed.

It is also worth remarking that having duplicated code sometimes means quadrupled (not duplicated) tests because we have to test both BaseCoordinateFrame and SkyCoord methods with both BaseCoordinateFrame and SkyCoord arguments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not having duplicated tests is what has caused problems to go unnoticed.

👍 😆.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to be correct and add your point in 9332ae6.

APE26.rst Outdated
Comment on lines 86 to 89
given method, if one looks in ``SkyCoord``, one will likely find
that it does not exist, but instead is defined on
``BaseCoordinateFrame`` and gets dynamically called via
``SkyCoord.__getattr__``. Indeed, the construction of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some doubts whether or not potential new contributors manage to find SkyCoord.__getattr__().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed! We can change the wording to be

one will likely find that it does not exist, and might struggle to find that instead it is defined on  ``BaseCoordinateFrame`` and gets dynamically called via  ``SkyCoord.__getattr__``

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to Nathaniel's text in cd5471b.

APE26.rst Outdated
Comment on lines 128 to 131
fragile. For example, if users manipulate the internal workings
of ``SkyCoord`` (which is discouraged but possible), the
coordinate data can become decoupled from the caching that
``SkyCoord`` performs for speed. With these proposed changes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not worried about users who mess with SkyCoord internals. The problems they will cause will be their own fault. But it might be worth considering downstream developers who subclass SkyCoord.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a better example to highlight.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the text accordingly in 062755f.

Comment on lines +153 to +160
objects). We also introduce a new class, ``Coordinate``, which
is akin to ``SkyCoord`` (containing both frame and data), but
without extra features like keeping frame attributes not
associated with the current frame, caching and flexible input
parsing. In this way ``Coordinate`` operates very similarly to
the current ``BaseCoordinateFrame`` objects when they have
data, and is meant to be their direct replacement in the new
framework as well as a more lightweight and performant
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the new Coordinate class avoid code duplication and user confusion that this APE tries to solve? Why can't we just use SkyCoord?

Copy link
Member

@nstarman nstarman May 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the new Coordinate class avoid code duplication and user confusion that this APE tries to solve?

Common base class avoids the code duplication.
Avoiding user confusion is partly handled in the main docstring of each, in particular that Coordinate only accepts data, frame versus the long list of inputs for SkyCoord. Also the common base class should help avoid confusion as objects with a common base class have expectations about common behaviour.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I supposed to learn that from reading this paragraph?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in part. See the lower pseudo-code that demonstrates how simple Coordinate will be as a subclass of the common base class. That and the assumption that for changes large enough to merit an APE there would need to be a reasonable amount of explanation. For example, even a See Also is very useful for helping users decide when to use a Coordinate versus a SkyCoord. Combined with further docstring and RTD changes which are the natural course of a PR, and I'm not so worried.


Branches and pull requests
--------------------------
No direct progress on these changes has yet occurred. Discussion
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much effort would it be to do a prototype implementation? Are the proposed changes here actually going to amount to a lot of lines of code if we exclude changes to tests, docs, and deprecation warnings/layers?

@jeffjennings
Copy link
Author

jeffjennings commented May 7, 2025

Thank you all for your comments. At this point we will leave this PR open and complete a prototype implementation, and return when that is ready and can be demoed. In the interim, we will not address further comments left here.

Signed,
APE 26 authors (@adrn, @jeffjennings, @mhvk, @nstarman)

jeffjennings and others added 15 commits May 8, 2025 15:18
Co-authored-by: P. L. Lim <2090236+pllim@users.noreply.github.com>
Co-authored-by: P. L. Lim <2090236+pllim@users.noreply.github.com>
Co-authored-by: P. L. Lim <2090236+pllim@users.noreply.github.com>
Co-authored-by: P. L. Lim <2090236+pllim@users.noreply.github.com>
Co-authored-by: P. L. Lim <2090236+pllim@users.noreply.github.com>
Co-authored-by: P. L. Lim <2090236+pllim@users.noreply.github.com>
Co-authored-by: P. L. Lim <2090236+pllim@users.noreply.github.com>
Co-authored-by: Eero Vaher <eero.vaher@gmail.com>
Co-authored-by: Eero Vaher <eero.vaher@gmail.com>
@pllim
Copy link
Member

pllim commented May 9, 2025

and return when that is ready and can be demoed

In the light of this development, CoCo will hold off on further reviews for now. Thank you.

@pllim
Copy link
Member

pllim commented Jul 8, 2025

Possible updates from Astropy Coordination Meeting 2025:

  • ICRS(ra, dec) returns a frame now; proposal is to keep it around but returning a SkyCoord in the future (on the same deprecation timescale) cc @keflavich
  • Clarify more explicitly that your planned deprecation is at least 2 years, not just 1 year.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.