Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PHEP 9999: PyHC standardization for Python time objects #32

Closed
wants to merge 6 commits into from

Conversation

nabobalis
Copy link

@nabobalis nabobalis commented Jul 30, 2024

This PR proposes a new process PHEP to PyHC.

PHEP 9999 aims to build consensus within PyHC to standardize time objects to be based on astropy.time.Time, why we should do this and any potential roadblocks.

Note there is no implementation details about to program the transition, the goal for me with this PHEP is to build community support for the idea and what this idea entails.

It is still pretty rough in places, but I hope it's in a decent enough place for reviews and comments.

@nabobalis nabobalis marked this pull request as ready for review July 30, 2024 01:22

A dedicated implementation is most likely out of scope for a PHEP, as each package will require bespoke changes after an audit.

However, there will be a few common strategies that can be summarized below:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the strategies for making this effort is really helpful, so thank you for adding it!

pheps/phep-9999.md Outdated Show resolved Hide resolved
Copy link
Contributor

@namurphy namurphy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for doing this! While I anticipate that there will be a couple of areas that we need to iron out, this draft does a great job of laying out the issues, challenges, and path forward.

With regards to PlasmaPy, we mostly have dealt with time as Quantity objects (e.g., [0, 1, 2, 3] * u.s), so we wouldn't have much to change yet. In the future, I'm hoping that PlasmaPy will have functionality for reading in laboratory plasma data sets, in which case I'm personally happy to base it on astropy.time.

I'm personally curious about SpacePy's TickTock class since there seems to be a lot of functionality built on that.

Thank you again!

@jklenzing
Copy link
Contributor

Thanks for putting this together. I have a few thoughts / questions, which may be largely related to my use cases not currently using custom time objects.

  • What is the rationale for moving to astropy over datetime for objects? This is not particularly clear to me in the writeup (which focuses on this being preferable to writing custom time objects), and it would potentially add a new dependency across the ecosystem. What about a "pythonic" rewording to use astropy instead of building custom time objects?
  • How does this affect the usage of DatetimeIndex? Currently, most of the pysat ecosystem loads data files as pandas or xarray data objects linked to a DatetimeIndex, usually at 1Hz though sometimes higher.

@aburrell
Copy link

Another issue is overhead on operational systems. Datetime requires no additional dependencies and this is a concern for Python projects that are used in an operational environment.

@aburrell
Copy link

Another thing that may be useful to do is create a list of all the PyHC packages (not just the core) and see how many of them would need to change. Then talk to those developers (on either side) and their user base.

@nabobalis nabobalis changed the title PHEP 999: PyHC standardization around astropy.time.Time PHEP 9999: PyHC standardization around astropy.time.Time Jul 31, 2024
@nabobalis
Copy link
Author

  • What is the rationale for moving to astropy over datetime for objects? This is not particularly clear to me in the writeup (which focuses on this being preferable to writing custom time objects), and it would potentially add a new dependency across the ecosystem.

I added a section about astropy time. See if you are ok with it.

I see no problem with adding astropy as a new dependc across the ecosystem.
It is easy to install , not very large and is used on government machines in operational cases.

What about a "pythonic" rewording to use astropy instead of building custom time objects?

This sounds good, I have added this as well.

  • How does this affect the usage of DatetimeIndex? Currently, most of the pysat ecosystem loads data files as pandas or xarray data objects linked to a DatetimeIndex, usually at 1Hz though sometimes higher.

That is an open question. Ideally if this PHEP is accepted, I would want to fund a pandas developer to add astropy time support to pandas via a funding proposal call.

@nabobalis
Copy link
Author

nabobalis commented Jul 31, 2024

Another issue is overhead on operational systems. Datetime requires no additional dependencies and this is a concern for Python projects that are used in an operational environment.

My understanding is that astropy already sees use in sevearal operational environments. But I will make a note of this.

I do not know any of the requirements or rule around operational environments, so I will be required to learn what they are so I can try to make a better decision on this topic.

@nabobalis
Copy link
Author

Another thing that may be useful to do is create a list of all the PyHC packages (not just the core) and see how many of them would need to change. Then talk to those developers (on either side) and their user base.

I will have a look, I suspect the answer would be everything but a few sunpy released libraries.

@sapols
Copy link
Contributor

sapols commented Jul 31, 2024

Just wanted to chime in with my thanks, @nabobalis! The idea is presented well here, so I’ll save proofreading remarks for later. Overall I think it’d be a huge win for PyHC if we adopted this. It seems the first obvious step will be asking the package maintainers whether they’re willing to do this or not. If they don’t all comment here I’ll be sure to bring it up at the next telecon (and/or PyHC Core tag-up).

@jklenzing
Copy link
Contributor

Thanks @nabobalis, that is more clear. I am still hesitant to enforce dropping datetime support if the pandas solution is not yet implemented, but I would be more comfortable with this as a "recommendation" rather than a "prescription". This would give packages the space to evaluate use cases and decide if they need something more complex. I may try to run some tests to see how much adding astropy affects things.

@eelcodoornbos
Copy link

Astropy.time is pretty powerful and comprehensive, but therefore also difficult to understand and learn, especially for those without any prior knowledge on astronomical time scales.

I wonder how many PyHC projects would actually need the unique astropy.time functionality (time scale conversion, 2x64-bit precision) and whether or not pandas.Timestamp would be a better default for most projects. I do think that the Python built-in datetime module is a pretty poor choice for scientific computing.

I've been happily using pandas.Timestamp objects set to UTC in my projects (e.g. https://gitlab.com/KNMI-OSS/spaceweather/swxtools), and only convert back and forth to astropy.time.Time objects when absolutely needed, for example to convert data that is provided with a different time scale than UTC, such as GPS time for some satellite data.

In my experience, most date/time manipulations are in the form of applying time deltas and conversions to/from string representations, which is where pandas.Timestamp is very easy to use. I think their use leads to easy to comprehend (and therefore easy to maintain) code.

Counterarguments and alternative views are welcome of course!

@nabobalis
Copy link
Author

Thanks @nabobalis, that is more clear. I am still hesitant to enforce dropping datetime support if the pandas solution is not yet implemented, but I would be more comfortable with this as a "recommendation" rather than a "prescription". This would give packages the space to evaluate use cases and decide if they need something more complex. I may try to run some tests to see how much adding astropy affects things.

Sorry for the late reply. This PHEP (and I need to rewrite this since it isn't very clear) is about working out if there is community consensus for the idea and if so, we would start by adding support to pandas for astropy.time (via a roses call). Before that, we won't be able to enforce this until the foundational blocks are in place. That would be unfair.

@nabobalis
Copy link
Author

Astropy.time is pretty powerful and comprehensive, but therefore also difficult to understand and learn, especially for those without any prior knowledge on astronomical time scales.

I have to say I fundamentally disagree, for most users it has a very similar API as datetime, there is no jump in complexity if you just need UTC. It has been deemed simple enough to teach at two PyHC summer schools now.

I wonder how many PyHC projects would actually need the unique astropy.time functionality (time scale conversion, 2x64-bit precision) and whether or not pandas.Timestamp would be a better default for most projects. I do think that the Python built-in datetime module is a pretty poor choice for scientific computing.

The goal of this PHEP is to standardize what time objects are used to ensure that users have to deal with one type of datetime object and for developers of packages to be aware what they should be using.

I've been happily using pandas.Timestamp objects set to UTC in my projects (e.g. gitlab.com/KNMI-OSS/spaceweather/swxtools), and only convert back and forth to astropy.time.Time objects when absolutely needed, for example to convert data that is provided with a different time scale than UTC, such as GPS time for some satellite data.

But in this case, you already have to use with astropy to convert formats like this, so using it from the very start reduces the need to convert between pandas and astropy time objects. This PHEP would mean you have to not do that in future.

Hopefully if we can agree to do this, we will submit a proposal to fund a pandas developer to add astropy time support into pandas so we can have the best of both worlds.

@eelcodoornbos
Copy link

I have to say I fundamentally disagree, for most users it has a very similar API as datetime, there is no jump in complexity if you just need UTC. It has been deemed simple enough to teach at two PyHC summer schools now.

If you omit the complexity, astropy.time is of course easy to teach and learn. But this is even more true for Pandas.Timestamp, which has an even more flexible and intuitive interface, in my opinion, and which Python developers from different backgrounds will already know and love.

The astropy.time docs start out by listing time scales that only experienced users will be familiar with. There are also some opinionated choices in the implementation, for example, on the definitions and distinctions between time formats and time scales, which users taking advantage of the advanced features will have to learn, but which are not always straightforward. For instance, I'm myself still puzzled by GPS time being defined as a format, not as a time scale within astropy.time. This makes conversions between GPS and UTC timestamps with astropy.time ugly, even though it looks at first glance that this would be easy.

Hopefully if we can agree to do this, we will submit a proposal to fund a pandas developer to add astropy time support into pandas so we can have the best of both worlds.

That would be nice, but I wonder how this would look. I think it would be good for the discussion if the outcome of that work can be further specified.

If it means that pandas gets some configuration option so that astropy.time objects can then be used in pandas "behind-the-scenes" when manipulating Timestamps, DateTimeIndex, etc, while all the methods for manipulation are the same as for the current pandas objects (but with the addition of time scale conversion methods and inherent higher precision), that would be great.

Or would the proposal be for a future version of pandas (or some sort of pandas plug-in module) to adopt the astropy date/time manipulation methods? That would be more tricky to implement and more confusing to users, I think.

Even then, I think complexity/performance vs added value trade-offs need to be investigated as well. Astropy.time uses double the memory (double 64-bit timestamps), to accommodate the higher precision over long time spans. What are the implications for performance and hardware requirements of software that, for example, just processes 1-sec cadence satellite data, for which the single 64-bit timestamps are usually more than enough?

I personally don't think I would prefer to use an astropy.time option once implemented in pandas, except in some rare cases.

@rstoneback
Copy link

  • While AstroPy and SpacePy may have time support already, if it is going to be a standard the time functionality should be in its own independent package.
  • There was no community-wide selection of Astropy over SpacePy or over potentially creating a new time package. For example, AstroPy uses to 64-bit numbers to support precise times over long timescales. Not everyone needs this. The most common case could simply be support for leap seconds.
  • As already noted, pysat uses pandas and xarray. The only viable mechanism for pysat to use an updated time is to have it integrated into pandas' DatetimeIndex.
  • Operational and other systems can have long timescales, much longer than NEP or other PyHC support timelines. Thus, a standard time package would need to support older Python and associated packages longer than 'normal'. This is easiest to do if the time support is an independent package.
  • We'd need funding at the start, not just for a potential future integration into Pandas. NASA's funding for open source science is an ongoing concern. Not only do programs like B.20 lack sufficient funding for community efforts but I don't think NASA has ever presented evidence demonstrating it has an appropriate selection process.
  • To get Pandas to agree to incorporate science time we'd need not only funding but a good argument as to why anyone outside of astronomy should care about leap seconds etc. Trying to bring Pandas in after a time package is developed is not likely to go well unless the package does everything Pandas wants. Do we know yet what it would take to get Pandas, or datetime, or anyone at a non-science institution to say yes to science time?

@dstansby
Copy link
Contributor

dstansby commented Aug 22, 2024

if it is going to be a standard the time functionality should be in its own independent package.

Can you elaborate on this a bit more? I would have thought the advantages of depending on a third party library that has a wide existing maintenance team (e.g., astropy, pandas) that still accept contributions and suggstions would be much more efficient in time and money than developing new time package n+1.

a standard time package would need to support older Python and associated packages longer than 'normal'.

Currently #29 adopts the same recommendations as SEP 0, so as long as whatever package is standardised around follows SEP 0 this shouldn't be an issue? Either way, it seems like as long as a package is compatible with PHEP 3 if/when it's merged it should be fine because it will essentially have the same support policy as PyHC recommendations.

@Cadair
Copy link

Cadair commented Aug 22, 2024

I haven't caught up on the whole thread, but I want to say that the limitation of not being able to use astropy's Time in pandas indexes (and therefore xarray) is a technical limitation that can be overcome. Obviously for the whole ecosystem to adopt Time everywhere this would have to be done, but if that's the direction the community wants to head in then I think it would be easy enough to use some funding to pay the right people to make that happen. (One option would be companies like Quansight, who I believe have done similar work in pandas before on research grants).

@rstoneback
Copy link

if it is going to be a standard the time functionality should be in its own independent package.

Can you elaborate on this a bit more? I would have thought the advantages of depending on a third party library that has a wide existing maintenance team (e.g., astropy, pandas) that still accept contributions and suggstions would be much more efficient in time and money than developing new time package n+1.

Sure!
I'm not saying that we have to develop a new time package, but that if code from astropy or spacepy is going to be labeled a PyHC standard and spread to the wider Python community then that time code should be spun off into its own package.

  • Developer focus. The focus would be on time and time alone, not split between all of these other functions in the overall package.
  • Suppose there is a bug in the time code. It is easier for an independent package to release a new version than if it is contained within a larger more complicated package. Most likely, time bugfixes would have to wait on the release schedule of the whole package.
  • Developing a standard is different than a higher level package. There is a saying I've heard for Python, "Python core is where packages go to die." Standards need to change more slowly since every API change impacts all of the software above. Dealing with standards changes takes away from developer time for higher level packages that could go into features.
  • Standards are meant to make it easier on higher level packages. This includes having longer support cycles. It is easier to support 5-10 years of Python packages in something focused, like a time package, than across a whole series of functions like the rest of astropy or spacepy or whatever.
  • Eating 'own dogfood'. Easier as a developer to know if a package works well for integration by others if that developer has to integrate it themselves. The pysat ecosystem was split into a bunch of packages because we hope that people will choose to build upon pysat. We know if works well for that purpose because we do it ourselves.

I will note that I am generally opposed to the whole notion there should only be one software package for a given feature. If that was really an effective way to go then we'd see that throughout the free market. Mostly though there are always multiple software packages for a given problem. How a problem is solved is as important as solving the problem.

a standard time package would need to support older Python and associated packages longer than 'normal'.

Currently #29 adopts the same recommendations as SEP 0, so as long as whatever package is standardised around follows SEP 0 this shouldn't be an issue? Either way, it seems like as long as a package is compatible with PHEP 3 if/when it's merged it should be fine because it will essentially have the same support policy as PyHC recommendations.

pysat is trying to support Python as far back is 3.6 for operational users. It isn't easy. Satellite missions last 5-10 years or more and generally there isn't enough funding or available developer time within the mission to upgrade things part way through. Standards need to go out of their way to make things easier for users.

@rstoneback
Copy link

I haven't caught up on the whole thread, but I want to say that the limitation of not being able to use astropy's Time in pandas indexes (and therefore xarray) is a technical limitation that can be overcome.

I'd say the primary issue is a community one. The last official stance I heard was that datetime would never accept science time as leap seconds for the future aren't already known. Like, nobody can say yet how many leap seconds will be needed in 2030. We heard on the last call that may be changing. Nevertheless, even with a technical solution if packages like pandas/datetime/whomever aren't interested in the feature the technicals don't matter.

Obviously for the whole ecosystem to adopt Time everywhere this would have to be done, but if that's the direction the community wants to head in then I think it would be easy enough to use some funding to pay the right people to make that happen. (One option would be companies like Quansight, who I believe have done similar work in pandas before on research grants).

If pandas etc. is willing to accept the feature then sure, the integration is a technical one. Pandas has features like moving ahead x business days etc. This is harder to do when the number of seconds per day isn't fixed. The scope of the integration can't really be set until we have a better understanding of details.

My preference would be for PyHC to be the primary on any possible integration rather than turn it over.

I've been in space science going on 20 years now. I've been on more proposals than I can count as well as review committees for science, software, instrumentation, satellite missions, etc. for both NSF and NASA. That experience has shown me to never count on anything related to government funding.

@nabobalis
Copy link
Author

If you omit the complexity, astropy.time is of course easy to teach and learn. But this is even more true for Pandas.Timestamp, which has an even more flexible and intuitive interface, in my opinion, and which Python developers from different backgrounds will already know and love.

That is true and by adding support for astropy.time as a pandas Index, we would allow the best of both worlds here.

The astropy.time docs start out by listing time scales that only experienced users will be familiar with. There are also some opinionated choices in the implementation, for example, on the definitions and distinctions between time formats and time scales, which users taking advantage of the advanced features will have to learn, but which are not always straightforward.

If the main problem we currently have is that the documentation of astropy.time can do with tidy up and improvements, I would say we are in a good place. We can contribute upstream to fix these problems.

For instance, I'm myself still puzzled by GPS time being defined as a format, not as a time scale within astropy.time. This makes conversions between GPS and UTC timestamps with astropy.time ugly, even though it looks at first glance that this would be easy.

This is important feedback which we can use to open issues and improve the user experience with astropy.tim. We are working on open-source software, we have to be willing to work with upstream and improve libraries that would benefit a wider community. Otherwise, why do we bother with any of this?

That would be nice, but I wonder how this would look. I think it would be good for the discussion if the outcome of that work can be further specified.

Agreed.

If it means that pandas gets some configuration option so that astropy.time objects can then be used in pandas "behind-the-scenes" when manipulating Timestamps, DateTimeIndex, etc, while all the methods for manipulation are the same as for the current pandas objects (but with the addition of time scale conversion methods and inherent higher precision), that would be great.

This would be the main goal.

Or would the proposal be for a future version of pandas (or some sort of pandas plug-in module) to adopt the astropy date/time manipulation methods? That would be more tricky to implement and more confusing to users, I think.

This can be worked on after we have the first step down.

Even then, I think complexity/performance vs added value trade-offs need to be investigated as well. Astropy.time uses double the memory (double 64-bit timestamps), to accommodate the higher precision over long time spans. What are the implications for performance and hardware requirements of software that, for example, just processes 1-sec cadence satellite data, for which the single 64-bit timestamps are usually more than enough?

I would state (without evidence) that computers are fast enough that this shouldn't matter.
If the worst comes to it, we can always improve the speed of astropy.time underneath.

@nabobalis
Copy link
Author

nabobalis commented Aug 23, 2024

  • While AstroPy and SpacePy may have time support already, if it is going to be a standard the time functionality should be in its own independent package.

The answer for me here is to depend on astropy. There is no need for another package.
astropy provides wheels for almost every platform and is a very small package.

  • There was no community-wide selection of Astropy over SpacePy or over potentially creating a new time package. For example, AstroPy uses to 64-bit numbers to support precise times over long timescales. Not everyone needs this. The most common case could simply be support for leap seconds.

The goal isn't just about the features of astropy.time, it's about centralizing around one library that handles the time handling for us.

  • As already noted, pysat uses pandas and xarray. The only viable mechanism for pysat to use an updated time is to have it integrated into pandas' DatetimeIndex.

Yes and I want to add astropy.time support into pandas.

  • Operational and other systems can have long timescales, much longer than NEP or other PyHC support timelines. Thus, a standard time package would need to support older Python and associated packages longer than 'normal'. This is easiest to do if the time support is an independent package.

That is a fair point but operational libraries maybe don't follow these standards and within a world where we have levels of PyHC "status", those are left outside of that grading. The same goes for the PHEP 3 about Python versions.

  • We'd need funding at the start, not just for a potential future integration into Pandas. NASA's funding for open source science is an ongoing concern. Not only do programs like B.20 lack sufficient funding for community efforts but I don't think NASA has ever presented evidence demonstrating it has an appropriate selection process.

We would not need to fund a community with B.20, we just need to fund a qualified developer.

  • To get Pandas to agree to incorporate science time we'd need not only funding but a good argument as to why anyone outside of astronomy should care about leap seconds etc. Trying to bring Pandas in after a time package is developed is not likely to go well unless the package does everything Pandas wants. Do we know yet what it would take to get Pandas, or datetime, or anyone at a non-science institution to say yes to science time?

It isn't about getting other people to use astropy.time within pandas who do not need it.
It would be working with pandas to add opt-in support for astropy.time.
It is possible that they won't accept it into pandas directly but as a plugin like the same way that "CFTimeIndex" works in xarray. Which would create a small plugin library that would depend on pandas and astropy.

@nabobalis
Copy link
Author

Sure! I'm not saying that we have to develop a new time package, but that if code from astropy or spacepy is going to be labeled a PyHC standard and spread to the wider Python community then that time code should be spun off into its own package.

By wider community do you mean the PyHC community or Scientific Python?

The goal of this PHEP is to reduce the different ways that this community handles time.
The future goal is to work this into units and coordinates. There is no reason to spread this work outside of PyHC.

  • Developer focus. The focus would be on time and time alone, not split between all of these other functions in the overall package.

This would be important if you have like 1 developer on a package, astropy does not. They have dedicated people to maintain each subsection of the library.

  • Suppose there is a bug in the time code. It is easier for an independent package to release a new version than if it is contained within a larger more complicated package. Most likely, time bugfixes would have to wait on the release schedule of the whole package.

This is true for any package, I don't see how this is a problem. If we discover that numpy does not handle multi threaded windows tasks, we have to wait for numpy to patch that.

Dealing with standards changes takes away from developer time for higher level packages that could go into features.

Standards enable developers to spend less time on busy work and enable them to work on features.

  • Standards are meant to make it easier on higher level packages. This includes having longer support cycles. It is easier to support 5-10 years of Python packages in something focused, like a time package, than across a whole series of functions like the rest of astropy or spacepy or whatever.

The Python ecosystem does not support that kind of timescale. SPEC 0 reduces this down and numpy and all major packages follow that schedule.

I will note that I am generally opposed to the whole notion there should only be one software package for a given feature. If that was really an effective way to go then we'd see that throughout the free market. Mostly though there are always multiple software packages for a given problem. How a problem is solved is as important as solving the problem.

The free market does not have one way to do the same thing because they are competing to make more money. The reason that Microsoft runs a search engine, isn't because they think Google does a bad job of it, they do it to make more money for themselves.

We are part of a large open-source community, we are not doing this for the free market. We want to enable other developers within our community to be able to not have to re-code the same software again and again. We see centralization within the wider open-source community because its pretty much understand that a group focusing on one larger task is easier than splitting everything up into similar but packages. There might be a million Linux distros but they all use the same Linux kernel, they almost all use the same init system. The wider community there has decided that it isn't worth it to try and duplicate these systems.

I want to see if there a desire for the same within this community and if not, then PyHC should consider having no standards.

pysat is trying to support Python as far back is 3.6 for operational users. It isn't easy. Satellite missions last 5-10 years or more and generally there isn't enough funding or available developer time within the mission to upgrade things part way through. Standards need to go out of their way to make things easier for users.

And pysat can and if there is enough dev effort to support older versions. However we are trying to bring the PyHC community up to standards within the Python ecosystem such that packages can work together in order to reduce duplicated effort.

If packages can't (or do not want to) meet standards, as defined in PHEP 4 are still listed and advertised by they might not get the higher levels of support from PyHC or get the badges next to their name. I think that is more than fair.

@eelcodoornbos
Copy link

I have to agree with @rstoneback here. I’m not sure at all that interoperability would be simplified by recommending astropy in its current form.

I don’t think it has been discussed enough that astropy is a very opinionated library. It requires developers to work in a certain way, e.g. make use of its units and reference frame/scale definitions stored in its objects, its non-standard pretty printing of its objects, etc. This has nice advantages if you stick to working only with astropy objects but actually complicates interoperability with pandas (and other libraries) quite a lot.

Its time component supports leap seconds and high accuracy, but it does not support time zones or the broad range of input/output and calendar related options of pd.Timestamp.

Its coordinate handling might work very well for astronomers and seems to be robustly implemented, but I personally find its rich objects very clunky for making conversions in Earth (thermosphere-ionosphere) satellite data processing where input/output is stored in Pandas dataframes. You have to do a lot of seemingly unnecessary extra work to juggle with coordinate components, naming of panda’s columns, etc.

This is not astropy’s or pandas fault, just a consequence of the different philosophies involved. I think these should be reconciled first before recommending adoption, even for new developments.

I think adding some flexible to_pandas() and from_pandas() methods (or similar) to astropy, while using pandas as a recommended library for exchange of tabular data in the PyHC ecosystem might be a better place to start, and much more likely to succeed than starting now already to adopt astropy while proposing to modify pandas to adopt its functionality.

@nabobalis nabobalis changed the title PHEP 9999: PyHC standardization around astropy.time.Time PHEP 9999: PyHC standardization for Python time objects Sep 16, 2024
@nabobalis
Copy link
Author

nabobalis commented Sep 16, 2024

Hello everyone, I have revised the PHEP more in line with what was discussed in the last PyHC core tag up.

It drops the requirement to use astropy.time but encourages other libraries to accept them as possible inputs in the future, especially for those libraries that interface with pandas and other libraries, that will only be possible when work is done to enable that.

For libraries using datetime, there will be no changes for them to do.

@nabobalis
Copy link
Author

I disagree. Recommendations quickly become requirements.

As a community we should we setting ourselves recommendations that might involve doing some work.

  1. First, we haven't even established there is a problem. SpacePy is the only PyHC package to create time support. Other packages use datetime, which is actually fine. In practice, there are only a few leap seconds per year. So not supporting a leap seconds affects about 6E-8 of the total samples in a year.

The goal isn't to solve a problem per say but to bring the community towards a common set of tooling that we can expand upon and ideally base our interoperably on in the future.

While AstroPy also has time support, it is not in PyHC.

Is that a problem?

  1. We haven't established requirements. We need requirements to sort out what we need, what success looks like, what resources are required, and how to approach a proposal.

This community is a great place to establish these requirements especially around getting astropy integrated with pandas and xarray.

  1. Ignoring lessons from the history of software development is ill advised. The reason there is usually more than one software package for a problem is because not everyone has the same requirements. Several pysat developers have conveyed requirements for our use case, requirements which have been brushed aside by the PHEP.

The lack of support of astropy objects within pandas and xarray are known problems. Which only do sunpy developers want fixed, astropy and xarray developers as well.

There is a desire from the larger Python ecosystem to integrate but the main stumbling block has been defining the specific requirements to determine. If people are willing to help, we can work on those and bring these communities together instead of being seperated.

  1. There have been claims about future AstroPy support but these claims aren't from AstroPy itself.

The claims made about astropy support were focused on tutorials or examples or even maybe a code review. The astropy maintainters have offered some of these before for PyHC at the summer school or at meetings. But yes, there are no claims from astrpoy themselves but they have been very welcoming.

  1. If the requirements for a scientific time are already satisfied with a pre-existing package, then why do we need to require developers to use it? I'd say it is overwhelmingly likely packages will use it on their own. If developers do decide to invest the time and energy to create a new package the I think it is highly likely they have good reasons to do so.

Some developers are not aware of the largest ecosystem, this PHEP is partly informational as well as being a recommendation.

If developers within PyHC do decide that astropy.time isn't what they need, then there should be a conversation as to why, what can be changed to prevent another time library from being created and have changes contributed to astropy.

Either this is an open-source community angled at improving not only our own projects but the broader ecosystem or we are just out for ourselves and can drop many of PyHC core objectives.

I think our time would be better spent determining what requirements our community really needs and then addressing those. Getting science time into broader Python packages, like Pandas and Xarray, is more useful than requiring AstroPy or SpacePy. There are a whole range of commercial companies that could benefit from science time if it was supported by packages they already use. On the other hand, AstroPy/SpacePy already exists. Scientists can already use it if they want.

I agree, we should be getting astropy.time integrated within the broader ecosystem but we need to demonstrate that there is a willing community who needs that and would adopt that.

@nabobalis
Copy link
Author

  1. What is the size of astropy + all of its dependencies?

The astropy wheel is 6-10MB depending on the platform, its core dependency is numpy.

  1. What is the performance difference between datetime or datetime64 operations and astropy.time.Time operations?

  2. What is the difference in memory required to store an array of times? For example, an array of datetime64s vs. the equivalent in astropy.time.Time?

astropy's performance is going to be slower and I would expect it to use more memory as well. The hope is that a GSoC project can be put together next year to benchmark astropy so areas of improvement can be identified.

  1. How will package maintainers know they must test their package with a new version of astropy.time.Time? What happens if the astropy package version changes but there are no changes in astropy.time.Time (for the case when the package maintainer does not assume that their package will be installed as part of a PyHC bundle, in which case AstroPy will be available)?

Libraries should be testing with their dev/git versions of their upstream dependencies.

  1. The issue of roundoff error ("Since internally Time uses floating point numbers, round-off errors can cause two times to be not strictly equal even if mathematically they should be.") should be studied. I generally avoid representing time as floats if possible.

We can bring this up with the astropy time maintainers, I would hope that we can add features that are missing to astropy time if this community is an initiative like this.

  1. Mandating that PyHC core packages be rewritten by replacing any use of datetime or custom time objects with astropy.time.Time seems unnecessary. Most standards I am familiar with do not mandate implementation; they mandate representations at the interface level. What is the benefit of mandating how code works internally?

Overall I actually want to make a interface argument instead, i.e., everyone should accept and return astropy times, and leave the internals up the library. I can rephrase the PHEP more if required.

  1. Could you add examples to the PEP that justify, "This meant that all PyHC packages had to create their own implementations of time." Do all PyHC package have their own implementation of time, and what is meant by "implementation of time"?

I removed this line as I was overzealous in what I wrote.

@rstoneback
Copy link

rstoneback commented Sep 17, 2024

These goals are made significantly easier/more fully realized when packages can interoperate. Thus, the bringing up of finding standards for time, coords, etc. Also, thinking about these kinds of things now can help set up for the future of PyHC, and growth.

Again, per my point #1, a problem hasn't actually been established yet. Further, it also hasn't been established that packages can't interoperate.

I am thinking about the future. Overly specific standards hinder development. Wes McKinney, when giving early presentations on pandas, stated he had people telling him not to make pandas since numpy already exists. If he was in a group that specified people use numpy then Pandas wouldn't exist.

How can anyone claim that astropy time is the way to go when requirements (point #2) have yet to be established?

This PHEP needs to start back over at zero.

@rstoneback
Copy link

rstoneback commented Sep 18, 2024

I disagree. Recommendations quickly become requirements.

As a community we should we setting ourselves recommendations that might involve doing some work.

I have problems with the PHEP due to the inappropriate construction. I also have problems doing labor for free, especially given NASA repeated claims to "pay what it costs." That is not the same as an unwillingness to work.

  1. First, we haven't even established there is a problem. SpacePy is the only PyHC package to create time support. Other packages use datetime, which is actually fine. In practice, there are only a few leap seconds per year. So not supporting a leap seconds affects about 6E-8 of the total samples in a year.

The goal isn't to solve a problem per say but to bring the community towards a common set of tooling that we can expand upon and ideally base our interoperably on in the future.

If there isn't a problem then there is no reason to force new standards on the community.

While AstroPy also has time support, it is not in PyHC.

Is that a problem?

No, but packages outside of our community aren't a PyHC problem that require us to impose additional rules.

  1. We haven't established requirements. We need requirements to sort out what we need, what success looks like, what resources are required, and how to approach a proposal.

This community is a great place to establish these requirements especially around getting astropy integrated with pandas and xarray.

The PHEP has been written before getting requirements. Requirements need to be collected first.

  1. Ignoring lessons from the history of software development is ill advised. The reason there is usually more than one software package for a problem is because not everyone has the same requirements. Several pysat developers have conveyed requirements for our use case, requirements which have been brushed aside by the PHEP.

The lack of support of astropy objects within pandas and xarray are known problems. Which only do sunpy developers want fixed, astropy and xarray developers as well.

This response does not address the substance of the criticism.

There is a desire from the larger Python ecosystem to integrate but the main stumbling block has been defining the specific requirements to determine. If people are willing to help, we can work on those and bring these communities together instead of being seperated.

  1. There have been claims about future AstroPy support but these claims aren't from AstroPy itself.

The claims made about astropy support were focused on tutorials or examples or even maybe a code review. The astropy maintainters have offered some of these before for PyHC at the summer school or at meetings. But yes, there are no claims from astrpoy themselves but they have been very welcoming.

There are a large range of requirements within this community. These requirements are different than what astronomers experience. Unless astropy is committed to supporting a broader range of requirements then mandating astropy support will only cause problems.

  1. If the requirements for a scientific time are already satisfied with a pre-existing package, then why do we need to require developers to use it? I'd say it is overwhelmingly likely packages will use it on their own. If developers do decide to invest the time and energy to create a new package the I think it is highly likely they have good reasons to do so.

Some developers are not aware of the largest ecosystem, this PHEP is partly informational as well as being a recommendation.

Then the focus should be on advertising.

If developers within PyHC do decide that astropy.time isn't what they need, then there should be a conversation as to why, what can be changed to prevent another time library from being created and have changes contributed to astropy.

Either this is an open-source community angled at improving not only our own projects but the broader ecosystem or we are just out for ourselves and can drop many of PyHC core objectives.

We shouldn't adopt new standards just to adopt a new standards. To benefit the community, not only now but into the future, then the standards we do adopt need to be very well thought out.

The PHEP so far has failed to demonstrate that the PHEP is required. Thus it isn't justified to claim that opposition to the PHEP is being "just out for ourselves" or contrary to core open source principles.

I think our time would be better spent determining what requirements our community really needs and then addressing those. Getting science time into broader Python packages, like Pandas and Xarray, is more useful than requiring AstroPy or SpacePy. There are a whole range of commercial companies that could benefit from science time if it was supported by packages they already use. On the other hand, AstroPy/SpacePy already exists. Scientists can already use it if they want.

I agree, we should be getting astropy.time integrated within the broader ecosystem but we need to demonstrate that there is a willing community who needs that and would adopt that.

If we are going to get science time into the broader community then we should create an independent package to do that. If our requirements are best served by SpacePy, or by AstroPy, then that would be the place to start. An independent package is the setup that is most likely to serve the diverse range of requirements in the community.

@jibarnum
Copy link

These goals are made significantly easier/more fully realized when packages can interoperate. Thus, the bringing up of finding standards for time, coords, etc. Also, thinking about these kinds of things now can help set up for the future of PyHC, and growth.

Again, per my point #1, a problem hasn't actually been established yet. Further, it also hasn't been established that packages can't interoperate.

I am thinking about the future. Overly specific standards hinder development. Wes McKinney, when giving early presentations on pandas, stated he had people telling him not to make pandas since numpy already exists. If he was in a group that specified people use numpy then Pandas wouldn't exist.

How can anyone claim that astropy time is the way to go when requirements (point #2) have yet to be established?

This PHEP needs to start back over at zero.

Luckily, @nabobalis did do a very recent re-write of the PHEP that changes things from "you must use astropy's time" to the following:

This PHEP recommends that all projects across the PyHC ecosystem use the standard library datetime module or if the project has more complex requirements, they should use astropy.time.Time instead of creating their own time objects.

Eventually, we want to encourage that all PyHC libraries allow astropy.time.Time as valid time inputs to their libraries. This has some roadblocks currently which will not allow this to happen in the near future, but hopefully with the PyHC community behind this PHEP, we can push for better astropy integration with the broader Scientific Python ecosystem.

Any existing projects that have their own time object are strongly encouraged to replace their custom time objects with astropy.time.Time.

So, no longer are we saying you're require to use astropy time if that doesn't make sense for you. I think this should be sufficient for allowing a couple different options, while discouraging the creation of several new time objects. It also acknowledges the current challenges with integrating astropy more fully.

PHEP: 9999
Title: PyHC standardization for Python time objects
Author: Nabil Freij <[email protected]> <https://orcid.org/0000-0002-6253-082X>
Discussions-To: https://github.com/heliophysicsPy/standards/pull/X
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Discussions-To: https://github.com/heliophysicsPy/standards/pull/X
Discussions-To: https://github.com/heliophysicsPy/standards/pull/32

@rstoneback
Copy link

These goals are made significantly easier/more fully realized when packages can interoperate. Thus, the bringing up of finding standards for time, coords, etc. Also, thinking about these kinds of things now can help set up for the future of PyHC, and growth.

Again, per my point #1, a problem hasn't actually been established yet. Further, it also hasn't been established that packages can't interoperate.
I am thinking about the future. Overly specific standards hinder development. Wes McKinney, when giving early presentations on pandas, stated he had people telling him not to make pandas since numpy already exists. If he was in a group that specified people use numpy then Pandas wouldn't exist.
How can anyone claim that astropy time is the way to go when requirements (point #2) have yet to be established?
This PHEP needs to start back over at zero.

Luckily, @nabobalis did do a very recent re-write of the PHEP that changes things from "you must use astropy's time" to the following:

This PHEP recommends that all projects across the PyHC ecosystem use the standard library datetime module or if the project has more complex requirements, they should use astropy.time.Time instead of creating their own time objects.
Eventually, we want to encourage that all PyHC libraries allow astropy.time.Time as valid time inputs to their libraries. This has some roadblocks currently which will not allow this to happen in the near future, but hopefully with the PyHC community behind this PHEP, we can push for better astropy integration with the broader Scientific Python ecosystem.
Any existing projects that have their own time object are strongly encouraged to replace their custom time objects with astropy.time.Time.

So, no longer are we saying you're require to use astropy time if that doesn't make sense for you. I think this should be sufficient for allowing a couple different options, while discouraging the creation of several new time objects. It also acknowledges the current challenges with integrating astropy more fully.

The PHEP has failed to demonstrate that astropy.time satisfies the requirements for any PyHC packages. Thus why it is recommended? Plus, the 'strongly recommended' language is sufficient to ensure that no package trying to create time support that meets different requirements will be able to get funding.

Why should we discourage the creation of new time objects if we don't know the existing ones are appropriate? Further, the community has yet to demonstrate why there should be only one of a software package. This is especially relevant for a NASA community as NASA always produces more than one. And again, it has not been established that astropy satisfies PyHC requirements.

The response doesn't actually address the contents of my arguments.

  • Requirements are required before making decisions
  • Overly restrictive standards prevent progress
  • A problem that needs to be solved hasn't actually been established

Ignoring requirements when creating standards is certain to produce standards that don't meet community needs.

@eelcodoornbos
Copy link

Just a note that with the current wording:

... use the standard library datetime module or if the project has more complex requirements, they should use astropy.time.Time ...

all packages that accept or return pandas or xarray objects with time-based information (e.g. most models, time series observations, etc.) will simply not be able to comply, since these objects will contain pd.Timestamp and/or np.datetime64 objects.

I think this is a good illustration of @rstoneback's point regarding the need for agreed-upon requirements as a starting point.

@nabobalis
Copy link
Author

Why should we discourage the creation of new time objects if we don't know the existing ones are appropriate? Further, the community has yet to demonstrate why there should be only one of a software package. This is especially relevant for a NASA community as NASA always produces more than one. And again, it has not been established that astropy satisfies PyHC requirements.

If someone is motivated to create a new general scientific time package, that should not come under PyHC nor be funded by any NASA Heliophysics B.X call.

all packages that accept or return pandas or xarray objects with time-based information (e.g. most models, time series observations, etc.) will simply not be able to comply, since these objects will contain pd.Timestamp and/or np.datetime64 objects.

I think this is a good illustration of @rstoneback's point regarding the need for agreed-upon requirements as a starting point.

This is why now the PHEP says if you can use datetime, you do not need to change anything since it will work for pandas and xarray.

Fundamentally we have two separate ecosystems, one that is able to use datetime and integrate with pandas and xarray, the other is based around astropy time. The requirements needed for sunpy are only satisfied by astropy time, this PHEP is aimed at crossing this gap in a way that brings the PyHC community onboard so we can move forward together.

@eelcodoornbos
Copy link

This is why now the PHEP says if you can use datetime, you do not need to change anything since it will work for pandas and xarray.

This seems to ignore the point that pandas and numpy have their own built-in custom time objects. Like astropy.Time objects, pd.Timestamp accepts standard library datetime as input on initialisation, and/or can mimic the input of the standard datetime object initialization (among other ways). The pd.Timestamp object has a method to convert to standard library datetime for output. But it stores the time information in its own custom time object, with added functionality (for additional conversion/parsing/formatting options, use in indices, array operations, null/NaT values, etc). Similar (but with less convenience methods) for numpy.

So prescribing only standard library datetime and astropy.Time would, in my reading of this text, disallow passing numpy, Pandas and xArray objects containing their own custom time objects (since these will not be in the prescribed set of datetime/astropy.Time), and thereby severely restrict the PyHC community. It is very clear to me that this is not the intent, but it is, in my view, the result of the way it is currently phrased.

@nabobalis
Copy link
Author

nabobalis commented Sep 25, 2024

This is why now the PHEP says if you can use datetime, you do not need to change anything since it will work for pandas and xarray.

This seems to ignore the point that pandas and numpy have their own built-in custom time objects. Like astropy.Time objects, pd.Timestamp accepts standard library datetime as input on initialisation, and/or can mimic the input of the standard datetime object initialization (among other ways). The pd.Timestamp object has a method to convert to standard library datetime for output. But it stores the time information in its own custom time object, with added functionality (for additional conversion/parsing/formatting options, use in indices, array operations, null/NaT values, etc). Similar (but with less convenience methods) for numpy.

So prescribing only standard library datetime and astropy.Time would, in my reading of this text, disallow passing numpy, Pandas and xArray objects containing their own custom time objects (since these will not be in the prescribed set of datetime/astropy.Time), and thereby severely restrict the PyHC community. It is very clear to me that this is not the intent, but it is, in my view, the result of the way it is currently phrased.

Yes, you are correct that I don't mention the pandas objects. I had assumed that would be implicit under the datetime recommendation as I consider them one and the same even if they are not.
I am not suggesting the non-use of these objects if you use pandas/xarray, I will have to make that clearer in the PHEP.

I would be very interested in the direct use of numpy datetime64 in PyHC libraries. That was not one I had expected originally.

@jibarnum
Copy link

Again, per my point #1, a problem hasn't actually been established yet. Further, it also hasn't been established that packages can't interoperate.
I am thinking about the future. Overly specific standards hinder development. Wes McKinney, when giving early presentations on pandas, stated he had people telling him not to make pandas since numpy already exists. If he was in a group that specified people use numpy then Pandas wouldn't exist.
How can anyone claim that astropy time is the way to go when requirements (point #2) have yet to be established?

I'm thinking towards the future as well. In my mind, should there be a ton of different time objects created, that could lead to issues with interoperability and duplication of effort. So it made sense to me to set some guidelines, and get people working on the same page early on. Although not all use it, astropy was one of the widely-used packages in PyHC (enough to end up in PyHC summer schools), hence a focus on it. But, it's clear that that's not sufficient in and of itself, which led to @nabobalis revising the PHEP. Should a couple other packages make sense to include (e.g. the discussion happening above re numpy/pandas time objects), then we can all certainly discuss and Nabil get them included in the PHEP.

The PHEP has failed to demonstrate that astropy.time satisfies the requirements for any PyHC packages. Thus why it is recommended?

Several PyHC packages use astropy.time now, not sure what you mean by it doesn't satisfy the requirement of any PyHC package. Can you elaborate?

Plus, the 'strongly recommended' language is sufficient to ensure that no package trying to create time support that meets different requirements will be able to get funding. Why should we discourage the creation of new time objects if we don't know the existing ones are appropriate?

Hmmmm. My feelings on this are that we should, where we can, try to make current systems work better if they aren't currently meeting the needs of the community. I.e., are there ways we can make the currently-existing software work better (e.g. astropy, datetime), rather than spinning up a new tool? And if there isn't a good way to modify what exists now, I think that would warrant a community discussion around that before proposing for funding to do something totally different. If that discussion showed we truly needed something different, then we could pivot there and create PHEPs to supersede current ideas.

Further, the community has yet to demonstrate why there should be only one of a software package. This is especially relevant for a NASA community as NASA always produces more than one.

Well, with the way the PHEP is now, it's no longer the intention to only use astropy.time. If it were sufficiently modified in the future to meet everyone's needs, that'd be a different story, but we aren't at that point.

@rweigel
Copy link
Contributor

rweigel commented Sep 25, 2024

I'd like to see this rstoneback's point #1 addressed:

  1. First, we haven't even established there is a problem. SpacePy is the only PyHC package to create time support. Other packages use datetime, which is actually fine. In practice, there are only a few leap seconds per year. So not supporting a leap seconds affects about 6E-8 of the total samples in a year.

The fundamental issue with this PHEP is that it proposes a solution, but the problem it addresses is not concrete or specific. Maybe try framing it as "People do A (with code examples), and this creates problem B," but if they follow the standard, problem B is obviated or ameliorated because of reasons x, y, and z." If the problem is duplication, provide links to lines of code.

We are debating about an unknown, unagreed-upon, or hypothetical problem. I motion that this PHEP discussion is tabled until the concrete-and-specific problem question is answered and there are specific examples of interoperability issues related to time that this PHEP would address. The next step is to get agreement that a claimed problem is a problem. Finally, we should debate proposed solutions. We are not making progress because we started with the last step and the claimed problem is vague.

My experience with PyHC packages is that time interoperability could be improved without an AstroPy mandate. (Unfortunately, I have not had time to document them and provide research that justifies my claim.) I also think it is important that we all attempt experiments with many PyHC packages and develop our own opinions about interoperability issues.

@rweigel
Copy link
Contributor

rweigel commented Sep 25, 2024

Another reason for tabling is that continuing debates when it is clear there is little agreement can harm future collaboration or reduce participation. Let's find and start with points of agreement and then debate implementation issues.

@jibarnum
Copy link

Another reason for tabling is that continuing debates when it is clear there is little agreement can harm future collaboration or reduce participation. Let's find and start with points of agreement and then debate implementation issues.

Let’s discuss at Monday’s tag up!


For example, the reason that Astropy's time framework was created was to add support for astronomical formats (e.g., Julian Date (JD), Modified JD (MJD) and precise timing (e.g., a nanosecond over a Hubble Time).
None of which is possible when using the [datetime](https://docs.python.org/3/library/datetime.html) module.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Use case
Several independent researchers are attempting to match coronal features during an eruption observed by the Atmospheric Imaging Assembly (AIA) on the Solar Dynamics Observatory (SDO) to in situ measurements observed by Parker Solar Probe (PSP). Among these researchers are summer student interns who are using PyHC packages for the first time. This study requires careful comparisons of the times of the remote and in situ observations. In particular, the researchers must make numerous time comparisons, while taking into account transit time and spatial differences.
The researchers decide to use SunPy (which makes use of `astropy.time.Time`) to read in and analyze remote observations from SDO/AIA. They then decide to either use SpacePy (and its `TickTock` class) or PySPEDAS (which uses `datetime64`?) to read in and analyze in situ observations from PSP. However, operations like `Time(...) - TickTock(...)` result in an exception(?).
Currently, no functionality exists(?) among PyHC packages to convert between or directly compare `TickTock`, `Time`, and `datetime64` objects. This lack of interoperability has the following consequences:
- Students and researchers would need to learn how to use and interact with multiple different time objects, which increases the onboarding time for student interns.
- Each researcher would need to write or acquire functionality that would convert the different time classes into a common class.(?)
- The effort required to perform this analysis would be increased.
- User experience is degraded.
If the ecosystem were to standardize on a common time framework (or at least one with a common API), the consequences would be:
- Cross-disciplinary researchers would only need to learn a single API for dealing with time.
- Time delta operations would be greatly simplified: subtracting a time object from another would produce a consistent time delta object.
- The effort needed to perform this analysis would decrease.
- The likelihood of unnecessary software duplication is reduced.
- User experience when working with multiple PyHC packages would be improved.
- The onboarding time for students would be reduced, and the suitability of this topic as a research project for student interns would be increased.
- `astropy.time.TimeDelta` objects can be converted to `astropy.units.Quantity` objects using the `.to` method, which would enable multiplication with a velocity to get a distance.

Would it make sense to spell out some use cases to show more about why we're hoping to standardize around time? I attempted to draft one here, with question marks about the parts I'm uncertain about.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @namurphy - I suggest trying the experiment.

It seems that we have a proposed standard and are looking for hypothetical problems to justify it. All the packages understand datetime, and the hypothetical does not indicate exactly where datetime is insufficient. I expect one could come up with a concrete example that the standard would have fixed, but it will likely be an uncommon use case - if so, we should debate if the burden of the standard on everyone is worth simplifying an uncommon use case.

I'd really like to see the participants try to use other packages together and study the package's source code. Five ideas for standards or cross-cutting projects will probably result, and I doubt this one involving AstroPy.Time will be one of them. I am very reluctant to propose a standard based on hypothetical or contrived problems with claims that may prove wrong if one tries to solve the problem. There should be a PHEP for time, but there should not be speculation to justify it. Also, the implications should be researched (again, without speculations).

Much more research is needed. I don't see the rush to have this voted on.

I've started with an ISO 8601 timestamp of interest and used SpacePy, SunPy, PySPEDAS, and SpiceyPy to do a coordinate transform. There may have been a time standard that would have simplified things for me - if I could have bypassed the custom time objects and just passed an ISO 8601 string to the transform functions, things would have been much easier. In my problem, there was no need to expose the package's time abstraction to the user. Several packages require the user to understand custom time and/or coordinate objects (or don't allow dimensionless vectors) to do something conceptually simple - pass a timestamp and a vector and get a new vector. I don't know if there is an actual proposal here - what is there to say beyond "make simple things easy to do" or "don't require the scientist to understand a software abstraction if unnecessary?"

As a result of this experience, I submitted an issue requesting a keyword in the hapiclient function to request the output as datetimes instead of the default of the raw strings served by the HAPI server. There is no reason for the general user to learn and use a separate function, hapitime2datetime, to convert from the raw ISO 8601 strings to datetimes. In the next major release, we'll make this the default behavior because this is what most want.

@jvandegriff
Copy link

I've been behind on this, but am trying to catch up.

This proposal to switch to a new, standardize approach to time represents a very low-level change to most if not all of the libraries in PyHC. The fundamental ways that astronomers and Heliophysicists think about time is different. This adds a large new python library dependency to basically all PyHC projects. Some of the core projects are not supportive of this large change.

With this much headwind, there is the potential for frustration and discord if this gets passed through quickly. Some people have pointed out that there is a lack of clarity on the exact problems that this is solving. We at least need to back up and address those concerns. The general principles of reducing duplication, leveraging existing codebases, etc, are of course well accepted. However, because there are so many existing PyHC libraries that are now in the maintenance-level-only for funding, it is possibly a very scary situation to suggest that they now all try to find funding to adapt to a new approach, or be considered deprecated. So a plan to adapt time formats across existing libraries, or a plan to consolidate over time to a set of accepted standards seems like it could get more traction with a larger number of people.

@rebeccaringuette
Copy link

Some very good points here. I suggest that the time currently scheduled in the fall meeting for voting on this PHEP be reassigned to be a hackathon where the group works out what problems we currently face in interoperability between our softwares that this PHEP would resolve.

@rebeccaringuette
Copy link

It seems we should also add some text about the time methods being interoperable (rather than equal to) the options in this PHEP, which seem to be astropy.time, np.datetime, and one or two others.

@sapols
Copy link
Contributor

sapols commented Sep 30, 2024

I'll quickly note that we just decided at the recent PyHC core tag-up to not hold a vote on this PHEP at the Fall meeting, and we'll use that time to discuss things as a community instead.

@nabobalis nabobalis closed this Nov 13, 2024
@nabobalis
Copy link
Author

nabobalis commented Nov 13, 2024

It was decided during the Fall 2024 meeting that around tiering of interoperability, the base tier will be that all PyHC packages are installable in the same environment and work without issues. Whereas the the top tier will be that PyHC will attempt convertor functions between libraries.

Therefore, I have closed this PR as it now serves no purpose.

@rebeccaringuette
Copy link

https://youtu.be/9FzCWLOHUes?feature=shared
could not resist…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.