-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarification of time coordinates, especially leap seconds, define utc
and tai
calendars and leap_seconds
in units_metadata
#542
Comments
In discussion 304, @ChrisBarker-NOAA has given his support to this proposal (thanks, Chris). He writes:
Please could anyone who wants to comment on this proposal do so here in this issue, rather than in discussion 304. Thanks. |
@ChrisBarker-NOAA has also made some comments on the PR (#541). I'm copying them here, because discussion of "substantive" points in a PR is awkward to follow subsequently. It's easier to have a single record in the issue. Marking typos etc. in a PR is fine, because they don't need discussion or reply.
I agree that "date/time" isn't ideal because "/" means "or", but I don't have a strong view on what we should write. We used "date/time" because it appears like that elsewhere in the convention document, especially chapter 7. If there is a consensus on a preferred way to write it, or a different term to use, we could change it throughout the document.
UTC and TAI have a complicated history, as described by wikipedia. My understanding is that, to summarise it simply, TAI began in 1958-1-1, with the modern definition of a second in terms of the caesium atomic clock. In 1972 UTC was rebased on TAI, in such a way that they were treated as coincident at 1958-1-1, with 10 leap seconds having been added by 1972. Hence it's convenient to regard UTC as beginning in 1958 as well as TAI. There is a sentence of explanation elsewhere in the CF text, which Chris discovered later. I will put something at the point where this remark was made as well.
That's fine, thanks. I will insert it. The time zone definitions are plus/minus numbers hours (and minutes), not names - no automatic transitions are implied by them!
OK, thanks.
In practice I'm sure it's OK if data-writers produce data for the future which they know it will be correct because of advance warning. The checker will give an error if it finds a date which is the future when the checker is run, but the future becomes the past at the rate of 1 second per second, and the same file will not give an error once that has happened! Should this be a recommendation not to write future UTC, rather than a prohibition? Thanks for these comments, Chris. I have resolved them in the PR. |
Dear Chris I have made changes (in the PR, html and pdf) following your suggestions. Two of them were more complicated that I had expected. Here are the new versions of various paragraphs: In 4.4.1UDUNITS defines a The default time zone offset is zero. In a time zone with zero offset, time (approximately) equals mean solar time for 0 For example, In 4.4.2In the real world, the international basis of civil timekeeping is Coordinated Universal Time (UTC). Leap seconds are adjustments occasionally made in UTC, in order to keep it close to mean solar time at 0 Do they look OK? Cheers Jonathan |
These look greatt -- thanks! Where are we at with:
I vote for either "datetime" or "date-time" -- but yes, it should be the same everywhere, so if this is too much churn, we can leave it as is. Maybe wait to see if anyone else has a preference? |
Dear @chris-little Thanks for reviewing the PR. I am glad you found it clear. You commented
Thanks for this point. I have qualified "midnight" with "at 0 Best wishes Jonathan |
Enough support has been expressed for this proposal to be accepted, and more than three weeks have passed without any further concern being raised. It would be really good to have this enhancement in CF 1.12, since it we've been needing a solution to this issue for years. However, we're keen that there should be a consensus. Would anyone else like to comment? |
I'd still like to see "date/time" replaced with either "date-time" or "datetime" -- anyone else have a preference? One data point: apparently SQL uses "DATETIME" -- for what that's worth. OH, two: python uses As for "midnight":
So midnight is clearly defined -- though is still could be confusing (midnight at the beginning or end of teh day?) why not "zero hours" or "0" or, at least "Midnight at the beginning of the day" -- I'm not sure the "at 0 degrees_east" is needed. |
No strong preference, but I agree that either datetime or date-time reads more smoothly than date/time. I'd lean marginally towards datetime because it's pythonic. |
(Sorry to be late commenting; I've been too swamped to keep up on this issue until now.) I think the text does an admirable job of sorting out all the complicated details; kudos to the authors. The one thing that is missing is that I think it needs a high-level overview and summary at the beginning. I would venture that a majority of readers are going to get a short ways into this section, become overwhelmed, and skip over the rest of it. Most users just want to know what, if anything, they should do about leap seconds, so we should provide that guidance up front. I would suggest something along these lines (perhaps phrased a little more formally), if others think it summarizes things as accurately as it can while glossing over all of the details.
|
@sethmcg And perhaps to lay it on a bit thicker, add at the end something along the lines of: |
slight word smithing: Consequently, even though time labels allow data to be correctly ordered, any calculations of durations may be inaccurate by a few seconds. (is "labels" the right term? -- I don't think I see it in the other text" Related NOTE: Over on: https://github.com/orgs/cf-convention/discussions/383 We are discussing the use of floating point types in time variables -- I think that if you want to be accurate to the leap-seconds, you really should be using integer seconds (or less) as your time unit -- using a floating point type makes even less sense when you care about the precision that much. In terms of this issue, that means we should have the examples use appropriate units / data types (if data types are part of the example). Looking at the PR, I see:
perfect -- seconds is a good unit to use. (can we spell it "seconds" or is that fixed elsewhere?) """ All good -- - but do we want to add anything about appropriate unit/data type combinations? -- e.g. "days since" with float is not going to get you second precision for very long. And days since with double will for a huge range, but it also has variable precision, depending on where you are on the timescale -- so while calculating leap-seconds precision, you may be off by a nanosecond or two -- but I suppose that's all the usual caveats with floating point types. (I thought I saw a "days since" in there somewhere, but can't find it now -- so I guess all good?) But a recommendation may be good: Maybe add something to the conformance doc under: === 4.4.1 Time Coordinate Units Recommendations: Or is there somewhere else a recommendation could be added? |
@ChrisBarker-NOAA I'm happy with your word-smithing. I used |
Dear all Thanks for your careful reading of and comments on the draft. David and I have considered these comments, and I have made consequent changes in PR #541. The updated conventions document can be seen as HTML and PDF. Below is a description of the changes. Please let us know any further concerns or suggestions for improvement. Best wishes Jonathan
|
Thanks Jonathan, Here are first a couple of comments of more technical nature that I think are uncontroversial:
And here are two further comments of technical nature, but I do not have a concrete suggestion how to fix them:
|
Dear Jonathan, Thank you for these new changes. I'm happy with all of them, with the exception of the new paragraph in 4.4.2 about leap seconds: I don't think we should be so explicit about the numbers of leap seconds by which TAI and UTC differ. UTC is currently 37 seconds (not 27) behind UTC. 27 seconds have been added to UTC since 1972, in increments of 1 second; but 10 seconds were also added to UTC over the period 1958-1971, in various increments of (much) less than 1 second (these were not called leap seconds at the time, but "rubber seconds"). The current difference between TAI and UTC is likely to change, so I don't think we should hardwire it into the conventions. Similarly, the 2035 no-more-leap seconds date is already a bit vague (as you noted), and doesn't preclude the possibility of 60 leaps seconds being applied at one instant in the future (i.e. when we've drifted by 1 minute). I propose a new version of this paragraph ( Leap seconds are adjustments made at irregular and unpredictable intervals in Coordinated Universal Time (UTC), the international basis of civil timekeeping in the real world. In response to slight variations in the Earth’s rotation speed, positive or negative leap seconds are inserted in order to keep UTC close to mean solar time at 0 degrees_east i.e. the time zone with the default (zero) time zone offset in UDUNITS and CF (see Section 4.4.1, "Time Coordinate Units"). When a positive leap second is introduced at the end of a minute, that minute contains 61 seconds. Thanks, |
(whilst writing mine, I missed Lar's last post, who also picked up in the 27/37 seconds discrepancy!) |
I think it's PK as is, but I generally agree with David and Lars -- it could be trimmed down, we only need so much detail, e.g.: Clearly stating that leap seconds are a thing, and which how calendars handle (or don't) is enough -- folks can go find the details if it matter to them. Similarly for DST -- just note that it's a thing to keep in mind, that's really all we need. -CHB |
I agree with @ChrisBarker-NOAA and @larsbarring. Where we can get away with saying less, I think we should. For instance, we say that adding 1 leap second makes a minute 61 seconds long. Do we really need to also say that subtracting a second makes it 59 seconds long, especially when that case has never actually happened? I think it's safe to leave that unsaid, and that it will make the document easier to absorb and understand if we do. Likewise, I think we can just say "UTC" without an in-line explanation of what it is. Maybe it would make sense to link it to the Wikipedia page on UTC? |
@JonathanGregory @davidhassell Just so that everyone is clear about was agreed by the BIPM/IERS/ITU global conference (CGPM). Leap seconds, positive or negative, have not been abolished, but the criteria for declaring one will be (much) looser. The current criteria is (UT1-UTC) ~ 0.9s. The CGPM: So you might want to fine tune your wording to hedge your bets in 2026 and to avoid this issue in ten years' time, or a hundred! ;-) |
Dear all Thanks for reading this again and for further comments. I have updated the PR #541. By some good magic of Antonio @cofinoa, the corresponding PDF and HTML have been generated automatically by GitHub, also the conformance document PDF and HTML.
Best wishes Jonathan |
Looks good to me! Thanks for synthesizing all our comments. |
Agreed -- looking good: "Z is the time zone offset. This is an interval of time, specified in one of the formats described below. Time zone names or acronyms are not allowed." Folks do often get confused about "time zone" vs "time zone offset" time zone offset is simple and clear -- whereas a "time zone" is a designation of a region that follows certain rules for determining the offset -- CF doesn't deal with those at all. I think this captures that OK, unless we want to nail the point home: "Z is the time zone offset. This is an interval of time, specified in one of the formats described below. It is simply the offset from UTC, and not the regional time zone -- thus names or acronyms are not allowed." or not .... -CHB |
Thanks, @sethmcg and @ChrisBarker-NOAA. For Chris Barker's most recent comment, I have changed the time zone description to read
The PR #541 HTML and PDF have been updated. If no further concerns are raised, this change can accepted in three weeks, on 26th November. @chris-little should be added to the list of contributions to the CF convention. Thanks, Chris. |
I'm not suggesting a blocker at this point, but I just noticed something: Is the UDUNITS time format ISO 8601? (it's close, but is it exactly?) if so, maybe we should say that in the doc somewhere. The references i see to ISO 8601 at this point are:
It seems a bit odd to talk about teh differences, without having. stated the similarities first. |
Lars suggested the first point of comparison in ch04, and I added the second, having looked at the ISO 8601 definitions. We don't need these comparisons for the CF standard, and we should remove them if they're not helpful. I included them for the sake of anyone who is familiar with ISO 8601, to prevent them from making an assumption about what we mean, or to point out that we already know it's different (to forestall concerns or objections that we've made a mistake). I haven't done a thorough comparison. |
@JonathanGregory @ChrisBarker-NOAA I suggest minimising the references to ISO 8601 and making sure that they are purely informative, as there are superceded versions, and the latest is behind a paywall. ISO is just about to issue Amendment 1 to ISO 8601-1, and the whole standard is slated for full review. Part 3 (semantics) is in the pipeline too, and they are working on lots of overwhelming details on how to do calendrical calculations, possibly in conjunction with IETF people. The standard is probably too flexible with too many options, so that is why I would prefer references to a very restrictive profile of it, such as IETF RFC 3339, W3C's Webtime or even the US National Libary profile. The relevant ISO committee apparently met two weeks ago. HTH |
Good reasons all. It's not a big deal, but I think we can remove the ISO 8601 references altogether , and simply define what a datetme string means in CF by itself. For instance: I thnk we can remove this altogether: "Note that this interpretation of omitted time, which is an aspect of UDUNITS syntax, is different from <<ISO_8601>>, in which omitted time implies lack of precision." -- It's already clear that YYYY-MM-DD means: YYYY-MM-DDT00:00:00 and: "Note that the CF meaning of "calendar" refers to datetimes, whereas the <<ISO_8601>> definition refers only to dates." Can simply be -- "Note that the CF meaning of "calendar" refers to datetimes, not only dates" (I presume that's because with TAI vs UTC, we. do have different times in different calendars....) This isn't a deal breaker for me -- folks are familiar enough with teh ISO standard (though not its intiecate details), so it may be more clear to provide the contrast. Note that I can't find any documentation in the UDUNITS pages about how datetime strings are formatted -- maybe it's there, but I can't find it quickly. If it is there, we could put a link in ch04. If not, we should do something about that -- but that's another topic. |
I've deleted the references to UTC 8601 as suggested by @ChrisBarker-NOAA.
That's right.
I couldn't find it either. I don't think it's there. That's why I described it briefly. Since we take the UDUNITS syntax as definitive, it would be helpful if UDUNITS documented it. |
and thus: #562 If I get a chance, I'll add a note to that specifically about datetime strings... |
I am delighted to say that we have agreed to make these changes concerning leap seconds, after years of complicated and difficult debate. It's probably not the last word, but it's a step forward, I think. Many thanks to all who contributed recently on this issue. |
Whoo Hoo! Great work everyone! |
Amazing. I'll take a step forward over the last word, any day :) |
Summary
This proposal aims to reorganise and clarify the existing text, mostly in section 4.4, about time coordinates, with no change in meaning. It includes a new subsection on leap seconds and their implications for the CF
standard
calendar, with examples and a diagram, and defines a new use of theunits_metadata
attribute to remove ambiguity in the interpretation of leap seconds in thestandard
calendar. It introduces two new CF calendars:utc
for UTC with leap seconds properly accounted for, andtai
for atomic clock time, used for some satellite data.Benefits
Several previous lengthy but inconclusive CF discussions have shown that the treatment of leap seconds is unclear and unsatisfactory. In this proposal we hope to provide an acceptable solution to these difficulties.
Moderator
None yet
Associated pull request
#541
Detailed Proposal
A huge amount of hard thought has been spent on previous long discussions about CF calendars and leap seconds (including #148, discuss issue #297, Discussion #304). The last of these went quiet in April.
Since then, we (@davidhassell and @JonathanGregory) have been working on a proposal, on which we'd now like to invite comments. If you are interested, please look at our modified text, especially section 4.4 on time coordinates. You can find this in any of the following:
Revise section 4, include new subsection on leap seconds, define
utc
andtai
calendars andleap_seconds
inunits_metadata
#541. Since the changes toch04.adoc
are large, you have to click "Load diff" to see them.The conventions document incorporating the changes as HTML or PDF.
The main changes are these:
Reorganisation and clarification of the existing text, with no change in its meaning. We have put the text about
units
into its own subsection, including writing down the format of the reference date/time and time zone, which wasn't shown except by an example. We have put the detailed text and examples concerning thenone
and paleoclimate calendars into their own subsections as well, so that the subsection on calendars is limited to giving the definition of each calendar.Opening statements defining date/times and time coordinates, and an explanation in the subsection on calendars of how they relate to time intervals. These points have been contentious in the past, so we feel it's best to state plainly how they should be understood in CF (according to this proposal).
A new subsection on leap seconds, which explains in detail their implications for the CF
standard
calendar. Difficulties arise because that calendar is, and has always been, used in practice both for data that truly does not have UTC leap seconds in its time axis (e.g. a model which uses the real-world Gregorian calendar with every day having 86400 seconds) and for data which does, or should, have leap seconds but they are ignored in the time coordinates (e.g. observational data recorded with UTC time). Rather than deprecating or prohibiting one or other of these variants, we propose a new convention for theunits_metadata
attribute to distinguish them, so that they can be handled correctly by the data-user. Theunits_metadata
attribute was recently added to CF to handle the difficulty ofdegrees_celsius
being used in two different ways that require different treatment by data-users, after a very long and difficult discussion. We are hoping that it can work the same magic with leap seconds.A worked example and a diagram for leap seconds. The diagram was inspired by the graph posted by @ChrisBarker-NOAA. We've also produced a table illustrating how a selection date/times and coordinates are related across many CF calendars, inspired by Lars's table. We propose to put this in an appendix to the convention, if this proposal is accepted. Thanks, Lars and Chris, for the ideas.
Two new calendars:
utc
for UTC with leap seconds properly accounted for, andtai
for atomic clock time, used for some satellite data. The latter has been requested in previous discussions. The former hasn't explicitly been requested, but many comments imply that it would be preferred tostandard
for some purposes.Previous discussions on these matters have evoked disagreements on principle which turned out to be irreconcilable by discussion in the issue, and no conclusion was reached. To avoid that outcome, we'd like to try a different method with the present proposal. If you find something in this proposal which you feel you couldn't possibly accept, even with modification, please say so in this issue. If anyone feels like that, we will convene a group to discuss the disagreements by video meeting, like we've done with a couple of other difficult issues. The group would be charged with reaching a resolution soon enough for some version of this proposal to be accepted for the next release, probably with a deadline in November. If that can't be done, we'll have to start again when someone has a new idea in future.
On the other hand, any suggestions, comments or concerns on clarity, presentation and details of the convention can probably be resolved by discussion in this usual way on this issue. We look forward to hearing what you think!
@JonathanGregory and @davidhassell
The text was updated successfully, but these errors were encountered: