Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix common bug in parsing of UTC datetimes; remove dateutil #373

Merged
merged 1 commit into from
Dec 17, 2024

Conversation

probberechts
Copy link
Contributor

@probberechts probberechts commented Dec 14, 2024

This PR addresses a common bug related to parsing UTC datetimes, fixes the timezone of Opta F24 data and removes the dateutil dependency.

The following conventions are used:

  • All datetimes are timezone-aware
  • If the local timezone is known, the local timezone is used
  • If the local timezone is not known, UTC is used

astimezone -> replace

Often, datetimes are specified as a string and use the UTC timezone. In kloppy, we want to parse these a timezone-aware datetime. Then, then string was usually parsed and converted to a timezone-aware datetime as follows:

>>> parse("2024-12-14 12:00").astimezone(timezone.utc)
datetime.datetime(2024, 12, 14, 11, 0, tzinfo=datetime.timezone.utc)  # on a system with local timezone UTC+1

The correct implementation is:

>>> parse("2024-12-14 12:00").replace(tzinfo=timezone.utc)
datetime.datetime(2024, 12, 14, 12, 0, tzinfo=datetime.timezone.utc)

The former changes the timezone, while the latter sets the timezone (i.e., makes the datetime timezone-aware).

Opta F24 timezone

According to the documentation, timestamps in the F24 files use the timezone "Europe/London".

Remove dateutil

We know the exact format of the date string in advance. Hence, we don't need dateutil's parser module.

@probberechts
Copy link
Contributor Author

probberechts commented Dec 14, 2024

Is there a one-minute difference between UK/London and UTC? Or did I make a mistake?

import datetime
from pytz import timezone

naive_datetime = datetime.datetime.now()
local_datetime = naive_datetime.replace(tzinfo=timezone('Europe/London'))
utc_datetime = local_datetime.astimezone(timezone('UTC'))
print("Naive datetime:", naive_datetime)
print("Local datetime:", local_datetime)
print("UTC datetime:  ", utc_datetime)

This is the output:

Naive datetime: 2024-12-14 21:53:04.383850
Local datetime: 2024-12-14 21:53:04.383850-00:01
UTC datetime:   2024-12-14 21:54:04.383850+00:00

Update: I've figured it out: one should use localize instead of replace. See https://groups.google.com/g/django-users/c/rXalwEztfr0/m/QAd5bIJubwAJ. I'll update the code later.

@probberechts probberechts marked this pull request as ready for review December 16, 2024 22:35
@probberechts probberechts added this to the 3.16.0 milestone Dec 17, 2024
@probberechts probberechts merged commit b0f56e1 into PySport:master Dec 17, 2024
19 checks passed
@probberechts probberechts deleted the fix/astimezone branch December 18, 2024 10:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant