Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify and speed up timestamp parsing #3063

Open
wants to merge 5 commits into
base: develop
Choose a base branch
from
Open

Conversation

akx
Copy link
Contributor

@akx akx commented Nov 15, 2023

As mentioned in #2972 (comment):

_parse_timestamp_with_tzinfo() already attempts to do fromtimestamp parsing; not much point in doing that work twice for timestamp-esque strings (and failing in the cases described in #2972).

In the subsequent commits, this PR also cleans up remaining seemingly Useless Use of tzlocal.

@codecov-commenter
Copy link

codecov-commenter commented Nov 15, 2023

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Please upload report for BASE (develop@976b894). Learn more about missing BASE report.
Report is 177 commits behind head on develop.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             develop    #3063   +/-   ##
==========================================
  Coverage           ?   93.06%           
==========================================
  Files              ?       66           
  Lines              ?    14472           
  Branches           ?        0           
==========================================
  Hits               ?    13468           
  Misses             ?     1004           
  Partials           ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@akx akx marked this pull request as draft November 15, 2023 07:23
@akx
Copy link
Contributor Author

akx commented Nov 15, 2023

@nateprewitt I have some tests you seem to have added for #1987 failing with this (hence a draft), but you might have some context to help here: Why do we need tzlocal or tzwinlocal in the first place?

parse_timestamp doesn't necessarily guarantee the returned datetime should have any tzinfo attached, though the tests seem to always assume it's UTC, and practically, parsing any timestamp off the wire does prescribe a timezone in some way (GMT for RFC822, Z for ISO8601).

Might I be right in guessing the tzlocal/tzwinlocal stuff could actually be all removed in favor of _epoch_seconds_to_datetime's arithmetics?

@akx akx changed the title Use _epoch_seconds_to_datetime in _parse_timestamp_with_tzinfo Simplify timestamp parsing Nov 15, 2023
@akx

This comment was marked as outdated.

@akx
Copy link
Contributor Author

akx commented Nov 28, 2023

(Gentle review nudge, @nateprewitt? 😄)

@akx
Copy link
Contributor Author

akx commented Jan 10, 2024

Another nudge, @nateprewitt... I noticed you'd said

we still have some concerns around the original windows issue that lead to that PR. It's likely safe to clean up as you've suggested but we'll need to do some manual testing first.

in #3062 – anything I can do to help things along?

@akx
Copy link
Contributor Author

akx commented Feb 16, 2024

Sorry for Yet Another Ping, @nateprewitt 😅

@akx
Copy link
Contributor Author

akx commented Jun 22, 2024

Another ping... @nateprewitt?

@akx akx force-pushed the epoch-seconds branch 2 times, most recently from 0631762 to a5866ba Compare November 21, 2024 14:01
@akx
Copy link
Contributor Author

akx commented Nov 21, 2024

Rebased post conflicts from #3206...

@akx akx changed the title Simplify timestamp parsing Simplify and speed up timestamp parsing Feb 6, 2025
@akx
Copy link
Contributor Author

akx commented Feb 6, 2025

I decided to just add the optimization commit I'd mentioned in a previous comment in here, because why not – after all this PR has been brewing for over a year so might as well.

Benchmark results for parsing timestamps found in the test suite comparing 05538ce and this PR, showing a mean OPS improvement of 16.75×:

Name (time in ms) OPS Min Max Mean StdDev Median IQR Outliers Rounds
test_benchmark_parse_timestamp[develop] 44.7751 20.0351 25.8432 22.3338 1.7029 22.2602 2.9707 86;0 211
test_benchmark_parse_timestamp[this PR] 750.3208 1.1025 1.6340 1.3328 0.1388 1.2426 0.2904 929;0 2541

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants