Skip to content

Use uncertainties.ufloat for uncertain durations #116

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: develop
Choose a base branch
from

Conversation

rlskoeser
Copy link
Member

@rlskoeser rlskoeser commented Mar 13, 2025

preliminary implementation to add support for uncertain values

  • adds dependency on uncertainties package so we can use ufloat object
  • adds a preliminary Udelta object for a timedelta with uncertain duration, which uses ufloat for days
  • updates Undate.duration to calculate and return Udelta when there is uncertainty about the duration

Summary by CodeRabbit

  • New Features

    • Enhanced date duration calculations now provide statistical estimates for ambiguous dates, delivering both a nominal value and uncertainty measures.
    • Improved handling of date intervals involving variable month lengths and leap year considerations for more precise outcomes.
  • Chores

    • Updated dependency management to support the enhanced uncertainty computations.

Copy link
Contributor

coderabbitai bot commented Mar 13, 2025

Walkthrough

The pull request updates the project to support uncertainty in date calculations. The changes add the "uncertainties" package as a dependency, improve documentation for constants related to leap years, and introduce a new Udelta class to represent an uncertain time delta. The Undate.duration method is modified to return either a precise Timedelta or an uncertain Udelta, and corresponding tests have been updated to verify these behaviors.

Changes

File(s) Change Summary
pyproject.toml Added "uncertainties" package to the dependencies in the [project] section.
src/undate/converters/calendars/gregorian.py Updated comment for NON_LEAP_YEAR and added new LEAP_YEAR constant with documentation.
src/undate/date.py
src/undate/undate.py
Introduced new Udelta class in date.py; updated Undate.duration to return a union of `Timedelta
src/undate/interval.py Added a comment in the duration method regarding a future handling of February’s leap day.
tests/test_date.py
tests/test_undate.py
Added new test class TestUdelta with appropriate tests and updated existing tests to reflect the modified behavior of uncertain durations.

Sequence Diagram(s)

sequenceDiagram
    participant U as Undate Instance
    participant D as duration() Method
    participant C as Calculation Logic
    participant R as Return Value

    U->>D: Call duration()
    D->>C: Evaluate date precision and unknown year conditions
    alt Multiple possible month durations
        C-->>D: Provide list of possible day counts
        D->>R: Create and return Udelta (uncertain duration)
    else Single valid duration
        C-->>D: Calculate exact day count
        D->>R: Create and return Timedelta (certain duration)
    end
    D-->>U: Return computed duration
Loading

Suggested reviewers

  • ColeDCrawford
  • robcast
  • jdamerow

Poem

I'm just a bunny in a code-filled glade,
Hopping through changes, unafraid.
Adding uncertainty with a playful twist,
Leaping through logic, you get the gist!
(_/), ( •_•), / >💻 – a bunny’s cheer in every byte!

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai plan to trigger planning for file edits and PR creation.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

codecov bot commented Mar 14, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.27%. Comparing base (759d0c7) to head (34c31eb).
Report is 21 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #116      +/-   ##
===========================================
+ Coverage    96.50%   97.27%   +0.76%     
===========================================
  Files           20       21       +1     
  Lines          658      697      +39     
===========================================
+ Hits           635      678      +43     
+ Misses          23       19       -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@rlskoeser
Copy link
Member Author

@coderabbitai review

Copy link
Contributor

coderabbitai bot commented Apr 7, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (7)
pyproject.toml (1)

12-12: Appropriate dependency addition for new functionality.

Adding the uncertainties package as a dependency is necessary for the implementation of ufloat for uncertain durations. Consider specifying a minimum version to ensure compatibility with your implementation.

-dependencies = ["lark[interegular]", "numpy", "convertdate", "strenum; python_version < '3.11'", "uncertainties"]
+dependencies = ["lark[interegular]", "numpy", "convertdate", "strenum; python_version < '3.11'", "uncertainties>=3.1.7"]
src/undate/date.py (1)

34-53: Well-implemented uncertain timedelta class.

The Udelta class is a good implementation of an uncertain duration. The use of @dataclass is appropriate, and the calculation of midpoint and half-difference is mathematically sound for representing uncertainty.

A few suggestions for potential improvements:

  1. Consider adding more timedelta-like functionality as noted in your TODO comment
  2. Add validation to ensure that at least two day values are provided
  3. Consider implementing comparison methods (__eq__, __lt__, etc.)
@dataclass
class Udelta:
    """An uncertain timedelta, for durations where the number of days is uncertain.
    Initialize with a list of possible day durations as integers, which are used
    to calculate a value for duration in :attr:`days` as an
    instance of :class:`uncertainties.ufloat`.
    """

    # NOTE: we will probably need other timedelta-like logic here besides days...

    #: number of days, as an instance of :class:`uncertainties.ufloat`
    days: ufloat

    def __init__(self, *days: int):
+        if len(days) < 2:
+            raise ValueError("At least two possible day values are required to create uncertainty")
        min_days = min(days)
        max_days = max(days)
        half_diff = (max_days - min_days) / 2
        midpoint = min_days + half_diff
        self.days = ufloat(midpoint, half_diff)
+
+    def __eq__(self, other):
+        if isinstance(other, Udelta):
+            return self.days == other.days
+        return NotImplemented
src/undate/converters/calendars/gregorian.py (1)

43-44: Consider implementing uncertainty for maximum days.

The TODO comment raises a valid question about returning a ufloat for the maximum possible days when month and year are unknown. This would be consistent with the new Udelta approach.

Consider implementing this enhancement to return a ufloat with uncertainty range for the maximum day when month/year are unknown:

# if year and month are unknown, return maximum possible
# TODO: should this return a ufloat?
-max_day = 31
+from uncertainties import ufloat
+# Create ufloat with average of 28-31 days and half-difference of (31-28)/2
+max_day = ufloat(29.5, 1.5)
tests/test_date.py (1)

84-94: Add broader boundary tests
The newly introduced TestUdelta class tests uncertain durations for a limited scenario (28–29 days). Consider adding edge cases (e.g., no possible days, a single possible day) to verify more exotic usage or error handling.

src/undate/undate.py (3)

441-470: Partial unknown year logic
This block correctly attempts leap and non-leap year combinations when the year is unknown. However, consider extending support for partially known years (e.g., 19XX) and non-Gregorian calendars.

Would you like me to open an issue to track more comprehensive handling for partially known years or other calendar systems?


472-474: Potential empty set risk
If possible_max_days is ever empty, accessing [0] will raise an IndexError. Confirm that possible_max_days always has at least one element for these cases.

- max_day = list(possible_max_days)[0]
+ if not possible_max_days:
+     # handle gracefully, e.g. return ONE_DAY or raise an error
+ max_day = list(possible_max_days)[0]

481-482: FIXME comment on calendar dependence
A flat 31-day maximum may not be correct across all calendars.

Would you like me to open a new issue or propose a patch to account for different calendar logic (e.g., Hijri, Hebrew) when determining the maximum month length?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ca9fb7b and 34c31eb.

📒 Files selected for processing (7)
  • pyproject.toml (1 hunks)
  • src/undate/converters/calendars/gregorian.py (2 hunks)
  • src/undate/date.py (2 hunks)
  • src/undate/interval.py (1 hunks)
  • src/undate/undate.py (3 hunks)
  • tests/test_date.py (2 hunks)
  • tests/test_undate.py (2 hunks)
🧰 Additional context used
🧬 Code Definitions (2)
tests/test_undate.py (1)
src/undate/date.py (3)
  • Timedelta (11-31)
  • Udelta (35-52)
  • days (29-31)
tests/test_date.py (1)
src/undate/date.py (5)
  • Date (63-142)
  • DatePrecision (145-170)
  • Timedelta (11-31)
  • Udelta (35-52)
  • days (29-31)
🔇 Additional comments (12)
src/undate/interval.py (1)

107-111: Well-documented future enhancement plan.

This comment correctly identifies a case where uncertain durations should be considered - when intervals span the end of February in Gregorian calendars. The implementation of the Udelta class in other files enables this future enhancement.

src/undate/date.py (1)

2-2: Good import choices for the new functionality.

The imports for dataclasses and uncertainties.ufloat are appropriate for implementing the new uncertain duration class.

Also applies to: 8-8

src/undate/converters/calendars/gregorian.py (1)

16-19: Good documentation improvement for leap year constants.

Clarifying that these are "arbitrary" known leap/non-leap years improves the code documentation. The addition of LEAP_YEAR will be useful for calculations involving February 29.

tests/test_date.py (2)

2-2: New import for ufloat
This import is necessary for testing the new uncertain duration functionality.


4-4: Added Udelta import
The import aligns with the new uncertain timedelta tests.

src/undate/undate.py (3)

21-21: Import statement for Udelta
This import is consistent with returning either Timedelta or Udelta in the duration method.


423-423: Union return type for duration
Using Timedelta | Udelta requires Python 3.10+ or use of from __future__ import annotations. Confirm that your environment meets these requirements, or adjust if Python < 3.10 is needed.


429-430: Updated docstring
Accurately reflects the new behavior returning Udelta for uncertain durations.

tests/test_undate.py (4)

7-7: New import of Udelta
The test suite now checks uncertain durations. This import is appropriate.


386-392: Check partial unknown month (“1X”)
Returning a Udelta for typical end-of-year months (October–December) is correctly validated.


393-399: Test unknown month (“XX”)
Validates a 28–31 day range for unknown months by returning a Udelta with a midpoint and deviation. Looks good.


400-406: Uncertain February duration
Correctly handles the 28–29 day range for an unknown year. This test accurately ensures the Udelta behavior.

@taylor-arnold
Copy link
Collaborator

Mostly looking at the test suite to see how the implementation works, rather than the internals since I don't have a great understanding of the entire code structure. Mostly this works great. However, the one concern I have is that that uncertainties.core.Variable class that is being used here is defined as a mean +/- a standard deviation, whereas the usage internally within undate is as a range from low to high. The difference is not particularly important in most cases, but it does lead to some strange/incorrect math. For example, what is the difference between the numbers of days in any month and the month of Feburary:

from undate.undate import Undate

month_duration = Undate(month="XX").duration()
feb_duration = Undate(month=2).duration()
month_duration.days - feb_duration.days
1.0+/-1.5811388300841898    # = math.sqrt(1.5**2 + 0.5**2)

@rlskoeser
Copy link
Member Author

@taylor-arnold thank you for looking and testing! I hadn't gotten that far with trying out how uncertainties.ufloat works for different calculations. I was trying to avoid unnecessary wheel-reinvention, but it seems like it's not a good fit here.

Any thoughts on how to proceed? Would it be better to create a simple uncertain range (or set?) class here? I did wonder about the fact that in this case we know it's more likely to be 28 than 29 - but I'm not sure how much that matters.

If you have capacity to propose expected behavior for some of these uncertain duration calculations I could start building out unit tests and use that to drive the implementation.

@taylor-arnold
Copy link
Collaborator

Yes, I definitely agree that it's great to not rebuild the wheel if possible. Though, it's probably not too much of a push to make a minimum object type that records minimum and maximum and allows for them to be added/substracted would not be too difficult.

There's also this package that seems like it does just that. Could use their structures, or if it's too heavy-weight, maybe just incorporate the examples as test cases? https://github.com/AlexandreDecan/portion

@rlskoeser
Copy link
Member Author

@taylor-arnold that library looks interesting! Only one additional dependency and has potential applications for other use cases — EDTF supports sets of dates and I hadn't yet wrapped my head around how one might implement or use that.

I'll revisit this swapping out uncertainties for portions and see how it goes. Appreciate the feedback & suggestion.

@rlskoeser
Copy link
Member Author

@all-contributors please add @taylor-arnold for review and ideas

Copy link
Contributor

@rlskoeser

I've put up a pull request to add @taylor-arnold! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants