Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

329 refactor joined timeseries class to allow adding fields #360

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

mgdenno
Copy link
Contributor

@mgdenno mgdenno commented Jan 3, 2025

This PR adds the ability to add calculated fields (used to be called user defined fields, but that isn't really accurate anymore). Notably:

  1. Adds RowLevelCalculatedFields and TimeseriesAwareCalculatedFields which are hopefully descriptive enough names.
  2. Adds a User Guide page to describe what they are and how to use them.
  3. Adds hvplot dependency to poerty
  4. Adds add_calculated_fields() methods to joined_timeseries and metrics "tables"
  5. Updates tests

@mgdenno mgdenno linked an issue Jan 3, 2025 that may be closed by this pull request
@mgdenno mgdenno requested a review from samlamont January 3, 2025 16:04
Copy link
Collaborator

@samlamont samlamont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, just had a few questions/comments. (the reference to the added notebook in the user guide index is broken)

Another thought on the user guide is to add some content describing the difference between using .add_calculated_fields() vs. execute_udfs=True. It seems like executing the UDF's from the script makes it easier for the user to add their own custom functions?

We could also list the currently available RCF's and TCF's in the User Guide?

@@ -38,6 +38,7 @@ arch = "^7.0.0"
pandera = {extras = ["pyspark"], version = "^0.20.4"}
netcdf4 = "1.6.5"
bokeh = "^3.5.0"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we'll remove bokeh as a dependency in the future?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, probably. I think it will still be installed but as a dep to holoviews, not directly.

PercentileEventDetection = PercentileEventDetection

# print all properties such as PercentileEventDetection
def __repr__(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is a convenience function, correct? Just to be able to visualize the object?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Should remove that. I thought it would be nice if printing the object showed what "methods" it had, but it didn't work as I wanted.

@@ -28,6 +28,8 @@ Before starting, make sure you have installed TEEHR and its dependencies as desc

:doc:`Grouping and Filtering </user_guide/notebooks/06_grouping_and_filtering>` :download:`(download notebook) </user_guide/notebooks/06_grouping_and_filtering.ipynb>`

:doc:`Adding calculated fields </user_guide/notebooks/08_adding_calculated_fields>` :download:`(download notebook) </user_guide/notebooks/08_adding_calculated_fields.ipynb>`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the notebook reference should be 08_add_calculated_fields (or change name of notebook?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.

@@ -40,4 +42,5 @@ Before starting, make sure you have installed TEEHR and its dependencies as desc
Read an Evaluation from S3 </user_guide/notebooks/07_read_from_s3>
Joining Timeseries </user_guide/tutorials/joining_timeseries>
Grouping and Filtering </user_guide/notebooks/06_grouping_and_filtering>
Adding calculated fields </user_guide/notebooks/08_adding_calculated_fields>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

@mgdenno
Copy link
Contributor Author

mgdenno commented Jan 6, 2025

Another thought on the user guide is to add some content describing the difference between using .add_calculated_fields() vs. execute_udfs=True. It seems like executing the UDF's from the script makes it easier for the user to add their own custom functions?

Yeah, that probably makes sense to do. execute_udfs is really just a call to execute a script in the evaluation. This concept is probably confusing, esp. now that we have these other .add_calculated_field() methods. I think this is probably confusing...

We could also list the currently available RCF's and TCF's in the User Guide?

Good idea. Maybe just in the notebook page.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

refactor joined_timeseries class to allow adding fields
2 participants