-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
As a developer, I want to optimize the retrieval of feature groups #273
Comments
Original Redmine Comment Not a priority, more like a nice-to-have. For example, if an evaluation contains feature groups (A), (A,B), (A,B,C), (B,C), then B should not be retrieved more than once, likewise A and C. |
Original Redmine Comment Caching of retrievals in general would not be a good solution to this (too much memory). A better solution would be to de-duplicate retrievals on pool creation, linking together suppliers of the atomic pools, something like that. Effectively, we do not want to hold in memory more of the time-series data than exactly those time-series that will be required in more than one context. Will need some work around the @poolfactory@ to achieve de-duplication. |
Original Redmine Comment I don't think this needs to happen before feature pooling is deployed, it's pure optimization and it might even be premature optimization in the sense that combinations of singleton features and feature groups that overlap might not even be a common thing, in practice. Bottom line, without this optimization, you can expect to see overlapping retrievals and hence more effort than the minimum needed whenever a feature is referenced more than once in a @FeatureGroup@ context or in both a singleton feature and a @FeatureGroup@ context. No big deal. |
Author Name: James (James)
Original Redmine Issue: 95971, https://vlab.noaa.gov/redmine/issues/95971
Original Date: 2021-09-08
Given an evaluation that contains multiple instances of the same feature in different contexts (e.g., feature groups)
When that evaluation proceeds
Then it should not retrieve the same time-series data more than once from an underlying data store
Redmine related issue(s): 110326
The text was updated successfully, but these errors were encountered: