You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm just summing up the results of a mini investigation I did, which puts most of my concerns about our handling of floating point values to rest (i.e., it seems like we do not have the problem I was thinking we might). I did file one issue related to this here, but this comment is just noting that some other aspects of our handling of floating point numbers work well. It might possibly be worth adding some tests related to this though, to ensure that current working functionality continues to work if we make any changes to implementations?
The thing I was worried about was handling of floating point issues in quantile levels in hubValidations (and then later in downstream analyses using hubEnsembles). To investigate, note that in R we get:
> a <- 0.09999999999999998
> a
[1] 0.1
> print(a, digits=22)
[1] 0.09999999999999997779554
> print(as.character(a))
[1] "0.1"
> b <- round(a, 1)
> b
[1] 0.1
> print(b, digits=22)
[1] 0.1000000000000000055511
> print(as.character(b))
[1] "0.1"
Based on this, I was worried that hubValidations checks that work by converting things to characters would accept both of these options for the quantile level (output_type_id) 0.1, and that could have downstream implications, e.g. in hubEnsembles if these were both in the same data frame we’d get the following error if we were ensembling two different model outputs with different floating point representations of the same quantile level:
Error in `validate_output_type_ids()`:
✖ `model_outputs` contains 2 invalid distributions.
ℹ Within each group defined by a combination of task id variables and output type, all models must provide the same set of output
type ids`
However, all seems to be fine because the arrow cast functionality is more careful than just as.character :
This means that validations should not pass when checking equality of 0.09999999999999997779554 with 0.1 (and I confirmed that they do not pass, as expected). One lingering question is whether the value conversions being used here are fully “safe” across platforms, or if there might potentially be some differences in treatment of this across different computing environments?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I'm just summing up the results of a mini investigation I did, which puts most of my concerns about our handling of floating point values to rest (i.e., it seems like we do not have the problem I was thinking we might). I did file one issue related to this here, but this comment is just noting that some other aspects of our handling of floating point numbers work well. It might possibly be worth adding some tests related to this though, to ensure that current working functionality continues to work if we make any changes to implementations?
The thing I was worried about was handling of floating point issues in quantile levels in hubValidations (and then later in downstream analyses using hubEnsembles). To investigate, note that in R we get:
Based on this, I was worried that hubValidations checks that work by converting things to characters would accept both of these options for the quantile level (
output_type_id
) 0.1, and that could have downstream implications, e.g. in hubEnsembles if these were both in the same data frame we’d get the following error if we were ensembling two different model outputs with different floating point representations of the same quantile level:However, all seems to be fine because the arrow cast functionality is more careful than just as.character :
This means that validations should not pass when checking equality of 0.09999999999999997779554 with 0.1 (and I confirmed that they do not pass, as expected). One lingering question is whether the value conversions being used here are fully “safe” across platforms, or if there might potentially be some differences in treatment of this across different computing environments?
Beta Was this translation helpful? Give feedback.
All reactions