-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#61 Add infinity support to Dates and Numerics #63
Conversation
JaCoCo code coverage report - scala 2.12.12
|
@@ -681,4 +695,27 @@ object TypeParser { | |||
} | |||
} | |||
|
|||
sealed trait InfinitySupport { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why sealed trait?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could and perhaps should be in its own file even.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason is that this is a library and there are not supposed to be used outside. It is supposed to be only used with our defined parsers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the code of this trait is written in such a way, that outside usage doesn't really causes issue, if somebody finds it useful - unlikely 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO the "hiding" is important, when the class/trait have special handling needs or has access to more sensitive parts of the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved out
@@ -1 +1 @@ | |||
sbt.version=1.6.2 | |||
sbt.version=1.10.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not. Is it an issue? There was a warning when debuging JaCoCo
@@ -681,4 +695,27 @@ object TypeParser { | |||
} | |||
} | |||
|
|||
sealed trait InfinitySupport { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could and perhaps should be in its own file even.
Here is a problematic use case. Config: {
"pattern": "yyyyMMdd",
"minus_infinity_symbol": "0",
"minus_infinity_value": "00010101",
"plus_infinity_symbol": "99999999",
"plus_infinity_value": "99991231"
} When I believe this is because this implementation tries to replace all '0' with value In order to have pure solution that covers this and similar use cases maybe we should do it like this (pseudocode):
where It is more complicated, but covers edge cases like this. AlternativelyYou can convert the input column to a string first, then do the replacement, and then do a conversion In addition (optional, just thinking)Maybe we can consider specifying Say, this can solve weird cases like this: {
"pattern": "yyyy-MM",
"minus_infinity_symbol": "0",
"minus_infinity_value": "0001-01-15",
"plus_infinity_symbol": "999999",
"plus_infinity_value": "9999-12-31"
} |
My thoughts were that an integer or numeric columns in general would never come with zeros in the front (i.e. As per @yruslan we will quickly confirm if this is need feature and think of a solution or if this is good enough. |
I'm okay with the current implementation. |
Tbh, I think the argument doesn't make sense from a user's perspective. Why should a minus infinity symbol of 0 be standardized to a date 1000-01-01 and not 0001-01-01, or even 0000-01-01 for that matter. But we can tell users that the minimal supported date is 1000-01-01, so I'm also ok with the functionality. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Closes #61
Adds support or infinity replacements on Dates, Timestamps and Numeric fields.
It works before the "spark built-in" infinity replacements that use
∞
and has no dependency onallowInfinity
metadata field.New metadata fields are now supported are: