-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add FILLNULL command in PPL (#3032) #3075
base: main
Are you sure you want to change the base?
Add FILLNULL command in PPL (#3032) #3075
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR! Could you add some integration tests? They live here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add an ExplainTest to validate that you get an Eval physical plan when using FILLNULL.
@MaxKsyunz I have added some integration tests. |
@jduo I have added an explain test. |
docs/user/ppl/cmd/fillnull.rst
Outdated
os> source=accounts | fields email, host | fillnull using email = '<not found>', host = '<no host>' ; | ||
fetched rows / total rows = 4/4 | ||
+-----------------------+------------+ | ||
| email | host | | ||
|-----------------------+------------| | ||
| [email protected] | pyrami.com | | ||
| [email protected] | netagy.com | | ||
| <not found> | | | ||
| [email protected] | boink.com | | ||
+-----------------------+------------+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is missing <no host>
from the right column, right? Same for the table above, it looks like one <not found>
is missing?
os> source=accounts | fields email, host | fillnull using email = '<not found>', host = '<no host>' ; | |
fetched rows / total rows = 4/4 | |
+-----------------------+------------+ | |
| email | host | | |
|-----------------------+------------| | |
| [email protected] | pyrami.com | | |
| [email protected] | netagy.com | | |
| <not found> | | | |
| [email protected] | boink.com | | |
+-----------------------+------------+ | |
os> source=accounts | fields email, host | fillnull using email = '<not found>', host = '<no host>' ; | |
fetched rows / total rows = 4/4 | |
+-----------------------+------------+ | |
| email | host | | |
|-----------------------+------------| | |
| [email protected] | pyrami.com | | |
| [email protected] | netagy.com | | |
| <not found> | <no host> | | |
| [email protected] | boink.com | | |
+-----------------------+------------+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Example 1, it looks like <not found>
is still missing from the host
column..?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test has been changed to use employer
rather than host
. null
values are replaced with <no employer>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Thanks.
* Add FILLNULL command in PPL Signed-off-by: Norman Jordan <[email protected]>
Signed-off-by: Norman Jordan <[email protected]>
a1f584f
to
240d0d6
Compare
Signed-off-by: Norman Jordan <[email protected]>
240d0d6
to
7d17615
Compare
@@ -558,6 +559,29 @@ public LogicalPlan visitAD(AD node, AnalysisContext context) { | |||
return new LogicalAD(child, options); | |||
} | |||
|
|||
/** Build {@link LogicalAD} for fillnull command. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should link to LogicalEval instead of LogicalAD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
@@ -0,0 +1,67 @@ | |||
============= | |||
fillnull |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to update category.json in doctest to get this to run with the documentation test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
JSONObject result = | ||
executeQuery( | ||
String.format( | ||
"source=%s | fields str2, num0 | fillnull with -1 in num0", TEST_INDEX_CALCS)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd show unchanged and filled column in the test (and in doctests and in PR description as well)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an integration test that updates one field, but not the other (testFillNullWithOtherField
).
There is a test in the docs as well (the first query in fillnull.rst
).
The description for this PR has been updated with a sample query.
} | ||
] | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add eol please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
JSONObject result = | ||
executeQuery( | ||
String.format( | ||
"source=%s | fields str2, num0 | fillnull using num0 = -1", TEST_INDEX_CALCS)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add more interesting tests there:
source=%s | fields str2, num0 | fillnull using num0 = num1
source=%s | fields str2, num0 | fillnull using num0 = cos(num1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added tests like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to list new file in table of content (see James's PR for example)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
|
||
@RequiredArgsConstructor | ||
@AllArgsConstructor | ||
public class FillNull extends UnresolvedPlan { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mind adding a doc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a Javadoc comment.
Signed-off-by: Norman Jordan <[email protected]>
d326da9
to
a0f9c2c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@normanj-bitquill please lets align the docs with the existing ppl fillnull doc in spark
|
||
Syntax | ||
============ | ||
fillnull "with" <expression> <field> ["," <field> ]... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@normanj-bitquill please review spark's ppl fillnull to make sure we are aligned with the syntax
@@ -35,6 +35,7 @@ NEW_FIELD: 'NEW_FIELD'; | |||
KMEANS: 'KMEANS'; | |||
AD: 'AD'; | |||
ML: 'ML'; | |||
FILLNULL: 'FILLNULL'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add FILLNULL to the keywordsCanBeId
here
Description
Adds the FILLNULL command for PPL. FILLNULL will replace NULL values in specified fields.
Related Issues
Resolves #3032
Based on this PR for Spark: opensearch-project/opensearch-spark#723
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.
An example query using fillnull.