Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE]Add fillnull command to PPL #3032

Open
YANG-DB opened this issue Sep 16, 2024 · 2 comments · May be fixed by #3075
Open

[FEATURE]Add fillnull command to PPL #3032

YANG-DB opened this issue Sep 16, 2024 · 2 comments · May be fixed by #3075
Labels
enhancement New feature or request PPL Piped processing language

Comments

@YANG-DB
Copy link
Member

YANG-DB commented Sep 16, 2024

Description:
We propose adding a fillnull command to OpenSearch's Piped Processing Language (PPL) to provide a convenient way to handle null or missing values in query results. This feature would be similar to the fillnull command in Splunk's SPL, enhancing PPL's data cleaning and preparation capabilities.

Proposed Functionality:

  1. The 'fillnull' command should allow users to replace null values with a specified value.
  2. It should support filling nulls for specific fields or all fields.
  3. The command should allow different fill values for different fields.
  4. It should support conditional filling based on other field values or expressions.

Example Usage:

... | fillnull value=0

This would replace all null values in all fields with 0.

... | fillnull value=N/A field1, field2

This would replace null values in field1 and field2 with "N/A".

... | fillnull field1=0 field2="Unknown" field3=false

This would fill null values in different fields with different values.

... | eval new_field = if(field1 == "category1", field2, null) | fillnull value=0 new_field

This example uses eval to create a new field (or overwrite an existing one) based on a condition, and then use fillnull to handle the null values

...
| eval field1 = if(field1 == "category1", field1, null), field2 = if(field2 == "category2", field2, null)
| fillnull field1=0 field2="Unknown"

This example uses multiple eval expressions to handle different conditions for multiple fields, followed by fillnull


implementation Considerations:

  1. Ensure compatibility with existing PPL commands and syntax
  2. Optimize performance for large datasets with many null values
  3. Provide clear documentation and examples for users
  4. Consider type-checking or type-conversion for filled values
@YANG-DB YANG-DB added enhancement New feature or request untriaged PPL Piped processing language labels Sep 16, 2024
@YANG-DB YANG-DB moved this to Todo in PPL Commands Sep 16, 2024
@dblock dblock removed the untriaged label Oct 7, 2024
@dblock
Copy link
Member

dblock commented Oct 7, 2024

[Catch All Triage - 1, 2, 3, 4]

@YANG-DB YANG-DB moved this from Todo to InReview in PPL Commands Oct 8, 2024
@YANG-DB YANG-DB moved this from InReview to Todo in PPL Commands Oct 9, 2024
@normanj-bitquill
Copy link
Contributor

I can start development of this.

normanj-bitquill added a commit to Bit-Quill/opensearch-project-sql that referenced this issue Oct 16, 2024
normanj-bitquill added a commit to Bit-Quill/opensearch-project-sql that referenced this issue Oct 16, 2024
* Add FILLNULL command in PPL

Signed-off-by: Norman Jordan <[email protected]>
@normanj-bitquill normanj-bitquill linked a pull request Oct 16, 2024 that will close this issue
@YANG-DB YANG-DB moved this from Todo to In Progress in PPL Commands Oct 17, 2024
normanj-bitquill added a commit to Bit-Quill/opensearch-project-sql that referenced this issue Oct 30, 2024
* Add FILLNULL command in PPL

Signed-off-by: Norman Jordan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request PPL Piped processing language
Projects
Status: In Progress
Development

Successfully merging a pull request may close this issue.

3 participants