Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Row based output incorrect when using satisfies check and assertion with upper bound < 1 #519

Closed
arsenalgunnershubert777 opened this issue Oct 30, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@arsenalgunnershubert777
Copy link

arsenalgunnershubert777 commented Oct 30, 2023

Describe the bug
When using satisfies check, the columnar row based output seems unexpected based on the assertion being passed in. This specifically occurs when assertion has bound where upper bound < 1.

To Reproduce
Steps to reproduce the behavior:

  1. Create custom check using satisfies, with some sql column condition.
  2. Pass in an assertion function with bounds where the upper bound < 1
  3. Run check on input dataframe where some rows pass and some rows fail the column condition.
  4. The row based output when calling rowLevelResultsAsDataFrame will show all rows as false/fail

Code:

Check(CheckLevel.Error, id.value)
        .satisfies(
          sqlColumnCondition,
         "name",
          (d: Double) => d > 0 && d < 1.0
        )

Output row based dataframe:

+-----+------+------+
|index|values|result|
+-----+------+------+
|    1|  blue| false|
|    2| green| false|
|    3|  blue  false|
|    4|   red| false|
|    5|purple| false|
+-----+------+------+
  1. However, if the assertion bounds is adjusted where the upper bound < 1.1 (instead of 1), then the row based results look correct

Code:

Check(CheckLevel.Error, id.value)
        .satisfies(
          sqlColumnCondition,
         "name",
          (d: Double) => d > 0 && d < 1.1
        )

Output row based dataframe (this is correct behavior):

+-----+------+------+
|index|values|result|
+-----+------+------+
|    1|  blue|  true|
|    2| green|  true|
|    3|  blue|  true|
|    4|   red| false|
|    5|purple| false|
+-----+------+------+

Expected behavior
The row based output should show rows that passed and rows that failed based on the columnCondition and shouldn’t be impacted by the assertion. The row based output shouldn’t show every row as false when there are certain rows that passed the columnCondition. The correct example is the one shown directly above.

Screenshots
N/A

Additional context
This row output issue may be due to this line from Verification result constraintResultToColumn. I'm not sure if that line is needed for some other functionality.
Also, the overall verification result check status (Success or Error) seems to be working correctly.
Thanks for the help!

@arsenalgunnershubert777 arsenalgunnershubert777 added the bug Something isn't working label Oct 30, 2023
@Sat30
Copy link

Sat30 commented Mar 14, 2024

  • Result column's value is not based on assertion
    Result column's value is depends on sqlCondition

@arsenalgunnershubert777
Copy link
Author

arsenalgunnershubert777 commented Mar 19, 2024

Hi @Sat30 thanks for the response, can you clarify what you mean by those bullet points?
yes the row level result should be dependent on sqlCondition only, but when changing the assertionFunction the result gets affected when it shouldn't be

@rdsharma26
Copy link
Contributor

rdsharma26 commented Apr 3, 2024

@arsenalgunnershubert777

Thank you so much for reporting this issue. It has been fixed as part of PR #553
We will be releasing this to Maven as part of our next release cycle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants