Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix row level bug when composing outcome #594

Merged

Conversation

rdsharma26
Copy link
Contributor

@rdsharma26 rdsharma26 commented Dec 16, 2024

Description of changes:

  • When a check fails due to a precondition failure, the row level results are not evaluated correctly.
  • For example, let's say a check has a completeness constraint which passes, and a minimum constraint which fails due to a precondition failure.
  • The row level results will be the results for just the completeness constraint. There will be no results generated for the minimum constraint, and therefore the row level results will be incorrect.
  • We fix this by adding a default outcome for when the row level result column is not provided by the analyzer.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

- When a check fails due to a precondition failure, the row level results are not evaluated correctly.
- For example, let's say a check has a completeness constraint which passes, and a minimum constraint which fails due to a precondition failure.
- The row level results will be the results for just the completeness constraint. There will be no results generated for the minimum constraint, and therefore the row level results will be incorrect.
- We fix this by adding a default outcome for when the row level result column is not provided by the analyzer.
@rdsharma26 rdsharma26 force-pushed the column-type-failure-row-level-bug branch from 0e8c6b1 to ddc0942 Compare December 16, 2024 22:14
@@ -144,7 +145,7 @@ object VerificationResult {
val constraint = constraintResult.constraint
constraint match {
case asserted: RowLevelAssertedConstraint =>
constraintResult.metric.flatMap(metricToColumn).map(asserted.assertion(_))
constraintResult.metric.flatMap(metricToColumn).map(asserted.assertion(_)).orElse(Some(lit(false)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably need to look at the code again, but could you remind me the difference between RowLevelAssertedConstraint and RowLevelConstraint / why we don't need to add it for the latter? Are RowLevelConstraint for checks like Completeness, Uniqueness where row level results are boolean and RowLevelAssertedConstraint are for checks like Minimum, Sum where we need to compare the row-level value to a given assertion?

@eycho-am
Copy link
Contributor

One clarifying question, but otherwise lgtm

Skipped RowLevelGroupedConstraint because only UniqueValueRatio/Uniqueness use it, and they don't use preconditions.
@rdsharma26
Copy link
Contributor Author

Addressed feedback in followup commit.

@rdsharma26 rdsharma26 merged commit 82537bd into awslabs:master Dec 17, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants