Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data validity metrics #206

Open
vandanavk opened this issue May 31, 2024 · 0 comments
Open

Data validity metrics #206

vandanavk opened this issue May 31, 2024 · 0 comments

Comments

@vandanavk
Copy link

Is your feature request related to a problem? Please describe.
With the current list of analyzers, we don't have a way to check data validity - presence of nulls and zeroes in the data.

Describe the solution you'd like
We would like to be able to determine the percentage of rows that have null value or zero value for a particular column

Describe alternatives you've considered
We will probably have to implement this in python in our own fork but would be great to have this capability in deequ (Scala)

Additional context
Similar to Tecton's data quality metrics on resultant feature values

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant