Data contracts #2265
karolisg
started this conversation in
Feature Requests
Data contracts
#2265
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Current implementation
The current dozer version (v0.3.0) supports dynamic schema generation in protobof and openapi formats. Schema is generated based on input sources and transformations.
Contracts features
Different articles might be defining data contracts slightly differently, but essentially data contract needs to solve 4 key agreements between producers and consumers:
To ensure that, contract needs to provide:
schema
andsemantics
)Validations (for
schema
andsemantics
)protobuf
automaticaly generates schema level validations, while for json validation is done on consumer side and dozer needs to make sure that json schema contains all possible information (likemin
,max
values forUInt
andInt
types)Semantics validations are used for defining business logic and validation of data. They can be differentiated into several categories:
Value integrity
- find unacceptable, illogical dataOutliers
- find data which doesn't fit to provided condition (usually it is average, mean or something similar + thresholds)Referential integrity
- find incomplete data. This check is similar to foreign key between two or more tables.Event order and state transitions
- find inconsistent data. Data might not be sent in correct order or data is missing in particular stateValidation errors have several handling solutions:
Alerts only
- Only alert user if validation failedAll or none
- Make API unavailableIgnore failed records
Dead letter queue
- Send record to another system, which would allow user to fix dataUsers, roles and ACLs
Data contracts should have clear visibility and rules on data access. This can be separated into several aspects:
users
- list of users which has access to data sourceroles
- list of roles with rules of data accessACL
- list of rulesSLA
This part of contract is used to define data quality.
Potential improvements
Examples
open data contract standard full example
datacontract
Resources:
https://atlan.com/data-contracts/#how-to-handle-contract-validation-failures
https://medium.com/profitoptics/data-contract-101-568a9adbf9a9
https://sherinthomas.medium.com/data-contracts-what-is-it-and-why-should-you-care-bc19be951ec2
https://mlops.community/an-engineers-guide-to-data-contracts-pt-1/
Beta Was this translation helpful? Give feedback.
All reactions