v1.0.0
We're happy to announce that Cloud Data Quality (CloudDQ) project has reached the stable v1.0.0 version. This release includes the following changes:
- Officially deprecating support for CLI flags
--dbt_path
and--dbt_profiles_dir
. Please migrate to using CLI flags--gcp_project_id
,--gcp_bq_dataset_id
,--gcp_service_account_key_path
(if using exported SA keys),--gcp_impersonation_credentials
(if using SA impersonation) instead. If you are still using--dbt_path
and--dbt_profiles_dir
, existing pipelines will break and you are advised to migrate to using the native connection configurations flags described above as soon as possible. - Officially graduating CLI flags
--enable_experimental_dataplex_gcs_validation
and--enable_experimental_bigquery_entity_uris
from experimental status. These flags respectively allow validation of Dataplex GCS Assets via BigQuery External Tables and referencing BigQuery tables directly via entity_uri, without needing to first register them as Dataplex Assets. If used, the experimental flags will not throw an error, however they are redundant and can be removed as their behaviors are now enabled by default. - Officially making CLI flag
--target_bigquery_summary_table
a required argument. Users are recommended to only consume Data Quality summary results from the target table of their choice instead of relying on the dq_summary table or any intermediate data stored in the BigQuery dataset specified in--gcp_bq_dataset_id
. The--target_bigquery_summary_table
cannot be the same table as the dq_summary table automatically created in the BigQuery dataset specified in--gcp_bq_dataset_id
. - Exposes new CLI flag
--num_threads
for tuning performance. This flag allows increasing number of concurrent BigQuery jobs for calculating data quality summary data. - Exposes new CLI flag
--intermediate_table_expiration_hours
for tuning storage expiration for intermediate entity-level data quality summary calculations data. - bug-fixes for allowing more BigQuery data types such as GEOGRAPHY and RECORD when using Dataplex and BigQuery entity_uri.
- bug-fixes for allowing case-insensitive entity-ids.
- bug-fixes for improving error messages from parsing invalid YAML configurations.
What's Changed
- fixed entity-id-uuid bug by @AmandeepSinghCS in #164
- fixed geography type by @AmandeepSinghCS in #163
- add RECORD type by @thinhha in #165
- added a flag for dbt intermediate table expiration hours by @AmandeepSinghCS in #158
- Dbt threads cli flag by @AmandeepSinghCS in #161
- removed dbt flags and updated dbt runner by @AmandeepSinghCS in #166
- updated user manual for intermediate_table_expiration_hours and num_t… by @AmandeepSinghCS in #168
- Update docs for v1.0.0 by @thinhha in #167
- Target table req arg by @AmandeepSinghCS in #169
- validate configs before loading into cache by @thinhha in #170
Full Changelog: v0.5.3...v1.0.0