Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Adjust transformation deletes to handle cascading deletes (navapbc#103)
Note this is a duplicate of #2000 - just want to pull it into this repo first Updates the transformation code to handle a case where a parent record (ie. opportunity or opportunity_summary) is deleted, AND the child records (everything else) is marked to be deleted as well. Also added a new way to set metrics that handles adding more specific prefixed ones (eg. `total_records_processed` and `opportunity.total_records_processed`) - will expand more on this later. Imagine a scenario an opportunity with a summary (synopsis) and a few applicant types gets deleted. The update process for loading from Oracle will mark all of our staging table records for those as `is_deleted=True`. When we go to process, we'll first process the opportunity, and delete it uneventfully, however we have cascade-deletes setup. This means that all of the children (the opportunity summary, and assistance listing tables among many others) also need to be deleted. SQLAlchemy handles this for us. However, this means when we then start processing the synopsis record that was marked as deleted - we would error and say "I can't delete something that doesn't exist". To work around this, we're okay with these orphan deletes, and we just assume we already took care of it. To further test this, I loaded a subset of the prod data locally (~2500 opportunities, 35k records total). I then marked all of the data is `is_deleted=True, transformed_at=null` and ran it again. It went through the opportunities deleting them. When it got to the other tables, it didn't have to do very much as they all hit the new case. The metrics produced look like: ``` total_records_processed=37002 total_records_deleted=2453 total_delete_orphans_skipped=34549 total_error_count=0 opportunity.total_records_processed=2453 opportunity.total_records_deleted=2453 assistance_listing.total_records_processed=3814 assistance_listing.total_delete_orphans_skipped=3814 opportunity_summary.total_records_processed=3827 opportunity_summary.total_delete_orphans_skipped=3827 applicant_type.total_records_processed=17547 applicant_type.total_delete_orphans_skipped=17547 funding_category.total_records_processed=4947 funding_category.total_delete_orphans_skipped=4947 funding_instrument.total_records_processed=4414 funding_instrument.total_delete_orphans_skipped=4414 ``` And as a sanity check, running again processes nothing. --------- Co-authored-by: nava-platform-bot <[email protected]>
- Loading branch information