Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

Commit

Permalink
Adjust transformation deletes to handle cascading deletes (#103)
Browse files Browse the repository at this point in the history
Note this is a duplicate of
HHS#2000 - just want to pull
it into this repo first

Updates the transformation code to handle a case where a parent record
(ie. opportunity or opportunity_summary) is deleted, AND the child
records (everything else) is marked to be deleted as well.

Also added a new way to set metrics that handles adding more specific
prefixed ones (eg. `total_records_processed` and
`opportunity.total_records_processed`) - will expand more on this later.

Imagine a scenario an opportunity with a summary (synopsis) and a few
applicant types gets deleted. The update process for loading from Oracle
will mark all of our staging table records for those as
`is_deleted=True`. When we go to process, we'll first process the
opportunity, and delete it uneventfully, however we have cascade-deletes
setup. This means that all of the children (the opportunity summary, and
assistance listing tables among many others) also need to be deleted.
SQLAlchemy handles this for us.

However, this means when we then start processing the synopsis record
that was marked as deleted - we would error and say "I can't delete
something that doesn't exist". To work around this, we're okay with
these orphan deletes, and we just assume we already took care of it.

To further test this, I loaded a subset of the prod data locally (~2500
opportunities, 35k records total). I then marked all of the data is
`is_deleted=True, transformed_at=null` and ran it again. It went through
the opportunities deleting them. When it got to the other tables, it
didn't have to do very much as they all hit the new case. The metrics
produced look like:

```
total_records_processed=37002
total_records_deleted=2453
total_delete_orphans_skipped=34549
total_error_count=0

opportunity.total_records_processed=2453
opportunity.total_records_deleted=2453

assistance_listing.total_records_processed=3814
assistance_listing.total_delete_orphans_skipped=3814

opportunity_summary.total_records_processed=3827
opportunity_summary.total_delete_orphans_skipped=3827

applicant_type.total_records_processed=17547
applicant_type.total_delete_orphans_skipped=17547

funding_category.total_records_processed=4947
funding_category.total_delete_orphans_skipped=4947

funding_instrument.total_records_processed=4414
funding_instrument.total_delete_orphans_skipped=4414
```
And as a sanity check, running again processes nothing.

---------

Co-authored-by: nava-platform-bot <[email protected]>
  • Loading branch information
2 people authored and acouch committed Sep 18, 2024
1 parent c3dcdec commit 8c16376
Show file tree
Hide file tree
Showing 3 changed files with 345 additions and 139 deletions.
Loading

0 comments on commit 8c16376

Please sign in to comment.