Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

[Issue #104] Refactor/restructure the transformation code #112

Merged
merged 52 commits into from
Aug 2, 2024

Conversation

chouinar
Copy link
Collaborator

Summary

Fixes #104

Time to review: 10 mins

Changes proposed

Restructured the transformation code

  • Split each chunk of the transformation logic into separate "Subtask" classes
  • Made a constants file
  • A few duplicated pieces of implementation were pulled into functions that the subtasks are derived from

Some additional logging

Created a Subtask class for breaking up a task into multiple steps for organizational reasons

Added configuration to the transformation step for enabling/disabling different parts of the process (not used yet - but lets us build things out and not worry about breaking non-local environments).

Context for reviewers

This looks far larger than it actually is, most of the actual changes are very small, I made all of the changes without adjusting the tests (outside of a few small bits of cleanup) and then refactored the tests as well. This does not aim to change the meaningful behavior of the transformation logic, but instead make it a lot easier to parse. Now when we add new transformations, that's conceptually simpler as its adding another one of the subtasks rather than adding to the massive mess of functions it was before.

There are a few small logging / metric related changes from the Subtask just so we can have very granular metrics of how long each part of the task takes.

Additional information

I ran locally with a full snapshot of the production data and didn't see anything of note different from prior runs. Still takes ~10 minutes.

Base automatically changed from chouinar/1983-cascade-deletes to main July 8, 2024 15:27
@chouinar chouinar merged commit f89158d into main Aug 2, 2024
8 checks passed
@chouinar chouinar deleted the chouinar/104-cleanup-transformations branch August 2, 2024 15:29
acouch pushed a commit that referenced this pull request Sep 18, 2024
Fixes HHS#2054

Restructured the transformation code
- Split each chunk of the transformation logic into separate "Subtask"
classes
- Made a constants file
- A few duplicated pieces of implementation were pulled into functions
that the subtasks are derived from

Some additional logging

Created a Subtask class for breaking up a task into multiple steps for
organizational reasons

Added configuration to the transformation step for enabling/disabling
different parts of the process (not used yet - but lets us build things
out and not worry about breaking non-local environments).

This looks far larger than it actually is, most of the actual changes
are very small, I made all of the changes without adjusting the tests
(outside of a few small bits of cleanup) and then refactored the tests
as well. This does not aim to change the meaningful behavior of the
transformation logic, but instead make it a lot easier to parse. Now
when we add new transformations, that's conceptually simpler as its
adding another one of the subtasks rather than adding to the massive
mess of functions it was before.

There are a few small logging / metric related changes from the Subtask
just so we can have very granular metrics of how long each part of the
task takes.

I ran locally with a full snapshot of the production data and didn't see
anything of note different from prior runs. Still takes ~10 minutes.

---------

Co-authored-by: nava-platform-bot <[email protected]>
acouch pushed a commit that referenced this pull request Sep 18, 2024
Fixes HHS#2054

Restructured the transformation code
- Split each chunk of the transformation logic into separate "Subtask"
classes
- Made a constants file
- A few duplicated pieces of implementation were pulled into functions
that the subtasks are derived from

Some additional logging

Created a Subtask class for breaking up a task into multiple steps for
organizational reasons

Added configuration to the transformation step for enabling/disabling
different parts of the process (not used yet - but lets us build things
out and not worry about breaking non-local environments).

This looks far larger than it actually is, most of the actual changes
are very small, I made all of the changes without adjusting the tests
(outside of a few small bits of cleanup) and then refactored the tests
as well. This does not aim to change the meaningful behavior of the
transformation logic, but instead make it a lot easier to parse. Now
when we add new transformations, that's conceptually simpler as its
adding another one of the subtasks rather than adding to the massive
mess of functions it was before.

There are a few small logging / metric related changes from the Subtask
just so we can have very granular metrics of how long each part of the
task takes.

I ran locally with a full snapshot of the production data and didn't see
anything of note different from prior runs. Still takes ~10 minutes.

---------

Co-authored-by: nava-platform-bot <[email protected]>
acouch pushed a commit to HHS/simpler-grants-gov that referenced this pull request Sep 18, 2024
Fixes #2054

Restructured the transformation code
- Split each chunk of the transformation logic into separate "Subtask"
classes
- Made a constants file
- A few duplicated pieces of implementation were pulled into functions
that the subtasks are derived from

Some additional logging

Created a Subtask class for breaking up a task into multiple steps for
organizational reasons

Added configuration to the transformation step for enabling/disabling
different parts of the process (not used yet - but lets us build things
out and not worry about breaking non-local environments).

This looks far larger than it actually is, most of the actual changes
are very small, I made all of the changes without adjusting the tests
(outside of a few small bits of cleanup) and then refactored the tests
as well. This does not aim to change the meaningful behavior of the
transformation logic, but instead make it a lot easier to parse. Now
when we add new transformations, that's conceptually simpler as its
adding another one of the subtasks rather than adding to the massive
mess of functions it was before.

There are a few small logging / metric related changes from the Subtask
just so we can have very granular metrics of how long each part of the
task takes.

I ran locally with a full snapshot of the production data and didn't see
anything of note different from prior runs. Still takes ~10 minutes.

---------

Co-authored-by: nava-platform-bot <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Task]: Cleanup / refactor transformation work to be less boilerplatey
3 participants