Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write to Both DynamoDB and RDS #993

Merged
merged 86 commits into from
Dec 6, 2024
Merged

Write to Both DynamoDB and RDS #993

merged 86 commits into from
Dec 6, 2024

Conversation

Douglas-Hong
Copy link
Contributor

@Douglas-Hong Douglas-Hong commented Jun 3, 2024

Summary

  • Make postgres DB on RDS with Drizzle ORM.
  • Write to both DDB and RDS as first part of migration.
  • Updates patch notes to inform about migration and feedback form.

Migration Procedures

  • Start saving to both DBs.
  • Copy to new DB, while still writing to both.
    • This needs to be done while few people are using AA. I can set up a chron job to do this at an ungodly hour.
    • Then I'll check in the morning to see if it worked.
  • Switch read to new DB.
  • Remove old DB.

Action Items

  • Auth/user relations.
  • Schedules/courses relations.
  • Custom events relations.
  • Make APIs for new DB.
    • Read/write changes happen here. Don't change API.
  • Migration script(s).
    • NOTE: This needs to mangle duplicate schedule names because the (schedule, username) pair is unique.
    • Since we're going to start writing to both first, the copy should ignore schedules that already exist in RDS
  • Migration procedures
  • Provision dev and prod databases.
    • Just one dev database for non-database PRs.
    • PRs that change the DB just dev locally or will be manually provisioned.
  • Check migration and dual-writing locally.
  • CI/CD pipelines (ci: pipeline for RDS migration #1029)
    • Only migrate when merging into main?
  • Batch copy. Each batch takes 5 minutes, which is good enough.
  • Read from RDS in different branch (Read from RDS #1060).
  • Execute migration

Test Plan

  • Save a schedule.
  • See that its data (schedule, courses, custom events) is in pgAdmin or Drizzle Studio.
  • Do the same in deployment.

Related Issues

Future Followup

  • Google auth #1027
  • Only update changed schedule(s) instead of all of them.

@Douglas-Hong Douglas-Hong added the no deploy skip staging label Jun 3, 2024
@github-actions github-actions bot added the Stale label Jul 4, 2024
@MinhxNguyen7 MinhxNguyen7 self-assigned this Oct 24, 2024
@MinhxNguyen7
Copy link
Member

I'm going to look at and finish up this PR

@github-actions github-actions bot removed the Stale label Oct 25, 2024
@MinhxNguyen7
Copy link
Member

Now that I'm actually writing the migration scripts, I've realized that this is more work than I gave it credit for.

I'm going to try to get this done in the next week or so, but I don't like my chances

@MinhxNguyen7
Copy link
Member

@ecxyzzy I'm gonna need to work with you on this. We've been on this rodeo together, so it's not too crazy, but we need to review the plan, review the migration script, and help with the deployment.

@MinhxNguyen7
Copy link
Member

MinhxNguyen7 commented Dec 2, 2024

The staging instance isn't working, reporting "DB_URL must be defined" despite #1058 adding the dev DB to the staging deploy.
image
Works locally as well.

@MinhxNguyen7
Copy link
Member

image
The env is hardcoded in 2 places
Why do you do this to me @ap0nia?

@ap0nia
Copy link
Collaborator

ap0nia commented Dec 2, 2024

I really disliked that too... As far as I can tell, there isn't a good way to fix this because the declarations are used in completely unrelated parts of the deployment pipeline.

The environment variables in the GitHub workflow manually load all the repository secrets into the current environment prior to any scripts running, whereas the CDK script validates environment variables when it actually runs. Neither of these steps can be done in the other one...

A possible strategy might be to upload a base64 encoded env file and decode it to get all the environment variables. This makes the repository secrets more opaque and the deployment scripts more complicated...

@MinhxNguyen7
Copy link
Member

@ecxyzzy thanks a ton.

Do you mind talking about what the problem was?

@ecxyzzy
Copy link
Member

ecxyzzy commented Dec 2, 2024

I think the main issue was that at some point USERDATA_TABLE_NAME stopped getting injected into the Lambda env, since CDK is responsible for that. Other fixes I did included separating the environment variables needed by the backend at runtime and the environment variables needed at build time.

@MinhxNguyen7
Copy link
Member

MinhxNguyen7 commented Dec 3, 2024

Everything works locally, but, in deployment, writes to the DB are timing out. Seems like there is no regression, though.
image

@MinhxNguyen7
Copy link
Member

Above problem has been fixed. Apparently the problem was the lambda dies after the response is sent, and we don't await the RDS write.

@MinhxNguyen7 MinhxNguyen7 marked this pull request as ready for review December 6, 2024 03:14
@github-actions github-actions bot requested a review from KevinWu098 December 6, 2024 03:14
@MinhxNguyen7 MinhxNguyen7 removed the request for review from KevinWu098 December 6, 2024 03:18
@MinhxNguyen7 MinhxNguyen7 merged commit d264887 into main Dec 6, 2024
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants