Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace custom ODK submission media upload with official external storage (S3) #1894

Merged
merged 13 commits into from
Feb 18, 2025

Conversation

spwoodcock
Copy link
Member

@spwoodcock spwoodcock commented Nov 18, 2024

What type of PR is this? (check all applicable)

  • πŸ• Feature
  • πŸ› Bug Fix
  • πŸ“ Documentation
  • πŸ§‘β€πŸ’» Refactor
  • βœ… Test
  • πŸ€– Build or CI
  • ❓ Other (please specify)

Related Issue

Related to
#1875
#1701
will impact
#1706

Describe this PR

  • Configure Central so submission media automatically syncs to S3 (24hr schedule).
  • Optionally trigger the sync on a more frequent interval (perhaps whenever submissions are requested).
    • Updated the crontab on the bundled ODK instances to do this every 15 minutes instead of 24hrs.
  • Implement some additional async functions in osm-fieldwork OdkCentralAsync to handle getting all pre-signed URLs for media attached to a submission.
  • Replace custom logic in backend API with logic from osm-fieldwork to get the submission media pre-signed URLs via S3.
  • Return the pre-signed URLs to the frontend for display.
  • Deploy ODK Central on prod with env vars and S3 bucket set up
  • Remove logic from FMTM that handled upload to the fmtm-data S3 bucket (replaced by ODK logic)
  • Remove photo uploads added fmtm-data bucket on prod (edit: this wasn't necessary)
  • Log onto servers and trigger ODK S3 upload manually (plus restart failures)
    • Connect to container: docker exec -it fmtm-central-1
    • Upload files: node lib/bin/s3.js upload-pending
    • Reset failed uploads: node lib/bin/s3.js reset-failed-to-pending
    • Dev server
    • Stage server
    • Prod server

Review Guide

Notes for the reviewer. How to test this change?

Checklist before requesting a review

[optional] What gif best describes this PR or how it makes you feel?

@spwoodcock spwoodcock added ODK Any requests for optimizing ODK dependency:osm-fieldwork Requires updates in osm-fieldwork labels Nov 18, 2024
@spwoodcock spwoodcock requested a review from Sujanadh November 18, 2024 02:50
@github-actions github-actions bot added docs Improvements or additions to documentation enhancement New feature or request backend Related to backend code devops Related to deployment or configuration contrib External contributions, or not related to core functionality labels Nov 18, 2024
@spwoodcock spwoodcock force-pushed the feat/central-s3-media branch from ac99aa0 to df2d5e4 Compare November 21, 2024 12:09
@spwoodcock
Copy link
Member Author

spwoodcock commented Jan 10, 2025

Work in progress on this osm-fieldwork branch https://github.com/hotosm/osm-fieldwork/tree/feat/submission-attachment-urls

  • Finish writing tests for osm-fieldwork functionality.
  • Publish a new osm-fieldwork release.

Here we need to be able to get all pre-signed URLs for submission photos from ODK Central.

Then we can work on integrating into FMTM and replace the existing custom code to upload S3.

@spwoodcock
Copy link
Member Author

spwoodcock commented Feb 17, 2025

Should be working now after latest pushes πŸ™

Let's merge and test this on dev!

The tests fail as we need to build new ODK images (latest version), but to do that we need to merge first πŸ˜…

@Sujanadh
Copy link
Collaborator

I will test and merge then

@Sujanadh
Copy link
Collaborator

I am gettings image contents (bytes) instead of url.

@Sujanadh
Copy link
Collaborator

There is another issue as well, i can't create project.
while publishing form
Multiple Form Fields cannot be saved to a single property
I think this is because we are saving two fields value in same geometry(save_to: geometry), existing feature and new_feature (from your recent change), odk avoids that.

@spwoodcock
Copy link
Member Author

I am gettings image contents (bytes) instead of url.

But have the images actually been uploaded to S3?

See the PR description for the command that needs to be run in ODK first.

Although it would probably be nice to return an informative error from the osm-fieldwork method. Or do something with the blob if the photo isn't transferred to S3 yet

@spwoodcock
Copy link
Member Author

spwoodcock commented Feb 18, 2025

There is another issue as well, i can't create project.
while publishing form
Multiple Form Fields cannot be saved to a single property
I think this is because we are saving two fields value in same geometry(save_to: geometry), existing feature and new_feature (from your recent change), odk avoids that.

Please raise as a separate issue πŸ™

An internesting one - I managed to publish forms after that change, so it suggested its extra validation introduced in the Central upgrade.

Easy to remove that save_to though

@Sujanadh
Copy link
Collaborator

  • node lib/bin/s3.js reset-failed-to-pending

so i need to run that command to upload image after submission? I did that

Uploading 0 blobs...
[2025-02-18T08:24:57.269Z] Upload completed.

@Sujanadh
Copy link
Collaborator

There is another issue as well, i can't create project.
while publishing form
Multiple Form Fields cannot be saved to a single property
I think this is because we are saving two fields value in same geometry(save_to: geometry), existing feature and new_feature (from your recent change), odk avoids that.

Please raise as a separate issue πŸ™

An internesting one - I managed to publish forms after that change, so it suggested its extra validation introduced in the Central upgrade.

Easy to remove that save_to though

yes it might be new validation, we can remove that.

@spwoodcock
Copy link
Member Author

  • node lib/bin/s3.js reset-failed-to-pending

so i need to run that command to upload image after submission? I did that

Uploading 0 blobs...
[2025-02-18T08:24:57.269Z] Upload completed.

Hmm that's strange. Could be a few reasons. More info here: https://docs.getodk.org/central-install-digital-ocean/#using-s3-compatible-storage

Perhaps already uploaded? You can check with
node lib/bin/s3.js count-blobs uploaded

@spwoodcock
Copy link
Member Author

spwoodcock commented Feb 18, 2025

Where are you testing this?

On the dev Central instance?

This sort of PR needs to be tested on the local ODK instance, as it includes changes to Central that won't be present on dev until this is merged.

The error you posted suggests the server doesn't have an S3 bucket configured to upload correctly (which makes sense)

@Sujanadh
Copy link
Collaborator

node lib/bin/s3.js count-blobs uploaded

No i tested on my local ODK instance.
I checked bucket as well, fmtm-odk-media is present there.

@spwoodcock
Copy link
Member Author

node lib/bin/s3.js count-blobs uploaded

No i tested on my local ODK instance.
I checked bucket as well, fmtm-odk-media is present there.

Extra strange then!

  • The submission included a photo
  • The photo is verified to be present in ODK (downloaded submission zip)
  • The file blob was downloaded via API too
  • ODK containers are rebuilt and up to date (including pg_rowlocks extension)
  • Command to check pending uploads to S3 shows no files?
  • ODK DB inspected for blobs table. Blob is present and status is not 'uploaded'

@Sujanadh
Copy link
Collaborator

Sujanadh commented Feb 18, 2025

sorry i may not have understood what you meant to say "Dev odk instance"? do you mean dev server's odk central docker service? or dev odk configuration during project creation? (odk.dev.hotosm.org)?

@spwoodcock
Copy link
Member Author

sorry i may not have understood what you meant to say "Dev odk instance"? do you mean dev server's odk central docker service? or dev odk configuration during project creation? (odk.dev.hotosm.org)?

Dev = odk.dev.fmtm.hotosm.org (remote)

Local dev (or just local) = odk.fmtm.localhost (on your machine)

The remote dev instance isn't configured for S3 until this PR is merged.

Testing on local should work though πŸ˜ƒ

@Sujanadh
Copy link
Collaborator

sorry i may not have understood what you meant to say "Dev odk instance"? do you mean dev server's odk central docker service? or dev odk configuration during project creation? (odk.dev.hotosm.org)?

Dev = odk.dev.fmtm.hotosm.org (remote)

Local dev (or just local) = odk.fmtm.localhost (on your machine)

The remote dev instance isn't configured for S3 until this PR is merged.

Testing on local should work though πŸ˜ƒ

Ah i was using odk.dev server then, i will test using local instance.

@spwoodcock
Copy link
Member Author

spwoodcock commented Feb 18, 2025

Its more tricky to do that - I was considering arranging a call tomorrow to discuss using ODK tunnel config I set up (for submissions to local ODK) πŸ‘

For now, we may as well merge this and test it on the remote dev (easier / will save you time)

@Sujanadh
Copy link
Collaborator

yeah lets merge this then it will be easier to test on dev

@spwoodcock spwoodcock merged commit 898e8b6 into development Feb 18, 2025
8 of 9 checks passed
@spwoodcock spwoodcock deleted the feat/central-s3-media branch February 18, 2025 09:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend Related to backend code contrib External contributions, or not related to core functionality dependency:osm-fieldwork Requires updates in osm-fieldwork devops Related to deployment or configuration docs Improvements or additions to documentation enhancement New feature or request frontend Related to frontend code ODK Any requests for optimizing ODK
Projects
Development

Successfully merging this pull request may close these issues.

3 participants