Bug/memory loss #126

jonaraphael · 2024-11-20T18:35:52Z

Feel free to merge MAIN back into this branch before running your review.

Handle errors more gracefully

# Conflicts: # cerulean_cloud/cloud_function_ais_analysis/main.py # cerulean_cloud/cloud_run_orchestrator/handler.py # cerulean_cloud/database_client.py # cerulean_cloud/models.py

github-actions · 2024-11-26T04:41:18Z

🍹 `preview` on cerulean-cloud-images/test

Pulumi report

Previewing update (test):
@ previewing update........

@ previewing update...........
pulumi:pulumi:Stack cerulean-cloud-images-test running
@ previewing update....
gcp:container:Registry cerulean-cloud-images-test-registry
@ previewing update.....
docker:index:Image cerulean-cloud-images-test-cr-orchestrator-image Building your image for linux/amd64 architecture.
~ docker:index:Image cerulean-cloud-images-test-cr-orchestrator-image update [diff: ~build]; Building your image for linux/amd64 architecture.
docker:index:Image cerulean-cloud-images-test-cr-offset-tile-image Building your image for linux/amd64 architecture.
~ docker:index:Image cerulean-cloud-images-test-cr-offset-tile-image update [diff: ~build]; Building your image for linux/amd64 architecture.
docker:index:Image cerulean-cloud-images-test-cr-tipg-image Building your image for linux/amd64 architecture.
~ docker:index:Image cerulean-cloud-images-test-cr-tipg-image update [diff: ~build]; Building your image for linux/amd64 architecture.
pulumi:pulumi:Stack cerulean-cloud-images-test
Diagnostics:
docker:index:Image (cerulean-cloud-images-test-cr-tipg-image):
Building your image for linux/amd64 architecture.
To ensure you are building for the correct platform, consider explicitly setting the `platform` field on ImageBuildOptions.

docker:index:Image (cerulean-cloud-images-test-cr-orchestrator-image):
Building your image for linux/amd64 architecture.
To ensure you are building for the correct platform, consider explicitly setting the `platform` field on ImageBuildOptions.

docker:index:Image (cerulean-cloud-images-test-cr-offset-tile-image):
Building your image for linux/amd64 architecture.
To ensure you are building for the correct platform, consider explicitly setting the `platform` field on ImageBuildOptions.

Resources:
~ 3 to update
2 unchanged

github-actions · 2024-11-26T04:43:29Z

🍹 `preview` on cerulean-cloud/test

Pulumi report

   Previewing update (test):

@ previewing update.....
   pulumi:pulumi:Stack cerulean-cloud-test running 
   pulumi:providers:docker cerulean-cloud-images-test-gcr  
@ previewing update....
   gcp:storage:Bucket cerulean-cloud-test-bucket-cf-ais  
   gcp:sql:DatabaseInstance cerulean-cloud-test-database-instance  
   gcp:cloudtasks:Queue cerulean-cloud-test-queue-cloud-tasks-ais-analysis  
   gcp:serviceaccount:Account cerulean-cloud-test-cf-ais  
-- gcp:storage:BucketObject cerulean-cloud-test-source-cf-ais delete original 
+- gcp:storage:BucketObject cerulean-cloud-test-source-cf-ais replace [diff: ~detectMd5hash,name,source]
++ gcp:storage:BucketObject cerulean-cloud-test-source-cf-ais create replacement [diff: ~detectMd5hash,name,source]
   gcp:sql:User cerulean-cloud-test-database-users  
   gcp:projects:IAMMember cerulean-cloud-test-cf-ais-iam  
   gcp:sql:Database cerulean-cloud-test-database  
~  gcp:cloudfunctions:Function cerulean-cloud-test-cf-ais update [diff: ~environmentVariables,secretEnvironmentVariables,sourceArchiveObject]
   gcp:cloudfunctions:FunctionIamMember cerulean-cloud-test-cf-ais-invoker  
@ previewing update........
   gcp:serviceaccount:Account cerulean-cloud-test-cr-offset-tile  
   docker:index:RemoteImage cerulean-cloud-images-test-remote-offset  
   docker:index:RemoteImage cerulean-cloud-images-test-remote-orchestrator  
   docker:index:RemoteImage cerulean-cloud-images-test-remote-tipg  
   gcp:projects:IAMMember cerulean-cloud-test-cr-offset-tile-cloudSqlClient  
   gcp:projects:IAMMember cerulean-cloud-test-cr-offset-tile-secretmanagerSecretAccessor  
   gcp:secretmanager:SecretIamMember cerulean-cloud-test-cr-offset-tile-secret-accessor-binding  
   pulumi:pulumi:Stack cerulean-cloud-test running warning: serving_state is deprecated: `serving_state` is deprecated and will be removed in a future major release. This field is not supported by the Cloud Run API.
   pulumi:pulumi:Stack cerulean-cloud-test running warning: env_froms is deprecated: `env_from` is deprecated and will be removed in a future major release. This field is not supported by the Cloud Run API.
   pulumi:pulumi:Stack cerulean-cloud-test running warning: working_dir is deprecated: `working_dir` is deprecated and will be removed in a future major release. This field is not supported by the Cloud Run API.
~  gcp:cloudrun:Service cerulean-cloud-test-cr-offset-tiles update [diff: ~metadata,template]
   gcp:cloudrun:IamPolicy cerulean-cloud-test-cr-noauth-iam-policy-offset  
@ previewing update.....
   aws:iam:Role cerulean-cloud-test-lambda-titiler-role  
   aws:s3:Bucket cerulean-cloud-test-titiler-lambda-archive  
   aws:iam:Policy cerulean-cloud-test-lambda-titiler-policy  
   aws:sns:Topic cerulean-cloud-test-lambda-APIAbuseAlert  
   aws:apigatewayv2:Api cerulean-cloud-test-lambda-titiler-api  
   gcp:storage:Bucket cerulean-cloud-test-bucket-cf-sr  
   gcp:serviceaccount:Account cerulean-cloud-test-cr-orchestrator  
   gcp:cloudtasks:Queue cerulean-cloud-test-queue-cr-orchestrator  
   gcp:serviceaccount:Account cerulean-cloud-test-cf-sr  
   gcp:serviceaccount:Account cerulean-cloud-test-cr-tipg  
   aws:iam:Role cerulean-cloud-test-lambda-sentinel1-iam  
   aws:iam:RolePolicyAttachment cerulean-cloud-test-lambda-titiler-attachment2  
   aws:iam:RolePolicyAttachment cerulean-cloud-test-lambda-titiler-attachment  
   aws:sns:TopicSubscription cerulean-cloud-test-lambda-titiler-email-support  
   aws:sns:TopicSubscription cerulean-cloud-test-lambda-titiler-email-jona  
   aws:sns:TopicSubscription cerulean-cloud-test-lambda-titiler-email-aemon  
   aws:sns:TopicSubscription cerulean-cloud-test-lambda-titiler-email-jason  
-- gcp:storage:BucketObject cerulean-cloud-test-source-cf-sr delete original 
+- gcp:storage:BucketObject cerulean-cloud-test-source-cf-sr replace [diff: ~detectMd5hash,name,source]
++ gcp:storage:BucketObject cerulean-cloud-test-source-cf-sr create replacement [diff: ~detectMd5hash,name,source]
-- gcp:storage:BucketObject cerulean-cloud-test-source-cf-historical-run delete original 
+- gcp:storage:BucketObject cerulean-cloud-test-source-cf-historical-run replace [diff: ~detectMd5hash,name,source]
++ gcp:storage:BucketObject cerulean-cloud-test-source-cf-historical-run create replacement [diff: ~detectMd5hash,name,source]
   gcp:projects:IAMMember cerulean-cloud-test-cr-orchestrator-cloudTasksEnqueuer  
   gcp:projects:IAMMember cerulean-cloud-test-cr-orchestrator-cloudSqlClient  
   gcp:projects:IAMMember cerulean-cloud-test-cr-orchestrator-secretmanagerSecretAccessor  
   gcp:projects:IAMMember cerulean-cloud-test-cf-sr-iam  
   aws:iam:RolePolicyAttachment cerulean-cloud-test-lambda-sentinel1-basic-execution  
   gcp:projects:IAMMember cerulean-cloud-test-cr-tipg-cloudSqlClient  
   gcp:secretmanager:SecretIamMember cerulean-cloud-test-cr-orchestrator-secret-accessor-binding  
   gcp:projects:IAMMember cerulean-cloud-test-cr-tipg-secretmanagerSecretAccessor  
   gcp:secretmanager:SecretIamMember cerulean-cloud-test-cr-tipg-secret-accessor-binding  
   pulumi:pulumi:Stack cerulean-cloud-test running warning: serving_state is deprecated: `serving_state` is deprecated and will be removed in a future major release. This field is not supported by the Cloud Run API.
   pulumi:pulumi:Stack cerulean-cloud-test running warning: env_froms is deprecated: `env_from` is deprecated and will be removed in a future major release. This field is not supported by the Cloud Run API.
   pulumi:pulumi:Stack cerulean-cloud-test running warning: working_dir is deprecated: `working_dir` is deprecated and will be removed in a future major release. This field is not supported by the Cloud Run API.
   pulumi:pulumi:Stack cerulean-cloud-test running warning: serving_state is deprecated: `serving_state` is deprecated and will be removed in a future major release. This field is not supported by the Cloud Run API.
   pulumi:pulumi:Stack cerulean-cloud-test running warning: env_froms is deprecated: `env_from` is deprecated and will be removed in a future major release. This field is not supported by the Cloud Run API.
   pulumi:pulumi:Stack cerulean-cloud-test running warning: working_dir is deprecated: `working_dir` is deprecated and will be removed in a future major release. This field is not supported by the Cloud Run API.
@ previewing update....
~  gcp:cloudrun:Service cerulean-cloud-test-cr-tipg update [diff: ~metadata]
~  gcp:cloudrun:Service cerulean-cloud-test-cr-orchestrator update [diff: ~metadata,template]
   gcp:cloudrun:IamPolicy cerulean-cloud-test-cr-noauth-iam-policy-tipg  
   gcp:cloudrun:IamPolicy cerulean-cloud-test-cr-noauth-iam-policy-orchestrator  
~  gcp:cloudfunctions:Function cerulean-cloud-test-cf-sr update [diff: ~secretEnvironmentVariables,sourceArchiveObject]
~  gcp:cloudfunctions:Function cerulean-cloud-test-cf-historical-run update [diff: ~secretEnvironmentVariables,sourceArchiveObject]
   aws:lambda:Function cerulean-cloud-test-lambda-sentinel1-sub  
   gcp:cloudfunctions:FunctionIamMember cerulean-cloud-test-cf-sr-invoker  
   gcp:cloudfunctions:FunctionIamMember cerulean-cloud-test-cf-historical-run-invoker  
   aws:sns:TopicSubscription cerulean-cloud-test-sentinel1-subscription  
   aws:lambda:Permission cerulean-cloud-test-lambda-sentinel1-permission  
@ previewing update.........................................................................................................................
~  aws:s3:BucketObject cerulean-cloud-test-titiler-lambda-archive update [diff: ~source]
~  aws:lambda:Function cerulean-cloud-test-lambda-titiler-sentinel update [diff: ~sourceCodeHash]
   aws:apigatewayv2:Integration cerulean-cloud-test-lambda-titiler-integration  
   aws:cloudwatch:MetricAlarm cerulean-cloud-test-lambda-titiler-alarm  
   aws:lambda:Permission cerulean-cloud-test-lambda-titiler-permission  
   aws:apigatewayv2:Route cerulean-cloud-test-lambda-titiler-route  
   aws:apigatewayv2:Stage cerulean-cloud-test-lambda-titiler-stage  
   pulumi:pulumi:Stack cerulean-cloud-test running Creating lambda package in [/home/runner/work/cerulean-cloud/cerulean-cloud] [running in Docker]...
   pulumi:pulumi:Stack cerulean-cloud-test running Checking Docker is available...
   pulumi:pulumi:Stack cerulean-cloud-test running Building container image...
   pulumi:pulumi:Stack cerulean-cloud-test running Sucessfully built container image with id sha256:23f342d5ac8f51d78eb60bb3a56e5f45f410e510621149eebe2876e0828d2516
   pulumi:pulumi:Stack cerulean-cloud-test running Creating installation package.zip ...
   pulumi:pulumi:Stack cerulean-cloud-test running Sucessfully created package.zip at /home/runner/work/cerulean-cloud/cerulean-cloud/package.zip
@ previewing update....
   pulumi:pulumi:Stack cerulean-cloud-test  9 warnings; 6 messages
Diagnostics:
 pulumi:pulumi:Stack (cerulean-cloud-test):
   warning: serving_state is deprecated: `serving_state` is deprecated and will be removed in a future major release. This field is not supported by the Cloud Run API.
   warning: env_froms is deprecated: `env_from` is deprecated and will be removed in a future major release. This field is not supported by the Cloud Run API.
   warning: working_dir is deprecated: `working_dir` is deprecated and will be removed in a future major release. This field is not supported by the Cloud Run API.
   warning: serving_state is deprecated: `serving_state` is deprecated and will be removed in a future major release. This field is not supported by the Cloud Run API.
   warning: env_froms is deprecated: `env_from` is deprecated and will be removed in a future major release. This field is not supported by the Cloud Run API.
   warning: working_dir is deprecated: `working_dir` is deprecated and will be removed in a future major release. This field is not supported by the Cloud Run API.
   warning: serving_state is deprecated: `serving_state` is deprecated and will be removed in a future major release. This field is not supported by the Cloud Run API.
   warning: env_froms is deprecated: `env_from` is deprecated and will be removed in a future major release. This field is not supported by the Cloud Run API.
   warning: working_dir is deprecated: `working_dir` is deprecated and will be removed in a future major release. This field is not supported by the Cloud Run API.

   Creating lambda package in [/home/runner/work/cerulean-cloud/cerulean-cloud] [running in Docker]...
   Checking Docker is available...
   Building container image...
   Sucessfully built container image with id sha256:23f342d5ac8f51d78eb60bb3a56e5f45f410e510621149eebe2876e0828d2516
   Creating installation package.zip ...
   Sucessfully created package.zip at /home/runner/work/cerulean-cloud/cerulean-cloud/package.zip

Resources:
   ~ 8 to update
   +-3 to replace
   11 changes. 56 unchanged

sstill88 · 2024-12-11T18:30:16Z

cerulean_cloud/cloud_run_orchestrator/handler.py

+            features=tileset_fc_list, min_overlaps_to_keep=1
+        )
+
+        # Stitch inferences


Is this stitching and ensembling done twice or is there a subtle difference I am missing?

Stitching and Ensembling are related concepts that we try to keep distinct in this code. Here's my working definition...
Stitching: connecting adjacent tiles that do not overlap
Ensembling: combining multiple (overlapping) runs of a collection of data into a single more robust result

I'm not sure if I am interpreting your question exactly correctly, but yes we do run nms_feature_reduction() twice--
• the first takes place inside postprocess_tileset() and is applied to the scene after having gone through Stitching
self.nms_feature_reduction(feature_collection)
• the second takes place in _orchestrate() and is our primary method of Ensembling
model.nms_feature_reduction(features=tileset_fc_list, min_overlaps_to_keep=1)

This operation goes like N^2, so smaller datasets lead to substantially fewer intersection operations.
The first one is used to generate a single tileset's best candidates for the oil in the scene, primarily used to eliminate the lowest likelihood classes if a single polygon of oil shows up as in two or more class layers (very likely).
The second one is used to choose which of the tileset's results is the best.

Is your question / argument that we might be able to do a single nms_feature_reduction? If so, I'm not sure that we would have much to gain by that, and possibly suffer much worse performance on some "messy" scenes where the polygon count is high.

This comment is referring to the stitching and ensemble blocks each being run twice

jonaraphael added 13 commits September 18, 2024 19:03

singleton engine, and garbage collection

0eff4aa

Clean up some engine use in orchestrator

b2924db

oops

b112974

Clean up singleton use

8004447

Was closing the problem?

dff7c4c

orchestrator object needed to persist across multiple db connections

e5c1522

oops--forgot to flush

7bcf88e

Swap engine maintenance to fastapi's app lifetime.

e888ced

Handle errors more gracefully

Garbage collection

88672a5

update AIS as well

fc2bd8e

Merge branch 'bug/engine-maintenance' into bug/memory-loss

554a94c

Oops! Deleted before assignment...

cf254f1

more debug statement

8bf5803

jonaraphael requested a review from sstill88 November 20, 2024 18:35

Merge branch 'main' into bug/memory-loss

7f24c5f

# Conflicts: # cerulean_cloud/cloud_function_ais_analysis/main.py # cerulean_cloud/cloud_run_orchestrator/handler.py # cerulean_cloud/database_client.py # cerulean_cloud/models.py

sstill88 reviewed Dec 11, 2024

View reviewed changes

remove duplicate stitching/ensembling

06d8ea1

jonaraphael temporarily deployed to test December 12, 2024 17:30 — with GitHub Actions Inactive

sstill88 added 10 commits December 13, 2024 16:57

updated logging, added retries to get_bounds

673d687

more logs

885734e

added retries and cleared some MemoryFiles

712325f

reformat

2d3035c

moved more print --> logger

b03b482

oops, revert one change to MemoryFile back

426df03

clear memory when run_parallel_inference fails

af03c6c

return None if fail inf fetch_and_process_image

17f6d94

error logs

0860f3d

undo memfile clear; returns the memfile

29c228f

sstill88 temporarily deployed to test December 18, 2024 14:51 — with GitHub Actions Inactive

make logs json serializable

5e12ef7

sstill88 temporarily deployed to test December 18, 2024 15:26 — with GitHub Actions Inactive

sstill88 added 4 commits December 18, 2024 09:17

removed log formatting (return jsonPayload)

2c553e9

update logs

0b2b72c

clearer logs for if image is empty vs there is no imagery

5a50854

exception --> error

377bd34

sstill88 temporarily deployed to test December 18, 2024 19:30 — with GitHub Actions Inactive

sstill88 temporarily deployed to test December 18, 2024 19:55 — with GitHub Actions Inactive

sstill88 added 5 commits December 18, 2024 13:21

moved structured_log to utils and added severity

070b736

add more retries

c4dd899

raise error instead of return None

1b54756

import logging correctly...

6c4dd26

add method=get to test

766a4eb

sstill88 had a problem deploying to test December 19, 2024 15:41 — with GitHub Actions Failure

sstill88 had a problem deploying to test December 19, 2024 16:03 — with GitHub Actions Failure

add cloud_run_orchestrator/utils.py to ais cloud build

9b444b1

sstill88 had a problem deploying to test December 19, 2024 22:12 — with GitHub Actions Failure

typo...

c3d5293

sstill88 had a problem deploying to test December 19, 2024 22:20 — with GitHub Actions Failure

make cloud_run_orchestrator directory in action yaml

9287563

sstill88 temporarily deployed to test December 19, 2024 22:32 — with GitHub Actions Inactive

sstill88 added 7 commits December 19, 2024 18:52

bugfix and add separate try/except for roda (after titiler metadata)

32628c9

added tracebacks to errors

ab93827

oops. removed some nested try/except/if weirdness

30e874c

create separate error for if asyncio.gather fails

d21a9ad

typo - make sure aux_datasets are cleared

a34d604

reduced log size when possible

be9cfe9

return RuntimeErrors instead of None

56ccaca

sstill88 closed this Dec 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug/memory loss #126

Bug/memory loss #126

jonaraphael commented Nov 20, 2024

github-actions bot commented Nov 26, 2024 •

edited

Loading

github-actions bot commented Nov 26, 2024 •

edited

Loading

sstill88 Dec 11, 2024

jonaraphael Dec 11, 2024

sstill88 Dec 11, 2024

Bug/memory loss #126

Bug/memory loss #126

Conversation

jonaraphael commented Nov 20, 2024

github-actions bot commented Nov 26, 2024 • edited Loading

🍹 preview on cerulean-cloud-images/test

github-actions bot commented Nov 26, 2024 • edited Loading

🍹 preview on cerulean-cloud/test

sstill88 Dec 11, 2024

Choose a reason for hiding this comment

jonaraphael Dec 11, 2024

Choose a reason for hiding this comment

sstill88 Dec 11, 2024

Choose a reason for hiding this comment

github-actions bot commented Nov 26, 2024 •

edited

Loading

🍹 `preview` on cerulean-cloud-images/test

github-actions bot commented Nov 26, 2024 •

edited

Loading

🍹 `preview` on cerulean-cloud/test