You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Implement a (most likely) lambda function that periodically fires off and gathers the health status of each of the below services.
The status will be gathered into a JSON file, which will be uploaded to a S3 bucket:
Where are the Health Check Endpoints defined?
The set of "healthCheck" endpoints will be defined by what's in SSM.
Health check endpoints will be defined in SSM parameters, starting with /unity/healthCheck/...
For example: /unity/healthCheck/<MARKETPLACE_ITEM>/<COMPONENT_NAME>
For shared services, shared-services is effectively the MARKETPLACE_ITEM
example: /unity/healthCheck/shared-services/data-catalog
For venue services, an example would be: /unity/healthCheck/sps/airflowUi
Who Creates the SSM entries?
The Service Areas (not U-CS) are responsible for creating the SSM entries.
If the deployment occurs via the Management Console/Marketplace then the deployment infrastructure as code (IAC, usually terraform) will be responsible for creating the SSM values.
Otherwise the SSM entry can be created manually in the venue.
How does the querying occur?
A lambda function periodically fires off (nominally every 5 minutes -- probably leveraging AWS EventBridge) and:
queries SSM for all params starting with /unity/healthCheck/
gathers the health status of each of the URLs found in the /unity/healthCheck/... SSM values. For now, HTTP 200 represents HEALTHY, and anything else represents UNHEALTHY. Some of the URLs represented in the SSM values are endpoints in the shared services AWS account, and others are in the venue account.
Generates the JSON status file, with the statuses (healthy or unhealthy). EXAMPLE JSON file:
galenatjpl
changed the title
Implement framework in Venue account to periodically gather health
Implement lambda in Venue account to periodically gather health status
Apr 10, 2024
@mike-gangl This ticket is implemented, and we are closing this, to take credit for the work in 24.2. We can run everything manually, and it's fine. We will open up another ticket to do the final testing in 24.3. @hargitayjpl and @jdrodjpl will be getting together to run the test and confirm things.
Implement a (most likely) lambda function that periodically fires off and gathers the health status of each of the below services.
The status will be gathered into a JSON file, which will be uploaded to a S3 bucket:
Where are the Health Check Endpoints defined?
The set of "healthCheck" endpoints will be defined by what's in SSM.
Health check endpoints will be defined in SSM parameters, starting with
/unity/healthCheck/...
For example:
/unity/healthCheck/<MARKETPLACE_ITEM>/<COMPONENT_NAME>
For shared services,
shared-services
is effectively the MARKETPLACE_ITEMexample:
/unity/healthCheck/shared-services/data-catalog
For venue services, an example would be:
/unity/healthCheck/sps/airflowUi
Who Creates the SSM entries?
The Service Areas (not U-CS) are responsible for creating the SSM entries.
How does the querying occur?
A lambda function periodically fires off (nominally every 5 minutes -- probably leveraging AWS EventBridge) and:
/unity/healthCheck/
/unity/healthCheck/${PROJECT}/${VENUE}/<MARKETPLACE_ITEM>/<COMPONENT_NAME>
/unity/healthCheck/shared-services/<MARKETPLACE_ITEM>/<COMPONENT_NAME>
/unity/healthCheck/...
SSM values. For now, HTTP 200 representsHEALTHY
, and anything else representsUNHEALTHY
. Some of the URLs represented in the SSM values are endpoints in the shared services AWS account, and others are in the venue account.What if the healthCheck endpoint is secured? How will I work around that?
@mike-gangl mentions that there is a methodology for getting the username/password from SSM, then getting a token.
See https://github.com/unity-sds/unity-data-services/blob/develop/cumulus_lambda_functions/lib/cognito_login/cognito_token_retriever.py for an example of how U-DS gets a token.that's getting the cognito login and then something like https://github.com/unity-sds/unity-data-services/blob/develop/cumulus_lambda_functions/stage_in_out/dapa_client.py uses that cognito token to make calls.
See also: https://github.com/unity-sds/sounder-sips-tutorial/blob/develop/jupyter-notebooks/tutorials/2_working_with_data.ipynb
See diagrams and other notes in unity-sds/unity-project-management#101
Dependencies
Other epics or outside tickets required for this to work
The text was updated successfully, but these errors were encountered: