Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(alerts): configure alerts for payment wallet ingestion #2608

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

pietro-tota
Copy link
Contributor

@pietro-tota pietro-tota commented Nov 26, 2024

List of changes

Add alerts to monitor:

  • wallet ingestion storage queue write/read event rate
  • wallet ingestion event hub written events in a day

Motivation and context

Those alerts will monitor payment wallet ingestion dedicated queues checking that onboarded wallet events are processed and sent to data-lake dedicated event hub.
Those events are not linked to opsgenie since they have not to be linked to the on-call system and any eventual ingestion problem will be taken into account during working hours

Type of changes

  • Add new resources
  • Update configuration to existing resources
  • Remove existing resources

Does this introduce a change to production resources with possible user impact?

  • Yes, users may be impacted applying this change
  • No

Does this introduce an unwanted change on infrastructure? Check terraform plan execution result

  • Yes
  • No

Other information


If PR is partially applied, why? (reserved to mantainers)

@pietro-tota pietro-tota self-assigned this Nov 26, 2024
@pietro-tota pietro-tota requested review from a team as code owners November 26, 2024 13:44
metric_name = "IncomingMessages"
description = "Payment wallet onboarding written events less than 2000 detected in the last 24h"
operator = "LessThanOrEqual"
threshold = 2000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This threshold is too low; it is very close to the low-traffic onboarding number (see the traffic from November 2)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, maybe we can lower to 1000 in one day and see how it will go in prod after activating it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lowered to 1000 with 23ab4ea

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants