Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NOT URGENT][Incident] Pagerduty alert on binder-staging #2262

Open
5 tasks
damianavila opened this issue Feb 27, 2023 · 0 comments
Open
5 tasks

[NOT URGENT][Incident] Pagerduty alert on binder-staging #2262

damianavila opened this issue Feb 27, 2023 · 0 comments

Comments

@damianavila
Copy link
Contributor

damianavila commented Feb 27, 2023

Summary

Ref: https://2i2c-org.pagerduty.com/incidents/Q2MDFTOFMUO3CL?utm_campaign=channel&utm_source=slack

Impact on users

This is impacting binder-staging so there is no real impact on users, AFAIK.
Also related (decommission): #2026

Important information

  • Hub URL: binder-staging.2i2c.cloud
  • Support ticket ref: None

Tasks and updates

  • Discuss and address incident, leaving comments below with updates
  • Incident has been dealt with or is over
  • Copy/paste the after-action report below and fill in relevant sections
  • Incident title is discoverable and accurate
  • All actionable items in report have linked GitHub Issues
After-action report template
# After-action report

These sections should be filled out once we've resolved the incident and know what happened.
They should focus on the knowledge we've gained and any improvements we should take.

## Timeline

_A short list of dates / times and major updates, with links to relevant comments in the issue for more context._

All times in {{ most convenient timezone}}.

- {{ yyyy-mm-dd }} - [Summary of first update](link to comment)
- {{ yyyy-mm-dd }} - [Summary of another update](link to comment)
- {{ yyyy-mm-dd }} - [Summary of final update](link to comment)


## What went wrong

_Things that could have gone better. Ideally these should result in concrete
action items that have GitHub issues created for them and linked to under
Action items._

- Thing one
- Thing two

## Where we got lucky

_These are good things that happened to us but not because we had planned for them._

- Thing one
- Thing two

## Follow-up actions

_Every action item should have a GitHub issue (even a small skeleton of one) attached to it, so these do not get forgotten. These issues don't have to be in `infrastructure/`, they can be in other repositories._

### Process improvements

1. {{ summary }} [link to github issue]
2. {{ summary }} [link to github issue]

### Documentation improvements

1. {{ summary }} [link to github issue]
2. {{ summary }} [link to github issue]

### Technical improvements

1. {{ summary }} [link to github issue]
2. {{ summary }} [link to github issue]
@damianavila damianavila moved this to Needs Shaping / Refinement in DEPRECATED Engineering and Product Backlog Feb 27, 2023
@damianavila damianavila moved this to Todo 👍 in Sprint Board Mar 1, 2023
pnasrat added a commit that referenced this issue Mar 9, 2023
This allows removing uptime checks on a single hub eg binder-staging

See #2262
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: Needs Shaping / Refinement
Development

No branches or pull requests

1 participant