Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add 30m support ticket evaluation checklist #781

Merged
merged 2 commits into from
Nov 23, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions projects/managed-hubs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,12 @@ The Managed JupyterHub Service is an ongoing service to **sustain and scale** a
**[`docs.2i2c.org`](https://docs.2i2c.org) has most of the information about this service**.

The sections here contain information that is more relevant to 2i2c team members (like support process documentation).

```{toctree}
:maxdepth: 2
showcase-hub
sales
support
timeboxed-initial-ticket-evaluation
incidents
```
6 changes: 4 additions & 2 deletions projects/managed-hubs/support.md
Original file line number Diff line number Diff line change
Expand Up @@ -177,18 +177,20 @@ This process is carried out in an ongoing basis by the {term}`Support Stewards`.
The goal of the non-incident response process is to bring standardization to our support response. This simple workflow tries to battle the bias towards a reactive response whereas it is also bringing some common patterns so all of our non-incident support responses are cohesive and shared among our support stewards.
The current iteration of the workflow states each step and who should be responsible/accountable for the specific step, plus some other clarifications.

When a new ticket lands in Freshdesk under the support group and it is not an incident, you should follow the following steps:
When a new ticket lands in Freshdesk under the support group and it is not an incident, we aim to respond within 24 working hours with a suggested next action. The next steps should be followed when resolving a ticket:

1. `Who: Support steward`

**Respond within 24 working hours**. Acknowledge receipt of the support request and let the {term}`Community Representative` know a time-boxed investigation will start soon. Please request any additional information you may need to be able to reproduce the issue in step 2.
**First 24h initial ticket evaluation**. In the first 24h a support ticket was opened, you should do an initial evaluation of the ticket and ask the {term}`Community Representative` about any additional information you may need.

2. `Who: Support steward`

**Spend 30 minutes trying to resolve**. If you believe you can resolve the issue within 30 minutes, try resolving it yourself.
1. If you resolve the issue, then jump to the "Confirm resolution" step 7.
2. If you don't believe you can resolve the issue (or you couldn't) in 30 minutes, jump to the next step.

Follow the guide at [](support:timeboxed-evaluation) to try and reach to a decision.

3. `Who: Support Steward`

**Open an engineering issue**. If this is a {term}`Change Request` or {term}`Guidance Request` and/or you cannot resolve the issue within 30 minutes, then open a support issue for the team to discuss.
Expand Down
48 changes: 48 additions & 0 deletions projects/managed-hubs/timeboxed-initial-ticket-evaluation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
(support:timeboxed-evaluation)=
# Initial timeboxed (30m) ticket resolution checklist

In the [non-incident support response process](https://compass.2i2c.org/projects/managed-hubs/support/#non-incident-response-process), an initial 30m time-boxed ticket resolution process is documented.

The support triagers use these 30m time interval to try an resolve a ticket, before opening a follow-up issue about it.

The next sections represents an incomplete initial checklist that the support triager can follow in order to resolve the ticket or decide on opening a tracking issue about it, with the context they gained during this investigation.

The steps to follow depend greatly on the type of ticket. To simplify, only three big ticket categories will be addressed.

## Category 1: Something is not working

```{important}
If something is not working, you might be dealing with an incident, so depending on the scale of the issue and its nature, you might want to consider following the [Incident Response Process](https://compass.2i2c.org/projects/managed-hubs/incidents/#incident-response-process).
```

1. ✅ Ask for any additional info might be needed
1. ✅ Check if the errors being reported are listed in this incomplete list of [the most common seen errors](https://infrastructure.2i2c.org/howto/troubleshoot/logs/common-errors/).
1. ✅ Depending on the issue being experienced, you should check the relevant logs:

🟡 via cloud-agnostic tools like [kubectl or the deployer](https://infrastructure.2i2c.org/howto/troubleshoot/logs/kubectl-logs), which provide details about the current running components

🟡 or search [the logs via the console](https://infrastructure.2i2c.org/howto/troubleshoot/logs/cloud-logs) which can be useful for digging out information about components, persisted for a longer time span (30d in GCP's case).

1. ✅ Save any of the logs that look useful
1. ✅ Check if you are dealing with any of [the most common seen problems](https://infrastructure.2i2c.org/sre-guide/common-problems-solutions/) and try and fix it.
1. ❌ If not, then open a new GitHub issue, sharing as much context from the previous steps as possible and continue with the [non-incident response process](https://compass.2i2c.org/projects/managed-hubs/support/#non-incident-response-process)

## Category 2: New feature requested
```{list-table}
:widths: 30
:header-rows: 1

* - Is the feature requested documented at [](hub-features)?
* - ✅ Yes? Then enable it after checking it is in the scope of the contract.
* - ❌ No? Then open a GitHub tracking issue about it and continue following the non-incident process.
```

## Category 3: Technical advice
```{list-table}
:widths: 30
:header-rows: 1

* - Is the question about an area where the support triager has insight into?
* - ✅ Yes? Then answer the ticket.
* - ❌ No? Then open a GitHub tracking issue about it and continue following the non-incident process
```