Skip to content

Commit

Permalink
feat(provider): pagerduty provider (#63)
Browse files Browse the repository at this point in the history
  • Loading branch information
talboren authored Feb 22, 2023
1 parent d36dec6 commit 17577a6
Show file tree
Hide file tree
Showing 5 changed files with 254 additions and 1 deletion.
3 changes: 2 additions & 1 deletion .github/workflows/lint-pr.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,8 @@ jobs:
name: Validate PR title
runs-on: ubuntu-latest
steps:
- name: semantic-pull-request
- name: lint_pr_title
id: lint_pr_title
uses: amannn/[email protected]
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
Expand Down
45 changes: 45 additions & 0 deletions docs/docs/providers/documentation/pagerduty.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
---
sidebar_label: Pagerduty Provider
---

# Pagerduty Provider

:::note Brief Description
Pagerduty Provider is a provider that allows to create incidents or post events to Pagerduty.
:::

## Inputs
The `notify` function in the `PagerdutyProvider` class takes the following parameters:
```python
kwargs (dict):
title (str): Title of the alert or incident. *Required*
alert_body (str): UTF-8 string of custom message for alert. Shown in incident body for events, and in the body for incidents. *Required for events, optional for incidents*
dedup (str | None): Any string, max 255 characters, used to deduplicate alerts for events. *Required for events, optional for incidents*
service_id (str): ID of the service for incidents. *Required for incidents, optional for events*
body (dict): Body of the incident. *Required for incidents, optional for events*
requester (str): Requester of the incident. *Required for incidents, optional for events*
incident_key (str | None): Key to identify the incident. If not given, a UUID will be generated. *Required for incidents, optional for events*
```

## Authentication Parameters
The PagerdutyProviderAuthConfig class takes the following parameters:
python
routing_key (str | None): Routing key, which is an integration or ruleset key. Optional, default is `None`. *Required for events, optional for incidents*
api_key (str | None): API key, which is a user or team API key. Optional, default is `None`. *Required for incidents, optional for events*

## Connecting with the Provider

To use the PagerdutyProvider, you'll need to provide either a routing_key or an api_key.

You can find your integration key or routing key in the PagerDuty web app under **Configuration** > **Integrations**, and select the integration you want to use.
You can find your API key in the PagerDuty web app under **Configuration** > **API Access**.

The routing_key is used to post events to Pagerduty using the events API.
The api_key is used to create incidents using the incidents API.

## Notes
The provider uses either the events API or the incidents API to create an alert or an incident. The choice of API to use is determined by the presence of either a routing_key or an api_key.

## Useful Links
- Pagerduty Events API documentation: https://v2.developer.pagerduty.com/docs/send-an-event-events-api-v2
- Pagerduty Incidents API documentation: https://v2.developer.pagerduty.com/docs/create-an-incident-incidents-api-v2
51 changes: 51 additions & 0 deletions examples/alerts/db_disk_space_pagerduty.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Database disk space is low (<10%)
alert:
id: db-disk-space
description: Check that the DB has enough disk space
owners:
- github-shahargl
- slack-talboren
services:
- db
- api
trigger:
# Run every hour or if the service-is-failing alert is triggered
interval: 1h
event:
- id: service-is-failing
type: alert
steps:
- name: db-no-space
provider:
type: mock
config: "{{ providers.db-server-mock }}"
with:
command: df -h | grep /dev/disk3s1s1 | awk '{ print $5}' # Check the disk space
command_output: 91% # Mock
condition:
- type: threshold
value: "{{ steps.this.results }}"
compare_to: 90% # Trigger if more than 90% full
failure_strategy: # What to do if the SSH connection failed?
- name: ssh-connection-failed
retry: 5 # Retry 5 times
alert: true # Finally, alert
actions:
- name: trigger-pagerduty
provider:
type: pagerduty
config: " {{ providers.pagerduty-test }} " # see documentation for options
with: # parameters changes when incident/event mode, see documentation for more information
title: Event Title Example
alert_body: This is an alert example for PagerDuty!
# service_id: PQ94Q8G
# requester: [email protected]
# body:
# type: incident_body
# details: |
# A disk is getting full on this machine. You should investigate what is causing the disk to fill ({{ steps.db-no-space.results }})

providers:
db-server-mock:
description: Paper DB Server
authentication:
Empty file.
156 changes: 156 additions & 0 deletions keep/providers/pagerduty_provider/pagerduty_provider.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
import dataclasses
import datetime
import json
import typing
import uuid

import pydantic
import requests

from keep.exceptions.provider_config_exception import ProviderConfigException
from keep.providers.base.base_provider import BaseProvider
from keep.providers.models.provider_config import ProviderConfig

# Todo: think about splitting in to PagerdutyIncidentsProvider and PagerdutyAlertsProvider
# Read this: https://community.pagerduty.com/forum/t/create-incident-using-python/3596/3


@pydantic.dataclasses.dataclass
class PagerdutyProviderAuthConfig:
routing_key: str | None = dataclasses.field(
metadata={
"required": False,
"description": "routing_key is an integration or ruleset key",
},
default=None,
)
api_key: str | None = dataclasses.field(
metadata={
"required": False,
"description": "api_key is a user or team API key",
},
default=None,
)


class PagerdutyProvider(BaseProvider):
def __init__(self, provider_id: str, config: ProviderConfig):
super().__init__(provider_id, config)

def validate_config(self):
self.authentication_config = PagerdutyProviderAuthConfig(
**self.config.authentication
)
if (
not self.authentication_config.routing_key
and not self.authentication_config.api_key
):
raise ProviderConfigException(
"PagerdutyProvider requires either routing_key or api_key"
)

def _build_alert(
self, title: str, alert_body: str, dedup: str
) -> typing.Dict[str, typing.Any]:
"""
Builds the payload for an event alert.
Args:
title: Title of alert
alert_body: UTF-8 string of custom message for alert. Shown in incident body
dedup: Any string, max 255, characters used to deduplicate alerts
Returns:
Dictionary of alert body for JSON serialization
"""
return {
"routing_key": self.authentication_config.routing_key,
"event_action": "trigger",
"dedup_key": dedup,
"payload": {
"summary": title,
"source": "custom_event",
"severity": "critical",
"custom_details": {
"alert_body": alert_body,
},
},
}

def _send_alert(self, title: str, alert_body: str, dedup: str | None = None):
"""
Sends PagerDuty Alert
Args:
title: Title of the alert.
alert_body: UTF-8 string of custom message for alert. Shown in incident body
dedup: Any string, max 255, characters used to deduplicate alerts
"""
# If no dedup is given, use epoch timestamp
if dedup is None:
dedup = str(datetime.datetime.utcnow().timestamp())

url = "https://events.pagerduty.com//v2/enqueue"

result = requests.post(url, json=self._build_alert(title, alert_body, dedup))

self.logger.debug("Alert status: %s", result.status_code)
self.logger.debug("Alert response: %s", result.text)

def _trigger_incident(
self,
service_id: str,
title: str,
body: dict,
requester: str,
incident_key: str | None = None,
):
"""Triggers an incident via the V2 REST API using sample data."""

if not incident_key:
incident_key = str(uuid.uuid4()).replace("-", "")

url = "https://api.pagerduty.com/incidents"
headers = {
"Content-Type": "application/json",
"Accept": "application/vnd.pagerduty+json;version=2",
"Authorization": "Token token={token}".format(
token=self.authentication_config.api_key
),
"From": requester,
}

payload = {
"incident": {
"type": "incident",
"title": title,
"service": {"id": service_id, "type": "service_reference"},
"incident_key": incident_key,
"body": body,
}
}

r = requests.post(url, headers=headers, data=json.dumps(payload))

print("Status Code: {code}".format(code=r.status_code))
print(r.json())

def dispose(self):
"""
No need to dispose of anything, so just do nothing.
"""
pass

def notify(self, **kwargs: dict):
"""
Create a PagerDuty alert.
Alert/Incident is created either via the Events API or the Incidents API.
See https://community.pagerduty.com/forum/t/create-incident-using-python/3596/3 for more information
Args:
kwargs (dict): The providers with context
"""
if self.authentication_config.routing_key:
self._send_alert(**kwargs)
else:
self._trigger_incident(**kwargs)

1 comment on commit 17577a6

@vercel
Copy link

@vercel vercel bot commented on 17577a6 Feb 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sign in to comment.