Skip to main content
Version: 3.8.0

How to Create PagerDuty Alerts

Resoto constantly monitors your infrastructure, and can alert you to any detected issues. PagerDuty is the de-facto standard to escalate alerts. In this guide, we will configure Resoto to send alerts to PagerDuty with a custom command.

Prerequisites

This guide assumes that you have already installed and configured Resoto to collect your cloud resources.

You will also need a valid routing key for your PagerDuty account.

Directions

  1. Open the relevant service in PagerDuty and click Integrations. Then, click the Add new integration button.

  2. Expand Events API V2 and copy the revealed integration key:

    PagerDuty Integration

    note

    We will refer to this key as the "routing key" for the remainder of these instructions.

  3. Open the resoto.core.commands configuration:

    > config edit resoto.core.commands
  4. Add the routing key copied in step 2 as the default value of the routing_key parameter in the pagerduty section. This will allow you to execute the pagerduty command without specifying the routing key parameter each time.

    info

    The pagerduty command has the following parameters, all of which are required:

    ParameterDescriptionDefault Value
    summaryAlert summary
    routing_keyEvents API V2 integration key
    dedup_keyString identifier that PagerDuty will use to ensure that only a single alert is active at a time
    sourceAlert sourceResoto
    severityAlert severity (critical, error, warning, or info)warning
    sourceLocation of the affected system (preferably a hostname or FQDN)Resoto
    event_actionAlert action (trigger, acknowledge, resolve or assign)trigger
    clientName of the monitoring client submitting the eventResoto
    client_urlURL to the monitoring clienthttps://resoto.com
    webhook_urlPagerDuty events API URL endpointhttps://events.pagerduty.com/v2/enqueue
  5. Define the search criteria that will trigger an alert. For example, let's say we want to send alerts whenever we find a Kubernetes Pod updated in the last hour with a restart count greater than 20:

    > search is(kubernetes_pod) and pod_status.container_statuses[*].restart_count > 20 and last_update<1h
    ​kind=kubernetes_pod, name=db-operator-mcd4g, restart_count=[42], age=2mo5d, last_update=23m, cloud=k8s, account=prod, region=kube-system
  6. Now that we've defined the alert trigger, we will simply pipe the result of the search query to the pagerduty command, replacing the name with your desired alert name:

    > search is(kubernetes_pod) and pod_status.container_statuses[*].restart_count > 20 and last_update<1h | pagerduty summary="Pods are restarting too often!" dedup_key="Resoto::PodRestartedTooOften"

    If the defined condition is currently true, you should see a new alert in PagerDuty.

  7. Finally, we want to automate checking of the defined alert trigger and send alerts to PagerDuty whenever the result is true. We can accomplish this by creating a job:

    > jobs add --id alert_on_pod_failure--wait-for-event post_collect 'search is(kubernetes_pod) and pod_status.container_statuses[*].restart_count > 20 and last_update<1h | pagerduty summary="Pods are restarting too often!" dedup_key="Resoto::PodRestartedTooOften"

Further Reading

Contact Us

Have feedback or need help? Don’t be shy—we’d love to hear from you!

 

 

 

Some Engineering Inc.