Docs: update the current documentation on notification policies (#64316)

Co-authored-by: Eve Meelan <81647476+Eve832@users.noreply.github.com>
This commit is contained in:
Gilles De Mey 2023-03-08 17:18:17 +01:00 committed by GitHub
parent 11bc66a0e8
commit 6827f97b78
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
5 changed files with 38 additions and 20 deletions

View File

@ -21,13 +21,15 @@ Grafana uses Alertmanagers to send notifications for firing and resolved alerts.
Notification policies control when and where notifications are sent. A notification policy can choose to send all alerts together in the same notification, send alerts in grouped notifications based on a set of labels, or send alerts as separate notifications. You can configure each notification policy to control how often notifications should be sent as well as having one or more mute timings to inhibit notifications at certain times of the day and on certain days of the week.
Notification policies are organized in a tree structure where at the root of the tree there is a notification policy called the root policy. There can be only one root policy and the root policy cannot be deleted.
Notification policies are organized in a tree structure where at the root of the tree there is a notification policy called the default policy. There can be only one default policy and the default policy cannot be deleted.
Specific routing policies are children of the root policy and can be used to match either all alerts or a subset of alerts based on a set of matching labels. A notification policy matches an alert when its matching labels match the labels in the alert.
Specific routing policies are children of the default policy and can be used to match either all alerts or a subset of alerts based on a set of matching labels. A notification policy matches an alert when its matching labels match the labels in the alert.
A specific routing policy can have its own child policies, called nested policies, which allow for additional matching of alerts. An example of a specific routing policy could be sending infrastructure alerts to the Ops team; while a child policy might send high priority alerts to Pagerduty and low priority alerts as emails.
A nested policy can have its own nested policies, which allow for additional matching of alerts. An example of a nested policy could be sending infrastructure alerts to the Ops team; while a nested policy might send high priority alerts to Pagerduty and low priority alerts as emails.
All alerts, irrespective of their labels, match the root policy. However, when the root policy receives an alert it looks at each specific routing policy and sends the alert to the first specific routing policy that matches the alert. If the specific routing policy has further child policies, then it can attempt to the match the alert against one of its nested policies. If no nested policies match the alert then the specific routing policy is the matching policy. If there are no specific routing policies, or no specific routing policies match the alert, then the root policy is the matching policy.
All alerts, irrespective of their labels, match the default policy. However, when the default policy receives an alert it looks at each nested policy and sends the alert to the first nested policy that matches the alert. If the nested policy has further nested policies, then it can attempt to the match the alert against one of its nested policies. If no nested policies match the alert then the policy itself is the matching policy. If there are no nested policies, or no nested policies match the alert, then the default policy is the matching policy.
<!-- This definitely needs a diagram and some examples (Gilles) -->
## Contact points

View File

@ -16,14 +16,12 @@ weight: 300
# Manage notification policies
Notification policies determine how alerts are routed to contact points. Policies have a tree structure, where each policy can have one or more child policies. Each policy, except for the root policy, can also match specific alert labels. Each alert is evaluated by the root policy and subsequently by each child policy. If the `Continue matching subsequent sibling nodes` option is enabled for a specific policy, then evaluation continues even after one or more matches. A parent policys configuration settings and contact point information govern the behavior of an alert that does not match any of the child policies. A root policy governs any alert that does not match a specific policy.
Notification policies determine how alerts are routed to contact points. Policies have a tree structure, where each policy can have one or more nested policies. Each policy, except for the default policy, can also match specific alert labels. Each alert is evaluated by the default policy and subsequently by each nested policy. If the `Continue matching subsequent sibling nodes` option is enabled for a nested policy, then evaluation continues even after one or more matches. A parent policys configuration settings and contact point information govern the behavior of an alert that does not match any of the nested policies. A default policy governs any alert that does not match a nested policy.
You can configure Grafana managed notification policies as well as notification policies for an external Alertmanager data source.
## Grouping
{{< figure max-width="40%" src="/static/img/docs/alerting/unified/notification-policies-grouping.png" max-width="650px" caption="Notification policies grouping" >}}
Grouping is a new and key concept of Grafana Alerting that categorizes alert notifications of similar nature into a single funnel. This allows you to properly route alert notifications during larger outages when many parts of a system fail at once causing a high number of alerts to fire simultaneously.
For example, suppose you have 100 services connected to a database in different environments. These services are differentiated by the label `env=environmentname`. An alert rule is in place to monitor whether your services can reach the database named `alertname=DatabaseUnreachable`.
@ -34,14 +32,14 @@ You can configure grouping to be `group_by: [alertname]` (take note that the `en
> **Note:** Grafana also has a special label named `...` that you can use to group all alerts by all labels (effectively disabling grouping), therefore each alert will go into its own group. It is different from the default of `group_by: null` where **all** alerts go into a single group.
## Edit root notification policy
## Edit default notification policy
> **Note:** Before Grafana v8.2, the configuration of the embedded Alertmanager was shared across organizations. Users of Grafana 8.0 and 8.1 are advised to use the new Grafana 8 Alerts only if they have one organization. Otherwise, silences for the Grafana managed alerts will be visible by all organizations.
1. In the Grafana menu, click the **Alerting** (bell) icon to open the Alerting page listing existing alerts.
1. Click **Notification policies**.
1. From the **Alertmanager** dropdown, select an external Alertmanager. By default, the Grafana Alertmanager is selected.
1. In the Root policy section, click **Edit** (pen icon).
1. In the Default policy section, click **...** &rsaquo; **Edit** (pen icon).
1. In **Default contact point**, update the contact point to whom notifications should be sent for rules when alert rules do not match any specific policy.
1. In **Group by**, choose labels to group alerts by. If multiple alerts are matched for this policy, then they are grouped by these labels. A notification is sent per group. If the field is empty (default), then all notifications are sent in a single group. Use a special label `...` to group alerts by all labels (which effectively disables grouping).
1. In **Timing options**, select from the following options:
@ -50,7 +48,7 @@ You can configure grouping to be `group_by: [alertname]` (take note that the `en
- **Repeat interval** Minimum time interval for re-sending a notification if no new alerts were added to the group. Default is 4 hours.
1. Click **Save** to save your changes.
## Add new specific policy
## Add new nested policy
1. In the Grafana menu, click the **Alerting** (bell) icon to open the Alerting page listing existing alerts.
1. Click **Notification policies**.
@ -59,7 +57,7 @@ You can configure grouping to be `group_by: [alertname]` (take note that the `en
1. In **Matching labels** section, add one or more rules for matching alert labels.
1. In **Contact point**, add the contact point to send notification to if alert matches only this specific policy and not any of the nested policies.
1. Optionally, enable **Continue matching subsequent sibling nodes** to continue matching sibling policies even after the alert matched the current policy. When this option is enabled, you can get more than one notification for one alert.
1. Optionally, enable **Override grouping** to specify the same grouping as the root policy. If this option is not enabled, the root policy grouping is used.
1. Optionally, enable **Override grouping** to specify the same grouping as the default policy. If this option is not enabled, the default policy grouping is used.
1. Optionally, enable **Override general timings** to override the timing options configured in the group notification policy.
1. Click **Save policy** to save your changes.
@ -76,14 +74,29 @@ You can configure grouping to be `group_by: [alertname]` (take note that the `en
1. Make any changes using instructions in [Add new specific policy](#add-new-specific-policy).
1. Click **Save policy**.
## Searching for policies
Grafana allows you to search within the tree of policies by the following:
- **Label matchers**
- **Contact Points**
To search by contact point simply select a contact point from the **Search by contact point** dropdown. The policies that use that contact point will be highlighted in the user interface.
To search by label matchers simply enter a valid matcher in the **Search by matchers** input field. Multiple matchers can be combined with a comma (`,`).
An example of a valid matchers search input is:
`severity=high, region=~EMEA|NASA`
> All matched policies will be **exact** matches, we currently do not support regex-style or partial matching.
## Example
An example of an alert configuration.
- Create a "default" contact point for slack notifications, and set it on root policy.
- Edit the root policy grouping to group alerts by `cluster`, `namespace` and `severity` so that you get a notification per alert rule and specific kubernetes cluster and namespace.
- Create a "default" contact point for slack notifications, and set it on the default policy.
- Edit the default policy grouping to group alerts by `cluster`, `namespace` and `severity` so that you get a notification per alert rule and specific kubernetes cluster and namespace.
- Create specific route for alerts coming from the development cluster with an appropriate contact point.
- Create a specific route for alerts with "critical" severity with a more invasive contact point integration, like pager duty notification.
- Create specific routes for particular teams that handle their own onduty rotations.
{{< figure max-width="40%" src="/static/img/docs/alerting/unified/notification-policies-8-0.png" max-width="650px" caption="Notification policies" >}}
- Create specific routes for particular teams that handle their own on-call rotations.

View File

@ -36,7 +36,8 @@ The following table highlights the key differences between mute timings and sile
1. In the Grafana menu, click the **Alerting** (bell) icon to open the Alerting page listing existing alerts.
1. Click **Notification policies**.
1. From the **Alertmanager** dropdown, select an external Alertmanager. By default, the Grafana Alertmanager is selected.
1. At the bottom of the page there will be a section titled **Mute timings**. Click the **Add mute timing** button.
1. Click the **Mute Timings** tab.
1. Click **Add mute timing**.
1. You will be redirected to a form to create a [time interval](#time-intervals) to match against for your mute timing.
1. Click **Submit** to create the mute timing.

View File

@ -17,7 +17,7 @@ weight: 800
# View and filter by alert groups
Alert groups show grouped alerts from an Alertmanager instance. By default, alert rules are grouped by the label keys for the root policy in notification policies. Grouping common alert rules into a single alert group prevents duplicate alert rules from being fired.
Alert groups show grouped alerts from an Alertmanager instance. By default, alert rules are grouped by the label keys for the default policy in notification policies. Grouping common alert rules into a single alert group prevents duplicate alert rules from being fired.
You can view alert groups and also filter for alert rules that match specific criteria.
@ -30,7 +30,7 @@ To view alert groups, complete the following steps.
1. From the **Alertmanager** drop-down, select an external Alertmanager as your data source. By default, the `Grafana` Alertmanager is selected.
1. From **custom group by** drop-down, select a combination of labels to view a grouping other than the default. This is useful for debugging and verifying your grouping of notification policies.
If an alert does not contain labels specified either in the grouping of the root policy or the custom grouping, then the alert is added to a catch all group with a header of `No grouping`.
If an alert does not contain labels specified either in the grouping of the default policy or the custom grouping, then the alert is added to a catch all group with a header of `No grouping`.
## Filter alerts

View File

@ -323,7 +323,9 @@ Now that Grafana knows how to notify us, it's time to set up an alert rule:
1. In **Section 4**, you can add some sample text to your summary message. [Read more about message templating here](/docs/grafana/latest/alerting/unified-alerting/message-templating/).
1. Click **Save and exit** at the top of the page.
1. In Grafana's sidebar, hover the cursor over the **Alerting** (bell) icon and then click **Notification policies**.
1. Under **Root policy**, press **Edit** and change the **Default contact point** to **RequestBin**. As a system grows, admins can use the **Notification policies** setting to organize and match alert rules to specific contact points.
1. Under **Default policy**, select **...** &rsaquo; **Edit** and change the **Default contact point** to **RequestBin**.
As a system grows, admins can use the **Notification policies** setting to organize and match alert rules to
specific contact points.
### Trigger a Grafana Managed Alert