mirror of
https://github.com/grafana/grafana.git
synced 2025-02-25 18:55:37 -06:00
sandboxfy alerting tutorial part 2 (#95260)
* sandboxfy alerting tutorial part 2 * format * format2 * added frontmatter * all pretty no pity * added setup, login * link * STEPS * login link * update steps * improve visibility of part 2 * link * new diagram image * img * all pretty, no pity
This commit is contained in:
parent
6fcfd132e5
commit
9b773b8501
@ -2,8 +2,7 @@
|
|||||||
Feedback Link: https://github.com/grafana/tutorials/issues/new
|
Feedback Link: https://github.com/grafana/tutorials/issues/new
|
||||||
categories:
|
categories:
|
||||||
- alerting
|
- alerting
|
||||||
description: This is part 2 of the Get started with Grafana Alerting tutorials. Learn how to leverage alert instances, and set up a notification policy that routes alert notifications based on labels to a specific contact point.
|
description: Learn to use alert instances and route notifications by labels to contacts, building on your alerting skills in Grafana for more advanced workflows — Part 2.
|
||||||
id: alerting-get-started-pt2
|
|
||||||
labels:
|
labels:
|
||||||
products:
|
products:
|
||||||
- enterprise
|
- enterprise
|
||||||
@ -13,11 +12,18 @@ tags:
|
|||||||
- beginner
|
- beginner
|
||||||
title: Get started with Grafana Alerting - Part 2
|
title: Get started with Grafana Alerting - Part 2
|
||||||
weight: 50
|
weight: 50
|
||||||
|
killercoda:
|
||||||
|
title: Get started with Grafana Alerting - Part 2
|
||||||
|
description: Learn to use alert instances and route notifications by labels to contacts, building on your alerting skills in Grafana for more advanced workflows — Part 2.
|
||||||
|
backend:
|
||||||
|
imageid: ubuntu
|
||||||
---
|
---
|
||||||
|
|
||||||
|
<!-- INTERACTIVE page intro.md START -->
|
||||||
|
|
||||||
# Get started with Grafana Alerting - Part 2
|
# Get started with Grafana Alerting - Part 2
|
||||||
|
|
||||||
This is part 2 of the [Get Started with Grafana Alerting tutorial](http://grafana.com/tutorials/alerting-get-started/).
|
The Get started with Grafana Alerting tutorial Part 2 is a continuation of [Get started with Grafana Alerting tutorial Part 1](http://www.grafana.com/tutorials/alerting-get-started/).
|
||||||
|
|
||||||
In this guide, we dig into more complex yet equally fundamental elements of Grafana Alerting: **alert instances** and **notification policies**.
|
In this guide, we dig into more complex yet equally fundamental elements of Grafana Alerting: **alert instances** and **notification policies**.
|
||||||
|
|
||||||
@ -29,6 +35,107 @@ After introducing each component, you will learn how to:
|
|||||||
|
|
||||||
Learning about alert instances and notification policies is useful if you have more than one contact point in your organization, or if your alert rule returns a number of metrics that you want to handle separately by routing each alert instance to a specific contact point. The tutorial will introduce each concept, followed by how to apply both concepts in a real-world scenario.
|
Learning about alert instances and notification policies is useful if you have more than one contact point in your organization, or if your alert rule returns a number of metrics that you want to handle separately by routing each alert instance to a specific contact point. The tutorial will introduce each concept, followed by how to apply both concepts in a real-world scenario.
|
||||||
|
|
||||||
|
<!-- INTERACTIVE page intro.md END -->
|
||||||
|
<!-- INTERACTIVE page step1.md START -->
|
||||||
|
|
||||||
|
<!-- INTERACTIVE ignore START -->
|
||||||
|
|
||||||
|
{{< docs/ignore >}}
|
||||||
|
|
||||||
|
## Set up the Grafana stack
|
||||||
|
|
||||||
|
{{< /docs/ignore >}}
|
||||||
|
|
||||||
|
## Before you begin
|
||||||
|
|
||||||
|
There are different ways you can follow along with this tutorial.
|
||||||
|
|
||||||
|
- **Grafana Cloud**
|
||||||
|
|
||||||
|
- As a Grafana Cloud user, you don't have to install anything. [Create your free account](http://www.grafana.com/auth/sign-up/create-user).
|
||||||
|
|
||||||
|
Continue to [Alert instances](#alert-instances).
|
||||||
|
|
||||||
|
- **Interactive learning environment**
|
||||||
|
|
||||||
|
- Alternatively, you can try out this example in our interactive learning environment: [Get started with Grafana Alerting - Part 2](https://killercoda.com/grafana-labs/course/grafana/alerting-get-started-pt2/). It's a fully configured environment with all the dependencies already installed.
|
||||||
|
|
||||||
|
- **Grafana OSS**
|
||||||
|
|
||||||
|
- If you opt to run a Grafana stack locally, ensure you have the following applications installed:
|
||||||
|
|
||||||
|
- [Docker Compose](https://docs.docker.com/get-docker/) (included in Docker for Desktop for macOS and Windows)
|
||||||
|
- [Git](https://git-scm.com/)
|
||||||
|
|
||||||
|
### Set up the Grafana stack (OSS users)
|
||||||
|
|
||||||
|
<!-- INTERACTIVE ignore END -->
|
||||||
|
|
||||||
|
To demonstrate the observation of data using the Grafana stack, download and run the following files.
|
||||||
|
|
||||||
|
1. Clone the [tutorial environment repository](https://www.github.com/grafana/tutorial-environment).
|
||||||
|
|
||||||
|
<!-- INTERACTIVE exec START -->
|
||||||
|
|
||||||
|
```
|
||||||
|
git clone https://github.com/grafana/tutorial-environment.git
|
||||||
|
```
|
||||||
|
|
||||||
|
<!-- INTERACTIVE exec END -->
|
||||||
|
|
||||||
|
1. Change to the directory where you cloned the repository:
|
||||||
|
|
||||||
|
<!-- INTERACTIVE exec START -->
|
||||||
|
|
||||||
|
```
|
||||||
|
cd tutorial-environment
|
||||||
|
```
|
||||||
|
|
||||||
|
<!-- INTERACTIVE exec END -->
|
||||||
|
|
||||||
|
1. Run the Grafana stack:
|
||||||
|
|
||||||
|
<!-- INTERACTIVE ignore START -->
|
||||||
|
|
||||||
|
```
|
||||||
|
docker compose up -d
|
||||||
|
```
|
||||||
|
|
||||||
|
<!-- INTERACTIVE ignore END -->
|
||||||
|
|
||||||
|
{{< docs/ignore >}}
|
||||||
|
|
||||||
|
<!-- INTERACTIVE exec START -->
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker-compose up -d
|
||||||
|
```
|
||||||
|
|
||||||
|
<!-- INTERACTIVE exec END -->
|
||||||
|
|
||||||
|
{{< /docs/ignore >}}
|
||||||
|
|
||||||
|
The first time you run `docker compose up -d`, Docker downloads all the necessary resources for the tutorial. This might take a few minutes, depending on your internet connection.
|
||||||
|
|
||||||
|
<!-- INTERACTIVE ignore START -->
|
||||||
|
|
||||||
|
{{< admonition type="note" >}}
|
||||||
|
If you already have Grafana, Loki, or Prometheus running on your system, you might see errors, because the Docker image is trying to use ports that your local installations are already using. If this is the case, stop the services, then run the command again.
|
||||||
|
{{< /admonition >}}
|
||||||
|
|
||||||
|
<!-- INTERACTIVE ignore END -->
|
||||||
|
|
||||||
|
{{< docs/ignore >}}
|
||||||
|
|
||||||
|
NOTE:
|
||||||
|
|
||||||
|
If you already have Grafana, Loki, or Prometheus running on your system, you might see errors, because the Docker image is trying to use ports that your local installations are already using. If this is the case, stop the services, then run the command again.
|
||||||
|
|
||||||
|
{{< /docs/ignore >}}
|
||||||
|
|
||||||
|
<!-- INTERACTIVE page step1.md END -->
|
||||||
|
<!-- INTERACTIVE page step2.md START -->
|
||||||
|
|
||||||
## Alert instances
|
## Alert instances
|
||||||
|
|
||||||
An [alert instance](https://grafana.com/docs/grafana/latest/alerting/fundamentals/#alert-instances) is an event that matches a metric returned by an alert rule query.
|
An [alert instance](https://grafana.com/docs/grafana/latest/alerting/fundamentals/#alert-instances) is an event that matches a metric returned by an alert rule query.
|
||||||
@ -37,10 +144,13 @@ Let's consider a scenario where you're monitoring website traffic using Grafana.
|
|||||||
|
|
||||||
If the query returns more than one time-series, each time-series represents a different metric or aspect being monitored. In this case, the alert rule is applied individually to each time-series.
|
If the query returns more than one time-series, each time-series represents a different metric or aspect being monitored. In this case, the alert rule is applied individually to each time-series.
|
||||||
|
|
||||||
{{< figure alt="Screenshot displaying alert instances in the context of an alert rule, highlighting the specific alerts triggered by the rule and their respective statuses" src="/media/docs/alerting/get-started-digram-instance-grey.png" max-width="1200px" caption="Alert Instances in the Context of an Alert Rule" >}}
|
{{< figure alt="Screenshot displaying alert instances in the context of an alert rule, highlighting the specific alerts triggered by the rule and their respective statuses" src="/media/docs/alerting/alert-instance-flow.jpg" max-width="1200px" caption="Alert Instances in the Context of an Alert Rule" >}}
|
||||||
|
|
||||||
In this scenario, each time-series is evaluated independently against the alert rule. It results in the creation of an alert instance for each time-series. The time-series corresponding to the desktop page views meets the threshold and, therefore, results in an alert instance in **Firing** state for which an alert notification is sent. The mobile alert instance state remains **Normal**.
|
In this scenario, each time-series is evaluated independently against the alert rule. It results in the creation of an alert instance for each time-series. The time-series corresponding to the desktop page views meets the threshold and, therefore, results in an alert instance in **Firing** state for which an alert notification is sent. The mobile alert instance state remains **Normal**.
|
||||||
|
|
||||||
|
<!-- INTERACTIVE page step2.md END -->
|
||||||
|
<!-- INTERACTIVE page step3.md START -->
|
||||||
|
|
||||||
## Notification policies
|
## Notification policies
|
||||||
|
|
||||||
[Notification policies](https://grafana.com/docs/grafana/latest/alerting/fundamentals/notifications/notification-policies/) route alerts to different communication channels, reducing alert noise and providing control over when and how alerts are sent. For example, you might use notification policies to ensure that critical alerts about server downtime are sent immediately to the on-call engineer. Another use case could be routing performance alerts to the development team for review and action.
|
[Notification policies](https://grafana.com/docs/grafana/latest/alerting/fundamentals/notifications/notification-policies/) route alerts to different communication channels, reducing alert noise and providing control over when and how alerts are sent. For example, you might use notification policies to ensure that critical alerts about server downtime are sent immediately to the on-call engineer. Another use case could be routing performance alerts to the development team for review and action.
|
||||||
@ -54,10 +164,19 @@ Key Characteristics:
|
|||||||
|
|
||||||
In the above diagram, alert instances and notification policies are matched by labels. For instance, the label `team=operations` matches the alert instance “**Pod stuck in CrashLoop**” and “**Disk Usage -80%**” to child policies that send alert notifications to a particular contact point (operations@grafana.com).
|
In the above diagram, alert instances and notification policies are matched by labels. For instance, the label `team=operations` matches the alert instance “**Pod stuck in CrashLoop**” and “**Disk Usage -80%**” to child policies that send alert notifications to a particular contact point (operations@grafana.com).
|
||||||
|
|
||||||
|
<!-- INTERACTIVE page step3.md END -->
|
||||||
|
<!-- INTERACTIVE page step4.md START -->
|
||||||
|
|
||||||
## Create notification policies
|
## Create notification policies
|
||||||
|
|
||||||
Create a notification policy if you want to handle metrics returned by alert rules separately by routing each alert instance to a specific contact point. In Grafana, click on the icon at the top left corner of the screen to access the navigation menu.
|
Create a notification policy if you want to handle metrics returned by alert rules separately by routing each alert instance to a specific contact point. In Grafana, click on the icon at the top left corner of the screen to access the navigation menu.
|
||||||
|
|
||||||
|
<!-- INTERACTIVE ignore START -->
|
||||||
|
|
||||||
|
1. In your browser, **sign in** to your Grafana Cloud account.
|
||||||
|
|
||||||
|
OSS and interactive learning environment users: To log in, navigate to [http://localhost:3000](http://localhost:3000), where Grafana should be running.
|
||||||
|
|
||||||
1. Navigate to **Alerts & IRM > Alerting > Notification policies**.
|
1. Navigate to **Alerts & IRM > Alerting > Notification policies**.
|
||||||
1. In the Default policy, click **+ New child policy**.
|
1. In the Default policy, click **+ New child policy**.
|
||||||
1. In the field **Label** enter `device`, and in the field **Value** enter `desktop`.
|
1. In the field **Label** enter `device`, and in the field **Value** enter `desktop`.
|
||||||
@ -71,6 +190,29 @@ Create a notification policy if you want to handle metrics returned by alert rul
|
|||||||
|
|
||||||
1. **Repeat the steps above to create a second child policy** to match another alert instance. For labels use: `device=mobile`. Use the Webhook integration for the contact point. Alternatively, experiment by using a different Webhook endpoint or a [different integration](https://grafana.com/docs/grafana/latest/alerting/configure-notifications/manage-contact-points/#list-of-supported-integrations).
|
1. **Repeat the steps above to create a second child policy** to match another alert instance. For labels use: `device=mobile`. Use the Webhook integration for the contact point. Alternatively, experiment by using a different Webhook endpoint or a [different integration](https://grafana.com/docs/grafana/latest/alerting/configure-notifications/manage-contact-points/#list-of-supported-integrations).
|
||||||
|
|
||||||
|
<!-- INTERACTIVE ignore END -->
|
||||||
|
|
||||||
|
{{< docs/ignore >}}
|
||||||
|
|
||||||
|
1. Visit [http://localhost:3000](http://localhost:3000), where Grafana should be running
|
||||||
|
1. Navigate to **Alerts & IRM > Alerting > Notification policies**.
|
||||||
|
1. In the Default policy, click **+ New child policy**.
|
||||||
|
1. In the field **Label** enter `device`, and in the field **Value** enter `desktop`.
|
||||||
|
1. From the **Contact point** drop-down, choose **Webhook**.
|
||||||
|
|
||||||
|
If you don’t have any contact points, add a [Contact point](https://grafana.com/tutorials/alerting-get-started/#create-a-contact-point).
|
||||||
|
|
||||||
|
1. Click **Save Policy**.
|
||||||
|
|
||||||
|
This new child policy routes alerts that match the label `device=desktop` to the Webhook contact point.
|
||||||
|
|
||||||
|
1. **Repeat the steps above to create a second child policy** to match another alert instance. For labels use: `device=mobile`. Use the Webhook integration for the contact point. Alternatively, experiment by using a different Webhook endpoint or a [different integration](https://grafana.com/docs/grafana/latest/alerting/configure-notifications/manage-contact-points/#list-of-supported-integrations).
|
||||||
|
|
||||||
|
{{< /docs/ignore >}}
|
||||||
|
|
||||||
|
<!-- INTERACTIVE page step4.md END -->
|
||||||
|
<!-- INTERACTIVE page step5.md START -->
|
||||||
|
|
||||||
## Create an alert rule that returns alert instances
|
## Create an alert rule that returns alert instances
|
||||||
|
|
||||||
The alert rule that you are about to create is meant to monitor web traffic page views. The objective is to explore what an alert instance is and how to leverage routing individual alert instances by using label matchers and notification policies.
|
The alert rule that you are about to create is meant to monitor web traffic page views. The objective is to explore what an alert instance is and how to leverage routing individual alert instances by using label matchers and notification policies.
|
||||||
@ -85,6 +227,8 @@ Grafana includes a [test data source](https://grafana.com/docs/grafana/latest/da
|
|||||||
1. Click **Save & test**.
|
1. Click **Save & test**.
|
||||||
|
|
||||||
You should see a message confirming that the data source is working.
|
You should see a message confirming that the data source is working.
|
||||||
|
<!-- INTERACTIVE page step5.md END -->
|
||||||
|
<!-- INTERACTIVE page step6.md START -->
|
||||||
|
|
||||||
### Create an alert rule
|
### Create an alert rule
|
||||||
|
|
||||||
@ -97,30 +241,33 @@ Make it short and descriptive as this will appear in your alert notification. Fo
|
|||||||
|
|
||||||
### Define query and alert condition
|
### Define query and alert condition
|
||||||
|
|
||||||
In this section, we use the **Advanced options** for Grafana-managed alert rule creation. The advanced options let us define queries, expressions (used to manipulate the data), and the condition that must be met for the alert to be triggered.
|
In this section, we use the default options for Grafana-managed alert rule creation. The default options let us define the query, a expression (used to manipulate the data -- the `WHEN` field in the UI), and the condition that must be met for the alert to be triggered (in default mode is the threshold).
|
||||||
|
|
||||||
1. Toggle **Advanced options** to view additional configuration options.
|
|
||||||
1. Select **TestData** data source from the drop-down menu.
|
1. Select **TestData** data source from the drop-down menu.
|
||||||
1. From **Scenario** select **CSV Content**.
|
1. From **Scenario** select **CSV Content**.
|
||||||
|
1. In the Query editor, switch to **Code** mode by clicking the button on the right.
|
||||||
1. Copy in the following CSV data:
|
1. Copy in the following CSV data:
|
||||||
|
|
||||||
```
|
```
|
||||||
device,views
|
device,views
|
||||||
desktop,1200
|
desktop,1200
|
||||||
mobile,900
|
mobile,900
|
||||||
```
|
```
|
||||||
|
|
||||||
The above CSV data simulates a data source returning multiple time series, each leading to the creation of an alert instance for that specific time series. Note that the data returned matches the example in the [Alert instance](#alert-instances) section.
|
The above CSV data simulates a data source returning multiple time series, each leading to the creation of an alert instance for that specific time series. Note that the data returned matches the example in the [Alert instance](#alert-instances) section.
|
||||||
|
|
||||||
|
1. In the **Alert condition** section:
|
||||||
|
|
||||||
|
- Keep `Last` as the value for the reducer function (`WHEN`), and `1000` as the threshold value. This is the value above which the alert rule should trigger.
|
||||||
|
|
||||||
1. Remove the ‘B’ **Reduce expression** (click the bin icon). The Reduce expression is default, and in this case, is not required since the queried data is already reduced. Note that the Threshold expression is now your **Alert condition**.
|
|
||||||
1. In the ‘C’ **Threshold expression**:
|
|
||||||
- Change the **Input** to ‘**A**’ to select the data source.
|
|
||||||
- Enter `1000` as the threshold value. This is the value above which the alert rule should trigger.
|
|
||||||
1. Click **Preview** to run the queries.
|
1. Click **Preview** to run the queries.
|
||||||
|
|
||||||
It should return two series.`desktop` in Firing state, and `mobile` in Normal state. The values `1`, and `0` mean that the condition is either `true` or `false`.
|
It should return two series.`desktop` in Firing state, and `mobile` in Normal state. The values `1`, and `0` mean that the condition is either `true` or `false`.
|
||||||
|
|
||||||
{{< figure alt="Screenshot showing a preview of a query in Grafana that returns two alert instances, including the query results and relevant alert details" src="/media/docs/alerting/get-started-expression-instances.png" max-width="1200px" caption="Preview of a query returning two alert instances in Grafana." >}}
|
{{< figure alt="Screenshot showing a preview of a query in Grafana that returns two alert instances, including the query results and relevant alert details" src="/media/docs/alerting/firing-instances.png" max-width="1200px" caption="Preview of a query returning two alert instances in Grafana." >}}
|
||||||
|
|
||||||
|
<!-- INTERACTIVE page step6.md END -->
|
||||||
|
<!-- INTERACTIVE page step7.md START -->
|
||||||
|
|
||||||
### Set evaluation behavior
|
### Set evaluation behavior
|
||||||
|
|
||||||
@ -144,14 +291,24 @@ In this section, you can select how you want to route your alert instances. Sinc
|
|||||||
|
|
||||||
{{< figure alt="Screenshot showing a routing preview of matched notification policies, detailing how alerts are matched and routed to specific notification channels" src="/media/docs/alerting/get-started-alert-instace-routing-prev.png" max-width="1200px" caption="Routing preview of matched notification policies" >}}
|
{{< figure alt="Screenshot showing a routing preview of matched notification policies, detailing how alerts are matched and routed to specific notification channels" src="/media/docs/alerting/get-started-alert-instace-routing-prev.png" max-width="1200px" caption="Routing preview of matched notification policies" >}}
|
||||||
|
|
||||||
|
<!-- INTERACTIVE ignore START -->
|
||||||
|
|
||||||
{{< admonition type="note" >}}
|
{{< admonition type="note" >}}
|
||||||
Even if both labels match the policies, only the alert instance in Firing state produces an alert notification.
|
Even if both labels match the policies, only the alert instance in Firing state produces an alert notification.
|
||||||
{{</ admonition >}}
|
{{</ admonition >}}
|
||||||
|
<!-- INTERACTIVE ignore END -->
|
||||||
|
|
||||||
|
{{< docs/ignore >}}
|
||||||
|
Even if both labels match the policies, only the alert instance in Firing state produces an alert notification.
|
||||||
|
{{< /docs/ignore >}}
|
||||||
|
|
||||||
1. Click **Save rule and exit**.
|
1. Click **Save rule and exit**.
|
||||||
|
|
||||||
Now that we have set up the alert rule, it’s time to check the alert notification.
|
Now that we have set up the alert rule, it’s time to check the alert notification.
|
||||||
|
|
||||||
|
<!-- INTERACTIVE page step7.md END -->
|
||||||
|
<!-- INTERACTIVE page step8.md START -->
|
||||||
|
|
||||||
## Receive alert notifications
|
## Receive alert notifications
|
||||||
|
|
||||||
Now that the alert rule has been configured, you should receive alert [notifications](http://grafana.com/docs/grafana/next/alerting/fundamentals/alert-rule-evaluation/state-and-health/#notifications) in the contact point whenever the alert triggers and gets resolved. In our example, each alert instance should be routed separately as we configured labels to match notification policies. Once the evaluation interval has concluded (1m), you should receive an alert notification in the Webhook endpoint.
|
Now that the alert rule has been configured, you should receive alert [notifications](http://grafana.com/docs/grafana/next/alerting/fundamentals/alert-rule-evaluation/state-and-health/#notifications) in the contact point whenever the alert triggers and gets resolved. In our example, each alert instance should be routed separately as we configured labels to match notification policies. Once the evaluation interval has concluded (1m), you should receive an alert notification in the Webhook endpoint.
|
||||||
@ -162,6 +319,9 @@ The alert notification details show that the alert instance corresponding to the
|
|||||||
|
|
||||||
Feel free to change the CSV data in the alert rule to trigger the routing of the alert instance that matches the label `device=mobile`.
|
Feel free to change the CSV data in the alert rule to trigger the routing of the alert instance that matches the label `device=mobile`.
|
||||||
|
|
||||||
|
<!-- INTERACTIVE page step8.md END -->
|
||||||
|
<!-- INTERACTIVE page finish.md START -->
|
||||||
|
|
||||||
## Summary
|
## Summary
|
||||||
|
|
||||||
In this tutorial, you have learned how Grafana Alerting can route individual alert instances using the labels generated by the data-source query and match these labels with notification policies, which in turn routes alert notifications to specific contact points.
|
In this tutorial, you have learned how Grafana Alerting can route individual alert instances using the labels generated by the data-source query and match these labels with notification policies, which in turn routes alert notifications to specific contact points.
|
||||||
@ -169,3 +329,5 @@ In this tutorial, you have learned how Grafana Alerting can route individual ale
|
|||||||
If you run into any problems, you are welcome to post questions in our [Grafana Community forum](https://community.grafana.com/).
|
If you run into any problems, you are welcome to post questions in our [Grafana Community forum](https://community.grafana.com/).
|
||||||
|
|
||||||
Enjoy your monitoring!
|
Enjoy your monitoring!
|
||||||
|
|
||||||
|
<!-- INTERACTIVE page finish.md END -->
|
||||||
|
Loading…
Reference in New Issue
Block a user