diff --git a/docs/sources/alerting/alerts-overview.md b/docs/sources/alerting/alerts-overview.md new file mode 100644 index 00000000000..de5b3e04556 --- /dev/null +++ b/docs/sources/alerting/alerts-overview.md @@ -0,0 +1,56 @@ ++++ +title = "Alerts overview" +type = "docs" +[menu.docs] +identifier = "alerting" +parent = "Alerting" +aliases = ["/docs/grafana/latest/alerting/rules/", "/docs/grafana/latest/alerting/metrics/"] +weight = 100 ++++ + +# Alerts overview + +Alerts allow you to identify problems in your system moments after they occur. By quickly identifying unintended changes in your system, you can minimize disruptions to your services. + +Alerts consists of two parts: + +- Alert rules - When the alert is triggered. Alert rules are defined by one or more conditions that are regularly evaluated by Grafana. +- Notification channel - How the alert is delivered. When the conditions of an alert rule are met, the Grafana notifies the channels configured for that alert. + +Currently only the graph panel visualization supports alerts. + +## Alert tasks + +You can perform the following tasks for alerts: + +- [Add or edit an alert notification channel]({{< relref "notifications.md" >}}) +- [Create an alert rule]({{< relref "create-alerts.md" >}}) +- [View existing alert rules and their current state]({{< relref "view-alerts.md" >}}) +- [Test alert rules and troubleshoot]({{< relref "troubleshoot-alerts.md" >}}) + +## Clustering + +Currently alerting supports a limited form of high availability. Since v4.2.0 of Grafana, alert notifications are deduped when running multiple servers. This means all alerts are executed on every server but no duplicate alert notifications are sent due to the deduping logic. Proper load balancing of alerts will be introduced in the future. + +## Notifications + +You can also set alert rule notifications along with a detailed message about the alert rule. The message can contain anything: information about how you might solve the issue, link to runbook, and so on. + +The actual notifications are configured and shared between multiple alerts. + +## Alert execution + +Alert rules are evaluated in the Grafana backend in a scheduler and query execution engine that is part +of core Grafana. Only some data sources are supported right now. They include `Graphite`, `Prometheus`, `InfluxDB`, `Elasticsearch`, +`Stackdriver`, `Cloudwatch`, `Azure Monitor`, `MySQL`, `PostgreSQL`, `MSSQL`, `OpenTSDB`, `Oracle`, and `Azure Data Explorer`. + +## Metrics from the alert engine + +The alert engine publishes some internal metrics about itself. You can read more about how Grafana publishes [internal metrics]({{< relref "../administration/metrics/" >}}). + +Description | Type | Metric name +---------- | ----------- | ---------- +Total number of alerts | counter | `alerting.active_alerts` +Alert execution result | counter | `alerting.result` +Notifications sent counter | counter | `alerting.notifications_sent` +Alert execution timer | timer | `alerting.execution_time` diff --git a/docs/sources/alerting/create-alerts.md b/docs/sources/alerting/create-alerts.md new file mode 100644 index 00000000000..0637d6e0baf --- /dev/null +++ b/docs/sources/alerting/create-alerts.md @@ -0,0 +1,129 @@ ++++ +title = "Create alerts" +description = "Configure alert rules" +keywords = ["grafana", "alerting", "guide", "rules"] +type = "docs" +[menu.docs] +name = "Create alerts" +parent = "alerting" +weight = 200 ++++ + +# Create alerts + +Grafana alerting allows you to attach rules to your dashboard panels. When you save the dashboard, Grafana extracts the alert rules into a separate alert rule storage and schedules them for evaluation. + +{{< imgbox max-width="1000px" img="/img/docs/alerting/drag_handles_gif.gif" caption="Alerting overview" >}} + +In the Alert tab of the graph panel you can configure how often the alert rule should be evaluated and the conditions that need to be met for the alert to change state and trigger its [notifications]({{< relref "notifications.md" >}}). + +Currently only the graph panel supports alert rules. + +## Add or edit an alert rule + +1. Navigate to the panel you want to add or edit an alert rule for, click the title, and then click **Edit**. +1. On the Alert tab, click **Create Alert**. If an alert already exists for this panel, then you can just edit the fields on the Alert tab. +1. Fill out the fields. Descriptions are listed below in [Alert rule fields](#alert-rule-fields). +1. When you have finished writing your rule, click **Save** in the upper right corner to save alert rule and the dashboard. +1. (Optional but recommended) Click **Test rule** to make sure the rule returns the results you expect. + +## Delete an alert + +To delete an alert, scroll to the bottom of the alert and then click **Delete**. + +## Alert rule fields + +This section describes the fields you fill out to create an alert. + +### Rule + +- **Name -** Enter a descriptive name. The name will be displayed in the Alert Rules list. +- **Evaluate every -** Specify how often the scheduler should evaluate the alert rule. This is referred to as the _evaluation interval_. +- **For -** Specify how long the query needs to violate the configured thresholds before the alert notification triggers. + +You can set a minimum evaluation interval in the `alerting.min_interval_seconds` config field, to set a minimum time between evaluations. Refer to [Configuration]({{< relref "../installation/configuration.md" >}}#min-interval-seconds) for more information. + +> **Caution:** Do not use `For` with the `If no data or all values are null` setting set to `No Data`. The triggering of `No Data` will trigger instantly and not take `For` into consideration. This may also result in that an OK notification not being sent if alert transitions from `No Data -> Pending -> OK`. + +If an alert rule has a configured `For` and the query violates the configured threshold, then it will first go from `OK` to `Pending`. Going from `OK` to `Pending` Grafana will not send any notifications. Once the alert rule has been firing for more than `For` duration, it will change to `Alerting` and send alert notifications. + +Typically, it's always a good idea to use this setting since it's often worse to get false positive than wait a few minutes before the alert notification triggers. Looking at the `Alert list` or `Alert list panels` you will be able to see alerts in pending state. + +Below you can see an example timeline of an alert using the `For` setting. At ~16:04 the alert state changes to `Pending` and after 4 minutes it changes to `Alerting` which is when alert notifications are sent. Once the series falls back to normal the alert rule goes back to `OK`. +{{< imgbox img="/img/docs/v54/alerting-for-dark-theme.png" caption="Alerting For" >}} + +{{< imgbox max-width="40%" img="/img/docs/v4/alerting_conditions.png" caption="Alerting Conditions" >}} + +### Conditions + +Currently the only condition type that exists is a `Query` condition that allows you to +specify a query letter, time range and an aggregation function. + +#### Query condition example + +```sql +avg() OF query(A, 15m, now) IS BELOW 14 +``` + +- `avg()` Controls how the values for **each** series should be reduced to a value that can be compared against the threshold. Click on the function to change it to another aggregation function. +- `query(A, 15m, now)` The letter defines what query to execute from the **Metrics** tab. The second two parameters define the time range, `15m, now` means 15 minutes ago to now. You can also do `10m, now-2m` to define a time range that will be 10 minutes ago to 2 minutes ago. This is useful if you want to ignore the last 2 minutes of data. +- `IS BELOW 14` Defines the type of threshold and the threshold value. You can click on `IS BELOW` to change the type of threshold. + +The query used in an alert rule cannot contain any template variables. Currently we only support `AND` and `OR` operators between conditions and they are executed serially. +For example, we have 3 conditions in the following order: +*condition:A(evaluates to: TRUE) OR condition:B(evaluates to: FALSE) AND condition:C(evaluates to: TRUE)* +so the result will be calculated as ((TRUE OR FALSE) AND TRUE) = TRUE. + +We plan to add other condition types in the future, like `Other Alert`, where you can include the state of another alert in your conditions, and `Time Of Day`. + +#### Multiple Series + +If a query returns multiple series then the aggregation function and threshold check will be evaluated for each series. What Grafana does not do currently is track alert rule state **per series**. This has implications that are detailed in the scenario below. + +- Alert condition with query that returns 2 series: **server1** and **server2** +- **server1** series causes the alert rule to fire and switch to state `Alerting` +- Notifications are sent out with message: _load peaking (server1)_ +- In a subsequence evaluation of the same alert rule the **server2** series also cause the alert rule to fire +- No new notifications are sent as the alert rule is already in state `Alerting`. + +So as you can see from the above scenario Grafana will not send out notifications when other series cause the alert to fire if the rule already is in state `Alerting`. To improve support for queries that return multiple series we plan to track state **per series** in a future release. + +> Starting with Grafana v5.3 you can configure reminders to be sent for triggered alerts. This will send additional notifications +> when an alert continues to fire. If other series (like server2 in the example above) also cause the alert rule to fire they will be included in the reminder notification. Depending on what notification channel you're using you may be able to take advantage of this feature for identifying new/existing series causing alert to fire. + +### No Data & Error Handling + +Below are conditions you can configure how the rule evaluation engine should handle queries that return no data or only null values. + +No Data Option | Description +------------ | ------------- +No Data | Set alert rule state to `NoData` +Alerting | Set alert rule state to `Alerting` +Keep Last State | Keep the current alert rule state, what ever it is. +Ok | Not sure why you would want to send yourself an alert when things are okay, but you could. + +### Execution errors or timeouts + +Tell Grafana how to handle execution or timeout errors. + +Error or timeout option | Description +------------ | ------------- +Alerting | Set alert rule state to `Alerting` +Keep Last State | Keep the current alert rule state, what ever it is. + +If you have an unreliable time series store from which queries sometime timeout or fail randomly you can set this option to `Keep Last State` in order to basically ignore them. + +## Notifications + +In alert tab you can also specify alert rule notifications along with a detailed message about the alert rule. The message can contain anything, information about how you might solve the issue, link to runbook, and so on. + +The actual notifications are configured and shared between multiple alerts. Read +[Alert notifications]({{< relref "notifications.md" >}}) for information on how to configure and set up notifications. + +- **Send to -** Select an alert notification channel if you have one set up. +- **Message -** Enter a message to be sent on the notification channel. The message can be in text, markdown, or HTML format. It can include links and variables as well. +- **Tags -** Specify a list of tags (key/value) to be included in the notification. It is only supported by [some notifiers]({{< relref "notifications/#all-supported-notifiers" >}}). + +## Alert state history and annotations + +Alert state changes are recorded in the internal annotation table in Grafana's database. The state changes are visualized as annotations in the alert rule's graph panel. You can also go into the `State history` submenu in the alert tab to view and clear state history. diff --git a/docs/sources/alerting/metrics.md b/docs/sources/alerting/metrics.md deleted file mode 100644 index 4bb40185b5a..00000000000 --- a/docs/sources/alerting/metrics.md +++ /dev/null @@ -1,21 +0,0 @@ -+++ -title = "Alerting Metrics" -description = "Alerting Metrics Guide" -keywords = ["Grafana", "alerting", "guide", "metrics"] -type = "docs" -[menu.docs] -name = "Metrics" -parent = "alerting" -weight = 2 -+++ - -# Metrics from the alert engine - -The alert engine publishes some internal metrics about itself. You can read more about how Grafana publishes [internal metrics]({{< relref "../administration/metrics/" >}}). - -Description | Type | Metric name ----------- | ----------- | ---------- -Total number of alerts | counter | `alerting.active_alerts` -Alert execution result | counter | `alerting.result` -Notifications sent counter | counter | `alerting.notifications_sent` -Alert execution timer | timer | `alerting.execution_time` diff --git a/docs/sources/alerting/notifications.md b/docs/sources/alerting/notifications.md index df4f6012de3..359992aae4f 100644 --- a/docs/sources/alerting/notifications.md +++ b/docs/sources/alerting/notifications.md @@ -1,46 +1,42 @@ +++ -title = "Alerting Notifications" -description = "Alerting Notifications Guide" +title = "Alert notifications" +description = "Alerting notifications guide" keywords = ["Grafana", "alerting", "guide", "notifications"] type = "docs" [menu.docs] name = "Notifications" parent = "alerting" -weight = 2 +weight = 200 +++ - -# Alert Notifications - -> Alerting is only available in Grafana v4.0 and above. +# Alert notifications When an alert changes state, it sends out notifications. Each alert rule can have multiple notifications. In order to add a notification to an alert rule you first need -to add and configure a `notification` channel (can be email, PagerDuty or other integration). -This is done from the Notification Channels page. +to add and configure a `notification` channel (can be email, PagerDuty, or other integration). -## Notification Channel Setup +This is done from the Notification channels page. -On the Notification Channels page hit the `New Channel` button to go the page where you -can configure and setup a new Notification Channel. +> **Note:** Alerting is only available in Grafana v4.0 and above. -You specify a name and a type, and type specific options. You can also test the notification to make -sure it's setup correctly. +## Add a notification channel + +1. In the Grafana side bar, hover your cursor over the **Alerting** (bell) icon and then click **Notification channels**. +1. Click **Add channel**. +1. Fill out the fields or select options described below. + +## New notification channel fields ### Default (send on all alerts) -When checked, this option will notify for all alert rules - existing and new. +- **Name -** Enter a name for this channel. It will be displayed when users add notifications to alert rules. +- **Type -** Select the channel type. Refer to the [List of supported notifiers](#list-of-supported-notifiers) for details. +- **Default (send on all alerts) -** When selected, this option sends a notification on this channel for all alert rules. +- **Include Image -** See [Enable images in notifications](#enable-images-in-notifications-external-image-store) for details. +- **Disable Resolve Message -** When selected, this option disables the resolve message [OK] that is sent when the alerting state returns to false. +- **Send reminders -** When this option is checked additional notifications (reminders) will be sent for triggered alerts. You can specify how often reminders should be sent using number of seconds (s), minutes (m) or hours (h), for example `30s`, `3m`, `5m` or `1h`. -### Send reminders - -> Only available in Grafana v5.3 and above. - -{{< docs-imagebox max-width="600px" img="/img/docs/v53/alerting_notification_reminders.png" class="docs-image--right" caption="Alerting notification reminders setup" >}} - -When this option is checked additional notifications (reminders) will be sent for triggered alerts. You can specify how often reminders -should be sent using number of seconds (s), minutes (m) or hours (h), for example `30s`, `3m`, `5m` or `1h` etc. - -**Important:** Alert reminders are sent after rules are evaluated. Therefore a reminder can never be sent more frequently than a configured [alert rule evaluation interval]({{< relref "rules/#name-evaluation-interval" >}}). +**Important:** Alert reminders are sent after rules are evaluated. Therefore a reminder can never be sent more frequently than a configured alert rule evaluation interval. These examples show how often and when reminders are sent for a triggered alert. @@ -55,13 +51,28 @@ Alert rule evaluation interval | Send reminders every | Reminder sent every (aft
-### Disable resolve message +## List of supported notifiers -When checked, this option will disable resolve message [OK] that is sent when alerting state returns to false. - -## Supported Notification Types - -Grafana ships with the following set of notification types: +Name | Type | Supports images | Support alert rule tags +-----|------|---------------- | ----------------------- +[DingDing](#dingdingdingtalk) | `dingding` | yes, external only | no +Discord | `discord` | yes | no +[Email](#email) | `email` | yes | no +[Google Hangouts Chat](#google-hangouts-chat) | `googlechat` | yes, external only | no +Hipchat | `hipchat` | yes, external only | no +[Kafka](#kafka) | `kafka` | yes, external only | no +Line | `line` | yes, external only | no +Microsoft Teams | `teams` | yes, external only | no +OpsGenie | `opsgenie` | yes, external only | yes +[Pagerduty](#pagerduty) | `pagerduty` | yes, external only | yes +Prometheus Alertmanager | `prometheus-alertmanager` | yes, external only | yes +Pushover | `pushover` | yes | no +Sensu | `sensu` | yes, external only | no +[Slack](#slack) | `slack` | yes | no +Telegram | `telegram` | yes | no +Threema | `threema` | yes, external only | no +VictorOps | `victorops` | yes, external only | no +[Webhook](#webhook) | `webhook` | yes, external only | yes ### Email @@ -185,29 +196,6 @@ Notifications can be sent by setting up an incoming webhook in Google Hangouts c Squadcast helps you get alerted via Phone call, SMS, Email and Push notifications and lets you take actions on those alerts. Grafana notifications can be sent to Squadcast via a simple incoming webhook. Refer the official [Squadcast support documentation](https://support.squadcast.com/docs/grafana) for configuring these webhooks. -### All supported notifiers - -Name | Type | Supports images | Support alert rule tags ------|------|---------------- | ----------------------- -DingDing | `dingding` | yes, external only | no -Discord | `discord` | yes | no -Email | `email` | yes | no -Google Hangouts Chat | `googlechat` | yes, external only | no -Hipchat | `hipchat` | yes, external only | no -Kafka | `kafka` | yes, external only | no -Line | `line` | yes, external only | no -Microsoft Teams | `teams` | yes, external only | no -OpsGenie | `opsgenie` | yes, external only | yes -Pagerduty | `pagerduty` | yes, external only | yes -Prometheus Alertmanager | `prometheus-alertmanager` | yes, external only | yes -Pushover | `pushover` | yes | no -Sensu | `sensu` | yes, external only | no -Slack | `slack` | yes | no -Telegram | `telegram` | yes | no -Threema | `threema` | yes, external only | no -VictorOps | `victorops` | yes, external only | no -Webhook | `webhook` | yes, external only | yes - ## Enable images in notifications {#external-image-store} Grafana can render the panel associated with the alert rule as a PNG image and include that in the notification. Read more about the requirements and how to configure @@ -220,16 +208,6 @@ Be aware that some notifiers require public access to the image to be able to in Notification services which need public image access are marked as 'external only'. -## Use alert rule tags in notifications {#alert-rule-tags} - -> Only available in Grafana v6.3+. - -Grafana can include a list of tags (key/value) in the notification. -It's called alert rule tags to contrast with tags parsed from timeseries. -It currently supports only the Prometheus Alertmanager notifier. - - This is an optional feature. You can get notifications without using alert rule tags. - ## Configure the link back to Grafana from alert notifications All alert notifications contain a link back to the triggered alert in the Grafana instance. diff --git a/docs/sources/alerting/pause-an-alert-rule.md b/docs/sources/alerting/pause-an-alert-rule.md new file mode 100644 index 00000000000..dd0ec672e3e --- /dev/null +++ b/docs/sources/alerting/pause-an-alert-rule.md @@ -0,0 +1,17 @@ ++++ +title = "Pause alert rule" +description = "Pause an existing alert rule" +keywords = ["grafana", "alerting", "guide", "rules", "view"] +type = "docs" +[menu.docs] +parent = "alerting" +weight = 400 ++++ + +# Pause an alert rule + +Pausing the evaluation of an alert rule can sometimes be useful. For example, during a maintenance window, pausing alert rules can avoid triggering a flood of alerts. + +1. In the Grafana side bar, hover your cursor over the Alerting (bell) icon and then click **Alert Rules**. All configured alert rules are listed, along with their current state. +1. Find your alert in the list, and click the **Pause** icon on the right. The **Pause** icon turns into a **Play** icon. +1. Click the **Play** icon to resume evaluation of your alert. \ No newline at end of file diff --git a/docs/sources/alerting/rules.md b/docs/sources/alerting/rules.md deleted file mode 100755 index 4a07a208f0b..00000000000 --- a/docs/sources/alerting/rules.md +++ /dev/null @@ -1,174 +0,0 @@ -+++ -title = "Alerting Engine and Rules Guide" -description = "Configuring Alert Rules" -keywords = ["grafana", "alerting", "guide", "rules"] -type = "docs" -[menu.docs] -name = "Engine and Rules" -parent = "alerting" -weight = 1 -+++ - -# Alerting Engine and Rules Guide - -Alerting in Grafana allows you to attach rules to your dashboard panels. When you save the dashboard, -Grafana will extract the alert rules into a separate alert rule storage and schedule them for evaluation. - -{{< imgbox max-width="40%" img="/img/docs/v4/drag_handles_gif.gif" caption="Alerting overview" >}} - -In the alert tab of the graph panel you can configure how often the alert rule should be evaluated -and the conditions that need to be met for the alert to change state and trigger its -[notifications]({{< relref "notifications.md" >}}). - -## Execution - -The alert rules are evaluated in the Grafana backend in a scheduler and query execution engine that is part -of core Grafana. Only some data sources are supported right now. They include `Graphite`, `Prometheus`, `InfluxDB`, `Elasticsearch`, -`Stackdriver`, `Cloudwatch`, `Azure Monitor`, `MySQL`, `PostgreSQL`, `MSSQL`, `OpenTSDB`, `Oracle`, and `Azure Data Explorer`. - -## Clustering - -Currently alerting supports a limited form of high availability. Since v4.2.0 of Grafana, alert notifications are deduped when running multiple servers. This means all alerts are executed on every server but no duplicate alert notifications are sent due to the deduping logic. Proper load balancing of alerts will be introduced in the future. - - - -## Rule Config - -Currently only the graph panel supports alert rules. - -### Name and Evaluation interval - -Here you can specify the name of the alert rule and how often the scheduler should evaluate the alert rule. -**Note:** You can set a minimum interval in the `alerting.min_interval_seconds` config field, to set a minimum time between evaluations. Check out the [[configuration]]({{< relref "../installation/configuration.md" >}}#min-interval-seconds) page for more information. - -### For - -> **Important note regarding No Data:** -> -> Do not use `For` with the `If no data or all values are null` setting set to `No Data`. The triggering of `No Data` will trigger instantly and not take `For` into consideration. This may also result in that an OK notification not being sent if alert transitions from `No Data -> Pending -> OK`. - -If an alert rule has a configured `For` and the query violates the configured threshold it will first go from `OK` to `Pending`. Going from `OK` to `Pending` Grafana will not send any notifications. Once the alert rule has been firing for more than `For` duration, it will change to `Alerting` and send alert notifications. - -Typically, it's always a good idea to use this setting since it's often worse to get false positive than wait a few minutes before the alert notification triggers. Looking at the `Alert list` or `Alert list panels` you will be able to see alerts in pending state. - -Below you can see an example timeline of an alert using the `For` setting. At ~16:04 the alert state changes to `Pending` and after 4 minutes it changes to `Alerting` which is when alert notifications are sent. Once the series falls back to normal the alert rule goes back to `OK`. -{{< imgbox img="/img/docs/v54/alerting-for-dark-theme.png" caption="Alerting For" >}} - -{{< imgbox max-width="40%" img="/img/docs/v4/alerting_conditions.png" caption="Alerting Conditions" >}} - -### Conditions - -Currently the only condition type that exists is a `Query` condition that allows you to -specify a query letter, time range and an aggregation function. - -### Query condition example - -```sql -avg() OF query(A, 15m, now) IS BELOW 14 -``` - -- `avg()` Controls how the values for **each** series should be reduced to a value that can be compared against the threshold. Click on the function to change it to another aggregation function. -- `query(A, 15m, now)` The letter defines what query to execute from the **Metrics** tab. The second two parameters define the time range, `15m, now` means 15 minutes ago to now. You can also do `10m, now-2m` to define a time range that will be 10 minutes ago to 2 minutes ago. This is useful if you want to ignore the last 2 minutes of data. -- `IS BELOW 14` Defines the type of threshold and the threshold value. You can click on `IS BELOW` to change the type of threshold. - -The query used in an alert rule cannot contain any template variables. Currently we only support `AND` and `OR` operators between conditions and they are executed serially. -For example, we have 3 conditions in the following order: -*condition:A(evaluates to: TRUE) OR condition:B(evaluates to: FALSE) AND condition:C(evaluates to: TRUE)* -so the result will be calculated as ((TRUE OR FALSE) AND TRUE) = TRUE. - -We plan to add other condition types in the future, like `Other Alert`, where you can include the state -of another alert in your conditions, and `Time Of Day`. - -#### Multiple Series - -If a query returns multiple series then the aggregation function and threshold check will be evaluated for each series. -What Grafana does not do currently is track alert rule state **per series**. This has implications that are detailed -in the scenario below. - -- Alert condition with query that returns 2 series: **server1** and **server2** -- **server1** series causes the alert rule to fire and switch to state `Alerting` -- Notifications are sent out with message: _load peaking (server1)_ -- In a subsequence evaluation of the same alert rule the **server2** series also cause the alert rule to fire -- No new notifications are sent as the alert rule is already in state `Alerting`. - -So as you can see from the above scenario Grafana will not send out notifications when other series cause the alert -to fire if the rule already is in state `Alerting`. To improve support for queries that return multiple series -we plan to track state **per series** in a future release. - -> Starting with Grafana v5.3 you can configure reminders to be sent for triggered alerts. This will send additional notifications -> when an alert continues to fire. If other series (like server2 in the example above) also cause the alert rule to fire they will -> be included in the reminder notification. Depending on what notification channel you're using you may be able to take advantage -> of this feature for identifying new/existing series causing alert to fire. [Read more about notification reminders here]({{< relref "notifications/#send-reminders" >}}). - -### No Data / Null values - -Below your conditions you can configure how the rule evaluation engine should handle queries that return no data or only null values. - -No Data Option | Description ------------- | ------------- -NoData | Set alert rule state to `NoData` -Alerting | Set alert rule state to `Alerting` -Keep Last State | Keep the current alert rule state, what ever it is. - -### Execution errors or timeouts - -The last option tells how to handle execution or timeout errors. - -Error or timeout option | Description ------------- | ------------- -Alerting | Set alert rule state to `Alerting` -Keep Last State | Keep the current alert rule state, what ever it is. - -If you have an unreliable time series store from which queries sometime timeout or fail randomly you can set this option -to `Keep Last State` in order to basically ignore them. - -## Notifications - -In alert tab you can also specify alert rule notifications along with a detailed message about the alert rule. -The message can contain anything, information about how you might solve the issue, link to runbook, etc. - -The actual notifications are configured and shared between multiple alerts. Read the -[notifications]({{< relref "notifications.md" >}}) guide for how to configure and setup notifications. - -## Alert State History and Annotations - -Alert state changes are recorded in the internal annotation table in Grafana's database. The state changes -are visualized as annotations in the alert rule's graph panel. You can also go into the `State history` -submenu in the alert tab to view and clear state history. - -## Troubleshooting - -{{< imgbox max-width="40%" img="/img/docs/v4/alert_test_rule.png" caption="Test Rule" >}} - -First level of troubleshooting you can do is hit the **Test Rule** button. You will get result back that you can expand -to the point where you can see the raw data that was returned from your query. - -Further troubleshooting can also be done by inspecting the grafana-server log. If it's not an error or for some reason -the log does not say anything you can enable debug logging for some relevant components. This is done -in Grafana's ini config file. - -Example showing loggers that could be relevant when troubleshooting alerting. - -```ini -[log] -filters = alerting.scheduler:debug \ - alerting.engine:debug \ - alerting.resultHandler:debug \ - alerting.evalHandler:debug \ - alerting.evalContext:debug \ - alerting.extractor:debug \ - alerting.notifier:debug \ - alerting.notifier.slack:debug \ - alerting.notifier.pagerduty:debug \ - alerting.notifier.email:debug \ - alerting.notifier.webhook:debug \ - tsdb.graphite:debug \ - tsdb.prometheus:debug \ - tsdb.opentsdb:debug \ - tsdb.influxdb:debug \ - tsdb.elasticsearch:debug \ - tsdb.elasticsearch.client:debug \ -``` - -If you want to log raw query sent to your TSDB and raw response in log you also have to set grafana.ini option `app_mode` to -`development`. diff --git a/docs/sources/alerting/troubleshoot-alerts.md b/docs/sources/alerting/troubleshoot-alerts.md new file mode 100644 index 00000000000..3e2a3cfac6c --- /dev/null +++ b/docs/sources/alerting/troubleshoot-alerts.md @@ -0,0 +1,45 @@ ++++ +title = "Troubleshoot alerts" +description = "Troubleshoot alert rules" +keywords = ["grafana", "alerting", "guide", "rules", "troubleshoot"] +type = "docs" +[menu.docs] +name = "Troubleshoot alerts" +parent = "alerting" +weight = 500 ++++ + +# Troubleshoot alerts + +If alerts are not behaving as you expect, here are some steps you can take to troubleshoot and figure out what is going wrong. + +{{< imgbox max-width="1000px" img="/img/docs/v4/alert_test_rule.png" caption="Test Rule" >}} + +The first level of troubleshooting you can do is click **Test Rule**. You will get result back that you can expand to the point where you can see the raw data that was returned from your query. + +Further troubleshooting can also be done by inspecting the grafana-server log. If it's not an error or for some reason the log does not say anything you can enable debug logging for some relevant components. This is done in Grafana's ini config file. + +Example showing loggers that could be relevant when troubleshooting alerting. + +```ini +[log] +filters = alerting.scheduler:debug \ + alerting.engine:debug \ + alerting.resultHandler:debug \ + alerting.evalHandler:debug \ + alerting.evalContext:debug \ + alerting.extractor:debug \ + alerting.notifier:debug \ + alerting.notifier.slack:debug \ + alerting.notifier.pagerduty:debug \ + alerting.notifier.email:debug \ + alerting.notifier.webhook:debug \ + tsdb.graphite:debug \ + tsdb.prometheus:debug \ + tsdb.opentsdb:debug \ + tsdb.influxdb:debug \ + tsdb.elasticsearch:debug \ + tsdb.elasticsearch.client:debug \ +``` + +If you want to log raw query sent to your TSDB and raw response in log you also have to set grafana.ini option `app_mode` to `development`. diff --git a/docs/sources/alerting/view-alerts.md b/docs/sources/alerting/view-alerts.md new file mode 100644 index 00000000000..edc1f35d73c --- /dev/null +++ b/docs/sources/alerting/view-alerts.md @@ -0,0 +1,23 @@ ++++ +title = "View alerts" +description = "View existing alert rules" +keywords = ["grafana", "alerting", "guide", "rules", "view"] +type = "docs" +[menu.docs] +name = "View alerts" +parent = "alerting" +weight = 400 ++++ + +# View existing alert rules + +Grafana stores individual alert rules in the panels where they are defined, but you can also view a list of all existing alert rules and their current state. + +In the Grafana side bar, hover your cursor over the Alerting (bell) icon and then click **Alert Rules**. All configured alert rules are listed, along with their current state. + +You can do several things while viewing alerts. + +- **Filter alerts by name -** Type an alert name in the **Search alerts** field. +- **Filter alerts by state -** In **States**, select which alert states you want to see. All others will be hidden. +- **Pause or resume an alert -** Click the **Pause** or **Play** icon next to the alert to pause or resume evaluation. See [Pause an alert rule]({{< relref "pause-an-alert-rule.md" >}}) for more information. +- **Access alert rule settings -** Click the alert name or the **Edit alert rule** (gear) icon. Grafana opens the Alert tab of the panel where the alert rule is defined. This is helpful when an alert is firing but you don't know which panel it is defined in. \ No newline at end of file diff --git a/docs/sources/features/datasources/azuremonitor.md b/docs/sources/features/datasources/azuremonitor.md index c2065522f67..93c19ca1908 100755 --- a/docs/sources/features/datasources/azuremonitor.md +++ b/docs/sources/features/datasources/azuremonitor.md @@ -155,7 +155,7 @@ Not all metrics returned by the Azure Monitor API have values. The Grafana data ### Azure Monitor alerting -Grafana alerting is supported for the Azure Monitor service. This is not Azure Alerts support. Read more about how alerting in Grafana works [here]({{< relref "../../alerting/rules.md" >}}). +Grafana alerting is supported for the Azure Monitor service. This is not Azure Alerts support. Read more about how alerting in Grafana works [here]({{< relref "../../alerting/alerts-overview.md" >}}). {{< docs-imagebox img="/img/docs/v60/azuremonitor-alerting.png" class="docs-image--no-shadow" caption="Azure Monitor Alerting" >}} @@ -216,7 +216,7 @@ Examples: ### Application Insights alerting -Grafana alerting is supported for Application Insights. This is not Azure Alerts support. Read more about how alerting in Grafana works [here]({{< relref "../../alerting/rules.md" >}}). +Grafana alerting is supported for Application Insights. This is not Azure Alerts support. Read more about how alerting in Grafana works [here]({{< relref "../../alerting/alerts-overview.md" >}}). {{< docs-imagebox img="/img/docs/v60/azuremonitor-alerting.png" class="docs-image--no-shadow" caption="Azure Monitor Alerting" >}} @@ -329,7 +329,7 @@ If you're not currently logged in to the Azure Portal, then the link opens the l > Only available in Grafana v7.0+. -Grafana alerting is supported for Application Insights. This is not Azure Alerts support. Read more about how alerting in Grafana works in [Alerting rules]({{< relref "../../alerting/rules.md" >}}). +Grafana alerting is supported for Application Insights. This is not Azure Alerts support. Read more about how alerting in Grafana works in [Alerting rules]({{< relref "../../alerting/alerts-overview.md" >}}). ### Writing analytics queries For the Application Insights service diff --git a/docs/sources/features/panels/alertlist.md b/docs/sources/features/panels/alertlist.md index 03ba1d0030a..e3aeda0fae3 100644 --- a/docs/sources/features/panels/alertlist.md +++ b/docs/sources/features/panels/alertlist.md @@ -14,7 +14,7 @@ weight = 4 {{< docs-imagebox img="/img/docs/v45/alert-list-panel.png" max-width="850px" >}} -The alert list panel allows you to display your dashboards alerts. The list can be configured to show current state or recent state changes. You can read more about alerts [here](http://docs.grafana.org/alerting/rules). +The alert list panel allows you to display your dashboards alerts. The list can be configured to show current state or recent state changes. You can read more about alerts [here](http://docs.grafana.org/alerting/alerts-overview). ## Alert List Options diff --git a/docs/sources/getting-started/what-is-grafana.md b/docs/sources/getting-started/what-is-grafana.md index 862c1743993..4b015aa8565 100644 --- a/docs/sources/getting-started/what-is-grafana.md +++ b/docs/sources/getting-started/what-is-grafana.md @@ -33,7 +33,7 @@ Refer to [Explore]({{< relref "../features/explore/index.md" >}}) for more infor If you're using Grafana alerting, then you can have alerts sent through a number of different [alert notifiers]({{< relref "../alerting/notifications.md" >}}), including PagerDuty, SMS, email, VictorOps, OpsGenie, or Slack. -Alert hooks allow you to create different notifiers with a bit of code if you prefer some other channels of communication. Visually define [alert rules]({{< relref "../alerting/rules.md" >}}) for your most important metrics. +Alert hooks allow you to create different notifiers with a bit of code if you prefer some other channels of communication. Visually define [alert rules]({{< relref "../alerting/alerts-overview.md" >}}) for your most important metrics. ## Annotations diff --git a/docs/sources/guides/whats-new-in-v5-3.md b/docs/sources/guides/whats-new-in-v5-3.md index 94f9ee2d3e1..4d87f6a40e7 100644 --- a/docs/sources/guides/whats-new-in-v5-3.md +++ b/docs/sources/guides/whats-new-in-v5-3.md @@ -59,7 +59,7 @@ certain view mode enabled. Additionally, this also enables [playlists](/referenc ## Notification Reminders Do you use Grafana alerting and have some notifications that are more important than others? Then it's possible to set reminders so that you continue to be alerted until the problem is fixed. This is done on the notification channel itself and will affect all alerts that use that channel. -For additional examples of why reminders might be useful for you, see [multiple series](/alerting/rules/#multiple-series). +For additional examples of why reminders might be useful for you, see [multiple series](/alerting/alerts-overview/#multiple-series). Learn how to enable and configure reminders [here](/alerting/notifications/#send-reminders). diff --git a/docs/sources/guides/whats-new-in-v5-4.md b/docs/sources/guides/whats-new-in-v5-4.md index 02a4a495331..8efaa8c3824 100644 --- a/docs/sources/guides/whats-new-in-v5-4.md +++ b/docs/sources/guides/whats-new-in-v5-4.md @@ -26,7 +26,7 @@ Grafana v5.4 brings new features, many enhancements and bug fixes. This article Grafana v5.4 ships with a new alert rule setting named `For` which is great for removing false positives. If an alert rule has a configured `For` and the query violates the configured threshold it will first go from `OK` to `Pending`. Going from `OK` to `Pending` Grafana will not send any notifications. Once the alert rule has been firing for more than `For` duration, it will change to `Alerting` and send alert notifications. Typically, it's always a good idea to use this setting since it's often worse to get false positive than wait a few minutes before the alert notification triggers. -In the screenshot you can see an example timeline of an alert using the `For` setting. At ~16:04 the alert state changes to `Pending` and after 4 minutes it changes to `Alerting` which is when alert notifications are sent. Once the series falls back to normal the alert rule goes back to `OK`. [Learn more](/alerting/rules/#for). +In the screenshot you can see an example timeline of an alert using the `For` setting. At ~16:04 the alert state changes to `Pending` and after 4 minutes it changes to `Alerting` which is when alert notifications are sent. Once the series falls back to normal the alert rule goes back to `OK`. [Learn more](/alerting/alerts-overview/#for). Additionally, there's now support for disable the sending of `OK` alert notifications. [Learn more](/alerting/notifications/#disable-resolve-message). diff --git a/docs/sources/guides/whats-new-in-v6-5.md b/docs/sources/guides/whats-new-in-v6-5.md index 68beca3ccce..c12fb97e082 100755 --- a/docs/sources/guides/whats-new-in-v6-5.md +++ b/docs/sources/guides/whats-new-in-v6-5.md @@ -174,7 +174,7 @@ In the Explore split view, you can now link the two timepickers so that if you c ### Alerting support for Azure Application Insights -The [Azure Monitor]({{< relref "../features/datasources/azuremonitor/" >}}) data source supports multiple services in the Azure cloud. Before Grafana v6.5, only the Azure Monitor service had support for [Grafana Alerting]({{< relref "../alerting/rules" >}}). In Grafana 6.5, alerting support has been implemented for the [Application Insights service]({{< relref "../features/datasources/azuremonitor/#querying-the-application-insights-service" >}}). +The [Azure Monitor]({{< relref "../features/datasources/azuremonitor/" >}}) data source supports multiple services in the Azure cloud. Before Grafana v6.5, only the Azure Monitor service had support for [Grafana Alerting]({{< relref "../alerting/alerts-overview" >}}). In Grafana 6.5, alerting support has been implemented for the [Application Insights service]({{< relref "../features/datasources/azuremonitor/#querying-the-application-insights-service" >}}). ### Allow saving of provisioned dashboards from UI diff --git a/docs/sources/installation/requirements.md b/docs/sources/installation/requirements.md index 9c2008b3d6a..f5ad0a3251e 100644 --- a/docs/sources/installation/requirements.md +++ b/docs/sources/installation/requirements.md @@ -39,7 +39,7 @@ Minimum recommended CPU: 1 Some features might require more memory or CPUs. Features require more resources include: - [Server side rendering of images]({{< relref "../administration/image_rendering/#requirements" >}}) -- [Alerting]({{< relref "../alerting/rules" >}}) +- [Alerting]({{< relref "../alerting/alerts-overview" >}}) - Data source proxy ## Supported databases diff --git a/docs/sources/menu.yaml b/docs/sources/menu.yaml index a62832b25bd..1d35ce44583 100644 --- a/docs/sources/menu.yaml +++ b/docs/sources/menu.yaml @@ -206,12 +206,18 @@ - name: Alerting link: /alerting/ children: - - link: /alerting/rules/ - name: Engine and Rules - - link: /alerting/metrics/ - name: Metrics + - link: /alerting/alerts-overview/ + name: Overview - link: /alerting/notifications/ - name: Notifications + name: Alert notifications + - link: /alerting/create-alerts/ + name: Create alerts + - link: /alerting/view-alerts/ + name: View alerts + - link: /alerting/pause-an-alert-rule/ + name: Pause alert rule + - link: /alerting/troubleshoot-alerts/ + name: Troubleshoot alerts - name: Image rendering link: /administration/image_rendering/ - name: Linking diff --git a/docs/sources/panels/visualizations/alert-list-panel.md b/docs/sources/panels/visualizations/alert-list-panel.md index e961d013d9a..167a25cd91e 100644 --- a/docs/sources/panels/visualizations/alert-list-panel.md +++ b/docs/sources/panels/visualizations/alert-list-panel.md @@ -13,7 +13,7 @@ draft = "true" # Alert list panel -The alert list panel allows you to display your dashboards alerts. You can configure the list to show current state or recent state changes. You can read more about alerts in [Alerting rules](http://docs.grafana.org/alerting/rules). +The alert list panel allows you to display your dashboards alerts. You can configure the list to show current state or recent state changes. You can read more about alerts in [Alerting rules](http://docs.grafana.org/alerting/alerts-overview). {{< docs-imagebox img="/img/docs/v45/alert-list-panel.png" max-width="850px" >}}