Alerting vale fixes (#87380)

This commit is contained in:
brendamuir 2024-05-06 11:43:34 +02:00 committed by GitHub
parent 601485c74d
commit 526be4fa2b
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
5 changed files with 23 additions and 23 deletions

View File

@ -25,7 +25,7 @@ Set up Grafana to use an external Alertmanager as a single Alertmanager to recei
Grafana Alerting does not support sending alerts to the AWS Managed Service for Prometheus due to the lack of sigv4 support in Prometheus.
{{% /admonition %}}
Once you have added the Alertmanager, you can use the Grafana Alerting UI to manage silences, contact points, and notification policies. A drop-down option in these pages allows you to switch between alertmanagers.
After you have added the Alertmanager, you can use the Grafana Alerting UI to manage silences, contact points, and notification policies. A drop-down option in these pages allows you to switch between alertmanagers.
External alertmanagers should now be configured as data sources using Grafana Configuration from the main Grafana navigation menu. This enables you to manage the contact points and notification policies of external alertmanagers from within Grafana and also encrypts HTTP basic authentication credentials.

View File

@ -47,9 +47,9 @@ Grafana Alerting has the following permissions.
| `alert.silences:create` | `folders:*`<br>`folders:uid:*` | Create rule-specific silences in a folder and its subfolders. |
| `alert.silences:read` | `folders:*`<br>`folders:uid:*` | Read general and rule-specific silences in a folder and its subfolders. |
| `alert.silences:write` | `folders:*`<br>`folders:uid:*` | Update and expire rule-specific silences in a folder and its subfolders. |
| `alert.provisioning:read` | n/a | Read all Grafana alert rules, notification policies, etc via provisioning API. Permissions to folders and datasource are not required. |
| `alert.provisioning:read` | n/a | Read all Grafana alert rules, notification policies, etc via provisioning API. Permissions to folders and data source are not required. |
| `alert.provisioning.secrets:read` | n/a | Same as `alert.provisioning:read` plus ability to export resources with decrypted secrets. |
| `alert.provisioning:write` | n/a | Update all Grafana alert rules, notification policies, etc via provisioning API. Permissions to folders and datasource are not required. |
| `alert.provisioning:write` | n/a | Update all Grafana alert rules, notification policies, etc via provisioning API. Permissions to folders and data source are not required. |
| `alert.provisioning.provenance:write` | n/a | Set provisioning status for alerting resources. Cannot be used alone. Requires user to have permissions to access resources |
To help plan your RBAC rollout strategy, refer to [Plan your RBAC rollout strategy](https://grafana.com/docs/grafana/next/administration/roles-and-permissions/access-control/plan-rbac-rollout-strategy/).

View File

@ -52,11 +52,11 @@ By default, users with the basic roles Admin, Editor, and Viewer roles have quer
If you used fixed roles or custom roles, you need to update data source permissions.
Alternatively, an admin can assign the role **Datasource Reader**, which grants the user access to all data sources.
Alternatively, an administrator can assign the role **Datasource Reader**, which grants the user access to all data sources.
To manage data source permissions, complete the following steps.
1. In the left-side menu, click **Connections** > **Data sources**.
1. Click the data source you want to change the permissions for.
1. Click the **Permissions** tab.
1. In the **Permission column**, update the permission or remove it by clicking **X**.
1. In the **Permission column**, update the permission, or remove it by clicking **X**.

View File

@ -20,19 +20,19 @@ weight: 600
# Performance considerations and limitations
Grafana Alerting supports multi-dimensional alerting, where one alert rule can generate many alerts. For example, you can configure an alert rule to fire an alert every time the CPU of individual VMs max out. This topic discusses performance considerations resulting from multi-dimensional alerting.
Grafana Alerting supports multi-dimensional alerting, where one alert rule can generate many alerts. For example, you can configure an alert rule to fire an alert every time the CPU of individual virtual machines max out. This topic discusses performance considerations resulting from multi-dimensional alerting.
Evaluating alerting rules consumes RAM and CPU to compute the output of an alerting query, and network resources to send alert notifications and write the results to the Grafana SQL database. The configuration of individual alert rules affects the resource consumption and, therefore, the maximum number of rules a given configuration can support.
The following section provides a list of alerting performance considerations.
- Frequency of rule evaluation consideration. The "Evaluate Every" property of an alert rule controls the frequency of rule evaluation. We recommend using the lowest acceptable evaluation frequency to support more concurrent rules.
- Cardinality of the rule's result set. For example, suppose you are monitoring API response errors for every API path, on every VM in your fleet. This set has a cardinality of _n_ number of paths multiplied by _v_ number of VMs. You can reduce the cardinality of a result set - perhaps by monitoring errors-per-VM instead of for each path per VM.
- Frequency of rule evaluation consideration. The "Evaluate Every" property of an alert rule controls the frequency of rule evaluation. It is recommended to use the lowest acceptable evaluation frequency to support more concurrent rules.
- Cardinality of the rule's result set. For example, suppose you are monitoring API response errors for every API path, on every virtual machine in your fleet. This set has a cardinality of _n_ number of paths multiplied by _v_ number of VMs. You can reduce the cardinality of a result set - perhaps by monitoring errors-per-VM instead of for each path per VM.
- Complexity of the alerting query consideration. Queries that data sources can process and respond to quickly consume fewer resources. Although this consideration is less important than the other considerations listed above, if you have reduced those as much as possible, looking at individual query performance could make a difference.
Each evaluation of an alert rule generates a set of alert instances; one for each member of the result set. The state of all the instances is written to the `alert_instance` table in Grafana's SQL database. This number of write-heavy operations can cause issues when using SQLite.
Each evaluation of an alert rule generates a set of alert instances; one for each member of the result set. The state of all the instances is written to the `alert_instance` table in the Grafana SQL database. This number of write-heavy operations can cause issues when using SQLite.
Grafana Alerting exposes a metric, `grafana_alerting_rule_evaluations_total` that counts the number of alert rule evaluations. To get a feel for the influence of rule evaluations on your Grafana instance, you can observe the rate of evaluations and compare it with resource consumption. In a Prometheus-compatible database, you can use the query `rate(grafana_alerting_rule_evaluations_total[5m])` to compute the rate over 5 minute windows of time. It's important to remember that this isn't the full picture of rule evaluation. For example, the load will be unevenly distributed if you have some rules that evaluate every 10 seconds, and others every 30 minutes.
Grafana Alerting exposes a metric, `grafana_alerting_rule_evaluations_total` that counts the number of alert rule evaluations. To get a feel for the influence of rule evaluations on your Grafana instance, you can observe the rate of evaluations and compare it with resource consumption. In a Prometheus-compatible database, you can use the query `rate(grafana_alerting_rule_evaluations_total[5m])` to compute the rate over 5 minute windows of time. It's important to remember that this isn't the full picture of rule evaluation. For example, the load is unevenly distributed if you have some rules that evaluate every 10 seconds, and others every 30 minutes.
These factors all affect the load on the Grafana instance, but you should also be aware of the performance impact that evaluating these rules has on your data sources. Alerting queries are often the vast majority of queries handled by monitoring databases, so the same load factors that affect the Grafana instance affect them as well.
@ -44,25 +44,25 @@ It does not support reading or writing alerting rules from any other data source
## Prometheus version support
We support the latest two minor versions of both Prometheus and Alertmanager. We cannot guarantee that older versions will work.
The latest two minor versions of both Prometheus and Alertmanager are supported. We cannot guarantee that older versions work.
As an example, if the current Prometheus version is `2.31.1`, we support >= `2.29.0`.
As an example, if the current Prometheus version is `2.31.1`, >= `2.29.0` is supported.
## The Grafana Alertmanager can only receive Grafana managed alerts
Grafana cannot be used to receive external alerts. You can only send alerts to the Grafana Alertmanager using Grafana managed alerts.
You have the option to send Grafana managed alerts to an external Alertmanager, you can find this option in the admin tab on the Alerting page.
You have the option to send Grafana-managed alerts to an external Alertmanager, you can find this option in the Admin tab on the Alerting page.
For more information, refer to [this GitHub issue](https://github.com/grafana/grafana/issues/73447).
## High load on database caused by a high number of alert instances
If you have a high number of alert instances, it can happen that the load on the database gets very high, as each state
transition of an alert instance will be saved in the database.
transition of an alert instance is saved in the database.
This can be prevented by writing to the database periodically. For this the feature flag `alertingSaveStatePeriodic` needs
to be enabled. By default it will save the states every 5 minutes to the database and on each shutdown. The periodic interval
to be enabled. By default, it saves the states every 5 minutes to the database and on each shutdown. The periodic interval
can also be configured using the `state_periodic_save_interval` configuration flag.
The time it takes to write to the database periodically can be monitored using the `state_full_sync_duration_seconds` metric

View File

@ -82,7 +82,7 @@ For Grafana Cloud, refer to the [instructions to manage a Grafana Cloud stack wi
| [Notification policy tree][notification-policy] | [grafana_notification_policy](https://registry.terraform.io/providers/grafana/grafana/latest/docs/resources/notification_policy) |
| [Mute timings][mute-timings] | [grafana_mute_timing](https://registry.terraform.io/providers/grafana/grafana/latest/docs/resources/mute_timing) |
In this section, we'll create Terraform configurations for each alerting resource and demonstrate how to link them together.
This section provides examples of Terraform configurations for each alerting resource and demonstrate how to link them together.
### Add alert rules
@ -181,7 +181,7 @@ In this section, we'll create Terraform configurations for each alerting resourc
- `<terraform_rule_group_name>` with the name of the alert rule group.
Note that the distinct Grafana resources are connected through `uid` values in their Terraform configurations. The `uid` value will be randomly generated when provisioning.
Note that the distinct Grafana resources are connected through `uid` values in their Terraform configurations. The `uid` value is randomly generated when provisioning.
To link the alert rule group with its respective data source and folder in this example, replace the following field values:
@ -212,7 +212,7 @@ In this section, we'll create Terraform configurations for each alerting resourc
Replace the following field values:
- `<terraform_contact_point_name>` with the terraform name of the contact point. It will be used to reference the contact point in other Terraform resources.
- `<terraform_contact_point_name>` with the terraform name of the contact point. It is used to reference the contact point in other Terraform resources.
- `<email_address>` with the email to receive alert notifications.
1. Continue to add more Grafana resources or [use the Terraform CLI for provisioning](#provision-grafana-resources-with-terraform).
@ -276,7 +276,7 @@ In this section, we'll create Terraform configurations for each alerting resourc
Replace the following field values:
- `<terraform_mute_timing_name>` with the name of the Terraform resource. It will be used to reference the mute timing in the Terraform notification policy tree.
- `<terraform_mute_timing_name>` with the name of the Terraform resource. It is used to reference the mute timing in the Terraform notification policy tree.
1. Continue to add more Grafana resources or [use the Terraform CLI for provisioning](#provision-grafana-resources-with-terraform).
@ -286,13 +286,13 @@ In this section, we'll create Terraform configurations for each alerting resourc
{{% admonition type="warning" %}}
Since the policy tree is a single resource, provisioning the `grafana_notification_policy` resource will overwrite a policy tree created through any other means.
Since the policy tree is a single resource, provisioning the `grafana_notification_policy` resource overwrites a policy tree created through any other means.
{{< /admonition >}}
1. Find the default notification policy tree. Alternatively, consider writing the resource in code as demonstrated in the example below.
1. [Export][alerting_export] the notification policy tree in Terraform format. This exports it as [`grafana_notification_policy` Terraform resource](https://registry.terraform.io/providers/grafana/grafana/latest/docs/resources/notification_policy)—edit it if necessary.
2. [Export][alerting_export] the notification policy tree in Terraform format. This exports it as [`grafana_notification_policy` Terraform resource](https://registry.terraform.io/providers/grafana/grafana/latest/docs/resources/notification_policy)—edit it if necessary.
```terraform
resource "grafana_notification_policy" "my_policy_tree" {
@ -314,7 +314,7 @@ Since the policy tree is a single resource, provisioning the `grafana_notificati
- `<terraform_data_source_name>` with the terraform name of the previously defined contact point.
- `<terraform_folder_name>` with the terraform name of the previously defined mute timing.
1. Continue to add more Grafana resources or [use the Terraform CLI for provisioning](#provision-grafana-resources-with-terraform).
3. Continue to add more Grafana resources or [use the Terraform CLI for provisioning](#provision-grafana-resources-with-terraform).
### Enable editing resources in the Grafana UI
@ -368,7 +368,7 @@ To create the previous alerting resources in Grafana with the Terraform CLI, com
Enter a value:
```
Once you have confirmed to proceed with the changes, Terraform will create the provisioned resources in Grafana!
After you have confirmed to proceed with the changes, Terraform creates the provisioned resources in Grafana.
```shell
Apply complete! Resources: 4 added, 0 changed, 0 destroyed.