This commit adds contact point testing to ngalerts via a new API
endpoint. This endpoint accepts JSON containing a list of
receiver configurations which are validated and then tested
with a notification for a test alert. The endpoint returns JSON
for each receiver with a status and error message. It accepts
a configurable timeout via the Request-Timeout header (in seconds)
up to a maximum of 30 seconds.
* Alerting: Expose discovered and dropped Alertmanagers
Exposes the API for discovered and dropped Alertmanagers.
* make admin config poll interval configurable
* update after rebase
* wordsmith
* More wordsmithing
* change name of the config
* settings package too
* Alerting: modify table and accessors to limit org access appropriately
* Update migration to create multiple Alertmanager configs
* Apply suggestions from code review
Co-authored-by: gotjosh <josue@grafana.com>
* replace mg.ClearMigrationEntry()
mg.ClearMigrationEntry() would create a new session.
This commit introduces a new migration for clearing an entry from migration log for replacing mg.ClearMigrationEntry() so that all dashboard alert migration operations will run inside the same transaction.
It adds also `SkipMigrationLog()` in Migrator interface for skipping adding an entry in the migration_log.
Co-authored-by: gotjosh <josue@grafana.com>
* Alerting: Send alerts to external Alertmanager(s)
Within this PR we're adding support for registering or unregistering
sending to a set of external alertmanagers. A few of the things that are
going are:
- Introduce a new table to hold "admin" (either org or global)
configuration we can change at runtime.
- A new periodic check that polls for this configuration and adjusts the
"senders" accordingly.
- Introduces a new concept of "senders" that are responsible for
shipping the alerts to the external Alertmanager(s). In a nutshell,
this is the Prometheus notifier (the one in charge of sending the alert)
mapped to a multi-tenant map.
There are a few code movements here and there but those are minor, I
tried to keep things intact as much as possible so that we could have an
easier diff.
* Alerting: Refactor `Run` of the scheduler
A bit of a refactor to make the diff easier to read for supporting
external Alertmanagers.
We'll introduce another routine that checks the database for
configuration and spawns other routines accordingly.
* Block the wait.
* Fix test
* initial attempt at automatic removal of stale states
* test case, need espected states
* finish unit test
* PR feedback
* still multiply by time.second
* pr feedback
* Alerting: deactivate an Alertmanager configuration
Implement DELETE /api/alertmanager/grafana/config/api/v1/alerts
by storing the default configuration which stops existing cnfiguration
from being in use.
* Apply suggestions from code review
* Expand the value of math and reduce expressions in annotations and labels
This commit makes it possible to use the values of reduce and math
expressions in annotations and labels via their RefIDs. It uses the
Stringer interface to ensure that "{{ $values.A }}" still prints the
value in decimal format while also making the labels for each RefID
available with "{{ $values.A.Labels }}" and the float64 value with
"{{ $values.A.Value }}"
* Alerting: Refactor state manager as a dependency
Within the scheduler, the state manager was being passed around a
certain number of functions. I've introduced it as a dependency to keep
the "service" interfaces as clean and homogeneous as possible.
This is relevant, because I'm going to introduce live reload of these
components as part of my next PR and it is better if dependencies are
self-contained.
* remove unused functions
* Fix a few more tests
* Make sure the `stateManager` is declared before the schedule
* Alerting: Allow __value__ label in notifications
was being removed by removePrivateItems
discoverd in #36020, but issue is not about that specifically
* __value__ label to __value_string__ annotation
and .ValueString extended property for notifications
* Fix deleting labels and annotations
* Add test
* Keep no data and error start if not provided
* Allow setting interval and for to zero during rule updates
* Alerting: Implement /status for the notification system
Implements the necessary plumbing to have a /status endpoint on the
notification system.
* Add API examples
* Update API specs
* Update prometheus/common dependency
Co-authored-by: Sofia Papagiannaki <sofia@grafana.com>
* Fix dashboard alert and nootifier migration for MySQL
* Fix POSTing Alertmanager configuration if no current configuration exists
in case the default configuration has not be stored yet
or has failed to get stored
* Change CreatedAt field type
* Alerting: Expand `{{$labels.xyz}}` template in labels and annotations
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Fix annotation not updating for same alert
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Alerting: Do not hard fail on templating errors in channels
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Fix review
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
When using mulit-dimensional Grafana managed alerts (e.g. SSE math) extract refIds values and labels so they can be shown in the notification and dashboards.
Threema Gateway supports two types of IDs: Basic IDs (where the
encryption is managed by the API server) and End-to-End IDs (where the
keys are managed by the user).
This plugin currently does not support End-to-End IDs (since it's much
more complex to implement, because the encryption needs to happen
locally). Add a few clarifications to the UI.
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>