Commit Graph

363 Commits

Author SHA1 Message Date
Gokhan
cf601fab09
Alerting: Enable sending notifications to a specific topic on Telegram (#79546)
Co-authored-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
2024-02-06 17:19:22 +02:00
William Wernert
2ab7d3c725
Alerting: Receivers API (read only endpoints) (#81751)
* Add single receiver method

* Add receiver permissions

* Add single/multi GET endpoints for receivers

* Remove stable tag from time intervals

See end of PR description here: https://github.com/grafana/grafana/pull/81672
2024-02-05 20:12:15 +02:00
William Wernert
7e939401dc
Alerting: Introduce initial common receiver service (#81211)
* Create locking config store that mimics existing provisioning store

* Rename existing receivers(_test).go

* Introduce shared receiver group service

* Fix test

* Move query model to models package

* ReceiverGroup -> Receiver

* Remove locking config store

* Move convert methods to compat.go

* Cleanup
2024-02-01 14:42:59 -05:00
George Robinson
0726c7c3fa
Alerting: Prevent inhibition rules in Grafana Alertmanager (#81712)
This commit prevents saving configurations containing inhibition
rules in Grafana Alertmanager. It does not reject inhibition
rules when using external Alertmanagers, such as Mimir. This meant
the validation had to be put in the MultiOrgAlertmanager instead of
in the validation of PostableUserConfig. We can remove this when
inhibition rules are supported in Grafana Managed Alerts.
2024-02-01 14:53:15 +00:00
Matthew Jacobson
0ce1ccd6f9
Alerting: Fix inconsistent AM raw config when applied via sync vs API (#81655)
AM config applied via API would use the PostableUserConfig as the AM raw
config and also the hash used to decide when the AM config has changed.
However, when applied via the periodic sync the PostableApiAlertingConfig would
be used instead.

This leads to two issues:
- Inconsistent hash comparisons when modifying the AM causing redundant applies.
- GetStatus assumed the raw config was PostableUserConfig causing the endpoint
to return correctly after a new config is applied via API and then nothing once
 the periodic sync runs.

Note: Technically, the upstream GrafanaAlertamanger GetStatus shouldn't be
returning PostableUserConfig or PostableApiAlertingConfig, but instead
GettableStatus. However, this issue required changes elsewhere and is out of
scope.
2024-01-31 21:05:30 +02:00
William Wernert
2203bc2a3d
Alerting: Refactor provisioning tests/fakes (#81205)
* Fix up test Alertmanager config JSON

* Move fake AM config and provisioning stores to fakes package
2024-01-24 17:15:55 -05:00
George Robinson
85b9edcd28
Alerting: Fix incorrect initialization of logger (#81099) 2024-01-23 17:29:38 +02:00
Santiago
3afd94185c
Alerting: Add metric to check for default AM configurations (#80225)
* Alerting: Add metric to check for default AM configurations

* Use a gauge for the config hash

* don't go out of bounds when converting uint64 to float64

* expose metric for config hash

* update metrics after applying config
2024-01-16 17:12:24 +01:00
Santiago
9e78faa7ba
Alerting: Add metrics to the remote Alertmanager struct (#79835)
* Alerting: Add metrics to the remote Alertmanager struct

* rephrase http_requests_failed description

* make linter happy

* remove unnecessary metrics

* extract timed client to separate package

* use histogram collector from dskit

* remove weaveworks dependency

* capture metrics for all requests to the remote Alertmanager (both clients)

* use the timed client in the MimirAuthRoundTripper

* HTTPRequestsDuration -> HTTPRequestDuration, clean up mimir client factory function

* refactor

* less git diff

* gauge for last readiness check in seconds

* initialize LastReadinesCheck to 0, tweak metric names and descriptions

* add counters for sync attempts/errors

* last config sync and last state sync timestamps (gauges)

* change latency metric name

* metric for remote Alertmanager mode

* code review comments

* move label constants to metrics package
2024-01-10 11:18:24 +01:00
Santiago
1f6575e65e
Alerting: Test MOA in remote secondary mode (#79828) 2024-01-05 11:05:27 +01:00
Santiago
a77ba40ed4
Alerting: Use the forked Alertmanager for remote secondary mode (#79646)
* (WIP) Alerting: Use the forked Alertmanager for remote secondary mode

* fall back to using internal AM in case of error

* remove TODOs, clean up .ini file, add orgId as part of remote AM config struct

* log warnings and errors, fall back to remoteSecondary, fall back to internal AM only

* extract logic to decide remote Alertmanager mode to a separate function, switch on mode

* tests

* make linter happy

* remove func to decide remote Alertmanager mode

* refactor factory function and options

* add default case to switch statement

* remove ineffectual assignment
2023-12-21 15:26:31 +01:00
Santiago
c46da8ea9b
Alerting: Update alerting package and imports from cluster and clusterpb (#79786)
* Alerting: Update alerting package

* update to latest commit

* alias for imports
2023-12-21 12:34:48 +01:00
Santiago
f7248efff5
Alerting: Fix panic when creating a new Alertmanager returns an error (#79641)
Alerting: Fix panic after error creating new Alertmanager
2023-12-18 15:33:07 +01:00
Santiago
57e0d6bcb5
Chore: Simplify function signature for GetLatestAlertmanagerConfiguration (#79392) 2023-12-12 13:49:54 +01:00
Santiago
73776f37eb
Alerting: Send state to the remote Alertmanager (#78538)
* Alerting: Introduce a Mimir client as part of the Remote Alertmanager

Mimir client that understands the new APIs developed for mimir. Very much a WIP still.

* more wip

* appease the linter

* more linting

* add more code

* get state from kvstore, encode, send

* send state to the remote Alertmanager, extract fullstate logic into its own function

* pass kvstore to remote.NewAlertmanager()

* refactor

* add fake kvstore to tests

* tests

* use FileStore to get state

* always log 'completed state upload'

* refactor compareRemoteConfig

* base64-encode the state in the file store

* export silences and nflog filenames, refactor

* log 'completed state/config upload...' regardless of outcome

* add values to the state store in tests

* address code review comments

* log error from filestore

---------

Co-authored-by: gotjosh <josue.abreu@gmail.com>
2023-11-29 12:49:39 +01:00
Santiago
01d274852c
Alerting: Add GetFullState method to FileStore (#78701)
* Alerting: Add GetFullState method to FileStore

* make tests compile, create stateStore in NewAlertmanager

* return errors instead of logging, accept an arbitrary number of strings

* make NewAlertmanager() accept a stateStore
2023-11-28 15:34:45 +01:00
Santiago
197f0d2859
Alerting: Add methods for silences to the forked Alertmanager (#77805)
* Alerting: Add an empty Forked Alertmanager

* Alerting: Add methods for silences to the forked Alertmanager

* check for errors in tests

* make linter happy

* make linter happy

* Alerting: Add methods for silences to the forked Alertmanager
2023-11-08 12:03:40 +01:00
Ryan McKinley
5d5f8dfc52
Chore: Upgrade Go to 1.21.3 (#77304) 2023-11-01 09:17:38 -07:00
Santiago
a6b9b27673
Alerting: Remove OrgID() from the Alertmanager interface (#77398) 2023-10-31 10:58:47 +01:00
Yuri Tseretyan
48b55f39bf
Alerting: Add support for responders to Opsgenie integration (#77159)
* add support for responders in opsgenie UI config
* update export model

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>
2023-10-27 13:06:46 -04:00
Santiago
f9fc2e4568
Alerting: Remove ConfigHash() from the Alertmanager interface (#77134) 2023-10-25 17:11:53 +02:00
Santiago
322a9c0b15
Alerting: Replace FileStore() for CleanUp() in the Alertmanager interface (#77126)
Alerting: Remplace FileStore() for CleanUp() in the Alertmanager interface
2023-10-25 13:58:28 +02:00
gotjosh
866acbd5ac
Alerting: Move ExternalAlertmanager to its own package (#76854)
* Alerting: Move `ExternalAlertmanager` to its own package

We'll avoid import cycles when using components from other packages. In addition to that, I've created an `Options` approach for the multiorg alertmanger to allow us to override how per tenant alertmanagers are created.

* switch things around

* address review comments

* fix references and warnings
2023-10-20 14:08:13 +02:00
Santiago
a60ec150f9
Alerting: Fetch receivers from remote Alertmanager (#76841)
* Alerting: fetch receivers from remote Alertmanager

* make linter happy

* change require.Eventually() timeout and tick
2023-10-20 11:34:17 +02:00
Santiago
61cb26711e
Alerting: Fetch alerts from a remote Alertmanager (#75844)
* Alerting: post alerts to the remote Alertmanager and fetch them

* fix broken tests

* Alerting: Add Mimir Backend image to devenv (blocks)

* add alerting as code owner for mimir_backend block

* Alerting: Use Mimir image to run integration tests for the remote Alertmanager

* skip integration test when running all tests

* skipping integration test when no Alertmanager URL is provided

* fix bad host for mimir_backend

* remove basic auth testing until we have an nginx image in our CI

* add integration tests for alerts

* fix tests

* change SendCtx -> Send, add context.Context to Send, fix CI

* add reover() for functions from the Prometheus Alertmanager HTTP client that could panic

* add TODO to implement PutAlerts in a way that mimicks what Prometheus does

* fix log format
2023-10-19 11:27:37 +02:00
Santiago
7d9b2c73c7
Alerting: Use Mimir image to run integration tests for the remote Alertmanager (#76608)
* Alerting: Use Mimir image to run integration tests for the remote Alertmanager

* skip integration test when running all tests

* skipping integration test when no Alertmanager URL is provided

* fix bad host for mimir_backend

* remove basic auth testing until we have an nginx image in our CI
2023-10-17 12:21:45 +02:00
Matthew Jacobson
82f3127e23
Alerting: Move legacy alert migration from sqlstore migration to service (#72702) 2023-10-12 13:43:10 +01:00
Alexander Weaver
f6649d7a97
Revert "Alerting: Remove vendored models in migration service" (#76387)
Revert "Alerting: Remove vendored models in migration service (#74503)"

This reverts commit 6a8649d544.
2023-10-11 14:21:21 -05:00
Matthew Jacobson
6a8649d544
Alerting: Remove vendored models in migration service (#74503)
This PR replaces the vendored models in the migration with their equivalent ngalert models. It also replaces the raw SQL selects and inserts with service calls.

It also fills in some gaps in the testing suite around:

    - Migration of alert rules: verifying that the actual data model (queries, conditions) are correct 9a7cfa9
    - Secure settings migration: verifying that secure fields remain encrypted for all available notifiers and certain fields migrate from plain text to encrypted secure settings correctly e7d3993

Replacing the checks for custom dashboard ACLs will be replaced in a separate targeted PR as it will be complex enough alone.
2023-10-11 17:22:09 +01:00
Santiago
73be9449d1
Alerting: Manage remote Alertmanager silences (#75452)
* Alerting: Manage remote Alertmanager silences

* fix typo

* check errors when encoding json in fake external AM

* take path from configured URL, check for nil responses
2023-10-02 07:36:11 -03:00
Santiago
8c1a3f75f9
Alerting: Add empty remote Alertmanager struct (#74864)
* Alerting: Add empty remote alertmanager struct

* Update pkg/services/ngalert/notifier/external_alertmanager.go

Co-authored-by: gotjosh <josue.abreu@gmail.com>

---------

Co-authored-by: gotjosh <josue.abreu@gmail.com>
2023-09-14 08:55:01 -03:00
Nutmos
ad9f0b9e4e
Alerting: Add message options for Telegram contact point (#74635)
Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>
2023-09-12 10:45:57 -04:00
Santiago
93b9f9b537
Alerting: Use interfaces for the Alertmanager (#73900) 2023-09-06 07:59:29 -03:00
Alexander Weaver
5c9aeaef41
Alerting: Do not exit if Redis ping fails when using redis-based Alertmanager clustering (#74144)
Do not fail redis peer construction if ping fails
2023-09-05 10:43:13 -05:00
Serge Zaitsev
58f6648505
Chore: capitalise messages for alerting (#74335) 2023-09-04 18:46:34 +02:00
George Robinson
439270f6cb
Rename Google Hangouts to Google Chat (#74162)
* Rename Google Hangouts to Google Chat

* Fix prettier
2023-08-31 16:09:22 +03:00
Ryan McKinley
025b2f3011
Chore: use any rather than interface{} (#74066) 2023-08-30 18:46:47 +03:00
Alexander Weaver
dfba94e052
Alerting: Limit redis pool size to 5 and make configurable (#74057)
* Limit redis pool size to 5 and expose it in config ini

* Coerce negative pool sizes to the default
2023-08-29 14:59:12 -05:00
George Robinson
bbef000202
Alerting: Add contact point for Grafana OnCall (#73733)
Add contact point for Grafana OnCall
2023-08-24 10:45:12 +02:00
Yuri Tseretyan
90e3f516ff
Alerting: Update Discord settings to treat 'url' as a secure setting (#69588)
* make discord url secure

* support migrating unsecure settings to secure settings

* Update public/app/features/alerting/unified/utils/receiver-form.ts

Co-authored-by: William Wernert <william.wernert@grafana.com>

---------

Co-authored-by: Gilles De Mey <gilles.de.mey@gmail.com>
Co-authored-by: William Wernert <william.wernert@grafana.com>
2023-08-16 09:03:56 +02:00
Matthew Jacobson
d31d175109
Alerting: Fix contact point testing with secure settings (#72235)
* Alerting: Fix contact point testing with secure settings

Fixes double encryption of secure settings during contact point testing and removes code duplication
that helped cause the drift between alertmanager and test endpoint. Also adds integration tests to cover
the regression.

Note: provisioningStore is created to remove cycle and the unnecessary dependency.
2023-07-25 10:04:27 -04:00
Matthew Jacobson
e3787de470
Alerting: Fix Alertmanager change detection for receivers with secure settings (#71307)
* Alerting: Make ApplyAlertmanagerConfiguration only decrypt/encrypt new/changed secure settings

Previously, ApplyAlertmanagerConfiguration would decrypt and re-encrypt all secure settings. However, this caused re-encrypted secure settings to be included in the raw configuration when applied to the embedded alertmanager, resulting in changes to the hash. Consequently, even if no actual modifications were made, saving any alertmanager configuration triggered an apply/restart and created a new historical entry in the database.

To address the issue, this modifies ApplyAlertmanagerConfiguration, which is called by POST `api/alertmanager/grafana/config/api/v1/alerts`, to decrypt and re-encrypt only new and updated secure settings. Unchanged secure settings are loaded directly from the database without alteration.

We determine whether secure settings have changed based on the following (already in-use) assumption: Only new or updated secure settings are provided via the POST `api/alertmanager/grafana/config/api/v1/alerts` request, while existing unchanged settings are omitted.

* Ensure saving a grafana-managed contact point will only send new/changed secure settings

Previously, when saving a grafana-managed contact point, empty string values were transmitted for all unset secure settings. This led to potential backend issues, as it assumed that only newly added or updated secure settings would be provided.

To address this, we now exclude empty ('', null, undefined) secure settings, unless there was a pre-existing entry in secureFields for that specific setting. In essence, this means we only transmit an empty secure setting if a previously configured value was cleared.

* Fix linting

* refactor omitEmptyUnlessExisting

* fixup

---------

Co-authored-by: Gilles De Mey <gilles.de.mey@gmail.com>
2023-07-11 08:23:07 +02:00
João Calisto
1d68f5ba77
Alerting: Fix HA alerting membership sync (#70607)
* Alerting: Fix HA alerting membership sync

* Added comment about filtering duplicates
2023-06-26 17:12:10 +01:00
Andreas Deininger
95b1f3c875
Fixing typos (#70487) 2023-06-22 09:43:38 +01:00
Santiago
ff9eff49bd
Alerting: Bump grafana/alerting and refactor the ImageStore/Provider to provide image URL/bytes (#70182)
* implement alerting.images.Provider interface in our ImageStore

* add URLExists() method to fakeConfigStore

* make linter happy

* update integration tests
2023-06-21 20:53:30 -03:00
George Robinson
a1cb7319d5
Alerting: Update in app documentation for customizing message and subject (#70367) 2023-06-20 12:20:01 +02:00
George Robinson
f085e99d3c
Alerting: Add matchers metrics to Alertmanager (#69855) 2023-06-15 09:18:01 +01:00
Matthew Jacobson
ba3994d338
Alerting: Repurpose rule testing endpoint to return potential alerts (#69755)
* Alerting: Repurpose rule testing endpoint to return potential alerts

This feature replaces the existing no-longer in-use grafana ruler testing API endpoint /api/v1/rule/test/grafana. The new endpoint returns a list of potential alerts created by the given alert rule, including built-in + interpolated labels and annotations.

The key priority of this endpoint is that it is intended to be as true as possible to what would be generated by the ruler except that the resulting alerts are not filtered to only Resolved / Firing and ready to be sent.

This means that the endpoint will, among other things:

- Attach static annotations and labels from the rule configuration to the alert instances.
- Attach dynamic annotations from the datasource to the alert instances.
- Attach built-in labels and annotations created by the Grafana Ruler (such as alertname and grafana_folder) to the alert instances.
- Interpolate templated annotations / labels and accept allowed template functions.
2023-06-08 18:59:54 -04:00
Matthew Jacobson
c16f1f5e99
Alerting: Fix provisioned templates being ignored by alertmanager (#69485)
* Alerting: Fix provisioned templates being ignored by alertmanager

Template provisioning sets the template in cfg.TemplateFiles while a recent change
made it so that alertmanager reads cfg.AlertmanagerConfig.Templates instead.

This change fixes the issue on both ends, by having provisioning set boths fields and
reverts the change on the alertmanager side so that it uses cfg.TemplateFiles.
2023-06-02 15:47:43 -04:00
Alexander Weaver
0ed5d3bdf2
Revert "Alerting: Refactor the ImageStore/Provider to provide image URL/bytes" (#69265)
Revert "Alerting: Refactor the ImageStore/Provider to provide image URL/bytes (#67693)"

This reverts commit 72a187b0be.
2023-05-30 11:33:33 -05:00