Commit Graph

135 Commits

Author SHA1 Message Date
George Robinson
924deda589
Fix Discord Webhook URL for invalid template (#44763)
This commit fixes an issue where an invalid template for Discord would change the Webhook URL to "" and cause "unsupported protocol scheme" errors.
2022-02-02 14:28:41 +01:00
idafurjes
12420260ef
Remove bus from org invite api (#44530)
* Remove bus from org invite api

* Fix lint

* Remove comment
2022-01-31 17:24:52 +01:00
Serge Zaitsev
84a5910e56
Chore: Remove bus from ngalert (#44465)
* pass notification service down to the notifiers

* add ns to all notifiers

* remove bus from ngalert notifiers

* use smaller interfaces for notificationservice

* attempt to fix the tests

* remove unused struct field

* simplify notification service mock

* trying to resolve issues in the tests

* make linter happy

* make linter even happier

* linter, you are annoying
2022-01-26 16:42:40 +01:00
Yuriy Tseretyan
ea478dec22
Alerting: Remove bridge between log15 and go-kit logger (#43769)
* remove bridge between log15 and go-kit logger.

* fix tests
2022-01-07 09:40:09 +01:00
Alexander Weaver
fd583a0e3b
Alerting: Allow customization of Google chat message (#43568)
* Allow customizable googlechat message via optional setting

* Add optional message field in googlechat contact point configurator

* Fix strange error message on send if template fails to fully evaluate

* Elevate template evaluation failure logs to Warn level

* Extract default.title template embed from all channels to shared constant
2022-01-05 09:47:08 -06:00
idafurjes
8e6d6af744
Rename DispatchCtx to Dispatch (#43563) 2021-12-28 17:36:22 +01:00
idafurjes
7936c4c522
Rename AddHandlerCtx to AddHandler (#43557) 2021-12-28 16:08:07 +01:00
idafurjes
56c3875bb9
Chore: Remove context.TODO (#43458)
* Remove context.TODO() from services

* Fix live test
2021-12-28 10:26:18 +01:00
Alexander Weaver
56b3dc5445
Alerting: Allow configuration of non-ready alertmanagers (#43063)
* Create API test for overwriting invalid alertmanager config

* Avoid requiring alertmanager readiness for config changes

* AlertmanagerSrv depends on functionality rather than concrete types

* Add test for non-ready alertmanagers

* Additional cleanup and polish

* Back out previous integration test changes

* Refactor of tests incorrectly caused a test to become redundant

* Use pre-existing fake secret service

* Drop unused interface

* Test against concrete MultiOrgAlertmanager re-using fake infra from other tests

* Fix linter error

* Empty commit to rerun checks
2021-12-27 17:01:17 -06:00
Alexander Weaver
9abdaf251f
Alerting: Fix global state sensitivity in notifier channel tests (#43508) 2021-12-27 11:58:17 -06:00
Jean-Philippe Quéméner
ffc72aa255
Alerting: fix gosec warning that is not valid (#43425) 2021-12-21 19:47:47 +01:00
Gilles De Mey
bb3b5c10e7
Alerting: fix WeCom channel notifier test assertion (#43173) 2021-12-15 19:45:12 +01:00
Gilles De Mey
cbbbb505b4
Alerting: use HTML-safe characters for the default template (#43148) 2021-12-15 17:57:08 +01:00
smallpath
aec14cba42
Alerting: Support WeCom as a contact point type (#40975)
* add wecom notifier

* fix backend lint

* fix alerting channel test

* update wecom doc

* update notifiers

* update wecom notifier test

* Apply suggestions from code review

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* unify wecom alerting

* fix backend lint

* fix front lint

* fix wecom test

* update docs

* Update pkg/services/ngalert/notifier/channels/wecom.go

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Update docs/sources/alerting/old-alerting/notifications.md

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Update docs/sources/alerting/old-alerting/notifications.md

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Update docs/sources/alerting/old-alerting/notifications.md

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* remove old wecom notifier

* remove old notifier doc

* fix backend test

* Update docs/sources/alerting/unified-alerting/contact-points.md

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* fix doc style

Co-authored-by: gotjosh <josue.abreu@gmail.com>
2021-12-15 16:42:03 +00:00
Yuriy Tseretyan
1db9b1e6a9
Improve bridge for Alertmanager logger (#42958)
* Implement go-kit/log.Logger for internal logger.
2021-12-13 09:41:53 -05:00
gotjosh
bdab1d1f1f
Fix flaky tests in several notifiers (#42668)
* Fix flaky tests in several notifiers

- Non-mocked time in sensu go tests
- Close server in Slack tests
- Use a mutex for writing responses in the fake slack server

* Remove mutex at the fake slack server
2021-12-03 12:34:31 +00:00
Sofia Papagiannaki
9c7b52fd36
Alerting: Fix API specification (#42282)
* Alerting: Fix API specification
2021-11-30 20:55:54 +01:00
George Robinson
9122e7f647
Alerting: Check for nil model.Settings and models.SecureSettings (#37738) 2021-11-22 11:56:18 +00:00
Peter Holmberg
97978a7c02
Alerting: Add value to notifier template (#41951)
* add value to email template

* add value to default template

* update test string

* test: fix ngalert test suite

* test: run CI

Co-authored-by: gillesdemey <gilles.de.mey@gmail.com>
2021-11-22 08:45:44 +01:00
Jean-Philippe Quéméner
b9cdad3814
Alerting: support mute timings configuration through the api for the embedded alertmanager (#41533)
* Alerting: accept mute_timing_intervals through the api for the embedded alertmanager

* add workaround for mutetimeinterval

* add mute timings to routes

* revert changes

* Update pkg/services/ngalert/api/api_alertmanager.go

* Update pkg/services/ngalert/api/api_alertmanager.go

* Update pkg/services/ngalert/api/api_alertmanager.go

* update prometheus/alertmanager dependency

* add some var docs
2021-11-19 16:50:55 +01:00
ying-jeanne
54de1078c8
remove the global log error/warn etc functions (#41404)
* remove the global log error/warn etc functions and use request context logger whenever possible
2021-11-08 17:56:56 +01:00
Tania B
5652bde447
Encryption: Use secrets service (#40251)
* Use secrets service in pluginproxy

* Use secrets service in pluginxontext

* Use secrets service in pluginsettings

* Use secrets service in provisioning

* Use secrets service in authinfoservice

* Use secrets service in api

* Use secrets service in sqlstore

* Use secrets service in dashboardshapshots

* Use secrets service in tsdb

* Use secrets service in datasources

* Use secrets service in alerting

* Use secrets service in ngalert

* Break cyclic dependancy

* Refactor service

* Break cyclic dependancy

* Add FakeSecretsStore

* Setup Secrets Service in sqlstore

* Fix

* Continue secrets service refactoring

* Fix cyclic dependancy in sqlstore tests

* Fix secrets service references

* Fix linter errors

* Add fake secrets service for tests

* Refactor SetupTestSecretsService

* Update setting up secret service in tests

* Fix missing secrets service in multiorg_alertmanager_test

* Use fake db in tests and sort imports

* Use fake db in datasources tests

* Fix more tests

* Fix linter issues

* Attempt to fix plugin proxy tests

* Pass secrets service to getPluginProxiedRequest in pluginproxy tests

* Fix pluginproxy tests

* Revert using secrets service in alerting and provisioning

* Update decryptFn in alerting migration

* Rename defaultProvider to currentProvider

* Use fake secrets service in alert channels tests

* Refactor secrets service test helper

* Update setting up secrets service in tests

* Revert alerting changes in api

* Add comments

* Remove secrets service from background services

* Convert global encryption functions into vars

* Revert "Convert global encryption functions into vars"

This reverts commit 498eb19859.

* Add feature toggle for envelope encryption

* Rename toggle

Co-authored-by: Emil Tullstedt <emil.tullstedt@grafana.com>
Co-authored-by: Joan López de la Franca Beltran <joanjan14@gmail.com>
2021-11-04 18:47:21 +02:00
Yuriy Tseretyan
a1e1a728ad
Alerting: Update references to alertmanager (#40904)
* update module reference for alertmanager
* remove workaround
2021-10-29 10:03:51 -04:00
Santiago
c9654c4bc0
Fix issues with Slack contact points (#40953)
* recipient validation regex modified, validation at creation/modification implemented

* Remove validation for recipient, fix tests

* Log level changed from Warn to Error
2021-10-27 13:58:37 -03:00
Skye
bce1011361
Alerting: Option for Discord notifier to use webhook name (#40463)
* Added an option to discord notifier to use discord's webhook name (useful for customizing notifications).

* Support ngalert system with discord username toggle

* Added ngalert discord test

* Apply suggestions from code review

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Docs updated with discord username setting

* Fix api integration test

Co-authored-by: Marcus Efraimsson <marcus.efraimsson@gmail.com>
Co-authored-by: gotjosh <josue.abreu@gmail.com>
2021-10-26 14:55:10 -04:00
Santiago
1095f69740
Fix slack contact point panic (code review changes) (#40770)
* panic when unexpected Slack response fixed, tests added

* fix linting errors, add test

* Code review changes

* Apply PR comments

Co-authored-by: Armand Grillet <armand.grillet@outlook.com>
2021-10-22 17:23:22 +02:00
gotjosh
74fb491b6a
Alerting: Validate contact point configuration during migration to Unified Alerting (#40717)
* Alerting: Validate contact point configuration during the migration

This minimises the chances of generating broken configuration as part of the migration. Originally, we wanted to generate it and not produce a hard stop in Grafana but this strategy has the chance to avoid delivering notifications for our users.

We now think it's better to hard stop the migration and let the user take care of resolving the configuration manually.
2021-10-22 10:11:06 +01:00
George Robinson
967721068e
Alerting: Support custom annotations and labels when testing contact points
Support custom annotations and labels when testing contact points
2021-10-21 13:47:06 +01:00
Santiago
e54fe220e5
Fix panic when Slack API sends unexpected response (#40721)
* panic when unexpected Slack response fixed, tests added

* fix linting errors, add test
2021-10-21 08:12:15 +02:00
Yuriy Tseretyan
36e12e2b26
Remove flakiness of google chat tests (#40592) 2021-10-19 16:18:44 -04:00
Jean-Philippe Quéméner
153c356993
Alerting: delete orphaned records from kvstore (#40337) 2021-10-14 12:04:00 +02:00
gotjosh
48d73cb148
Alerting: Fixes a bug when trying to sync broken alertmanager config (#40338)
* Alerting: Fixes a bug when trying to sync broken alertmanager config

Broken alertmanager configuration has the potential to be introduced as part of a migration e.g. due to incompatible data between what grafana accepts and what the Alertmanager expects. When this happens, we expect an eventually consistent behaviour where we'll keep trying to apply the configuration until it works.

As part of change in https://github.com/grafana/grafana/pull/39237 we introduced a regression that modified this behaviour and instead tried to create a new Alertmanager for that organization everytime, which eventually ended up in a panic due to a duplicate metrics being registered.

This PR fixes that and introduces a test to catch further regressions.

* Remove disable orgs
2021-10-12 18:10:08 +01:00
Jean-Philippe Quéméner
e1dfec49f9
Alerting: cleanup alert resources on org removal (#39938) 2021-10-12 12:05:02 +02:00
George Robinson
8318e45452
Alerting: Fix error message in ngalert when notifications cannot be sent to alertmanager (#40158) 2021-10-11 14:50:50 +01:00
Jean-Philippe Quéméner
d9c0220824
Alerting: add organziation ID to the ngAlert webhook payload (#40189)
* Alerting: add organziation ID to the ngAlert webhook payload
2021-10-08 14:52:44 +02:00
Yuriy Tseretyan
5836def6c2
Alerting: declare constants for __dashboardUid__ and __panelId__ literals (#39976) 2021-10-07 17:30:06 -04:00
Joan López de la Franca Beltran
722c414fef
Encryption: Refactor securejsondata.SecureJsonData to stop relying on global functions (#38865)
* Encryption: Add support to encrypt/decrypt sjd

* Add datasources.Service as a proxy to datasources db operations

* Encrypt ds.SecureJsonData before calling SQLStore

* Move ds cache code into ds service

* Fix tlsmanager tests

* Fix pluginproxy tests

* Remove some securejsondata.GetEncryptedJsonData usages

* Add pluginsettings.Service as a proxy for plugin settings db operations

* Add AlertNotificationService as a proxy for alert notification db operations

* Remove some securejsondata.GetEncryptedJsonData usages

* Remove more securejsondata.GetEncryptedJsonData usages

* Fix lint errors

* Minor fixes

* Remove encryption global functions usages from ngalert

* Fix lint errors

* Minor fixes

* Minor fixes

* Remove securejsondata.DecryptedValue usage

* Refactor the refactor

* Remove securejsondata.DecryptedValue usage

* Move securejsondata to migrations package

* Move securejsondata to migrations package

* Minor fix

* Fix integration test

* Fix integration tests

* Undo undesired changes

* Fix tests

* Add context.Context into encryption methods

* Fix tests

* Fix tests

* Fix tests

* Trigger CI

* Fix test

* Add names to params of encryption service interface

* Remove bus from CacheServiceImpl

* Add logging

* Add keys to logger

Co-authored-by: Emil Tullstedt <emil.tullstedt@grafana.com>

* Add missing key to logger

Co-authored-by: Emil Tullstedt <emil.tullstedt@grafana.com>

* Undo changes in markdown files

* Fix formatting

* Add context to secrets service

* Rename decryptSecureJsonData to decryptSecureJsonDataFn

* Name args in GetDecryptedValueFn

* Add template back to NewAlertmanagerNotifier

* Copy GetDecryptedValueFn to ngalert

* Add logging to pluginsettings

* Fix pluginsettings test

Co-authored-by: Tania B <yalyna.ts@gmail.com>
Co-authored-by: Emil Tullstedt <emil.tullstedt@grafana.com>
2021-10-07 17:33:50 +03:00
gotjosh
6572017ec7
Alerting: Allow more characters in label names so notifications are sent (#38629)
Remove validation for labels to be accepted in the Alertmanager, This helps with datasources that produce non-compatible labels.

Adds an "object_matchers" to alert manager routers so we can support labels names with extended characters beyond prometheus/openmetrics. It only does this for the internal Grafana managed Alert Manager.

This requires a change to alert manager, so for now we use grafana/alertmanager which is a slight fork, with the intention of going back to upstream.

The frontend handles the migration of "matchers" -> "object_matchers" when the route is edited and saved. Once this is done, downgrades will not work old versions will not recognize the "object_matchers".

Co-authored-by: Kyle Brandt <kyle@grafana.com>
Co-authored-by: Nathan Rodman <nathanrodman@gmail.com>
2021-10-04 15:06:40 +02:00
Yuriy Tseretyan
4dadb8fc51
Alerting: Remove extra field orgId from notifier.Alertmanager (#39870) 2021-10-01 09:54:37 -04:00
Sofia Papagiannaki
012d4f0905
Alerting: Remove ngalert feature toggle and introduce two new settings for enabling Grafana 8 alerts and disabling them for specific organisations (#38746)
* Remove `ngalert` feature toggle

* Update frontend

Remove all references of ngalert feature toggle

* Update docs

* Disable unified alerting for specific orgs

* Add backend tests

* Apply suggestions from code review

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

* Disabled unified alerting by default

* Ensure backward compatibility with old ngalert feature toggle

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>
2021-09-29 16:16:40 +02:00
Sofia Papagiannaki
f6f3a54742
Alerting: tune rule evaluation via configuration (#35623)
* Alerting: Configure max evaluation retries

* Alerting: Enforce minimum rule evaluation interval

* Alerting: Disable rule evaluation from configuration

* Update docs

* Alerting: Configure rule evaluation timeout

* Move options on unified_alerting config section

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>
2021-09-28 13:00:16 +03:00
Yuriy Tseretyan
05eb30e323
Alerting: Move alertmanager default config to UnifiedAlertingSettings (#39597) 2021-09-23 13:52:20 -04:00
Yuriy Tseretyan
1910d85ae0
Alerting: Optimization of fetching data in multiorg alertmanager (#39237)
* Add method GetAllLatestAlertmanagerConfiguration to DBStore
* add method ApplyConfig to AlertManager
* update multiorg alert manager to load all alertmanager configs at once
2021-09-21 11:01:23 -04:00
gotjosh
2ad82b9354
Alerting: Move the unified alerting settings to its own struct (#39350) 2021-09-20 10:12:21 +03:00
Yuriy Tseretyan
e1aae0549e
Provide reader to alertmanager silence instead of file path (#39305) 2021-09-17 14:12:27 -04:00
gotjosh
7db97097c9
Alerting: Support Unified Alerting with Grafana HA (#37920)
* Alerting: Support Unified Alerting in Grafana's HA mode.
2021-09-16 15:33:51 +01:00
Santiago
0d2e68537c
Alerting: Cleanup template, silence and notification files created du… (#39007)
* Alerting: Cleanup template, silence and notification files created during tests

* Create tempdir for testing, delete afterwards and check for errors

* Refactoring error checks

* Update docs/sources/enterprise/access-control/fine-grained-access-control-references.md

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

* Update docs/sources/administration/configuration.md

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

* Update docs/sources/enterprise/access-control/fine-grained-access-control-references.md

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>
2021-09-15 18:48:52 -03:00
gotjosh
2b1d3d27e4
Alerting: Fix bug not creating filepath for silences/nflog if it does not exist (#39174)
We created this filepath just as we're about persist the templates - with the latest change, we now need to create it sooner.
2021-09-14 14:40:59 +01:00
gotjosh
a2f4344bf2
Alerting: Refactor & fix unified alerting metrics structure (#39151)
* Alerting: Refactor & fix unified alerting metrics structure

Fixes and refactors the metrics structure we have for the ngalert service. Now, each component has its own metric struct that includes the JUST the metrics it uses. Additionally, I have fixed the configuration metrics and added new metrics to determine if we have discovered and started all the necessary configurations of an instance.

This allows us to alert on `grafana_alerting_discovered_configurations - grafana_alerting_active_configurations != 0` to know whether an alertmanager instance did not start successfully.
2021-09-14 12:55:01 +01:00
Marcus Efraimsson
2cc0788187
Chore: Disable backend test for now since it adds 10 minutes extra in CI (#39150)
Ref #38586
2021-09-13 19:37:26 +02:00