Commit Graph

476 Commits

Author SHA1 Message Date
Jean-Philippe Quéméner
ffc72aa255
Alerting: fix gosec warning that is not valid (#43425) 2021-12-21 19:47:47 +01:00
idafurjes
ff3cf94b56
Chore: Remove context.TODO() from services (#42555)
* Remove context.TODO() from services

* Fix live test
2021-12-20 17:05:33 +01:00
Yuriy Tseretyan
1a762083d7
Alerting: make alert rule routine evaluation control be thread-safe (#41220)
* change registry.delete to return deleted struct
* use pointer to alertRuleInfo instead copying.
* do not access evaluation channel when routine is stopped
* remove stopCh and use context cancellation
* do not return ctx.Err when channel is cancelled because it cancels all other routines
* make alertRuleInfo fields and functions package private
2021-12-16 14:52:47 -05:00
Ryan McKinley
2754e4fdf0
Expressions: use datasource model from the query (#41376)
* refactor datasource loading

* refactor datasource loading

* pass uid

* use dscache in alerting to get DS

* remove expr/translate pacakge

* remove dup injection entry

* fix DS type on metrics endpoint, remove SQL DS lookup inside SSE

* update test and adapter

* comment fix

* Make eval run as admin when getting datasource info

Co-authored-by: Marcus Efraimsson <marcus.efraimsson@gmail.com>

* fmt and comment

* remove unncessary/redundant code

Co-authored-by: Kyle Brandt <kyle@grafana.com>
Co-authored-by: Marcus Efraimsson <marcus.efraimsson@gmail.com>
Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>
2021-12-16 13:51:46 -03:00
Jean-Philippe Quéméner
b605340668
Alerting: log errors happening in the API on server side (#43192)
* Alerting: log errors happening in the API on server side

* adapt tests to reflect changed payload
2021-12-16 13:33:10 +01:00
Gilles De Mey
bb3b5c10e7
Alerting: fix WeCom channel notifier test assertion (#43173) 2021-12-15 19:45:12 +01:00
Gilles De Mey
cbbbb505b4
Alerting: use HTML-safe characters for the default template (#43148) 2021-12-15 17:57:08 +01:00
smallpath
aec14cba42
Alerting: Support WeCom as a contact point type (#40975)
* add wecom notifier

* fix backend lint

* fix alerting channel test

* update wecom doc

* update notifiers

* update wecom notifier test

* Apply suggestions from code review

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* unify wecom alerting

* fix backend lint

* fix front lint

* fix wecom test

* update docs

* Update pkg/services/ngalert/notifier/channels/wecom.go

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Update docs/sources/alerting/old-alerting/notifications.md

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Update docs/sources/alerting/old-alerting/notifications.md

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Update docs/sources/alerting/old-alerting/notifications.md

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* remove old wecom notifier

* remove old notifier doc

* fix backend test

* Update docs/sources/alerting/unified-alerting/contact-points.md

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* fix doc style

Co-authored-by: gotjosh <josue.abreu@gmail.com>
2021-12-15 16:42:03 +00:00
Yuriy Tseretyan
1db9b1e6a9
Improve bridge for Alertmanager logger (#42958)
* Implement go-kit/log.Logger for internal logger.
2021-12-13 09:41:53 -05:00
Sofia Papagiannaki
c6483cd8ed
Alerting: Refactor API handlers to use web.Bind (#42600)
* Alerting: Refactor API handlers to use web.Bind

* lint
2021-12-13 09:22:57 +01:00
gotjosh
bdab1d1f1f
Fix flaky tests in several notifiers (#42668)
* Fix flaky tests in several notifiers

- Non-mocked time in sensu go tests
- Close server in Slack tests
- Use a mutex for writing responses in the fake slack server

* Remove mutex at the fake slack server
2021-12-03 12:34:31 +00:00
George Robinson
c932dc959c
Alerting: Add Ref ID to DatasourceNoData and DatasourceError alerts (#42630) 2021-12-03 09:55:16 +00:00
gotjosh
5b64c4f684
Alerting: Fix panic while proxying 4xx responses of requests to cortex/loki (#42570)
Fixes a panic that would ocurr as we proxy 4xx responses. When this happens and the content type of the response is JSON we try to check if the response has a "message" key. Then, we assume that the key will contain a value of string but we don't take into account that this value can potentially be `null`.

This adds a type assertion check to to this assumption so that we can keep the original JSON body as the response if we're unable to extract an `message`.
2021-12-01 13:53:29 +00:00
gotjosh
357e9ed1ea
Alerting: Fix Annotation Creation when the alerting state changes (#42479)
* Fix Annotation creation
- Remove validation of panelID, now annotations are created irrespective on whether they're attached to a panel or not.
- Alwasy attach the annotation to an AlertID

* Fix annotation creation

* fix tests
2021-12-01 11:04:54 +00:00
Sofia Papagiannaki
9c7b52fd36
Alerting: Fix API specification (#42282)
* Alerting: Fix API specification
2021-11-30 20:55:54 +01:00
Santiago
a21d1e50f1
avoid template execution errors on missing values (#41617) 2021-11-29 15:26:51 -03:00
gotjosh
dd5a2e5128
Alerting: Clear alerting rule evaluation errors after intermittent failures (#42386)
* Alerting: Clear alerting rule evaluation errors after intermittent failures

When an alert transitioned in a way that `alerting -> error -> (alerting|nodata)`, the error provided by the `error` state would never be cleared thus the API and UI would show the health as an error.
2021-11-26 17:58:19 +00:00
George Robinson
1b26d4d88e
Alerting: Create DatasourceError alert if evaluation returns error (#41869)
* Alerting: Create DatasourceError alert if evaluation returns error

* Alerting: Add docs for DatasourceError alert

* Alerting: Fix DatasourceError alert does not have dashboard_uid label

* Alerting: Add break when datasource_uid found

* Alerting: Update TestProcessEvalResults
2021-11-25 11:46:47 +01:00
George Robinson
1e5b0e64ac
Alerting: Add comments to ScheduleService interface (#42228) 2021-11-25 10:12:04 +00:00
Armand Grillet
6523486122
Alerting: Make Unified Alerting enabled by default for those who do not use legacy alerting (#42200)
* update AlertingEnabled and UnifiedAlertingSettings.Enabled to be pointers
* add a pseudo migration to fix the AlertingEnabled and UnifiedAlertingSettings.Enabled if the latter is not defined
* update the default configuration file to make default value for both 'enabled' flags be undefined

Misc
* update Migrator to expose DB engine. This is needed for a ualert migration to access the database while the list of migrations is created.
* add more verbose failure when migrations do not match

Co-authored-by: gotjosh <josue@grafana.com>
Co-authored-by: Yuriy Tseretyan <yuriy.tseretyan@grafana.com>
Co-authored-by: gillesdemey <gilles.de.mey@gmail.com>
2021-11-24 14:56:07 -05:00
Jean-Philippe Quéméner
cec2d965ec
Alerting: validate mute timings in the alertmanager configuration (#42125)
* Alerting: check for uniqueness of mutetime names

* add some testing

* add name validation

* add root route validation

* add tests for validation

* add check for root route mute_time_intervals

* add duplicate test

* remove useless yaml test

* refactor table test
2021-11-23 16:25:20 +01:00
George Robinson
9122e7f647
Alerting: Check for nil model.Settings and models.SecureSettings (#37738) 2021-11-22 11:56:18 +00:00
Peter Holmberg
97978a7c02
Alerting: Add value to notifier template (#41951)
* add value to email template

* add value to default template

* update test string

* test: fix ngalert test suite

* test: run CI

Co-authored-by: gillesdemey <gilles.de.mey@gmail.com>
2021-11-22 08:45:44 +01:00
Jean-Philippe Quéméner
b9cdad3814
Alerting: support mute timings configuration through the api for the embedded alertmanager (#41533)
* Alerting: accept mute_timing_intervals through the api for the embedded alertmanager

* add workaround for mutetimeinterval

* add mute timings to routes

* revert changes

* Update pkg/services/ngalert/api/api_alertmanager.go

* Update pkg/services/ngalert/api/api_alertmanager.go

* Update pkg/services/ngalert/api/api_alertmanager.go

* update prometheus/alertmanager dependency

* add some var docs
2021-11-19 16:50:55 +01:00
George Robinson
ac59b7615d
Alerting: README.md for ngalert 2021-11-18 10:26:42 +00:00
George Robinson
5f5298ad25
Alerting: Use require.ElementsMatch in TestEvaluateExecutionResultsNoData 2021-11-16 18:58:48 +01:00
George Robinson
4d288cc6c7
Alerting: Fix NoData tests (#41759) 2021-11-16 16:41:32 +00:00
George Robinson
d363e19517
Alerting: Add datasource_uid label to DatasourceNoData alerts (#41621) 2021-11-16 10:03:18 +00:00
Peter Holmberg
b2d7162168
Alerting: Add external Alertmanagers (#39183)
* building ui

* saving alertmanager urls

* add actions and api call to get external ams

* add list to add modal

* add validation and edit/delete

* work on merging results

* merging results

* get color for status heart

* adding tests

* tests added

* rename

* add pollin and status

* fix list sync

* fix polling

* add info icon with actual tooltip

* fix test

* Accessibility things

* fix strict error

* delete public/dist files

* Add API tests for invalid URL

* start redo admin test

* Fix for empty configuration and test

* remove admin test

* text updates after review

* suppress appevent error

* fix tests

* update description

Co-authored-by: gotjosh <josue@grafana.com>

* fix text plus go lint

* updates after pr review

* Adding docs

* Update docs/sources/alerting/unified-alerting/fundamentals/alertmanager.md

Co-authored-by: gotjosh <josue@grafana.com>

* Update docs/sources/alerting/unified-alerting/fundamentals/alertmanager.md

Co-authored-by: gotjosh <josue@grafana.com>

* Update docs/sources/alerting/unified-alerting/fundamentals/alertmanager.md

Co-authored-by: gotjosh <josue@grafana.com>

* Update docs/sources/alerting/unified-alerting/fundamentals/alertmanager.md

Co-authored-by: gotjosh <josue@grafana.com>

* Update docs/sources/alerting/unified-alerting/fundamentals/alertmanager.md

Co-authored-by: gotjosh <josue@grafana.com>

* Update docs/sources/alerting/unified-alerting/fundamentals/alertmanager.md

Co-authored-by: gotjosh <josue@grafana.com>

* Update docs/sources/alerting/unified-alerting/fundamentals/alertmanager.md

Co-authored-by: gotjosh <josue@grafana.com>

* prettier

* updates after docs feedback

Co-authored-by: gotjosh <josue.abreu@gmail.com>
Co-authored-by: gotjosh <josue@grafana.com>
2021-11-12 22:19:16 +01:00
Gilles De Mey
dbe78e47b1
fix: check lotex endpoint URL (#41429)
* fix: check lotex endpoint URL

* Add validation for data sources URLs

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>
2021-11-11 07:45:56 +01:00
Santiago
a45e4ff73f
graphLink and tableLink template functions (#41369)
* graphLink and tableLink functions, docs updated

* Code review changes

* extract query struct outside of graphLink and tableLink functions

* Fix docs

* Update docs/sources/alerting/unified-alerting/alerting-rules/alert-annotation-label.md

Co-authored-by: Jean-Philippe Quéméner <JohnnyQQQQ@users.noreply.github.com>

* Update docs/sources/alerting/unified-alerting/alerting-rules/alert-annotation-label.md

Co-authored-by: Jean-Philippe Quéméner <JohnnyQQQQ@users.noreply.github.com>

* Update docs/sources/alerting/unified-alerting/alerting-rules/alert-annotation-label.md

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

* Update docs/sources/alerting/unified-alerting/alerting-rules/alert-annotation-label.md

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

* Fix linting errors

Co-authored-by: Jean-Philippe Quéméner <JohnnyQQQQ@users.noreply.github.com>
Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>
2021-11-10 10:36:03 -03:00
Marcus Efraimsson
baab021fec
Chore: Refactor usage of legacy data contracts (#41218)
Refactor usage of legacy data contracts. Moves legacy data contracts 
to pkg/tsdb/legacydata package.
Refactor pkg/expr to be a proper service/dependency that can be provided 
to wire to remove some unneeded dependencies to SSE in ngalert and other places.
Refactor pkg/expr to not use the legacydata,RequestHandler and use 
backend.QueryDataHandler instead.
2021-11-10 11:52:16 +01:00
Yuriy Tseretyan
24ad769831
Alerting: Fix flaky TestAlertInstanceOperations (#41435)
* use UID generator for text
2021-11-09 10:42:58 -05:00
ying-jeanne
54de1078c8
remove the global log error/warn etc functions (#41404)
* remove the global log error/warn etc functions and use request context logger whenever possible
2021-11-08 17:56:56 +01:00
gotjosh
6220872633
Alerting: fix bug where user is able to access rules from namespaces user is not part of (#41403)
* Add fix
* Add tests
Co-authored-by: Yuriy Tseretyan <yuriy.tseretyan@grafana.com>
Co-authored-by: Armand Grillet <2117580+armandgrillet@users.noreply.github.com>
Co-authored-by: Jean-Philippe Quéméner <JohnnyQQQQ@users.noreply.github.com>
Co-authored-by: George Robinson <george.robinson@grafana.com>
2021-11-08 14:26:08 +01:00
Ryan McKinley
3489721ed6
api/ds/query: simplify data sources lookup for queries and expressions (#41172) 2021-11-05 08:12:55 -07:00
Yuriy Tseretyan
610643a668
Alerting: Special alert instance if rule is in state NoData (#40540)
* do not suppress NoData state
* extract conversion of state to postable alert + tests
* create a special alert instance if nodata 
* use NoData when converting from Keep Last State instead of Alerting
* add silence during migration if NoData is mapped to KeepLastState.
2021-11-04 16:42:34 -04:00
Yuriy Tseretyan
1b5b747885
Alerting: Additional Tests for State Manager (#41291)
* rename fakeInstanceStore to FakeInstanceStore

* update test for state manager to initialize instance store with FakeInstanceStore
2021-11-04 15:15:56 -04:00
Tania B
5652bde447
Encryption: Use secrets service (#40251)
* Use secrets service in pluginproxy

* Use secrets service in pluginxontext

* Use secrets service in pluginsettings

* Use secrets service in provisioning

* Use secrets service in authinfoservice

* Use secrets service in api

* Use secrets service in sqlstore

* Use secrets service in dashboardshapshots

* Use secrets service in tsdb

* Use secrets service in datasources

* Use secrets service in alerting

* Use secrets service in ngalert

* Break cyclic dependancy

* Refactor service

* Break cyclic dependancy

* Add FakeSecretsStore

* Setup Secrets Service in sqlstore

* Fix

* Continue secrets service refactoring

* Fix cyclic dependancy in sqlstore tests

* Fix secrets service references

* Fix linter errors

* Add fake secrets service for tests

* Refactor SetupTestSecretsService

* Update setting up secret service in tests

* Fix missing secrets service in multiorg_alertmanager_test

* Use fake db in tests and sort imports

* Use fake db in datasources tests

* Fix more tests

* Fix linter issues

* Attempt to fix plugin proxy tests

* Pass secrets service to getPluginProxiedRequest in pluginproxy tests

* Fix pluginproxy tests

* Revert using secrets service in alerting and provisioning

* Update decryptFn in alerting migration

* Rename defaultProvider to currentProvider

* Use fake secrets service in alert channels tests

* Refactor secrets service test helper

* Update setting up secrets service in tests

* Revert alerting changes in api

* Add comments

* Remove secrets service from background services

* Convert global encryption functions into vars

* Revert "Convert global encryption functions into vars"

This reverts commit 498eb19859.

* Add feature toggle for envelope encryption

* Rename toggle

Co-authored-by: Emil Tullstedt <emil.tullstedt@grafana.com>
Co-authored-by: Joan López de la Franca Beltran <joanjan14@gmail.com>
2021-11-04 18:47:21 +02:00
Sofia Papagiannaki
603dae33d6
Fix test (#41287) 2021-11-04 12:22:33 +01:00
Yuriy Tseretyan
8a88caaa71
Alerting: Rule evaluation loop refactoring (#40238)
* move evaluation function out of loop
* extract updateRule function
* isolate alertRule change. update returns new rule and the evaluation accepts the rule as argument
* extract retry loop into a function
* add function wide log context.
* refactor metrics + add tests + replace timeNow with schedule.clock
2021-11-02 17:04:13 -04:00
Yuriy Tseretyan
a1e1a728ad
Alerting: Update references to alertmanager (#40904)
* update module reference for alertmanager
* remove workaround
2021-10-29 10:03:51 -04:00
Santiago
c9654c4bc0
Fix issues with Slack contact points (#40953)
* recipient validation regex modified, validation at creation/modification implemented

* Remove validation for recipient, fix tests

* Log level changed from Warn to Error
2021-10-27 13:58:37 -03:00
Skye
bce1011361
Alerting: Option for Discord notifier to use webhook name (#40463)
* Added an option to discord notifier to use discord's webhook name (useful for customizing notifications).

* Support ngalert system with discord username toggle

* Added ngalert discord test

* Apply suggestions from code review

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Docs updated with discord username setting

* Fix api integration test

Co-authored-by: Marcus Efraimsson <marcus.efraimsson@gmail.com>
Co-authored-by: gotjosh <josue.abreu@gmail.com>
2021-10-26 14:55:10 -04:00
Yuriy Tseretyan
6709359148
Alerting: Tests for rule evaluation routine (#40646)
* add fake stores to record queries
2021-10-26 13:22:07 -04:00
Santiago
1095f69740
Fix slack contact point panic (code review changes) (#40770)
* panic when unexpected Slack response fixed, tests added

* fix linting errors, add test

* Code review changes

* Apply PR comments

Co-authored-by: Armand Grillet <armand.grillet@outlook.com>
2021-10-22 17:23:22 +02:00
gotjosh
74fb491b6a
Alerting: Validate contact point configuration during migration to Unified Alerting (#40717)
* Alerting: Validate contact point configuration during the migration

This minimises the chances of generating broken configuration as part of the migration. Originally, we wanted to generate it and not produce a hard stop in Grafana but this strategy has the chance to avoid delivering notifications for our users.

We now think it's better to hard stop the migration and let the user take care of resolving the configuration manually.
2021-10-22 10:11:06 +01:00
George Robinson
967721068e
Alerting: Support custom annotations and labels when testing contact points
Support custom annotations and labels when testing contact points
2021-10-21 13:47:06 +01:00
Santiago
e54fe220e5
Fix panic when Slack API sends unexpected response (#40721)
* panic when unexpected Slack response fixed, tests added

* fix linting errors, add test
2021-10-21 08:12:15 +02:00
Yuriy Tseretyan
36e12e2b26
Remove flakiness of google chat tests (#40592) 2021-10-19 16:18:44 -04:00
Jean-Philippe Quéméner
153c356993
Alerting: delete orphaned records from kvstore (#40337) 2021-10-14 12:04:00 +02:00
gotjosh
48d73cb148
Alerting: Fixes a bug when trying to sync broken alertmanager config (#40338)
* Alerting: Fixes a bug when trying to sync broken alertmanager config

Broken alertmanager configuration has the potential to be introduced as part of a migration e.g. due to incompatible data between what grafana accepts and what the Alertmanager expects. When this happens, we expect an eventually consistent behaviour where we'll keep trying to apply the configuration until it works.

As part of change in https://github.com/grafana/grafana/pull/39237 we introduced a regression that modified this behaviour and instead tried to create a new Alertmanager for that organization everytime, which eventually ended up in a panic due to a duplicate metrics being registered.

This PR fixes that and introduces a test to catch further regressions.

* Remove disable orgs
2021-10-12 18:10:08 +01:00
Jean-Philippe Quéméner
e1dfec49f9
Alerting: cleanup alert resources on org removal (#39938) 2021-10-12 12:05:02 +02:00
gotjosh
f33f098b73
Alerting: Disable flaky assertion in TestSendingToExternalAlertmanager_WithMultipleOrgs (#40197)
* Alerting: Disable flaky assertion in TestSendingToExternalAlertmanager_WithMultipleOrgs

* Update pkg/services/ngalert/schedule/schedule_unit_test.go

Co-authored-by: Jean-Philippe Quéméner <JohnnyQQQQ@users.noreply.github.com>

* Update pkg/services/ngalert/schedule/schedule_unit_test.go

Co-authored-by: Jean-Philippe Quéméner <JohnnyQQQQ@users.noreply.github.com>

Co-authored-by: Jean-Philippe Quéméner <JohnnyQQQQ@users.noreply.github.com>
2021-10-12 10:07:29 +01:00
George Robinson
8318e45452
Alerting: Fix error message in ngalert when notifications cannot be sent to alertmanager (#40158) 2021-10-11 14:50:50 +01:00
Serge Zaitsev
57fcfd578d
Chore: replace macaron with web package (#40136)
* replace macaron with web package

* add web.go
2021-10-11 14:30:59 +02:00
Jean-Philippe Quéméner
d9c0220824
Alerting: add organziation ID to the ngAlert webhook payload (#40189)
* Alerting: add organziation ID to the ngAlert webhook payload
2021-10-08 14:52:44 +02:00
Yuriy Tseretyan
5836def6c2
Alerting: declare constants for __dashboardUid__ and __panelId__ literals (#39976) 2021-10-07 17:30:06 -04:00
Joan López de la Franca Beltran
722c414fef
Encryption: Refactor securejsondata.SecureJsonData to stop relying on global functions (#38865)
* Encryption: Add support to encrypt/decrypt sjd

* Add datasources.Service as a proxy to datasources db operations

* Encrypt ds.SecureJsonData before calling SQLStore

* Move ds cache code into ds service

* Fix tlsmanager tests

* Fix pluginproxy tests

* Remove some securejsondata.GetEncryptedJsonData usages

* Add pluginsettings.Service as a proxy for plugin settings db operations

* Add AlertNotificationService as a proxy for alert notification db operations

* Remove some securejsondata.GetEncryptedJsonData usages

* Remove more securejsondata.GetEncryptedJsonData usages

* Fix lint errors

* Minor fixes

* Remove encryption global functions usages from ngalert

* Fix lint errors

* Minor fixes

* Minor fixes

* Remove securejsondata.DecryptedValue usage

* Refactor the refactor

* Remove securejsondata.DecryptedValue usage

* Move securejsondata to migrations package

* Move securejsondata to migrations package

* Minor fix

* Fix integration test

* Fix integration tests

* Undo undesired changes

* Fix tests

* Add context.Context into encryption methods

* Fix tests

* Fix tests

* Fix tests

* Trigger CI

* Fix test

* Add names to params of encryption service interface

* Remove bus from CacheServiceImpl

* Add logging

* Add keys to logger

Co-authored-by: Emil Tullstedt <emil.tullstedt@grafana.com>

* Add missing key to logger

Co-authored-by: Emil Tullstedt <emil.tullstedt@grafana.com>

* Undo changes in markdown files

* Fix formatting

* Add context to secrets service

* Rename decryptSecureJsonData to decryptSecureJsonDataFn

* Name args in GetDecryptedValueFn

* Add template back to NewAlertmanagerNotifier

* Copy GetDecryptedValueFn to ngalert

* Add logging to pluginsettings

* Fix pluginsettings test

Co-authored-by: Tania B <yalyna.ts@gmail.com>
Co-authored-by: Emil Tullstedt <emil.tullstedt@grafana.com>
2021-10-07 17:33:50 +03:00
George Robinson
935bd34a30
Panel ID annotation cannot be set without Dashboard UID (#40019) 2021-10-06 11:34:11 +01:00
idafurjes
2759b16ef5
Chore: Add context for dashboards (#39844)
* Add context for dashboards

* Remove GetDashboardCtx

* Remove ctx.TODO
2021-10-05 13:26:24 +02:00
Santiago
562cd9e44e
Alerting template functions (#39261)
* Alerting: (wip) add template funcs

* Alerting: (wip) numeric template functions

* Alerting: (wip) template functions

* Test for the "args" function

* Alerting: (wip) Documentation for template functions

* Alerting: template functions - refactor

* code review changes

* disable linter error

* Use Prometheus implementation of TemplateExpander

* Update docs/sources/alerting/unified-alerting/alerting-rules/create-grafana-managed-rule.md

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

* change templateCaptureValue to support using template functions

* Update pkg/services/ngalert/state/template.go

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Test and documentation added for reReplaceAll template function

* complete missing functions, documentation and tests

* Use the alert instance's evaluation time for expanding the template

* strvalue graphlink and tablelink functions

* delete duplicate test

* make strvalue return an empty string

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>
Co-authored-by: gotjosh <josue.abreu@gmail.com>
2021-10-04 15:04:37 -03:00
George Robinson
2a4c1b1aa6
You can now get alert rules for a dashboard or a panel using /api/v1/rules endpoints. (#39476)
Get alert rules for a dashboard and panel in /api/v1/rules
2021-10-04 16:33:55 +01:00
gotjosh
6572017ec7
Alerting: Allow more characters in label names so notifications are sent (#38629)
Remove validation for labels to be accepted in the Alertmanager, This helps with datasources that produce non-compatible labels.

Adds an "object_matchers" to alert manager routers so we can support labels names with extended characters beyond prometheus/openmetrics. It only does this for the internal Grafana managed Alert Manager.

This requires a change to alert manager, so for now we use grafana/alertmanager which is a slight fork, with the intention of going back to upstream.

The frontend handles the migration of "matchers" -> "object_matchers" when the route is edited and saved. Once this is done, downgrades will not work old versions will not recognize the "object_matchers".

Co-authored-by: Kyle Brandt <kyle@grafana.com>
Co-authored-by: Nathan Rodman <nathanrodman@gmail.com>
2021-10-04 15:06:40 +02:00
Yuriy Tseretyan
4dadb8fc51
Alerting: Remove extra field orgId from notifier.Alertmanager (#39870) 2021-10-01 09:54:37 -04:00
Domas
e343b62665
Alerting: make /api/prometheus/grafana/api/v1/rules faster (#39660) 2021-10-01 16:39:04 +03:00
Domas
a1d4be0700
Alerting: Alertmanager datasource support for upstream Prometheus AM implementation (#39775) 2021-10-01 16:24:56 +03:00
Yuriy Tseretyan
526961f298
Alerting: rule evaluation loop to not access multiorg Alertmanager if no alerts to add (#39872) 2021-09-30 15:07:56 -04:00
Yuriy Tseretyan
2b4e51f478
Alerting: Parse App URL only once (#39855) 2021-09-30 12:51:20 -04:00
Sofia Papagiannaki
012d4f0905
Alerting: Remove ngalert feature toggle and introduce two new settings for enabling Grafana 8 alerts and disabling them for specific organisations (#38746)
* Remove `ngalert` feature toggle

* Update frontend

Remove all references of ngalert feature toggle

* Update docs

* Disable unified alerting for specific orgs

* Add backend tests

* Apply suggestions from code review

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

* Disabled unified alerting by default

* Ensure backward compatibility with old ngalert feature toggle

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>
2021-09-29 16:16:40 +02:00
Sofia Papagiannaki
f6f3a54742
Alerting: tune rule evaluation via configuration (#35623)
* Alerting: Configure max evaluation retries

* Alerting: Enforce minimum rule evaluation interval

* Alerting: Disable rule evaluation from configuration

* Update docs

* Alerting: Configure rule evaluation timeout

* Move options on unified_alerting config section

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>
2021-09-28 13:00:16 +03:00
Yuriy Tseretyan
05eb30e323
Alerting: Move alertmanager default config to UnifiedAlertingSettings (#39597) 2021-09-23 13:52:20 -04:00
Marcus Efraimsson
518a0d0458
Chore: Propagate context for dashboard guardian (#39201)
Require guardian.New to take context.Context as first argument. 
Migrates the GetDashboardAclInfoListQuery to be dispatched using context.

Ref #36734

Co-authored-by: Emil Tullstedt <emil.tullstedt@grafana.com>
Co-authored-by: sam boyer <sam.boyer@grafana.com>
2021-09-23 17:43:32 +02:00
George Robinson
27609dc2c5
Fix alerts with evaluation interval more than 30 seconds resolving in Alertmanager (#39513) 2021-09-22 14:55:46 +01:00
Yuriy Tseretyan
1910d85ae0
Alerting: Optimization of fetching data in multiorg alertmanager (#39237)
* Add method GetAllLatestAlertmanagerConfiguration to DBStore
* add method ApplyConfig to AlertManager
* update multiorg alert manager to load all alertmanager configs at once
2021-09-21 11:01:23 -04:00
gotjosh
fcbcfd232b
Alerting: Move spammy log line to debug in the state manager (#39410) 2021-09-20 16:05:55 +01:00
gotjosh
2ad82b9354
Alerting: Move the unified alerting settings to its own struct (#39350) 2021-09-20 10:12:21 +03:00
Yuriy Tseretyan
e1aae0549e
Provide reader to alertmanager silence instead of file path (#39305) 2021-09-17 14:12:27 -04:00
gotjosh
35e5bfce40
Alerting: Metrics should have the label org instead of user (#39353)
An user within Grafana has a completely different meaning. Multi-tenancy is done via Organizations as a top-level concept.
2021-09-17 17:17:26 +01:00
gotjosh
7db97097c9
Alerting: Support Unified Alerting with Grafana HA (#37920)
* Alerting: Support Unified Alerting in Grafana's HA mode.
2021-09-16 15:33:51 +01:00
Santiago
c3cf95f383
Revert "Alerting: add template funcs (#38404)" (#39258)
This reverts commit d6fb0181fb.
2021-09-15 19:47:22 -03:00
Santiago
0d2e68537c
Alerting: Cleanup template, silence and notification files created du… (#39007)
* Alerting: Cleanup template, silence and notification files created during tests

* Create tempdir for testing, delete afterwards and check for errors

* Refactoring error checks

* Update docs/sources/enterprise/access-control/fine-grained-access-control-references.md

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

* Update docs/sources/administration/configuration.md

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

* Update docs/sources/enterprise/access-control/fine-grained-access-control-references.md

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>
2021-09-15 18:48:52 -03:00
Santiago
d6fb0181fb
Alerting: add template funcs (#38404)
* Alerting: (wip) add template funcs

* Alerting: (wip) numeric template functions

* Alerting: (wip) template functions

* Test for the "args" function

* Alerting: (wip) Documentation for template functions

* Alerting: template functions - refactor

* code review changes

* disable linter error

* Use Prometheus implementation of TemplateExpander

* Update docs/sources/alerting/unified-alerting/alerting-rules/create-grafana-managed-rule.md

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>
2021-09-15 18:48:29 -03:00
Serge Zaitsev
063160aae2
Chore: pass url parameters through context.Context (#38826)
* pass url parameters through context.Context

* fix url param names without colon prefix

* change context params to vars

* replace url vars in tests using new api

* rename vars to params

* add some comments

* rename seturlvars to seturlparams
2021-09-14 18:34:56 +02:00
Marcus Efraimsson
fa9857499b
Chore: GetDashboardQuery should be dispatched using DispatchCtx (#36877)
* Chore: GetDashboardQuery should be dispatched using DispatchCtx

* Fix after merge

* Changes after review

* Various fixes

* Use GetDashboardCtx function instead of GetDashboard
2021-09-14 16:08:04 +02:00
gotjosh
2b1d3d27e4
Alerting: Fix bug not creating filepath for silences/nflog if it does not exist (#39174)
We created this filepath just as we're about persist the templates - with the latest change, we now need to create it sooner.
2021-09-14 14:40:59 +01:00
gotjosh
a2f4344bf2
Alerting: Refactor & fix unified alerting metrics structure (#39151)
* Alerting: Refactor & fix unified alerting metrics structure

Fixes and refactors the metrics structure we have for the ngalert service. Now, each component has its own metric struct that includes the JUST the metrics it uses. Additionally, I have fixed the configuration metrics and added new metrics to determine if we have discovered and started all the necessary configurations of an instance.

This allows us to alert on `grafana_alerting_discovered_configurations - grafana_alerting_active_configurations != 0` to know whether an alertmanager instance did not start successfully.
2021-09-14 12:55:01 +01:00
Marcus Efraimsson
2cc0788187
Chore: Disable backend test for now since it adds 10 minutes extra in CI (#39150)
Ref #38586
2021-09-13 19:37:26 +02:00
Sofia Papagiannaki
7af329f385
Alerting: Fix API specification (#38753)
* Alerting: Fix API spec

* Add missing status codes
2021-09-10 12:46:02 +03:00
gotjosh
39a3bb8a1c
Alerting: Persist notification log and silences to the database (#39005)
* Alerting: Persist notification log and silences to the database

This removes the dependency of having persistent disk to run grafana alerting. Instead of regularly flushing the notification log and silences to disk we now flush the binary content of those files to the database encoded as a base64 string.
2021-09-09 17:25:22 +01:00
Todd Treece
6e667cacee
Alerting: Skip query cache for alert queries (#39010) 2021-09-09 16:16:05 +02:00
Yuriy Tseretyan
6c2884ac37
Alerting: Fix notifier tests to close the temp file (#38992) 2021-09-09 09:56:42 -04:00
George Robinson
5caf6cb369
Change templateCaptureValue to support using template functions (#38766)
* Change templateCaptureValue to support using template functions

This commit changes templateCaptureValue to use float64 for the value
instead of *float64. This change means that annotations and labels can
use the float64 value with functions such as printf and avoid having to
check for nil. It also means that absent values are now printed as 0.

* Use math.NaN() instead of 0 for absent value
2021-09-08 10:46:15 +01:00
Sofia Papagiannaki
c19d65b1ad
Alerting: some fixes for updating rules via the API (#38764)
* Alerting: Allow updating rules if quota are exceeded

* Check for rule UID uniqueness in POST request
2021-09-02 19:38:42 +03:00
gotjosh
dd502f22eb
Alerting: Fix alert flapping in the internal alertmanager (#38648)
* Alerting: Fix alert flapping in the alertmanager

fixes a bug that caused Alerts that are evaluated at low intervals (sub 1 minute), to flap in the Alertmanager.
Mostly due to a combination of `EndsAt` and resend delay.

The Alertmanager uses `EndsAt` as a heuristic to know whenever it should resolve a firing alert, in the case that it hasn't heard
back from the alert generation system.

Because grafana sent the alert with an `EndsAt` which is equal to the `For` of the alert itself,
and we had a hard-coded 1 minute re-send delay (only applicable to firing alerts) this meant that a firing alert would resolve in the Alertmanager before we re-notify that it still firing.

This commit, increases the `EndsAt` by 3x the the resend delay or alert interval (depending on which one is higher). The resendDelay has been decreased to 30 seconds.
2021-09-02 16:22:59 +01:00
Serge Zaitsev
643c7fa0cb
Chore: update all +build statements (#38782) 2021-09-01 17:38:56 +03:00
Serge Zaitsev
c3ab2fdeb7
Macaron: remove custom Request type (#37874)
* remove macaron.Request, use http.Request instead

* remove com dependency from bindings module

* fix another c.Req.Request
2021-09-01 11:18:30 +02:00
Arve Knudsen
78596a6756
Migrate to Wire for dependency injection (#32289)
Fixes #30144

Co-authored-by: dsotirakis <sotirakis.dim@gmail.com>
Co-authored-by: Marcus Efraimsson <marcus.efraimsson@gmail.com>
Co-authored-by: Ida Furjesova <ida.furjesova@grafana.com>
Co-authored-by: Jack Westbrook <jack.westbrook@gmail.com>
Co-authored-by: Will Browne <wbrowne@users.noreply.github.com>
Co-authored-by: Leon Sorokin <leeoniya@gmail.com>
Co-authored-by: Andrej Ocenas <mr.ocenas@gmail.com>
Co-authored-by: spinillos <selenepinillos@gmail.com>
Co-authored-by: Karl Persson <kalle.persson@grafana.com>
Co-authored-by: Leonard Gram <leo@xlson.com>
2021-08-25 15:11:22 +02:00
gotjosh
2f27a5240b
Alerting: Fix flake on test receiver tests (#38511)
* Alerting: Fix flake on test receiver tests

* Make the actual result from the API be sorted

* Use the correct letters
2021-08-24 17:22:11 +01:00
David Parrott
7fbeefc090
Alerting: create wrapper for Alertmanager to enable org level isolation (#37320)
Introduces org-level isolation for the Alertmanager and its components.

Silences, Alerts and Contact points are not separated by org and are not shared between them.

Co-authored with @davidmparrott and @papagian
2021-08-24 11:28:09 +01:00
Domas
cb9912ec0a
Alerting: button to test contact point (#37475) 2021-08-18 10:16:35 +03:00
George Robinson
3ca00f90b5
Contact point testing (#37308)
This commit adds contact point testing to ngalerts via a new API
endpoint. This endpoint accepts JSON containing a list of
receiver configurations which are validated and then tested
with a notification for a test alert. The endpoint returns JSON
for each receiver with a status and error message. It accepts
a configurable timeout via the Request-Timeout header (in seconds)
up to a maximum of 30 seconds.
2021-08-17 13:49:05 +01:00
Sofia Papagiannaki
7a01fb369d
Alerting: Fix API spec generation (#37852)
* Alerting: Fix API spec generation

* Apply suggestion from code review

Co-authored-by: gotjosh <josue@grafana.com>
2021-08-13 16:15:53 +03:00
gotjosh
f3f3fcc727
Alerting: Introduces /api/v1/ngalert/alertmanagers to expose discovered and dropped Alertmanager(s) (#37632)
* Alerting: Expose discovered and dropped Alertmanagers

Exposes the API for discovered and dropped Alertmanagers.

* make admin config poll interval configurable

* update after rebase

* wordsmith

* More wordsmithing

* change name of the config

* settings package too
2021-08-13 13:14:36 +01:00
Kyle Brandt
aef67994a1
Annotations: Fix alerting annotation coloring (#37412)
Co-authored-by: Ryan McKinley <ryantxu@gmail.com>
2021-08-12 09:37:54 -07:00
Sofia Papagiannaki
04d5dcb7c8
Alerting: modify DB table, accessors and migration to restrict org access (#37414)
* Alerting: modify table and accessors to limit org access appropriately

* Update migration to create multiple Alertmanager configs

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>

* replace mg.ClearMigrationEntry()

mg.ClearMigrationEntry() would create a new session.
This commit introduces a new migration for clearing an entry from migration log for replacing  mg.ClearMigrationEntry() so that all dashboard alert migration operations will run inside the same transaction.
It adds also `SkipMigrationLog()` in Migrator interface for skipping adding an entry in the migration_log.

Co-authored-by: gotjosh <josue@grafana.com>
2021-08-12 16:04:09 +03:00
SLAMA
5986d99f51
Alerting frontend : fix line notifier (#37744)
- Fixes #37425
- change `line` type string to uppercase
2021-08-10 18:59:53 +01:00
gotjosh
f83cd401e5
Alerting: Send alerts to external Alertmanager(s) (#37298)
* Alerting: Send alerts to external Alertmanager(s)

Within this PR we're adding support for registering or unregistering
sending to a set of external alertmanagers. A few of the things that are
going are:

- Introduce a new table to hold "admin" (either org or global)
  configuration we can change at runtime.
- A new periodic check that polls for this configuration and adjusts the
  "senders" accordingly.
- Introduces a new concept of "senders" that are responsible for
  shipping the alerts to the external Alertmanager(s). In a nutshell,
this is the Prometheus notifier (the one in charge of sending the alert)
mapped to a multi-tenant map.

There are a few code movements here and there but those are minor, I
tried to keep things intact as much as possible so that we could have an
easier diff.
2021-08-06 13:06:56 +01:00
Kyle Brandt
aa904a5a04
NGAlert: Send resolve signal to alertmanager on alerting -> Normal (#37363) 2021-07-29 20:29:17 +02:00
gotjosh
442a6677fc
Alerting: Refactor Run of the scheduler (#37157)
* Alerting: Refactor `Run` of the scheduler

A bit of a refactor to make the diff easier to read for supporting
external Alertmanagers.

We'll introduce another routine that checks the database for
configuration and spawns other routines accordingly.

* Block the wait.

* Fix test
2021-07-27 11:52:59 +01:00
Ganesh Vernekar
e8ac802e4f
Alerting: Remove unused fields in Pagerduty struct (#37198)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-07-27 10:41:48 +05:30
David Parrott
b5f464412d
Alerting: automatically remove stale alerting states (#36767)
* initial attempt at automatic removal of stale states

* test case, need espected states

* finish unit test

* PR feedback

* still multiply by time.second

* pr feedback
2021-07-26 18:12:04 +02:00
Ganesh Vernekar
a65975cca0
Alerting: Remove the fixed wait for notification delivery (#37203)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-07-26 15:15:09 +02:00
George Robinson
2f4c893cf3
Expand the value string in annotations and labels of alerts (#37051)
This commit makes it possible to use the value string in
annotations and labels for alerts with "{{ $value }}"
2021-07-22 15:20:44 +01:00
Sofia Papagiannaki
7815ed511f
Alerting: Refactor API endpoints for fetching alert rules (#37055)
* Refactor ruler API endpoint for listing rules

* Refactor prometheus API endpoint for listing rules

* Update HTTP API docs
2021-07-22 09:53:14 +03:00
David Parrott
fa0bed7118
do not over write alerting rule duration (#36930) 2021-07-20 11:49:35 +05:30
Djairho Geuens
4cadbba686
Email: Allow configuration of content types for email notifications (#34530)
* Alerting: Allow configuration of content types for email notifications

* Fix lint error

* Improves email templates

* Improve configuration documentation

Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>

* Improve code comments

Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>

* Improve configuration documentation

Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>

* Improve email template

* Remove unnecessary predeclaration

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>

* Adds handling for unrecognized content type

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>

* Move utility function outside of util package

* Fixes syntax

* Remove unused package

* Fix lint error

* improve email templates

* Fix test

* Alerting: Allow configuration of content types for email notifications

* Fix lint error

* Improves email templates

* Improve configuration documentation

Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>

* Improve code comments

Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>

* Improve configuration documentation

Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>

* Improve email template

* Remove unnecessary predeclaration

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>

* Adds handling for unrecognized content type

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>

* Move utility function outside of util package

* Fixes syntax

* Remove unused package

* Fix lint error

* improve email templates

* Fix test

* Fix comment style

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* Fix template formatting

* Add test and improve error handling

* Fix test

* Fix formatting

* Fix formatting

* Improve documentation and regenerates txt template

* Update docs/sources/administration/configuration.md

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

Co-authored-by: Djairho Geuens <djairho.geuens@ae.be>
Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>
2021-07-19 13:31:51 +03:00
Sofia Papagiannaki
f308ba91e3
Alerting: Improve receiver initialisation errors (#36814)
* Alerting: Improve receiver initialisation errors
2021-07-19 11:58:35 +03:00
Sofia Papagiannaki
afe6e793ff
Alerting: deactivate an Alertmanager configuration (#36794)
* Alerting: deactivate an Alertmanager configuration

Implement DELETE /api/alertmanager/grafana/config/api/v1/alerts
by storing the default configuration which stops existing cnfiguration
from being in use.

* Apply suggestions from code review
2021-07-16 20:07:31 +03:00
George Robinson
456dac1303
Expand the value of math and reduce expressions in annotations and labels (#36611)
* Expand the value of math and reduce expressions in annotations and labels

This commit makes it possible to use the values of reduce and math
expressions in annotations and labels via their RefIDs. It uses the
Stringer interface to ensure that "{{ $values.A }}" still prints the
value in decimal format while also making the labels for each RefID
available with "{{ $values.A.Labels }}" and the float64 value with
"{{ $values.A.Value }}"
2021-07-15 13:10:56 +01:00
David Parrott
19f18bcecc
Alerting: annotation on state change (#36535)
* WIP

* Add annotation on alert state change

* move annotation creation to manager

* praise the linter!

* add debug msg when creating annotation
2021-07-13 09:50:10 -07:00
David Parrott
310d3ebe3d
change template expansion missing value handling (#36679) 2021-07-13 06:57:18 -07:00
Ganesh Vernekar
8efe1856e2
Alerting: A better and cleaner way to know if Alertmanager is initialised (#36659)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-07-12 18:53:01 +05:30
Ganesh Vernekar
e19c690426
Alerting: Fix potential panic in Alertmanager when starting up (#36562)
* Alerting: Fix potential panic in Alertmanager when starting up

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix reviews

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-07-12 11:15:16 +05:30
Andrej Ocenas
ea2ba06b93
CloudWatch/Logs: Fix log alerts in new unified alerting (#36558)
* Pass FromAlert header from new alerting

* Add better error messages
2021-07-09 13:43:22 +02:00
Ganesh Vernekar
94d2520a84
Alerting: Allow space in label and annotation names (#36549)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-07-08 18:26:09 +05:30
gotjosh
a86ad1190c
Alerting: Refactor state manager as a dependency (#36513)
* Alerting: Refactor state manager as a dependency

Within the scheduler, the state manager was being passed around a
certain number of functions. I've introduced it as a dependency to keep
the "service" interfaces as clean and homogeneous as possible.

This is relevant, because I'm going to introduce live reload of these
components as part of my next PR and it is better if dependencies are
self-contained.

* remove unused functions

* Fix a few more tests

* Make sure the `stateManager` is declared before the schedule
2021-07-07 17:18:31 +01:00
Sofia Papagiannaki
fc90d47863
Alerting API: Restrict access to Alertmanager configuration (#36507)
* Alerting API: Restrict access to Alertmanager configuration to viewers
2021-07-07 16:29:18 +03:00
Sofia Papagiannaki
8a3edf280e
Alerting: Fix prometheus API to check folder permissions (#36301) 2021-07-05 10:49:14 +03:00
Ganesh Vernekar
8fe58fc2e3
Alerting: Add additional newlines to Microsoft Teams notification message where necessary (#36126)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-06-24 20:52:00 +05:30
Kyle Brandt
19f764739b
Alerting: Change __value__ label to __value_string__ annotation and add ValueString variable in notifications (#36032)
* Alerting: Allow __value__ label in notifications
was being removed by removePrivateItems
discoverd in #36020, but issue is not about that specifically

* __value__ label to __value_string__ annotation
and .ValueString extended property for notifications
2021-06-24 12:45:49 +05:30
Ganesh Vernekar
9a5c1f06df
Alerting: Template all possible variables in notification channels (#35749)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-06-22 15:12:54 +05:30
Ganesh Vernekar
33d6e11175
Alerting: Decouple default template from channel tests (#35239)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-06-21 07:59:09 +05:30
David Parrott
4732f832f7
Alerting: recalculate EndsAt (#35830)
* setEndsAt

* one more test case

* add should clause to tests
2021-06-17 10:01:46 -07:00
Ganesh Vernekar
dcd4bf1615
Alerting: Fill the empty GeneratorURL (#35740)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-06-16 15:34:12 +05:30
Sofia Papagiannaki
e5a5b8e3fe
Alerting: Fix updating alert rule properties with missing/zero values (#35512)
* Fix deleting labels and annotations

* Add test

* Keep no data and error start if not provided

* Allow setting interval and for to zero during rule updates
2021-06-15 20:55:25 +03:00
Sofia Papagiannaki
abe35c8c01
Alerting: Add error recovery during rule evaluations (#35450)
* Alerting: Eval recovery after query failure

* Apply suggestions from code review
2021-06-15 19:30:21 +03:00
gotjosh
f7ed35336d
Alerting: Implement /status for the notification system (#33227)
* Alerting: Implement /status for the notification system

Implements the necessary plumbing to have a /status endpoint on the
notification system.

* Add API examples

* Update API specs

* Update prometheus/common dependency

Co-authored-by: Sofia Papagiannaki <sofia@grafana.com>
2021-06-15 19:14:02 +03:00
Sofia Papagiannaki
fba90b8f9b
Alerting: Recact html responses (#35277) 2021-06-04 20:57:24 +03:00
Sofia Papagiannaki
8cda1f5153
Alerting: Allow rules with same title across folders (#35270)
* Alerting: Allow rules with same title across folders

* Add test
2021-06-04 20:45:26 +03:00
Sofia Papagiannaki
15c55b0115
Alerting: Fix notification channel migration and handle case when Alertmanager default configuration is absent (#35086)
* Fix dashboard alert and nootifier migration for MySQL

* Fix POSTing Alertmanager configuration if no current configuration exists

in case the default configuration has not be stored yet
or has failed to get stored

* Change CreatedAt field type
2021-06-04 15:52:41 +03:00
Ganesh Vernekar
8417088969
Alerting: Expand {{$labels.xyz}} template in labels and annotations (#35159)
* Alerting: Expand `{{$labels.xyz}}` template in labels and annotations

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix annotation not updating for same alert

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-06-03 19:24:36 +02:00
Ganesh Vernekar
a30e60a0b8
Alerting: Do not hard fail on templating errors in channels (#35165)
* Alerting: Do not hard fail on templating errors in channels

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-06-03 19:39:32 +05:30
Ganesh Vernekar
a23674ef99
Alerting: Migrate tags as labels and not annotations (#34990)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-31 19:47:17 +05:30
Sofia Papagiannaki
355be158b7
[Alerting]: fix/cleanup API examples (#34588) 2021-05-31 11:18:29 +03:00
Chip Wolf ‮
badec6c6ad
Alerting: Add support for configuring avatar URL for the Discord notifier (#33355)
Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>
2021-05-28 23:00:21 +02:00
Owen Diehl
cc38613ba4
alerting: fixes per-receiver metric cardinality (#34915) 2021-05-28 12:31:23 -04:00
Owen Diehl
9aca032d10
Alerting/consistent api errors (#34858)
* consolidates alertmanager api errors

* util & testing consistent errors

* consistent errors for rest of ngalert apis

* updates expected errors in testware

* bump ci

* linting

* unrelated: dashboard.go lint
2021-05-28 11:55:03 -04:00
Kyle Brandt
b47e7d12e6
Alerting: Extract values from MD expr alerts (#34757)
When using mulit-dimensional Grafana managed alerts (e.g. SSE math) extract refIds values and labels so they can be shown in the notification and dashboards.
2021-05-28 11:04:20 -04:00
Danilo Bargen
83a83de10a
Clarify that Threema Gateway Alerts support only Basic IDs (#34828)
Threema Gateway supports two types of IDs: Basic IDs (where the
encryption is managed by the API server) and End-to-End IDs (where the
keys are managed by the user).

This plugin currently does not support End-to-End IDs (since it's much
more complex to implement, because the encryption needs to happen
locally). Add a few clarifications to the UI.

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
2021-05-28 08:54:55 +02:00
Domas
347273cdea
Alerting: check upstream response content type in lotex proxy (#34760) 2021-05-27 14:12:29 +03:00
David Parrott
20d356947c
set state correctly and test (#34680) 2021-05-26 11:37:42 -07:00
Ganesh Vernekar
d69c21acb6
NGAlert: Update the default template to include more URLs (#34715)
* NGAlert: Update the default template to include more URLs

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix tests

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-26 16:49:39 +02:00
Ganesh Vernekar
b168223029
NGAlert: Add integration tests for remaining notification channels (#34662)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-26 16:33:55 +05:30
Marcus Andersson
e19b3df1a9
Alerting: added possibility to preview grafana managed alert rules. (#34600)
* starting to add eval logic.

* wip

* first version of test rule.

* reverted file.

* add info colum to result to show error or (with CC evalmatches)

* fix labels in evalmatch

* fix be test

* refactored using observables.

* moved widht/height div to outside panel rendere.

* adding docs api level.

* adding container styles to error div.

* increasing size of preview.

Co-authored-by: kyle <kyle@grafana.com>
2021-05-26 10:06:28 +02:00
Owen Diehl
0e0ed43153
Alerting/testing promql extraction (#34665)
* promql compat for marshaling

* extracts upstream instant queries into data frame for alerting

* eval string parity
2021-05-25 11:54:50 -04:00
Sofia Papagiannaki
b48832c0f7
[Alerting]: alertmanager notifier fixes (#34575) 2021-05-24 16:09:29 +03:00
Sofia Papagiannaki
23939eab10
[Alerting]: namespace fixes (#34470)
* [Alerting]: forbid viewers for updating rules if viewers can edit

check for CanSave instead of CanEdit

* Clear ngalert tables when deleting the folder

* Apply suggestions from code review

* Log failure to check save permission

Co-authored-by: gotjosh <josue@grafana.com>
2021-05-20 15:49:33 +03:00
gotjosh
7b04278834
Alerting: Opsgenie notification channel (#34418)
* Alerting: Opsgenie notification channel

This translate the opsgenie notification channel from the old alerting
system to the new alerting system with a few changes:

- The tag system has been replaced in favour of annotation.
- TBD
- TBD

Signed-off-by: Josue Abreu <josue@grafana.com>

* Fix template URL

* Bugfig: dont send resolved when autoClose is false

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix integration tests

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix URLs in all other channels

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-20 10:12:08 +02:00
David Parrott
7a83d1f9ff
Alerting resend delay for sending to notifiers (#34312)
* adds resend delay to avoid saturating notifier

* correct method signatures

* pr feedback
2021-05-19 22:15:09 +02:00
Owen Diehl
8f350bc353
actually register metrics this time (#34444) 2021-05-19 22:09:12 +02:00
Ganesh Vernekar
533be16787
NGAlert: Add Threema notification channel (#34159)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-19 20:20:52 +02:00
Ganesh Vernekar
b2e84277a3
NGAlert: Add Kafka notification channel (#34156)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-19 20:02:09 +02:00
Ganesh Vernekar
ad1d0ae0bf
NGAlert: Add VictorOps notification channel (#34161)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-19 19:52:14 +02:00
Ganesh Vernekar
fb9223ab42
NGAlert: Add Line notification channel (#34157)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-19 19:04:48 +02:00
Domas
54c33c6cdd
Alerting: update email template (#34205) 2021-05-19 18:58:31 +02:00
Ganesh Vernekar
01e0faf800
NGAlert: Add GoogleChat notification channel (#34153)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-19 18:24:04 +02:00
David Parrott
a0f175c7a5
also don't allow negative intervalseconds (#34319) 2021-05-19 09:05:32 -07:00
David Parrott
b9f4ec2030
Add discord notifier channel and test (#34150)
* Add discord notifier channel and test

* Correct payload

* remove print statement

* PR feedback and update due to changes in main

* Add discord notifier channel and test

* Correct payload

* remove print statement

* PR feedback and update due to changes in main

* update constructor and tests

* group imports sensibly

* Fix lint

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-19 17:31:55 +02:00
Sofia Papagiannaki
a79a4838b8
[Alerting]: Add Pushover integration with the alert manager (#34371)
* [Alerting]: Add Pushover integration with the alert manager

* lint

* Set boundary only for tests

* Remove title field

* fix imports
2021-05-19 16:48:46 +02:00
Owen Diehl
1d2febfa85
[Alerting] Route validations (#34393)
* more routing validation

* go mod

* recursive route validations
2021-05-19 10:36:28 -04:00
Arve Knudsen
9dfaa037d1
Alerting: Migrate Alertmanager notifier (#34304)
* Alerting: Port Alertmanager notifier to v8

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2021-05-19 15:27:41 +02:00
Owen Diehl
d6c4c2fcd5
[Alerting] Ensure upstream validations are run (#34333)
* use embedded validations via noop yaml unmarshaler

* lint

* fixes integration tests now that groupings are handled
2021-05-19 06:22:44 -04:00
Owen Diehl
c48c701791
adds missing metric name (#34307) 2021-05-18 17:24:38 -04:00
David Parrott
25485100b0
Alerting: Trim results when at processing instead of on ticker (#34248)
* Trim results when at processing instead of on ticker

* User RWMutex correctly

* remove comment
2021-05-18 10:56:14 -07:00
David Parrott
bbb7bbf891
Alerting: Remove back end logic for supporting KeepLastState (#34242)
* Removed back end logic for supporting KeepLastState

* Map keep_state correctly in migrations
2021-05-18 10:55:43 -07:00
Sofia Papagiannaki
ff112f07e3
[Alerting]: Add Sensu Go integration with the alert manager (#34045)
* [Alerting]: Add sensugo notification channel

* Apply suggestions from code review

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* Do not include labels with concatenated rule UID and names

* Modifications after syncing with main

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2021-05-18 17:31:51 +03:00
Sofia Papagiannaki
11243dec14
[Alerting]: Assign UUID to grafana receivers (#34241)
* [Alerting]: Assign UUID to grafana receivers

* Apply suggestions from code review

* Add test for updating invalid receiver

Co-authored-by: Domas <domasx2@gmail.com>
2021-05-18 17:31:00 +03:00
Kyle Brandt
63b2dd06a5
Alerting: Set "value" with evalmatches in G Managed (#34075)
When, and currently only when using a classic condition, evaluation information is added (which is like the EvalMatches from dashboard alerting).

This is returned via the API and can be included in notifications by reading the `__value__` label attached `.Alerts` in the template. It is a string.
2021-05-18 09:12:39 -04:00
Ganesh Vernekar
89c2b5e863
NGAlert: Remove unwanted fields from notification channel config (#34036)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-18 10:04:47 +02:00
gotjosh
6384f86fb9
Alerting: Allow the notifier to log (#34232)
* Alerting: Allow the notifier to log

The notifier upstream code uses go-kit as its logging library. The
grafana specific logger is not compatible with this API. In this PR, I
have created a wrapper that implements io.Writer to make them
compatible.
2021-05-17 18:06:47 +01:00
Kyle Brandt
331991ca10
UAlerting: Increase default max datapoints (#34223)
Change const value from 100 to 43200 (12 hours at 1sec interval)
2021-05-17 18:46:52 +02:00
Ganesh Vernekar
d5ae55c5dd
NGAlert: Add message field to email notification channel (#34044)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-17 16:05:09 +05:30
Owen Diehl
1367f7171e
Alerting/ruler metrics (#34144)
* adds active configurations metric

* rule evaluation metrics

* ruler metrics

* pr feedback
2021-05-14 16:13:44 -04:00
gotjosh
eb74994b8b
Alerting: Modify configuration apply and save semantics - v2 (#34143)
* Save default configuration to the database and copy over secure settings
2021-05-14 19:49:54 +01:00
Owen Diehl
fc90c36d50
removes unused db method (#34082) 2021-05-13 20:28:10 +02:00
Owen Diehl
baca873a84
extracts alertmanager from DI, including migrations (#34071)
* extracts alertmanager from DI, including migrations

* includes alertmanager Run method in ngalert

* removes 3s test shutdown timeout

* lint
2021-05-13 14:01:38 -04:00
Ganesh Vernekar
ec3214bac2
NGAlert: Add integration tests for notification channels (#33431)
* NGAlert: Add integration tests for notification channels

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix the failing tests

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Override creation of rule UID, remove only namespace UID

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-13 22:58:19 +05:30
Kyle Brandt
babb17afd6
Alerting/Chore: Move tests from tests package (#34059)
Instead put in package folder but with package name suffixed with _test
This enables code coverage within the pkg while still allow the tests to operate from external to package perspective (only exported things).
2021-05-13 10:05:33 -04:00
Ganesh Vernekar
5f44ccff0c
NGAlert: Fix unit test to write files in temporary directory (#34032)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-13 16:08:12 +05:30
Kyle Brandt
3da8db7f3f
Alerting: Run table migrations regardless of feature flag and move out of service (#33996) 2021-05-12 14:39:48 -04:00
Owen Diehl
3b06f52bab
Alerting/allow empty receiver (#33962)
* simplifies yaml unmarshaling: PostableApiReceiver

* allow empty receiver type

* allows name only receivers (blackhole)

* better receiver type parsing

* linting
2021-05-12 07:58:16 -04:00
Kyle Brandt
a735c51202
Alerting/Chore: Backend remove def_ columns from instance (#33875)
rename def_uid and def_org_id to rule_uid and rule_org_id on the alert_instance table and drops the definition table.
2021-05-12 07:17:43 -04:00
Ganesh Vernekar
8d442c9b44
NGAlert: Fix templating and remove unwanted default templates (#33918)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-12 15:13:43 +05:30
Sofia Papagiannaki
f4750fb3c8
[Alerting]: Alertmanager API apply permissions (#33843)
* [Alerting]: Alertmanager API apply permissions

* Apply suggestions from code review
2021-05-11 11:31:38 +03:00
Owen Diehl
e18ca8f6f2
enforce receivers align with backend type when posting AM config (#33877) 2021-05-10 16:58:41 -04:00
Sofia Papagiannaki
1c58fd380f
[Alerting]: store encrypted receiver secure settings (#33832)
* [Alerting]: Store secure settings encrypted

* Move encryption to the API handler
2021-05-10 15:30:42 +03:00
David Parrott
e58aca2d20
Alerting: remove instances from db and cache on rule update (#33722)
* remove instances from db and cache on rule update

* fix panic

* rename
2021-05-06 18:39:34 +02:00
Kyle Brandt
fae093bbe2
Alerting: Fix state cache getOrCreate panic (#33777) 2021-05-06 14:35:52 +02:00
Owen Diehl
a5ae8cf377
Unredact/secret (#33723)
* no longer redacts GETing proxied AM configs

* removes unused testfile

* testware fix

* consistently roundtrips yaml<>json and doesnt redact secrets

* lint
2021-05-05 16:21:53 -04:00