Commit Graph

476 Commits

Author SHA1 Message Date
Jean-Philippe Quéméner
153c356993
Alerting: delete orphaned records from kvstore (#40337) 2021-10-14 12:04:00 +02:00
gotjosh
48d73cb148
Alerting: Fixes a bug when trying to sync broken alertmanager config (#40338)
* Alerting: Fixes a bug when trying to sync broken alertmanager config

Broken alertmanager configuration has the potential to be introduced as part of a migration e.g. due to incompatible data between what grafana accepts and what the Alertmanager expects. When this happens, we expect an eventually consistent behaviour where we'll keep trying to apply the configuration until it works.

As part of change in https://github.com/grafana/grafana/pull/39237 we introduced a regression that modified this behaviour and instead tried to create a new Alertmanager for that organization everytime, which eventually ended up in a panic due to a duplicate metrics being registered.

This PR fixes that and introduces a test to catch further regressions.

* Remove disable orgs
2021-10-12 18:10:08 +01:00
Jean-Philippe Quéméner
e1dfec49f9
Alerting: cleanup alert resources on org removal (#39938) 2021-10-12 12:05:02 +02:00
gotjosh
f33f098b73
Alerting: Disable flaky assertion in TestSendingToExternalAlertmanager_WithMultipleOrgs (#40197)
* Alerting: Disable flaky assertion in TestSendingToExternalAlertmanager_WithMultipleOrgs

* Update pkg/services/ngalert/schedule/schedule_unit_test.go

Co-authored-by: Jean-Philippe Quéméner <JohnnyQQQQ@users.noreply.github.com>

* Update pkg/services/ngalert/schedule/schedule_unit_test.go

Co-authored-by: Jean-Philippe Quéméner <JohnnyQQQQ@users.noreply.github.com>

Co-authored-by: Jean-Philippe Quéméner <JohnnyQQQQ@users.noreply.github.com>
2021-10-12 10:07:29 +01:00
George Robinson
8318e45452
Alerting: Fix error message in ngalert when notifications cannot be sent to alertmanager (#40158) 2021-10-11 14:50:50 +01:00
Serge Zaitsev
57fcfd578d
Chore: replace macaron with web package (#40136)
* replace macaron with web package

* add web.go
2021-10-11 14:30:59 +02:00
Jean-Philippe Quéméner
d9c0220824
Alerting: add organziation ID to the ngAlert webhook payload (#40189)
* Alerting: add organziation ID to the ngAlert webhook payload
2021-10-08 14:52:44 +02:00
Yuriy Tseretyan
5836def6c2
Alerting: declare constants for __dashboardUid__ and __panelId__ literals (#39976) 2021-10-07 17:30:06 -04:00
Joan López de la Franca Beltran
722c414fef
Encryption: Refactor securejsondata.SecureJsonData to stop relying on global functions (#38865)
* Encryption: Add support to encrypt/decrypt sjd

* Add datasources.Service as a proxy to datasources db operations

* Encrypt ds.SecureJsonData before calling SQLStore

* Move ds cache code into ds service

* Fix tlsmanager tests

* Fix pluginproxy tests

* Remove some securejsondata.GetEncryptedJsonData usages

* Add pluginsettings.Service as a proxy for plugin settings db operations

* Add AlertNotificationService as a proxy for alert notification db operations

* Remove some securejsondata.GetEncryptedJsonData usages

* Remove more securejsondata.GetEncryptedJsonData usages

* Fix lint errors

* Minor fixes

* Remove encryption global functions usages from ngalert

* Fix lint errors

* Minor fixes

* Minor fixes

* Remove securejsondata.DecryptedValue usage

* Refactor the refactor

* Remove securejsondata.DecryptedValue usage

* Move securejsondata to migrations package

* Move securejsondata to migrations package

* Minor fix

* Fix integration test

* Fix integration tests

* Undo undesired changes

* Fix tests

* Add context.Context into encryption methods

* Fix tests

* Fix tests

* Fix tests

* Trigger CI

* Fix test

* Add names to params of encryption service interface

* Remove bus from CacheServiceImpl

* Add logging

* Add keys to logger

Co-authored-by: Emil Tullstedt <emil.tullstedt@grafana.com>

* Add missing key to logger

Co-authored-by: Emil Tullstedt <emil.tullstedt@grafana.com>

* Undo changes in markdown files

* Fix formatting

* Add context to secrets service

* Rename decryptSecureJsonData to decryptSecureJsonDataFn

* Name args in GetDecryptedValueFn

* Add template back to NewAlertmanagerNotifier

* Copy GetDecryptedValueFn to ngalert

* Add logging to pluginsettings

* Fix pluginsettings test

Co-authored-by: Tania B <yalyna.ts@gmail.com>
Co-authored-by: Emil Tullstedt <emil.tullstedt@grafana.com>
2021-10-07 17:33:50 +03:00
George Robinson
935bd34a30
Panel ID annotation cannot be set without Dashboard UID (#40019) 2021-10-06 11:34:11 +01:00
idafurjes
2759b16ef5
Chore: Add context for dashboards (#39844)
* Add context for dashboards

* Remove GetDashboardCtx

* Remove ctx.TODO
2021-10-05 13:26:24 +02:00
Santiago
562cd9e44e
Alerting template functions (#39261)
* Alerting: (wip) add template funcs

* Alerting: (wip) numeric template functions

* Alerting: (wip) template functions

* Test for the "args" function

* Alerting: (wip) Documentation for template functions

* Alerting: template functions - refactor

* code review changes

* disable linter error

* Use Prometheus implementation of TemplateExpander

* Update docs/sources/alerting/unified-alerting/alerting-rules/create-grafana-managed-rule.md

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

* change templateCaptureValue to support using template functions

* Update pkg/services/ngalert/state/template.go

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Test and documentation added for reReplaceAll template function

* complete missing functions, documentation and tests

* Use the alert instance's evaluation time for expanding the template

* strvalue graphlink and tablelink functions

* delete duplicate test

* make strvalue return an empty string

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>
Co-authored-by: gotjosh <josue.abreu@gmail.com>
2021-10-04 15:04:37 -03:00
George Robinson
2a4c1b1aa6
You can now get alert rules for a dashboard or a panel using /api/v1/rules endpoints. (#39476)
Get alert rules for a dashboard and panel in /api/v1/rules
2021-10-04 16:33:55 +01:00
gotjosh
6572017ec7
Alerting: Allow more characters in label names so notifications are sent (#38629)
Remove validation for labels to be accepted in the Alertmanager, This helps with datasources that produce non-compatible labels.

Adds an "object_matchers" to alert manager routers so we can support labels names with extended characters beyond prometheus/openmetrics. It only does this for the internal Grafana managed Alert Manager.

This requires a change to alert manager, so for now we use grafana/alertmanager which is a slight fork, with the intention of going back to upstream.

The frontend handles the migration of "matchers" -> "object_matchers" when the route is edited and saved. Once this is done, downgrades will not work old versions will not recognize the "object_matchers".

Co-authored-by: Kyle Brandt <kyle@grafana.com>
Co-authored-by: Nathan Rodman <nathanrodman@gmail.com>
2021-10-04 15:06:40 +02:00
Yuriy Tseretyan
4dadb8fc51
Alerting: Remove extra field orgId from notifier.Alertmanager (#39870) 2021-10-01 09:54:37 -04:00
Domas
e343b62665
Alerting: make /api/prometheus/grafana/api/v1/rules faster (#39660) 2021-10-01 16:39:04 +03:00
Domas
a1d4be0700
Alerting: Alertmanager datasource support for upstream Prometheus AM implementation (#39775) 2021-10-01 16:24:56 +03:00
Yuriy Tseretyan
526961f298
Alerting: rule evaluation loop to not access multiorg Alertmanager if no alerts to add (#39872) 2021-09-30 15:07:56 -04:00
Yuriy Tseretyan
2b4e51f478
Alerting: Parse App URL only once (#39855) 2021-09-30 12:51:20 -04:00
Sofia Papagiannaki
012d4f0905
Alerting: Remove ngalert feature toggle and introduce two new settings for enabling Grafana 8 alerts and disabling them for specific organisations (#38746)
* Remove `ngalert` feature toggle

* Update frontend

Remove all references of ngalert feature toggle

* Update docs

* Disable unified alerting for specific orgs

* Add backend tests

* Apply suggestions from code review

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

* Disabled unified alerting by default

* Ensure backward compatibility with old ngalert feature toggle

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>
2021-09-29 16:16:40 +02:00
Sofia Papagiannaki
f6f3a54742
Alerting: tune rule evaluation via configuration (#35623)
* Alerting: Configure max evaluation retries

* Alerting: Enforce minimum rule evaluation interval

* Alerting: Disable rule evaluation from configuration

* Update docs

* Alerting: Configure rule evaluation timeout

* Move options on unified_alerting config section

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>
2021-09-28 13:00:16 +03:00
Yuriy Tseretyan
05eb30e323
Alerting: Move alertmanager default config to UnifiedAlertingSettings (#39597) 2021-09-23 13:52:20 -04:00
Marcus Efraimsson
518a0d0458
Chore: Propagate context for dashboard guardian (#39201)
Require guardian.New to take context.Context as first argument. 
Migrates the GetDashboardAclInfoListQuery to be dispatched using context.

Ref #36734

Co-authored-by: Emil Tullstedt <emil.tullstedt@grafana.com>
Co-authored-by: sam boyer <sam.boyer@grafana.com>
2021-09-23 17:43:32 +02:00
George Robinson
27609dc2c5
Fix alerts with evaluation interval more than 30 seconds resolving in Alertmanager (#39513) 2021-09-22 14:55:46 +01:00
Yuriy Tseretyan
1910d85ae0
Alerting: Optimization of fetching data in multiorg alertmanager (#39237)
* Add method GetAllLatestAlertmanagerConfiguration to DBStore
* add method ApplyConfig to AlertManager
* update multiorg alert manager to load all alertmanager configs at once
2021-09-21 11:01:23 -04:00
gotjosh
fcbcfd232b
Alerting: Move spammy log line to debug in the state manager (#39410) 2021-09-20 16:05:55 +01:00
gotjosh
2ad82b9354
Alerting: Move the unified alerting settings to its own struct (#39350) 2021-09-20 10:12:21 +03:00
Yuriy Tseretyan
e1aae0549e
Provide reader to alertmanager silence instead of file path (#39305) 2021-09-17 14:12:27 -04:00
gotjosh
35e5bfce40
Alerting: Metrics should have the label org instead of user (#39353)
An user within Grafana has a completely different meaning. Multi-tenancy is done via Organizations as a top-level concept.
2021-09-17 17:17:26 +01:00
gotjosh
7db97097c9
Alerting: Support Unified Alerting with Grafana HA (#37920)
* Alerting: Support Unified Alerting in Grafana's HA mode.
2021-09-16 15:33:51 +01:00
Santiago
c3cf95f383
Revert "Alerting: add template funcs (#38404)" (#39258)
This reverts commit d6fb0181fb.
2021-09-15 19:47:22 -03:00
Santiago
0d2e68537c
Alerting: Cleanup template, silence and notification files created du… (#39007)
* Alerting: Cleanup template, silence and notification files created during tests

* Create tempdir for testing, delete afterwards and check for errors

* Refactoring error checks

* Update docs/sources/enterprise/access-control/fine-grained-access-control-references.md

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

* Update docs/sources/administration/configuration.md

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

* Update docs/sources/enterprise/access-control/fine-grained-access-control-references.md

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>
2021-09-15 18:48:52 -03:00
Santiago
d6fb0181fb
Alerting: add template funcs (#38404)
* Alerting: (wip) add template funcs

* Alerting: (wip) numeric template functions

* Alerting: (wip) template functions

* Test for the "args" function

* Alerting: (wip) Documentation for template functions

* Alerting: template functions - refactor

* code review changes

* disable linter error

* Use Prometheus implementation of TemplateExpander

* Update docs/sources/alerting/unified-alerting/alerting-rules/create-grafana-managed-rule.md

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>
2021-09-15 18:48:29 -03:00
Serge Zaitsev
063160aae2
Chore: pass url parameters through context.Context (#38826)
* pass url parameters through context.Context

* fix url param names without colon prefix

* change context params to vars

* replace url vars in tests using new api

* rename vars to params

* add some comments

* rename seturlvars to seturlparams
2021-09-14 18:34:56 +02:00
Marcus Efraimsson
fa9857499b
Chore: GetDashboardQuery should be dispatched using DispatchCtx (#36877)
* Chore: GetDashboardQuery should be dispatched using DispatchCtx

* Fix after merge

* Changes after review

* Various fixes

* Use GetDashboardCtx function instead of GetDashboard
2021-09-14 16:08:04 +02:00
gotjosh
2b1d3d27e4
Alerting: Fix bug not creating filepath for silences/nflog if it does not exist (#39174)
We created this filepath just as we're about persist the templates - with the latest change, we now need to create it sooner.
2021-09-14 14:40:59 +01:00
gotjosh
a2f4344bf2
Alerting: Refactor & fix unified alerting metrics structure (#39151)
* Alerting: Refactor & fix unified alerting metrics structure

Fixes and refactors the metrics structure we have for the ngalert service. Now, each component has its own metric struct that includes the JUST the metrics it uses. Additionally, I have fixed the configuration metrics and added new metrics to determine if we have discovered and started all the necessary configurations of an instance.

This allows us to alert on `grafana_alerting_discovered_configurations - grafana_alerting_active_configurations != 0` to know whether an alertmanager instance did not start successfully.
2021-09-14 12:55:01 +01:00
Marcus Efraimsson
2cc0788187
Chore: Disable backend test for now since it adds 10 minutes extra in CI (#39150)
Ref #38586
2021-09-13 19:37:26 +02:00
Sofia Papagiannaki
7af329f385
Alerting: Fix API specification (#38753)
* Alerting: Fix API spec

* Add missing status codes
2021-09-10 12:46:02 +03:00
gotjosh
39a3bb8a1c
Alerting: Persist notification log and silences to the database (#39005)
* Alerting: Persist notification log and silences to the database

This removes the dependency of having persistent disk to run grafana alerting. Instead of regularly flushing the notification log and silences to disk we now flush the binary content of those files to the database encoded as a base64 string.
2021-09-09 17:25:22 +01:00
Todd Treece
6e667cacee
Alerting: Skip query cache for alert queries (#39010) 2021-09-09 16:16:05 +02:00
Yuriy Tseretyan
6c2884ac37
Alerting: Fix notifier tests to close the temp file (#38992) 2021-09-09 09:56:42 -04:00
George Robinson
5caf6cb369
Change templateCaptureValue to support using template functions (#38766)
* Change templateCaptureValue to support using template functions

This commit changes templateCaptureValue to use float64 for the value
instead of *float64. This change means that annotations and labels can
use the float64 value with functions such as printf and avoid having to
check for nil. It also means that absent values are now printed as 0.

* Use math.NaN() instead of 0 for absent value
2021-09-08 10:46:15 +01:00
Sofia Papagiannaki
c19d65b1ad
Alerting: some fixes for updating rules via the API (#38764)
* Alerting: Allow updating rules if quota are exceeded

* Check for rule UID uniqueness in POST request
2021-09-02 19:38:42 +03:00
gotjosh
dd502f22eb
Alerting: Fix alert flapping in the internal alertmanager (#38648)
* Alerting: Fix alert flapping in the alertmanager

fixes a bug that caused Alerts that are evaluated at low intervals (sub 1 minute), to flap in the Alertmanager.
Mostly due to a combination of `EndsAt` and resend delay.

The Alertmanager uses `EndsAt` as a heuristic to know whenever it should resolve a firing alert, in the case that it hasn't heard
back from the alert generation system.

Because grafana sent the alert with an `EndsAt` which is equal to the `For` of the alert itself,
and we had a hard-coded 1 minute re-send delay (only applicable to firing alerts) this meant that a firing alert would resolve in the Alertmanager before we re-notify that it still firing.

This commit, increases the `EndsAt` by 3x the the resend delay or alert interval (depending on which one is higher). The resendDelay has been decreased to 30 seconds.
2021-09-02 16:22:59 +01:00
Serge Zaitsev
643c7fa0cb
Chore: update all +build statements (#38782) 2021-09-01 17:38:56 +03:00
Serge Zaitsev
c3ab2fdeb7
Macaron: remove custom Request type (#37874)
* remove macaron.Request, use http.Request instead

* remove com dependency from bindings module

* fix another c.Req.Request
2021-09-01 11:18:30 +02:00
Arve Knudsen
78596a6756
Migrate to Wire for dependency injection (#32289)
Fixes #30144

Co-authored-by: dsotirakis <sotirakis.dim@gmail.com>
Co-authored-by: Marcus Efraimsson <marcus.efraimsson@gmail.com>
Co-authored-by: Ida Furjesova <ida.furjesova@grafana.com>
Co-authored-by: Jack Westbrook <jack.westbrook@gmail.com>
Co-authored-by: Will Browne <wbrowne@users.noreply.github.com>
Co-authored-by: Leon Sorokin <leeoniya@gmail.com>
Co-authored-by: Andrej Ocenas <mr.ocenas@gmail.com>
Co-authored-by: spinillos <selenepinillos@gmail.com>
Co-authored-by: Karl Persson <kalle.persson@grafana.com>
Co-authored-by: Leonard Gram <leo@xlson.com>
2021-08-25 15:11:22 +02:00
gotjosh
2f27a5240b
Alerting: Fix flake on test receiver tests (#38511)
* Alerting: Fix flake on test receiver tests

* Make the actual result from the API be sorted

* Use the correct letters
2021-08-24 17:22:11 +01:00
David Parrott
7fbeefc090
Alerting: create wrapper for Alertmanager to enable org level isolation (#37320)
Introduces org-level isolation for the Alertmanager and its components.

Silences, Alerts and Contact points are not separated by org and are not shared between them.

Co-authored with @davidmparrott and @papagian
2021-08-24 11:28:09 +01:00
Domas
cb9912ec0a
Alerting: button to test contact point (#37475) 2021-08-18 10:16:35 +03:00
George Robinson
3ca00f90b5
Contact point testing (#37308)
This commit adds contact point testing to ngalerts via a new API
endpoint. This endpoint accepts JSON containing a list of
receiver configurations which are validated and then tested
with a notification for a test alert. The endpoint returns JSON
for each receiver with a status and error message. It accepts
a configurable timeout via the Request-Timeout header (in seconds)
up to a maximum of 30 seconds.
2021-08-17 13:49:05 +01:00
Sofia Papagiannaki
7a01fb369d
Alerting: Fix API spec generation (#37852)
* Alerting: Fix API spec generation

* Apply suggestion from code review

Co-authored-by: gotjosh <josue@grafana.com>
2021-08-13 16:15:53 +03:00
gotjosh
f3f3fcc727
Alerting: Introduces /api/v1/ngalert/alertmanagers to expose discovered and dropped Alertmanager(s) (#37632)
* Alerting: Expose discovered and dropped Alertmanagers

Exposes the API for discovered and dropped Alertmanagers.

* make admin config poll interval configurable

* update after rebase

* wordsmith

* More wordsmithing

* change name of the config

* settings package too
2021-08-13 13:14:36 +01:00
Kyle Brandt
aef67994a1
Annotations: Fix alerting annotation coloring (#37412)
Co-authored-by: Ryan McKinley <ryantxu@gmail.com>
2021-08-12 09:37:54 -07:00
Sofia Papagiannaki
04d5dcb7c8
Alerting: modify DB table, accessors and migration to restrict org access (#37414)
* Alerting: modify table and accessors to limit org access appropriately

* Update migration to create multiple Alertmanager configs

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>

* replace mg.ClearMigrationEntry()

mg.ClearMigrationEntry() would create a new session.
This commit introduces a new migration for clearing an entry from migration log for replacing  mg.ClearMigrationEntry() so that all dashboard alert migration operations will run inside the same transaction.
It adds also `SkipMigrationLog()` in Migrator interface for skipping adding an entry in the migration_log.

Co-authored-by: gotjosh <josue@grafana.com>
2021-08-12 16:04:09 +03:00
SLAMA
5986d99f51
Alerting frontend : fix line notifier (#37744)
- Fixes #37425
- change `line` type string to uppercase
2021-08-10 18:59:53 +01:00
gotjosh
f83cd401e5
Alerting: Send alerts to external Alertmanager(s) (#37298)
* Alerting: Send alerts to external Alertmanager(s)

Within this PR we're adding support for registering or unregistering
sending to a set of external alertmanagers. A few of the things that are
going are:

- Introduce a new table to hold "admin" (either org or global)
  configuration we can change at runtime.
- A new periodic check that polls for this configuration and adjusts the
  "senders" accordingly.
- Introduces a new concept of "senders" that are responsible for
  shipping the alerts to the external Alertmanager(s). In a nutshell,
this is the Prometheus notifier (the one in charge of sending the alert)
mapped to a multi-tenant map.

There are a few code movements here and there but those are minor, I
tried to keep things intact as much as possible so that we could have an
easier diff.
2021-08-06 13:06:56 +01:00
Kyle Brandt
aa904a5a04
NGAlert: Send resolve signal to alertmanager on alerting -> Normal (#37363) 2021-07-29 20:29:17 +02:00
gotjosh
442a6677fc
Alerting: Refactor Run of the scheduler (#37157)
* Alerting: Refactor `Run` of the scheduler

A bit of a refactor to make the diff easier to read for supporting
external Alertmanagers.

We'll introduce another routine that checks the database for
configuration and spawns other routines accordingly.

* Block the wait.

* Fix test
2021-07-27 11:52:59 +01:00
Ganesh Vernekar
e8ac802e4f
Alerting: Remove unused fields in Pagerduty struct (#37198)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-07-27 10:41:48 +05:30
David Parrott
b5f464412d
Alerting: automatically remove stale alerting states (#36767)
* initial attempt at automatic removal of stale states

* test case, need espected states

* finish unit test

* PR feedback

* still multiply by time.second

* pr feedback
2021-07-26 18:12:04 +02:00
Ganesh Vernekar
a65975cca0
Alerting: Remove the fixed wait for notification delivery (#37203)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-07-26 15:15:09 +02:00
George Robinson
2f4c893cf3
Expand the value string in annotations and labels of alerts (#37051)
This commit makes it possible to use the value string in
annotations and labels for alerts with "{{ $value }}"
2021-07-22 15:20:44 +01:00
Sofia Papagiannaki
7815ed511f
Alerting: Refactor API endpoints for fetching alert rules (#37055)
* Refactor ruler API endpoint for listing rules

* Refactor prometheus API endpoint for listing rules

* Update HTTP API docs
2021-07-22 09:53:14 +03:00
David Parrott
fa0bed7118
do not over write alerting rule duration (#36930) 2021-07-20 11:49:35 +05:30
Djairho Geuens
4cadbba686
Email: Allow configuration of content types for email notifications (#34530)
* Alerting: Allow configuration of content types for email notifications

* Fix lint error

* Improves email templates

* Improve configuration documentation

Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>

* Improve code comments

Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>

* Improve configuration documentation

Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>

* Improve email template

* Remove unnecessary predeclaration

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>

* Adds handling for unrecognized content type

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>

* Move utility function outside of util package

* Fixes syntax

* Remove unused package

* Fix lint error

* improve email templates

* Fix test

* Alerting: Allow configuration of content types for email notifications

* Fix lint error

* Improves email templates

* Improve configuration documentation

Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>

* Improve code comments

Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>

* Improve configuration documentation

Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>

* Improve email template

* Remove unnecessary predeclaration

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>

* Adds handling for unrecognized content type

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>

* Move utility function outside of util package

* Fixes syntax

* Remove unused package

* Fix lint error

* improve email templates

* Fix test

* Fix comment style

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* Fix template formatting

* Add test and improve error handling

* Fix test

* Fix formatting

* Fix formatting

* Improve documentation and regenerates txt template

* Update docs/sources/administration/configuration.md

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

Co-authored-by: Djairho Geuens <djairho.geuens@ae.be>
Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>
2021-07-19 13:31:51 +03:00
Sofia Papagiannaki
f308ba91e3
Alerting: Improve receiver initialisation errors (#36814)
* Alerting: Improve receiver initialisation errors
2021-07-19 11:58:35 +03:00
Sofia Papagiannaki
afe6e793ff
Alerting: deactivate an Alertmanager configuration (#36794)
* Alerting: deactivate an Alertmanager configuration

Implement DELETE /api/alertmanager/grafana/config/api/v1/alerts
by storing the default configuration which stops existing cnfiguration
from being in use.

* Apply suggestions from code review
2021-07-16 20:07:31 +03:00
George Robinson
456dac1303
Expand the value of math and reduce expressions in annotations and labels (#36611)
* Expand the value of math and reduce expressions in annotations and labels

This commit makes it possible to use the values of reduce and math
expressions in annotations and labels via their RefIDs. It uses the
Stringer interface to ensure that "{{ $values.A }}" still prints the
value in decimal format while also making the labels for each RefID
available with "{{ $values.A.Labels }}" and the float64 value with
"{{ $values.A.Value }}"
2021-07-15 13:10:56 +01:00
David Parrott
19f18bcecc
Alerting: annotation on state change (#36535)
* WIP

* Add annotation on alert state change

* move annotation creation to manager

* praise the linter!

* add debug msg when creating annotation
2021-07-13 09:50:10 -07:00
David Parrott
310d3ebe3d
change template expansion missing value handling (#36679) 2021-07-13 06:57:18 -07:00
Ganesh Vernekar
8efe1856e2
Alerting: A better and cleaner way to know if Alertmanager is initialised (#36659)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-07-12 18:53:01 +05:30
Ganesh Vernekar
e19c690426
Alerting: Fix potential panic in Alertmanager when starting up (#36562)
* Alerting: Fix potential panic in Alertmanager when starting up

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix reviews

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-07-12 11:15:16 +05:30
Andrej Ocenas
ea2ba06b93
CloudWatch/Logs: Fix log alerts in new unified alerting (#36558)
* Pass FromAlert header from new alerting

* Add better error messages
2021-07-09 13:43:22 +02:00
Ganesh Vernekar
94d2520a84
Alerting: Allow space in label and annotation names (#36549)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-07-08 18:26:09 +05:30
gotjosh
a86ad1190c
Alerting: Refactor state manager as a dependency (#36513)
* Alerting: Refactor state manager as a dependency

Within the scheduler, the state manager was being passed around a
certain number of functions. I've introduced it as a dependency to keep
the "service" interfaces as clean and homogeneous as possible.

This is relevant, because I'm going to introduce live reload of these
components as part of my next PR and it is better if dependencies are
self-contained.

* remove unused functions

* Fix a few more tests

* Make sure the `stateManager` is declared before the schedule
2021-07-07 17:18:31 +01:00
Sofia Papagiannaki
fc90d47863
Alerting API: Restrict access to Alertmanager configuration (#36507)
* Alerting API: Restrict access to Alertmanager configuration to viewers
2021-07-07 16:29:18 +03:00
Sofia Papagiannaki
8a3edf280e
Alerting: Fix prometheus API to check folder permissions (#36301) 2021-07-05 10:49:14 +03:00
Ganesh Vernekar
8fe58fc2e3
Alerting: Add additional newlines to Microsoft Teams notification message where necessary (#36126)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-06-24 20:52:00 +05:30
Kyle Brandt
19f764739b
Alerting: Change __value__ label to __value_string__ annotation and add ValueString variable in notifications (#36032)
* Alerting: Allow __value__ label in notifications
was being removed by removePrivateItems
discoverd in #36020, but issue is not about that specifically

* __value__ label to __value_string__ annotation
and .ValueString extended property for notifications
2021-06-24 12:45:49 +05:30
Ganesh Vernekar
9a5c1f06df
Alerting: Template all possible variables in notification channels (#35749)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-06-22 15:12:54 +05:30
Ganesh Vernekar
33d6e11175
Alerting: Decouple default template from channel tests (#35239)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-06-21 07:59:09 +05:30
David Parrott
4732f832f7
Alerting: recalculate EndsAt (#35830)
* setEndsAt

* one more test case

* add should clause to tests
2021-06-17 10:01:46 -07:00
Ganesh Vernekar
dcd4bf1615
Alerting: Fill the empty GeneratorURL (#35740)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-06-16 15:34:12 +05:30
Sofia Papagiannaki
e5a5b8e3fe
Alerting: Fix updating alert rule properties with missing/zero values (#35512)
* Fix deleting labels and annotations

* Add test

* Keep no data and error start if not provided

* Allow setting interval and for to zero during rule updates
2021-06-15 20:55:25 +03:00
Sofia Papagiannaki
abe35c8c01
Alerting: Add error recovery during rule evaluations (#35450)
* Alerting: Eval recovery after query failure

* Apply suggestions from code review
2021-06-15 19:30:21 +03:00
gotjosh
f7ed35336d
Alerting: Implement /status for the notification system (#33227)
* Alerting: Implement /status for the notification system

Implements the necessary plumbing to have a /status endpoint on the
notification system.

* Add API examples

* Update API specs

* Update prometheus/common dependency

Co-authored-by: Sofia Papagiannaki <sofia@grafana.com>
2021-06-15 19:14:02 +03:00
Sofia Papagiannaki
fba90b8f9b
Alerting: Recact html responses (#35277) 2021-06-04 20:57:24 +03:00
Sofia Papagiannaki
8cda1f5153
Alerting: Allow rules with same title across folders (#35270)
* Alerting: Allow rules with same title across folders

* Add test
2021-06-04 20:45:26 +03:00
Sofia Papagiannaki
15c55b0115
Alerting: Fix notification channel migration and handle case when Alertmanager default configuration is absent (#35086)
* Fix dashboard alert and nootifier migration for MySQL

* Fix POSTing Alertmanager configuration if no current configuration exists

in case the default configuration has not be stored yet
or has failed to get stored

* Change CreatedAt field type
2021-06-04 15:52:41 +03:00
Ganesh Vernekar
8417088969
Alerting: Expand {{$labels.xyz}} template in labels and annotations (#35159)
* Alerting: Expand `{{$labels.xyz}}` template in labels and annotations

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix annotation not updating for same alert

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-06-03 19:24:36 +02:00
Ganesh Vernekar
a30e60a0b8
Alerting: Do not hard fail on templating errors in channels (#35165)
* Alerting: Do not hard fail on templating errors in channels

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-06-03 19:39:32 +05:30
Ganesh Vernekar
a23674ef99
Alerting: Migrate tags as labels and not annotations (#34990)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-31 19:47:17 +05:30
Sofia Papagiannaki
355be158b7
[Alerting]: fix/cleanup API examples (#34588) 2021-05-31 11:18:29 +03:00
Chip Wolf ‮
badec6c6ad
Alerting: Add support for configuring avatar URL for the Discord notifier (#33355)
Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>
2021-05-28 23:00:21 +02:00
Owen Diehl
cc38613ba4
alerting: fixes per-receiver metric cardinality (#34915) 2021-05-28 12:31:23 -04:00
Owen Diehl
9aca032d10
Alerting/consistent api errors (#34858)
* consolidates alertmanager api errors

* util & testing consistent errors

* consistent errors for rest of ngalert apis

* updates expected errors in testware

* bump ci

* linting

* unrelated: dashboard.go lint
2021-05-28 11:55:03 -04:00
Kyle Brandt
b47e7d12e6
Alerting: Extract values from MD expr alerts (#34757)
When using mulit-dimensional Grafana managed alerts (e.g. SSE math) extract refIds values and labels so they can be shown in the notification and dashboards.
2021-05-28 11:04:20 -04:00
Danilo Bargen
83a83de10a
Clarify that Threema Gateway Alerts support only Basic IDs (#34828)
Threema Gateway supports two types of IDs: Basic IDs (where the
encryption is managed by the API server) and End-to-End IDs (where the
keys are managed by the user).

This plugin currently does not support End-to-End IDs (since it's much
more complex to implement, because the encryption needs to happen
locally). Add a few clarifications to the UI.

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
2021-05-28 08:54:55 +02:00
Domas
347273cdea
Alerting: check upstream response content type in lotex proxy (#34760) 2021-05-27 14:12:29 +03:00
David Parrott
20d356947c
set state correctly and test (#34680) 2021-05-26 11:37:42 -07:00
Ganesh Vernekar
d69c21acb6
NGAlert: Update the default template to include more URLs (#34715)
* NGAlert: Update the default template to include more URLs

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix tests

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-26 16:49:39 +02:00
Ganesh Vernekar
b168223029
NGAlert: Add integration tests for remaining notification channels (#34662)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-26 16:33:55 +05:30
Marcus Andersson
e19b3df1a9
Alerting: added possibility to preview grafana managed alert rules. (#34600)
* starting to add eval logic.

* wip

* first version of test rule.

* reverted file.

* add info colum to result to show error or (with CC evalmatches)

* fix labels in evalmatch

* fix be test

* refactored using observables.

* moved widht/height div to outside panel rendere.

* adding docs api level.

* adding container styles to error div.

* increasing size of preview.

Co-authored-by: kyle <kyle@grafana.com>
2021-05-26 10:06:28 +02:00
Owen Diehl
0e0ed43153
Alerting/testing promql extraction (#34665)
* promql compat for marshaling

* extracts upstream instant queries into data frame for alerting

* eval string parity
2021-05-25 11:54:50 -04:00
Sofia Papagiannaki
b48832c0f7
[Alerting]: alertmanager notifier fixes (#34575) 2021-05-24 16:09:29 +03:00
Sofia Papagiannaki
23939eab10
[Alerting]: namespace fixes (#34470)
* [Alerting]: forbid viewers for updating rules if viewers can edit

check for CanSave instead of CanEdit

* Clear ngalert tables when deleting the folder

* Apply suggestions from code review

* Log failure to check save permission

Co-authored-by: gotjosh <josue@grafana.com>
2021-05-20 15:49:33 +03:00
gotjosh
7b04278834
Alerting: Opsgenie notification channel (#34418)
* Alerting: Opsgenie notification channel

This translate the opsgenie notification channel from the old alerting
system to the new alerting system with a few changes:

- The tag system has been replaced in favour of annotation.
- TBD
- TBD

Signed-off-by: Josue Abreu <josue@grafana.com>

* Fix template URL

* Bugfig: dont send resolved when autoClose is false

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix integration tests

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix URLs in all other channels

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-20 10:12:08 +02:00
David Parrott
7a83d1f9ff
Alerting resend delay for sending to notifiers (#34312)
* adds resend delay to avoid saturating notifier

* correct method signatures

* pr feedback
2021-05-19 22:15:09 +02:00
Owen Diehl
8f350bc353
actually register metrics this time (#34444) 2021-05-19 22:09:12 +02:00
Ganesh Vernekar
533be16787
NGAlert: Add Threema notification channel (#34159)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-19 20:20:52 +02:00
Ganesh Vernekar
b2e84277a3
NGAlert: Add Kafka notification channel (#34156)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-19 20:02:09 +02:00
Ganesh Vernekar
ad1d0ae0bf
NGAlert: Add VictorOps notification channel (#34161)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-19 19:52:14 +02:00
Ganesh Vernekar
fb9223ab42
NGAlert: Add Line notification channel (#34157)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-19 19:04:48 +02:00
Domas
54c33c6cdd
Alerting: update email template (#34205) 2021-05-19 18:58:31 +02:00
Ganesh Vernekar
01e0faf800
NGAlert: Add GoogleChat notification channel (#34153)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-19 18:24:04 +02:00
David Parrott
a0f175c7a5
also don't allow negative intervalseconds (#34319) 2021-05-19 09:05:32 -07:00
David Parrott
b9f4ec2030
Add discord notifier channel and test (#34150)
* Add discord notifier channel and test

* Correct payload

* remove print statement

* PR feedback and update due to changes in main

* Add discord notifier channel and test

* Correct payload

* remove print statement

* PR feedback and update due to changes in main

* update constructor and tests

* group imports sensibly

* Fix lint

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-19 17:31:55 +02:00
Sofia Papagiannaki
a79a4838b8
[Alerting]: Add Pushover integration with the alert manager (#34371)
* [Alerting]: Add Pushover integration with the alert manager

* lint

* Set boundary only for tests

* Remove title field

* fix imports
2021-05-19 16:48:46 +02:00
Owen Diehl
1d2febfa85
[Alerting] Route validations (#34393)
* more routing validation

* go mod

* recursive route validations
2021-05-19 10:36:28 -04:00
Arve Knudsen
9dfaa037d1
Alerting: Migrate Alertmanager notifier (#34304)
* Alerting: Port Alertmanager notifier to v8

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2021-05-19 15:27:41 +02:00
Owen Diehl
d6c4c2fcd5
[Alerting] Ensure upstream validations are run (#34333)
* use embedded validations via noop yaml unmarshaler

* lint

* fixes integration tests now that groupings are handled
2021-05-19 06:22:44 -04:00
Owen Diehl
c48c701791
adds missing metric name (#34307) 2021-05-18 17:24:38 -04:00
David Parrott
25485100b0
Alerting: Trim results when at processing instead of on ticker (#34248)
* Trim results when at processing instead of on ticker

* User RWMutex correctly

* remove comment
2021-05-18 10:56:14 -07:00
David Parrott
bbb7bbf891
Alerting: Remove back end logic for supporting KeepLastState (#34242)
* Removed back end logic for supporting KeepLastState

* Map keep_state correctly in migrations
2021-05-18 10:55:43 -07:00
Sofia Papagiannaki
ff112f07e3
[Alerting]: Add Sensu Go integration with the alert manager (#34045)
* [Alerting]: Add sensugo notification channel

* Apply suggestions from code review

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* Do not include labels with concatenated rule UID and names

* Modifications after syncing with main

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2021-05-18 17:31:51 +03:00
Sofia Papagiannaki
11243dec14
[Alerting]: Assign UUID to grafana receivers (#34241)
* [Alerting]: Assign UUID to grafana receivers

* Apply suggestions from code review

* Add test for updating invalid receiver

Co-authored-by: Domas <domasx2@gmail.com>
2021-05-18 17:31:00 +03:00
Kyle Brandt
63b2dd06a5
Alerting: Set "value" with evalmatches in G Managed (#34075)
When, and currently only when using a classic condition, evaluation information is added (which is like the EvalMatches from dashboard alerting).

This is returned via the API and can be included in notifications by reading the `__value__` label attached `.Alerts` in the template. It is a string.
2021-05-18 09:12:39 -04:00
Ganesh Vernekar
89c2b5e863
NGAlert: Remove unwanted fields from notification channel config (#34036)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-18 10:04:47 +02:00
gotjosh
6384f86fb9
Alerting: Allow the notifier to log (#34232)
* Alerting: Allow the notifier to log

The notifier upstream code uses go-kit as its logging library. The
grafana specific logger is not compatible with this API. In this PR, I
have created a wrapper that implements io.Writer to make them
compatible.
2021-05-17 18:06:47 +01:00
Kyle Brandt
331991ca10
UAlerting: Increase default max datapoints (#34223)
Change const value from 100 to 43200 (12 hours at 1sec interval)
2021-05-17 18:46:52 +02:00
Ganesh Vernekar
d5ae55c5dd
NGAlert: Add message field to email notification channel (#34044)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-17 16:05:09 +05:30
Owen Diehl
1367f7171e
Alerting/ruler metrics (#34144)
* adds active configurations metric

* rule evaluation metrics

* ruler metrics

* pr feedback
2021-05-14 16:13:44 -04:00
gotjosh
eb74994b8b
Alerting: Modify configuration apply and save semantics - v2 (#34143)
* Save default configuration to the database and copy over secure settings
2021-05-14 19:49:54 +01:00
Owen Diehl
fc90c36d50
removes unused db method (#34082) 2021-05-13 20:28:10 +02:00
Owen Diehl
baca873a84
extracts alertmanager from DI, including migrations (#34071)
* extracts alertmanager from DI, including migrations

* includes alertmanager Run method in ngalert

* removes 3s test shutdown timeout

* lint
2021-05-13 14:01:38 -04:00
Ganesh Vernekar
ec3214bac2
NGAlert: Add integration tests for notification channels (#33431)
* NGAlert: Add integration tests for notification channels

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix the failing tests

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Override creation of rule UID, remove only namespace UID

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-13 22:58:19 +05:30
Kyle Brandt
babb17afd6
Alerting/Chore: Move tests from tests package (#34059)
Instead put in package folder but with package name suffixed with _test
This enables code coverage within the pkg while still allow the tests to operate from external to package perspective (only exported things).
2021-05-13 10:05:33 -04:00
Ganesh Vernekar
5f44ccff0c
NGAlert: Fix unit test to write files in temporary directory (#34032)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-13 16:08:12 +05:30
Kyle Brandt
3da8db7f3f
Alerting: Run table migrations regardless of feature flag and move out of service (#33996) 2021-05-12 14:39:48 -04:00
Owen Diehl
3b06f52bab
Alerting/allow empty receiver (#33962)
* simplifies yaml unmarshaling: PostableApiReceiver

* allow empty receiver type

* allows name only receivers (blackhole)

* better receiver type parsing

* linting
2021-05-12 07:58:16 -04:00
Kyle Brandt
a735c51202
Alerting/Chore: Backend remove def_ columns from instance (#33875)
rename def_uid and def_org_id to rule_uid and rule_org_id on the alert_instance table and drops the definition table.
2021-05-12 07:17:43 -04:00
Ganesh Vernekar
8d442c9b44
NGAlert: Fix templating and remove unwanted default templates (#33918)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-12 15:13:43 +05:30
Sofia Papagiannaki
f4750fb3c8
[Alerting]: Alertmanager API apply permissions (#33843)
* [Alerting]: Alertmanager API apply permissions

* Apply suggestions from code review
2021-05-11 11:31:38 +03:00
Owen Diehl
e18ca8f6f2
enforce receivers align with backend type when posting AM config (#33877) 2021-05-10 16:58:41 -04:00
Sofia Papagiannaki
1c58fd380f
[Alerting]: store encrypted receiver secure settings (#33832)
* [Alerting]: Store secure settings encrypted

* Move encryption to the API handler
2021-05-10 15:30:42 +03:00
David Parrott
e58aca2d20
Alerting: remove instances from db and cache on rule update (#33722)
* remove instances from db and cache on rule update

* fix panic

* rename
2021-05-06 18:39:34 +02:00
Kyle Brandt
fae093bbe2
Alerting: Fix state cache getOrCreate panic (#33777) 2021-05-06 14:35:52 +02:00
Owen Diehl
a5ae8cf377
Unredact/secret (#33723)
* no longer redacts GETing proxied AM configs

* removes unused testfile

* testware fix

* consistently roundtrips yaml<>json and doesnt redact secrets

* lint
2021-05-05 16:21:53 -04:00
Ganesh Vernekar
1b8c0ce88b
NGAlert: Fix some TODOs in notification channels (#33739)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-05 17:48:40 +05:30
David Parrott
b1a8c67689
Alerting return evaluation errors to /rules (#33663)
* Set and return errors produced by evaluation results

* test fixup
2021-05-04 13:08:12 -04:00
David Parrott
39099bf3c0
Alerting nested state cache (#33666)
* nest cache by orgID, ruleUID, stateID

* update accessors to use new cache structure

* test and linter fixup

* fix panic

Co-authored-by: Kyle Brandt <kyle@grafana.com>

* add comment to identify what's going on with nested maps in cache

Co-authored-by: Kyle Brandt <kyle@grafana.com>
2021-05-04 09:57:50 -07:00
David Parrott
5072fefc22
allow saving pending alerts (#33667) 2021-05-04 09:24:20 -07:00
Sofia Papagiannaki
540f110220
[Alerting]: Extend quota service to optionally set limits on alerts (#33283)
* Quota: Extend service to set limit on alerts

* Add test for applying quota to alert rules

* Apply suggestions from code review

Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>

* Get used alert quota only if naglert is enabled

* Set alert limit to zero if nglalert is not enabled
Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>
2021-05-04 19:16:28 +03:00
Ganesh Vernekar
918552d34b
NGAlert: Send list of available ngalert notification channels via API (#33489)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-04 13:58:39 +02:00
Kyle Brandt
48358efc13
Alerting: remove State cache entries on Ruler Delete (#33638)
for https://github.com/grafana/alerting-squad/issues/133
2021-05-03 14:01:33 -04:00
Owen Diehl
070627d11e
better handle metrics for state transitions (#33648) 2021-05-03 11:57:24 -04:00
Kyle Brandt
c1034f3118
Alerting: Create instanceStore (#33587)
for https://github.com/grafana/alerting-squad/issues/129
2021-05-03 07:19:15 -04:00
Kyle Brandt
c2a5da79e3
Alerting: Avoid panic by not loading instances without a rule (#33597) 2021-05-01 19:01:28 +02:00
Kyle Brandt
759a0cd71b
Build: Fix with cleanup call maybe? (#33590) 2021-04-30 13:02:37 -07:00
Kyle Brandt
7823842c5d
Alerting: Load annotations from rule into State cache (#33542)
for https://github.com/grafana/alerting-squad/issues/127
2021-04-30 20:23:12 +02:00
Kyle Brandt
b8f01fe034
Alerting: backend "ng" code cleanup (#33578) 2021-04-30 13:21:57 -04:00
Owen Diehl
5e48b54549
Alerting/metrics (#33547)
* moves alerting metrics to their own pkg

* adds grafana_alerting_alerts (by state) metric

* alerts_received_{total,invalid}

* embed alertmanager alerting struct in ng metrics & remove duplicated notification metrics (already embed alertmanager notifier metrics)

* use silence metrics from alertmanager lib

* fix - manager has metrics

* updates ngalert tests

* comment lint
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* cleaner prom registry code

* removes ngalert global metrics

* new registry use in all tests

* ngalert metrics impl service, hack testinfra code to prevent duplicate metric registrations

* nilmetrics unexported
2021-04-30 12:28:06 -04:00
Kyle Brandt
6c8ef2a9c2
Alerting: Alert Rule migration (#33000)
* Not complete, put migration behind env flag for now:
UALERT_MIG=iDidBackup
* Important to backup, and not expect the same DB to keep working until the env trigger is removed.
* Alerting: Migrate dashboard alert permissions
* Do not use imported models
* Change folder titles

Co-authored-by: Sofia Papagiannaki <papagian@users.noreply.github.com>
2021-04-29 13:24:37 -04:00
Sofia Papagiannaki
1e380e869e
[Alerting]: some fixes (#33538)
* Fix fialure when adding state annotations

* Fix get org rules API

Do not fail response if user has no access to view a namespace.
Do not include the namespace in the response instead.

* lint
2021-04-29 19:15:15 +03:00
Kyle Brandt
d32fcbe2bc
Alerting: Eval pkg tests and more specific error handling (#33496)
* comment updates
* more friendly error messages, in particular if it looks like time series data
2021-04-29 07:27:32 -04:00
Owen Diehl
ec37b4cb87
[Alerting] Automatic request instrumentation (#33444)
* alerting: automatic request instrumentation

* always expose alerting prom metrics

* globally register alerting metrics
2021-04-28 16:59:15 -04:00
Kyle Brandt
914443c816
Alerting: Fix state cache id duplication (#33480) 2021-04-28 11:42:19 -04:00
Sofia Papagiannaki
7ccb022c03
Alerting: validate condition before updating rulegroup (#33367)
* Alerting: validate condition before updating rulegroup

* Apply suggestions from code review
2021-04-28 11:31:51 +03:00
Kyle Brandt
b590e95682
AlertingAPI: Change list response query prop (#33419)
* Alerting: change to full []AlertQuery as json in a string and not just model.
2021-04-27 22:15:00 +02:00
Ganesh Vernekar
467ab124dd
NGAlert: Fix GET for Alertmanager config (#33379)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-27 20:48:19 +05:30
Kyle Brandt
adcba36d39
AlertingAPI: update swagger json files match datasourceUid change (#33332)
* update swagger json files match datasourceUid change
underlying change made in https://github.com/grafana/grafana/pull/33282
* Document DatasourceUID field in AlertQuery model
* Run spec generation from inside a docker container
* Generate latest spec

Co-authored-by: Sofia Papagiannaki <sofia@grafana.com>
2021-04-27 16:50:30 +02:00
Owen Diehl
86c8eed386
Instrument/ruler api (#33290)
* ruler api histogram instrumentation

* register ruler metrics
2021-04-27 08:25:32 -04:00
Ganesh Vernekar
be1affe0a4
NGAlert: Fix flaky test (#33415)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-27 17:03:22 +05:30
David Parrott
788bc2a793
Alerting: refactor state tracker (#33292)
* set processing time

* merge labels and set on response

* use state cache for adding alerts to rules

* minor cleanup

* add support for NoData and Error results

* rename test

* bring in changes from other PRs tha have been merged

* pr feedback

* add integration test

* close state tracker cleanup on context.Done

* fixup test

* rename state tracker

* set EvaluationDuration on Result

* default labels set as constants

* separate cache and state from manager

* use RWMutex in cache
2021-04-23 21:32:25 +02:00
David Parrott
ca79206498
Alerting: Handle NoData and Error evaluation results (#33194)
* set processing time

* merge labels and set on response

* use state cache for adding alerts to rules

* minor cleanup

* add support for NoData and Error results

* rename test

* bring in changes from other PRs tha have been merged

* pr feedback

* add integration test

* close state tracker cleanup on context.Done

* fixup test

* not those annotations
2021-04-23 20:47:52 +02:00
Kyle Brandt
5e818146de
Alerting/Expr: New SSE Request/QueryType, alerting move data source UID (#33282) 2021-04-23 16:52:32 +02:00
Ganesh Vernekar
659ea20c3c
NGAlert: Run the maintenance cycle for the silences (#33301)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-23 16:19:03 +02:00
Ganesh Vernekar
d66a5e65a4
AlertingNG: Add webhook notification channel (#33229)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-23 18:59:28 +05:30
Ganesh Vernekar
a0e567f80f
AlertingNG: Add Dingding notification channel (#32995)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-22 19:30:49 +02:00
Ganesh Vernekar
4ec1edfca3
AlertingNG: Add Teams notification channel (#32979)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-22 18:16:26 +02:00
Ganesh Vernekar
c9cd7ea701
AlertingNG: Add Telegram notification channel (#32795)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-22 17:24:59 +02:00
Ganesh Vernekar
0a03d5c29e
AlertingNG: Correctly set StartsAt, EndsAt, UpdatedAt after alert reception (#33109)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-22 20:42:18 +05:30
Ganesh Vernekar
3056f86f76
AlertingNG: Fix TODOs in email notification channel (#33169)
* AlertingNG: Fix TODOs in email notification channel

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Test fixup

* Remove the receiver field it is not needed for the email notification

Co-authored-by: Josue Abreu <josue@grafana.com>
2021-04-22 10:01:55 -04:00
Arve Knudsen
6408b55a7c
Slack: Use chat.postMessage API by default (#32511)
* Slack: Use only chat.postMessage API

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Slack: Check for response error

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Slack: Support custom webhook URL

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Simplify

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Fix tests

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Rewrite tests to use stdlib

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Update pkg/services/alerting/notifiers/slack.go

Co-authored-by: Dimitris Sotirakis <sotirakis.dim@gmail.com>

* Clarify URL field name

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Fix linting issue

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Fix test

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Fix up new Slack notifier

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Improve tests

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Fix lint

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Slack: Make token not required

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Alerting: Send validation errors back to client

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Document how token is required

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Make recipient required when using Slack API

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Fix field description

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

Co-authored-by: Dimitris Sotirakis <sotirakis.dim@gmail.com>
2021-04-22 16:00:21 +02:00
Arve Knudsen
66020b419c
NGAlert: Consolidate on standard errors package (#33249)
* NGAlert: Don't use pkg/errors

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Update pkg/services/ngalert/notifier/alertmanager.go

Co-authored-by: Will Browne <wbrowne@users.noreply.github.com>

* Fix logging

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

Co-authored-by: Will Browne <wbrowne@users.noreply.github.com>
2021-04-22 11:18:25 +02:00
Sofia Papagiannaki
b2288f7ef9
[Alerting]: Add alerting endpoint for Query Evaluation (#33174)
* [Alerting]: Add alerting endpoint for Query Evaluation

* Fix passing down now parameter

* Add validations and test

* Fix eval queries and expressions test

* Add eval tests
2021-04-21 22:44:50 +03:00
gotjosh
de0802cf3b
Alerting: Fixes the integration test currently failing at master (#33233)
* Alerting: Fixes the integration test currently failing at master

* Skip the state tracker test for now
2021-04-21 14:57:17 -04:00
David Parrott
4be1d84f23
Alerting: Enhancements to /rules (#33085)
* set processing time

* merge labels and set on response

* use state cache for adding alerts to rules

* minor cleanup

* pr feedback

* Do not initialize mutex unnecessarily

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* linter

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2021-04-21 09:30:03 -07:00
Sofia Papagiannaki
87a70af7eb
[Alerting]: Fix updating rule group and add tests (#33074)
* [Alerting]: Fix updating rule group and add test

* Fix updating rule labels
* Set default values for rule no data and error states
if they are missing
* Add test for updating rule

* Test updating annotations

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>

* add test for posting an unknown rule UID

* Fix alert rule validation and add tests

* Remove org id from PostableGrafanaRule

This field was not used; each rule gets the organisation of the user making
the rerquest

* Update pkg/tests/api/alerting/api_alertmanager_test.go

Co-authored-by: gotjosh <josue@grafana.com>
2021-04-21 17:22:58 +03:00
gotjosh
23c7e7ab60
Alerting: Various fixes for the alerts endpoint (#33182)
A set of fixes for the GET alert and groups endpoints.

- First, is the fact that the default values where not being for the query params. I've introduced a new method in the Grafana context that allow us to do this.
- Second, is the fact that alerts were never being transitioned to active. To my surprise this is actually done by the inhibitor in the pipeline - if an alert is not muted, or inhibited then it's active.
- Third, I have added an integration test to cover for regressions.

Signed-off-by: Josue Abreu <josue@grafana.com>
2021-04-21 06:34:42 -04:00
Owen Diehl
e065e19583
Fix/ngalert generation (#33172)
* fixes pkg names & alerting openapi generation

* cleans up api generation, uses docker & removes python
2021-04-20 13:12:32 -04:00
Oscar Kilhed
bc2d90f140
Fix lint issue in cortex ruler test (#33158) 2021-04-20 14:03:58 +02:00
Owen Diehl
e37a780e14
Inhouse alerting api (#33129)
* init

* autogens AM route

* POST dashboards/db spec

* POST alert-notifications spec

* fix description

* re inits vendor, updates grafana to master

* go mod updates

* alerting routes

* renames to receivers

* prometheus endpoints

* align config endpoint with cortex, include templates

* Change grafana receiver type

* Update receivers.go

* rename struct to stop swagger thrashing

* add rules API

* index html

* standalone swagger ui html page

* Update README.md

* Expose GrafanaManagedAlert properties

* Some fixes

- /api/v1/rules/{Namespace} should return a map
- update ExtendedUpsertAlertDefinitionCommand properties

* am alerts routes

* rename prom swagger section for clarity, remove example endpoints

* Add missing json and yaml tags

* folder perms

* make folders POST again

* fix grafana receiver type

* rename fodler->namespace for perms

* make ruler json again

* PR fixes

* silences

* fix Ok -> Ack

* Add id to POST /api/v1/silences (#9)

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Add POST /api/v1/alerts (#10)

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* fix silences

* Add testing endpoints

* removes grpc replace directives

* [wip] starts validation

* pkg cleanup

* go mod tidy

* ignores vendor dir

* Change response type for Cortex/Loki alerts

* receiver unmarshaling tests

* ability to split routes between AM & Grafana

* api marshaling & validation

* begins work on routing lib

* [hack] ignores embedded field in generation

* path specific datasource for alerting

* align endpoint names with cloud

* single route per Alerting config

* removes unused routing pkg

* regens spec

* adds datasource param to ruler/prom route paths

* Modifications for supporting migration

* Apply suggestions from code review

* hack for cleaning circular refs in swagger definition

* generates files

* minor fixes for prom endpoints

* decorate prom apis with required: true where applicable

* Revert "generates files"

This reverts commit ef7e975584.

* removes server autogen

* Update imported structs from ngalert

* Fix listing rules response

* Update github.com/prometheus/common dependency

* Update get silence response

* Update get silences response

* adds ruler validation & backend switching

* Fix GET /alertmanager/{DatasourceId}/config/api/v1/alerts response

* Distinct gettable and postable grafana receivers

* Remove permissions routes

* Latest JSON specs

* Fix testing routes

* inline yaml annotation on apirulenode

* yaml test & yamlv3 + comments

* Fix yaml annotations for embedded type

* Rename DatasourceId path parameter

* Implement Backend.String()

* backend zero value is a real backend

* exports DiscoveryBase

* Fix GO initialisms

* Silences: Use PostableSilence as the base struct for creating silences

* Use type alias instead of struct embedding

* More fixes to alertmanager silencing routes

* post and spec JSONs

* Split rule config to postable/gettable

* Fix empty POST /silences payload

Recreating the generated JSON specs fixes the issue
without further modifications

* better yaml unmarshaling for nested yaml docs in cortex-am configs

* regens spec

* re-adds config.receivers

* omitempty to align with prometheus API behavior

* Prefix routes with /api

* Update Alertmanager models

* Make adjustments to follow the Alertmanager API

* ruler: add for and annotations to grafana alert (#45)

* Modify testing API routes

* Fix grafana rule for field type

* Move PostableUserConfig validation to this library

* Fix PostableUserConfig YAML encoding/decoding

* Use common fields for grafana and lotex rules

* Add namespace id in GettableGrafanaRule

* Apply suggestions from code review

* fixup

* more changes

* Apply suggestions from code review

* aligns structure pre merge

* fix new imports & tests

* updates tooling readme

* goimports

* lint

* more linting!!

* revive lint

Co-authored-by: Sofia Papagiannaki <papagian@gmail.com>
Co-authored-by: Domas <domasx2@gmail.com>
Co-authored-by: Sofia Papagiannaki <papagian@users.noreply.github.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Co-authored-by: gotjosh <josue@grafana.com>
Co-authored-by: David Parrott <stomp.box.yo@gmail.com>
Co-authored-by: Kyle Brandt <kyle@grafana.com>
2021-04-19 14:26:04 -04:00
Ganesh Vernekar
6271777ec6
AlertingNG: Remove the receivers field from postable alerts (#33068)
* AlertingNG: Remove the receivers field from postable alerts and update tests

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-19 12:28:44 +05:30
Kyle Brandt
2c862678ab
AlertingNG/SSE: Datasource UID and UID/ID only (drop name) (#33039)
SSE still will support ID until dashboard/frontend always requests UID
update *.http examples

Co-authored-by: gotjosh <josue@grafana.com>
2021-04-16 15:29:19 +02:00
David Parrott
555da77527
Dparrott/labels on alert rule (#33057)
* move state tracker tests to /tests

* set default labels on alerts

* handle empty labels in result.Instance

* create annotation on transition to alerting state
2021-04-16 15:11:40 +02:00
gotjosh
362c4d4276
Alerting: Integration test rule creation (#33047)
* Alerting: Integration test rule creation

* Appease the linter

* Cleanup

* Make anonymous user role a parameter
2021-04-16 08:00:07 -04:00
David Parrott
2276e9556a
Alerting: set query in rules response (#33010)
* set query in rules response

* Theme: tweaking dark theme colors (#33007)

* Library Panels: Add library panel tab to share modal (#32953)

* Explore: Scroll split panes in Explore independently (#32978)

* Change default prometheus to latest and prometheus v1 to prometheus1

* Update README

* Remove prometheus1 block as not used

* Explore: Separatae scrolling in split view

* Update snapshot

* Allow skip migrations in tests via environment variable (#32958)

* Dashboard: Fix issue where Slack notifications won't link to users (#32861)

* DashboardPage: refactored styles from sass to emotion (#32955)

* DashboardPage: refactored styles from sass to emotion

* refactored dashboardPage component to be alot easier to read and understand

* more refactoring...

* more cleaning...

* fixes frontend test

* fixes frontend test- I hope

* fixes frontend test- I hope

* moves dashboard scss styles back to it's standalone file

* GraphNG: use theme font family and size for axis labels (#33009)

* GraphNG: use theme font family and size for axis labels

* fix test

* AlertingNG: Slack notification channel (#32675)

* AlertingNG: Slack notification channel

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Add tests

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments and small refactoring

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* GraphNG: stacking (#30749)

* First iteration

* Dev dash

* Re-use StackingMode type

* Fix ts and api issues

* Stacking work resurected

* Fix overrides

* Correct values in tooltip and updated test dashboard

* Update dev dashboard

* Apply correct bands for stacking

* Merge fix

* Update snapshot

* Revert go.sum

* Handle null values correctyl and make filleBelowTo and stacking mutual exclusive

* Snapshots update

* Graph->Time series stacking migration

* Review comments

* Indicate overrides in StandardEditorContext

* Change stacking UI editor, migrate stacking to object option

* Small refactor, fix for hiding series and dev dashboard

* VizLegend: sets a min and max value of the seriesCount control in Storybook (#33022)

* Alerting: Filter rules list (#32818)

* Chore: Reduces strict errors (#33012)

* Chore: reduces strict error in OptionPicker tests

* Chore: reduces strict errors in FormDropdownCtrl

* Chore: reduces has no initializer and is not definitely assigned in the constructor errors

* Chore: reduces has no initializer and is not definitely assigned in the constructor errors

* Chore: lowers strict count limit

* Tests: updates snapshots

* Tests: updates snapshots

* Chore: updates after PR comments

* Refactor: removes throw and changes signature for DashboardSrv.getCurrent

* [Alerting]: Several modifications in alert rules (#32983)

* [Alerting]: Use common properties for all rules

* Add Labels in rules

* Fix update ruleGroup API

Return 400 Bad Request response
when the request contains a UID that does not exist

* Check permissions and return namespace id

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>

* WIP (#33025)

* Chore: Bump strict error count limit (#33035)

* set query in rules response

Co-authored-by: Torkel Ödegaard <torkel@grafana.org>
Co-authored-by: kay delaney <45561153+kaydelaney@users.noreply.github.com>
Co-authored-by: Ivana Huckova <30407135+ivanahuckova@users.noreply.github.com>
Co-authored-by: Dafydd <72009875+dafydd-t@users.noreply.github.com>
Co-authored-by: n-wbrown <n-wbrown@users.noreply.github.com>
Co-authored-by: Uchechukwu Obasi <obasiuche62@gmail.com>
Co-authored-by: Leon Sorokin <leeoniya@gmail.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Co-authored-by: Dominik Prokop <dominik.prokop@grafana.com>
Co-authored-by: Nathan Rodman <nathanrodman@gmail.com>
Co-authored-by: Hugo Häggmark <hugo.haggmark@grafana.com>
Co-authored-by: Sofia Papagiannaki <papagian@users.noreply.github.com>
Co-authored-by: gotjosh <josue@grafana.com>
Co-authored-by: Marcus Efraimsson <marcus.efraimsson@gmail.com>
2021-04-15 22:23:16 +02:00
Sofia Papagiannaki
6bbb2fd4ba
[Alerting]: Several modifications in alert rules (#32983)
* [Alerting]: Use common properties for all rules

* Add Labels in rules

* Fix update ruleGroup API

Return 400 Bad Request response
when the request contains a UID that does not exist

* Check permissions and return namespace id

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>
2021-04-15 15:54:37 +03:00
Ganesh Vernekar
04a8d5407e
AlertingNG: Slack notification channel (#32675)
* AlertingNG: Slack notification channel

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Add tests

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments and small refactoring

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-15 16:01:41 +05:30
Sofia Papagiannaki
624fbf5dda
[Alerting]: Fix empty rules evaluation statuses (#32997)
* [Alerting]: Fix empty rules evaluation statuses

`GetRuleGroupAlertRules()` requires an non empty namespaceUID

* Include the namespace into the response
2021-04-14 17:49:26 +00:00
Sofia Papagiannaki
8848d825e0
[Alerting]: Use title instead of slug for retrieving the namespace (#32957)
* [Alerting]: Use title instead of slug for retrieving the namespace

* Apply suggestions from code review

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
2021-04-14 12:02:50 +03:00
David Parrott
567a6a09bd
Alerting: Return RuleResponse for api/prometheus/grafana/api/v1/rules (#32919)
* Return RuleResponse for api/prometheus/grafana/api/v1/rules

* change TODO to note

Co-authored-by: gotjosh <josue@grafana.com>

* pr feedback

* test fixup

Co-authored-by: gotjosh <josue@grafana.com>
2021-04-13 17:38:09 -04:00
Sofia Papagiannaki
e7ff04a167
[Alerting]: Implement test rule API route (#32837)
* [Alerting]: Implement test rule API route

* Apply suggestions from code review

* Call /query instead of /query_range
2021-04-13 20:58:34 +03:00
gotjosh
528ca9134b
Alerting: Use a default configuration and periodically poll for new ones (#32851)
* Alerting: Use a default configuration and periodically poll for new ones

Use a default configuration to make sure we always start the grafana
instance. Then, regularly poll for new ones.

I've also made sure that failures to apply configuration do not stop the
Grafana server but instead keep polling until it is a success.
2021-04-13 13:02:44 +01:00
Sofia Papagiannaki
54689f2739
[Alerting]: Fix YAML encoding/decoding issues when proxying lotex requests (#32854)
* [Alerting]: Fix GET lotex rule group config

* [Alerting]: Fix POST lotex user config
2021-04-12 12:04:37 +03:00
Kyle Brandt
80dfa83380
AlertingNG: Add For+Annotations to Grafana_Alert (#32793)
* add db columns
* Fix deserialisation issue of AlertRule For field (#32848)
* Update to latest alerting-api

Co-authored-by: Sofia Papagiannaki <papagian@users.noreply.github.com>
2021-04-09 16:50:04 +02:00
gotjosh
c9e5088e8b
Alerting: Cleanup and move legacy to a legacy file (#32803)
* Alerting: Cleanup and move legacy to a legacy file

A quick cleanup of the ngalert/api directory, optimising for an easy
removal of what is will be considered legacy at some point. A quick
summary of what's done is:

- Add a prefix `generated` prefix to files that are auto-generated by
  our swagger definitions.
- Create a legacy file to place all the legacy API routes implementation
  and helpers. Deleting files that where no longer needed after this
move.
- Rename the `lotex` file to `lotex_ruler`
- Adding a couple of comments here and there.

With this, I hope to organise our code in this directory a bit better
given there's a lot going on.
2021-04-09 05:55:41 -04:00
Ganesh Vernekar
e3a1d3d158
AlertingNG: PagerDuty notification channel (#32604)
* AlertingNG: PagerDuty notification channel

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Add tests

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix lint

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix reviews

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-08 19:21:09 +02:00
Ganesh Vernekar
b1c84c795f
AlertingNG: Add a global registry for notification channels (#32781)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-08 22:01:23 +05:30
gotjosh
fe67680c42
Alerting: Allow querying of Alerts from notifications (#32614)
* Alerting: Allow querying of Alerts from notifications

* Wire everything up

* Remove unused functions

* Remove duplicate line
2021-04-08 07:27:59 -04:00
Owen Diehl
8b8fc293b7
safer, more idiomatic proxy helper (#32732)
Co-authored-by: Sofia Papagiannaki <sofia@grafana.com>
2021-04-07 15:36:50 -04:00
Kyle Brandt
d519913843
AlertingNG: Temp endpoint to translate dashboard alert into rule group (#32694)
* Set NoData and ExecErr states
* make save an option
* TODOs
* adjust interval
* FOR and alertRuleTags not done yet
2021-04-07 14:28:06 +02:00
Ganesh Vernekar
0f7d8ae6d2
Update email template for AlertingNG (#32691)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-07 14:52:48 +05:30
Domas
a56293142a
Alerting: unified alerting frontend (#32708) 2021-04-07 08:42:43 +03:00
Sofia Papagiannaki
9d7d33ebb3
[Alerting]: Require login for alerting API routes (#32688)
* [Alerting]: Require login for alerting API routes

* Fix example requests

* [Alerting]: Add /api prefix to all routes (#32716)
2021-04-06 17:22:05 +03:00
David Parrott
c0d83fc01e
Alerting: Return cached alerts for prometheus/api/v1/alerts (#32654)
* Return cached alerts for prometheus/api/v1/alerts

* Return not implemented for /prometheus/grafana/api/v1/rules

* Set StartsAt for already alerting states

* Fix tests
2021-04-05 15:05:39 -07:00
Sofia Papagiannaki
a80f31a5c8
Alerting: 404 error status when no alertmanager configuration (#32651) 2021-04-05 17:03:00 +03:00
Sofia Papagiannaki
daabf64aa1
[Alerting]: Update scheduler to evaluate rules created by the unified API (#32589)
* Update scheduler

* Fix tests

* Fixes after code review feedback

* lint - add uncommitted modifications

Co-authored-by: kyle <kyle@grafana.com>
2021-04-03 20:13:29 +03:00
Kyle Brandt
948da25c13
ngalerting: represent nil/empty labels the same (#32652)
was getting duplicates of [] and null before
2021-04-02 13:49:45 -04:00
Kyle Brandt
7fcb6ecb91
Alerting: Fix persistance migration (#32650) 2021-04-02 18:31:03 +02:00
Sofia Papagiannaki
0e350ae6c8
Remove more dead code (#32645) 2021-04-02 18:24:27 +03:00
David Parrott
2a8446e435
Alerting: Persist alerts on evaluation and shutdown. Warm cache from DB on startup (#32576)
* Initial commit for state tracking

* basic state transition logic and tests

* constructor. test and interface fixup

* use new sig for sch.definitionRoutine()

* test fixup

* make the linter happy

* more minor linting cleanup

* Alerting: Send alerts from state tracker to notifier

* Add evaluation time and test

Add evaluation time and test

* Add cleanup routine and logging

* Pull in compact.go and reconcile differences

* Save alert transitions and save all state on shutdown

* pr feedback

* WIP

* WIP

* Persist alerts on evaluation and shutdown. Warm cache on startup

* Filter non-firing alerts before sending to notifier

Co-authored-by: Josue Abreu <josue@grafana.com>
2021-04-02 08:11:33 -07:00
Kyle Brandt
6ad02315eb
ngalert: json dataframe on temp endpoints (#32641) 2021-04-02 15:52:38 +02:00
Ryan McKinley
c7ea96940a
Arrow: move arrow support from frontend to backend only (#32575) 2021-04-01 10:30:08 -07:00
Sofia Papagiannaki
8793f5c7f8
[Alerting]: Delete obsolete database table and code (#32595)
* Delete obsolete migration

* Remove redundant code
2021-04-01 19:41:57 +03:00
Sofia Papagiannaki
ee06970d72
[Alerting]: Grafana managed ruler API implementation (#32537)
* [Alerting]: Grafana managed ruler API impl

* Apply suggestions from code review

* fix lint

* Add validation for ruleGroup name length

* Fix MySQL migration

Co-authored-by: kyle <kyle@grafana.com>
2021-04-01 11:11:45 +03:00
Sofia Papagiannaki
a5e95823b2
[Alerting]: Alertmanager API implementation (#32174)
* Add validation for grafana recipient

* Alertmanager API implementation (WIP)

* Fix encoding/decoding receiver settings from/to YAML

* Save templates together with the configuration

* update POST to apply latest config

* Alertmanager service enabled by the ngalert toggle

* Silence API integration with Alertmanager

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2021-03-31 23:00:56 +03:00
gotjosh
433f6b91d0
Alerting: Introduce the silencing interface (#32517)
* Alerting: Introduce the silencing interface

The operations introduced are:

- Listing silences
- Retrieving an specific silence
- Deleting a silence
- Creating a silence

Signed-off-by: Josue Abreu <josue@grafana.com>

* Add a comment to listing silences

* Update to upstream alertmanager

* Remove copied code from the Alertmanager
2021-03-31 07:36:36 -04:00
David Parrott
b1cb74c0c9
Alerting: Send alerts from state tracker to notifier, logging, and cleanup task (#32333)
* Initial commit for state tracking

* basic state transition logic and tests

* constructor. test and interface fixup

* use new sig for sch.definitionRoutine()

* test fixup

* make the linter happy

* more minor linting cleanup

* Alerting: Send alerts from state tracker to notifier

* Add evaluation time and test

Add evaluation time and test

* Add cleanup routine and logging

* Pull in compact.go and reconcile differences

* pr feedback

* pr feedback

Pull in compact.go and reconcile differences

Co-authored-by: Josue Abreu <josue@grafana.com>
2021-03-30 09:37:56 -07:00
Sofia Papagiannaki
c4d5a67b38
[Alerting] Forking alert manager API (#32300)
* Alertmanager lotex ruler

* Apply suggestions from code review
2021-03-29 18:18:25 +03:00
Ganesh Vernekar
740c5813d4
AlertingNG: Fix dispatcher metrics in notifier (#32434)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-03-29 20:35:15 +05:30
Domas
0779dab0de
NgAlerting: loki & cortex have different prom & ruler endpoint prefixes (#32344) 2021-03-29 08:55:09 +03:00
Ganesh Vernekar
a0db4dce32
Render new email template and fix the title (#32314)
* Render new email template and fix the title

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix nit

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-03-26 12:53:14 +05:30
Ganesh Vernekar
093e5947f4
Upgrade Prometheus Alertmanager and small fixes (#32280)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-03-25 12:51:44 +01:00
David Parrott
d33a77a67f
Alerting: add state tracker to alerting evaluation (#32298)
* Initial commit for state tracking

* basic state transition logic and tests

* constructor. test and interface fixup

* use new sig for sch.definitionRoutine()

* test fixup

* make the linter happy

* more minor linting cleanup
2021-03-24 15:34:18 -07:00
gotjosh
58b814bd7d
Alerting: Add StartAt and FiredAt to the alert evaluation result (#32302) 2021-03-24 20:27:04 +00:00
Kyle Brandt
66548878fe
ngalert: add addition temp translation endpoint (#32287)
spits out new sse condition/data json from old alert id to help with generating UI models
also moves this api code into another file
2021-03-24 17:12:34 +01:00
gotjosh
9b52ffc6a9
Alerting: Fetch configuration from the database and run a notification service (#32175)
* Alerting: Fetch configuration from the database and run a notification
instance

Co-Authored-By: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2021-03-24 14:20:44 +00:00
Kyle Brandt
05fa9d6ad9
AlertingNG: rename Condition properties to match front end (#32267) 2021-03-24 08:07:29 -04:00
Owen Diehl
2179a2658e
Extricates reusable utilities for different alerting proxy types (#32268)
* backendtype helper

* abstracts alertingproxy

* updates alerting api dep

* prom endpoints
2021-03-24 07:43:25 -04:00
Kyle Brandt
7bb79158ed
SSE/Alerting: First pass at query/condition translation (#31693)
- Takes the conditions property from the settings column of an alert from alerts table and turns into an ng alerting condition with the queries and classic condition.
- Has temp API rest endpoint that will take the dashboard conditions json, translate it to SEE queries + classic condition, and execute it (only enabled in dev mode).
- Changes expressions to catch query responses with a non-nil error property
- Adds two new states for an NG instance result (NoData, Error) and updates evaluation to match those states
- Changes the AsDataFrame (for frontend) from Bool to string to represent additional states
- Fix bug in condition model to accept first Operator as empty string.
- In ngalert, adds GetQueryDataRequest, which was part of execute and is still called from there. But this allows me to get the Expression request from a condition to make the "pipeline" can be built.
- Update AsDataFrame for evalresult to be row based so it displays a little better for now
2021-03-23 12:11:15 -04:00
Sofia Papagiannaki
24cb059a6b
[Alerting]: implement backend checking for forking to Lotex ruler (#32208)
* Rename DatasourceId path parameter

* Implement fork ruler backendType()

* Apply suggestions from code review
2021-03-23 18:08:57 +02:00
Owen Diehl
93d0f7163f
[Alerting] Forking LoTex ruler (#32138)
* updates alerting api to master

* skeleton for lotex ruler

* withPath helper & legacyRulerPrefix const

* forked ruler

* wires up proxy

* safeMacaronWrapper

* working proxy

* jsonExtractor

* lint
2021-03-19 10:32:13 -04:00
Ganesh Vernekar
8854001b67
AlertingNG: Refactor notifier to support config reloads (#32099)
* AlertingNG: Refactor notifier to support config reloads

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments and make reloading of config a sync operation

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-03-19 09:26:00 +01:00
gotjosh
cc74b1fe46
Alerting: Add database table for persisting alerting configuration (#32042)
* Alerting: Add database table for persisting alerting configuration

* Fix the linter

* Address review comments

* Don't split templates and configuration

It is already bundled together as part of a of the API so might as well
marshall it directly.
2021-03-18 18:12:28 +00:00
Ganesh Vernekar
0b788b5ce8
AlertingNG: Notification channel for emails (#31768)
* Email notification channel in ngalert

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Use existing templating system

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Update template and add unit tests

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2021-03-18 14:55:11 +00:00
Ganesh Vernekar
974ccf8091
AlertingNG: Fix the alerting stage for legacy alerts (#32025)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2021-03-17 16:38:33 +00:00