Commit Graph

441 Commits

Author SHA1 Message Date
Sofia Papagiannaki
e5a5b8e3fe
Alerting: Fix updating alert rule properties with missing/zero values (#35512)
* Fix deleting labels and annotations

* Add test

* Keep no data and error start if not provided

* Allow setting interval and for to zero during rule updates
2021-06-15 20:55:25 +03:00
Sofia Papagiannaki
abe35c8c01
Alerting: Add error recovery during rule evaluations (#35450)
* Alerting: Eval recovery after query failure

* Apply suggestions from code review
2021-06-15 19:30:21 +03:00
gotjosh
f7ed35336d
Alerting: Implement /status for the notification system (#33227)
* Alerting: Implement /status for the notification system

Implements the necessary plumbing to have a /status endpoint on the
notification system.

* Add API examples

* Update API specs

* Update prometheus/common dependency

Co-authored-by: Sofia Papagiannaki <sofia@grafana.com>
2021-06-15 19:14:02 +03:00
Sofia Papagiannaki
fba90b8f9b
Alerting: Recact html responses (#35277) 2021-06-04 20:57:24 +03:00
Sofia Papagiannaki
8cda1f5153
Alerting: Allow rules with same title across folders (#35270)
* Alerting: Allow rules with same title across folders

* Add test
2021-06-04 20:45:26 +03:00
Sofia Papagiannaki
15c55b0115
Alerting: Fix notification channel migration and handle case when Alertmanager default configuration is absent (#35086)
* Fix dashboard alert and nootifier migration for MySQL

* Fix POSTing Alertmanager configuration if no current configuration exists

in case the default configuration has not be stored yet
or has failed to get stored

* Change CreatedAt field type
2021-06-04 15:52:41 +03:00
Ganesh Vernekar
8417088969
Alerting: Expand {{$labels.xyz}} template in labels and annotations (#35159)
* Alerting: Expand `{{$labels.xyz}}` template in labels and annotations

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix annotation not updating for same alert

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-06-03 19:24:36 +02:00
Ganesh Vernekar
a30e60a0b8
Alerting: Do not hard fail on templating errors in channels (#35165)
* Alerting: Do not hard fail on templating errors in channels

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-06-03 19:39:32 +05:30
Ganesh Vernekar
a23674ef99
Alerting: Migrate tags as labels and not annotations (#34990)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-31 19:47:17 +05:30
Sofia Papagiannaki
355be158b7
[Alerting]: fix/cleanup API examples (#34588) 2021-05-31 11:18:29 +03:00
Chip Wolf ‮
badec6c6ad
Alerting: Add support for configuring avatar URL for the Discord notifier (#33355)
Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>
2021-05-28 23:00:21 +02:00
Owen Diehl
cc38613ba4
alerting: fixes per-receiver metric cardinality (#34915) 2021-05-28 12:31:23 -04:00
Owen Diehl
9aca032d10
Alerting/consistent api errors (#34858)
* consolidates alertmanager api errors

* util & testing consistent errors

* consistent errors for rest of ngalert apis

* updates expected errors in testware

* bump ci

* linting

* unrelated: dashboard.go lint
2021-05-28 11:55:03 -04:00
Kyle Brandt
b47e7d12e6
Alerting: Extract values from MD expr alerts (#34757)
When using mulit-dimensional Grafana managed alerts (e.g. SSE math) extract refIds values and labels so they can be shown in the notification and dashboards.
2021-05-28 11:04:20 -04:00
Danilo Bargen
83a83de10a
Clarify that Threema Gateway Alerts support only Basic IDs (#34828)
Threema Gateway supports two types of IDs: Basic IDs (where the
encryption is managed by the API server) and End-to-End IDs (where the
keys are managed by the user).

This plugin currently does not support End-to-End IDs (since it's much
more complex to implement, because the encryption needs to happen
locally). Add a few clarifications to the UI.

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
2021-05-28 08:54:55 +02:00
Domas
347273cdea
Alerting: check upstream response content type in lotex proxy (#34760) 2021-05-27 14:12:29 +03:00
David Parrott
20d356947c
set state correctly and test (#34680) 2021-05-26 11:37:42 -07:00
Ganesh Vernekar
d69c21acb6
NGAlert: Update the default template to include more URLs (#34715)
* NGAlert: Update the default template to include more URLs

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix tests

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-26 16:49:39 +02:00
Ganesh Vernekar
b168223029
NGAlert: Add integration tests for remaining notification channels (#34662)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-26 16:33:55 +05:30
Marcus Andersson
e19b3df1a9
Alerting: added possibility to preview grafana managed alert rules. (#34600)
* starting to add eval logic.

* wip

* first version of test rule.

* reverted file.

* add info colum to result to show error or (with CC evalmatches)

* fix labels in evalmatch

* fix be test

* refactored using observables.

* moved widht/height div to outside panel rendere.

* adding docs api level.

* adding container styles to error div.

* increasing size of preview.

Co-authored-by: kyle <kyle@grafana.com>
2021-05-26 10:06:28 +02:00
Owen Diehl
0e0ed43153
Alerting/testing promql extraction (#34665)
* promql compat for marshaling

* extracts upstream instant queries into data frame for alerting

* eval string parity
2021-05-25 11:54:50 -04:00
Sofia Papagiannaki
b48832c0f7
[Alerting]: alertmanager notifier fixes (#34575) 2021-05-24 16:09:29 +03:00
Sofia Papagiannaki
23939eab10
[Alerting]: namespace fixes (#34470)
* [Alerting]: forbid viewers for updating rules if viewers can edit

check for CanSave instead of CanEdit

* Clear ngalert tables when deleting the folder

* Apply suggestions from code review

* Log failure to check save permission

Co-authored-by: gotjosh <josue@grafana.com>
2021-05-20 15:49:33 +03:00
gotjosh
7b04278834
Alerting: Opsgenie notification channel (#34418)
* Alerting: Opsgenie notification channel

This translate the opsgenie notification channel from the old alerting
system to the new alerting system with a few changes:

- The tag system has been replaced in favour of annotation.
- TBD
- TBD

Signed-off-by: Josue Abreu <josue@grafana.com>

* Fix template URL

* Bugfig: dont send resolved when autoClose is false

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix integration tests

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix URLs in all other channels

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-20 10:12:08 +02:00
David Parrott
7a83d1f9ff
Alerting resend delay for sending to notifiers (#34312)
* adds resend delay to avoid saturating notifier

* correct method signatures

* pr feedback
2021-05-19 22:15:09 +02:00
Owen Diehl
8f350bc353
actually register metrics this time (#34444) 2021-05-19 22:09:12 +02:00
Ganesh Vernekar
533be16787
NGAlert: Add Threema notification channel (#34159)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-19 20:20:52 +02:00
Ganesh Vernekar
b2e84277a3
NGAlert: Add Kafka notification channel (#34156)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-19 20:02:09 +02:00
Ganesh Vernekar
ad1d0ae0bf
NGAlert: Add VictorOps notification channel (#34161)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-19 19:52:14 +02:00
Ganesh Vernekar
fb9223ab42
NGAlert: Add Line notification channel (#34157)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-19 19:04:48 +02:00
Domas
54c33c6cdd
Alerting: update email template (#34205) 2021-05-19 18:58:31 +02:00
Ganesh Vernekar
01e0faf800
NGAlert: Add GoogleChat notification channel (#34153)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-19 18:24:04 +02:00
David Parrott
a0f175c7a5
also don't allow negative intervalseconds (#34319) 2021-05-19 09:05:32 -07:00
David Parrott
b9f4ec2030
Add discord notifier channel and test (#34150)
* Add discord notifier channel and test

* Correct payload

* remove print statement

* PR feedback and update due to changes in main

* Add discord notifier channel and test

* Correct payload

* remove print statement

* PR feedback and update due to changes in main

* update constructor and tests

* group imports sensibly

* Fix lint

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-19 17:31:55 +02:00
Sofia Papagiannaki
a79a4838b8
[Alerting]: Add Pushover integration with the alert manager (#34371)
* [Alerting]: Add Pushover integration with the alert manager

* lint

* Set boundary only for tests

* Remove title field

* fix imports
2021-05-19 16:48:46 +02:00
Owen Diehl
1d2febfa85
[Alerting] Route validations (#34393)
* more routing validation

* go mod

* recursive route validations
2021-05-19 10:36:28 -04:00
Arve Knudsen
9dfaa037d1
Alerting: Migrate Alertmanager notifier (#34304)
* Alerting: Port Alertmanager notifier to v8

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2021-05-19 15:27:41 +02:00
Owen Diehl
d6c4c2fcd5
[Alerting] Ensure upstream validations are run (#34333)
* use embedded validations via noop yaml unmarshaler

* lint

* fixes integration tests now that groupings are handled
2021-05-19 06:22:44 -04:00
Owen Diehl
c48c701791
adds missing metric name (#34307) 2021-05-18 17:24:38 -04:00
David Parrott
25485100b0
Alerting: Trim results when at processing instead of on ticker (#34248)
* Trim results when at processing instead of on ticker

* User RWMutex correctly

* remove comment
2021-05-18 10:56:14 -07:00
David Parrott
bbb7bbf891
Alerting: Remove back end logic for supporting KeepLastState (#34242)
* Removed back end logic for supporting KeepLastState

* Map keep_state correctly in migrations
2021-05-18 10:55:43 -07:00
Sofia Papagiannaki
ff112f07e3
[Alerting]: Add Sensu Go integration with the alert manager (#34045)
* [Alerting]: Add sensugo notification channel

* Apply suggestions from code review

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* Do not include labels with concatenated rule UID and names

* Modifications after syncing with main

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2021-05-18 17:31:51 +03:00
Sofia Papagiannaki
11243dec14
[Alerting]: Assign UUID to grafana receivers (#34241)
* [Alerting]: Assign UUID to grafana receivers

* Apply suggestions from code review

* Add test for updating invalid receiver

Co-authored-by: Domas <domasx2@gmail.com>
2021-05-18 17:31:00 +03:00
Kyle Brandt
63b2dd06a5
Alerting: Set "value" with evalmatches in G Managed (#34075)
When, and currently only when using a classic condition, evaluation information is added (which is like the EvalMatches from dashboard alerting).

This is returned via the API and can be included in notifications by reading the `__value__` label attached `.Alerts` in the template. It is a string.
2021-05-18 09:12:39 -04:00
Ganesh Vernekar
89c2b5e863
NGAlert: Remove unwanted fields from notification channel config (#34036)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-18 10:04:47 +02:00
gotjosh
6384f86fb9
Alerting: Allow the notifier to log (#34232)
* Alerting: Allow the notifier to log

The notifier upstream code uses go-kit as its logging library. The
grafana specific logger is not compatible with this API. In this PR, I
have created a wrapper that implements io.Writer to make them
compatible.
2021-05-17 18:06:47 +01:00
Kyle Brandt
331991ca10
UAlerting: Increase default max datapoints (#34223)
Change const value from 100 to 43200 (12 hours at 1sec interval)
2021-05-17 18:46:52 +02:00
Ganesh Vernekar
d5ae55c5dd
NGAlert: Add message field to email notification channel (#34044)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-17 16:05:09 +05:30
Owen Diehl
1367f7171e
Alerting/ruler metrics (#34144)
* adds active configurations metric

* rule evaluation metrics

* ruler metrics

* pr feedback
2021-05-14 16:13:44 -04:00
gotjosh
eb74994b8b
Alerting: Modify configuration apply and save semantics - v2 (#34143)
* Save default configuration to the database and copy over secure settings
2021-05-14 19:49:54 +01:00
Owen Diehl
fc90c36d50
removes unused db method (#34082) 2021-05-13 20:28:10 +02:00
Owen Diehl
baca873a84
extracts alertmanager from DI, including migrations (#34071)
* extracts alertmanager from DI, including migrations

* includes alertmanager Run method in ngalert

* removes 3s test shutdown timeout

* lint
2021-05-13 14:01:38 -04:00
Ganesh Vernekar
ec3214bac2
NGAlert: Add integration tests for notification channels (#33431)
* NGAlert: Add integration tests for notification channels

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix the failing tests

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Override creation of rule UID, remove only namespace UID

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-13 22:58:19 +05:30
Kyle Brandt
babb17afd6
Alerting/Chore: Move tests from tests package (#34059)
Instead put in package folder but with package name suffixed with _test
This enables code coverage within the pkg while still allow the tests to operate from external to package perspective (only exported things).
2021-05-13 10:05:33 -04:00
Ganesh Vernekar
5f44ccff0c
NGAlert: Fix unit test to write files in temporary directory (#34032)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-13 16:08:12 +05:30
Kyle Brandt
3da8db7f3f
Alerting: Run table migrations regardless of feature flag and move out of service (#33996) 2021-05-12 14:39:48 -04:00
Owen Diehl
3b06f52bab
Alerting/allow empty receiver (#33962)
* simplifies yaml unmarshaling: PostableApiReceiver

* allow empty receiver type

* allows name only receivers (blackhole)

* better receiver type parsing

* linting
2021-05-12 07:58:16 -04:00
Kyle Brandt
a735c51202
Alerting/Chore: Backend remove def_ columns from instance (#33875)
rename def_uid and def_org_id to rule_uid and rule_org_id on the alert_instance table and drops the definition table.
2021-05-12 07:17:43 -04:00
Ganesh Vernekar
8d442c9b44
NGAlert: Fix templating and remove unwanted default templates (#33918)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-12 15:13:43 +05:30
Sofia Papagiannaki
f4750fb3c8
[Alerting]: Alertmanager API apply permissions (#33843)
* [Alerting]: Alertmanager API apply permissions

* Apply suggestions from code review
2021-05-11 11:31:38 +03:00
Owen Diehl
e18ca8f6f2
enforce receivers align with backend type when posting AM config (#33877) 2021-05-10 16:58:41 -04:00
Sofia Papagiannaki
1c58fd380f
[Alerting]: store encrypted receiver secure settings (#33832)
* [Alerting]: Store secure settings encrypted

* Move encryption to the API handler
2021-05-10 15:30:42 +03:00
David Parrott
e58aca2d20
Alerting: remove instances from db and cache on rule update (#33722)
* remove instances from db and cache on rule update

* fix panic

* rename
2021-05-06 18:39:34 +02:00
Kyle Brandt
fae093bbe2
Alerting: Fix state cache getOrCreate panic (#33777) 2021-05-06 14:35:52 +02:00
Owen Diehl
a5ae8cf377
Unredact/secret (#33723)
* no longer redacts GETing proxied AM configs

* removes unused testfile

* testware fix

* consistently roundtrips yaml<>json and doesnt redact secrets

* lint
2021-05-05 16:21:53 -04:00
Ganesh Vernekar
1b8c0ce88b
NGAlert: Fix some TODOs in notification channels (#33739)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-05 17:48:40 +05:30
David Parrott
b1a8c67689
Alerting return evaluation errors to /rules (#33663)
* Set and return errors produced by evaluation results

* test fixup
2021-05-04 13:08:12 -04:00
David Parrott
39099bf3c0
Alerting nested state cache (#33666)
* nest cache by orgID, ruleUID, stateID

* update accessors to use new cache structure

* test and linter fixup

* fix panic

Co-authored-by: Kyle Brandt <kyle@grafana.com>

* add comment to identify what's going on with nested maps in cache

Co-authored-by: Kyle Brandt <kyle@grafana.com>
2021-05-04 09:57:50 -07:00
David Parrott
5072fefc22
allow saving pending alerts (#33667) 2021-05-04 09:24:20 -07:00
Sofia Papagiannaki
540f110220
[Alerting]: Extend quota service to optionally set limits on alerts (#33283)
* Quota: Extend service to set limit on alerts

* Add test for applying quota to alert rules

* Apply suggestions from code review

Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>

* Get used alert quota only if naglert is enabled

* Set alert limit to zero if nglalert is not enabled
Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>
2021-05-04 19:16:28 +03:00
Ganesh Vernekar
918552d34b
NGAlert: Send list of available ngalert notification channels via API (#33489)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-04 13:58:39 +02:00
Kyle Brandt
48358efc13
Alerting: remove State cache entries on Ruler Delete (#33638)
for https://github.com/grafana/alerting-squad/issues/133
2021-05-03 14:01:33 -04:00
Owen Diehl
070627d11e
better handle metrics for state transitions (#33648) 2021-05-03 11:57:24 -04:00
Kyle Brandt
c1034f3118
Alerting: Create instanceStore (#33587)
for https://github.com/grafana/alerting-squad/issues/129
2021-05-03 07:19:15 -04:00
Kyle Brandt
c2a5da79e3
Alerting: Avoid panic by not loading instances without a rule (#33597) 2021-05-01 19:01:28 +02:00
Kyle Brandt
759a0cd71b
Build: Fix with cleanup call maybe? (#33590) 2021-04-30 13:02:37 -07:00
Kyle Brandt
7823842c5d
Alerting: Load annotations from rule into State cache (#33542)
for https://github.com/grafana/alerting-squad/issues/127
2021-04-30 20:23:12 +02:00
Kyle Brandt
b8f01fe034
Alerting: backend "ng" code cleanup (#33578) 2021-04-30 13:21:57 -04:00
Owen Diehl
5e48b54549
Alerting/metrics (#33547)
* moves alerting metrics to their own pkg

* adds grafana_alerting_alerts (by state) metric

* alerts_received_{total,invalid}

* embed alertmanager alerting struct in ng metrics & remove duplicated notification metrics (already embed alertmanager notifier metrics)

* use silence metrics from alertmanager lib

* fix - manager has metrics

* updates ngalert tests

* comment lint
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* cleaner prom registry code

* removes ngalert global metrics

* new registry use in all tests

* ngalert metrics impl service, hack testinfra code to prevent duplicate metric registrations

* nilmetrics unexported
2021-04-30 12:28:06 -04:00
Kyle Brandt
6c8ef2a9c2
Alerting: Alert Rule migration (#33000)
* Not complete, put migration behind env flag for now:
UALERT_MIG=iDidBackup
* Important to backup, and not expect the same DB to keep working until the env trigger is removed.
* Alerting: Migrate dashboard alert permissions
* Do not use imported models
* Change folder titles

Co-authored-by: Sofia Papagiannaki <papagian@users.noreply.github.com>
2021-04-29 13:24:37 -04:00
Sofia Papagiannaki
1e380e869e
[Alerting]: some fixes (#33538)
* Fix fialure when adding state annotations

* Fix get org rules API

Do not fail response if user has no access to view a namespace.
Do not include the namespace in the response instead.

* lint
2021-04-29 19:15:15 +03:00
Kyle Brandt
d32fcbe2bc
Alerting: Eval pkg tests and more specific error handling (#33496)
* comment updates
* more friendly error messages, in particular if it looks like time series data
2021-04-29 07:27:32 -04:00
Owen Diehl
ec37b4cb87
[Alerting] Automatic request instrumentation (#33444)
* alerting: automatic request instrumentation

* always expose alerting prom metrics

* globally register alerting metrics
2021-04-28 16:59:15 -04:00
Kyle Brandt
914443c816
Alerting: Fix state cache id duplication (#33480) 2021-04-28 11:42:19 -04:00
Sofia Papagiannaki
7ccb022c03
Alerting: validate condition before updating rulegroup (#33367)
* Alerting: validate condition before updating rulegroup

* Apply suggestions from code review
2021-04-28 11:31:51 +03:00
Kyle Brandt
b590e95682
AlertingAPI: Change list response query prop (#33419)
* Alerting: change to full []AlertQuery as json in a string and not just model.
2021-04-27 22:15:00 +02:00
Ganesh Vernekar
467ab124dd
NGAlert: Fix GET for Alertmanager config (#33379)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-27 20:48:19 +05:30
Kyle Brandt
adcba36d39
AlertingAPI: update swagger json files match datasourceUid change (#33332)
* update swagger json files match datasourceUid change
underlying change made in https://github.com/grafana/grafana/pull/33282
* Document DatasourceUID field in AlertQuery model
* Run spec generation from inside a docker container
* Generate latest spec

Co-authored-by: Sofia Papagiannaki <sofia@grafana.com>
2021-04-27 16:50:30 +02:00
Owen Diehl
86c8eed386
Instrument/ruler api (#33290)
* ruler api histogram instrumentation

* register ruler metrics
2021-04-27 08:25:32 -04:00
Ganesh Vernekar
be1affe0a4
NGAlert: Fix flaky test (#33415)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-27 17:03:22 +05:30
David Parrott
788bc2a793
Alerting: refactor state tracker (#33292)
* set processing time

* merge labels and set on response

* use state cache for adding alerts to rules

* minor cleanup

* add support for NoData and Error results

* rename test

* bring in changes from other PRs tha have been merged

* pr feedback

* add integration test

* close state tracker cleanup on context.Done

* fixup test

* rename state tracker

* set EvaluationDuration on Result

* default labels set as constants

* separate cache and state from manager

* use RWMutex in cache
2021-04-23 21:32:25 +02:00
David Parrott
ca79206498
Alerting: Handle NoData and Error evaluation results (#33194)
* set processing time

* merge labels and set on response

* use state cache for adding alerts to rules

* minor cleanup

* add support for NoData and Error results

* rename test

* bring in changes from other PRs tha have been merged

* pr feedback

* add integration test

* close state tracker cleanup on context.Done

* fixup test

* not those annotations
2021-04-23 20:47:52 +02:00
Kyle Brandt
5e818146de
Alerting/Expr: New SSE Request/QueryType, alerting move data source UID (#33282) 2021-04-23 16:52:32 +02:00
Ganesh Vernekar
659ea20c3c
NGAlert: Run the maintenance cycle for the silences (#33301)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-23 16:19:03 +02:00
Ganesh Vernekar
d66a5e65a4
AlertingNG: Add webhook notification channel (#33229)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-23 18:59:28 +05:30
Ganesh Vernekar
a0e567f80f
AlertingNG: Add Dingding notification channel (#32995)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-22 19:30:49 +02:00
Ganesh Vernekar
4ec1edfca3
AlertingNG: Add Teams notification channel (#32979)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-22 18:16:26 +02:00
Ganesh Vernekar
c9cd7ea701
AlertingNG: Add Telegram notification channel (#32795)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-22 17:24:59 +02:00
Ganesh Vernekar
0a03d5c29e
AlertingNG: Correctly set StartsAt, EndsAt, UpdatedAt after alert reception (#33109)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-22 20:42:18 +05:30
Ganesh Vernekar
3056f86f76
AlertingNG: Fix TODOs in email notification channel (#33169)
* AlertingNG: Fix TODOs in email notification channel

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Test fixup

* Remove the receiver field it is not needed for the email notification

Co-authored-by: Josue Abreu <josue@grafana.com>
2021-04-22 10:01:55 -04:00
Arve Knudsen
6408b55a7c
Slack: Use chat.postMessage API by default (#32511)
* Slack: Use only chat.postMessage API

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Slack: Check for response error

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Slack: Support custom webhook URL

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Simplify

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Fix tests

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Rewrite tests to use stdlib

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Update pkg/services/alerting/notifiers/slack.go

Co-authored-by: Dimitris Sotirakis <sotirakis.dim@gmail.com>

* Clarify URL field name

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Fix linting issue

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Fix test

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Fix up new Slack notifier

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Improve tests

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Fix lint

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Slack: Make token not required

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Alerting: Send validation errors back to client

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Document how token is required

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Make recipient required when using Slack API

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Fix field description

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

Co-authored-by: Dimitris Sotirakis <sotirakis.dim@gmail.com>
2021-04-22 16:00:21 +02:00
Arve Knudsen
66020b419c
NGAlert: Consolidate on standard errors package (#33249)
* NGAlert: Don't use pkg/errors

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Update pkg/services/ngalert/notifier/alertmanager.go

Co-authored-by: Will Browne <wbrowne@users.noreply.github.com>

* Fix logging

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

Co-authored-by: Will Browne <wbrowne@users.noreply.github.com>
2021-04-22 11:18:25 +02:00
Sofia Papagiannaki
b2288f7ef9
[Alerting]: Add alerting endpoint for Query Evaluation (#33174)
* [Alerting]: Add alerting endpoint for Query Evaluation

* Fix passing down now parameter

* Add validations and test

* Fix eval queries and expressions test

* Add eval tests
2021-04-21 22:44:50 +03:00
gotjosh
de0802cf3b
Alerting: Fixes the integration test currently failing at master (#33233)
* Alerting: Fixes the integration test currently failing at master

* Skip the state tracker test for now
2021-04-21 14:57:17 -04:00
David Parrott
4be1d84f23
Alerting: Enhancements to /rules (#33085)
* set processing time

* merge labels and set on response

* use state cache for adding alerts to rules

* minor cleanup

* pr feedback

* Do not initialize mutex unnecessarily

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* linter

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2021-04-21 09:30:03 -07:00
Sofia Papagiannaki
87a70af7eb
[Alerting]: Fix updating rule group and add tests (#33074)
* [Alerting]: Fix updating rule group and add test

* Fix updating rule labels
* Set default values for rule no data and error states
if they are missing
* Add test for updating rule

* Test updating annotations

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>

* add test for posting an unknown rule UID

* Fix alert rule validation and add tests

* Remove org id from PostableGrafanaRule

This field was not used; each rule gets the organisation of the user making
the rerquest

* Update pkg/tests/api/alerting/api_alertmanager_test.go

Co-authored-by: gotjosh <josue@grafana.com>
2021-04-21 17:22:58 +03:00
gotjosh
23c7e7ab60
Alerting: Various fixes for the alerts endpoint (#33182)
A set of fixes for the GET alert and groups endpoints.

- First, is the fact that the default values where not being for the query params. I've introduced a new method in the Grafana context that allow us to do this.
- Second, is the fact that alerts were never being transitioned to active. To my surprise this is actually done by the inhibitor in the pipeline - if an alert is not muted, or inhibited then it's active.
- Third, I have added an integration test to cover for regressions.

Signed-off-by: Josue Abreu <josue@grafana.com>
2021-04-21 06:34:42 -04:00
Owen Diehl
e065e19583
Fix/ngalert generation (#33172)
* fixes pkg names & alerting openapi generation

* cleans up api generation, uses docker & removes python
2021-04-20 13:12:32 -04:00
Oscar Kilhed
bc2d90f140
Fix lint issue in cortex ruler test (#33158) 2021-04-20 14:03:58 +02:00
Owen Diehl
e37a780e14
Inhouse alerting api (#33129)
* init

* autogens AM route

* POST dashboards/db spec

* POST alert-notifications spec

* fix description

* re inits vendor, updates grafana to master

* go mod updates

* alerting routes

* renames to receivers

* prometheus endpoints

* align config endpoint with cortex, include templates

* Change grafana receiver type

* Update receivers.go

* rename struct to stop swagger thrashing

* add rules API

* index html

* standalone swagger ui html page

* Update README.md

* Expose GrafanaManagedAlert properties

* Some fixes

- /api/v1/rules/{Namespace} should return a map
- update ExtendedUpsertAlertDefinitionCommand properties

* am alerts routes

* rename prom swagger section for clarity, remove example endpoints

* Add missing json and yaml tags

* folder perms

* make folders POST again

* fix grafana receiver type

* rename fodler->namespace for perms

* make ruler json again

* PR fixes

* silences

* fix Ok -> Ack

* Add id to POST /api/v1/silences (#9)

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Add POST /api/v1/alerts (#10)

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* fix silences

* Add testing endpoints

* removes grpc replace directives

* [wip] starts validation

* pkg cleanup

* go mod tidy

* ignores vendor dir

* Change response type for Cortex/Loki alerts

* receiver unmarshaling tests

* ability to split routes between AM & Grafana

* api marshaling & validation

* begins work on routing lib

* [hack] ignores embedded field in generation

* path specific datasource for alerting

* align endpoint names with cloud

* single route per Alerting config

* removes unused routing pkg

* regens spec

* adds datasource param to ruler/prom route paths

* Modifications for supporting migration

* Apply suggestions from code review

* hack for cleaning circular refs in swagger definition

* generates files

* minor fixes for prom endpoints

* decorate prom apis with required: true where applicable

* Revert "generates files"

This reverts commit ef7e975584.

* removes server autogen

* Update imported structs from ngalert

* Fix listing rules response

* Update github.com/prometheus/common dependency

* Update get silence response

* Update get silences response

* adds ruler validation & backend switching

* Fix GET /alertmanager/{DatasourceId}/config/api/v1/alerts response

* Distinct gettable and postable grafana receivers

* Remove permissions routes

* Latest JSON specs

* Fix testing routes

* inline yaml annotation on apirulenode

* yaml test & yamlv3 + comments

* Fix yaml annotations for embedded type

* Rename DatasourceId path parameter

* Implement Backend.String()

* backend zero value is a real backend

* exports DiscoveryBase

* Fix GO initialisms

* Silences: Use PostableSilence as the base struct for creating silences

* Use type alias instead of struct embedding

* More fixes to alertmanager silencing routes

* post and spec JSONs

* Split rule config to postable/gettable

* Fix empty POST /silences payload

Recreating the generated JSON specs fixes the issue
without further modifications

* better yaml unmarshaling for nested yaml docs in cortex-am configs

* regens spec

* re-adds config.receivers

* omitempty to align with prometheus API behavior

* Prefix routes with /api

* Update Alertmanager models

* Make adjustments to follow the Alertmanager API

* ruler: add for and annotations to grafana alert (#45)

* Modify testing API routes

* Fix grafana rule for field type

* Move PostableUserConfig validation to this library

* Fix PostableUserConfig YAML encoding/decoding

* Use common fields for grafana and lotex rules

* Add namespace id in GettableGrafanaRule

* Apply suggestions from code review

* fixup

* more changes

* Apply suggestions from code review

* aligns structure pre merge

* fix new imports & tests

* updates tooling readme

* goimports

* lint

* more linting!!

* revive lint

Co-authored-by: Sofia Papagiannaki <papagian@gmail.com>
Co-authored-by: Domas <domasx2@gmail.com>
Co-authored-by: Sofia Papagiannaki <papagian@users.noreply.github.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Co-authored-by: gotjosh <josue@grafana.com>
Co-authored-by: David Parrott <stomp.box.yo@gmail.com>
Co-authored-by: Kyle Brandt <kyle@grafana.com>
2021-04-19 14:26:04 -04:00
Ganesh Vernekar
6271777ec6
AlertingNG: Remove the receivers field from postable alerts (#33068)
* AlertingNG: Remove the receivers field from postable alerts and update tests

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-19 12:28:44 +05:30
Kyle Brandt
2c862678ab
AlertingNG/SSE: Datasource UID and UID/ID only (drop name) (#33039)
SSE still will support ID until dashboard/frontend always requests UID
update *.http examples

Co-authored-by: gotjosh <josue@grafana.com>
2021-04-16 15:29:19 +02:00
David Parrott
555da77527
Dparrott/labels on alert rule (#33057)
* move state tracker tests to /tests

* set default labels on alerts

* handle empty labels in result.Instance

* create annotation on transition to alerting state
2021-04-16 15:11:40 +02:00
gotjosh
362c4d4276
Alerting: Integration test rule creation (#33047)
* Alerting: Integration test rule creation

* Appease the linter

* Cleanup

* Make anonymous user role a parameter
2021-04-16 08:00:07 -04:00
David Parrott
2276e9556a
Alerting: set query in rules response (#33010)
* set query in rules response

* Theme: tweaking dark theme colors (#33007)

* Library Panels: Add library panel tab to share modal (#32953)

* Explore: Scroll split panes in Explore independently (#32978)

* Change default prometheus to latest and prometheus v1 to prometheus1

* Update README

* Remove prometheus1 block as not used

* Explore: Separatae scrolling in split view

* Update snapshot

* Allow skip migrations in tests via environment variable (#32958)

* Dashboard: Fix issue where Slack notifications won't link to users (#32861)

* DashboardPage: refactored styles from sass to emotion (#32955)

* DashboardPage: refactored styles from sass to emotion

* refactored dashboardPage component to be alot easier to read and understand

* more refactoring...

* more cleaning...

* fixes frontend test

* fixes frontend test- I hope

* fixes frontend test- I hope

* moves dashboard scss styles back to it's standalone file

* GraphNG: use theme font family and size for axis labels (#33009)

* GraphNG: use theme font family and size for axis labels

* fix test

* AlertingNG: Slack notification channel (#32675)

* AlertingNG: Slack notification channel

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Add tests

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments and small refactoring

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* GraphNG: stacking (#30749)

* First iteration

* Dev dash

* Re-use StackingMode type

* Fix ts and api issues

* Stacking work resurected

* Fix overrides

* Correct values in tooltip and updated test dashboard

* Update dev dashboard

* Apply correct bands for stacking

* Merge fix

* Update snapshot

* Revert go.sum

* Handle null values correctyl and make filleBelowTo and stacking mutual exclusive

* Snapshots update

* Graph->Time series stacking migration

* Review comments

* Indicate overrides in StandardEditorContext

* Change stacking UI editor, migrate stacking to object option

* Small refactor, fix for hiding series and dev dashboard

* VizLegend: sets a min and max value of the seriesCount control in Storybook (#33022)

* Alerting: Filter rules list (#32818)

* Chore: Reduces strict errors (#33012)

* Chore: reduces strict error in OptionPicker tests

* Chore: reduces strict errors in FormDropdownCtrl

* Chore: reduces has no initializer and is not definitely assigned in the constructor errors

* Chore: reduces has no initializer and is not definitely assigned in the constructor errors

* Chore: lowers strict count limit

* Tests: updates snapshots

* Tests: updates snapshots

* Chore: updates after PR comments

* Refactor: removes throw and changes signature for DashboardSrv.getCurrent

* [Alerting]: Several modifications in alert rules (#32983)

* [Alerting]: Use common properties for all rules

* Add Labels in rules

* Fix update ruleGroup API

Return 400 Bad Request response
when the request contains a UID that does not exist

* Check permissions and return namespace id

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>

* WIP (#33025)

* Chore: Bump strict error count limit (#33035)

* set query in rules response

Co-authored-by: Torkel Ödegaard <torkel@grafana.org>
Co-authored-by: kay delaney <45561153+kaydelaney@users.noreply.github.com>
Co-authored-by: Ivana Huckova <30407135+ivanahuckova@users.noreply.github.com>
Co-authored-by: Dafydd <72009875+dafydd-t@users.noreply.github.com>
Co-authored-by: n-wbrown <n-wbrown@users.noreply.github.com>
Co-authored-by: Uchechukwu Obasi <obasiuche62@gmail.com>
Co-authored-by: Leon Sorokin <leeoniya@gmail.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Co-authored-by: Dominik Prokop <dominik.prokop@grafana.com>
Co-authored-by: Nathan Rodman <nathanrodman@gmail.com>
Co-authored-by: Hugo Häggmark <hugo.haggmark@grafana.com>
Co-authored-by: Sofia Papagiannaki <papagian@users.noreply.github.com>
Co-authored-by: gotjosh <josue@grafana.com>
Co-authored-by: Marcus Efraimsson <marcus.efraimsson@gmail.com>
2021-04-15 22:23:16 +02:00
Sofia Papagiannaki
6bbb2fd4ba
[Alerting]: Several modifications in alert rules (#32983)
* [Alerting]: Use common properties for all rules

* Add Labels in rules

* Fix update ruleGroup API

Return 400 Bad Request response
when the request contains a UID that does not exist

* Check permissions and return namespace id

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>
2021-04-15 15:54:37 +03:00
Ganesh Vernekar
04a8d5407e
AlertingNG: Slack notification channel (#32675)
* AlertingNG: Slack notification channel

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Add tests

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments and small refactoring

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-15 16:01:41 +05:30
Sofia Papagiannaki
624fbf5dda
[Alerting]: Fix empty rules evaluation statuses (#32997)
* [Alerting]: Fix empty rules evaluation statuses

`GetRuleGroupAlertRules()` requires an non empty namespaceUID

* Include the namespace into the response
2021-04-14 17:49:26 +00:00
Sofia Papagiannaki
8848d825e0
[Alerting]: Use title instead of slug for retrieving the namespace (#32957)
* [Alerting]: Use title instead of slug for retrieving the namespace

* Apply suggestions from code review

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
2021-04-14 12:02:50 +03:00
David Parrott
567a6a09bd
Alerting: Return RuleResponse for api/prometheus/grafana/api/v1/rules (#32919)
* Return RuleResponse for api/prometheus/grafana/api/v1/rules

* change TODO to note

Co-authored-by: gotjosh <josue@grafana.com>

* pr feedback

* test fixup

Co-authored-by: gotjosh <josue@grafana.com>
2021-04-13 17:38:09 -04:00
Sofia Papagiannaki
e7ff04a167
[Alerting]: Implement test rule API route (#32837)
* [Alerting]: Implement test rule API route

* Apply suggestions from code review

* Call /query instead of /query_range
2021-04-13 20:58:34 +03:00
gotjosh
528ca9134b
Alerting: Use a default configuration and periodically poll for new ones (#32851)
* Alerting: Use a default configuration and periodically poll for new ones

Use a default configuration to make sure we always start the grafana
instance. Then, regularly poll for new ones.

I've also made sure that failures to apply configuration do not stop the
Grafana server but instead keep polling until it is a success.
2021-04-13 13:02:44 +01:00
Sofia Papagiannaki
54689f2739
[Alerting]: Fix YAML encoding/decoding issues when proxying lotex requests (#32854)
* [Alerting]: Fix GET lotex rule group config

* [Alerting]: Fix POST lotex user config
2021-04-12 12:04:37 +03:00
Kyle Brandt
80dfa83380
AlertingNG: Add For+Annotations to Grafana_Alert (#32793)
* add db columns
* Fix deserialisation issue of AlertRule For field (#32848)
* Update to latest alerting-api

Co-authored-by: Sofia Papagiannaki <papagian@users.noreply.github.com>
2021-04-09 16:50:04 +02:00
gotjosh
c9e5088e8b
Alerting: Cleanup and move legacy to a legacy file (#32803)
* Alerting: Cleanup and move legacy to a legacy file

A quick cleanup of the ngalert/api directory, optimising for an easy
removal of what is will be considered legacy at some point. A quick
summary of what's done is:

- Add a prefix `generated` prefix to files that are auto-generated by
  our swagger definitions.
- Create a legacy file to place all the legacy API routes implementation
  and helpers. Deleting files that where no longer needed after this
move.
- Rename the `lotex` file to `lotex_ruler`
- Adding a couple of comments here and there.

With this, I hope to organise our code in this directory a bit better
given there's a lot going on.
2021-04-09 05:55:41 -04:00
Ganesh Vernekar
e3a1d3d158
AlertingNG: PagerDuty notification channel (#32604)
* AlertingNG: PagerDuty notification channel

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Add tests

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix lint

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix reviews

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-08 19:21:09 +02:00
Ganesh Vernekar
b1c84c795f
AlertingNG: Add a global registry for notification channels (#32781)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-08 22:01:23 +05:30
gotjosh
fe67680c42
Alerting: Allow querying of Alerts from notifications (#32614)
* Alerting: Allow querying of Alerts from notifications

* Wire everything up

* Remove unused functions

* Remove duplicate line
2021-04-08 07:27:59 -04:00
Owen Diehl
8b8fc293b7
safer, more idiomatic proxy helper (#32732)
Co-authored-by: Sofia Papagiannaki <sofia@grafana.com>
2021-04-07 15:36:50 -04:00
Kyle Brandt
d519913843
AlertingNG: Temp endpoint to translate dashboard alert into rule group (#32694)
* Set NoData and ExecErr states
* make save an option
* TODOs
* adjust interval
* FOR and alertRuleTags not done yet
2021-04-07 14:28:06 +02:00
Ganesh Vernekar
0f7d8ae6d2
Update email template for AlertingNG (#32691)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-07 14:52:48 +05:30
Domas
a56293142a
Alerting: unified alerting frontend (#32708) 2021-04-07 08:42:43 +03:00
Sofia Papagiannaki
9d7d33ebb3
[Alerting]: Require login for alerting API routes (#32688)
* [Alerting]: Require login for alerting API routes

* Fix example requests

* [Alerting]: Add /api prefix to all routes (#32716)
2021-04-06 17:22:05 +03:00
David Parrott
c0d83fc01e
Alerting: Return cached alerts for prometheus/api/v1/alerts (#32654)
* Return cached alerts for prometheus/api/v1/alerts

* Return not implemented for /prometheus/grafana/api/v1/rules

* Set StartsAt for already alerting states

* Fix tests
2021-04-05 15:05:39 -07:00
Sofia Papagiannaki
a80f31a5c8
Alerting: 404 error status when no alertmanager configuration (#32651) 2021-04-05 17:03:00 +03:00
Sofia Papagiannaki
daabf64aa1
[Alerting]: Update scheduler to evaluate rules created by the unified API (#32589)
* Update scheduler

* Fix tests

* Fixes after code review feedback

* lint - add uncommitted modifications

Co-authored-by: kyle <kyle@grafana.com>
2021-04-03 20:13:29 +03:00
Kyle Brandt
948da25c13
ngalerting: represent nil/empty labels the same (#32652)
was getting duplicates of [] and null before
2021-04-02 13:49:45 -04:00
Kyle Brandt
7fcb6ecb91
Alerting: Fix persistance migration (#32650) 2021-04-02 18:31:03 +02:00
Sofia Papagiannaki
0e350ae6c8
Remove more dead code (#32645) 2021-04-02 18:24:27 +03:00
David Parrott
2a8446e435
Alerting: Persist alerts on evaluation and shutdown. Warm cache from DB on startup (#32576)
* Initial commit for state tracking

* basic state transition logic and tests

* constructor. test and interface fixup

* use new sig for sch.definitionRoutine()

* test fixup

* make the linter happy

* more minor linting cleanup

* Alerting: Send alerts from state tracker to notifier

* Add evaluation time and test

Add evaluation time and test

* Add cleanup routine and logging

* Pull in compact.go and reconcile differences

* Save alert transitions and save all state on shutdown

* pr feedback

* WIP

* WIP

* Persist alerts on evaluation and shutdown. Warm cache on startup

* Filter non-firing alerts before sending to notifier

Co-authored-by: Josue Abreu <josue@grafana.com>
2021-04-02 08:11:33 -07:00
Kyle Brandt
6ad02315eb
ngalert: json dataframe on temp endpoints (#32641) 2021-04-02 15:52:38 +02:00
Ryan McKinley
c7ea96940a
Arrow: move arrow support from frontend to backend only (#32575) 2021-04-01 10:30:08 -07:00
Sofia Papagiannaki
8793f5c7f8
[Alerting]: Delete obsolete database table and code (#32595)
* Delete obsolete migration

* Remove redundant code
2021-04-01 19:41:57 +03:00
Sofia Papagiannaki
ee06970d72
[Alerting]: Grafana managed ruler API implementation (#32537)
* [Alerting]: Grafana managed ruler API impl

* Apply suggestions from code review

* fix lint

* Add validation for ruleGroup name length

* Fix MySQL migration

Co-authored-by: kyle <kyle@grafana.com>
2021-04-01 11:11:45 +03:00
Sofia Papagiannaki
a5e95823b2
[Alerting]: Alertmanager API implementation (#32174)
* Add validation for grafana recipient

* Alertmanager API implementation (WIP)

* Fix encoding/decoding receiver settings from/to YAML

* Save templates together with the configuration

* update POST to apply latest config

* Alertmanager service enabled by the ngalert toggle

* Silence API integration with Alertmanager

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2021-03-31 23:00:56 +03:00
gotjosh
433f6b91d0
Alerting: Introduce the silencing interface (#32517)
* Alerting: Introduce the silencing interface

The operations introduced are:

- Listing silences
- Retrieving an specific silence
- Deleting a silence
- Creating a silence

Signed-off-by: Josue Abreu <josue@grafana.com>

* Add a comment to listing silences

* Update to upstream alertmanager

* Remove copied code from the Alertmanager
2021-03-31 07:36:36 -04:00
David Parrott
b1cb74c0c9
Alerting: Send alerts from state tracker to notifier, logging, and cleanup task (#32333)
* Initial commit for state tracking

* basic state transition logic and tests

* constructor. test and interface fixup

* use new sig for sch.definitionRoutine()

* test fixup

* make the linter happy

* more minor linting cleanup

* Alerting: Send alerts from state tracker to notifier

* Add evaluation time and test

Add evaluation time and test

* Add cleanup routine and logging

* Pull in compact.go and reconcile differences

* pr feedback

* pr feedback

Pull in compact.go and reconcile differences

Co-authored-by: Josue Abreu <josue@grafana.com>
2021-03-30 09:37:56 -07:00
Sofia Papagiannaki
c4d5a67b38
[Alerting] Forking alert manager API (#32300)
* Alertmanager lotex ruler

* Apply suggestions from code review
2021-03-29 18:18:25 +03:00
Ganesh Vernekar
740c5813d4
AlertingNG: Fix dispatcher metrics in notifier (#32434)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-03-29 20:35:15 +05:30
Domas
0779dab0de
NgAlerting: loki & cortex have different prom & ruler endpoint prefixes (#32344) 2021-03-29 08:55:09 +03:00
Ganesh Vernekar
a0db4dce32
Render new email template and fix the title (#32314)
* Render new email template and fix the title

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix nit

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-03-26 12:53:14 +05:30
Ganesh Vernekar
093e5947f4
Upgrade Prometheus Alertmanager and small fixes (#32280)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-03-25 12:51:44 +01:00
David Parrott
d33a77a67f
Alerting: add state tracker to alerting evaluation (#32298)
* Initial commit for state tracking

* basic state transition logic and tests

* constructor. test and interface fixup

* use new sig for sch.definitionRoutine()

* test fixup

* make the linter happy

* more minor linting cleanup
2021-03-24 15:34:18 -07:00
gotjosh
58b814bd7d
Alerting: Add StartAt and FiredAt to the alert evaluation result (#32302) 2021-03-24 20:27:04 +00:00
Kyle Brandt
66548878fe
ngalert: add addition temp translation endpoint (#32287)
spits out new sse condition/data json from old alert id to help with generating UI models
also moves this api code into another file
2021-03-24 17:12:34 +01:00
gotjosh
9b52ffc6a9
Alerting: Fetch configuration from the database and run a notification service (#32175)
* Alerting: Fetch configuration from the database and run a notification
instance

Co-Authored-By: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2021-03-24 14:20:44 +00:00
Kyle Brandt
05fa9d6ad9
AlertingNG: rename Condition properties to match front end (#32267) 2021-03-24 08:07:29 -04:00
Owen Diehl
2179a2658e
Extricates reusable utilities for different alerting proxy types (#32268)
* backendtype helper

* abstracts alertingproxy

* updates alerting api dep

* prom endpoints
2021-03-24 07:43:25 -04:00
Kyle Brandt
7bb79158ed
SSE/Alerting: First pass at query/condition translation (#31693)
- Takes the conditions property from the settings column of an alert from alerts table and turns into an ng alerting condition with the queries and classic condition.
- Has temp API rest endpoint that will take the dashboard conditions json, translate it to SEE queries + classic condition, and execute it (only enabled in dev mode).
- Changes expressions to catch query responses with a non-nil error property
- Adds two new states for an NG instance result (NoData, Error) and updates evaluation to match those states
- Changes the AsDataFrame (for frontend) from Bool to string to represent additional states
- Fix bug in condition model to accept first Operator as empty string.
- In ngalert, adds GetQueryDataRequest, which was part of execute and is still called from there. But this allows me to get the Expression request from a condition to make the "pipeline" can be built.
- Update AsDataFrame for evalresult to be row based so it displays a little better for now
2021-03-23 12:11:15 -04:00
Sofia Papagiannaki
24cb059a6b
[Alerting]: implement backend checking for forking to Lotex ruler (#32208)
* Rename DatasourceId path parameter

* Implement fork ruler backendType()

* Apply suggestions from code review
2021-03-23 18:08:57 +02:00
Owen Diehl
93d0f7163f
[Alerting] Forking LoTex ruler (#32138)
* updates alerting api to master

* skeleton for lotex ruler

* withPath helper & legacyRulerPrefix const

* forked ruler

* wires up proxy

* safeMacaronWrapper

* working proxy

* jsonExtractor

* lint
2021-03-19 10:32:13 -04:00
Ganesh Vernekar
8854001b67
AlertingNG: Refactor notifier to support config reloads (#32099)
* AlertingNG: Refactor notifier to support config reloads

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments and make reloading of config a sync operation

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-03-19 09:26:00 +01:00
gotjosh
cc74b1fe46
Alerting: Add database table for persisting alerting configuration (#32042)
* Alerting: Add database table for persisting alerting configuration

* Fix the linter

* Address review comments

* Don't split templates and configuration

It is already bundled together as part of a of the API so might as well
marshall it directly.
2021-03-18 18:12:28 +00:00
Ganesh Vernekar
0b788b5ce8
AlertingNG: Notification channel for emails (#31768)
* Email notification channel in ngalert

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Use existing templating system

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Update template and add unit tests

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2021-03-18 14:55:11 +00:00
Ganesh Vernekar
974ccf8091
AlertingNG: Fix the alerting stage for legacy alerts (#32025)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2021-03-17 16:38:33 +00:00
Sofia Papagiannaki
68b05b8aaa
AlertingNG: Unified alerting API mock (#32040)
* AlertingNG: Alertmanager mock API

* AlertingNG: Remove permissions API routes

* Add example POST payloads

* Prometheus and testing mock API
2021-03-17 12:47:03 +02:00
Ganesh Vernekar
ecbc98ba5d
AlertingNG: Add alert provider and basic structure with dispatcher, silences and delivery stages (#31833)
* AlertingNG: Add alert provider

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Add unit tests

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Alertmanager WIP

* Merge alertmanager into notifier

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fixes for PR 31833 (#31990)

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Use alertmanager from upgrad-uuid temporarily to unblock

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix lint

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

Co-authored-by: Josue Abreu <josue@grafana.com>
2021-03-16 10:44:52 +01:00
Sofia Papagiannaki
fe628c6282
AlertingNG: base API implementation (#31824)
* AlertingNG: base API implementation

* Pass the interface instead of the base impl

* Ruler mock draft (WIP)

* Update alerting-api dependency

* Improve mock implementation
2021-03-11 21:28:00 +02:00
Sofia Papagiannaki
53bccf1b77
Replace eval.Condition with models.Condition (#31909) 2021-03-11 18:56:58 +02:00
Sofia Papagiannaki
4ce0a49eac
AlertingNG: Split into several packages (#31719)
* AlertingNG: Split into several packages

* Move AlertQuery to models
2021-03-08 22:19:21 +02:00
Arve Knudsen
b79e61656a
Introduce TSDB service (#31520)
* Introduce TSDB service

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

Co-authored-by: Erik Sundell <erik.sundell87@gmail.com>
Co-authored-by: Will Browne <will.browne@grafana.com>
Co-authored-by: Torkel Ödegaard <torkel@grafana.org>
Co-authored-by: Will Browne <wbrowne@users.noreply.github.com>
Co-authored-by: Zoltán Bedi <zoltan.bedi@gmail.com>
2021-03-08 07:02:49 +01:00
Sofia Papagiannaki
bd2390c49f
AlertingNG: code refactoring (#30787)
* AlertingNG: refactoring

* Fix tests
2021-03-03 17:52:19 +02:00
Hugo Häggmark
1725bf773f
Chore: Fixes small typos (#31461) 2021-02-25 08:59:26 +01:00
Peter Holmberg
aaf5710748
AlertingNG: Edit Alert Definition (#30676)
* break out new and edit

* changed model to match new model in backend

* AlertingNG: API modifications (#30683)

* Fix API consistency

* Change eval alert definition to POST request

* Fix eval endpoint to accept custom now parameter

* Change JSON input property for create/update endpoints

* model adjustments

* set mixed datasource, fix put url

* update snapshots

* remove edit and add landing page

* remove snapshot tests ans snapshots

* wrap linkbutton in array

Co-authored-by: Sofia Papagiannaki <papagian@users.noreply.github.com>
Co-authored-by: Sofia Papagiannaki <sofia@grafana.com>
2021-02-04 09:13:02 +01:00
Sofia Papagiannaki
5d029abc42
AlertingNG: change API permissions (#30781) 2021-02-02 10:37:01 +02:00
Sofia Papagiannaki
1c158744e8
AlertingNG: pause/unpause definitions via the API (#30627)
* AlertingNG: pause/unpause definitions via the API

* Apply suggestions from code review

Co-authored-by: Will Browne <wbrowne@users.noreply.github.com>

* Enable pausing/unpausing multiple definitions

Co-authored-by: Will Browne <wbrowne@users.noreply.github.com>
2021-01-27 15:51:00 +02:00
Sofia Papagiannaki
9ada4b6052
Expressions: Add option to disable feature (#30541)
* Expressions: Add option to disable feature

* Apply suggestions from code review

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>
2021-01-22 19:27:33 +02:00
Sofia Papagiannaki
b1debc9c46
AlertingNG: Enforce unique alert definition title (non empty)/UID per organisation (#30380)
* Enforce unique alert definition title/uid per org

* Remove print statement from test

* Do not allow empty alert definition titles

* update error message on dup title

* also add title error to update

* CamelCase json properties

* Add test for title unique enforcement in updates

Co-authored-by: kyle <kyle@grafana.com>
2021-01-19 19:11:11 +02:00
Sofia Papagiannaki
8c31e25926
AlertingNG: Save alert instances (#30223)
* AlertingNG: Save alert instances

Co-authored-by: Kyle Brandt <kyle@grafana.com>

* Rename alert instance fields/columns

* Include definition title in listing alert instances

* Delete instances when deleting defintion

Co-authored-by: Kyle Brandt <kyle@grafana.com>
2021-01-18 20:57:17 +02:00
Sofia Papagiannaki
2b15581339
AlertingNG: Modify queries and transform endpoint to get datasource UIDs (#30297)
* Pass skipCache from context

* Use macaron Params instead of ParamsEscape for UIDs

* Modify queries and transform to get datasource UIDs

* Update github.com/grafana/grafana-plugin-sdk-go to v0.83.0
2021-01-15 18:33:50 +02:00
Hugo Häggmark
3d41267fc4
Chore: Moves common and response into separate packages (#30298)
* Chore: moves common and response into separate packages

* Chore: moves common and response into separate packages

* Update pkg/api/utils/common.go

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>

* Chore: changes after PR comments

* Chore: move wrap to routing package

* Chore: move functions in common to response package

* Chore: move functions in common to response package

* Chore: formats imports

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
2021-01-15 14:43:20 +01:00
Sofia Papagiannaki
551428af61
Fix alert definition routine stop (#30117) 2021-01-11 16:14:03 +02:00
Sofia Papagiannaki
5560be73bf
Alerting NG: update API to expect UIDs instead of IDs (#29896)
* Change API to expect UIDs instead of ID

* Remove unnecessary transactions

When only one query is executed

* Modify API responses

* Cleanup tests

* Use globally orgID and UID for identifying alert definitions
2021-01-07 17:45:42 +02:00
Sofia Papagiannaki
3cac10e598
AlertingNG: Create a scheduler to evaluate alert definitions (#29305)
* Always use cache: stop passing skipCache among ngalert functions

* Add updated column

* Scheduler initial draft

* Add retry on failure

* Allow settting/updating alert definition interval

Set default interval if no interval is provided during alert definition creation.
Keep existing alert definition interval if no interval is provided during alert definition update.

* Parameterise alerting.Ticker to run on custom interval

* Allow updating alert definition interval without having to provide the queries and expressions

* Add schedule tests

* Use xorm tags for having initialisms with consistent case in Go

* Add ability to pause/unpause the scheduler

* Add alert definition versioning

* Optimise scheduler to fetch alert definition only when it's necessary

* Change MySQL data column to mediumtext

* Delete alert definition versions

* Increase default scheduler interval to 10 seconds

* Fix setting OrgID on updates

* Add validation for alert definition name length

* Recreate tables
2020-12-17 16:00:09 +02:00
Kyle Brandt
6d64c603c2
Expr: fix failure to execute due to OrgID (#29653)
* Expr: fix failure to execute due to OrgID
Get orgID from the plugin context, which makes more sense anyways.
makes expressions work again after https://github.com/grafana/grafana/pull/29449 changes.

* Do not save organisation on its alert query model

Co-authored-by: Sofia Papagiannaki <sofia@grafana.com>
2020-12-07 10:30:38 -05:00
Kyle Brandt
0cb29d337a
Expressions: Move GEL into core as expressions (#29072)
* comes from grafana/gel-app
* remove transform plugin code
* move __expr__ and -100 constants to expr pkg
* set OrgID on request plugin context
* use gtime for resample duration
* in resample, rename "rule" to "window", use gtime for duration, parse duration before exec
* remove gel entry from plugins-bundled/external.json
which creates an empty array for plugins
2020-11-19 07:17:00 -05:00
Sofia Papagiannaki
43f580c299
AlertingNG: manage and evaluate alert definitions via the API (#28377)
* Alerting NG: prototype v2 (WIP)

* Separate eval package

* Modify eval alert definition endpoint

* Disable migration if ngalert is not enabled

* Remove premature test

* Fix lint issues

* Delete obsolete struct

* Apply suggestions from code review

* Update pkg/services/ngalert/ngalert.go

Co-authored-by: Kyle Brandt <kyle@grafana.com>

* Add API endpoint for listing alert definitions

* Introduce index for alert_definition table

* make ds object for expression to avoid panic

* wrap error

* Update pkg/services/ngalert/eval/eval.go

* Swith to backend.DataQuery

* Export TransformWrapper callback

* Fix lint issues

* Update pkg/services/ngalert/ngalert.go

Co-authored-by: Kyle Brandt <kyle@grafana.com>

* Validate alert definitions before storing them

* Introduce AlertQuery

* Add test

* Add QueryType in AlertQuery

* Accept only float64 (seconds) durations

* Apply suggestions from code review

* Get rid of bus

* Do not export symbols

* Fix failing test

* Fix failure due to service initialization order

Introduce MediumHigh service priority and assign it to backendplugin
service

* Fix test

* Apply suggestions from code review

* Fix renamed reference

Co-authored-by: Kyle Brandt <kyle@grafana.com>
2020-11-12 15:11:30 +02:00
Arve Knudsen
7897c6b7d5
Chore: Fix staticcheck issues (#28854)
* Chore: Fix issues reported by staticcheck

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Undo changes

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2020-11-05 11:57:20 +01:00
Kyle Brandt
44a795cb17
AlertingNG: remove warn/crit from eval prototype (#28334)
and misc cleanup
2020-10-16 12:33:57 -04:00
Sofia Papagiannaki
4ab90f9397
Fix linting: remove commented code (#28208) 2020-10-13 08:35:41 +02:00
Sofia Papagiannaki
4acbcd7053
AlertingNG: POC of evaluator under feature flag. (#27922)
* New feature toggle for enabling alerting NG

* Initial commit

* Modify evaluate alert API request

* Check for unique labels in alert execution result dataframes

* Remove print statement

* Additional minor fixes/comments

* Fix lint issues

* Add API endpoint for evaluating panel queries

* Push missing renaming

* add refId for condition to API

* add refId for condition to API

* switch dashboard based eval to get method

* add from/to params to dashboard based eval

* add from/to params to  eval endpoint

Co-authored-by: kyle <kyle@grafana.com>
2020-10-12 21:51:39 +03:00