Ganesh Vernekar
d5ae55c5dd
NGAlert: Add message field to email notification channel ( #34044 )
...
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-17 16:05:09 +05:30
Owen Diehl
1367f7171e
Alerting/ruler metrics ( #34144 )
...
* adds active configurations metric
* rule evaluation metrics
* ruler metrics
* pr feedback
2021-05-14 16:13:44 -04:00
gotjosh
eb74994b8b
Alerting: Modify configuration apply and save semantics - v2 ( #34143 )
...
* Save default configuration to the database and copy over secure settings
2021-05-14 19:49:54 +01:00
Owen Diehl
fc90c36d50
removes unused db method ( #34082 )
2021-05-13 20:28:10 +02:00
Owen Diehl
baca873a84
extracts alertmanager from DI, including migrations ( #34071 )
...
* extracts alertmanager from DI, including migrations
* includes alertmanager Run method in ngalert
* removes 3s test shutdown timeout
* lint
2021-05-13 14:01:38 -04:00
Ganesh Vernekar
ec3214bac2
NGAlert: Add integration tests for notification channels ( #33431 )
...
* NGAlert: Add integration tests for notification channels
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Fix the failing tests
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Fix review comments
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Override creation of rule UID, remove only namespace UID
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-13 22:58:19 +05:30
Kyle Brandt
babb17afd6
Alerting/Chore: Move tests from tests package ( #34059 )
...
Instead put in package folder but with package name suffixed with _test
This enables code coverage within the pkg while still allow the tests to operate from external to package perspective (only exported things).
2021-05-13 10:05:33 -04:00
Ganesh Vernekar
5f44ccff0c
NGAlert: Fix unit test to write files in temporary directory ( #34032 )
...
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-13 16:08:12 +05:30
Kyle Brandt
3da8db7f3f
Alerting: Run table migrations regardless of feature flag and move out of service ( #33996 )
2021-05-12 14:39:48 -04:00
Owen Diehl
3b06f52bab
Alerting/allow empty receiver ( #33962 )
...
* simplifies yaml unmarshaling: PostableApiReceiver
* allow empty receiver type
* allows name only receivers (blackhole)
* better receiver type parsing
* linting
2021-05-12 07:58:16 -04:00
Kyle Brandt
a735c51202
Alerting/Chore: Backend remove def_ columns from instance ( #33875 )
...
rename def_uid and def_org_id to rule_uid and rule_org_id on the alert_instance table and drops the definition table.
2021-05-12 07:17:43 -04:00
Ganesh Vernekar
8d442c9b44
NGAlert: Fix templating and remove unwanted default templates ( #33918 )
...
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-12 15:13:43 +05:30
Sofia Papagiannaki
f4750fb3c8
[Alerting]: Alertmanager API apply permissions ( #33843 )
...
* [Alerting]: Alertmanager API apply permissions
* Apply suggestions from code review
2021-05-11 11:31:38 +03:00
Owen Diehl
e18ca8f6f2
enforce receivers align with backend type when posting AM config ( #33877 )
2021-05-10 16:58:41 -04:00
Sofia Papagiannaki
1c58fd380f
[Alerting]: store encrypted receiver secure settings ( #33832 )
...
* [Alerting]: Store secure settings encrypted
* Move encryption to the API handler
2021-05-10 15:30:42 +03:00
David Parrott
e58aca2d20
Alerting: remove instances from db and cache on rule update ( #33722 )
...
* remove instances from db and cache on rule update
* fix panic
* rename
2021-05-06 18:39:34 +02:00
Kyle Brandt
fae093bbe2
Alerting: Fix state cache getOrCreate panic ( #33777 )
2021-05-06 14:35:52 +02:00
Owen Diehl
a5ae8cf377
Unredact/secret ( #33723 )
...
* no longer redacts GETing proxied AM configs
* removes unused testfile
* testware fix
* consistently roundtrips yaml<>json and doesnt redact secrets
* lint
2021-05-05 16:21:53 -04:00
Ganesh Vernekar
1b8c0ce88b
NGAlert: Fix some TODOs in notification channels ( #33739 )
...
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-05 17:48:40 +05:30
David Parrott
b1a8c67689
Alerting return evaluation errors to /rules ( #33663 )
...
* Set and return errors produced by evaluation results
* test fixup
2021-05-04 13:08:12 -04:00
David Parrott
39099bf3c0
Alerting nested state cache ( #33666 )
...
* nest cache by orgID, ruleUID, stateID
* update accessors to use new cache structure
* test and linter fixup
* fix panic
Co-authored-by: Kyle Brandt <kyle@grafana.com>
* add comment to identify what's going on with nested maps in cache
Co-authored-by: Kyle Brandt <kyle@grafana.com>
2021-05-04 09:57:50 -07:00
David Parrott
5072fefc22
allow saving pending alerts ( #33667 )
2021-05-04 09:24:20 -07:00
Sofia Papagiannaki
540f110220
[Alerting]: Extend quota service to optionally set limits on alerts ( #33283 )
...
* Quota: Extend service to set limit on alerts
* Add test for applying quota to alert rules
* Apply suggestions from code review
Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>
* Get used alert quota only if naglert is enabled
* Set alert limit to zero if nglalert is not enabled
Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>
2021-05-04 19:16:28 +03:00
Ganesh Vernekar
918552d34b
NGAlert: Send list of available ngalert notification channels via API ( #33489 )
...
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-04 13:58:39 +02:00
Kyle Brandt
48358efc13
Alerting: remove State cache entries on Ruler Delete ( #33638 )
...
for https://github.com/grafana/alerting-squad/issues/133
2021-05-03 14:01:33 -04:00
Owen Diehl
070627d11e
better handle metrics for state transitions ( #33648 )
2021-05-03 11:57:24 -04:00
Kyle Brandt
c1034f3118
Alerting: Create instanceStore ( #33587 )
...
for https://github.com/grafana/alerting-squad/issues/129
2021-05-03 07:19:15 -04:00
Kyle Brandt
c2a5da79e3
Alerting: Avoid panic by not loading instances without a rule ( #33597 )
2021-05-01 19:01:28 +02:00
Kyle Brandt
759a0cd71b
Build: Fix with cleanup call maybe? ( #33590 )
2021-04-30 13:02:37 -07:00
Kyle Brandt
7823842c5d
Alerting: Load annotations from rule into State cache ( #33542 )
...
for https://github.com/grafana/alerting-squad/issues/127
2021-04-30 20:23:12 +02:00
Kyle Brandt
b8f01fe034
Alerting: backend "ng" code cleanup ( #33578 )
2021-04-30 13:21:57 -04:00
Owen Diehl
5e48b54549
Alerting/metrics ( #33547 )
...
* moves alerting metrics to their own pkg
* adds grafana_alerting_alerts (by state) metric
* alerts_received_{total,invalid}
* embed alertmanager alerting struct in ng metrics & remove duplicated notification metrics (already embed alertmanager notifier metrics)
* use silence metrics from alertmanager lib
* fix - manager has metrics
* updates ngalert tests
* comment lint
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
* cleaner prom registry code
* removes ngalert global metrics
* new registry use in all tests
* ngalert metrics impl service, hack testinfra code to prevent duplicate metric registrations
* nilmetrics unexported
2021-04-30 12:28:06 -04:00
Kyle Brandt
6c8ef2a9c2
Alerting: Alert Rule migration ( #33000 )
...
* Not complete, put migration behind env flag for now:
UALERT_MIG=iDidBackup
* Important to backup, and not expect the same DB to keep working until the env trigger is removed.
* Alerting: Migrate dashboard alert permissions
* Do not use imported models
* Change folder titles
Co-authored-by: Sofia Papagiannaki <papagian@users.noreply.github.com>
2021-04-29 13:24:37 -04:00
Sofia Papagiannaki
1e380e869e
[Alerting]: some fixes ( #33538 )
...
* Fix fialure when adding state annotations
* Fix get org rules API
Do not fail response if user has no access to view a namespace.
Do not include the namespace in the response instead.
* lint
2021-04-29 19:15:15 +03:00
Kyle Brandt
d32fcbe2bc
Alerting: Eval pkg tests and more specific error handling ( #33496 )
...
* comment updates
* more friendly error messages, in particular if it looks like time series data
2021-04-29 07:27:32 -04:00
Owen Diehl
ec37b4cb87
[Alerting] Automatic request instrumentation ( #33444 )
...
* alerting: automatic request instrumentation
* always expose alerting prom metrics
* globally register alerting metrics
2021-04-28 16:59:15 -04:00
Kyle Brandt
914443c816
Alerting: Fix state cache id duplication ( #33480 )
2021-04-28 11:42:19 -04:00
Sofia Papagiannaki
7ccb022c03
Alerting: validate condition before updating rulegroup ( #33367 )
...
* Alerting: validate condition before updating rulegroup
* Apply suggestions from code review
2021-04-28 11:31:51 +03:00
Kyle Brandt
b590e95682
AlertingAPI: Change list response query prop ( #33419 )
...
* Alerting: change to full []AlertQuery as json in a string and not just model.
2021-04-27 22:15:00 +02:00
Ganesh Vernekar
467ab124dd
NGAlert: Fix GET for Alertmanager config ( #33379 )
...
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-27 20:48:19 +05:30
Kyle Brandt
adcba36d39
AlertingAPI: update swagger json files match datasourceUid change ( #33332 )
...
* update swagger json files match datasourceUid change
underlying change made in https://github.com/grafana/grafana/pull/33282
* Document DatasourceUID field in AlertQuery model
* Run spec generation from inside a docker container
* Generate latest spec
Co-authored-by: Sofia Papagiannaki <sofia@grafana.com>
2021-04-27 16:50:30 +02:00
Owen Diehl
86c8eed386
Instrument/ruler api ( #33290 )
...
* ruler api histogram instrumentation
* register ruler metrics
2021-04-27 08:25:32 -04:00
Ganesh Vernekar
be1affe0a4
NGAlert: Fix flaky test ( #33415 )
...
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-27 17:03:22 +05:30
David Parrott
788bc2a793
Alerting: refactor state tracker ( #33292 )
...
* set processing time
* merge labels and set on response
* use state cache for adding alerts to rules
* minor cleanup
* add support for NoData and Error results
* rename test
* bring in changes from other PRs tha have been merged
* pr feedback
* add integration test
* close state tracker cleanup on context.Done
* fixup test
* rename state tracker
* set EvaluationDuration on Result
* default labels set as constants
* separate cache and state from manager
* use RWMutex in cache
2021-04-23 21:32:25 +02:00
David Parrott
ca79206498
Alerting: Handle NoData and Error evaluation results ( #33194 )
...
* set processing time
* merge labels and set on response
* use state cache for adding alerts to rules
* minor cleanup
* add support for NoData and Error results
* rename test
* bring in changes from other PRs tha have been merged
* pr feedback
* add integration test
* close state tracker cleanup on context.Done
* fixup test
* not those annotations
2021-04-23 20:47:52 +02:00
Kyle Brandt
5e818146de
Alerting/Expr: New SSE Request/QueryType, alerting move data source UID ( #33282 )
2021-04-23 16:52:32 +02:00
Ganesh Vernekar
659ea20c3c
NGAlert: Run the maintenance cycle for the silences ( #33301 )
...
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-23 16:19:03 +02:00
Ganesh Vernekar
d66a5e65a4
AlertingNG: Add webhook notification channel ( #33229 )
...
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-23 18:59:28 +05:30
Ganesh Vernekar
a0e567f80f
AlertingNG: Add Dingding notification channel ( #32995 )
...
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-22 19:30:49 +02:00
Ganesh Vernekar
4ec1edfca3
AlertingNG: Add Teams notification channel ( #32979 )
...
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-22 18:16:26 +02:00