Commit Graph

720 Commits

Author SHA1 Message Date
Sofia Papagiannaki
1399ab50b3 API: Universal swagger generation (#51033) 2022-06-27 10:54:31 +03:00
George Robinson
dc68213114 Alerting: Remove fmt.Println from Threema (#51380) 2022-06-24 14:50:53 +01:00
Alexander Weaver
0d9389e1f4 Alerting: Code-gen parsing of URL parameters and fix related bugs (#50731)
* Extend template and generate

* Generate and fix up alertmanager endpoints

* Prometheus routes

* fix up Testing endpoints

* touch up ruler API

* Update provisioning and fix 500

* Drop dead code

* Remove more dead code

* Resolve merge conflicts
2022-06-23 15:13:39 -05:00
Karl Persson
b9bb0513e3 Remove version property from fixed roles (#51298) 2022-06-23 12:09:03 +02:00
Selene
ecc15a2f71 KVStore: Extend kvstore to retrieve all items (#50848)
* Extend kvstore to retrieve all items

* Fix comment

* Fix tests

* Change test order

* Move test outside to avoid order conditions

* Update Items to GetAll function and return a map

* Add explanation of map result

* Add description comment

Co-authored-by: Tania B <yalyna.ts@gmail.com>
2022-06-23 11:12:07 +02:00
Yuriy Tseretyan
ee5bcf2b96 make test more stable (#51268) 2022-06-22 12:53:16 -04:00
gotjosh
90646e7f41 Alerting: Don't stop the migration when alert rule tags are invalid (#51253)
* Alerting: Don't stop the migration when alert rule tags are invalid

As we migrate we expect the `alertRuleTags` on a dashboard alert to be a JSON object. However, it seems this is not really validated by Grafana and an user can change the format to something else that the JSON parser is not able to marshal into a `map[string]string`.

Let's do a bit better by "attempting" to parse the tags and if we can't we'll simple return an empty map. The data is still there so if the user wishes they can go back, fix the data and attemp the migration again.
2022-06-22 17:39:17 +01:00
Yuriy Tseretyan
4b42cd3c1d Alerting: State manager to use clock (#51219)
* manager to use clock, to be able to mock real time
2022-06-22 12:18:42 -04:00
Yuriy Tseretyan
4d02f73e5f Alerting: Persist rule position in the group (#50051)
Migrations:
* add a new column alert_group_idx to alert_rule table
* add a new column alert_group_idx to alert_rule_version table
* re-index existing rules during migration

API:
* set group index on update. Use the natural order of items in  the array as group index
* sort rules in the group on GET
* update the version of all rules of all affected groups. This will make optimistic lock work in the case of multiple concurrent request touching the same groups.

UI:
* update UI to keep the order of alerts in a group
2022-06-22 10:52:46 -04:00
George Robinson
6e44b36a30 Alerting: Add support for images in Kafka alerts (#50758) 2022-06-22 11:03:08 +01:00
George Robinson
99516360c9 Alerting: Add support for images in VictorOps alerts (#50759) 2022-06-22 10:00:50 +01:00
Yuriy Tseretyan
157c12211d Alerting: State manager to use tick time to determine stale states (#50991)
* use correct stale timestamp
* calculate stale using tick time instead of time.now

* remove unused dependency on sql store
2022-06-22 00:16:53 +02:00
George Robinson
c8466d285c Alerting: Add support for image annotation in Alertmanager alerts (#50686) 2022-06-21 09:06:00 +01:00
George Robinson
67046c5e79 Alerting: Add support for images in Threema alerts (#50734) 2022-06-20 15:45:35 +01:00
George Robinson
7235480be5 Alerting: Use ErrImagesDone in Discord and SensuGo (#51106) 2022-06-20 14:39:27 +01:00
George Robinson
18c3456d13 Alerting: Support up to N fake images (#51111) 2022-06-20 14:34:53 +01:00
Gilles De Mey
81a5436c1e Alerting: Adds Mimir to Alertmanager data source implementation (#50943) 2022-06-20 12:56:38 +02:00
George Robinson
62c2b1ec78 Alerting: Add ErrImagesDone to return from withStoredImages (#51098) 2022-06-20 10:56:28 +01:00
George Robinson
2dbaf259a7 Alerting: Update test funcs for notifications (#51013) 2022-06-20 09:05:21 +01:00
Yuriy Tseretyan
81089b956a Alerting: Update authorization rules for RouteGetNamespaceRulesConfig (#50965)
* use authorizeAccessToRuleGroup
* use toGettableRuleGroupConfig in get by namespace
* add comments for controller methods
2022-06-17 13:55:31 -04:00
Matthew Jacobson
5dee2ed24c Alerting: Add first Grafana reserved label grafana_folder (#50262)
* Alerting: Add first Grafana reserved label g_label

g_label holds the title of the folder container the alert. The intention of this label
is to use it as part of the new default notification policy groupBy.

* Add nil check on updateRule labels map

* Disable gocyclo lint on schedule.ruleRoutine

will remove later in a separate refactoring PR to reduce complexity.

* Address doc suggestions

* Update g_folder for rules in folder when folder title changes

* Remove global bus in FolderService

* Modify tests to fit new common g_folder label

* Add changelog entry

* Fix merge conflicts

* Switch GrafanaReservedLabelPrefix from `g_` to `grafana_`
2022-06-17 13:10:49 -04:00
Alexander Weaver
9bbfeedadf Alerting: Create algorithm to process receiver changes and keep them consistent internally (#50738)
* Algorithm to fix up receivers

* Extract for tests

* Add tests, fix bug

* Add test which demonstrates how it fixes up broken groups

* Fix package prefix
2022-06-17 10:19:22 -05:00
Yuriy Tseretyan
c1550d1f07 Alerting: Rule api to fail update if provisioned rules are affected (#50835)
* add function that checks whether changes mention provisioned rules
* update API that updates group of rules to fail if check does not pass
2022-06-15 16:01:14 -04:00
Ben Kochie
68691d7775 Convert some metrics to Histograms (#50420)
Because Summary metrics can not be aggreated, convert them to histograms
so that users with HA deployments can use these metrics.
* Convert metrics registration to promauto.
* Improve help text style.

Signed-off-by: SuperQ <superq@gmail.com>
2022-06-15 13:19:43 +02:00
Serge Zaitsev
ae9491c3a7 Chore: Make test tracer noop and return no errors (#50797) 2022-06-15 12:40:41 +02:00
George Robinson
87f3bb3156 Alerting: Add support for images in SensuGo alerts (#50718) 2022-06-15 10:15:16 +01:00
Alexander Weaver
d61d439b11 Handle bsd vs gnu sed (#50641) 2022-06-14 15:35:23 -05:00
Serge Zaitsev
0b55c41d05 Chore: Remove global bus variable (#50765)
* Chore: Remove global bus variable

* fix bus in tests
2022-06-14 16:07:41 +02:00
Karl Persson
44ffbfd6aa RBAC: Refactor GetUserPermissions to use []accesscontrol.Permission (#50683)
* Return slice of permissions instead of slice of pointers for permissions
2022-06-14 10:17:48 +02:00
Alexander Weaver
17e76b06ff Alerting: Fix rendering issues in OpenAPI docs (#50630)
* Clean up status codes

* Missing consumes tag

* Regenerate

* Fix incorrect documented responses and missing UI elements

* Fix response docs

* Fix wrong response copy paste

* Regenerate

* Temporarily revert
2022-06-13 12:51:07 -05:00
Yuriy Tseretyan
c314ce48c7 Alerting: Support for optimistic locking for alert rules (#50274)
* add support for optimistic locking for alert_rule table
* return 409 in the case of opitimistic lock
2022-06-13 12:15:28 -04:00
Jean-Philippe Quéméner
1ed7280363 Alerting: add right provenance when creating mute timings (#50707) 2022-06-13 18:05:41 +02:00
Jean-Philippe Quéméner
ed6a887737 Alerting: remove unused function in alert rule store (#50696) 2022-06-13 11:24:29 -04:00
Kat Yang
bd35e6917a Chore: Exclude integration tests from running on test-backend step (#50359)
* Chore: Exclude integration tests from running on test-backend step

* Remove -v from go test command

* Add check to skip integration tests before each integration test

* Try to restart pipeline

* Retrying to make pipeline run
2022-06-10 11:46:21 -04:00
Yuriy Tseretyan
b0ae4d460e Alerting: Make ticker to tick at predictable time (#50197) 2022-06-10 10:27:17 -04:00
Jean-Philippe Quéméner
862f51216b Alerting: improve provisioning docs (#50347)
* Alerting: improve provisioning docs

* add new provisioning page

* add api docs

* fix formatting and add better descriptions

* fix typo
2022-06-10 16:25:15 +02:00
Gabriel MABILLE
840a442796 RBAC: Rename alerting roles to match naming convention (#50504) 2022-06-09 14:29:27 +02:00
Alexander Weaver
7dd78fee2c Alerting: Fix provisioning validation status codes and panics (#50464)
* Updates to all except alert rules

* Return 400 when rules fail to validate, add testinfra

* More sane package aliases

* More package alias renames

* One more bug in contact point validation

* remove unused function

Co-authored-by: Jean-Philippe Quémémer <jeanphilippe.quemener@grafana.com>
Co-authored-by: Jean-Philippe Quéméner <JohnnyQQQQ@users.noreply.github.com>
2022-06-09 10:38:46 +02:00
Jean-Philippe Quéméner
cf684ed38f Alerting: bump rule version when updating rule group interval (#50295)
* Alerting: move group update to alert rule service

* rename validateAlertRuleInterval to validateRuleGroupInterval

* init baseinterval correctly

* add seconds suffix

* extract validation function for reusability

* add context to err message
2022-06-09 09:28:32 +02:00
Yuriy Tseretyan
54fa04263b Alerting: Add RBAC actions and role for provisioning API routes (#50459)
* add alert provisioning actions and role

* linter
2022-06-09 09:18:57 +02:00
Joe Blubaugh
ecf080825e Alerting: Fix image embed in email template. (#50370)
The ng_alert_notification email template did not include templating for
linked or embedded images. This change updates that.

Additionally, this change supports embedding an image for each alert in
an email batch.

Fixes #50315
2022-06-09 10:01:58 +08:00
Santiago
9dc7e752b7 Optional custom title and description for OpsGenie (#50131)
* optional custom description for OpsGenie

* custom title and message, tests

* update changelog

* check for empty / whitespace only strings

* truncate the title to 130 characters if needed

* unnecessary validation removed

* truncate title to 127 characters and add three dots
2022-06-08 17:55:31 -03:00
gotjosh
c59938b235 Alerting: Schedule Alert rules metric tracking (#50415)
* Alerting: Schedule Alert rules metric tracking

Change the record of metrics from one place to two as an attempt to have a semi-accurate record.
2022-06-08 18:37:33 +01:00
Yuriy Tseretyan
a89d4a5be7 Alerting: Scheduler to drop ticks if a rule's evaluation is too slow (#48885)
* drop ticks if evaluation of a rule is too slow.
* add metric schedule_rule_evaluations_missed_total
2022-06-08 12:50:44 -04:00
Jean-Philippe Quéméner
fd664e4beb Alerting: replace a duplicated configuration key (#50350)
This PR renames the configuration key enabled to capture. This is needed as we already have a configuration key with the name enabled.

Fixes #50328

Co-authored-by: Jean-Philippe Quéméner <JohnnyQQQQ@users.noreply.github.com>
2022-06-08 11:04:51 +08:00
Alexander Weaver
28a47b56d2 Bump provisioning to admin-only in lieu of dedicated RBAC permissions (#50366) 2022-06-07 17:26:48 -05:00
gotjosh
0cde283505 Alerting: Logs should not be capitalized and the errors key should be "err" (#50333)
* Alerting: decapitalize log lines and use "err" as the key for errors

Found using (logger|log).(Warn|Debug|Info|Error)\([A-Z] and (logger|log).(Warn|Debug|Info|Error)\(.+"error"
2022-06-07 19:54:23 +02:00
George Robinson
c83f84348c Alerting: Fix database unavailable removes rules from scheduler (#49874) 2022-06-07 16:20:06 +01:00
Karl Persson
c4a75f9eb3 RBAC: Add scope resolvers for dashboards (#50110)
* Inject access control into dashboard service

* Add function to parse id scopes

* Add dashboard as return value

* Update mock

* Return only err to keep service interface

* Add scope resolvers for dashboard id scopes

* Add function to parse uid scopes

* Add dashboard uid scope resolver

* Register scope resolvers for dashboards

Co-authored-by: Gabriel MABILLE <gamab@users.noreply.github.com>
2022-06-07 11:02:20 +02:00
Jean-Philippe Quéméner
4b8a4449ed Alerting: remove feature toggle for provisioning API (#50167)
* Alerting: remove feature toggle for provisioning API

* remove missed code parts

* remove unused import

* remove empty line

* mark routes as stable
2022-06-05 07:45:36 +02:00