Commit Graph

441 Commits

Author SHA1 Message Date
Alexander Weaver
078a578803
Drop ProvenanceOrgAdapter and build into store API instead (#48137) 2022-04-26 10:30:57 -05:00
Matthew Jacobson
0301d956da
Alerting: Create fewer contact points on migration (#47291)
* Alerting: Create fewer contact points on migration

Previously a new contact point was created for every unique combination
of channels attached to any legacy alert. This was very hard to maintain,
requiring modifications in every generated contact point.

This change deduplicates the generated contact points to a more
reasonable state. There should now only be one contact point per legacy
channel, and we attached multiple contact points to a route by nesting
them. The sole exception to this is if there were multiple default
legacy channels, in which case we create a redundant contact point
containing all of them used only in the root policy. This allows for a
much simpler notification policy structure.

Co-authored-by: gotjosh <josue.abreu@gmail.com>
2022-04-26 16:17:30 +02:00
Guilherme Caulada
a367ad730c
Secrets: Implement basic unified secret store service (#45804)
* wip: Implement kvstore for secrets

* wip: Refactor kvstore for secrets

* wip: Add format key function to secrets kvstore sql

* wip: Add migration for secrets kvstore

* Remove unused Key field from secrets kvstore

* Remove secret values from debug logs

* Integrate unified secrets with datasources

* Fix minor issues and tests for kvstore

* Create test service helper for secret store

* Remove encryption tests from datasources

* Move secret operations after datasources

* Fix datasource proxy tests

* Fix legacy data tests

* Add Name to all delete data source commands

* Implement decryption cache on sql secret store

* Fix minor issue with cache and tests

* Use secret type on secret store datasource operations

* Add comments to make create and update clear

* Rename itemFound variable to isFound

* Improve secret deletion and cache management

* Add base64 encoding to sql secret store

* Move secret retrieval to decrypted values function

* Refactor decrypt secure json data functions

* Fix expr tests

* Fix datasource tests

* Fix plugin proxy tests

* Fix query tests

* Fix metrics api tests

* Remove unused fake secrets service from query tests

* Add rename function to secret store

* Add check for error renaming secret

* Remove bus from tests to fix merge conflicts

* Add background secrets migration to datasources

* Get datasource secure json fields from secrets

* Move migration to secret store

* Revert "Move migration to secret store"

This reverts commit 7c3f872072.

* Add secret service to datasource service on tests

* Fix datasource tests

* Remove merge conflict on wire

* Add ctx to data source http transport on prometheus stats collector

* Add ctx to data source http transport on stats collector test
2022-04-25 13:57:45 -03:00
gotjosh
25c07ff85e
Alerting: Wrap legacy alerting metrics with legacy_ (#48190)
* Alerting: Wrap legacy alerting metrics with `legacy_`
2022-04-25 17:19:36 +02:00
George Robinson
c5547123bc
Remove redundant queries in GetAlertRules and GetOrgAlertRules and replace with ListAlertRules (#48108) 2022-04-25 11:42:42 +01:00
Yuriy Tseretyan
75ba4e98c6
Alerting: Remove unused features from ticker + metric + tests (#47828)
* remove not used code:
  - remove offset in ticket because it is not used
  - remove unused ticker and scheduler methods

* use duration for interval
* add metrics grafana_alerting_ticker_last_consumed_tick_timestamp_seconds, grafana_alerting_ticker_next_tick_timestamp_seconds, grafana_alerting_ticker_interval_seconds
2022-04-22 15:09:47 -04:00
Alexander Weaver
8310789ef1
Indicate whether routes are provisioned when GETting Alertmanager configuration (#47857)
* Test composition simplification from last PR

* Policies use proper API model everywhere

* Expose policy provenance in API, miss some dep injection

* Complete injection

* fix args

* Tests for provenance value

* Extract test helpers so tests are very readable

* Single source adapter struct that was copied in 3 places

* Drop redundant test

* Resolve merge conflicts on changelog
2022-04-22 11:57:56 -05:00
George Robinson
d66fc6ed1a
Alerting: Add GetRuleGroups to RuleStore (#48036)
This commit adds a new method GetRuleGroups to RuleStore which returns the set of rule groups across all organizations.
2022-04-21 17:59:22 +01:00
Vardan Torosyan
a0553de8dd
Rename FGAC to RBAC in the codebase (#48051) 2022-04-21 14:31:02 +02:00
Joe Blubaugh
3d91047e6e
Alerting: Notification URL points to alert view page instead of alert edit page (#47752)
Before this change, notifications generated by the Grafana Alertmanager
pointed to '/alerting/:ruleID/edit'. This change instead points them to
the view path '/alerting/grafana/:ruleID/view'. The view page has a
better UX, including timeseries display. It's also where many alert
state improvements will land in the next few versions of Grafana.

Fixes #45301

Signed-off-by: Joe Blubaugh <joe.blubaugh@grafana.com>
2022-04-20 21:43:55 +08:00
Sofia Papagiannaki
54962c2f0c
Alerting: Rename Recipient path parameter to DatasourceID (#47949) 2022-04-20 16:20:17 +03:00
Peter Holmberg
39d3c8afd7
Alerting: Fix issue with Slack contact point validation (#47559)
* secureFields and secureSettings

* revert channelIndex

* readd lost code

* use specific return

* register secure fields and use not hard coded index

* fix for determineReadOnly

* fix lint error

* fix test suite

Co-authored-by: gillesdemey <gilles.de.mey@gmail.com>
2022-04-20 09:40:57 +02:00
Yuriy Tseretyan
0c31399e34
Alerting: Fix nav-links for RBAC and other (#47798) 2022-04-19 11:47:28 -04:00
Alexander Weaver
758364e78b
Alerting: Refactor GET/POST alerting config routes to be more extensible (#47229)
* Refactor GET am config to be extensible

* Extract post config route

* Fix tests

* Remove temporary duplication

* Fix broken test due to layer shift

* Fix duplicated error message

* Properly return 400 on config rejection

* Revert weird half method extraction

* Move things to notifier package and avoid redundant interface

* Simplify documentation

* Split encryption service and depend on minimal abstractions

* Properly initialize things all the way up to the composition root

* Encryption -> Crypto

* Address misc feedback

* Missing docstring

* Few more simple polish improvements

* Unify on MultiOrgAlertmanager. Discover bug in existing test

* Fix rebase conflicts

* Misc feedback, renames, docs

* Access crypto hanging off MultiOrgAlertmanager rather than having a separate API to initialize
2022-04-14 13:06:21 -05:00
Jean-Philippe Quéméner
060ccacbf9
Alerting: unwrap upsert into insert and update function (#47731)
* Alerting: unwrap upsert into insert and update function

* add changelog entry

* remove changelog entry

* rename upsertrule to updaterule

* use directly alertrule model for inserts

* add test for updating a rule with a conflicting name
2022-04-14 14:21:36 +02:00
Alexander Weaver
c266a4ac81
Alerting: Remove mis-behaving fake and fix masked test failure in AM config API (#47747)
* Remove misbehaving fake

* Fix bug and inject logger
2022-04-13 19:31:57 -05:00
Jean-Philippe Quéméner
388ecb4037
Alerting: Provisioning API - Contact points (#47197) 2022-04-13 22:15:55 +02:00
Santiago
5fb80498b1
Apply templating on alert notifications on OK state (#47355)
* OK notifications using previous evaluation data

* copy rule.EvalMatches to avoid changes to the underlying array

* test cases added/modified

* delete trailing newline

* fix double newline in go import

* add change to the changelog

* specify that this only affects legacy alerting (changelog)

* use current eval data instead of prev eval data

* create evalMatch just once

* code comments, renamings, getTemplateMatches() function

* changelog and docs updated
2022-04-13 17:04:10 -03:00
Yuriy Tseretyan
884c885289
Alerting: Support OK option for Error state (#47670)
* support OK state for Error
2022-04-13 14:45:29 -04:00
Yuriy Tseretyan
af9353caec
Alerting: Add check for datasource permission in alert rule read API (#47087)
* add check for access to rule's data source in GET APIs

* use more general method GetAlertRules instead of GetNamespaceAlertRules.
* remove unused GetNamespaceAlertRules.

Tests:
* create a method to generate permissions for rules
* extract method to create RuleSrv
* add tests for RouteGetNamespaceRulesConfig
2022-04-11 17:37:44 -04:00
Yuriy Tseretyan
48519f9ebb
Alerting: reduce database calls in prometheus-comptible rules API (#47080)
* move validation at the beginning of method
* remove usage of GetOrgRuleGroups because it is not necessary. All information is already available in memory.
* remove unused method
2022-04-11 10:54:29 -04:00
Joe Blubaugh
631dd718a2
47470: Add additional delay to silences in test. (#47482)
This test of silence cleanup was flaky because of its use of real wall
time. In CI environments with slow execution, delays could cause the
test to fail. This change mitigates the problem by increasing the end time of
silences in the test.

After Prometheus merges this PR: https://github.com/prometheus/alertmanager/pull/2867
we can make the test fully deterministic by using a fake clock.

Fixes #47470

Signed-off-by: Joe Blubaugh <joe.blubaugh@grafana.com>
2022-04-08 14:52:08 +08:00
Alexander Weaver
c3ad36ba72
Temporarily skip intermittent test (#47471) 2022-04-07 12:52:00 -05:00
gotjosh
94f72acbb3
Alerting: Introduce an internal changelog (#47390)
* Alerting: Introduce an internal changelog

Please note that this is not intented to replace Grafana's "add to changelog" label. It is _mostly_ for internal consumption of the Alerting team that owns this part of Grafana.

* Fix markdown formatting

* Fix changelog entry
2022-04-07 15:24:26 +01:00
Alexander Weaver
dde0b93cf1
Alerting: Provisioning API - Notification Policies (#46755)
* Base-line API for provisioning notification policies

* Wire API up, some simple tests

* Return provenance status through API

* Fix missing call

* Transactions

* Clarity in package dependencies

* Unify receivers in definitions

* Fix issue introduced by receiver change

* Drop unused internal test implementation

* FGAC hooks for provisioning routes

* Polish, swap names

* Asserting on number of exposed routes

* Don't bubble up updated object

* Integrate with new concurrency token feature in store

* Back out duplicated changes

* Remove redundant tests

* Regenerate and create unit tests for API layer

* Integration tests for auth

* Address linter errors

* Put route behind toggle

* Use alternative store API and fix feature toggle in tests

* Fixes, polish

* Fix whitespace

* Re-kick drone

* Rename services to provisioning
2022-04-05 16:48:51 -05:00
gotjosh
cb6124c921
Alerting: Accurately set value for prom-compatible APIs (#47216)
* Alerting: Accurately set value for prom-compatible APIs

Sets the value fields for the prometheus compatible API based on a combination of condition `refID` and the values extracted from the different frames.

* Fix an extra test

* Ensure a consitent ordering

* Address review comments

* address review comments
2022-04-05 19:36:42 +01:00
Gabriel MABILLE
e430f5021d
AccessControl: Alerting role grants folder read on all folders to viewers (#47278) 2022-04-05 07:04:02 +00:00
Konrad Lalik
6992d17924
Alerting: Add support to distinguish Prometheus datasource subtypes (Mimir, Cortex and Vanilla Prometheus) (#46771)
* Add basic UI for custom ruler URL

* Add build info fetching for alerting data sources

* Add keeping data sources build info in the store

* Use data source build info to construct data source urls

* Remove unused code

* Add custom ruler support in prometheus api calls

* Migrate actions

* Use thunk condition to prevent multiple data source buildinfo fetches

* Unify prom and ruler rules loading

* Upgrade RuleEditor tests

* Upgrade RuleList tests

* Upgrade PanelAlertTab tests

* Upgrade actions tests

* Build info refactoring

* Get rid of lotex ruler support action

* Add prom ruler availability checking when the buildinfo is not available

* Add rulerUrlBuilder tests

* Improve prometheus data source validation, small build info refactoring

* Change prefix based on Prometheus subtype

* Use the correct path

* Revert config routing

* Add deprecation notice for /api/prom prefix

* Add tests to the datasource subtype

* Remove custom ruler support

* Remove deprecation notice

* Prevent fetching ruler rules when ruler api is not available

* Add build info tests

* Unify naming of ruler methods

* Fix test

* Change buildinfo data source validation

* Use strings for subtype params and unveil mimir

* organise imports

* frontend changes and wordsmithing

* fix test suite

* add a nicer verbose message for prometheus datasources

* detect Mimir datasource

* fix test

* fix buildinfo test for Mimir

* shrink vectors

* add some code documentation

* DRY prepareRulesFilterQueryParams

* clarify that Prometheus does not support managing rules

* Improve buildinfo error handling

Co-authored-by: gotjosh <josue.abreu@gmail.com>
Co-authored-by: gillesdemey <gilles.de.mey@gmail.com>
2022-04-04 18:30:17 +01:00
Yuriy Tseretyan
e94d0c1b96
Alerting: update rule test endpoints to respect data source permissions (#47169)
* make eval.Evaluator an interface
* inject Evaluator to TestingApiSrv
* move conditionEval to RouteTestGrafanaRuleConfig because it is the only place where it is used
* update rule test api to check data source permissions
2022-04-02 02:00:23 +02:00
Yuriy Tseretyan
51114527dc
Alerting: handle folder permissions when fine-grained access enabled (#47035)
* Use alert:create action for folder search with edit permissions. This matches the action that is used to query dashboards (the update will be addressed later)
* Update rule store to use FindDashboards instead of folder service to list folders the user has access to view alerts. Folder service does not support query type and additional filters. 
* Do not check whether the user can save to folder if FGAC is enabled because it is checked on API level.
2022-04-01 19:33:26 -04:00
Yuriy Tseretyan
8a2c368031
check that user is authorized to create\update silences (#47163) 2022-04-01 09:39:59 -04:00
Santiago
4b1af6fb06
Fix empty contact point URLs when template parsing fails (#47029)
* fix empty URLs

* leave URL templating, use fallback

* better fix, new tests cases

* fix linting errors
2022-03-31 15:57:48 -03:00
George Robinson
79769132c0
Alerting: Alert rule should wait For duration when execution error state is Alerting (#47052)
Alerting: Alert rule should wait For duration when execution error state is Alerting
2022-03-31 09:57:58 +01:00
Alexander Weaver
502cf8b37f
Alerting: Unify Swagger/OpenAPI generation tooling (#46928)
* Unify makefiles

* Improve documentation
2022-03-31 09:34:46 +02:00
gotjosh
84e5f336fe
Alerting: Classic conditions can now display multiple values (#46971)
* Alerting: Extract classic condition values by RefID

* uncapitalise function

* update documentation

* Update pkg/services/ngalert/eval/extract_md.go

Co-authored-by: George Robinson <george.robinson@grafana.com>

* Update pkg/services/ngalert/state/state.go

Co-authored-by: George Robinson <george.robinson@grafana.com>

* Update pkg/services/ngalert/state/state.go

Co-authored-by: George Robinson <george.robinson@grafana.com>

* Update pkg/services/ngalert/eval/extract_md.go

Co-authored-by: George Robinson <george.robinson@grafana.com>

* Update docs/sources/alerting/unified-alerting/alerting-rules/alert-annotation-label.md

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

* Update pkg/services/ngalert/eval/extract_md.go

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

* Run prettier

Co-authored-by: George Robinson <george.robinson@grafana.com>
Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>
2022-03-29 20:33:03 +01:00
Matthew Jacobson
932f43b220
Alerting: Add resolved count to notification title when both firing and resolved present (#46697)
* Alerting: Add resolved count to notification title when both firing and resolved are present

* Fix test case default_template_test.go
2022-03-29 11:22:28 -04:00
Yuriy Tseretyan
c1dbe7617c
fix scope for datasource:query action (#46973) 2022-03-29 09:58:59 -04:00
Kat Yang
90f2233ea9
Chore: Remove global database engine variable from annotation (#46940)
* Chore: Remove global database engine variable from annotation

* 💩
2022-03-25 13:23:09 -04:00
Yuriy Tseretyan
e20d157a9b
Alerting: rules delete API to check data source authorization (#46906)
* merge RuleSrv rule delete methods
* remove unused store methods
* implement delete by uid for fake store
* add scheduler mock
* implement tests for RouteDeleteAlertRules
2022-03-25 12:39:24 -04:00
Yuriy Tseretyan
6610adf090
Alerting: remove UpdateRuleGroup from fake rule store (#46941)
* remove UpdateRuleGroup from fake rule store because It is not part of interface anymore
2022-03-24 19:29:19 -04:00
Yuriy Tseretyan
15e4556c2f
Alerting: update authorization logic to use proper legacy roles when fine-grained access is disabled (#46931)
* require legacy Editor for post, put, delete endpoints
* require user to be signed in on group level because handler that checks that user has role Editor does not check it is signed in
2022-03-24 17:13:47 -04:00
Yuriy Tseretyan
8868848e93
Alerting: rule group update API to ignore deletes of rules user is not authorized to access (#46905)
* verify that the user has access to all data sources used by the rule that needs to be deleted from the group
* if a user is not authorized to access the rule, the rule is removed from the list to delete
2022-03-24 16:53:00 -04:00
Yuriy Tseretyan
60d4cd80bf
Alerting: update DeleteAlertRuleByUID to accept many UID (#46890) 2022-03-23 16:09:53 -04:00
Selene
d57c94fb6a
Chore: Remove bus from folder service (#46840)
* Remove bus from folder service

* Fix tests
2022-03-23 19:40:22 +01:00
Yuriy Tseretyan
4ee48c2e77
Alerting: Update GetRuleGroupAlertRules to accept optional rule group (#46889)
* rename GetRuleGroupAlertRules to GetAlertRules
* make rule group optional in GetAlertRulesQuery
* simplify FakeStore. the current structure did not support optional rule group
2022-03-23 17:36:25 +00:00
Yuriy Tseretyan
acd7be1cb4
Alerting: Change getEvaluatorForAlertRule to checkDatasourcePermissionsForRule (#46887)
update method getEvaluatorForAlertRule to accept permissions evaluator and exit on the first negative result, which is more effective than returning an evaluator that in fact is a bunch of slices.
2022-03-23 17:11:30 +00:00
Joe Blubaugh
481a68cbf5
Unified Alerting: Make log message follow codebase convention. (#46881)
1. Keep log lines lower case.
2. The key-value pair arguments are not format argument for the string.
3. Always use the "err" key.
2022-03-23 15:07:07 +01:00
Joe Blubaugh
c5b39dd3cd
Unified Alerting, Issue 41156: Clean up expired silences. (#46740)
Expired silences older than the retention period were not being cleaned up. The root problem was that notifier.Alertmanager overrides the Prometheus alert manager's silence maintenance function and was not calling Silences.GC() in the overriden function.
2022-03-23 09:49:02 +01:00
Jean-Philippe Quéméner
a80f04c949
Alerting: add collision safe update function for alertmanager configurations (#46692)
* Alerting: add collision safe update function for alertmanager configurations

* fix typo

* use bootstrap func for tests

* move hash calculation to store

* remove icons lol

* remove removed field
2022-03-23 09:31:46 +01:00
Eng Zer Jun
b56848f006
test: use T.TempDir to create temporary test directory (#44947)
The directory created by `T.TempDir` is automatically removed when the
test and all its subtests complete.

Reference: https://pkg.go.dev/testing#T.TempDir
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2022-03-22 15:43:29 +01:00