Commit Graph

104 Commits

Author SHA1 Message Date
Yuriy Tseretyan
a081764fd8 Alerting: Scheduler to use AlertRule (#52354)
* update GetAlertRulesForSchedulingQuery to have result AlertRule
* update fetcher utils and registry to support AlertRule
* alertRuleInfo to use alert rule instead of version
* update updateCh hanlder of ruleRoutine to just clean up the state. The updated rule will be provided at the next evaluation
* update evalCh handler of ruleRoutine to use rule from the message and clear state as well as update extra labels

* remove unused function in ruleRoutine
* remove unused model SchedulableAlertRule

* store rule version in ruleRoutine instead of rule
* do not call the sender if nothing to send
2022-07-26 09:40:06 -04:00
Jean-Philippe Quéméner
320262c3db Alerting: Cleanup the alert_configuration table on write (#51497) 2022-07-20 16:54:18 +02:00
Yuriy Tseretyan
054fe54b03 Alerting: Split Scheduler and AlertRouter tests (#52416)
* move fake FakeExternalAlertmanager to sender package
* move tests from scheduler to router
* update alerts router to have all fields private
* update scheduler tests to use sender mock
2022-07-19 09:32:54 -04:00
Yuriy Tseretyan
6e1e4a4215 Alerting: Update DbStore to use disabled orgs from the config (#52156)
* update DbStore to use UnifiedAlerting settings
* remove disabled orgs from scheduler and use config in db store instead
* remove test
2022-07-15 14:13:30 -04:00
Yuriy Tseretyan
8b3b667a47 Alerting: Fix rule API to accept 0 duration of field For (#50992)
* make 'for' pointer to distinguish between missing field and 0
* set 'for' to -1 if the value is missing but not allow negative in the request + path -1 with the value from original rule
* update store validation to not allow negative 'for'
* update usages to use pointer
2022-06-30 11:46:26 -04:00
Emil Tullstedt
7d815a1db5 Alerting: Use google/uuid instead of gofrs/uuid (#51242) 2022-06-28 11:57:24 +02:00
Jean-Philippe Quéméner
cd0fefec5b Alerting: change optimistic lock to use proper insert select (#51461)
* Alerting: change optimistic lock to proper insert select

* remove debug logging

* fix postgres

* fix mysql

* remove empty line for go-lint

* add some docs

* use constants
2022-06-28 00:20:21 +02:00
Yuriy Tseretyan
4d02f73e5f Alerting: Persist rule position in the group (#50051)
Migrations:
* add a new column alert_group_idx to alert_rule table
* add a new column alert_group_idx to alert_rule_version table
* re-index existing rules during migration

API:
* set group index on update. Use the natural order of items in  the array as group index
* sort rules in the group on GET
* update the version of all rules of all affected groups. This will make optimistic lock work in the case of multiple concurrent request touching the same groups.

UI:
* update UI to keep the order of alerts in a group
2022-06-22 10:52:46 -04:00
Matthew Jacobson
5dee2ed24c Alerting: Add first Grafana reserved label grafana_folder (#50262)
* Alerting: Add first Grafana reserved label g_label

g_label holds the title of the folder container the alert. The intention of this label
is to use it as part of the new default notification policy groupBy.

* Add nil check on updateRule labels map

* Disable gocyclo lint on schedule.ruleRoutine

will remove later in a separate refactoring PR to reduce complexity.

* Address doc suggestions

* Update g_folder for rules in folder when folder title changes

* Remove global bus in FolderService

* Modify tests to fit new common g_folder label

* Add changelog entry

* Fix merge conflicts

* Switch GrafanaReservedLabelPrefix from `g_` to `grafana_`
2022-06-17 13:10:49 -04:00
Yuriy Tseretyan
c314ce48c7 Alerting: Support for optimistic locking for alert rules (#50274)
* add support for optimistic locking for alert_rule table
* return 409 in the case of opitimistic lock
2022-06-13 12:15:28 -04:00
Jean-Philippe Quéméner
ed6a887737 Alerting: remove unused function in alert rule store (#50696) 2022-06-13 11:24:29 -04:00
Kat Yang
bd35e6917a Chore: Exclude integration tests from running on test-backend step (#50359)
* Chore: Exclude integration tests from running on test-backend step

* Remove -v from go test command

* Add check to skip integration tests before each integration test

* Try to restart pipeline

* Retrying to make pipeline run
2022-06-10 11:46:21 -04:00
Jean-Philippe Quéméner
cf684ed38f Alerting: bump rule version when updating rule group interval (#50295)
* Alerting: move group update to alert rule service

* rename validateAlertRuleInterval to validateRuleGroupInterval

* init baseinterval correctly

* add seconds suffix

* extract validation function for reusability

* add context to err message
2022-06-09 09:28:32 +02:00
gotjosh
0cde283505 Alerting: Logs should not be capitalized and the errors key should be "err" (#50333)
* Alerting: decapitalize log lines and use "err" as the key for errors

Found using (logger|log).(Warn|Debug|Info|Error)\([A-Z] and (logger|log).(Warn|Debug|Info|Error)\(.+"error"
2022-06-07 19:54:23 +02:00
Jean-Philippe Quéméner
81d360529b Alerting: Provisioning API - Alert rules (#47930) 2022-06-02 14:48:53 +02:00
Kat Yang
c63ebc887b Chore: Run integration tests without grabpl (#49448)
* Chore: Run integration tests without grabpl

* Add new step for integration tests in lib.star

* Remove old integration test step from lib.star

* Change drone signature

* Fix: Edit starlark integration step to not affect enterprise

* Remove all build tags & rename starlark integration test step

* Resync .drone.yml with .drone.star

* Fix lint errors

* Fix lint errors

* Fix lint errors

* Fix more lint errors

* Fix another lint error

* Rename integration test step

* Fix last lint error

* Recomment enterprise step

* Remove comment from Makefile

Co-authored-by: Ida Furjesova <ida.furjesova@grafana.com>
2022-06-01 14:55:22 -04:00
Yuriy Tseretyan
ad25e2a20c Alerting: Update RBAC for alert rules to consider access to rule as access to group it belongs (#49033)
* update authz to exclude entire group if user does not have access to rule
* change rule update authz to not return changes because if user does not have access to any rule in group, they do not have access to the rule
* a new query that returns alerts in group by UID of alert that belongs to that group
* collect all affected groups during calculate changes
* update authorize to check access to groups
* update tests for calculateChanges to assert new fields
* add authorization tests
2022-06-01 10:23:54 -04:00
George Robinson
47a3ddd968 Alerting: Add GetImages to ImageStore (#49717)
* Alerting: Add GetImages to ImageStore

* Use assert.ElementsMatch instead of sort.Sort
2022-05-30 09:26:16 +01:00
Joe Blubaugh
9e8efaa459 Alerting: Add stored screenshot utilities to the channels package. (#49470)
Adds three functions:
`withStoredImages` iterates over a list of models.Alerts, extracting a stored image's data from storage, if available, and executing a user-provided function.
`withStoredImage` does this for an image attached to a specific alert.
`openImage` finds and opens an image file on disk.

Moves `store.Image` to `models.Image`
Simplifies `channels.ImageStore` interface and updates notifiers that use it to use the simpler methods.
Updates all pkg/alert/notifier/channels to use withStoredImage routines.
2022-05-26 13:29:56 +08:00
Kristin Laemmert
debbb8d59d sqlstore: finish removing Find and SearchDashboards (#49347)
* chore: replace artisnal FakeDashboardService with generated mock

Maintaining a handcrafted FakeDashboardService is not sustainable now that we are in the process of moving the dashboard-related functions out of sqlstore.

* sqlstore: finish removing Find and SearchDashboards

Find and SearchDashboards were previously copied into the dashboard service. This commit completes that work, removing Find and SearchDashboards from the sqlstore and updating callers to use the dashboard service.

* dashboards: remove SearchDashboards from Store interface

SearchDashboards is a wrapper around FindDashboard that transforms the results, so it's been moved out of the Store entirely and the functionality moved into the Dashboard Service's search implementation.

The database tests depended heavily on the transformation, so I added testSearchDashboards, a copy of search dashboards, instead of (heavily) refactoring all the tests.
2022-05-24 09:24:55 -04:00
Kat Yang
50c2b4682a Chore: Rename integration tests (#49438)
* Chore: Rename integration tests

* Remove one Integration

Co-authored-by: Ida Furjesova <ida.furjesova@grafana.com>
2022-05-24 11:04:03 +02:00
Joe Blubaugh
ccd160a75e Alerting: Add image url or file attachment to email notifications. (#49381)
If an image token is present in an alert instance, the email notifier will attempt to find a public URL for the image token. If found, it will add that to the email as the `ImageLink` field. If only local file data is available, the notifier will attach the file to the outgoing email using the `EmbeddedImage` field.
2022-05-23 23:08:28 +08:00
Joe Blubaugh
1cc034d960 Alerting: Add a "Reason" to Alert Instances to show underlying cause of state. (#49259)
This change adds a field to state.State and models.AlertInstance
that indicate the "Reason" that an instance has its current state. This
helps us account for cases where the state is "Normal" but the
underlying evaluation returned "NoData" or "Error", for example.

Fixes #42606

Signed-off-by: Joe Blubaugh <joe.blubaugh@grafana.com>
2022-05-23 16:49:49 +08:00
Joe Blubaugh
12c25759da Alerting: Attach screenshot data to Slack notifications. (#49374)
This change extracts screenshot data from alert messages via a private annotation `__alertScreenshotToken__` and attaches a URL to a Slack message or uploads the data to an image upload endpoint if needed.

This change also implements a few foundational functions for use in other notifiers.
2022-05-23 14:24:20 +08:00
Joe Blubaugh
1d724810de Alerting: State Manager takes screenshots. (#49338)
The State Manager will now take screenshots when an alert instance
switches to an Alerting or Resolved state.

Signed-off-by: Joe Blubaugh joe.blubaugh@grafana.com
2022-05-23 10:53:41 +08:00
Joe Blubaugh
687e79538b Alerting: Add a general screenshot service and alerting-specific image service. (#49293)
This commit adds a pkg/services/screenshot package for taking and uploading screenshots of Grafana dashboards. It supports taking screenshots of both dashboards and individual panels within a dashboard, using the rendering service.

The screenshot package has the following services, most of which can be composed:

BrowserScreenshotService (Takes screenshots with headless Chrome)
CachableScreenshotService (Caches screenshots taken with another service such as BrowserScreenshotService)
NoopScreenshotService (A no-op screenshot service for tests)
SingleFlightScreenshotService (Prevents duplicate screenshots when taking screenshots of the same dashboard or panel in parallel)
ScreenshotUnavailableService (A screenshot service that returns ErrScreenshotsUnavailable)
UploadingScreenshotService (A screenshot service that uploads taken screenshots)

The screenshot package does not support wire dependency injection yet. ngalert constructs its own version of the service. See https://github.com/grafana/grafana/issues/49296

This PR also adds an ImageScreenshotService to ngAlert. This is used to take screenshots with a screenshotservice and then store their location reference for use by alert instances and notifiers.
2022-05-22 22:33:49 +08:00
Yuriy Tseretyan
369fcc5e9a Alerting: scheduler to use short version of model for alert rule (#48916)
* scheduler to use a short version of alert rule model
2022-05-12 09:55:05 -04:00
Alexander Weaver
078a578803 Drop ProvenanceOrgAdapter and build into store API instead (#48137) 2022-04-26 10:30:57 -05:00
George Robinson
c5547123bc Remove redundant queries in GetAlertRules and GetOrgAlertRules and replace with ListAlertRules (#48108) 2022-04-25 11:42:42 +01:00
George Robinson
d66fc6ed1a Alerting: Add GetRuleGroups to RuleStore (#48036)
This commit adds a new method GetRuleGroups to RuleStore which returns the set of rule groups across all organizations.
2022-04-21 17:59:22 +01:00
Jean-Philippe Quéméner
060ccacbf9 Alerting: unwrap upsert into insert and update function (#47731)
* Alerting: unwrap upsert into insert and update function

* add changelog entry

* remove changelog entry

* rename upsertrule to updaterule

* use directly alertrule model for inserts

* add test for updating a rule with a conflicting name
2022-04-14 14:21:36 +02:00
Jean-Philippe Quéméner
388ecb4037 Alerting: Provisioning API - Contact points (#47197) 2022-04-13 22:15:55 +02:00
Yuriy Tseretyan
af9353caec Alerting: Add check for datasource permission in alert rule read API (#47087)
* add check for access to rule's data source in GET APIs

* use more general method GetAlertRules instead of GetNamespaceAlertRules.
* remove unused GetNamespaceAlertRules.

Tests:
* create a method to generate permissions for rules
* extract method to create RuleSrv
* add tests for RouteGetNamespaceRulesConfig
2022-04-11 17:37:44 -04:00
Yuriy Tseretyan
48519f9ebb Alerting: reduce database calls in prometheus-comptible rules API (#47080)
* move validation at the beginning of method
* remove usage of GetOrgRuleGroups because it is not necessary. All information is already available in memory.
* remove unused method
2022-04-11 10:54:29 -04:00
Alexander Weaver
dde0b93cf1 Alerting: Provisioning API - Notification Policies (#46755)
* Base-line API for provisioning notification policies

* Wire API up, some simple tests

* Return provenance status through API

* Fix missing call

* Transactions

* Clarity in package dependencies

* Unify receivers in definitions

* Fix issue introduced by receiver change

* Drop unused internal test implementation

* FGAC hooks for provisioning routes

* Polish, swap names

* Asserting on number of exposed routes

* Don't bubble up updated object

* Integrate with new concurrency token feature in store

* Back out duplicated changes

* Remove redundant tests

* Regenerate and create unit tests for API layer

* Integration tests for auth

* Address linter errors

* Put route behind toggle

* Use alternative store API and fix feature toggle in tests

* Fixes, polish

* Fix whitespace

* Re-kick drone

* Rename services to provisioning
2022-04-05 16:48:51 -05:00
Yuriy Tseretyan
51114527dc Alerting: handle folder permissions when fine-grained access enabled (#47035)
* Use alert:create action for folder search with edit permissions. This matches the action that is used to query dashboards (the update will be addressed later)
* Update rule store to use FindDashboards instead of folder service to list folders the user has access to view alerts. Folder service does not support query type and additional filters. 
* Do not check whether the user can save to folder if FGAC is enabled because it is checked on API level.
2022-04-01 19:33:26 -04:00
Kat Yang
90f2233ea9 Chore: Remove global database engine variable from annotation (#46940)
* Chore: Remove global database engine variable from annotation

* 💩
2022-03-25 13:23:09 -04:00
Yuriy Tseretyan
e20d157a9b Alerting: rules delete API to check data source authorization (#46906)
* merge RuleSrv rule delete methods
* remove unused store methods
* implement delete by uid for fake store
* add scheduler mock
* implement tests for RouteDeleteAlertRules
2022-03-25 12:39:24 -04:00
Yuriy Tseretyan
6610adf090 Alerting: remove UpdateRuleGroup from fake rule store (#46941)
* remove UpdateRuleGroup from fake rule store because It is not part of interface anymore
2022-03-24 19:29:19 -04:00
Yuriy Tseretyan
60d4cd80bf Alerting: update DeleteAlertRuleByUID to accept many UID (#46890) 2022-03-23 16:09:53 -04:00
Yuriy Tseretyan
4ee48c2e77 Alerting: Update GetRuleGroupAlertRules to accept optional rule group (#46889)
* rename GetRuleGroupAlertRules to GetAlertRules
* make rule group optional in GetAlertRulesQuery
* simplify FakeStore. the current structure did not support optional rule group
2022-03-23 17:36:25 +00:00
Jean-Philippe Quéméner
a80f04c949 Alerting: add collision safe update function for alertmanager configurations (#46692)
* Alerting: add collision safe update function for alertmanager configurations

* fix typo

* use bootstrap func for tests

* move hash calculation to store

* remove icons lol

* remove removed field
2022-03-23 09:31:46 +01:00
ying-jeanne
adc0cbf176 remove global variable in annotation (#46746)
* remove global varaible in annotation

* remove todo

* replace intransaction with withdbtransaction

* fix typo
2022-03-22 19:20:57 +08:00
Alexander Weaver
92716cb602 Alerting: Create abstraction for launching transactions and refactor existing transaction management to use it (#46216)
* Remove InTransaction from RuleStore and make it its own interface

* Ensure that ctx-based is clear from name

* Resolve merge conflicts

* Refactor tests to work in terms of the introduced abstraction rather than concrete dbstore
2022-03-15 11:48:42 -05:00
gotjosh
a75d4fcbd8 Alerting: Display query from grafana-managed alert rules on /api/v1/rules (#45969)
* Aleting: Extract query from alerting rule model for api/v1/rules

* more changes and fixtures

* appease the linter
2022-03-14 10:39:20 +00:00
Yuriy Tseretyan
f75bea481d Alerting: validate rules and calculate changes in API controller (#45072)
* Update API controller
   - add validation of rules API model
   - add function to calculate changes between the submitted alerts and existing alerts
   - update RoutePostNameRulesConfig to validate input models, calculate changes and apply in a transaction

* Update DBStore
   - delete unused storage method. All the logic is moved upstream.
   - upsert to not modify fields of new by values from the existing alert
   - if rule has UID do not try to pull it from db. (it is done upstream)

* Add rule generator
2022-02-23 11:30:04 -05:00
Selene
d5b98772ed Dashboards: Refactor service to make it injectable by wire (#44588)
* Add providers to folder and dashboard services

* Refactor folder and dashboard services

* Move store implementation to its own file due wire cannot allow us to cast to SQLStore

* Add store in some places and more missing dependencies

* Bad merge fix

* Remove old functions from tests and few fixes

* Fix provisioning

* Remove store from http server and some test fixes

* Test fixes

* Fix dashboard and folder tests

* Fix library tests

* Fix provisioning tests

* Fix plugins manager tests

* Fix alert and org users tests

* Refactor service package and more test fixes

* Fix dashboard_test tets

* Fix api tests

* Some lint fixes

* Fix lint

* More lint :/

* Move dashboard integration tests to dashboards service and fix dependencies

* Lint + tests

* More integration tests fixes

* Lint

* Lint again

* Fix tests again and again anda again

* Update searchstore_test

* Fix goimports

* More go imports

* More imports fixes

* Fix lint

* Move UnprovisionDashboard function into dashboard service and remove bus

* Use search service instead of bus

* Fix test

* Fix go imports

* Use nil in tests
2022-02-16 14:15:44 +01:00
Yuriy Tseretyan
02f8e99ca1 Alerting: move fake stores to store package (#45428)
* make fake storage public
* move fake storages to store package
2022-02-15 17:24:39 -05:00
George Robinson
4e3a72fc2a Add context.Context to AlertingStore (#45069) 2022-02-09 09:22:09 +00:00
George Robinson
67a3e1d6fd Add context.Context to InstanceStore (#45049) 2022-02-08 13:49:04 +00:00