Commit Graph

134 Commits

Author SHA1 Message Date
Alexander Weaver
3ddb28bad9
Find-and-replace 'err' logs to 'error' to match log search conventions (#57309) 2022-10-19 17:36:54 -04:00
Kristin Laemmert
05709ce411
chore: remove sqlstore & mockstore dependencies from (most) packages (#57087)
* chore: add alias for InitTestDB and Session

Adds an alias for the sqlstore InitTestDB and Session, and updates tests using these to reduce dependencies on the sqlstore.Store.

* next pass of removing sqlstore imports
* last little bit
* remove mockstore where possible
2022-10-19 09:02:15 -04:00
Kristin Laemmert
c61b5e85b4
chore: replace sqlstore.Store with db.DB (#57010)
* chore: replace sqlstore.SQLStore with db.DB

* more post-sqlstore.SQLStore cleanup
2022-10-14 15:33:06 -04:00
Serge Zaitsev
53baecd71f
Chore: Move folder service into a separate package (#56591)
* Chore: move folder service interface into a separate package

* copy implementation into a standalone package

* move implementation and tests to the new folder package

* remove leftovers from wire

* add test doubles for folder service

* fix tests in library panels/elements

* fix provideservice in ngalert
2022-10-10 21:47:53 +02:00
Joe Blubaugh
7312a2dab0
Alerting: Mark all tests that interact with the database as Integration tests. (#54875)
Previously, two tests were not explicitly marked as integration tests
and so were not run against all 3 supported databases in the CI
environment.
2022-10-10 01:54:54 -04:00
George Robinson
762688d67f
Alerting: Fix pq: missing FROM-clause for table "a" (#56453)
This commit fixes a bug where changing the Folder or Rule Group of an existing rule returns the following error in PostgreSQL "pq: missing FROM-clause for table a"
2022-10-07 10:18:49 +01:00
Joe Blubaugh
b476ae62fb
Alerting: Write and Delete multiple alert instances. (#55350)
Prior to this change, all alert instance writes and deletes happened
individually, in their own database transaction. This change batches up
writes or deletes for a given rule's evaluation loop into a single
transaction before applying it.

These new transactions are off by default, guarded by the feature toggle "alertingBigTransactions"

Before:

```
goos: darwin
goarch: arm64
pkg: github.com/grafana/grafana/pkg/services/ngalert/store
BenchmarkAlertInstanceOperations-8           398           2991381 ns/op         1133537 B/op      27703 allocs/op
--- BENCH: BenchmarkAlertInstanceOperations-8
    util.go:127: alert definition: {orgID: 1, UID: FovKXiRVzm} with title: "an alert definition FTvFXmRVkz" interval: 60 created
    util.go:127: alert definition: {orgID: 1, UID: foDFXmRVkm} with title: "an alert definition fovFXmRVkz" interval: 60 created
    util.go:127: alert definition: {orgID: 1, UID: VQvFuigVkm} with title: "an alert definition VwDKXmR4kz" interval: 60 created
PASS
ok      github.com/grafana/grafana/pkg/services/ngalert/store   1.619s
```

After:

```
goos: darwin
goarch: arm64
pkg: github.com/grafana/grafana/pkg/services/ngalert/store
BenchmarkAlertInstanceOperations-8          1440            816484 ns/op          352297 B/op       6529 allocs/op
--- BENCH: BenchmarkAlertInstanceOperations-8
    util.go:127: alert definition: {orgID: 1, UID: 302r_igVzm} with title: "an alert definition q0h9lmR4zz" interval: 60 created
    util.go:127: alert definition: {orgID: 1, UID: 71hrlmR4km} with title: "an alert definition nJ29_mR4zz" interval: 60 created
    util.go:127: alert definition: {orgID: 1, UID: Cahr_mR4zm} with title: "an alert definition ja2rlmg4zz" interval: 60 created
PASS
ok      github.com/grafana/grafana/pkg/services/ngalert/store   1.383s
```

So we cut time by about 75% and memory allocations by about 60% when
storing and deleting 100 instances.
2022-10-06 14:22:58 +08:00
Alexander Weaver
c16317e5b8
Alerting: Move fake rule store to the test utilities package (#56062)
* Move fakeRuleStore to tests/fakes package

* Break stub dependencies on store

* Update existing tests to point to new location

* Remove unused stub of TimeNow

* Rename fake to take advantage of package name
2022-09-30 14:36:51 -05:00
Alexander Weaver
d66ed6fe35
Alerting: Move stray model structs in store package to model package (#55968)
* Move stray command structs to model package like the rest

* Fix broken references
2022-09-29 15:47:56 -05:00
Sofia Papagiannaki
8b77ee2734
SQLStore: Ensure that sessions are always closed (#55864)
* SQLStore: Ensure that sessions are always closed

Delete `NewSession()` in favour of `WithDbSession()`

* Add WithDbSessionForceNewSession to the interface

* Apply suggestions from code review
2022-09-29 15:55:47 +03:00
Alexander Weaver
e6f99fc418
Alerting: Decouple schedule package from store (#55858)
* Separate out fake for scheduler tests

* Delete extracted methods from older fake
2022-09-27 13:48:12 -05:00
Alexander Weaver
d17ab82b98
Alerting: Break up store.RuleStore interface, delete dead code (#55776)
* Refactor state manager to not depend on rule store interface

* Refactor grafana and proxied ruler APIs to not depend on store.RuleStore

* Refactor folder subscription logic to not use store.RuleStore

* Delete dead code

* Delete store.RuleStore
2022-09-27 08:56:30 -05:00
Alexander Weaver
f11495a4c3
Alerting: Remove dead functionality from alert instance store (#55774)
* Update tests to use ListAlertInstances

* Drop the actual methods rather than just updating tests
2022-09-26 14:38:53 -05:00
Alexander Weaver
a00879ae21
Alerting: Refactor store to not export its own interface for InstanceStore, delete dead dependency injection (#55772)
* Add consumer-side store interface to state manager

* Remove dead dependency

* Delete dead dependency in API struct

* Delete store-layer InstanceStore interface

* Move fake for state's InstanceStore interface to state package
2022-09-26 13:55:05 -05:00
George Robinson
bad4f7fec5
Alerting: Change screenshots to use components (#55156)
* Alerting: Change screenshots to use components

This commit changes screenshots to use a number of components instead of a set of functional wrappers.

It moves the uploading of screenshots from the screenshot package to the image package so we can re-use the same code for both uploading screenshots and server-side images; SingleFlight from the screenshot package to the image package so we can use it for both taking and uploading the screenshot, where as before it was used just for taking the screenshot; and it also removes the use of a cache because we know that screenshots can be taken at most once per tick of the scheduler.
2022-09-21 10:25:07 +01:00
Sofia Papagiannaki
754eea20b3
Chore: SQL store split for annotations (#55089)
* Chore: SQL store split for annotations

* Apply suggestion from code review
2022-09-19 10:54:37 +03:00
Emil Tullstedt
b287047052
Chore: Upgrade Go to 1.19.1 (#54902)
* WIP

* Set public_suffix to a pre Ruby 2.6 version

* we don't need to install python

* Stretch->Buster

* Bump versions in lib.star

* Manually update linter

Sort of messy, but the .mod-file need to contain all dependencies that
use 1.16+ features, otherwise they're assumed to be compiled with
-lang=go1.16 and cannot access generics et al.

Bingo doesn't seem to understand that, but it's possible to manually
update things to get Bingo happy.

* undo reformatting

* Various lint improvements

* More from the linter

* goimports -w ./pkg/

* Disable gocritic

* Add/modify linter exceptions

* lint + flatten nested list

Go 1.19 doesn't support nested lists, and there wasn't an obvious workaround.
https://go.dev/doc/comment#lists
2022-09-12 12:03:49 +02:00
Joe Blubaugh
22c937340e
Revert "Alerting: Write and Delete multiple alert instances. (#54072)" (#54885)
This reverts commit 5e4fd94413.
2022-09-09 17:44:06 +02:00
George Robinson
77e53f9986
Alerting: Fix boolean comparison on PostgreSQL (#54730) 2022-09-06 08:28:42 +01:00
Joe Blubaugh
5e4fd94413
Alerting: Write and Delete multiple alert instances. (#54072)
Prior to this change, all alert instance writes and deletes happened
individually, in their own database transaction. This change batches up
writes or deletes for a given rule's evaluation loop into a single
transaction before applying it.

Before:
```
goos: darwin
goarch: arm64
pkg: github.com/grafana/grafana/pkg/services/ngalert/store
BenchmarkAlertInstanceOperations-8           398           2991381 ns/op         1133537 B/op      27703 allocs/op
--- BENCH: BenchmarkAlertInstanceOperations-8
    util.go:127: alert definition: {orgID: 1, UID: FovKXiRVzm} with title: "an alert definition FTvFXmRVkz" interval: 60 created
    util.go:127: alert definition: {orgID: 1, UID: foDFXmRVkm} with title: "an alert definition fovFXmRVkz" interval: 60 created
    util.go:127: alert definition: {orgID: 1, UID: VQvFuigVkm} with title: "an alert definition VwDKXmR4kz" interval: 60 created
PASS
ok      github.com/grafana/grafana/pkg/services/ngalert/store   1.619s
```

After:
```
goos: darwin
goarch: arm64
pkg: github.com/grafana/grafana/pkg/services/ngalert/store
BenchmarkAlertInstanceOperations-8          1440            816484 ns/op          352297 B/op       6529 allocs/op
--- BENCH: BenchmarkAlertInstanceOperations-8
    util.go:127: alert definition: {orgID: 1, UID: 302r_igVzm} with title: "an alert definition q0h9lmR4zz" interval: 60 created
    util.go:127: alert definition: {orgID: 1, UID: 71hrlmR4km} with title: "an alert definition nJ29_mR4zz" interval: 60 created
    util.go:127: alert definition: {orgID: 1, UID: Cahr_mR4zm} with title: "an alert definition ja2rlmg4zz" interval: 60 created
PASS
ok      github.com/grafana/grafana/pkg/services/ngalert/store   1.383s
```

So we cut time by about 75% and memory allocations by about 60% when
storing and deleting 100 instances.

This change also updates some of our tests so that they run successfully against postgreSQL - we were using random Int64s, but postgres integers, which our tables use, max out at 2^31-1
2022-09-02 11:17:20 +08:00
Yuriy Tseretyan
76ea0b15ae
Alerting: Scheduler to fetch folders along with rules (#52842)
* Update GetAlertRulesForScheduling to query for folders (if needed)
* Update scheduler's alertRulesRegistry to cache folder titles along with rules
* Update rule eval loop to take folder title from the
* Extract interface RuleStore 
* Pre-fetch the rule keys with the version to detect changes, and query the full table only if there are changes.
2022-08-31 11:08:19 -04:00
Yuriy Tseretyan
03e746d9df
Alerting: Delete state from the database on reset (#53919)
* make ResetStatesByRuleUID return states
* delete rule states when reset
* rule eval routine to clean up the state only when rule is deleted
2022-08-25 14:12:22 -04:00
Valério Valério
b5142832fa
Alerting: Fix saving of screenshots uploaded with a signed url (#53933)
The URL of screenshots uploaded to external image storages can be optionally signed, resulting in a long string (800+ chars).
2022-08-24 12:40:50 +01:00
idafurjes
a14621fff6
Chore: Add user service method SetUsingOrg and GetSignedInUserWithCacheCtx (#53343)
* Chore: Add user service method SetUsingOrg

* Chore: Add user service method GetSignedInUserWithCacheCtx

* Use method GetSignedInUserWithCacheCtx from user service

* Fix lint after rebase

* Fix lint

* Fix lint error

* roll back some changes

* Roll back changes in api and middleware

* Add xorm tags to SignedInUser ID fields
2022-08-11 13:28:55 +02:00
Alexander Weaver
b198559225
Alerting: Extend PUT rule-group route to write the entire rule group rather than top-level fields only (#53078)
* Wire up to full alert rule struct

* Extract group change detection logic to dedicated file

* GroupDiff -> GroupDelta for consistency

* Calculate deltas and handle backwards compatible requests

* Separate changes and insert/update/delete as needed

* Regenerate files

* Don't touch the DB if there are no changes

* Quota checking, delete unused file

* Mark modified records as provisioned

* Validation + a couple API layer tests

* Address linter errors

* Fix issue with UID assignment and rule creation

* Propagate top level group fields to all rules

* Tests for repeated updates and versioning

* Tests for quota and provenance checks

* Fix linter errors

* Regenerate

* Factor out some shared logic

* Drop unnecessary multiple nilchecks

* Use alternative strategy for rolling UIDs on inserted rules

* Fix tests, add back nilcheck, refresh UIDs during test

* Address feedback

* Add missing nil-check
2022-08-10 12:33:41 -05:00
idafurjes
6afad51761
Move SignedInUser to user service and RoleType and Roles to org (#53445)
* Move SignedInUser to user service and RoleType and Roles to org

* Use go naming convention for roles

* Fix some imports and leftovers

* Fix ldap debug test

* Fix lint

* Fix lint 2

* Fix lint 3

* Fix type and not needed conversion

* Clean up messages in api tests

* Clean up api tests 2
2022-08-10 11:56:48 +02:00
George Robinson
196b781c70
Alerting: Delete expired images from the database (#53236)
This commit adds a DeleteExpiredService that deletes expired images from the database. It is run in the periodic collector service.
2022-08-09 15:28:36 +01:00
Alexander Weaver
c50cbea0bb
Alerting: Extract alert rule diff logic into separate file with exported API (#53083)
* Refactor diff logic into separate file with exported API

* Fix linter complaint
2022-08-01 23:41:23 -05:00
Yuriy Tseretyan
5fb778814c
Alerting: Update rules version when folder title is updated (#53013)
* remove support for bus from scheduler
* rename event to FolderTitleUpdated and fire only if title has changed
* add method to increase version of all rules that belong to a folder
* update ngalert service to subscribe to folder title change event call data store and update scheduler
* add tests
2022-08-01 19:28:38 -04:00
Alexander Weaver
cc20f04860
Alerting: Increase alert rule operation perf by replacing subquery with threshold calculation (#53069)
* Replace subquery with threshold calculation

* Use offset/limit to account for orgs with large gaps in IDs

* Collapse into one statement

* Drop dead constants

* Revert to 2 query approach

* Drop unused consts again
2022-08-01 16:48:34 -05:00
Yuriy Tseretyan
a081764fd8
Alerting: Scheduler to use AlertRule (#52354)
* update GetAlertRulesForSchedulingQuery to have result AlertRule
* update fetcher utils and registry to support AlertRule
* alertRuleInfo to use alert rule instead of version
* update updateCh hanlder of ruleRoutine to just clean up the state. The updated rule will be provided at the next evaluation
* update evalCh handler of ruleRoutine to use rule from the message and clear state as well as update extra labels

* remove unused function in ruleRoutine
* remove unused model SchedulableAlertRule

* store rule version in ruleRoutine instead of rule
* do not call the sender if nothing to send
2022-07-26 09:40:06 -04:00
Jean-Philippe Quéméner
320262c3db
Alerting: Cleanup the alert_configuration table on write (#51497) 2022-07-20 16:54:18 +02:00
Yuriy Tseretyan
054fe54b03
Alerting: Split Scheduler and AlertRouter tests (#52416)
* move fake FakeExternalAlertmanager to sender package
* move tests from scheduler to router
* update alerts router to have all fields private
* update scheduler tests to use sender mock
2022-07-19 09:32:54 -04:00
Yuriy Tseretyan
6e1e4a4215
Alerting: Update DbStore to use disabled orgs from the config (#52156)
* update DbStore to use UnifiedAlerting settings
* remove disabled orgs from scheduler and use config in db store instead
* remove test
2022-07-15 14:13:30 -04:00
Yuriy Tseretyan
8b3b667a47
Alerting: Fix rule API to accept 0 duration of field For (#50992)
* make 'for' pointer to distinguish between missing field and 0
* set 'for' to -1 if the value is missing but not allow negative in the request + path -1 with the value from original rule
* update store validation to not allow negative 'for'
* update usages to use pointer
2022-06-30 11:46:26 -04:00
Emil Tullstedt
7d815a1db5
Alerting: Use google/uuid instead of gofrs/uuid (#51242) 2022-06-28 11:57:24 +02:00
Jean-Philippe Quéméner
cd0fefec5b
Alerting: change optimistic lock to use proper insert select (#51461)
* Alerting: change optimistic lock to proper insert select

* remove debug logging

* fix postgres

* fix mysql

* remove empty line for go-lint

* add some docs

* use constants
2022-06-28 00:20:21 +02:00
Yuriy Tseretyan
4d02f73e5f
Alerting: Persist rule position in the group (#50051)
Migrations:
* add a new column alert_group_idx to alert_rule table
* add a new column alert_group_idx to alert_rule_version table
* re-index existing rules during migration

API:
* set group index on update. Use the natural order of items in  the array as group index
* sort rules in the group on GET
* update the version of all rules of all affected groups. This will make optimistic lock work in the case of multiple concurrent request touching the same groups.

UI:
* update UI to keep the order of alerts in a group
2022-06-22 10:52:46 -04:00
Matthew Jacobson
5dee2ed24c
Alerting: Add first Grafana reserved label grafana_folder (#50262)
* Alerting: Add first Grafana reserved label g_label

g_label holds the title of the folder container the alert. The intention of this label
is to use it as part of the new default notification policy groupBy.

* Add nil check on updateRule labels map

* Disable gocyclo lint on schedule.ruleRoutine

will remove later in a separate refactoring PR to reduce complexity.

* Address doc suggestions

* Update g_folder for rules in folder when folder title changes

* Remove global bus in FolderService

* Modify tests to fit new common g_folder label

* Add changelog entry

* Fix merge conflicts

* Switch GrafanaReservedLabelPrefix from `g_` to `grafana_`
2022-06-17 13:10:49 -04:00
Yuriy Tseretyan
c314ce48c7
Alerting: Support for optimistic locking for alert rules (#50274)
* add support for optimistic locking for alert_rule table
* return 409 in the case of opitimistic lock
2022-06-13 12:15:28 -04:00
Jean-Philippe Quéméner
ed6a887737
Alerting: remove unused function in alert rule store (#50696) 2022-06-13 11:24:29 -04:00
Kat Yang
bd35e6917a
Chore: Exclude integration tests from running on test-backend step (#50359)
* Chore: Exclude integration tests from running on test-backend step

* Remove -v from go test command

* Add check to skip integration tests before each integration test

* Try to restart pipeline

* Retrying to make pipeline run
2022-06-10 11:46:21 -04:00
Jean-Philippe Quéméner
cf684ed38f
Alerting: bump rule version when updating rule group interval (#50295)
* Alerting: move group update to alert rule service

* rename validateAlertRuleInterval to validateRuleGroupInterval

* init baseinterval correctly

* add seconds suffix

* extract validation function for reusability

* add context to err message
2022-06-09 09:28:32 +02:00
gotjosh
0cde283505
Alerting: Logs should not be capitalized and the errors key should be "err" (#50333)
* Alerting: decapitalize log lines and use "err" as the key for errors

Found using (logger|log).(Warn|Debug|Info|Error)\([A-Z] and (logger|log).(Warn|Debug|Info|Error)\(.+"error"
2022-06-07 19:54:23 +02:00
Jean-Philippe Quéméner
81d360529b
Alerting: Provisioning API - Alert rules (#47930) 2022-06-02 14:48:53 +02:00
Kat Yang
c63ebc887b
Chore: Run integration tests without grabpl (#49448)
* Chore: Run integration tests without grabpl

* Add new step for integration tests in lib.star

* Remove old integration test step from lib.star

* Change drone signature

* Fix: Edit starlark integration step to not affect enterprise

* Remove all build tags & rename starlark integration test step

* Resync .drone.yml with .drone.star

* Fix lint errors

* Fix lint errors

* Fix lint errors

* Fix more lint errors

* Fix another lint error

* Rename integration test step

* Fix last lint error

* Recomment enterprise step

* Remove comment from Makefile

Co-authored-by: Ida Furjesova <ida.furjesova@grafana.com>
2022-06-01 14:55:22 -04:00
Yuriy Tseretyan
ad25e2a20c
Alerting: Update RBAC for alert rules to consider access to rule as access to group it belongs (#49033)
* update authz to exclude entire group if user does not have access to rule
* change rule update authz to not return changes because if user does not have access to any rule in group, they do not have access to the rule
* a new query that returns alerts in group by UID of alert that belongs to that group
* collect all affected groups during calculate changes
* update authorize to check access to groups
* update tests for calculateChanges to assert new fields
* add authorization tests
2022-06-01 10:23:54 -04:00
George Robinson
47a3ddd968
Alerting: Add GetImages to ImageStore (#49717)
* Alerting: Add GetImages to ImageStore

* Use assert.ElementsMatch instead of sort.Sort
2022-05-30 09:26:16 +01:00
Joe Blubaugh
9e8efaa459
Alerting: Add stored screenshot utilities to the channels package. (#49470)
Adds three functions:
`withStoredImages` iterates over a list of models.Alerts, extracting a stored image's data from storage, if available, and executing a user-provided function.
`withStoredImage` does this for an image attached to a specific alert.
`openImage` finds and opens an image file on disk.

Moves `store.Image` to `models.Image`
Simplifies `channels.ImageStore` interface and updates notifiers that use it to use the simpler methods.
Updates all pkg/alert/notifier/channels to use withStoredImage routines.
2022-05-26 13:29:56 +08:00
Kristin Laemmert
debbb8d59d
sqlstore: finish removing Find and SearchDashboards (#49347)
* chore: replace artisnal FakeDashboardService with generated mock

Maintaining a handcrafted FakeDashboardService is not sustainable now that we are in the process of moving the dashboard-related functions out of sqlstore.

* sqlstore: finish removing Find and SearchDashboards

Find and SearchDashboards were previously copied into the dashboard service. This commit completes that work, removing Find and SearchDashboards from the sqlstore and updating callers to use the dashboard service.

* dashboards: remove SearchDashboards from Store interface

SearchDashboards is a wrapper around FindDashboard that transforms the results, so it's been moved out of the Store entirely and the functionality moved into the Dashboard Service's search implementation.

The database tests depended heavily on the transformation, so I added testSearchDashboards, a copy of search dashboards, instead of (heavily) refactoring all the tests.
2022-05-24 09:24:55 -04:00