Commit Graph

329 Commits

Author SHA1 Message Date
Alex Moreno
7a465f42a6
Alerting: Allow pausing alerts from provisioning (#62263)
* Allow pausing alerts from provisioning

* Update swagger

* Add IsPaused to provision export endpoints

* Add pause field in sample.yml

* Add exception for reset state in first loop iteration of scheduler if rule is paused

* Update provision definition and swagger docs

* Fix provisioning export tests

* Suggestion: Simplify if condition

* Add more context to a comment
2023-01-30 16:29:05 +01:00
Serge Zaitsev
d6d4097567
Chore: Fix goimports grouping in alerting (#62424)
* fix goimports

* fix goimports order
2023-01-30 09:55:35 +01:00
Matthew Jacobson
c006df375a
Alerting: Create endpoints for exporting in provisioning file format (#58623)
This adds provisioning endpoints for downloading alert rules and alert rule groups in a 
format that is compatible with file provisioning. Each endpoint supports both json and 
yaml response types via Accept header as well as a query parameter 
download=true/false that will set Content-Disposition to recommend initiating a download 
or inline display.

This also makes some package changes to keep structs with potential to drift closer 
together. Eventually, other alerting file structs should also move into this new file 
package, but the rest require some refactoring that is out of scope for this PR.
2023-01-27 11:39:16 -05:00
George Robinson
b6db1ed524
Revert "Alerting: Add is_paused attr to the POST alert rule group endpoint" (#62310)
Revert "Alerting: Add is_paused attr to the POST alert rule group endpoint (#62253)"

This reverts commit 3ccafe3a5a.
2023-01-27 13:41:36 +01:00
Alex Moreno
3ccafe3a5a
Alerting: Add is_paused attr to the POST alert rule group endpoint (#62253)
Add is_paused attr to the POST alert rule group endpoint
2023-01-27 10:50:06 +01:00
idafurjes
6c5a573772
Chore: Move ReqContext to contexthandler service (#62102)
* Chore: Move ReqContext to contexthandler service

* Rename package to contextmodel

* Generate ngalert files

* Remove unused imports
2023-01-27 08:50:36 +01:00
Alex Moreno
531b439cf1
Alerting: Add alert pausing feature (#60734)
* Add field in alert_rule model, add state to alert_instance model, and state to eval

* Remove paused state from eval package

* Skip paused alert rules in scheduler

* Add migration to add is_paused field to alert_rule table

* Convert to postable alerts only if not normal, pernding, or paused

* Handle paused eval results in state manager

* Add Paused state to eval package

* Add paused alerts logic in scheduler

* Skip alert on scheduler

* Remove paused status from eval package

* Apply suggestions from code review

Co-authored-by: George Robinson <george.robinson@grafana.com>

* Remove state

* Rethink schedule and manager for paused alerts

* Change return to continue

* Remove unused var

* Rethink alert pausing

* Paused alerts storing annotations

* Only add one state transition

* Revert boolean method renaming refactor

* Revert take image refactor

* Make registry errors public

* Revert method extraction for getting a folder title

* Revert variable renaming refactor

* Undo unnecessary changes

* Revert changes in test

* Remove IsPause check in PatchPartiLAlertRule function

* Use SetNormal to set state

* Fix text by returning to old behaviour on alert rule deletion

* Add test in schedule_unit_test.go to test ticks with paused alerts

* Add coment to clarify usage of context.Background()

* Add comment to clarify resetStateByRuleUID method usage

* Move rule get to a more limited scope

* Update pkg/services/ngalert/schedule/schedule.go

Co-authored-by: George Robinson <george.robinson@grafana.com>

* rum gofmt on pkg/services/ngalert/schedule/schedule.go

* Remove defer cancel for context

* Update pkg/services/ngalert/models/instance_test.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* Update pkg/services/ngalert/models/testing.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* Update pkg/services/ngalert/schedule/schedule_unit_test.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* Update pkg/services/ngalert/schedule/schedule_unit_test.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* Update pkg/services/ngalert/models/instance_test.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* skip scheduler rule state clean up on paused alert rule

* Update pkg/services/ngalert/schedule/schedule.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* Fix mock in test

* Add (hopefully) final suggestions

* Use error channel from recordAnnotationsSync to cancel context

* Run make gen-cue

* Place pause alert check in channel update after version check

* Reduce branching un update channel select

* Add if for error and move code inside if in state manager ResetStateByRuleUID

* Add reason to logs

* Update pkg/services/ngalert/schedule/schedule.go

Co-authored-by: George Robinson <george.robinson@grafana.com>

* Do not delete alert rule routine, just exit on eval if is paused

* Reduce branching and create-close a channel to avoid deadlocks

* Separate state deletion and state reset (includes history saving)

* Add current pause state in rule route in scheduler

* Split clearState and bring errCh closer to RecordStatesAsync call

* Change rule to ruleMeta in RecordStatesAsync

* copy state to be able to modify it

* Add timeout to context creation

* Shorten the timeout

* Use resetState is rule is paused and deleteState if rule is not paused

* Remove Empty state reason

* Save every rule change in historian

* Add tests for DeleteStateByRuleUID and ResetStateByRuleUID

* Remove useless line

* Remove outdated comment

Co-authored-by: George Robinson <george.robinson@grafana.com>
Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>
Co-authored-by: Armand Grillet <2117580+armandgrillet@users.noreply.github.com>
2023-01-26 18:29:10 +01:00
Alexander Weaver
eb1293ebb1
Alerting: Re-generate swagger definitions (#62154)
Regenerate swagger, add body binding so parameters work
2023-01-25 15:01:42 -06:00
Santiago
e5920c211e
Chore: Fix random indices for slices in test files (#61884)
* Fix random indices for slices in test files

* Empty commit
2023-01-24 15:07:37 -03:00
gotjosh
511dab3b4b
Update grafana/alerting to the latest main (#61810)
* Update `grafana/alerting` to the latest main

Also updates prometheus-alertmanager since we use that one directly for some structs.
2023-01-19 20:44:49 +00:00
George Robinson
d4256b352d
Docs: Rename Message templates to Notification templates (#59477)
This commit renames "Message templates" to "Notification templates"
in the user interface as it suggests that these templates cannot
be used to template anything other than the message. However, message
templates are much more general and can be used to template other fields
too such as the subject of an email, or the title of a Slack message.
2023-01-18 17:26:34 +00:00
Jo
dcfeab2c73
AuthN: User Quota (#61540)
* remove reqContext from quota checks in login

* add guards for nil ScopeParams
2023-01-16 11:54:15 +01:00
gotjosh
e7cd6eb13c
Alerting: Use alerting.GrafanaAlertmanager instead of initialising Alertmanager components directly (#61230)
* Alerting: Use `alerting.GrafanaAlertmanager` instead of initialising Alertmanager components directly
2023-01-13 12:54:38 -04:00
gotjosh
fd6f107ded
Alerting Unification: Use the errors from grafana/alerting in Alerts (#61425) 2023-01-12 15:23:34 -04:00
gotjosh
39e429a14b
Alerting Unification: Use the errors from grafana/alerting in Silences (#61334) 2023-01-12 10:03:49 -04:00
George Robinson
2a291afbae
Alerting: Use consts from alerting package (#61241) 2023-01-10 19:59:13 +00:00
Yuri Tseretyan
f990be58cb
Alerting: Use all notifiers from alerting repository (#60655) 2022-12-22 09:27:18 -05:00
Yuri Tseretyan
35090c376c
Alerting: Replace VictorOps receiver with the one from alerting repository (#60543)
* replace victorops with one from alerting

* update other usages
2022-12-20 10:55:41 +01:00
Yuri Tseretyan
f0cabe14d5
Alerting: import Grafana alerting package and update usages (#60490)
* update remaining notifiers to use alerting package
2022-12-19 10:53:58 -05:00
Yuri Tseretyan
9ad45aedcf
Alerting: replace usage of simplejson to json.RawMessage in NotificationChannelConfig (#60423)
* introduce alias for json.RawMessage with name RawMessage. This is needed to keep raw JSON and implement a marshaler for YAML, which does not seem to be used but there are tests that fail.
* replace usage of simplejson with RawMessage in NotificationChannelConfig
* remove usage of simplejson in tests
* change migration code to convert simplejson to raw message
2022-12-16 13:01:06 -05:00
Alex Moreno
174c61b949
Alerting: Set Dashboard and Panel IDs on rule group replacement (#60374)
* Set Dashboard and Panel IDs on rule group replacement

* fix comments and abbreviate test variable name

* Update pkg/services/ngalert/provisioning/alert_rules.go

Co-authored-by: Jean-Philippe Quéméner <JohnnyQQQQ@users.noreply.github.com>

Co-authored-by: Jean-Philippe Quéméner <JohnnyQQQQ@users.noreply.github.com>
2022-12-16 11:47:25 +01:00
Yuri Tseretyan
6637333748
Alerting: refactor notifiers to use package specific Logger interface (#60361)
* introduce Logger interface local to channles + implementaton that wraps the Grafana logger
* make NewFactoryConfig accept LoggerFactory
* add logger field to FactoryConfig
* update usages of log.Logger to internal interface
2022-12-15 11:10:31 -05:00
Yuri Tseretyan
ad09feed83
Alerting: rule backtesting API (#57318)
* Implement backtesting engine that can process regular rule specification (with queries to datasource) as well as special kind of rules that have data frame instead of query.
* declare a new API endpoint and model
* add feature toggle `alertingBacktesting`
2022-12-14 09:44:14 -05:00
Alexander Weaver
e97b43cd58
Alerting: Add provisioning endpoint to fetch all rules (#59989)
* Domain layer api for fetching all rules

* Add endpoint for getting all rules
2022-12-13 11:54:08 +01:00
Bart Peeters
57d6adbc7c
Alerting: Support Prometheus durations in Provisioning API (#58293)
Provisioning API should support Prometheus durations
2022-11-21 18:58:25 +00:00
Karl Persson
fef1e1d5bc
Auth: Refactor auth package (#58920)
* Auth: move interface to its own file

* Auth: move to test package

* Auth: move quota consts to auth file

* Auth: move service to impl package

* Auth: move interfaces and related models to auth package

* Auth: Create sub package and type alias to avoid circular dependency
2022-11-18 09:56:06 +01:00
Sofia Papagiannaki
9855e74b92
Chore: Refactor quota service (#58643)
Chore: Refactor quota service (#57586)

* Chore: refactore quota service

* Apply suggestions from code review
2022-11-14 21:08:10 +02:00
Alex Moreno
78bb8c10ce
Alerting: Allow none provenance alert rule creation from provisioning API (#58410) 2022-11-11 19:58:45 +01:00
idafurjes
080ea88af7
Nested Folders: Support getting of nested folder in folder service wh… (#58597)
* Nested Folders: Support getting of nested folder in folder service when feature flag is set

* Fix lint

* Fix some tests

* Fix ngalert test

* ngalert fix

* Fix API tests

* Fix some tests and lint

* Fix lint 2

* Fix library elements and panels

* Add access control to get folder

* Cleanup and minor test change
2022-11-11 14:28:24 +01:00
Alex Moreno
45facbba11
Alerting: Remove url based external alertmanagers config (#57918)
* Remove URL-based alertmanagers from endpoint config

* WIP

* Add migration and alertmanagers from admin_configuration

* Empty comment removed

* set BasicAuth true when user is present in url

* Remove Alertmanagers from GET /admin_config payload

* Remove URL-based alertmanager configuration from UI

* Fix new uid generation in external alertmanagers migration

* Fix tests for URL-based external alertmanagers

* Fix API tests

* Add more tests, move migration code to separate file, and remove possible am duplicate urls

* Fix edge cases in migration

* Fix imports

* Remove useless fields and fix created_at/updated_at retrieval

Co-authored-by: George Robinson <george.robinson@grafana.com>
Co-authored-by: Konrad Lalik <konrad.lalik@grafana.com>
2022-11-10 16:34:13 +01:00
Sofia Papagiannaki
96cdf77995
Revert "Chore: Refactor quota service (#57586)" (#58394)
This reverts commit 326ea86a57.
2022-11-08 11:52:07 +02:00
Sofia Papagiannaki
326ea86a57
Chore: Refactor quota service (#57586)
* Chore: refactore quota service

* Apply suggestions from code review
2022-11-08 10:25:34 +02:00
Alexander Weaver
cc8c1380e2
Alerting: Persist annotations from multidimensional rules in batches (#56575)
* Reduce piecemeal state fields

* Read data directly off state instead of rule

* Unify state and context into single struct

* Expose contextual information to layer above setNextState

* Work in terms of ContextualState and call historian in batches

* Call annotations service in batches

* Export format state and reason and remove workaround in unrelated test package

* Add new method to annotation service for batch inserting

* Fix loop variable aliasing bug caught by linter, didn't change behavior

* Incl timerange on annotation tests

* Insert one at a time if tags are present

* Point to rule from ContextualState rather than copy fields

* Build annotations and copy data prior to starting goroutine

* Rename to StateTransition

* Use new bulk-insert utility

* Remove rule from StateTransition and pass in directly to historian

* Simplify annotations logic since we have only one rule

* Fix logs and context, nilcheck, simplify method name

* Regenerate mock
2022-11-04 10:39:26 -05:00
George Robinson
f2e4cb7c4e
Alerting: Fix feedback (#57922) 2022-11-02 22:36:14 +00:00
Yuriy Tseretyan
e3a4bde622
Alerting: Condition evaluator with cached pipeline (#57479)
* create rule evaluator
* load header from the context
* init one factory
* update scheduler
2022-11-02 10:13:39 -04:00
George Robinson
4c581b5f85
Alerting: Fix response is not returned for invalid Duration in Provisioning API (#58046) 2022-11-02 08:21:23 -04:00
George Robinson
b0a927b138
Alerting: Add debug logs in validateAndGetPrefix (#57002) 2022-10-31 16:40:28 +00:00
Yuriy Tseretyan
d848cc629b
Alerting: Refactor rule interval validation to be reusable (#57792) 2022-10-28 14:40:11 +00:00
Santiago
cdb5d4230a
Alerting: Fix "Not Implemented" responses (#57710)
* fix swagger spec, return 404 instead of 501 when an endpoint does not exist

* update number of paths in authorization_test.go
2022-10-26 23:35:52 -03:00
Yuriy Tseretyan
0a4121cef8
Alerting: Contextual log provider for rule key (#57476)
* create contextual log context provider
* use contextual provider in scheduler
* init logger in the package
* use context for log context
* use context in state manager
2022-10-26 19:16:02 -04:00
Alexander Weaver
3ddb28bad9
Find-and-replace 'err' logs to 'error' to match log search conventions (#57309) 2022-10-19 17:36:54 -04:00
Alexander Weaver
4eb8e4ff66
Alerting: Add traceability headers for alert queries (#57127)
* Define EvaluationContext

* Refactor ConditionEval to use new context struct

* Refactor QueriesAndExpressionsEval to use EvaluationContext

* Remove dead field from AlertExecCtx

* Refactor Validate to use EvaluationContext

* Get rid of privately used AlertExecCtx

* Move EvaluationContext to new file and add helper

* Add builder pattern and bind rule info to context

* Extract header logic and add rule UID header

* Fix missing call
2022-10-19 14:19:43 -05:00
Kristin Laemmert
05709ce411
chore: remove sqlstore & mockstore dependencies from (most) packages (#57087)
* chore: add alias for InitTestDB and Session

Adds an alias for the sqlstore InitTestDB and Session, and updates tests using these to reduce dependencies on the sqlstore.Store.

* next pass of removing sqlstore imports
* last little bit
* remove mockstore where possible
2022-10-19 09:02:15 -04:00
Santiago
6ad405e256
fix swagger spec for receivers API response (#57124) 2022-10-17 16:58:55 -03:00
Yuriy Tseretyan
888bdfd4ad
Alerting: Use correct response body for silence post API (#57114) 2022-10-17 15:43:37 -04:00
Santiago
09f8e026a1
Alerting: Expose info about notification delivery errors in a new /receivers endpoint (#55429)
* (WIP) switch to fork AM, first implementation of the API, generate spec

* get receivers avoiding race conditions

* use latest version of our forked AM, tests

* make linter happy, delete TODO comment

* update number of expected paths to += 2

* delete unused endpoint code, code review comments, tests

* Update pkg/services/ngalert/notifier/alertmanager.go

Co-authored-by: Matthew Jacobson <matthew.jacobson@grafana.com>

* remove call to fmt.Println

* clear naming for fields

* shorter variable names in GetReceivers

Co-authored-by: Matthew Jacobson <matthew.jacobson@grafana.com>
2022-10-03 10:58:41 -03:00
Alexander Weaver
c16317e5b8
Alerting: Move fake rule store to the test utilities package (#56062)
* Move fakeRuleStore to tests/fakes package

* Break stub dependencies on store

* Update existing tests to point to new location

* Remove unused stub of TimeNow

* Rename fake to take advantage of package name
2022-09-30 14:36:51 -05:00
Alexander Weaver
d66ed6fe35
Alerting: Move stray model structs in store package to model package (#55968)
* Move stray command structs to model package like the rest

* Fix broken references
2022-09-29 15:47:56 -05:00
Alexander Weaver
d17ab82b98
Alerting: Break up store.RuleStore interface, delete dead code (#55776)
* Refactor state manager to not depend on rule store interface

* Refactor grafana and proxied ruler APIs to not depend on store.RuleStore

* Refactor folder subscription logic to not use store.RuleStore

* Delete dead code

* Delete store.RuleStore
2022-09-27 08:56:30 -05:00
Alexander Weaver
a00879ae21
Alerting: Refactor store to not export its own interface for InstanceStore, delete dead dependency injection (#55772)
* Add consumer-side store interface to state manager

* Remove dead dependency

* Delete dead dependency in API struct

* Delete store-layer InstanceStore interface

* Move fake for state's InstanceStore interface to state package
2022-09-26 13:55:05 -05:00