* Add health fields to rules and an aggregator method to the scheduler
* Move health, last error, and last eval time in together to minimize state processing
* Wire up a readonly scheduler to prom api
* Extract to exported function
* Use health in api_prometheus and fix up tests
* Rename health struct to status
* Fix tests one more time
* Several new tests
* Handle inactive rules
* Push state mapping into state manager
* rename to StatusReader
* Rectify cyclo complexity rebase
* Convert existing package local status implementation to models one
* fix tests
* undo RuleDefs rename
* Support record struct in provisioning API
* Update api spec
* Use record field
* Restrict API endpoints following toggle
* Fix swagger spec
* Add recording rule validation to store validator
* Alerting: Add optional metadata to GET silence responses
- ruleMetadata: to request rule metadata.
- accesscontrol: to request access control metadata.
* Alerting: Add single rule checks to alert rule access control
Modifies ruler api single rule read to no longer fetch entire groups and instead
use the new single rule ac check.
Simplifies provisioning api getAlertRuleAuthorized logic to always load a single
rule instead of conditionally loading the entire group when provisioning
permissions are not present.
* Swap out Has/AuthorizeAccessToRule for Has/AuthorizeAccessInFolder
* Add notification settings to storage\domain and API models. Settings are a slice to workaround XORM mapping
* Support validation of notification settings when rules are updated
* Implement route generator for Alertmanager configuration. That fetches all notification settings.
* Update multi-tenant Alertmanager to run the generator before applying the configuration.
* Add notification settings labels to state calculation
* update the Multi-tenant Alertmanager to provide validation for notification settings
* update GET API so only admins can see auto-gen
* update GetUserVisibleNamespaces to use FolderSeriver
* update GetNamespaceByUID to use FolderService.GetFolders
* update GetAlertRulesForScheduling to use FolderService.GetFolders
* Update API and GetAlertRulesForScheduling to use the folder's full path
* get full path of folder in RouteTestGrafanaRuleConfig
* fix escaping of titles for MySQL
* Add single receiver method
* Add receiver permissions
* Add single/multi GET endpoints for receivers
* Remove stable tag from time intervals
See end of PR description here: https://github.com/grafana/grafana/pull/81672
* declare new API and models GettableTimeIntervals, PostableTimeIntervals
* add new actions alert.notifications.time-intervals:read and alert.notifications.time-intervals:write.
* update existing alerting roles with the read action. Add to all alerting roles.
* add integration tests
This PR has two steps that together create a functional dry-run capability for the migration.
By enabling the feature flag alertingPreviewUpgrade when on legacy alerting it will:
a. Allow all Grafana Alerting background services except for the scheduler to start (multiorg alertmanager, state manager, routes, …).
b. Allow the UI to show Grafana Alerting pages alongside legacy ones (with appropriate in-app warnings that UA is not actually running).
c. Show a new “Alerting Upgrade” page and register associated /api/v1/upgrade endpoints that will allow the user to upgrade their organization live without restart and present a summary of the upgrade in a table.
* add metrics and tracing to state manager
* propagate tracer to state manager
* add scheduler metrics
* fix backtesting
* add test for state metrics
* remove StateUpdateCount
* update docs
* metrics can be null
* add tracer to new tests
* Alerting: Repurpose rule testing endpoint to return potential alerts
This feature replaces the existing no-longer in-use grafana ruler testing API endpoint /api/v1/rule/test/grafana. The new endpoint returns a list of potential alerts created by the given alert rule, including built-in + interpolated labels and annotations.
The key priority of this endpoint is that it is intended to be as true as possible to what would be generated by the ruler except that the resulting alerts are not filtered to only Resolved / Firing and ready to be sent.
This means that the endpoint will, among other things:
- Attach static annotations and labels from the rule configuration to the alert instances.
- Attach dynamic annotations from the datasource to the alert instances.
- Attach built-in labels and annotations created by the Grafana Ruler (such as alertname and grafana_folder) to the alert instances.
- Interpolate templated annotations / labels and accept allowed template functions.
* Alerting: Allow hooking into request handler functions.
Adds a facility to AlertNG for hooking into API handlers, allowing the
replacement of request handlers for specific paths. One of goals of this
approach was to allow hooking as late as possible in the request, e.g.
after all middleware has been applied, to simplfiy usage.
* Update pkg/services/ngalert/api/hooks.go
Co-authored-by: gotjosh <josue.abreu@gmail.com>
* Update pkg/services/ngalert/api/hooks.go
Co-authored-by: gotjosh <josue.abreu@gmail.com>
* Update pkg/services/ngalert/ngalert.go
Co-authored-by: gotjosh <josue.abreu@gmail.com>
* Fixes to review comments
* Fix passing logger in
---------
Co-authored-by: gotjosh <josue.abreu@gmail.com>
* stop using the scheduler's Update and Delete methods all communication must be via the database
* update scheduler's registry to calculate diff before re-setting the cache
* update fetcher to return the diff generated by registry
* update processTick to update rule eval routine if the rule was updated and it is not going to be evaluated at this tick.
* remove references to the scheduler from api package
* remove unused methods in the scheduler
* Define endpoint and generate
* Wire up and register endpoint
* Cleanup, define authorization
* Forgot the leading slash
* Wire up query and SignedInUser
* Wire up timerange query params
* Add todo for label queries
* Drop comment
* Update path to rules subtree
* Implement backtesting engine that can process regular rule specification (with queries to datasource) as well as special kind of rules that have data frame instead of query.
* declare a new API endpoint and model
* add feature toggle `alertingBacktesting`
* create contextual log context provider
* use contextual provider in scheduler
* init logger in the package
* use context for log context
* use context in state manager
* (WIP) switch to fork AM, first implementation of the API, generate spec
* get receivers avoiding race conditions
* use latest version of our forked AM, tests
* make linter happy, delete TODO comment
* update number of expected paths to += 2
* delete unused endpoint code, code review comments, tests
* Update pkg/services/ngalert/notifier/alertmanager.go
Co-authored-by: Matthew Jacobson <matthew.jacobson@grafana.com>
* remove call to fmt.Println
* clear naming for fields
* shorter variable names in GetReceivers
Co-authored-by: Matthew Jacobson <matthew.jacobson@grafana.com>
* Refactor state manager to not depend on rule store interface
* Refactor grafana and proxied ruler APIs to not depend on store.RuleStore
* Refactor folder subscription logic to not use store.RuleStore
* Delete dead code
* Delete store.RuleStore
* Add consumer-side store interface to state manager
* Remove dead dependency
* Delete dead dependency in API struct
* Delete store-layer InstanceStore interface
* Move fake for state's InstanceStore interface to state package
Removes various custom headers logic sprinkled around in the backend.
It should automatically be applied to outgoing HTTP requests via the
CustomHeadersMiddleware.
This also removes decryption of SecureJSONData to populate custom
headers in ngalert which seemed to have caused a ton of CPU usage.
This changes the API codegen template (controller-api.mustache) to simplify some names. When this package was created, most APIs "forked" to either a Grafana backend implementation or a "Lotex" remote implementation. As we have added APIs it's no longer the case. Provisioning, configuration, and testing APIs do not fork, and we are likely to add additional APIs that don't fork.
This change replaces {{classname}}ForkingService with {{classname}} for interface names, and names the concrete implementation {{classname}}Handler. It changes the implied implementation of a route handler from fork{{nickname}} to handle{{nickname}}. So PrometheusApiForkingService becomes PrometheusApi, ForkedPrometheusApi becomes PrometheusApiHandler and forkRouteGetGrafanaAlertStatuses becomes handleRouteGetGrafanaAlertStatuses
It also renames some files - APIs that do no forking go from forked_{{name}}.go to {{name}}.go and APIs that still fork go from forked_{{name}}.go to forking_{{name}}.go to capture the idea that those files a "doing forking" rather than "are a fork of something."
Signed-off-by: Joe Blubaugh <joe.blubaugh@grafana.com>
* Remove user from preferences, stars, orguser, team member
* Fix lint
* Add Delete user from org and dashboard acl
* Delete user from user auth
* Add DeleteUser to quota
* Add test files and adjust user auth store
* Rename package in wire for user auth
* Import Quota Service interface in other services
* do the same in tests
* fix lint tests
* Fix tests
* Add some tests
* Rename InsertUser and DeleteUser to InsertOrgUser and DeleteOrgUser
* Rename DeleteUser to DeleteByUser in quota
* changing a method name in few additional places
* Fix in other places
* Fix lint
* Fix tests
* Rename DeleteOrgUser to DeleteUserFromAll
* Update pkg/services/org/orgimpl/org_test.go
Co-authored-by: Emil Tullstedt <emil.tullstedt@grafana.com>
* Update pkg/services/preference/prefimpl/inmemory_test.go
Co-authored-by: Emil Tullstedt <emil.tullstedt@grafana.com>
* Rename Acl to ACL
* Fix wire after merge with main
* Move test to uni test
Co-authored-by: Emil Tullstedt <emil.tullstedt@grafana.com>
* Introduce AlertsRouter in the sender package, and move all fields and methods related to notifications out of the scheduler to this router.
* Introduce a new interface AlertsSender in the schedule package and replace calls of anonymous function `notify` inside the ruleRoutine to calling methods of that interface.
* Rename interface Scheduler in api package to ExternalAlertmanagerProvider, and replace scheduler with AlertRouter as struct that implements the interface.