Commit Graph

68 Commits

Author SHA1 Message Date
Yuriy Tseretyan
49d93fb67e Alerting: Update alert rule diff to not see difference between nil and empty map (#50192) 2022-06-03 21:27:29 +02:00
Yuriy Tseretyan
ad25e2a20c Alerting: Update RBAC for alert rules to consider access to rule as access to group it belongs (#49033)
* update authz to exclude entire group if user does not have access to rule
* change rule update authz to not return changes because if user does not have access to any rule in group, they do not have access to the rule
* a new query that returns alerts in group by UID of alert that belongs to that group
* collect all affected groups during calculate changes
* update authorize to check access to groups
* update tests for calculateChanges to assert new fields
* add authorization tests
2022-06-01 10:23:54 -04:00
Joe Blubaugh
9e8efaa459 Alerting: Add stored screenshot utilities to the channels package. (#49470)
Adds three functions:
`withStoredImages` iterates over a list of models.Alerts, extracting a stored image's data from storage, if available, and executing a user-provided function.
`withStoredImage` does this for an image attached to a specific alert.
`openImage` finds and opens an image file on disk.

Moves `store.Image` to `models.Image`
Simplifies `channels.ImageStore` interface and updates notifiers that use it to use the simpler methods.
Updates all pkg/alert/notifier/channels to use withStoredImage routines.
2022-05-26 13:29:56 +08:00
Joe Blubaugh
1cc034d960 Alerting: Add a "Reason" to Alert Instances to show underlying cause of state. (#49259)
This change adds a field to state.State and models.AlertInstance
that indicate the "Reason" that an instance has its current state. This
helps us account for cases where the state is "Normal" but the
underlying evaluation returned "NoData" or "Error", for example.

Fixes #42606

Signed-off-by: Joe Blubaugh <joe.blubaugh@grafana.com>
2022-05-23 16:49:49 +08:00
Joe Blubaugh
1d724810de Alerting: State Manager takes screenshots. (#49338)
The State Manager will now take screenshots when an alert instance
switches to an Alerting or Resolved state.

Signed-off-by: Joe Blubaugh joe.blubaugh@grafana.com
2022-05-23 10:53:41 +08:00
George Robinson
43358c7248 Alerting: Keep private annotations across evaluations (#49080) 2022-05-18 11:21:18 +02:00
Yuriy Tseretyan
952cb4fc0b Alerting: introduce AlertRuleGroupKey and use it in API handlers (#48945)
* create AlertGroupKey structure
* update PrometheusSrv.
  - extract creation of RuleGroup to a separate method. Use group key for grouping
* update RuleSrv 
 - update calculateChanges to use groupKey
 - authorize to use groupkey
2022-05-16 15:45:45 -04:00
Yuriy Tseretyan
369fcc5e9a Alerting: scheduler to use short version of model for alert rule (#48916)
* scheduler to use a short version of alert rule model
2022-05-12 09:55:05 -04:00
Alexander Weaver
078a578803 Drop ProvenanceOrgAdapter and build into store API instead (#48137) 2022-04-26 10:30:57 -05:00
George Robinson
c5547123bc Remove redundant queries in GetAlertRules and GetOrgAlertRules and replace with ListAlertRules (#48108) 2022-04-25 11:42:42 +01:00
George Robinson
d66fc6ed1a Alerting: Add GetRuleGroups to RuleStore (#48036)
This commit adds a new method GetRuleGroups to RuleStore which returns the set of rule groups across all organizations.
2022-04-21 17:59:22 +01:00
Jean-Philippe Quéméner
388ecb4037 Alerting: Provisioning API - Contact points (#47197) 2022-04-13 22:15:55 +02:00
Alexander Weaver
dde0b93cf1 Alerting: Provisioning API - Notification Policies (#46755)
* Base-line API for provisioning notification policies

* Wire API up, some simple tests

* Return provenance status through API

* Fix missing call

* Transactions

* Clarity in package dependencies

* Unify receivers in definitions

* Fix issue introduced by receiver change

* Drop unused internal test implementation

* FGAC hooks for provisioning routes

* Polish, swap names

* Asserting on number of exposed routes

* Don't bubble up updated object

* Integrate with new concurrency token feature in store

* Back out duplicated changes

* Remove redundant tests

* Regenerate and create unit tests for API layer

* Integration tests for auth

* Address linter errors

* Put route behind toggle

* Use alternative store API and fix feature toggle in tests

* Fixes, polish

* Fix whitespace

* Re-kick drone

* Rename services to provisioning
2022-04-05 16:48:51 -05:00
Yuriy Tseretyan
4ee48c2e77 Alerting: Update GetRuleGroupAlertRules to accept optional rule group (#46889)
* rename GetRuleGroupAlertRules to GetAlertRules
* make rule group optional in GetAlertRulesQuery
* simplify FakeStore. the current structure did not support optional rule group
2022-03-23 17:36:25 +00:00
Jean-Philippe Quéméner
a80f04c949 Alerting: add collision safe update function for alertmanager configurations (#46692)
* Alerting: add collision safe update function for alertmanager configurations

* fix typo

* use bootstrap func for tests

* move hash calculation to store

* remove icons lol

* remove removed field
2022-03-23 09:31:46 +01:00
gotjosh
a338c78ca8 Alerting: Remove internal labels from prometheus compatible API responses (#46548)
* Alerting: Remove internal labels from prometheus compatible API responses

* Appease the linter

* Fix integration tests

* Fix API documentation & linter

* move removal of internal labels to the models
2022-03-16 16:04:19 +00:00
gotjosh
a75d4fcbd8 Alerting: Display query from grafana-managed alert rules on /api/v1/rules (#45969)
* Aleting: Extract query from alerting rule model for api/v1/rules

* more changes and fixtures

* appease the linter
2022-03-14 10:39:20 +00:00
Yuriy Tseretyan
4502e40ed8 Alerting: Revert Revert "Alerting: Calculate diff for two AlertRules" (#46034)
* Revert "Revert "Alerting: Calculate diff for two AlertRules (#45877)" (#46023)"

This reverts commit 82aa5acba6.

* remove flakiness
2022-03-01 11:10:29 -05:00
Jean-Philippe Quéméner
82aa5acba6 Revert "Alerting: Calculate diff for two AlertRules (#45877)" (#46023)
This reverts commit 4e19d7df63.
2022-03-01 13:40:47 +01:00
Yuriy Tseretyan
4e19d7df63 Alerting: Calculate diff for two AlertRules (#45877)
* add custom diff reporter DiffReporter that reports only paths that have a difference
* create Diff method for AlertRule that returns DiffReport, which is an alias for []Diff

Tests:
* create copy method for AlertRule in testing
* create GenerateAlertQuery method in testing
2022-02-28 17:13:53 +01:00
Yuriy Tseretyan
f75bea481d Alerting: validate rules and calculate changes in API controller (#45072)
* Update API controller
   - add validation of rules API model
   - add function to calculate changes between the submitted alerts and existing alerts
   - update RoutePostNameRulesConfig to validate input models, calculate changes and apply in a transaction

* Update DBStore
   - delete unused storage method. All the logic is moved upstream.
   - upsert to not modify fields of new by values from the existing alert
   - if rule has UID do not try to pull it from db. (it is done upstream)

* Add rule generator
2022-02-23 11:30:04 -05:00
Alexander Weaver
935059a376 Alerting: Create basic storage layer for provisioning (#44679)
* Simplistic store API for provenance lookups on arbitrary types

* Add a few notes in comments

* Improved type safety for provisioned objects

* Clean-up TODOs for future PRs

* Clean up provisioning model

* Clean up tests

* Restrict allowable types in interface

* Fix linter error

* Move AlertRule domain methods to same file as AlertRule definition

* Update pkg/services/ngalert/models/provisioning.go

Co-authored-by: George Robinson <george.robinson@grafana.com>

* Complete interface rename

* Pass context through store API

* More idiomatic method names

* Better error description

* Improve code-docs

* Use ORM language instead of raw sql

* Add support for records in different orgs

* ResourceTypeID -> ResourceType since it's not an ID

Co-authored-by: George Robinson <george.robinson@grafana.com>
2022-02-04 13:23:19 -06:00
Santiago
04d93751b8 Alerting: send alerts to external, internal, or both alertmanagers (#40341)
* (WIP) send alerts to external, internal, or both alertmanagers

* Modify admin configuration endpoint, update swagger docs

* Integration test for admin config updated

* Code review changes

* Fix alertmanagers choice not changing bug, add unit test

* Add AlertmanagersChoice as enum in swagger, code review changes

* Fix API and tests errors

* Change enum from int to string, use 'SendAlertsTo' instead of 'AlertmanagerChoice' where necessary

* Fix tests to reflect last changes

* Keep senders running when alerts are handled just internally

* Check if any external AM has been discovered before sending alerts, update tests

* remove duplicate data from logs

* update comment

* represent alertmanagers choice as an int instead of a string

* default alertmanagers choice to all alertmanagers, test cases

* update definitions and generate spec
2022-02-01 20:36:55 -03:00
George Robinson
1b26d4d88e Alerting: Create DatasourceError alert if evaluation returns error (#41869)
* Alerting: Create DatasourceError alert if evaluation returns error

* Alerting: Add docs for DatasourceError alert

* Alerting: Fix DatasourceError alert does not have dashboard_uid label

* Alerting: Add break when datasource_uid found

* Alerting: Update TestProcessEvalResults
2021-11-25 11:46:47 +01:00
Peter Holmberg
b2d7162168 Alerting: Add external Alertmanagers (#39183)
* building ui

* saving alertmanager urls

* add actions and api call to get external ams

* add list to add modal

* add validation and edit/delete

* work on merging results

* merging results

* get color for status heart

* adding tests

* tests added

* rename

* add pollin and status

* fix list sync

* fix polling

* add info icon with actual tooltip

* fix test

* Accessibility things

* fix strict error

* delete public/dist files

* Add API tests for invalid URL

* start redo admin test

* Fix for empty configuration and test

* remove admin test

* text updates after review

* suppress appevent error

* fix tests

* update description

Co-authored-by: gotjosh <josue@grafana.com>

* fix text plus go lint

* updates after pr review

* Adding docs

* Update docs/sources/alerting/unified-alerting/fundamentals/alertmanager.md

Co-authored-by: gotjosh <josue@grafana.com>

* Update docs/sources/alerting/unified-alerting/fundamentals/alertmanager.md

Co-authored-by: gotjosh <josue@grafana.com>

* Update docs/sources/alerting/unified-alerting/fundamentals/alertmanager.md

Co-authored-by: gotjosh <josue@grafana.com>

* Update docs/sources/alerting/unified-alerting/fundamentals/alertmanager.md

Co-authored-by: gotjosh <josue@grafana.com>

* Update docs/sources/alerting/unified-alerting/fundamentals/alertmanager.md

Co-authored-by: gotjosh <josue@grafana.com>

* Update docs/sources/alerting/unified-alerting/fundamentals/alertmanager.md

Co-authored-by: gotjosh <josue@grafana.com>

* Update docs/sources/alerting/unified-alerting/fundamentals/alertmanager.md

Co-authored-by: gotjosh <josue@grafana.com>

* prettier

* updates after docs feedback

Co-authored-by: gotjosh <josue.abreu@gmail.com>
Co-authored-by: gotjosh <josue@grafana.com>
2021-11-12 22:19:16 +01:00
Ryan McKinley
3489721ed6 api/ds/query: simplify data sources lookup for queries and expressions (#41172) 2021-11-05 08:12:55 -07:00
Yuriy Tseretyan
5836def6c2 Alerting: declare constants for __dashboardUid__ and __panelId__ literals (#39976) 2021-10-07 17:30:06 -04:00
George Robinson
2a4c1b1aa6 You can now get alert rules for a dashboard or a panel using /api/v1/rules endpoints. (#39476)
Get alert rules for a dashboard and panel in /api/v1/rules
2021-10-04 16:33:55 +01:00
Sofia Papagiannaki
012d4f0905 Alerting: Remove ngalert feature toggle and introduce two new settings for enabling Grafana 8 alerts and disabling them for specific organisations (#38746)
* Remove `ngalert` feature toggle

* Update frontend

Remove all references of ngalert feature toggle

* Update docs

* Disable unified alerting for specific orgs

* Add backend tests

* Apply suggestions from code review

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

* Disabled unified alerting by default

* Ensure backward compatibility with old ngalert feature toggle

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>
2021-09-29 16:16:40 +02:00
Sofia Papagiannaki
04d5dcb7c8 Alerting: modify DB table, accessors and migration to restrict org access (#37414)
* Alerting: modify table and accessors to limit org access appropriately

* Update migration to create multiple Alertmanager configs

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>

* replace mg.ClearMigrationEntry()

mg.ClearMigrationEntry() would create a new session.
This commit introduces a new migration for clearing an entry from migration log for replacing  mg.ClearMigrationEntry() so that all dashboard alert migration operations will run inside the same transaction.
It adds also `SkipMigrationLog()` in Migrator interface for skipping adding an entry in the migration_log.

Co-authored-by: gotjosh <josue@grafana.com>
2021-08-12 16:04:09 +03:00
gotjosh
f83cd401e5 Alerting: Send alerts to external Alertmanager(s) (#37298)
* Alerting: Send alerts to external Alertmanager(s)

Within this PR we're adding support for registering or unregistering
sending to a set of external alertmanagers. A few of the things that are
going are:

- Introduce a new table to hold "admin" (either org or global)
  configuration we can change at runtime.
- A new periodic check that polls for this configuration and adjusts the
  "senders" accordingly.
- Introduces a new concept of "senders" that are responsible for
  shipping the alerts to the external Alertmanager(s). In a nutshell,
this is the Prometheus notifier (the one in charge of sending the alert)
mapped to a multi-tenant map.

There are a few code movements here and there but those are minor, I
tried to keep things intact as much as possible so that we could have an
easier diff.
2021-08-06 13:06:56 +01:00
Sofia Papagiannaki
7815ed511f Alerting: Refactor API endpoints for fetching alert rules (#37055)
* Refactor ruler API endpoint for listing rules

* Refactor prometheus API endpoint for listing rules

* Update HTTP API docs
2021-07-22 09:53:14 +03:00
Ganesh Vernekar
dcd4bf1615 Alerting: Fill the empty GeneratorURL (#35740)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-06-16 15:34:12 +05:30
Sofia Papagiannaki
8cda1f5153 Alerting: Allow rules with same title across folders (#35270)
* Alerting: Allow rules with same title across folders

* Add test
2021-06-04 20:45:26 +03:00
Sofia Papagiannaki
15c55b0115 Alerting: Fix notification channel migration and handle case when Alertmanager default configuration is absent (#35086)
* Fix dashboard alert and nootifier migration for MySQL

* Fix POSTing Alertmanager configuration if no current configuration exists

in case the default configuration has not be stored yet
or has failed to get stored

* Change CreatedAt field type
2021-06-04 15:52:41 +03:00
David Parrott
bbb7bbf891 Alerting: Remove back end logic for supporting KeepLastState (#34242)
* Removed back end logic for supporting KeepLastState

* Map keep_state correctly in migrations
2021-05-18 10:55:43 -07:00
Kyle Brandt
331991ca10 UAlerting: Increase default max datapoints (#34223)
Change const value from 100 to 43200 (12 hours at 1sec interval)
2021-05-17 18:46:52 +02:00
gotjosh
eb74994b8b Alerting: Modify configuration apply and save semantics - v2 (#34143)
* Save default configuration to the database and copy over secure settings
2021-05-14 19:49:54 +01:00
Owen Diehl
fc90c36d50 removes unused db method (#34082) 2021-05-13 20:28:10 +02:00
Kyle Brandt
a735c51202 Alerting/Chore: Backend remove def_ columns from instance (#33875)
rename def_uid and def_org_id to rule_uid and rule_org_id on the alert_instance table and drops the definition table.
2021-05-12 07:17:43 -04:00
David Parrott
5072fefc22 allow saving pending alerts (#33667) 2021-05-04 09:24:20 -07:00
Kyle Brandt
c1034f3118 Alerting: Create instanceStore (#33587)
for https://github.com/grafana/alerting-squad/issues/129
2021-05-03 07:19:15 -04:00
Kyle Brandt
7823842c5d Alerting: Load annotations from rule into State cache (#33542)
for https://github.com/grafana/alerting-squad/issues/127
2021-04-30 20:23:12 +02:00
Kyle Brandt
b8f01fe034 Alerting: backend "ng" code cleanup (#33578) 2021-04-30 13:21:57 -04:00
Kyle Brandt
914443c816 Alerting: Fix state cache id duplication (#33480) 2021-04-28 11:42:19 -04:00
Kyle Brandt
b590e95682 AlertingAPI: Change list response query prop (#33419)
* Alerting: change to full []AlertQuery as json in a string and not just model.
2021-04-27 22:15:00 +02:00
Kyle Brandt
adcba36d39 AlertingAPI: update swagger json files match datasourceUid change (#33332)
* update swagger json files match datasourceUid change
underlying change made in https://github.com/grafana/grafana/pull/33282
* Document DatasourceUID field in AlertQuery model
* Run spec generation from inside a docker container
* Generate latest spec

Co-authored-by: Sofia Papagiannaki <sofia@grafana.com>
2021-04-27 16:50:30 +02:00
David Parrott
788bc2a793 Alerting: refactor state tracker (#33292)
* set processing time

* merge labels and set on response

* use state cache for adding alerts to rules

* minor cleanup

* add support for NoData and Error results

* rename test

* bring in changes from other PRs tha have been merged

* pr feedback

* add integration test

* close state tracker cleanup on context.Done

* fixup test

* rename state tracker

* set EvaluationDuration on Result

* default labels set as constants

* separate cache and state from manager

* use RWMutex in cache
2021-04-23 21:32:25 +02:00
Kyle Brandt
5e818146de Alerting/Expr: New SSE Request/QueryType, alerting move data source UID (#33282) 2021-04-23 16:52:32 +02:00
Sofia Papagiannaki
87a70af7eb [Alerting]: Fix updating rule group and add tests (#33074)
* [Alerting]: Fix updating rule group and add test

* Fix updating rule labels
* Set default values for rule no data and error states
if they are missing
* Add test for updating rule

* Test updating annotations

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>

* add test for posting an unknown rule UID

* Fix alert rule validation and add tests

* Remove org id from PostableGrafanaRule

This field was not used; each rule gets the organisation of the user making
the rerquest

* Update pkg/tests/api/alerting/api_alertmanager_test.go

Co-authored-by: gotjosh <josue@grafana.com>
2021-04-21 17:22:58 +03:00