Commit Graph

36 Commits

Author SHA1 Message Date
Santiago
7a34cdb3a2 Alerting: Add configuration options to migrate to an external Alertmanager (#71318)
* add configuration options to .ini file and parse them

* updates on config options, add external AM config to the main config struct

* separate external AM configs from general alerting configs, naming

* comments about usage of tenantID in basic auth & not using config options yet
2023-09-05 11:24:35 -03:00
Alexander Weaver
dfba94e052 Alerting: Limit redis pool size to 5 and make configurable (#74057)
* Limit redis pool size to 5 and expose it in config ini

* Coerce negative pool sizes to the default
2023-08-29 14:59:12 -05:00
Yuri Tseretyan
c7598cc6fb Alerting: Add ability to control scheduler tick interval via config (#71980)
* add ability to control scheduler interval via config
* add feature flag `configurableSchedulerTick`
2023-07-26 12:44:12 -04:00
George Robinson
7edbe72483 Alerting: Support concurrent queries for saving alert instances (#70525)
This commit adds support for concurrent queries when saving alert
instances to the database. This is an experimental feature in
response to some customers experiencing delays between rule evaluation
and sending alerts to Alertmanager, resulting in flapping. It is
disabled by default.
2023-06-23 11:36:07 +01:00
Jean-Philippe Quéméner
8bb62a8316 Alerting: Add option for memberlist label (#67982) 2023-05-09 10:32:23 +02:00
Jean-Philippe Quéméner
bc11a484ed Alerting: Add support for running HA using Redis (#65267)
Co-authored-by: Steve Simpson <steve.simpson@grafana.com>
2023-04-19 17:05:26 +02:00
Alexander Weaver
b2abb63286 Alerting: Introduce proper feature toggles for common state history backend combinations (#65497)
* define 3 feature toggles for rollout phases

* Pass feature toggles along

* Implement first feature toggle

* Try a different strategy with fall-throughs to specific configurations

* Apply toggle overrides once outside of backend composition

* Emit log messages when we coerce backends

* Run code generator for feature toggle files

* Improve wording in flag descs

* Re-run generator

* Use code-generated constants instead of plain strings

* Use converted enum values rather than strings for pre-parsing
2023-03-30 13:53:21 -05:00
Alexander Weaver
a31672fa40 Alerting: Create new state history "fanout" backend that dispatches to multiple other backends at once (#64774)
* Rename RecordStatesAsync to Record

* Rename QueryStates to Query

* Implement fanout writes

* Implement primary queries

* Simplify error joining

* Add test for query path

* Add tests for writes and error propagation

* Allow fanout backend to be configured

* Touch up log messages and config validation

* Consistent documentation for all backend structs

* Parse and normalize backend names more consistently against an enum

* Touch-ups to documentation

* Improve clarity around multi-record blocking

* Keep primary and secondaries more distinct

* Rename fanout backend to multiple backend

* Simplify config keys for multi backend mode
2023-03-17 12:41:18 -05:00
Alexander Weaver
e7ace4ed62 Alerting: Allow separate read and write path URLs for Loki state history (#62268)
Extract config parsing and add tests
2023-01-30 16:30:05 -06:00
Alexander Weaver
b4682fe3cb Alerting: Configurable externalLabels for Loki state history (#62404)
* Add config option for external labels

* Remove redundant nilcheck
2023-01-30 14:24:45 -06:00
Serge Zaitsev
aebcecf538 Chore: Fix goimports grouping in other backend platform packages (#62422)
* fix goimports

* fix goimports order

* fix goimports order

* fix goimports order

* fix goimports order

* fix goimports order
2023-01-30 08:26:42 +00:00
Jean-Philippe Quéméner
44b11d3228 Alerting: support basic auth for the state history loki client (#61696) 2023-01-18 20:24:40 +01:00
Alexander Weaver
1ac89ea040 Alerting: Add client configuration for remote Loki historian backend and test connection (#61114)
* Create loki client type and ping method

* Expose TestConnection on client

* Configure and ping Loki URL

* Close response body reader if present

* Add 30 second timeout

* Remove duplicate close
2023-01-17 13:58:52 -06:00
Alexander Weaver
eb960d9725 Alerting: Add un-documented toggle for changing state history backend, add shells for remote loki and sql (#61072)
* Add toggle for state history backend and shells

* Extract some shared logic and add tests
2023-01-06 12:06:01 -06:00
Alexander Weaver
8c3a5f6da0 Alerting: Allow state history to be disabled through configuration (#61006)
* Add configuration option for if state history should be enabled

* Inject no-op when history is disabled
2023-01-05 12:21:07 -06:00
George Robinson
9af7adef76 Alerting: Support customizable timeout for screenshots (#60981)
This commit adds a customizable timeout for screenshots called
capture_timeout. The default value is 10 seconds, and the maximum
value is 30 seconds. This timeout should be less than the minimum
Interval of all Evaluation Groups to avoid back pressure on alert
rule evaluation.
2023-01-05 16:07:46 +00:00
Sasha Melentyev
c02003af3c Refactor time durations (#58484)
This change uses `time.Second` in place of `1000 * time.Millisecond` and `time.Minute` in place of `60*time.Second`.
2022-11-22 15:09:15 +08:00
Matthew Jacobson
28dd413c1d Alerting: Add config disabled_labels to disable reserved labels (#51832)
* Alerting: Add config disabled_labels to disable reserved labels

[unified_alerting.reserved_labels]
disabled_labels

* Replace IsGrafanaFolderDisabled with more generic IsReservedLabelDisabled

* Simplify SchedulerCfg by including UnifiedAlertingSettings
2022-07-11 12:41:40 -04:00
Matthew Jacobson
434e94ef2b Alerting: Update default route groupBy to [grafana_folder, alertname] (#50052)
* Alerting: Update default route groupBy to [grafana_folder, alertname]

Default group by for new routes and migrations is now [grafana_folder, alertname]
2022-07-11 12:24:43 -04:00
George Robinson
bda47df4ad Alerting: Invalid setting of enabled for unified alerting should return error (#49876) 2022-06-10 08:59:58 +01:00
Jean-Philippe Quéméner
fd664e4beb Alerting: replace a duplicated configuration key (#50350)
This PR renames the configuration key enabled to capture. This is needed as we already have a configuration key with the name enabled.

Fixes #50328

Co-authored-by: Jean-Philippe Quéméner <JohnnyQQQQ@users.noreply.github.com>
2022-06-08 11:04:51 +08:00
George Robinson
3b7f871bf4 Alerting: Enable Unified Alerting for open source and enterprise (#49834)
This commit enables Unified Alerting for open source and enterprise unless disabled in configuration.
2022-05-30 16:47:15 +01:00
Joe Blubaugh
687e79538b Alerting: Add a general screenshot service and alerting-specific image service. (#49293)
This commit adds a pkg/services/screenshot package for taking and uploading screenshots of Grafana dashboards. It supports taking screenshots of both dashboards and individual panels within a dashboard, using the rendering service.

The screenshot package has the following services, most of which can be composed:

BrowserScreenshotService (Takes screenshots with headless Chrome)
CachableScreenshotService (Caches screenshots taken with another service such as BrowserScreenshotService)
NoopScreenshotService (A no-op screenshot service for tests)
SingleFlightScreenshotService (Prevents duplicate screenshots when taking screenshots of the same dashboard or panel in parallel)
ScreenshotUnavailableService (A screenshot service that returns ErrScreenshotsUnavailable)
UploadingScreenshotService (A screenshot service that uploads taken screenshots)

The screenshot package does not support wire dependency injection yet. ngalert constructs its own version of the service. See https://github.com/grafana/grafana/issues/49296

This PR also adds an ImageScreenshotService to ngAlert. This is used to take screenshots with a screenshotservice and then store their location reference for use by alert instances and notifiers.
2022-05-22 22:33:49 +08:00
Yuriy Tseretyan
e44ea3d589 Alerting: Rename setting AlertForDuration to DefaultRuleEvaluationInterval (#45569)
* fix AlertForDuration to DefaultRuleEvaluationInterval

Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com>
2022-02-18 10:05:06 -05:00
Yuriy Tseretyan
095ea44e97 Alerting: Move BaseInterval and MinInterval to UnifiedAlerting config (#45107)
* use base interval if legacy value is less than the base interval
2022-02-11 16:13:49 -05:00
Ryan McKinley
5d66194ec5 FeatureFlags: define features outside settings.Cfg (take 3) (#44443) 2022-01-26 09:44:20 -08:00
Agnès Toulet
65bdb3a899 FeatureFlags: Revert managing feature flags outside of settings.Cfg (#44382)
* Revert "FeatureToggles: register all enterprise feature toggles (#44336)"

This reverts commit f53b3fb007.

* Revert "FeatureFlags: manage feature flags outside of settings.Cfg (#43692)"

This reverts commit f94c0decbd.
2022-01-24 16:08:05 +01:00
Ryan McKinley
f94c0decbd FeatureFlags: manage feature flags outside of settings.Cfg (#43692) 2022-01-20 13:42:05 -08:00
Yuriy Tseretyan
005c8f8894 Alerting: Disable unified alerting by default in Enterprise Grafana (#42476)
* fallback to enable false if Enterprise is true

* anyBoolean
2021-11-29 20:51:15 +01:00
Armand Grillet
6523486122 Alerting: Make Unified Alerting enabled by default for those who do not use legacy alerting (#42200)
* update AlertingEnabled and UnifiedAlertingSettings.Enabled to be pointers
* add a pseudo migration to fix the AlertingEnabled and UnifiedAlertingSettings.Enabled if the latter is not defined
* update the default configuration file to make default value for both 'enabled' flags be undefined

Misc
* update Migrator to expose DB engine. This is needed for a ualert migration to access the database while the list of migrations is created.
* add more verbose failure when migrations do not match

Co-authored-by: gotjosh <josue@grafana.com>
Co-authored-by: Yuriy Tseretyan <yuriy.tseretyan@grafana.com>
Co-authored-by: gillesdemey <gilles.de.mey@gmail.com>
2021-11-24 14:56:07 -05:00
Sofia Papagiannaki
012d4f0905 Alerting: Remove ngalert feature toggle and introduce two new settings for enabling Grafana 8 alerts and disabling them for specific organisations (#38746)
* Remove `ngalert` feature toggle

* Update frontend

Remove all references of ngalert feature toggle

* Update docs

* Disable unified alerting for specific orgs

* Add backend tests

* Apply suggestions from code review

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

* Disabled unified alerting by default

* Ensure backward compatibility with old ngalert feature toggle

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>
2021-09-29 16:16:40 +02:00
Sofia Papagiannaki
f6f3a54742 Alerting: tune rule evaluation via configuration (#35623)
* Alerting: Configure max evaluation retries

* Alerting: Enforce minimum rule evaluation interval

* Alerting: Disable rule evaluation from configuration

* Update docs

* Alerting: Configure rule evaluation timeout

* Move options on unified_alerting config section

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>
2021-09-28 13:00:16 +03:00
Yuriy Tseretyan
05eb30e323 Alerting: Move alertmanager default config to UnifiedAlertingSettings (#39597) 2021-09-23 13:52:20 -04:00
Andres Martinez Gotor
64c8d32fe7 Use sdk pkg for gtime (#39354) 2021-09-21 13:08:52 +02:00
gotjosh
2ad82b9354 Alerting: Move the unified alerting settings to its own struct (#39350) 2021-09-20 10:12:21 +03:00
gotjosh
7db97097c9 Alerting: Support Unified Alerting with Grafana HA (#37920)
* Alerting: Support Unified Alerting in Grafana's HA mode.
2021-09-16 15:33:51 +01:00