grafana

mirror of https://github.com/grafana/grafana.git synced 2024-12-01 21:19:28 -06:00

Author	SHA1	Message	Date
George Robinson	bd29071a0d	Revert "Alerting: Add limits to the Prometheus Rules API" (#65842 )	2023-04-03 15:20:37 +00:00
George Robinson	d96b0a71d3	Alerting: Add limits to the Prometheus Rules API (#65169 ) This commit adds a number of limits to the Grafana flavor of the Prometheus Rules API: 1. `limit` limits the maximum number of Rule Groups returned 2. `limit_rules` limits the maximum number of rules per Rule Group 3. `limit_alerts` limits the maximum number of alerts per rule It sorts Rule Groups and rules within Rule Groups such that data in the response is stable across requests. It also returns summaries (totals) for all Rule Groups, individual Rule Groups and rules.	2023-04-03 10:17:02 +01:00
Santiago	aba91d3053	Alerting: Fetch all applied alerting configurations (#65728 ) * WIP * skip invalid historic configurations instead of erroring * add warning log when bad historic config is found * remove unused custom marshaller for GettableHistoricUserConfig * add id to historic user config, move limit check to store, fix typo * swagger spec	2023-03-31 17:43:04 -03:00
Yuri Tseretyan	9eaffdf5a8	Alerting: Remove dependency on alerting package in definitions (#65390 ) * move export rules to definitions package * move provisioning contact point methods to provisioning package * move AlertRuleGroupWithFolderTitle to ngalert models and adapter functions to api's compat * move rule_types files back to where they were before.	2023-03-29 13:34:59 -04:00
Serge Zaitsev	0beb768427	Chore: Remove result fields from ngalert (#65410 ) * remove result fields from ngalert * remove duplicate imports	2023-03-28 10:34:35 +02:00
Yuri Tseretyan	f066e8cdcd	Alerting: Update to alerting 20230203015918-0e4e2675d7aa (after refactoring) (#62823 ) * add alerting prefix to some packages from alerting that have similar names in prometheus alertmanager	2023-02-03 11:36:49 -05:00
Santiago	ba731f7865	Alerting: Mark AM configuration as applied (#61330 ) * Mark AM configuration as applied * add missing checks, make linter happy * fix deadlock, mark as valid on save and on load * mark configurations only if needed * check error after applyConfig() * code review comments * code review changes * more code review changes * clean HistoricConfigFromAlertConfig function	2023-02-02 14:45:17 -03:00
Alexander Weaver	6ad1cfef38	Alerting: Add endpoint for querying state history (#62166 ) * Define endpoint and generate * Wire up and register endpoint * Cleanup, define authorization * Forgot the leading slash * Wire up query and SignedInUser * Wire up timerange query params * Add todo for label queries * Drop comment * Update path to rules subtree	2023-02-02 11:34:00 -06:00
Alex Moreno	53945afedf	Alerting: Allow alert rule pausing from API (#62326 ) * Add is_paused attr to the POST alert rule group endpoint * Add is_paused to alerting API POST alert rule group * Fixed tests * Add is_paused to alerting gettable endpoints * Fix integration tests * Alerting: allow to pause existing rules (#62401) * Display Pause Rule switch in Editing Rule form * add isPaused property to form interface and dto * map isPaused prop with is_paused value from DTO Also update test snapshots * Append '(Paused)' text on alert list state column when appropriate * Change Switch styles according to discussion with UX Also adding a tooltip with info what this means * Adjust styles * Fix alignment and isPaused type definition Co-authored-by: gillesdemey <gilles.de.mey@gmail.com> * Fix test * Fix test * Fix RuleList test --------- Co-authored-by: gillesdemey <gilles.de.mey@gmail.com> * wip * Fix tests and add comments to clarify AlertRuleWithOptionals * Fix one more test * Fix tests * Fix typo in comment * Fix alert rule(s) cannot be paused via API * Add integration tests for alerting api pausing flow * Remove duplicated integration test --------- Co-authored-by: Virginia Cepeda <virginia.cepeda@grafana.com> Co-authored-by: gillesdemey <gilles.de.mey@gmail.com> Co-authored-by: George Robinson <george.robinson@grafana.com>	2023-02-01 13:15:03 +01:00
ismail simsek	91221bc436	Expressions: Fixes the issue showing expressions editor (#62510 ) * Use suggested value for uid * update the snapshot * use __expr__ * replace all -100 with __expr__ * update snapshot * more changes * revert redundant change * Use expr.DatasourceUID where it's possible * generate files	2023-01-31 18:50:10 +01:00
Serge Zaitsev	d6d4097567	Chore: Fix goimports grouping in alerting (#62424 ) * fix goimports * fix goimports order	2023-01-30 09:55:35 +01:00
Yuri Tseretyan	05bf241952	Alerting: Update state manager to return StateTransitions when Delete or Reset (#62264 ) * update Delete and Reset methods to return state transitions this will be used by notifier code to decide whether alert needs to be sent or not. * update scheduler to provide reason to delete states and use transitions * update FromAlertsStateToStoppedAlert to accept StateTransition and filter by old state * fixup * fix tests	2023-01-27 09:46:21 +01:00
Alex Moreno	531b439cf1	Alerting: Add alert pausing feature (#60734 ) * Add field in alert_rule model, add state to alert_instance model, and state to eval * Remove paused state from eval package * Skip paused alert rules in scheduler * Add migration to add is_paused field to alert_rule table * Convert to postable alerts only if not normal, pernding, or paused * Handle paused eval results in state manager * Add Paused state to eval package * Add paused alerts logic in scheduler * Skip alert on scheduler * Remove paused status from eval package * Apply suggestions from code review Co-authored-by: George Robinson <george.robinson@grafana.com> * Remove state * Rethink schedule and manager for paused alerts * Change return to continue * Remove unused var * Rethink alert pausing * Paused alerts storing annotations * Only add one state transition * Revert boolean method renaming refactor * Revert take image refactor * Make registry errors public * Revert method extraction for getting a folder title * Revert variable renaming refactor * Undo unnecessary changes * Revert changes in test * Remove IsPause check in PatchPartiLAlertRule function * Use SetNormal to set state * Fix text by returning to old behaviour on alert rule deletion * Add test in schedule_unit_test.go to test ticks with paused alerts * Add coment to clarify usage of context.Background() * Add comment to clarify resetStateByRuleUID method usage * Move rule get to a more limited scope * Update pkg/services/ngalert/schedule/schedule.go Co-authored-by: George Robinson <george.robinson@grafana.com> * rum gofmt on pkg/services/ngalert/schedule/schedule.go * Remove defer cancel for context * Update pkg/services/ngalert/models/instance_test.go Co-authored-by: Santiago <santiagohernandez.1997@gmail.com> * Update pkg/services/ngalert/models/testing.go Co-authored-by: Santiago <santiagohernandez.1997@gmail.com> * Update pkg/services/ngalert/schedule/schedule_unit_test.go Co-authored-by: Santiago <santiagohernandez.1997@gmail.com> * Update pkg/services/ngalert/schedule/schedule_unit_test.go Co-authored-by: Santiago <santiagohernandez.1997@gmail.com> * Update pkg/services/ngalert/models/instance_test.go Co-authored-by: Santiago <santiagohernandez.1997@gmail.com> * skip scheduler rule state clean up on paused alert rule * Update pkg/services/ngalert/schedule/schedule.go Co-authored-by: Santiago <santiagohernandez.1997@gmail.com> * Fix mock in test * Add (hopefully) final suggestions * Use error channel from recordAnnotationsSync to cancel context * Run make gen-cue * Place pause alert check in channel update after version check * Reduce branching un update channel select * Add if for error and move code inside if in state manager ResetStateByRuleUID * Add reason to logs * Update pkg/services/ngalert/schedule/schedule.go Co-authored-by: George Robinson <george.robinson@grafana.com> * Do not delete alert rule routine, just exit on eval if is paused * Reduce branching and create-close a channel to avoid deadlocks * Separate state deletion and state reset (includes history saving) * Add current pause state in rule route in scheduler * Split clearState and bring errCh closer to RecordStatesAsync call * Change rule to ruleMeta in RecordStatesAsync * copy state to be able to modify it * Add timeout to context creation * Shorten the timeout * Use resetState is rule is paused and deleteState if rule is not paused * Remove Empty state reason * Save every rule change in historian * Add tests for DeleteStateByRuleUID and ResetStateByRuleUID * Remove useless line * Remove outdated comment Co-authored-by: George Robinson <george.robinson@grafana.com> Co-authored-by: Santiago <santiagohernandez.1997@gmail.com> Co-authored-by: Armand Grillet <2117580+armandgrillet@users.noreply.github.com>	2023-01-26 18:29:10 +01:00
Santiago	e5920c211e	Chore: Fix random indices for slices in test files (#61884 ) * Fix random indices for slices in test files * Empty commit	2023-01-24 15:07:37 -03:00
Alexander Weaver	c10713ea76	Alerting: Create query interface for state history along with annotation-based implementation (#61646 )	2023-01-19 10:45:31 +01:00
Matthew Jacobson	23e05373a7	Alerting: Fix flaky TestIntegrationUpdateAlertRules (#61641 ) Prevents random OrgID=0 in test alert generation causing invalid alert rule.	2023-01-17 19:09:46 +00:00
Yuri Tseretyan	9d57b1c72e	Alerting: Do not persist noop transition from Normal state. (#61201 ) * add feature flag `alertingNoNormalState` * update instance database to support exclusion of state in list operation * do not save normal state and delete transitions to normal * update get methods to filter out normal state	2023-01-13 18:29:29 -05:00
George Robinson	2a291afbae	Alerting: Use consts from alerting package (#61241 )	2023-01-10 19:59:13 +00:00
Marcus Efraimsson	c35c689a96	Plugins: Automatically forward plugin request HTTP headers in outgoing HTTP requests (#60417 ) Automatically forward core plugin request HTTP headers in outgoing HTTP requests. Core datasource plugin authors don't have to specifically handle forwarding of HTTP headers, e.g. do not have to "hardcode" the header-names in the datasource plugin, if not having custom needs. Fixes #57065	2022-12-21 13:25:58 +01:00
Alex Moreno	174c61b949	Alerting: Set Dashboard and Panel IDs on rule group replacement (#60374 ) * Set Dashboard and Panel IDs on rule group replacement * fix comments and abbreviate test variable name * Update pkg/services/ngalert/provisioning/alert_rules.go Co-authored-by: Jean-Philippe Quéméner <JohnnyQQQQ@users.noreply.github.com> Co-authored-by: Jean-Philippe Quéméner <JohnnyQQQQ@users.noreply.github.com>	2022-12-16 11:47:25 +01:00
George Robinson	76601f3ae7	Alerting: Better define how we set states (#59977 ) This commit better defines how we set states in resultNormal, resultAlerting, resultError and resultNoData. It changes the existing code to call methods such as SetAlerting, SetPending, SetNormal, SetError and NoData instead of assigning values to each individual field whenever the state is changed. This should make it easier to understand what fields should be set for which states and avoid cases where states are missing, or have additional unexpected fields.	2022-12-08 20:12:13 +00:00
Yuri Tseretyan	abb49d96b5	Alerting: update state manager to return StateTransition instead of State (#58867 ) * improve test for stale states * update state manager return StateTransition * update scheduler to accept state transitions	2022-12-06 13:07:39 -05:00
Sofia Papagiannaki	9855e74b92	Chore: Refactor quota service (#58643 ) Chore: Refactor quota service (#57586) * Chore: refactore quota service * Apply suggestions from code review	2022-11-14 21:08:10 +02:00
idafurjes	080ea88af7	Nested Folders: Support getting of nested folder in folder service wh… (#58597 ) * Nested Folders: Support getting of nested folder in folder service when feature flag is set * Fix lint * Fix some tests * Fix ngalert test * ngalert fix * Fix API tests * Fix some tests and lint * Fix lint 2 * Fix library elements and panels * Add access control to get folder * Cleanup and minor test change	2022-11-11 14:28:24 +01:00
Alex Moreno	45facbba11	Alerting: Remove url based external alertmanagers config (#57918 ) * Remove URL-based alertmanagers from endpoint config * WIP * Add migration and alertmanagers from admin_configuration * Empty comment removed * set BasicAuth true when user is present in url * Remove Alertmanagers from GET /admin_config payload * Remove URL-based alertmanager configuration from UI * Fix new uid generation in external alertmanagers migration * Fix tests for URL-based external alertmanagers * Fix API tests * Add more tests, move migration code to separate file, and remove possible am duplicate urls * Fix edge cases in migration * Fix imports * Remove useless fields and fix created_at/updated_at retrieval Co-authored-by: George Robinson <george.robinson@grafana.com> Co-authored-by: Konrad Lalik <konrad.lalik@grafana.com>	2022-11-10 16:34:13 +01:00
George Robinson	c5ae1bcfe0	Alerting: Fix logging pointer address of DashboardUID and PanelID variables (#58539 )	2022-11-10 09:58:38 +00:00
Alexander Weaver	2bfdda5b68	Alerting: Break dependency between state and image packages (#58381 ) * Refactor state and manager to not depend directly on image interface * Move generic errors to models package * Move NotAvailableImageService to state as its only references are in state tests * Move NoopImageService to state package * Move mock to state package * Fix linter error * Fix comment styling * Fix a couple added references introduced by rebase * Empty commit to kick build	2022-11-09 15:06:49 -06:00
Yuri Tseretyan	bad4f28d0d	Alerting: update test TestAlertingTicker to not rely on clock (#58544 ) * extract method processTick * make processTick return scheduled rules * move state manager tests to state manager * update test * move all tests into one file * remove unused fields	2022-11-09 15:08:57 -05:00
Kristin Laemmert	ef7145e4aa	feat(nested folders): Add CountAlertRulesInFolder to ngalert store (#58269 ) * chore: refactor CountDashboardsInFolder to use the more efficient Count() sql function * feat(nested folders): Add CountAlertRulesInFolder to ngalert store This commit adds CountAlertRulesInFolder and a new model for the CountAlertRulesQuery. It returns a count of alert rules associated with a given orgID and parent folder UID. (the namespace referenced inside alert rules is the parent folder). I'm not sure where this belongs in the ngalert service, so that will come in a future commit.	2022-11-08 11:51:00 +01:00
Sofia Papagiannaki	96cdf77995	Revert "Chore: Refactor quota service (#57586 )" (#58394 ) This reverts commit `326ea86a57`.	2022-11-08 11:52:07 +02:00
Sofia Papagiannaki	326ea86a57	Chore: Refactor quota service (#57586 ) * Chore: refactore quota service * Apply suggestions from code review	2022-11-08 10:25:34 +02:00
George Robinson	8353f307aa	Alerting: Fix test fails in some environments (#58251 )	2022-11-07 16:34:37 +00:00
Neel	db1fd10ff1	Alerting: Append org ID to alert notification URLs (#57123 )	2022-11-07 16:03:25 +00:00
Yuriy Tseretyan	0a4121cef8	Alerting: Contextual log provider for rule key (#57476 ) * create contextual log context provider * use contextual provider in scheduler * init logger in the package * use context for log context * use context in state manager	2022-10-26 19:16:02 -04:00
Yuriy Tseretyan	2d20c8db7b	Chore: Expression engine to support relative time range (#57474 ) * make TimeRange interface and add relative range * make Execute methods support the current time * update resample to support relative time range * update DSNode to support relative time range * update query service to create queries with absolute time * make alerting evaluator create relative time ranges	2022-10-26 16:13:58 -04:00
George Robinson	802d67eeca	Alerting: Support values in notification templates (#56457 ) We have received a lot of feedback regarding the ValueString in alert notifications. Perhaps one of the most frequent complaints about ValueString is that it is difficult to read because it contains a lot of information, and the information is shown as a JSON-like string. Users have often asked how it can be templated and the answer is that it can't. Until now users have been able to add custom annotations to their alert rules which contains values via the $values variable added in previous versions of Grafana. However, these custom annotations must be added for each of the user's alert rule, instead of once in a template that all of their alerts can be notified via. This commit adds then the much requested feature to support values in notification templates. Users can then create a single template that prints the annotations, labels and values of their alerts in a format of their choice!	2022-10-10 13:40:21 +01:00
Joe Blubaugh	b476ae62fb	Alerting: Write and Delete multiple alert instances. (#55350 ) Prior to this change, all alert instance writes and deletes happened individually, in their own database transaction. This change batches up writes or deletes for a given rule's evaluation loop into a single transaction before applying it. These new transactions are off by default, guarded by the feature toggle "alertingBigTransactions" Before: ``` goos: darwin goarch: arm64 pkg: github.com/grafana/grafana/pkg/services/ngalert/store BenchmarkAlertInstanceOperations-8 398 2991381 ns/op 1133537 B/op 27703 allocs/op --- BENCH: BenchmarkAlertInstanceOperations-8 util.go:127: alert definition: {orgID: 1, UID: FovKXiRVzm} with title: "an alert definition FTvFXmRVkz" interval: 60 created util.go:127: alert definition: {orgID: 1, UID: foDFXmRVkm} with title: "an alert definition fovFXmRVkz" interval: 60 created util.go:127: alert definition: {orgID: 1, UID: VQvFuigVkm} with title: "an alert definition VwDKXmR4kz" interval: 60 created PASS ok github.com/grafana/grafana/pkg/services/ngalert/store 1.619s ``` After: ``` goos: darwin goarch: arm64 pkg: github.com/grafana/grafana/pkg/services/ngalert/store BenchmarkAlertInstanceOperations-8 1440 816484 ns/op 352297 B/op 6529 allocs/op --- BENCH: BenchmarkAlertInstanceOperations-8 util.go:127: alert definition: {orgID: 1, UID: 302r_igVzm} with title: "an alert definition q0h9lmR4zz" interval: 60 created util.go:127: alert definition: {orgID: 1, UID: 71hrlmR4km} with title: "an alert definition nJ29_mR4zz" interval: 60 created util.go:127: alert definition: {orgID: 1, UID: Cahr_mR4zm} with title: "an alert definition ja2rlmg4zz" interval: 60 created PASS ok github.com/grafana/grafana/pkg/services/ngalert/store 1.383s ``` So we cut time by about 75% and memory allocations by about 60% when storing and deleting 100 instances.	2022-10-06 14:22:58 +08:00
Alexander Weaver	d66ed6fe35	Alerting: Move stray model structs in store package to model package (#55968 ) * Move stray command structs to model package like the rest * Fix broken references	2022-09-29 15:47:56 -05:00
Alexander Weaver	d17ab82b98	Alerting: Break up store.RuleStore interface, delete dead code (#55776 ) * Refactor state manager to not depend on rule store interface * Refactor grafana and proxied ruler APIs to not depend on store.RuleStore * Refactor folder subscription logic to not use store.RuleStore * Delete dead code * Delete store.RuleStore	2022-09-27 08:56:30 -05:00
Alexander Weaver	f11495a4c3	Alerting: Remove dead functionality from alert instance store (#55774 ) * Update tests to use ListAlertInstances * Drop the actual methods rather than just updating tests	2022-09-26 14:38:53 -05:00
Yuriy Tseretyan	2d38664fe6	Alerting: Improve validation of query and expressions on rule submit (#53258 ) * Improve error messages of server-side expression * move validation of alert queries and a condition to eval package	2022-09-21 15:14:11 -04:00
Yuriy Tseretyan	199996cbf9	Alerting: Resolve stale state + add state reason to notifications (#49352 ) * adds a new reserved annotation `grafana_state_reason` * explicitly resolve stale states	2022-09-21 13:24:47 -04:00
Joe Blubaugh	22c937340e	Revert "Alerting: Write and Delete multiple alert instances. (#54072 )" (#54885 ) This reverts commit `5e4fd94413`.	2022-09-09 17:44:06 +02:00
Joe Blubaugh	5e4fd94413	Alerting: Write and Delete multiple alert instances. (#54072 ) Prior to this change, all alert instance writes and deletes happened individually, in their own database transaction. This change batches up writes or deletes for a given rule's evaluation loop into a single transaction before applying it. Before: ``` goos: darwin goarch: arm64 pkg: github.com/grafana/grafana/pkg/services/ngalert/store BenchmarkAlertInstanceOperations-8 398 2991381 ns/op 1133537 B/op 27703 allocs/op --- BENCH: BenchmarkAlertInstanceOperations-8 util.go:127: alert definition: {orgID: 1, UID: FovKXiRVzm} with title: "an alert definition FTvFXmRVkz" interval: 60 created util.go:127: alert definition: {orgID: 1, UID: foDFXmRVkm} with title: "an alert definition fovFXmRVkz" interval: 60 created util.go:127: alert definition: {orgID: 1, UID: VQvFuigVkm} with title: "an alert definition VwDKXmR4kz" interval: 60 created PASS ok github.com/grafana/grafana/pkg/services/ngalert/store 1.619s ``` After: ``` goos: darwin goarch: arm64 pkg: github.com/grafana/grafana/pkg/services/ngalert/store BenchmarkAlertInstanceOperations-8 1440 816484 ns/op 352297 B/op 6529 allocs/op --- BENCH: BenchmarkAlertInstanceOperations-8 util.go:127: alert definition: {orgID: 1, UID: 302r_igVzm} with title: "an alert definition q0h9lmR4zz" interval: 60 created util.go:127: alert definition: {orgID: 1, UID: 71hrlmR4km} with title: "an alert definition nJ29_mR4zz" interval: 60 created util.go:127: alert definition: {orgID: 1, UID: Cahr_mR4zm} with title: "an alert definition ja2rlmg4zz" interval: 60 created PASS ok github.com/grafana/grafana/pkg/services/ngalert/store 1.383s ``` So we cut time by about 75% and memory allocations by about 60% when storing and deleting 100 instances. This change also updates some of our tests so that they run successfully against postgreSQL - we were using random Int64s, but postgres integers, which our tables use, max out at 2^31-1	2022-09-02 11:17:20 +08:00
Timur Olzhabayev	b5b41988cf	Docs: Deprecating packages_api and removing it from our pipelines (#54473 )	2022-09-01 18:15:44 +02:00
Yuriy Tseretyan	76ea0b15ae	Alerting: Scheduler to fetch folders along with rules (#52842 ) * Update GetAlertRulesForScheduling to query for folders (if needed) * Update scheduler's alertRulesRegistry to cache folder titles along with rules * Update rule eval loop to take folder title from the * Extract interface RuleStore * Pre-fetch the rule keys with the version to detect changes, and query the full table only if there are changes.	2022-08-31 11:08:19 -04:00
Yuriy Tseretyan	41bd36eb97	Alerting: Update rules delete endpoint to handle rules in group (#53790 ) * update RouteDeleteAlertRules rules to update as a group * remove expecter from scheduler mock to support variadic function * create function to check for provisioning status + tests Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com>	2022-08-24 15:33:33 -04:00
Yuriy Tseretyan	9f90a7b54d	Alerting: State manager to use InstanceStore (#53852 ) * move saving the state to state manager when scheduler stops * move saving state to ProcessEvalResults * add GetRuleKey to State * add LogContext to AlertRuleKey	2022-08-18 09:40:33 -04:00
Alexander Weaver	f093c249ac	Alerting: Fix incorrect embedded DTO being returned when handling rule groups (#53701 ) * Fix DTO embedding when getting/putting alert rule groups * Drop usage of word 'Domain' * Rename var as well	2022-08-12 16:36:50 -05:00
George Robinson	196b781c70	Alerting: Delete expired images from the database (#53236 ) This commit adds a DeleteExpiredService that deletes expired images from the database. It is run in the periodic collector service.	2022-08-09 15:28:36 +01:00
Jean-Philippe Quéméner	54217a2037	Alerting: set dashboard and panel id using annotations in provisioning api (#53221 )	2022-08-03 16:05:32 +02:00
Yuriy Tseretyan	5fb778814c	Alerting: Update rules version when folder title is updated (#53013 ) * remove support for bus from scheduler * rename event to FolderTitleUpdated and fire only if title has changed * add method to increase version of all rules that belong to a folder * update ngalert service to subscribe to folder title change event call data store and update scheduler * add tests	2022-08-01 19:28:38 -04:00
Yuriy Tseretyan	a081764fd8	Alerting: Scheduler to use AlertRule (#52354 ) * update GetAlertRulesForSchedulingQuery to have result AlertRule * update fetcher utils and registry to support AlertRule * alertRuleInfo to use alert rule instead of version * update updateCh hanlder of ruleRoutine to just clean up the state. The updated rule will be provided at the next evaluation * update evalCh handler of ruleRoutine to use rule from the message and clear state as well as update extra labels * remove unused function in ruleRoutine * remove unused model SchedulableAlertRule * store rule version in ruleRoutine instead of rule * do not call the sender if nothing to send	2022-07-26 09:40:06 -04:00
Yuriy Tseretyan	054fe54b03	Alerting: Split Scheduler and AlertRouter tests (#52416 ) * move fake FakeExternalAlertmanager to sender package * move tests from scheduler to router * update alerts router to have all fields private * update scheduler tests to use sender mock	2022-07-19 09:32:54 -04:00
Yuriy Tseretyan	6e1e4a4215	Alerting: Update DbStore to use disabled orgs from the config (#52156 ) * update DbStore to use UnifiedAlerting settings * remove disabled orgs from scheduler and use config in db store instead * remove test	2022-07-15 14:13:30 -04:00
Yuriy Tseretyan	e5e8747ee9	Alerting: Update state manager to accept reserved labels (#52189 ) * add tests for cache getOrCreate * update ProcessEvalResults to accept extra lables * extract to getRuleExtraLabels * move populating of constant rule labels to extra labels	2022-07-14 15:59:59 -04:00
Alexander Weaver	2d7389c34d	Alerting: Provisioning API respects global rule quota (#52180 ) * Inject interface for quota service and create mock * Check quota and return 403 if limit exceeded * Implement tests for quota being exceeded	2022-07-13 17:36:17 -05:00
Yuriy Tseretyan	554ebd647b	Alerting: Refactor Evaluator (#51673 ) * AlertRule to return condition * update ConditionEval to not return an error because it's always nil * make getExprRequest private * refactor executeCondition to just converter and move execution to the ConditionEval as this makes code more readable. * log error if results have errors * change signature of evaluate function to not return an error	2022-07-12 16:51:32 -04:00
George Robinson	6844ac9879	Alerting: Change __alertScreenshotToken__ to __alertImageToken__ (#50771 )	2022-07-04 06:05:36 -04:00
Jean-Philippe Quéméner	580c5b6ad2	Alerting: add YAML support for relative time range (#51694 )	2022-07-04 06:03:34 -04:00
Yuriy Tseretyan	8b3b667a47	Alerting: Fix rule API to accept 0 duration of field `For` (#50992 ) * make 'for' pointer to distinguish between missing field and 0 * set 'for' to -1 if the value is missing but not allow negative in the request + path -1 with the value from original rule * update store validation to not allow negative 'for' * update usages to use pointer	2022-06-30 11:46:26 -04:00
Yuriy Tseretyan	78c012df65	move eval_conditions to API models package (#51447 )	2022-06-27 11:52:41 -04:00
Yuriy Tseretyan	ee5bcf2b96	make test more stable (#51268 )	2022-06-22 12:53:16 -04:00
Yuriy Tseretyan	4d02f73e5f	Alerting: Persist rule position in the group (#50051 ) Migrations: * add a new column alert_group_idx to alert_rule table * add a new column alert_group_idx to alert_rule_version table * re-index existing rules during migration API: * set group index on update. Use the natural order of items in the array as group index * sort rules in the group on GET * update the version of all rules of all affected groups. This will make optimistic lock work in the case of multiple concurrent request touching the same groups. UI: * update UI to keep the order of alerts in a group	2022-06-22 10:52:46 -04:00
Matthew Jacobson	5dee2ed24c	Alerting: Add first Grafana reserved label grafana_folder (#50262 ) * Alerting: Add first Grafana reserved label g_label g_label holds the title of the folder container the alert. The intention of this label is to use it as part of the new default notification policy groupBy. * Add nil check on updateRule labels map * Disable gocyclo lint on schedule.ruleRoutine will remove later in a separate refactoring PR to reduce complexity. * Address doc suggestions * Update g_folder for rules in folder when folder title changes * Remove global bus in FolderService * Modify tests to fit new common g_folder label * Add changelog entry * Fix merge conflicts * Switch GrafanaReservedLabelPrefix from `g_` to `grafana_`	2022-06-17 13:10:49 -04:00
Yuriy Tseretyan	c314ce48c7	Alerting: Support for optimistic locking for alert rules (#50274 ) * add support for optimistic locking for alert_rule table * return 409 in the case of opitimistic lock	2022-06-13 12:15:28 -04:00
Jean-Philippe Quéméner	862f51216b	Alerting: improve provisioning docs (#50347 ) * Alerting: improve provisioning docs * add new provisioning page * add api docs * fix formatting and add better descriptions * fix typo	2022-06-10 16:25:15 +02:00
Jean-Philippe Quéméner	cf684ed38f	Alerting: bump rule version when updating rule group interval (#50295 ) * Alerting: move group update to alert rule service * rename validateAlertRuleInterval to validateRuleGroupInterval * init baseinterval correctly * add seconds suffix * extract validation function for reusability * add context to err message	2022-06-09 09:28:32 +02:00
Yuriy Tseretyan	a89d4a5be7	Alerting: Scheduler to drop ticks if a rule's evaluation is too slow (#48885 ) * drop ticks if evaluation of a rule is too slow. * add metric schedule_rule_evaluations_missed_total	2022-06-08 12:50:44 -04:00
Yuriy Tseretyan	49d93fb67e	Alerting: Update alert rule diff to not see difference between nil and empty map (#50192 )	2022-06-03 21:27:29 +02:00
Yuriy Tseretyan	ad25e2a20c	Alerting: Update RBAC for alert rules to consider access to rule as access to group it belongs (#49033 ) * update authz to exclude entire group if user does not have access to rule * change rule update authz to not return changes because if user does not have access to any rule in group, they do not have access to the rule * a new query that returns alerts in group by UID of alert that belongs to that group * collect all affected groups during calculate changes * update authorize to check access to groups * update tests for calculateChanges to assert new fields * add authorization tests	2022-06-01 10:23:54 -04:00
Joe Blubaugh	9e8efaa459	Alerting: Add stored screenshot utilities to the channels package. (#49470 ) Adds three functions: `withStoredImages` iterates over a list of models.Alerts, extracting a stored image's data from storage, if available, and executing a user-provided function. `withStoredImage` does this for an image attached to a specific alert. `openImage` finds and opens an image file on disk. Moves `store.Image` to `models.Image` Simplifies `channels.ImageStore` interface and updates notifiers that use it to use the simpler methods. Updates all pkg/alert/notifier/channels to use withStoredImage routines.	2022-05-26 13:29:56 +08:00
Joe Blubaugh	1cc034d960	Alerting: Add a "Reason" to Alert Instances to show underlying cause of state. (#49259 ) This change adds a field to state.State and models.AlertInstance that indicate the "Reason" that an instance has its current state. This helps us account for cases where the state is "Normal" but the underlying evaluation returned "NoData" or "Error", for example. Fixes #42606 Signed-off-by: Joe Blubaugh <joe.blubaugh@grafana.com>	2022-05-23 16:49:49 +08:00
Joe Blubaugh	1d724810de	Alerting: State Manager takes screenshots. (#49338 ) The State Manager will now take screenshots when an alert instance switches to an Alerting or Resolved state. Signed-off-by: Joe Blubaugh joe.blubaugh@grafana.com	2022-05-23 10:53:41 +08:00
George Robinson	43358c7248	Alerting: Keep private annotations across evaluations (#49080 )	2022-05-18 11:21:18 +02:00
Yuriy Tseretyan	952cb4fc0b	Alerting: introduce AlertRuleGroupKey and use it in API handlers (#48945 ) * create AlertGroupKey structure * update PrometheusSrv. - extract creation of RuleGroup to a separate method. Use group key for grouping * update RuleSrv - update calculateChanges to use groupKey - authorize to use groupkey	2022-05-16 15:45:45 -04:00
Yuriy Tseretyan	369fcc5e9a	Alerting: scheduler to use short version of model for alert rule (#48916 ) * scheduler to use a short version of alert rule model	2022-05-12 09:55:05 -04:00
Alexander Weaver	078a578803	Drop ProvenanceOrgAdapter and build into store API instead (#48137 )	2022-04-26 10:30:57 -05:00
George Robinson	c5547123bc	Remove redundant queries in GetAlertRules and GetOrgAlertRules and replace with ListAlertRules (#48108 )	2022-04-25 11:42:42 +01:00
George Robinson	d66fc6ed1a	Alerting: Add GetRuleGroups to RuleStore (#48036 ) This commit adds a new method GetRuleGroups to RuleStore which returns the set of rule groups across all organizations.	2022-04-21 17:59:22 +01:00
Jean-Philippe Quéméner	388ecb4037	Alerting: Provisioning API - Contact points (#47197 )	2022-04-13 22:15:55 +02:00
Alexander Weaver	dde0b93cf1	Alerting: Provisioning API - Notification Policies (#46755 ) * Base-line API for provisioning notification policies * Wire API up, some simple tests * Return provenance status through API * Fix missing call * Transactions * Clarity in package dependencies * Unify receivers in definitions * Fix issue introduced by receiver change * Drop unused internal test implementation * FGAC hooks for provisioning routes * Polish, swap names * Asserting on number of exposed routes * Don't bubble up updated object * Integrate with new concurrency token feature in store * Back out duplicated changes * Remove redundant tests * Regenerate and create unit tests for API layer * Integration tests for auth * Address linter errors * Put route behind toggle * Use alternative store API and fix feature toggle in tests * Fixes, polish * Fix whitespace * Re-kick drone * Rename services to provisioning	2022-04-05 16:48:51 -05:00
Yuriy Tseretyan	4ee48c2e77	Alerting: Update GetRuleGroupAlertRules to accept optional rule group (#46889 ) * rename GetRuleGroupAlertRules to GetAlertRules * make rule group optional in GetAlertRulesQuery * simplify FakeStore. the current structure did not support optional rule group	2022-03-23 17:36:25 +00:00
Jean-Philippe Quéméner	a80f04c949	Alerting: add collision safe update function for alertmanager configurations (#46692 ) * Alerting: add collision safe update function for alertmanager configurations * fix typo * use bootstrap func for tests * move hash calculation to store * remove icons lol * remove removed field	2022-03-23 09:31:46 +01:00
gotjosh	a338c78ca8	Alerting: Remove internal labels from prometheus compatible API responses (#46548 ) * Alerting: Remove internal labels from prometheus compatible API responses * Appease the linter * Fix integration tests * Fix API documentation & linter * move removal of internal labels to the models	2022-03-16 16:04:19 +00:00
gotjosh	a75d4fcbd8	Alerting: Display query from grafana-managed alert rules on `/api/v1/rules` (#45969 ) * Aleting: Extract query from alerting rule model for api/v1/rules * more changes and fixtures * appease the linter	2022-03-14 10:39:20 +00:00
Yuriy Tseretyan	4502e40ed8	Alerting: Revert Revert "Alerting: Calculate diff for two AlertRules" (#46034 ) * Revert "Revert "Alerting: Calculate diff for two AlertRules (#45877)" (#46023)" This reverts commit `82aa5acba6`. * remove flakiness	2022-03-01 11:10:29 -05:00
Jean-Philippe Quéméner	82aa5acba6	Revert "Alerting: Calculate diff for two AlertRules (#45877 )" (#46023 ) This reverts commit `4e19d7df63`.	2022-03-01 13:40:47 +01:00
Yuriy Tseretyan	4e19d7df63	Alerting: Calculate diff for two AlertRules (#45877 ) * add custom diff reporter DiffReporter that reports only paths that have a difference * create Diff method for AlertRule that returns DiffReport, which is an alias for []Diff Tests: * create copy method for AlertRule in testing * create GenerateAlertQuery method in testing	2022-02-28 17:13:53 +01:00
Yuriy Tseretyan	f75bea481d	Alerting: validate rules and calculate changes in API controller (#45072 ) * Update API controller - add validation of rules API model - add function to calculate changes between the submitted alerts and existing alerts - update RoutePostNameRulesConfig to validate input models, calculate changes and apply in a transaction * Update DBStore - delete unused storage method. All the logic is moved upstream. - upsert to not modify fields of new by values from the existing alert - if rule has UID do not try to pull it from db. (it is done upstream) * Add rule generator	2022-02-23 11:30:04 -05:00
Alexander Weaver	935059a376	Alerting: Create basic storage layer for provisioning (#44679 ) * Simplistic store API for provenance lookups on arbitrary types * Add a few notes in comments * Improved type safety for provisioned objects * Clean-up TODOs for future PRs * Clean up provisioning model * Clean up tests * Restrict allowable types in interface * Fix linter error * Move AlertRule domain methods to same file as AlertRule definition * Update pkg/services/ngalert/models/provisioning.go Co-authored-by: George Robinson <george.robinson@grafana.com> * Complete interface rename * Pass context through store API * More idiomatic method names * Better error description * Improve code-docs * Use ORM language instead of raw sql * Add support for records in different orgs * ResourceTypeID -> ResourceType since it's not an ID Co-authored-by: George Robinson <george.robinson@grafana.com>	2022-02-04 13:23:19 -06:00
Santiago	04d93751b8	Alerting: send alerts to external, internal, or both alertmanagers (#40341 ) * (WIP) send alerts to external, internal, or both alertmanagers * Modify admin configuration endpoint, update swagger docs * Integration test for admin config updated * Code review changes * Fix alertmanagers choice not changing bug, add unit test * Add AlertmanagersChoice as enum in swagger, code review changes * Fix API and tests errors * Change enum from int to string, use 'SendAlertsTo' instead of 'AlertmanagerChoice' where necessary * Fix tests to reflect last changes * Keep senders running when alerts are handled just internally * Check if any external AM has been discovered before sending alerts, update tests * remove duplicate data from logs * update comment * represent alertmanagers choice as an int instead of a string * default alertmanagers choice to all alertmanagers, test cases * update definitions and generate spec	2022-02-01 20:36:55 -03:00
George Robinson	1b26d4d88e	Alerting: Create DatasourceError alert if evaluation returns error (#41869 ) * Alerting: Create DatasourceError alert if evaluation returns error * Alerting: Add docs for DatasourceError alert * Alerting: Fix DatasourceError alert does not have dashboard_uid label * Alerting: Add break when datasource_uid found * Alerting: Update TestProcessEvalResults	2021-11-25 11:46:47 +01:00
Peter Holmberg	b2d7162168	Alerting: Add external Alertmanagers (#39183 ) * building ui * saving alertmanager urls * add actions and api call to get external ams * add list to add modal * add validation and edit/delete * work on merging results * merging results * get color for status heart * adding tests * tests added * rename * add pollin and status * fix list sync * fix polling * add info icon with actual tooltip * fix test * Accessibility things * fix strict error * delete public/dist files * Add API tests for invalid URL * start redo admin test * Fix for empty configuration and test * remove admin test * text updates after review * suppress appevent error * fix tests * update description Co-authored-by: gotjosh <josue@grafana.com> * fix text plus go lint * updates after pr review * Adding docs * Update docs/sources/alerting/unified-alerting/fundamentals/alertmanager.md Co-authored-by: gotjosh <josue@grafana.com> * Update docs/sources/alerting/unified-alerting/fundamentals/alertmanager.md Co-authored-by: gotjosh <josue@grafana.com> * Update docs/sources/alerting/unified-alerting/fundamentals/alertmanager.md Co-authored-by: gotjosh <josue@grafana.com> * Update docs/sources/alerting/unified-alerting/fundamentals/alertmanager.md Co-authored-by: gotjosh <josue@grafana.com> * Update docs/sources/alerting/unified-alerting/fundamentals/alertmanager.md Co-authored-by: gotjosh <josue@grafana.com> * Update docs/sources/alerting/unified-alerting/fundamentals/alertmanager.md Co-authored-by: gotjosh <josue@grafana.com> * Update docs/sources/alerting/unified-alerting/fundamentals/alertmanager.md Co-authored-by: gotjosh <josue@grafana.com> * prettier * updates after docs feedback Co-authored-by: gotjosh <josue.abreu@gmail.com> Co-authored-by: gotjosh <josue@grafana.com>	2021-11-12 22:19:16 +01:00
Ryan McKinley	3489721ed6	api/ds/query: simplify data sources lookup for queries and expressions (#41172 )	2021-11-05 08:12:55 -07:00
Yuriy Tseretyan	5836def6c2	Alerting: declare constants for __dashboardUid__ and __panelId__ literals (#39976 )	2021-10-07 17:30:06 -04:00
George Robinson	2a4c1b1aa6	You can now get alert rules for a dashboard or a panel using /api/v1/rules endpoints. (#39476 ) Get alert rules for a dashboard and panel in /api/v1/rules	2021-10-04 16:33:55 +01:00
Sofia Papagiannaki	012d4f0905	Alerting: Remove `ngalert` feature toggle and introduce two new settings for enabling Grafana 8 alerts and disabling them for specific organisations (#38746 ) * Remove `ngalert` feature toggle * Update frontend Remove all references of ngalert feature toggle * Update docs * Disable unified alerting for specific orgs * Add backend tests * Apply suggestions from code review Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com> * Disabled unified alerting by default * Ensure backward compatibility with old ngalert feature toggle * Apply suggestions from code review Co-authored-by: gotjosh <josue@grafana.com>	2021-09-29 16:16:40 +02:00
Sofia Papagiannaki	04d5dcb7c8	Alerting: modify DB table, accessors and migration to restrict org access (#37414 ) * Alerting: modify table and accessors to limit org access appropriately * Update migration to create multiple Alertmanager configs * Apply suggestions from code review Co-authored-by: gotjosh <josue@grafana.com> * replace mg.ClearMigrationEntry() mg.ClearMigrationEntry() would create a new session. This commit introduces a new migration for clearing an entry from migration log for replacing mg.ClearMigrationEntry() so that all dashboard alert migration operations will run inside the same transaction. It adds also `SkipMigrationLog()` in Migrator interface for skipping adding an entry in the migration_log. Co-authored-by: gotjosh <josue@grafana.com>	2021-08-12 16:04:09 +03:00
gotjosh	f83cd401e5	Alerting: Send alerts to external Alertmanager(s) (#37298 ) * Alerting: Send alerts to external Alertmanager(s) Within this PR we're adding support for registering or unregistering sending to a set of external alertmanagers. A few of the things that are going are: - Introduce a new table to hold "admin" (either org or global) configuration we can change at runtime. - A new periodic check that polls for this configuration and adjusts the "senders" accordingly. - Introduces a new concept of "senders" that are responsible for shipping the alerts to the external Alertmanager(s). In a nutshell, this is the Prometheus notifier (the one in charge of sending the alert) mapped to a multi-tenant map. There are a few code movements here and there but those are minor, I tried to keep things intact as much as possible so that we could have an easier diff.	2021-08-06 13:06:56 +01:00

1 2 3 4

187 Commits