grafana

mirror of https://github.com/grafana/grafana.git synced 2024-11-25 18:30:41 -06:00

Author	SHA1	Message	Date
Alexander Weaver	b926b6336d	Alerting: Scheduled recording rules execute their queries (#88309 ) * Basic eval flow * Wiring-up * fix * Extend todo * Start with tests * Include some relevant tests, skip ones that seem to have timing-based race conditions * Some tests, touch up linter and todo * Solve TODO * Add tracing * Tests to make sure an eval went through * Wire up feature toggles * Update pkg/services/ngalert/schedule/recording_rule.go Co-authored-by: Steve Simpson <steve.simpson@grafana.com> * Update pkg/services/ngalert/schedule/recording_rule_test.go Co-authored-by: Steve Simpson <steve.simpson@grafana.com> * Update pkg/services/ngalert/schedule/recording_rule_test.go Co-authored-by: Steve Simpson <steve.simpson@grafana.com> * Update pkg/services/ngalert/schedule/recording_rule_test.go Co-authored-by: Steve Simpson <steve.simpson@grafana.com> --------- Co-authored-by: Steve Simpson <steve.simpson@grafana.com>	2024-05-28 10:59:21 -05:00
Alexander Weaver	89b54d06e9	Alerting: Schedule a shim implementation for recording rules (#87939 ) * Add shim rule implementation for recording rules * Give ruleFactory access to the original rule definition * Schedule shim implementation if the rule is a recording rule * Fix or suppress linter * Fix nolint	2024-05-21 16:42:58 -05:00
Alexander Weaver	36ef611cf4	Alerting: Add database migration for recording rule fields (#87012 ) * Create recording rule fields in model * Add migration * Write to database, support in version table * extend fingerprint * Force fields to be empty on validate * Another storage spot, tests for fingerprint * Explicitly set defaults in provisioning API * Tests for main API validation * Add diff tests even though fields are unpopulated for now * Use struct tag approach instead of FromDB/ToDB hooks as it better handles nulls when deserializing * test for deser * Backout RecordTo for now since it's not decided in the doc * back out of migration too * Drop datasourceref for now * address linter complaints * Try a single outer struct with all fields embedded	2024-05-09 12:12:44 -05:00
Benoit Tigeot	6f38ac6615	Alerting: Reduce set of fields that could trigger alert state change (#83496 ) We want to avoid too much change of alert state based on change on alert's fields. For that we ignore some fields from the diff.	2024-03-26 12:35:30 -04:00
ismail simsek	6137c4e0a6	Chore: Bump golangci-lint v1.57.1 (#84998 ) * bump golangci-lint v1.57.1 * update setting * remove goconst * fix linting issues * prettier * fix G601 * go mod tidy go work sync	2024-03-25 15:28:24 +01:00
Alexander Weaver	6c5e94095d	Alerting: Scheduler and registry handle rules by an interface (#84044 ) * export Evaluation * Export Evaluation * Export RuleVersionAndPauseStatus * export Eval, create interface * Export update and add to interface * Export Stop and Run and add to interface * Registry and scheduler use rule by interface and not concrete type * Update factory to use interface, update tests to work over public API rather than writing to channels directly * Rename map in registry * Rename getOrCreateInfo to not reference a specific implementation * Genericize alertRuleInfoRegistry into ruleRegistry * Rename alertRuleInfo to alertRule * Comments on interface * Update pkg/services/ngalert/schedule/schedule.go Co-authored-by: Jean-Philippe Quéméner <JohnnyQQQQ@users.noreply.github.com> --------- Co-authored-by: Jean-Philippe Quéméner <JohnnyQQQQ@users.noreply.github.com>	2024-03-11 22:57:38 +02:00
Alexander Weaver	d5fda06147	Alerting: Decouple rule routine from scheduler (#84018 ) * create rule factory for more complicated dep injection into rules * Rules get direct access to metrics, logs, traces utilities, use factory in tests * Use clock internal to rule * Use sender, statemanager, evalfactory directly * evalApplied and stopApplied * use schedulableAlertRules behind interface * loaded metrics reader * 3 relevant config options * Drop unused scheduler parameter * Rename ruleRoutine to run * Update READMED * Handle long parameter lists * remove dead branch	2024-03-06 13:44:53 -06:00
Alexander Weaver	fa51724bc6	Alerting: Move alertRuleInfo and tests to new files (#83854 ) Move ruleinfo and tests to new files	2024-03-04 11:24:49 -06:00
Yuri Tseretyan	1eebd2a4de	Alerting: Support for simplified notification settings in rule API (#81011 ) * Add notification settings to storage\domain and API models. Settings are a slice to workaround XORM mapping * Support validation of notification settings when rules are updated * Implement route generator for Alertmanager configuration. That fetches all notification settings. * Update multi-tenant Alertmanager to run the generator before applying the configuration. * Add notification settings labels to state calculation * update the Multi-tenant Alertmanager to provide validation for notification settings * update GET API so only admins can see auto-gen	2024-02-15 09:45:10 -05:00
Yuri Tseretyan	131c72d655	Alerting: Fix scheduler to group folders by the unique key (orgID and UID) (#81303 )	2024-01-30 17:14:11 -05:00
Yuri Tseretyan	ada325de2a	Alerting: Use unsafe.Slice for hashing a string during rule fingerprint calculation (#71000 )	2023-06-30 14:58:23 -04:00
Yuri Tseretyan	9eb10bee1f	Alerting: Scheduler use rule fingerprint instead of version (#66531 ) * implement calculation of fingerprint for ruleWithFolder * update scheduler to use fingerprint instead of rule's version	2023-04-28 10:42:16 -04:00
Yuri Tseretyan	85a954cd81	Alerting: Update scheduler to get updates only from database (#64635 ) * stop using the scheduler's Update and Delete methods all communication must be via the database * update scheduler's registry to calculate diff before re-setting the cache * update fetcher to return the diff generated by registry * update processTick to update rule eval routine if the rule was updated and it is not going to be evaluated at this tick. * remove references to the scheduler from api package * remove unused methods in the scheduler	2023-03-14 18:02:51 -04:00
Alex Moreno	531b439cf1	Alerting: Add alert pausing feature (#60734 ) * Add field in alert_rule model, add state to alert_instance model, and state to eval * Remove paused state from eval package * Skip paused alert rules in scheduler * Add migration to add is_paused field to alert_rule table * Convert to postable alerts only if not normal, pernding, or paused * Handle paused eval results in state manager * Add Paused state to eval package * Add paused alerts logic in scheduler * Skip alert on scheduler * Remove paused status from eval package * Apply suggestions from code review Co-authored-by: George Robinson <george.robinson@grafana.com> * Remove state * Rethink schedule and manager for paused alerts * Change return to continue * Remove unused var * Rethink alert pausing * Paused alerts storing annotations * Only add one state transition * Revert boolean method renaming refactor * Revert take image refactor * Make registry errors public * Revert method extraction for getting a folder title * Revert variable renaming refactor * Undo unnecessary changes * Revert changes in test * Remove IsPause check in PatchPartiLAlertRule function * Use SetNormal to set state * Fix text by returning to old behaviour on alert rule deletion * Add test in schedule_unit_test.go to test ticks with paused alerts * Add coment to clarify usage of context.Background() * Add comment to clarify resetStateByRuleUID method usage * Move rule get to a more limited scope * Update pkg/services/ngalert/schedule/schedule.go Co-authored-by: George Robinson <george.robinson@grafana.com> * rum gofmt on pkg/services/ngalert/schedule/schedule.go * Remove defer cancel for context * Update pkg/services/ngalert/models/instance_test.go Co-authored-by: Santiago <santiagohernandez.1997@gmail.com> * Update pkg/services/ngalert/models/testing.go Co-authored-by: Santiago <santiagohernandez.1997@gmail.com> * Update pkg/services/ngalert/schedule/schedule_unit_test.go Co-authored-by: Santiago <santiagohernandez.1997@gmail.com> * Update pkg/services/ngalert/schedule/schedule_unit_test.go Co-authored-by: Santiago <santiagohernandez.1997@gmail.com> * Update pkg/services/ngalert/models/instance_test.go Co-authored-by: Santiago <santiagohernandez.1997@gmail.com> * skip scheduler rule state clean up on paused alert rule * Update pkg/services/ngalert/schedule/schedule.go Co-authored-by: Santiago <santiagohernandez.1997@gmail.com> * Fix mock in test * Add (hopefully) final suggestions * Use error channel from recordAnnotationsSync to cancel context * Run make gen-cue * Place pause alert check in channel update after version check * Reduce branching un update channel select * Add if for error and move code inside if in state manager ResetStateByRuleUID * Add reason to logs * Update pkg/services/ngalert/schedule/schedule.go Co-authored-by: George Robinson <george.robinson@grafana.com> * Do not delete alert rule routine, just exit on eval if is paused * Reduce branching and create-close a channel to avoid deadlocks * Separate state deletion and state reset (includes history saving) * Add current pause state in rule route in scheduler * Split clearState and bring errCh closer to RecordStatesAsync call * Change rule to ruleMeta in RecordStatesAsync * copy state to be able to modify it * Add timeout to context creation * Shorten the timeout * Use resetState is rule is paused and deleteState if rule is not paused * Remove Empty state reason * Save every rule change in historian * Add tests for DeleteStateByRuleUID and ResetStateByRuleUID * Remove useless line * Remove outdated comment Co-authored-by: George Robinson <george.robinson@grafana.com> Co-authored-by: Santiago <santiagohernandez.1997@gmail.com> Co-authored-by: Armand Grillet <2117580+armandgrillet@users.noreply.github.com>	2023-01-26 18:29:10 +01:00
Timur Olzhabayev	b5b41988cf	Docs: Deprecating packages_api and removing it from our pipelines (#54473 )	2022-09-01 18:15:44 +02:00
Yuriy Tseretyan	76ea0b15ae	Alerting: Scheduler to fetch folders along with rules (#52842 ) * Update GetAlertRulesForScheduling to query for folders (if needed) * Update scheduler's alertRulesRegistry to cache folder titles along with rules * Update rule eval loop to take folder title from the * Extract interface RuleStore * Pre-fetch the rule keys with the version to detect changes, and query the full table only if there are changes.	2022-08-31 11:08:19 -04:00
Yuriy Tseretyan	03e746d9df	Alerting: Delete state from the database on reset (#53919 ) * make ResetStatesByRuleUID return states * delete rule states when reset * rule eval routine to clean up the state only when rule is deleted	2022-08-25 14:12:22 -04:00
Yuriy Tseretyan	a081764fd8	Alerting: Scheduler to use AlertRule (#52354 ) * update GetAlertRulesForSchedulingQuery to have result AlertRule * update fetcher utils and registry to support AlertRule * alertRuleInfo to use alert rule instead of version * update updateCh hanlder of ruleRoutine to just clean up the state. The updated rule will be provided at the next evaluation * update evalCh handler of ruleRoutine to use rule from the message and clear state as well as update extra labels * remove unused function in ruleRoutine * remove unused model SchedulableAlertRule * store rule version in ruleRoutine instead of rule * do not call the sender if nothing to send	2022-07-26 09:40:06 -04:00
Yuriy Tseretyan	a7509ba18b	Alerting: rule evaluation loop's update channel to provide version (#52170 ) * handler for update message in rule evaluation routine ignores the message if its version greater or equal. * replace messages to update the channel if it is not empty	2022-07-15 12:32:52 -04:00
gotjosh	c59938b235	Alerting: Schedule Alert rules metric tracking (#50415 ) * Alerting: Schedule Alert rules metric tracking Change the record of metrics from one place to two as an attempt to have a semi-accurate record.	2022-06-08 18:37:33 +01:00
Yuriy Tseretyan	a89d4a5be7	Alerting: Scheduler to drop ticks if a rule's evaluation is too slow (#48885 ) * drop ticks if evaluation of a rule is too slow. * add metric schedule_rule_evaluations_missed_total	2022-06-08 12:50:44 -04:00
George Robinson	c83f84348c	Alerting: Fix database unavailable removes rules from scheduler (#49874 )	2022-06-07 16:20:06 +01:00
Yuriy Tseretyan	99156b40bd	Alerting: Move alertRuleRegistry to its own file (#48890 ) * move alertRuleRegistry to its own file * move tests to separate file	2022-05-11 10:04:50 -04:00

23 Commits