grafana

mirror of https://github.com/grafana/grafana.git synced 2025-02-25 18:55:37 -06:00

Author	SHA1	Message	Date
Alexander Weaver	b4682fe3cb	Alerting: Configurable externalLabels for Loki state history (#62404 ) * Add config option for external labels * Remove redundant nilcheck	2023-01-30 14:24:45 -06:00
Serge Zaitsev	d6d4097567	Chore: Fix goimports grouping in alerting (#62424 ) * fix goimports * fix goimports order	2023-01-30 09:55:35 +01:00
Yuri Tseretyan	0c4671e31f	Alerting: Update historian to ignore transitions from Normal Paused and Updated (#62267 )	2023-01-27 16:26:22 -05:00
Yuri Tseretyan	05bf241952	Alerting: Update state manager to return StateTransitions when Delete or Reset (#62264 ) * update Delete and Reset methods to return state transitions this will be used by notifier code to decide whether alert needs to be sent or not. * update scheduler to provide reason to delete states and use transitions * update FromAlertsStateToStoppedAlert to accept StateTransition and filter by old state * fixup * fix tests	2023-01-27 09:46:21 +01:00
Alex Moreno	531b439cf1	Alerting: Add alert pausing feature (#60734 ) * Add field in alert_rule model, add state to alert_instance model, and state to eval * Remove paused state from eval package * Skip paused alert rules in scheduler * Add migration to add is_paused field to alert_rule table * Convert to postable alerts only if not normal, pernding, or paused * Handle paused eval results in state manager * Add Paused state to eval package * Add paused alerts logic in scheduler * Skip alert on scheduler * Remove paused status from eval package * Apply suggestions from code review Co-authored-by: George Robinson <george.robinson@grafana.com> * Remove state * Rethink schedule and manager for paused alerts * Change return to continue * Remove unused var * Rethink alert pausing * Paused alerts storing annotations * Only add one state transition * Revert boolean method renaming refactor * Revert take image refactor * Make registry errors public * Revert method extraction for getting a folder title * Revert variable renaming refactor * Undo unnecessary changes * Revert changes in test * Remove IsPause check in PatchPartiLAlertRule function * Use SetNormal to set state * Fix text by returning to old behaviour on alert rule deletion * Add test in schedule_unit_test.go to test ticks with paused alerts * Add coment to clarify usage of context.Background() * Add comment to clarify resetStateByRuleUID method usage * Move rule get to a more limited scope * Update pkg/services/ngalert/schedule/schedule.go Co-authored-by: George Robinson <george.robinson@grafana.com> * rum gofmt on pkg/services/ngalert/schedule/schedule.go * Remove defer cancel for context * Update pkg/services/ngalert/models/instance_test.go Co-authored-by: Santiago <santiagohernandez.1997@gmail.com> * Update pkg/services/ngalert/models/testing.go Co-authored-by: Santiago <santiagohernandez.1997@gmail.com> * Update pkg/services/ngalert/schedule/schedule_unit_test.go Co-authored-by: Santiago <santiagohernandez.1997@gmail.com> * Update pkg/services/ngalert/schedule/schedule_unit_test.go Co-authored-by: Santiago <santiagohernandez.1997@gmail.com> * Update pkg/services/ngalert/models/instance_test.go Co-authored-by: Santiago <santiagohernandez.1997@gmail.com> * skip scheduler rule state clean up on paused alert rule * Update pkg/services/ngalert/schedule/schedule.go Co-authored-by: Santiago <santiagohernandez.1997@gmail.com> * Fix mock in test * Add (hopefully) final suggestions * Use error channel from recordAnnotationsSync to cancel context * Run make gen-cue * Place pause alert check in channel update after version check * Reduce branching un update channel select * Add if for error and move code inside if in state manager ResetStateByRuleUID * Add reason to logs * Update pkg/services/ngalert/schedule/schedule.go Co-authored-by: George Robinson <george.robinson@grafana.com> * Do not delete alert rule routine, just exit on eval if is paused * Reduce branching and create-close a channel to avoid deadlocks * Separate state deletion and state reset (includes history saving) * Add current pause state in rule route in scheduler * Split clearState and bring errCh closer to RecordStatesAsync call * Change rule to ruleMeta in RecordStatesAsync * copy state to be able to modify it * Add timeout to context creation * Shorten the timeout * Use resetState is rule is paused and deleteState if rule is not paused * Remove Empty state reason * Save every rule change in historian * Add tests for DeleteStateByRuleUID and ResetStateByRuleUID * Remove useless line * Remove outdated comment Co-authored-by: George Robinson <george.robinson@grafana.com> Co-authored-by: Santiago <santiagohernandez.1997@gmail.com> Co-authored-by: Armand Grillet <2117580+armandgrillet@users.noreply.github.com>	2023-01-26 18:29:10 +01:00
George Robinson	a7eab8e46e	Alerting: Support context.Context in Loki interface (#61979 ) This commit adds support for canceleable contexts in the Loki interface.	2023-01-26 09:31:20 +00:00
Alexander Weaver	046a9bb7c1	Alerting: Copy rule definitions into state history (#62032 ) * Copy rules instead of accepting pointer * Deep-copy the rule, for even more guarantees * Create struct just for needed fields * Move RuleMeta to historian/model package, iron out package dependencies * Move tests for dash ID parsing to model package along with code	2023-01-25 11:29:57 -06:00
idafurjes	b54b80f473	Chore: Remove Result from dashboard models (#61997 ) * Chore: Remove Result from dashboard models * Fix lint tests * Fix dashboard service tests * Fix API tests * Remove commented out code * Chore: Merge main - cleanup	2023-01-25 10:36:26 +01:00
George Robinson	239d94205a	Alerting: Return chan <-error for #61811 (#61858 )	2023-01-24 15:41:38 +00:00
Alexander Weaver	7ccc845187	Alerting: Push state history entries to Loki (#61724 ) * Implement push endpoint * Drop duplicated struct * Genericize auth/tenant headers and improve logging in error case * Flesh out the data model * Drop dead code * Drop log line entirely * Drop unused arg * Rename a few type manipulation functions * Extract label keys as constants * Improve logs when loki responds with error * Inline lokiRepresentation function	2023-01-23 16:31:03 -06:00
Alexander Weaver	c10713ea76	Alerting: Create query interface for state history along with annotation-based implementation (#61646 )	2023-01-19 10:45:31 +01:00
Jean-Philippe Quéméner	44b11d3228	Alerting: support basic auth for the state history loki client (#61696 )	2023-01-18 20:24:40 +01:00
Alexander Weaver	1ac89ea040	Alerting: Add client configuration for remote Loki historian backend and test connection (#61114 ) * Create loki client type and ping method * Expose TestConnection on client * Configure and ping Loki URL * Close response body reader if present * Add 30 second timeout * Remove duplicate close	2023-01-17 13:58:52 -06:00
Denis Limarev	e6dee8a723	Perfomance: Preallocate slices (#61580 )	2023-01-17 11:50:17 +00:00
idafurjes	7c2522c477	Chore: Move dashboard models to dashboard pkg (#61458 ) * Copy dashboard models to dashboard pkg * Use some models from current pkg instead of models * Adjust api pkg * Adjust pkg services * Fix lint	2023-01-16 16:33:55 +01:00
Yuri Tseretyan	9d57b1c72e	Alerting: Do not persist noop transition from Normal state. (#61201 ) * add feature flag `alertingNoNormalState` * update instance database to support exclusion of state in list operation * do not save normal state and delete transitions to normal * update get methods to filter out normal state	2023-01-13 18:29:29 -05:00
Alexander Weaver	b289b8ac6e	Alerting: Set error annotation on EvaluationError regardless of underlying error type (#61506 ) Set error annotation regardless of underlying error type	2023-01-13 13:58:02 -06:00
Denis Limarev	90badc8729	Performance: Add preallocation for some slices (#59593 )	2023-01-11 18:03:37 +01:00
Yuri Tseretyan	86b5fbbf60	Alerting: Introduce state manager config structure (#61249 )	2023-01-10 16:26:15 -05:00
Alexander Weaver	eb960d9725	Alerting: Add un-documented toggle for changing state history backend, add shells for remote loki and sql (#61072 ) * Add toggle for state history backend and shells * Extract some shared logic and add tests	2023-01-06 12:06:01 -06:00
Alexander Weaver	8c3a5f6da0	Alerting: Allow state history to be disabled through configuration (#61006 ) * Add configuration option for if state history should be enabled * Inject no-op when history is disabled	2023-01-05 12:21:07 -06:00
Yuri Tseretyan	4d989860fb	Alerting: Fix conversion of alert state from db state during manager warmup (#60933 )	2023-01-04 09:40:04 -05:00
Alexander Weaver	b88b8bc291	Alerting: Fix missing dashboard/panelID links in annotations (#60926 ) Assign thru ref	2023-01-03 14:12:27 -06:00
Santiago	05c9af5110	Extract custom template functions (#60695 ) extract custom template functions and export the FuncMap	2022-12-22 17:31:40 -03:00
Kristina	5a7f38053b	Remove explore compact URLs (#59686 ) * Remove explore compact URLs * Remove two explore link builders that create compact URLs * Fix merge conflict	2022-12-14 12:57:53 -06:00
Yuri Tseretyan	4374966987	Alerting: Replace hardcoded <no value> to [no value] in label expansion (#60129 ) * replace hardcoded <no value> to [no value] in label expansion	2022-12-12 10:12:30 -05:00
George Robinson	76601f3ae7	Alerting: Better define how we set states (#59977 ) This commit better defines how we set states in resultNormal, resultAlerting, resultError and resultNoData. It changes the existing code to call methods such as SetAlerting, SetPending, SetNormal, SetError and NoData instead of assigning values to each individual field whenever the state is changed. This should make it easier to understand what fields should be set for which states and avoid cases where states are missing, or have additional unexpected fields.	2022-12-08 20:12:13 +00:00
George Robinson	6359dab040	Alerting: Change resultError in preparation for supporting ForError duration (#59894 )	2022-12-07 10:45:56 +00:00
George Robinson	3c249e1b99	Fix incorrect start time for DatasourceError alerts (#59903 )	2022-12-06 18:44:06 +00:00
Yuri Tseretyan	abb49d96b5	Alerting: update state manager to return StateTransition instead of State (#58867 ) * improve test for stale states * update state manager return StateTransition * update scheduler to accept state transitions	2022-12-06 13:07:39 -05:00
Yuri Tseretyan	a85adeed96	Alerting: Update state history service to filter states transitions (#58863 ) * rename the method to better reflect its behavior * make historian filter transition on itself * call historian with all changes	2022-12-06 12:33:15 -05:00
Sasha Melentyev	c02003af3c	Refactor time durations (#58484 ) This change uses `time.Second` in place of `1000 * time.Millisecond` and `time.Minute` in place of `60*time.Second`.	2022-11-22 15:09:15 +08:00
Yuri Tseretyan	28d39d35fd	Alerting: Update state manager to save state transitions in one batch (#58358 ) * change stale results handler to not update database but return transitions * save all transitions in one call	2022-11-14 10:57:51 -05:00
George Robinson	c5ae1bcfe0	Alerting: Fix logging pointer address of DashboardUID and PanelID variables (#58539 )	2022-11-10 09:58:38 +00:00
Alexander Weaver	2bfdda5b68	Alerting: Break dependency between state and image packages (#58381 ) * Refactor state and manager to not depend directly on image interface * Move generic errors to models package * Move NotAvailableImageService to state as its only references are in state tests * Move NoopImageService to state package * Move mock to state package * Fix linter error * Fix comment styling * Fix a couple added references introduced by rebase * Empty commit to kick build	2022-11-09 15:06:49 -06:00
Yuri Tseretyan	bad4f28d0d	Alerting: update test TestAlertingTicker to not rely on clock (#58544 ) * extract method processTick * make processTick return scheduled rules * move state manager tests to state manager * update test * move all tests into one file * remove unused fields	2022-11-09 15:08:57 -05:00
George Robinson	1290951b65	Alerting: Small improvements to staleResultsHandler (#58007 )	2022-11-09 11:08:32 +00:00
Yuri Tseretyan	3621cf5a12	Alerting: Update handling of stale state (#58276 ) * delete all stale states in one lock * do not use touched states to detect stale rely only on LastEvaluationTime maintained correctly * fix tests to use correct eval time * delete unused method	2022-11-07 11:03:53 -05:00
Yuri Tseretyan	623de12e35	Alerting: Create AlertInstanceKey in one place (#58278 ) * use method GetAlertInstanceKey * do not add key if error	2022-11-07 09:35:29 -05:00
Yuri Tseretyan	f9c88e72ae	Alerting: Update saveAlertStates in state manager to not return results (#58279 )	2022-11-07 09:09:19 -05:00
Yuri Tseretyan	978f1119d7	Alerting: Run state manager as regular sub-service (#58246 )	2022-11-04 17:06:47 -04:00
Yuri Tseretyan	dce8879145	Alerting: Update state manager to accept rule store as Warm method argument (#58244 )	2022-11-04 14:23:08 -04:00
Alexander Weaver	cc8c1380e2	Alerting: Persist annotations from multidimensional rules in batches (#56575 ) * Reduce piecemeal state fields * Read data directly off state instead of rule * Unify state and context into single struct * Expose contextual information to layer above setNextState * Work in terms of ContextualState and call historian in batches * Call annotations service in batches * Export format state and reason and remove workaround in unrelated test package * Add new method to annotation service for batch inserting * Fix loop variable aliasing bug caught by linter, didn't change behavior * Incl timerange on annotation tests * Insert one at a time if tags are present * Point to rule from ContextualState rather than copy fields * Build annotations and copy data prior to starting goroutine * Rename to StateTransition * Use new bulk-insert utility * Remove rule from StateTransition and pass in directly to historian * Simplify annotations logic since we have only one rule * Fix logs and context, nilcheck, simplify method name * Regenerate mock	2022-11-04 10:39:26 -05:00
Alex Moreno	ba15d675e7	Alerting: Add values to annotations (#57738 ) * Add values to annotations * Fix imports * Use State attrs instead of Result attrs * Remove unnecessary variable	2022-11-03 10:35:34 +01:00
George Robinson	215ffee437	Alerting: Fix screenshot is not taken for stale series (#57982 )	2022-11-02 22:14:22 +00:00
Yuriy Tseretyan	3294918e9f	Alerting: Update state manager to support nil stores and metrics (#57791 )	2022-10-28 13:10:28 -04:00
Yuriy Tseretyan	0a4121cef8	Alerting: Contextual log provider for rule key (#57476 ) * create contextual log context provider * use contextual provider in scheduler * init logger in the package * use context for log context * use context in state manager	2022-10-26 19:16:02 -04:00
Alexander Weaver	de46c1b002	Alerting: Improve logs in state manager and historian (#57374 ) * Touch up log statements, fix casing, add and normalize contexts * Dedicated logger for dashboard resolver * Avoid injecting logger to historian * More minor log touch-ups * Dedicated logger for state manager * Use rule context in annotation creator * Rename base logger and avoid redundant contextual loggers	2022-10-21 16:16:51 -05:00
Alexander Weaver	3ddb28bad9	Find-and-replace 'err' logs to 'error' to match log search conventions (#57309 )	2022-10-19 17:36:54 -04:00
Alexander Weaver	129a28919b	Alerting: Cache result of dashboard ID lookups (#56587 ) * Create caching dashboard resolver * A couple tests for dashboard resolving * Log warning on not found * Additional polish + review nits * Move to singleflight instead of a plain mutex * Store errors instead of -1 in cache and use reflection when reading * Address linter error * One more linter error	2022-10-14 15:48:02 -05:00

1 2 3 4 5 ...

257 Commits