grafana/pkg/services/ngalert/state
Alexander Weaver 19d01dff91
Alerting: Expose Prometheus metrics for persisting state history (#63157)
* Create historian metrics and dependency inject

* Record counter for total number of state transitions logged

* Track write failures

* Track current number of active write goroutines

* Record histogram of how long it takes to write history data

* Don't copy the registerer

* Adjust naming of write failures metric

* Introduce WritesTotal to complement WritesFailedTotal

* Measure TransitionsFailedTotal to complement TransitionsTotal

* Rename all to state_history

* Remove redundant Total suffix

* Increment totals all the time, not just on success

* Drop ActiveWriteGoroutines

* Drop PersistDuration in favor of WriteDuration

* Drop unused gauge

* Make writes and writesFailed per org

* Add metric indicating backend and a spot for future metadata

* Drop _batch_ from names and update help

* Add metric for bytes written

* Better pairing of total + failure metric updates

* Few tweaks to wording and naming

* Record info metric during composition

* Create fakeRequester and simple happy path test using it

* Blocking test for the full historian and test for happy path metrics

* Add tests for failure case metrics

* Smoke test for full annotation persistence

* Create test for metrics on annotation persistence, both happy and failing paths

* Address linter complaints

* More linter complaints

* Remove unnecessary whitespace

* Consistency improvements to help texts

* Update tests to match new descs
2023-03-06 10:40:37 -06:00
..
historian Alerting: Expose Prometheus metrics for persisting state history (#63157) 2023-03-06 10:40:37 -06:00
template Chore: Remove xorcare/pointer dependency (#63900) 2023-03-06 05:23:15 -05:00
cache_test.go Alerting: Refactor state manager's cache (#56197) 2022-10-06 15:30:12 -04:00
cache.go Alerting: Small readability improvements to template.go (#63422) 2023-02-20 09:24:11 +00:00
image_mock.go Chore: Fix goimports grouping in alerting (#62424) 2023-01-30 09:55:35 +01:00
manager_bench_test.go Alerting: Expose Prometheus metrics for persisting state history (#63157) 2023-03-06 10:40:37 -06:00
manager_private_test.go Alerting: Do not persist noop transition from Normal state. (#61201) 2023-01-13 18:29:29 -05:00
manager_test.go Alerting: Expose Prometheus metrics for persisting state history (#63157) 2023-03-06 10:40:37 -06:00
manager.go Alerting: Update state manager to return StateTransitions when Delete or Reset (#62264) 2023-01-27 09:46:21 +01:00
persist.go Alerting: Copy rule definitions into state history (#62032) 2023-01-25 11:29:57 -06:00
state_test.go Chore: Remove xorcare/pointer dependency (#63900) 2023-03-06 05:23:15 -05:00
state.go Alerting: Do not persist noop transition from Normal state. (#61201) 2023-01-13 18:29:29 -05:00
testing.go Alerting: Add alert pausing feature (#60734) 2023-01-26 18:29:10 +01:00