Commit Graph

8 Commits

Author SHA1 Message Date
Owen Diehl
1367f7171e
Alerting/ruler metrics (#34144)
* adds active configurations metric

* rule evaluation metrics

* ruler metrics

* pr feedback
2021-05-14 16:13:44 -04:00
Kyle Brandt
fae093bbe2
Alerting: Fix state cache getOrCreate panic (#33777) 2021-05-06 14:35:52 +02:00
David Parrott
39099bf3c0
Alerting nested state cache (#33666)
* nest cache by orgID, ruleUID, stateID

* update accessors to use new cache structure

* test and linter fixup

* fix panic

Co-authored-by: Kyle Brandt <kyle@grafana.com>

* add comment to identify what's going on with nested maps in cache

Co-authored-by: Kyle Brandt <kyle@grafana.com>
2021-05-04 09:57:50 -07:00
Kyle Brandt
48358efc13
Alerting: remove State cache entries on Ruler Delete (#33638)
for https://github.com/grafana/alerting-squad/issues/133
2021-05-03 14:01:33 -04:00
Owen Diehl
070627d11e
better handle metrics for state transitions (#33648) 2021-05-03 11:57:24 -04:00
Owen Diehl
5e48b54549
Alerting/metrics (#33547)
* moves alerting metrics to their own pkg

* adds grafana_alerting_alerts (by state) metric

* alerts_received_{total,invalid}

* embed alertmanager alerting struct in ng metrics & remove duplicated notification metrics (already embed alertmanager notifier metrics)

* use silence metrics from alertmanager lib

* fix - manager has metrics

* updates ngalert tests

* comment lint
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* cleaner prom registry code

* removes ngalert global metrics

* new registry use in all tests

* ngalert metrics impl service, hack testinfra code to prevent duplicate metric registrations

* nilmetrics unexported
2021-04-30 12:28:06 -04:00
Kyle Brandt
914443c816
Alerting: Fix state cache id duplication (#33480) 2021-04-28 11:42:19 -04:00
David Parrott
788bc2a793
Alerting: refactor state tracker (#33292)
* set processing time

* merge labels and set on response

* use state cache for adding alerts to rules

* minor cleanup

* add support for NoData and Error results

* rename test

* bring in changes from other PRs tha have been merged

* pr feedback

* add integration test

* close state tracker cleanup on context.Done

* fixup test

* rename state tracker

* set EvaluationDuration on Result

* default labels set as constants

* separate cache and state from manager

* use RWMutex in cache
2021-04-23 21:32:25 +02:00