Commit Graph

21 Commits

Author SHA1 Message Date
Santiago
9e78faa7ba
Alerting: Add metrics to the remote Alertmanager struct (#79835)
* Alerting: Add metrics to the remote Alertmanager struct

* rephrase http_requests_failed description

* make linter happy

* remove unnecessary metrics

* extract timed client to separate package

* use histogram collector from dskit

* remove weaveworks dependency

* capture metrics for all requests to the remote Alertmanager (both clients)

* use the timed client in the MimirAuthRoundTripper

* HTTPRequestsDuration -> HTTPRequestDuration, clean up mimir client factory function

* refactor

* less git diff

* gauge for last readiness check in seconds

* initialize LastReadinesCheck to 0, tweak metric names and descriptions

* add counters for sync attempts/errors

* last config sync and last state sync timestamps (gauges)

* change latency metric name

* metric for remote Alertmanager mode

* code review comments

* move label constants to metrics package
2024-01-10 11:18:24 +01:00
William Wernert
f7bf818527
Alerting: Make alert state history Loki http client public (#78291)
* Make state history Loki client public

* Make historian metrics subsystem configurable
2023-11-27 09:20:50 -05:00
Alexander Weaver
e77621649d
Alerting: Instrument outgoing state history requests using weaveworks/common (#63600)
* Loki backend and client depend on a requester

* Instrument all requests to loki using weaveworks TimedClient

* Construct collector in metrics package
2023-02-23 17:52:02 -06:00
gotjosh
55e7cf1aed
Alerting: Introduce Metric Aggregation starting with Silences (#62512)
* Alerting: Introduce Metric Aggregation starting with Silences
---------

Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com>
2023-01-31 19:54:38 +00:00
gotjosh
3c616da83f
Alerting: Refactor metrics/ngalert.go into seperate files (#62362)
* Alerting: Refactor metrics/ngalert.go into seperate files
2023-01-27 18:49:49 +00:00
idafurjes
6c5a573772
Chore: Move ReqContext to contexthandler service (#62102)
* Chore: Move ReqContext to contexthandler service

* Rename package to contextmodel

* Generate ngalert files

* Remove unused imports
2023-01-27 08:50:36 +01:00
Alexander Weaver
9977c7ea43
Alerting: Simplify scheduler configuration and remove dependency on Grafana-wide settings (#59735)
* Make scheduler not depend directly on grafana-wide settings

* Re-add missing interval
2022-12-02 16:02:07 -06:00
Alexander Weaver
bd6a5c900f
Alerting: Extract ticker into shared package (#55703)
* Move ticker files to dedicated package with no changes

* Fix package naming and resolve naming conflicts

* Fix up all existing references to moved objects

* Remove all alerting-specific references from shared util

* Rename TickerMetrics to simply Metrics

* Rename base ticker type to T and rename NewTicker to simply New
2022-09-26 12:35:33 -05:00
Ben Kochie
68691d7775
Convert some metrics to Histograms (#50420)
Because Summary metrics can not be aggreated, convert them to histograms
so that users with HA deployments can use these metrics.
* Convert metrics registration to promauto.
* Improve help text style.

Signed-off-by: SuperQ <superq@gmail.com>
2022-06-15 13:19:43 +02:00
gotjosh
c59938b235
Alerting: Schedule Alert rules metric tracking (#50415)
* Alerting: Schedule Alert rules metric tracking

Change the record of metrics from one place to two as an attempt to have a semi-accurate record.
2022-06-08 18:37:33 +01:00
Yuriy Tseretyan
a89d4a5be7
Alerting: Scheduler to drop ticks if a rule's evaluation is too slow (#48885)
* drop ticks if evaluation of a rule is too slow.
* add metric schedule_rule_evaluations_missed_total
2022-06-08 12:50:44 -04:00
George Robinson
c83f84348c
Alerting: Fix database unavailable removes rules from scheduler (#49874) 2022-06-07 16:20:06 +01:00
sh0rez
3ca3a59079
pkg/web: remove dependency injection (#49123)
* pkg/web: store http.Handler internally

* pkg/web: remove injection

Removes any injection code from pkg/web.

It already was no longer functional, as we already only injected into
`http.Handler`, meaning we only inject ctx.Req and ctx.Resp.

Any other types (*Context, *ReqContext) were already accessed using the
http.Request.Context.Value() method.

* *: remove type mappings

Removes any call to the previously removed TypeMapper, as those were
non-functional already.

* pkg/web: remove Context.Invoke

was no longer used outside of pkg/web and also no longer functional
2022-05-24 15:35:08 -04:00
Yuriy Tseretyan
75ba4e98c6
Alerting: Remove unused features from ticker + metric + tests (#47828)
* remove not used code:
  - remove offset in ticket because it is not used
  - remove unused ticker and scheduler methods

* use duration for interval
* add metrics grafana_alerting_ticker_last_consumed_tick_timestamp_seconds, grafana_alerting_ticker_next_tick_timestamp_seconds, grafana_alerting_ticker_interval_seconds
2022-04-22 15:09:47 -04:00
Sofia Papagiannaki
54962c2f0c
Alerting: Rename Recipient path parameter to DatasourceID (#47949) 2022-04-20 16:20:17 +03:00
George Robinson
5e2280ceee
Add metrics to ngalert scheduler (#44602)
This pull request adds metrics to the ngalert scheduler so we can see how long it takes to evaluate a tick.
2022-01-31 16:56:43 +00:00
Serge Zaitsev
57fcfd578d
Chore: replace macaron with web package (#40136)
* replace macaron with web package

* add web.go
2021-10-11 14:30:59 +02:00
gotjosh
35e5bfce40
Alerting: Metrics should have the label org instead of user (#39353)
An user within Grafana has a completely different meaning. Multi-tenancy is done via Organizations as a top-level concept.
2021-09-17 17:17:26 +01:00
gotjosh
7db97097c9
Alerting: Support Unified Alerting with Grafana HA (#37920)
* Alerting: Support Unified Alerting in Grafana's HA mode.
2021-09-16 15:33:51 +01:00
Serge Zaitsev
063160aae2
Chore: pass url parameters through context.Context (#38826)
* pass url parameters through context.Context

* fix url param names without colon prefix

* change context params to vars

* replace url vars in tests using new api

* rename vars to params

* add some comments

* rename seturlvars to seturlparams
2021-09-14 18:34:56 +02:00
gotjosh
a2f4344bf2
Alerting: Refactor & fix unified alerting metrics structure (#39151)
* Alerting: Refactor & fix unified alerting metrics structure

Fixes and refactors the metrics structure we have for the ngalert service. Now, each component has its own metric struct that includes the JUST the metrics it uses. Additionally, I have fixed the configuration metrics and added new metrics to determine if we have discovered and started all the necessary configurations of an instance.

This allows us to alert on `grafana_alerting_discovered_configurations - grafana_alerting_active_configurations != 0` to know whether an alertmanager instance did not start successfully.
2021-09-14 12:55:01 +01:00