Commit Graph

8 Commits

Author SHA1 Message Date
Yuriy Tseretyan
ddfe2dce74
Alerting: Split grafana and lotex routes (#44742)
* split Lotex and Grafana routes
* update template to use authorize function for every route
2022-02-04 12:42:04 -05:00
Sofia Papagiannaki
c6483cd8ed
Alerting: Refactor API handlers to use web.Bind (#42600)
* Alerting: Refactor API handlers to use web.Bind

* lint
2021-12-13 09:22:57 +01:00
gotjosh
a2f4344bf2
Alerting: Refactor & fix unified alerting metrics structure (#39151)
* Alerting: Refactor & fix unified alerting metrics structure

Fixes and refactors the metrics structure we have for the ngalert service. Now, each component has its own metric struct that includes the JUST the metrics it uses. Additionally, I have fixed the configuration metrics and added new metrics to determine if we have discovered and started all the necessary configurations of an instance.

This allows us to alert on `grafana_alerting_discovered_configurations - grafana_alerting_active_configurations != 0` to know whether an alertmanager instance did not start successfully.
2021-09-14 12:55:01 +01:00
gotjosh
f83cd401e5
Alerting: Send alerts to external Alertmanager(s) (#37298)
* Alerting: Send alerts to external Alertmanager(s)

Within this PR we're adding support for registering or unregistering
sending to a set of external alertmanagers. A few of the things that are
going are:

- Introduce a new table to hold "admin" (either org or global)
  configuration we can change at runtime.
- A new periodic check that polls for this configuration and adjusts the
  "senders" accordingly.
- Introduces a new concept of "senders" that are responsible for
  shipping the alerts to the external Alertmanager(s). In a nutshell,
this is the Prometheus notifier (the one in charge of sending the alert)
mapped to a multi-tenant map.

There are a few code movements here and there but those are minor, I
tried to keep things intact as much as possible so that we could have an
easier diff.
2021-08-06 13:06:56 +01:00
Owen Diehl
5e48b54549
Alerting/metrics (#33547)
* moves alerting metrics to their own pkg

* adds grafana_alerting_alerts (by state) metric

* alerts_received_{total,invalid}

* embed alertmanager alerting struct in ng metrics & remove duplicated notification metrics (already embed alertmanager notifier metrics)

* use silence metrics from alertmanager lib

* fix - manager has metrics

* updates ngalert tests

* comment lint
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* cleaner prom registry code

* removes ngalert global metrics

* new registry use in all tests

* ngalert metrics impl service, hack testinfra code to prevent duplicate metric registrations

* nilmetrics unexported
2021-04-30 12:28:06 -04:00
Owen Diehl
ec37b4cb87
[Alerting] Automatic request instrumentation (#33444)
* alerting: automatic request instrumentation

* always expose alerting prom metrics

* globally register alerting metrics
2021-04-28 16:59:15 -04:00
Owen Diehl
e065e19583
Fix/ngalert generation (#33172)
* fixes pkg names & alerting openapi generation

* cleans up api generation, uses docker & removes python
2021-04-20 13:12:32 -04:00
gotjosh
c9e5088e8b
Alerting: Cleanup and move legacy to a legacy file (#32803)
* Alerting: Cleanup and move legacy to a legacy file

A quick cleanup of the ngalert/api directory, optimising for an easy
removal of what is will be considered legacy at some point. A quick
summary of what's done is:

- Add a prefix `generated` prefix to files that are auto-generated by
  our swagger definitions.
- Create a legacy file to place all the legacy API routes implementation
  and helpers. Deleting files that where no longer needed after this
move.
- Rename the `lotex` file to `lotex_ruler`
- Adding a couple of comments here and there.

With this, I hope to organise our code in this directory a bit better
given there's a lot going on.
2021-04-09 05:55:41 -04:00