Commit Graph

25 Commits

Author SHA1 Message Date
Carl Bergquist
a39d2ae8ea
instrumentation: change slogroup for alerting handlers to high-slow (#75460)
instrumentation: change slogroup for alerting handlers to high-fast

Signed-off-by: bergquist <carl.bergquist@gmail.com>
2023-09-29 14:56:48 +02:00
Carl Bergquist
10a82e30ba
Alerting: add route owner middleware (#73869)
alerting: add route owner middleware

Signed-off-by: bergquist <carl.bergquist@gmail.com>
2023-08-29 12:43:33 +02:00
Matthew Jacobson
ba3994d338
Alerting: Repurpose rule testing endpoint to return potential alerts (#69755)
* Alerting: Repurpose rule testing endpoint to return potential alerts

This feature replaces the existing no-longer in-use grafana ruler testing API endpoint /api/v1/rule/test/grafana. The new endpoint returns a list of potential alerts created by the given alert rule, including built-in + interpolated labels and annotations.

The key priority of this endpoint is that it is intended to be as true as possible to what would be generated by the ruler except that the resulting alerts are not filtered to only Resolved / Firing and ready to be sent.

This means that the endpoint will, among other things:

- Attach static annotations and labels from the rule configuration to the alert instances.
- Attach dynamic annotations from the datasource to the alert instances.
- Attach built-in labels and annotations created by the Grafana Ruler (such as alertname and grafana_folder) to the alert instances.
- Interpolate templated annotations / labels and accept allowed template functions.
2023-06-08 18:59:54 -04:00
Steve Simpson
9effb9a708
Alerting: Allow hooking into request handler functions. (#67000)
* Alerting: Allow hooking into request handler functions.

Adds a facility to AlertNG for hooking into API handlers, allowing the
replacement of request handlers for specific paths. One of goals of this
approach was to allow hooking as late as possible in the request, e.g.
after all middleware has been applied, to simplfiy usage.

* Update pkg/services/ngalert/api/hooks.go

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Update pkg/services/ngalert/api/hooks.go

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Update pkg/services/ngalert/ngalert.go

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Fixes to review comments

* Fix passing logger in

---------

Co-authored-by: gotjosh <josue.abreu@gmail.com>
2023-04-24 18:18:44 +02:00
idafurjes
6c5a573772
Chore: Move ReqContext to contexthandler service (#62102)
* Chore: Move ReqContext to contexthandler service

* Rename package to contextmodel

* Generate ngalert files

* Remove unused imports
2023-01-27 08:50:36 +01:00
Alexander Weaver
eb1293ebb1
Alerting: Re-generate swagger definitions (#62154)
Regenerate swagger, add body binding so parameters work
2023-01-25 15:01:42 -06:00
Yuri Tseretyan
ad09feed83
Alerting: rule backtesting API (#57318)
* Implement backtesting engine that can process regular rule specification (with queries to datasource) as well as special kind of rules that have data frame instead of query.
* declare a new API endpoint and model
* add feature toggle `alertingBacktesting`
2022-12-14 09:44:14 -05:00
Joe Blubaugh
689ae96a0e
Alerting: Refactor API types generation with different names. (#51785)
This changes the API codegen template (controller-api.mustache) to simplify some names. When this package was created, most APIs "forked" to either a Grafana backend implementation or a "Lotex" remote implementation. As we have added APIs it's no longer the case. Provisioning, configuration, and testing APIs do not fork, and we are likely to add additional APIs that don't fork.

This change replaces {{classname}}ForkingService with {{classname}} for interface names, and names the concrete implementation {{classname}}Handler. It changes the implied implementation of a route handler from fork{{nickname}} to handle{{nickname}}. So PrometheusApiForkingService becomes PrometheusApi, ForkedPrometheusApi becomes PrometheusApiHandler and forkRouteGetGrafanaAlertStatuses becomes handleRouteGetGrafanaAlertStatuses

It also renames some files - APIs that do no forking go from forked_{{name}}.go to {{name}}.go and APIs that still fork go from forked_{{name}}.go to forking_{{name}}.go to capture the idea that those files a "doing forking" rather than "are a fork of something."

Signed-off-by: Joe Blubaugh <joe.blubaugh@grafana.com>
2022-07-18 03:08:08 -04:00
Alexander Weaver
0d9389e1f4
Alerting: Code-gen parsing of URL parameters and fix related bugs (#50731)
* Extend template and generate

* Generate and fix up alertmanager endpoints

* Prometheus routes

* fix up Testing endpoints

* touch up ruler API

* Update provisioning and fix 500

* Drop dead code

* Remove more dead code

* Resolve merge conflicts
2022-06-23 15:13:39 -05:00
Jean-Philippe Quéméner
81d360529b
Alerting: Provisioning API - Alert rules (#47930) 2022-06-02 14:48:53 +02:00
Sofia Papagiannaki
925784f514
Alerting: Modify endpoint for testing a datasource rule using the UID (#48070)
* Modify testing endpoint to expect the datasource UID

* Update docs
2022-05-17 14:10:20 +03:00
Sofia Papagiannaki
54962c2f0c
Alerting: Rename Recipient path parameter to DatasourceID (#47949) 2022-04-20 16:20:17 +03:00
Yuriy Tseretyan
15e4556c2f
Alerting: update authorization logic to use proper legacy roles when fine-grained access is disabled (#46931)
* require legacy Editor for post, put, delete endpoints
* require user to be signed in on group level because handler that checks that user has role Editor does not check it is signed in
2022-03-24 17:13:47 -04:00
Yuriy Tseretyan
ddfe2dce74
Alerting: Split grafana and lotex routes (#44742)
* split Lotex and Grafana routes
* update template to use authorize function for every route
2022-02-04 12:42:04 -05:00
Sofia Papagiannaki
c6483cd8ed
Alerting: Refactor API handlers to use web.Bind (#42600)
* Alerting: Refactor API handlers to use web.Bind

* lint
2021-12-13 09:22:57 +01:00
gotjosh
a2f4344bf2
Alerting: Refactor & fix unified alerting metrics structure (#39151)
* Alerting: Refactor & fix unified alerting metrics structure

Fixes and refactors the metrics structure we have for the ngalert service. Now, each component has its own metric struct that includes the JUST the metrics it uses. Additionally, I have fixed the configuration metrics and added new metrics to determine if we have discovered and started all the necessary configurations of an instance.

This allows us to alert on `grafana_alerting_discovered_configurations - grafana_alerting_active_configurations != 0` to know whether an alertmanager instance did not start successfully.
2021-09-14 12:55:01 +01:00
Sofia Papagiannaki
7af329f385
Alerting: Fix API specification (#38753)
* Alerting: Fix API spec

* Add missing status codes
2021-09-10 12:46:02 +03:00
gotjosh
f83cd401e5
Alerting: Send alerts to external Alertmanager(s) (#37298)
* Alerting: Send alerts to external Alertmanager(s)

Within this PR we're adding support for registering or unregistering
sending to a set of external alertmanagers. A few of the things that are
going are:

- Introduce a new table to hold "admin" (either org or global)
  configuration we can change at runtime.
- A new periodic check that polls for this configuration and adjusts the
  "senders" accordingly.
- Introduces a new concept of "senders" that are responsible for
  shipping the alerts to the external Alertmanager(s). In a nutshell,
this is the Prometheus notifier (the one in charge of sending the alert)
mapped to a multi-tenant map.

There are a few code movements here and there but those are minor, I
tried to keep things intact as much as possible so that we could have an
easier diff.
2021-08-06 13:06:56 +01:00
Owen Diehl
5e48b54549
Alerting/metrics (#33547)
* moves alerting metrics to their own pkg

* adds grafana_alerting_alerts (by state) metric

* alerts_received_{total,invalid}

* embed alertmanager alerting struct in ng metrics & remove duplicated notification metrics (already embed alertmanager notifier metrics)

* use silence metrics from alertmanager lib

* fix - manager has metrics

* updates ngalert tests

* comment lint
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* cleaner prom registry code

* removes ngalert global metrics

* new registry use in all tests

* ngalert metrics impl service, hack testinfra code to prevent duplicate metric registrations

* nilmetrics unexported
2021-04-30 12:28:06 -04:00
Owen Diehl
ec37b4cb87
[Alerting] Automatic request instrumentation (#33444)
* alerting: automatic request instrumentation

* always expose alerting prom metrics

* globally register alerting metrics
2021-04-28 16:59:15 -04:00
Sofia Papagiannaki
b2288f7ef9
[Alerting]: Add alerting endpoint for Query Evaluation (#33174)
* [Alerting]: Add alerting endpoint for Query Evaluation

* Fix passing down now parameter

* Add validations and test

* Fix eval queries and expressions test

* Add eval tests
2021-04-21 22:44:50 +03:00
Owen Diehl
e065e19583
Fix/ngalert generation (#33172)
* fixes pkg names & alerting openapi generation

* cleans up api generation, uses docker & removes python
2021-04-20 13:12:32 -04:00
Owen Diehl
e37a780e14
Inhouse alerting api (#33129)
* init

* autogens AM route

* POST dashboards/db spec

* POST alert-notifications spec

* fix description

* re inits vendor, updates grafana to master

* go mod updates

* alerting routes

* renames to receivers

* prometheus endpoints

* align config endpoint with cortex, include templates

* Change grafana receiver type

* Update receivers.go

* rename struct to stop swagger thrashing

* add rules API

* index html

* standalone swagger ui html page

* Update README.md

* Expose GrafanaManagedAlert properties

* Some fixes

- /api/v1/rules/{Namespace} should return a map
- update ExtendedUpsertAlertDefinitionCommand properties

* am alerts routes

* rename prom swagger section for clarity, remove example endpoints

* Add missing json and yaml tags

* folder perms

* make folders POST again

* fix grafana receiver type

* rename fodler->namespace for perms

* make ruler json again

* PR fixes

* silences

* fix Ok -> Ack

* Add id to POST /api/v1/silences (#9)

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Add POST /api/v1/alerts (#10)

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* fix silences

* Add testing endpoints

* removes grpc replace directives

* [wip] starts validation

* pkg cleanup

* go mod tidy

* ignores vendor dir

* Change response type for Cortex/Loki alerts

* receiver unmarshaling tests

* ability to split routes between AM & Grafana

* api marshaling & validation

* begins work on routing lib

* [hack] ignores embedded field in generation

* path specific datasource for alerting

* align endpoint names with cloud

* single route per Alerting config

* removes unused routing pkg

* regens spec

* adds datasource param to ruler/prom route paths

* Modifications for supporting migration

* Apply suggestions from code review

* hack for cleaning circular refs in swagger definition

* generates files

* minor fixes for prom endpoints

* decorate prom apis with required: true where applicable

* Revert "generates files"

This reverts commit ef7e975584.

* removes server autogen

* Update imported structs from ngalert

* Fix listing rules response

* Update github.com/prometheus/common dependency

* Update get silence response

* Update get silences response

* adds ruler validation & backend switching

* Fix GET /alertmanager/{DatasourceId}/config/api/v1/alerts response

* Distinct gettable and postable grafana receivers

* Remove permissions routes

* Latest JSON specs

* Fix testing routes

* inline yaml annotation on apirulenode

* yaml test & yamlv3 + comments

* Fix yaml annotations for embedded type

* Rename DatasourceId path parameter

* Implement Backend.String()

* backend zero value is a real backend

* exports DiscoveryBase

* Fix GO initialisms

* Silences: Use PostableSilence as the base struct for creating silences

* Use type alias instead of struct embedding

* More fixes to alertmanager silencing routes

* post and spec JSONs

* Split rule config to postable/gettable

* Fix empty POST /silences payload

Recreating the generated JSON specs fixes the issue
without further modifications

* better yaml unmarshaling for nested yaml docs in cortex-am configs

* regens spec

* re-adds config.receivers

* omitempty to align with prometheus API behavior

* Prefix routes with /api

* Update Alertmanager models

* Make adjustments to follow the Alertmanager API

* ruler: add for and annotations to grafana alert (#45)

* Modify testing API routes

* Fix grafana rule for field type

* Move PostableUserConfig validation to this library

* Fix PostableUserConfig YAML encoding/decoding

* Use common fields for grafana and lotex rules

* Add namespace id in GettableGrafanaRule

* Apply suggestions from code review

* fixup

* more changes

* Apply suggestions from code review

* aligns structure pre merge

* fix new imports & tests

* updates tooling readme

* goimports

* lint

* more linting!!

* revive lint

Co-authored-by: Sofia Papagiannaki <papagian@gmail.com>
Co-authored-by: Domas <domasx2@gmail.com>
Co-authored-by: Sofia Papagiannaki <papagian@users.noreply.github.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Co-authored-by: gotjosh <josue@grafana.com>
Co-authored-by: David Parrott <stomp.box.yo@gmail.com>
Co-authored-by: Kyle Brandt <kyle@grafana.com>
2021-04-19 14:26:04 -04:00
Sofia Papagiannaki
e7ff04a167
[Alerting]: Implement test rule API route (#32837)
* [Alerting]: Implement test rule API route

* Apply suggestions from code review

* Call /query instead of /query_range
2021-04-13 20:58:34 +03:00
gotjosh
c9e5088e8b
Alerting: Cleanup and move legacy to a legacy file (#32803)
* Alerting: Cleanup and move legacy to a legacy file

A quick cleanup of the ngalert/api directory, optimising for an easy
removal of what is will be considered legacy at some point. A quick
summary of what's done is:

- Add a prefix `generated` prefix to files that are auto-generated by
  our swagger definitions.
- Create a legacy file to place all the legacy API routes implementation
  and helpers. Deleting files that where no longer needed after this
move.
- Rename the `lotex` file to `lotex_ruler`
- Adding a couple of comments here and there.

With this, I hope to organise our code in this directory a bit better
given there's a lot going on.
2021-04-09 05:55:41 -04:00