Commit Graph

17 Commits

Author SHA1 Message Date
Owen Diehl
5e48b54549
Alerting/metrics (#33547)
* moves alerting metrics to their own pkg

* adds grafana_alerting_alerts (by state) metric

* alerts_received_{total,invalid}

* embed alertmanager alerting struct in ng metrics & remove duplicated notification metrics (already embed alertmanager notifier metrics)

* use silence metrics from alertmanager lib

* fix - manager has metrics

* updates ngalert tests

* comment lint
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* cleaner prom registry code

* removes ngalert global metrics

* new registry use in all tests

* ngalert metrics impl service, hack testinfra code to prevent duplicate metric registrations

* nilmetrics unexported
2021-04-30 12:28:06 -04:00
Kyle Brandt
914443c816
Alerting: Fix state cache id duplication (#33480) 2021-04-28 11:42:19 -04:00
David Parrott
788bc2a793
Alerting: refactor state tracker (#33292)
* set processing time

* merge labels and set on response

* use state cache for adding alerts to rules

* minor cleanup

* add support for NoData and Error results

* rename test

* bring in changes from other PRs tha have been merged

* pr feedback

* add integration test

* close state tracker cleanup on context.Done

* fixup test

* rename state tracker

* set EvaluationDuration on Result

* default labels set as constants

* separate cache and state from manager

* use RWMutex in cache
2021-04-23 21:32:25 +02:00
David Parrott
ca79206498
Alerting: Handle NoData and Error evaluation results (#33194)
* set processing time

* merge labels and set on response

* use state cache for adding alerts to rules

* minor cleanup

* add support for NoData and Error results

* rename test

* bring in changes from other PRs tha have been merged

* pr feedback

* add integration test

* close state tracker cleanup on context.Done

* fixup test

* not those annotations
2021-04-23 20:47:52 +02:00
gotjosh
de0802cf3b
Alerting: Fixes the integration test currently failing at master (#33233)
* Alerting: Fixes the integration test currently failing at master

* Skip the state tracker test for now
2021-04-21 14:57:17 -04:00
David Parrott
4be1d84f23
Alerting: Enhancements to /rules (#33085)
* set processing time

* merge labels and set on response

* use state cache for adding alerts to rules

* minor cleanup

* pr feedback

* Do not initialize mutex unnecessarily

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* linter

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2021-04-21 09:30:03 -07:00
Sofia Papagiannaki
87a70af7eb
[Alerting]: Fix updating rule group and add tests (#33074)
* [Alerting]: Fix updating rule group and add test

* Fix updating rule labels
* Set default values for rule no data and error states
if they are missing
* Add test for updating rule

* Test updating annotations

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>

* add test for posting an unknown rule UID

* Fix alert rule validation and add tests

* Remove org id from PostableGrafanaRule

This field was not used; each rule gets the organisation of the user making
the rerquest

* Update pkg/tests/api/alerting/api_alertmanager_test.go

Co-authored-by: gotjosh <josue@grafana.com>
2021-04-21 17:22:58 +03:00
Owen Diehl
e37a780e14
Inhouse alerting api (#33129)
* init

* autogens AM route

* POST dashboards/db spec

* POST alert-notifications spec

* fix description

* re inits vendor, updates grafana to master

* go mod updates

* alerting routes

* renames to receivers

* prometheus endpoints

* align config endpoint with cortex, include templates

* Change grafana receiver type

* Update receivers.go

* rename struct to stop swagger thrashing

* add rules API

* index html

* standalone swagger ui html page

* Update README.md

* Expose GrafanaManagedAlert properties

* Some fixes

- /api/v1/rules/{Namespace} should return a map
- update ExtendedUpsertAlertDefinitionCommand properties

* am alerts routes

* rename prom swagger section for clarity, remove example endpoints

* Add missing json and yaml tags

* folder perms

* make folders POST again

* fix grafana receiver type

* rename fodler->namespace for perms

* make ruler json again

* PR fixes

* silences

* fix Ok -> Ack

* Add id to POST /api/v1/silences (#9)

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Add POST /api/v1/alerts (#10)

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* fix silences

* Add testing endpoints

* removes grpc replace directives

* [wip] starts validation

* pkg cleanup

* go mod tidy

* ignores vendor dir

* Change response type for Cortex/Loki alerts

* receiver unmarshaling tests

* ability to split routes between AM & Grafana

* api marshaling & validation

* begins work on routing lib

* [hack] ignores embedded field in generation

* path specific datasource for alerting

* align endpoint names with cloud

* single route per Alerting config

* removes unused routing pkg

* regens spec

* adds datasource param to ruler/prom route paths

* Modifications for supporting migration

* Apply suggestions from code review

* hack for cleaning circular refs in swagger definition

* generates files

* minor fixes for prom endpoints

* decorate prom apis with required: true where applicable

* Revert "generates files"

This reverts commit ef7e975584.

* removes server autogen

* Update imported structs from ngalert

* Fix listing rules response

* Update github.com/prometheus/common dependency

* Update get silence response

* Update get silences response

* adds ruler validation & backend switching

* Fix GET /alertmanager/{DatasourceId}/config/api/v1/alerts response

* Distinct gettable and postable grafana receivers

* Remove permissions routes

* Latest JSON specs

* Fix testing routes

* inline yaml annotation on apirulenode

* yaml test & yamlv3 + comments

* Fix yaml annotations for embedded type

* Rename DatasourceId path parameter

* Implement Backend.String()

* backend zero value is a real backend

* exports DiscoveryBase

* Fix GO initialisms

* Silences: Use PostableSilence as the base struct for creating silences

* Use type alias instead of struct embedding

* More fixes to alertmanager silencing routes

* post and spec JSONs

* Split rule config to postable/gettable

* Fix empty POST /silences payload

Recreating the generated JSON specs fixes the issue
without further modifications

* better yaml unmarshaling for nested yaml docs in cortex-am configs

* regens spec

* re-adds config.receivers

* omitempty to align with prometheus API behavior

* Prefix routes with /api

* Update Alertmanager models

* Make adjustments to follow the Alertmanager API

* ruler: add for and annotations to grafana alert (#45)

* Modify testing API routes

* Fix grafana rule for field type

* Move PostableUserConfig validation to this library

* Fix PostableUserConfig YAML encoding/decoding

* Use common fields for grafana and lotex rules

* Add namespace id in GettableGrafanaRule

* Apply suggestions from code review

* fixup

* more changes

* Apply suggestions from code review

* aligns structure pre merge

* fix new imports & tests

* updates tooling readme

* goimports

* lint

* more linting!!

* revive lint

Co-authored-by: Sofia Papagiannaki <papagian@gmail.com>
Co-authored-by: Domas <domasx2@gmail.com>
Co-authored-by: Sofia Papagiannaki <papagian@users.noreply.github.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Co-authored-by: gotjosh <josue@grafana.com>
Co-authored-by: David Parrott <stomp.box.yo@gmail.com>
Co-authored-by: Kyle Brandt <kyle@grafana.com>
2021-04-19 14:26:04 -04:00
Kyle Brandt
2c862678ab
AlertingNG/SSE: Datasource UID and UID/ID only (drop name) (#33039)
SSE still will support ID until dashboard/frontend always requests UID
update *.http examples

Co-authored-by: gotjosh <josue@grafana.com>
2021-04-16 15:29:19 +02:00
David Parrott
555da77527
Dparrott/labels on alert rule (#33057)
* move state tracker tests to /tests

* set default labels on alerts

* handle empty labels in result.Instance

* create annotation on transition to alerting state
2021-04-16 15:11:40 +02:00
David Parrott
567a6a09bd
Alerting: Return RuleResponse for api/prometheus/grafana/api/v1/rules (#32919)
* Return RuleResponse for api/prometheus/grafana/api/v1/rules

* change TODO to note

Co-authored-by: gotjosh <josue@grafana.com>

* pr feedback

* test fixup

Co-authored-by: gotjosh <josue@grafana.com>
2021-04-13 17:38:09 -04:00
Sofia Papagiannaki
daabf64aa1
[Alerting]: Update scheduler to evaluate rules created by the unified API (#32589)
* Update scheduler

* Fix tests

* Fixes after code review feedback

* lint - add uncommitted modifications

Co-authored-by: kyle <kyle@grafana.com>
2021-04-03 20:13:29 +03:00
Kyle Brandt
948da25c13
ngalerting: represent nil/empty labels the same (#32652)
was getting duplicates of [] and null before
2021-04-02 13:49:45 -04:00
David Parrott
2a8446e435
Alerting: Persist alerts on evaluation and shutdown. Warm cache from DB on startup (#32576)
* Initial commit for state tracking

* basic state transition logic and tests

* constructor. test and interface fixup

* use new sig for sch.definitionRoutine()

* test fixup

* make the linter happy

* more minor linting cleanup

* Alerting: Send alerts from state tracker to notifier

* Add evaluation time and test

Add evaluation time and test

* Add cleanup routine and logging

* Pull in compact.go and reconcile differences

* Save alert transitions and save all state on shutdown

* pr feedback

* WIP

* WIP

* Persist alerts on evaluation and shutdown. Warm cache on startup

* Filter non-firing alerts before sending to notifier

Co-authored-by: Josue Abreu <josue@grafana.com>
2021-04-02 08:11:33 -07:00
David Parrott
b1cb74c0c9
Alerting: Send alerts from state tracker to notifier, logging, and cleanup task (#32333)
* Initial commit for state tracking

* basic state transition logic and tests

* constructor. test and interface fixup

* use new sig for sch.definitionRoutine()

* test fixup

* make the linter happy

* more minor linting cleanup

* Alerting: Send alerts from state tracker to notifier

* Add evaluation time and test

Add evaluation time and test

* Add cleanup routine and logging

* Pull in compact.go and reconcile differences

* pr feedback

* pr feedback

Pull in compact.go and reconcile differences

Co-authored-by: Josue Abreu <josue@grafana.com>
2021-03-30 09:37:56 -07:00
David Parrott
d33a77a67f
Alerting: add state tracker to alerting evaluation (#32298)
* Initial commit for state tracking

* basic state transition logic and tests

* constructor. test and interface fixup

* use new sig for sch.definitionRoutine()

* test fixup

* make the linter happy

* more minor linting cleanup
2021-03-24 15:34:18 -07:00
Sofia Papagiannaki
4ce0a49eac
AlertingNG: Split into several packages (#31719)
* AlertingNG: Split into several packages

* Move AlertQuery to models
2021-03-08 22:19:21 +02:00