Commit Graph

135 Commits

Author SHA1 Message Date
Ganesh Vernekar
8d442c9b44
NGAlert: Fix templating and remove unwanted default templates (#33918)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-12 15:13:43 +05:30
Sofia Papagiannaki
1c58fd380f
[Alerting]: store encrypted receiver secure settings (#33832)
* [Alerting]: Store secure settings encrypted

* Move encryption to the API handler
2021-05-10 15:30:42 +03:00
Ganesh Vernekar
1b8c0ce88b
NGAlert: Fix some TODOs in notification channels (#33739)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-05 17:48:40 +05:30
Ganesh Vernekar
918552d34b
NGAlert: Send list of available ngalert notification channels via API (#33489)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-05-04 13:58:39 +02:00
Owen Diehl
5e48b54549
Alerting/metrics (#33547)
* moves alerting metrics to their own pkg

* adds grafana_alerting_alerts (by state) metric

* alerts_received_{total,invalid}

* embed alertmanager alerting struct in ng metrics & remove duplicated notification metrics (already embed alertmanager notifier metrics)

* use silence metrics from alertmanager lib

* fix - manager has metrics

* updates ngalert tests

* comment lint
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* cleaner prom registry code

* removes ngalert global metrics

* new registry use in all tests

* ngalert metrics impl service, hack testinfra code to prevent duplicate metric registrations

* nilmetrics unexported
2021-04-30 12:28:06 -04:00
Ganesh Vernekar
be1affe0a4
NGAlert: Fix flaky test (#33415)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-27 17:03:22 +05:30
Ganesh Vernekar
659ea20c3c
NGAlert: Run the maintenance cycle for the silences (#33301)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-23 16:19:03 +02:00
Ganesh Vernekar
d66a5e65a4
AlertingNG: Add webhook notification channel (#33229)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-23 18:59:28 +05:30
Ganesh Vernekar
a0e567f80f
AlertingNG: Add Dingding notification channel (#32995)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-22 19:30:49 +02:00
Ganesh Vernekar
4ec1edfca3
AlertingNG: Add Teams notification channel (#32979)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-22 18:16:26 +02:00
Ganesh Vernekar
c9cd7ea701
AlertingNG: Add Telegram notification channel (#32795)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-22 17:24:59 +02:00
Ganesh Vernekar
0a03d5c29e
AlertingNG: Correctly set StartsAt, EndsAt, UpdatedAt after alert reception (#33109)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-22 20:42:18 +05:30
Ganesh Vernekar
3056f86f76
AlertingNG: Fix TODOs in email notification channel (#33169)
* AlertingNG: Fix TODOs in email notification channel

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Test fixup

* Remove the receiver field it is not needed for the email notification

Co-authored-by: Josue Abreu <josue@grafana.com>
2021-04-22 10:01:55 -04:00
Arve Knudsen
6408b55a7c
Slack: Use chat.postMessage API by default (#32511)
* Slack: Use only chat.postMessage API

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Slack: Check for response error

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Slack: Support custom webhook URL

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Simplify

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Fix tests

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Rewrite tests to use stdlib

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Update pkg/services/alerting/notifiers/slack.go

Co-authored-by: Dimitris Sotirakis <sotirakis.dim@gmail.com>

* Clarify URL field name

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Fix linting issue

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Fix test

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Fix up new Slack notifier

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Improve tests

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Fix lint

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Slack: Make token not required

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Alerting: Send validation errors back to client

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Document how token is required

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Make recipient required when using Slack API

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Fix field description

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

Co-authored-by: Dimitris Sotirakis <sotirakis.dim@gmail.com>
2021-04-22 16:00:21 +02:00
Arve Knudsen
66020b419c
NGAlert: Consolidate on standard errors package (#33249)
* NGAlert: Don't use pkg/errors

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Update pkg/services/ngalert/notifier/alertmanager.go

Co-authored-by: Will Browne <wbrowne@users.noreply.github.com>

* Fix logging

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

Co-authored-by: Will Browne <wbrowne@users.noreply.github.com>
2021-04-22 11:18:25 +02:00
gotjosh
23c7e7ab60
Alerting: Various fixes for the alerts endpoint (#33182)
A set of fixes for the GET alert and groups endpoints.

- First, is the fact that the default values where not being for the query params. I've introduced a new method in the Grafana context that allow us to do this.
- Second, is the fact that alerts were never being transitioned to active. To my surprise this is actually done by the inhibitor in the pipeline - if an alert is not muted, or inhibited then it's active.
- Third, I have added an integration test to cover for regressions.

Signed-off-by: Josue Abreu <josue@grafana.com>
2021-04-21 06:34:42 -04:00
Owen Diehl
e37a780e14
Inhouse alerting api (#33129)
* init

* autogens AM route

* POST dashboards/db spec

* POST alert-notifications spec

* fix description

* re inits vendor, updates grafana to master

* go mod updates

* alerting routes

* renames to receivers

* prometheus endpoints

* align config endpoint with cortex, include templates

* Change grafana receiver type

* Update receivers.go

* rename struct to stop swagger thrashing

* add rules API

* index html

* standalone swagger ui html page

* Update README.md

* Expose GrafanaManagedAlert properties

* Some fixes

- /api/v1/rules/{Namespace} should return a map
- update ExtendedUpsertAlertDefinitionCommand properties

* am alerts routes

* rename prom swagger section for clarity, remove example endpoints

* Add missing json and yaml tags

* folder perms

* make folders POST again

* fix grafana receiver type

* rename fodler->namespace for perms

* make ruler json again

* PR fixes

* silences

* fix Ok -> Ack

* Add id to POST /api/v1/silences (#9)

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Add POST /api/v1/alerts (#10)

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* fix silences

* Add testing endpoints

* removes grpc replace directives

* [wip] starts validation

* pkg cleanup

* go mod tidy

* ignores vendor dir

* Change response type for Cortex/Loki alerts

* receiver unmarshaling tests

* ability to split routes between AM & Grafana

* api marshaling & validation

* begins work on routing lib

* [hack] ignores embedded field in generation

* path specific datasource for alerting

* align endpoint names with cloud

* single route per Alerting config

* removes unused routing pkg

* regens spec

* adds datasource param to ruler/prom route paths

* Modifications for supporting migration

* Apply suggestions from code review

* hack for cleaning circular refs in swagger definition

* generates files

* minor fixes for prom endpoints

* decorate prom apis with required: true where applicable

* Revert "generates files"

This reverts commit ef7e975584.

* removes server autogen

* Update imported structs from ngalert

* Fix listing rules response

* Update github.com/prometheus/common dependency

* Update get silence response

* Update get silences response

* adds ruler validation & backend switching

* Fix GET /alertmanager/{DatasourceId}/config/api/v1/alerts response

* Distinct gettable and postable grafana receivers

* Remove permissions routes

* Latest JSON specs

* Fix testing routes

* inline yaml annotation on apirulenode

* yaml test & yamlv3 + comments

* Fix yaml annotations for embedded type

* Rename DatasourceId path parameter

* Implement Backend.String()

* backend zero value is a real backend

* exports DiscoveryBase

* Fix GO initialisms

* Silences: Use PostableSilence as the base struct for creating silences

* Use type alias instead of struct embedding

* More fixes to alertmanager silencing routes

* post and spec JSONs

* Split rule config to postable/gettable

* Fix empty POST /silences payload

Recreating the generated JSON specs fixes the issue
without further modifications

* better yaml unmarshaling for nested yaml docs in cortex-am configs

* regens spec

* re-adds config.receivers

* omitempty to align with prometheus API behavior

* Prefix routes with /api

* Update Alertmanager models

* Make adjustments to follow the Alertmanager API

* ruler: add for and annotations to grafana alert (#45)

* Modify testing API routes

* Fix grafana rule for field type

* Move PostableUserConfig validation to this library

* Fix PostableUserConfig YAML encoding/decoding

* Use common fields for grafana and lotex rules

* Add namespace id in GettableGrafanaRule

* Apply suggestions from code review

* fixup

* more changes

* Apply suggestions from code review

* aligns structure pre merge

* fix new imports & tests

* updates tooling readme

* goimports

* lint

* more linting!!

* revive lint

Co-authored-by: Sofia Papagiannaki <papagian@gmail.com>
Co-authored-by: Domas <domasx2@gmail.com>
Co-authored-by: Sofia Papagiannaki <papagian@users.noreply.github.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Co-authored-by: gotjosh <josue@grafana.com>
Co-authored-by: David Parrott <stomp.box.yo@gmail.com>
Co-authored-by: Kyle Brandt <kyle@grafana.com>
2021-04-19 14:26:04 -04:00
Ganesh Vernekar
6271777ec6
AlertingNG: Remove the receivers field from postable alerts (#33068)
* AlertingNG: Remove the receivers field from postable alerts and update tests

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-19 12:28:44 +05:30
Ganesh Vernekar
04a8d5407e
AlertingNG: Slack notification channel (#32675)
* AlertingNG: Slack notification channel

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Add tests

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments and small refactoring

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-15 16:01:41 +05:30
gotjosh
528ca9134b
Alerting: Use a default configuration and periodically poll for new ones (#32851)
* Alerting: Use a default configuration and periodically poll for new ones

Use a default configuration to make sure we always start the grafana
instance. Then, regularly poll for new ones.

I've also made sure that failures to apply configuration do not stop the
Grafana server but instead keep polling until it is a success.
2021-04-13 13:02:44 +01:00
Ganesh Vernekar
e3a1d3d158
AlertingNG: PagerDuty notification channel (#32604)
* AlertingNG: PagerDuty notification channel

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Add tests

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix lint

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix reviews

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-08 19:21:09 +02:00
Ganesh Vernekar
b1c84c795f
AlertingNG: Add a global registry for notification channels (#32781)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-08 22:01:23 +05:30
gotjosh
fe67680c42
Alerting: Allow querying of Alerts from notifications (#32614)
* Alerting: Allow querying of Alerts from notifications

* Wire everything up

* Remove unused functions

* Remove duplicate line
2021-04-08 07:27:59 -04:00
Ganesh Vernekar
0f7d8ae6d2
Update email template for AlertingNG (#32691)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-04-07 14:52:48 +05:30
Sofia Papagiannaki
a5e95823b2
[Alerting]: Alertmanager API implementation (#32174)
* Add validation for grafana recipient

* Alertmanager API implementation (WIP)

* Fix encoding/decoding receiver settings from/to YAML

* Save templates together with the configuration

* update POST to apply latest config

* Alertmanager service enabled by the ngalert toggle

* Silence API integration with Alertmanager

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2021-03-31 23:00:56 +03:00
gotjosh
433f6b91d0
Alerting: Introduce the silencing interface (#32517)
* Alerting: Introduce the silencing interface

The operations introduced are:

- Listing silences
- Retrieving an specific silence
- Deleting a silence
- Creating a silence

Signed-off-by: Josue Abreu <josue@grafana.com>

* Add a comment to listing silences

* Update to upstream alertmanager

* Remove copied code from the Alertmanager
2021-03-31 07:36:36 -04:00
David Parrott
b1cb74c0c9
Alerting: Send alerts from state tracker to notifier, logging, and cleanup task (#32333)
* Initial commit for state tracking

* basic state transition logic and tests

* constructor. test and interface fixup

* use new sig for sch.definitionRoutine()

* test fixup

* make the linter happy

* more minor linting cleanup

* Alerting: Send alerts from state tracker to notifier

* Add evaluation time and test

Add evaluation time and test

* Add cleanup routine and logging

* Pull in compact.go and reconcile differences

* pr feedback

* pr feedback

Pull in compact.go and reconcile differences

Co-authored-by: Josue Abreu <josue@grafana.com>
2021-03-30 09:37:56 -07:00
Ganesh Vernekar
740c5813d4
AlertingNG: Fix dispatcher metrics in notifier (#32434)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-03-29 20:35:15 +05:30
Ganesh Vernekar
a0db4dce32
Render new email template and fix the title (#32314)
* Render new email template and fix the title

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix nit

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-03-26 12:53:14 +05:30
Ganesh Vernekar
093e5947f4
Upgrade Prometheus Alertmanager and small fixes (#32280)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-03-25 12:51:44 +01:00
gotjosh
9b52ffc6a9
Alerting: Fetch configuration from the database and run a notification service (#32175)
* Alerting: Fetch configuration from the database and run a notification
instance

Co-Authored-By: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2021-03-24 14:20:44 +00:00
Ganesh Vernekar
8854001b67
AlertingNG: Refactor notifier to support config reloads (#32099)
* AlertingNG: Refactor notifier to support config reloads

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments and make reloading of config a sync operation

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-03-19 09:26:00 +01:00
Ganesh Vernekar
0b788b5ce8
AlertingNG: Notification channel for emails (#31768)
* Email notification channel in ngalert

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Use existing templating system

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Update template and add unit tests

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2021-03-18 14:55:11 +00:00
Ganesh Vernekar
974ccf8091
AlertingNG: Fix the alerting stage for legacy alerts (#32025)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2021-03-17 16:38:33 +00:00
Ganesh Vernekar
ecbc98ba5d
AlertingNG: Add alert provider and basic structure with dispatcher, silences and delivery stages (#31833)
* AlertingNG: Add alert provider

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Add unit tests

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Alertmanager WIP

* Merge alertmanager into notifier

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fixes for PR 31833 (#31990)

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Use alertmanager from upgrad-uuid temporarily to unblock

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix lint

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

Co-authored-by: Josue Abreu <josue@grafana.com>
2021-03-16 10:44:52 +01:00