Commit Graph

1446 Commits

Author SHA1 Message Date
gotjosh
433f6b91d0
Alerting: Introduce the silencing interface (#32517)
* Alerting: Introduce the silencing interface

The operations introduced are:

- Listing silences
- Retrieving an specific silence
- Deleting a silence
- Creating a silence

Signed-off-by: Josue Abreu <josue@grafana.com>

* Add a comment to listing silences

* Update to upstream alertmanager

* Remove copied code from the Alertmanager
2021-03-31 07:36:36 -04:00
David Parrott
b1cb74c0c9
Alerting: Send alerts from state tracker to notifier, logging, and cleanup task (#32333)
* Initial commit for state tracking

* basic state transition logic and tests

* constructor. test and interface fixup

* use new sig for sch.definitionRoutine()

* test fixup

* make the linter happy

* more minor linting cleanup

* Alerting: Send alerts from state tracker to notifier

* Add evaluation time and test

Add evaluation time and test

* Add cleanup routine and logging

* Pull in compact.go and reconcile differences

* pr feedback

* pr feedback

Pull in compact.go and reconcile differences

Co-authored-by: Josue Abreu <josue@grafana.com>
2021-03-30 09:37:56 -07:00
Sofia Papagiannaki
c4d5a67b38
[Alerting] Forking alert manager API (#32300)
* Alertmanager lotex ruler

* Apply suggestions from code review
2021-03-29 18:18:25 +03:00
Ganesh Vernekar
740c5813d4
AlertingNG: Fix dispatcher metrics in notifier (#32434)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-03-29 20:35:15 +05:30
Domas
0779dab0de
NgAlerting: loki & cortex have different prom & ruler endpoint prefixes (#32344) 2021-03-29 08:55:09 +03:00
Ganesh Vernekar
a0db4dce32
Render new email template and fix the title (#32314)
* Render new email template and fix the title

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix nit

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-03-26 12:53:14 +05:30
Ganesh Vernekar
093e5947f4
Upgrade Prometheus Alertmanager and small fixes (#32280)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-03-25 12:51:44 +01:00
David Parrott
d33a77a67f
Alerting: add state tracker to alerting evaluation (#32298)
* Initial commit for state tracking

* basic state transition logic and tests

* constructor. test and interface fixup

* use new sig for sch.definitionRoutine()

* test fixup

* make the linter happy

* more minor linting cleanup
2021-03-24 15:34:18 -07:00
gotjosh
58b814bd7d
Alerting: Add StartAt and FiredAt to the alert evaluation result (#32302) 2021-03-24 20:27:04 +00:00
Kyle Brandt
66548878fe
ngalert: add addition temp translation endpoint (#32287)
spits out new sse condition/data json from old alert id to help with generating UI models
also moves this api code into another file
2021-03-24 17:12:34 +01:00
gotjosh
9b52ffc6a9
Alerting: Fetch configuration from the database and run a notification service (#32175)
* Alerting: Fetch configuration from the database and run a notification
instance

Co-Authored-By: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2021-03-24 14:20:44 +00:00
Kyle Brandt
05fa9d6ad9
AlertingNG: rename Condition properties to match front end (#32267) 2021-03-24 08:07:29 -04:00
Owen Diehl
2179a2658e
Extricates reusable utilities for different alerting proxy types (#32268)
* backendtype helper

* abstracts alertingproxy

* updates alerting api dep

* prom endpoints
2021-03-24 07:43:25 -04:00
Kyle Brandt
7bb79158ed
SSE/Alerting: First pass at query/condition translation (#31693)
- Takes the conditions property from the settings column of an alert from alerts table and turns into an ng alerting condition with the queries and classic condition.
- Has temp API rest endpoint that will take the dashboard conditions json, translate it to SEE queries + classic condition, and execute it (only enabled in dev mode).
- Changes expressions to catch query responses with a non-nil error property
- Adds two new states for an NG instance result (NoData, Error) and updates evaluation to match those states
- Changes the AsDataFrame (for frontend) from Bool to string to represent additional states
- Fix bug in condition model to accept first Operator as empty string.
- In ngalert, adds GetQueryDataRequest, which was part of execute and is still called from there. But this allows me to get the Expression request from a condition to make the "pipeline" can be built.
- Update AsDataFrame for evalresult to be row based so it displays a little better for now
2021-03-23 12:11:15 -04:00
Sofia Papagiannaki
24cb059a6b
[Alerting]: implement backend checking for forking to Lotex ruler (#32208)
* Rename DatasourceId path parameter

* Implement fork ruler backendType()

* Apply suggestions from code review
2021-03-23 18:08:57 +02:00
Owen Diehl
93d0f7163f
[Alerting] Forking LoTex ruler (#32138)
* updates alerting api to master

* skeleton for lotex ruler

* withPath helper & legacyRulerPrefix const

* forked ruler

* wires up proxy

* safeMacaronWrapper

* working proxy

* jsonExtractor

* lint
2021-03-19 10:32:13 -04:00
Ganesh Vernekar
8854001b67
AlertingNG: Refactor notifier to support config reloads (#32099)
* AlertingNG: Refactor notifier to support config reloads

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments and make reloading of config a sync operation

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-03-19 09:26:00 +01:00
gotjosh
cc74b1fe46
Alerting: Add database table for persisting alerting configuration (#32042)
* Alerting: Add database table for persisting alerting configuration

* Fix the linter

* Address review comments

* Don't split templates and configuration

It is already bundled together as part of a of the API so might as well
marshall it directly.
2021-03-18 18:12:28 +00:00
Ganesh Vernekar
0b788b5ce8
AlertingNG: Notification channel for emails (#31768)
* Email notification channel in ngalert

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Use existing templating system

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Update template and add unit tests

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2021-03-18 14:55:11 +00:00
Ganesh Vernekar
974ccf8091
AlertingNG: Fix the alerting stage for legacy alerts (#32025)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2021-03-17 16:38:33 +00:00
Sofia Papagiannaki
68b05b8aaa
AlertingNG: Unified alerting API mock (#32040)
* AlertingNG: Alertmanager mock API

* AlertingNG: Remove permissions API routes

* Add example POST payloads

* Prometheus and testing mock API
2021-03-17 12:47:03 +02:00
Ganesh Vernekar
ecbc98ba5d
AlertingNG: Add alert provider and basic structure with dispatcher, silences and delivery stages (#31833)
* AlertingNG: Add alert provider

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Add unit tests

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Alertmanager WIP

* Merge alertmanager into notifier

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fixes for PR 31833 (#31990)

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Use alertmanager from upgrad-uuid temporarily to unblock

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix lint

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

Co-authored-by: Josue Abreu <josue@grafana.com>
2021-03-16 10:44:52 +01:00
Sofia Papagiannaki
fe628c6282
AlertingNG: base API implementation (#31824)
* AlertingNG: base API implementation

* Pass the interface instead of the base impl

* Ruler mock draft (WIP)

* Update alerting-api dependency

* Improve mock implementation
2021-03-11 21:28:00 +02:00
Sofia Papagiannaki
53bccf1b77
Replace eval.Condition with models.Condition (#31909) 2021-03-11 18:56:58 +02:00
Sofia Papagiannaki
4ce0a49eac
AlertingNG: Split into several packages (#31719)
* AlertingNG: Split into several packages

* Move AlertQuery to models
2021-03-08 22:19:21 +02:00
Arve Knudsen
b79e61656a
Introduce TSDB service (#31520)
* Introduce TSDB service

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

Co-authored-by: Erik Sundell <erik.sundell87@gmail.com>
Co-authored-by: Will Browne <will.browne@grafana.com>
Co-authored-by: Torkel Ödegaard <torkel@grafana.org>
Co-authored-by: Will Browne <wbrowne@users.noreply.github.com>
Co-authored-by: Zoltán Bedi <zoltan.bedi@gmail.com>
2021-03-08 07:02:49 +01:00
Sofia Papagiannaki
bd2390c49f
AlertingNG: code refactoring (#30787)
* AlertingNG: refactoring

* Fix tests
2021-03-03 17:52:19 +02:00
Hugo Häggmark
1725bf773f
Chore: Fixes small typos (#31461) 2021-02-25 08:59:26 +01:00
Peter Holmberg
aaf5710748
AlertingNG: Edit Alert Definition (#30676)
* break out new and edit

* changed model to match new model in backend

* AlertingNG: API modifications (#30683)

* Fix API consistency

* Change eval alert definition to POST request

* Fix eval endpoint to accept custom now parameter

* Change JSON input property for create/update endpoints

* model adjustments

* set mixed datasource, fix put url

* update snapshots

* remove edit and add landing page

* remove snapshot tests ans snapshots

* wrap linkbutton in array

Co-authored-by: Sofia Papagiannaki <papagian@users.noreply.github.com>
Co-authored-by: Sofia Papagiannaki <sofia@grafana.com>
2021-02-04 09:13:02 +01:00
Sofia Papagiannaki
5d029abc42
AlertingNG: change API permissions (#30781) 2021-02-02 10:37:01 +02:00
Sofia Papagiannaki
1c158744e8
AlertingNG: pause/unpause definitions via the API (#30627)
* AlertingNG: pause/unpause definitions via the API

* Apply suggestions from code review

Co-authored-by: Will Browne <wbrowne@users.noreply.github.com>

* Enable pausing/unpausing multiple definitions

Co-authored-by: Will Browne <wbrowne@users.noreply.github.com>
2021-01-27 15:51:00 +02:00
Sofia Papagiannaki
9ada4b6052
Expressions: Add option to disable feature (#30541)
* Expressions: Add option to disable feature

* Apply suggestions from code review

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>
2021-01-22 19:27:33 +02:00
Sofia Papagiannaki
b1debc9c46
AlertingNG: Enforce unique alert definition title (non empty)/UID per organisation (#30380)
* Enforce unique alert definition title/uid per org

* Remove print statement from test

* Do not allow empty alert definition titles

* update error message on dup title

* also add title error to update

* CamelCase json properties

* Add test for title unique enforcement in updates

Co-authored-by: kyle <kyle@grafana.com>
2021-01-19 19:11:11 +02:00
Sofia Papagiannaki
8c31e25926
AlertingNG: Save alert instances (#30223)
* AlertingNG: Save alert instances

Co-authored-by: Kyle Brandt <kyle@grafana.com>

* Rename alert instance fields/columns

* Include definition title in listing alert instances

* Delete instances when deleting defintion

Co-authored-by: Kyle Brandt <kyle@grafana.com>
2021-01-18 20:57:17 +02:00
Sofia Papagiannaki
2b15581339
AlertingNG: Modify queries and transform endpoint to get datasource UIDs (#30297)
* Pass skipCache from context

* Use macaron Params instead of ParamsEscape for UIDs

* Modify queries and transform to get datasource UIDs

* Update github.com/grafana/grafana-plugin-sdk-go to v0.83.0
2021-01-15 18:33:50 +02:00
Hugo Häggmark
3d41267fc4
Chore: Moves common and response into separate packages (#30298)
* Chore: moves common and response into separate packages

* Chore: moves common and response into separate packages

* Update pkg/api/utils/common.go

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>

* Chore: changes after PR comments

* Chore: move wrap to routing package

* Chore: move functions in common to response package

* Chore: move functions in common to response package

* Chore: formats imports

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
2021-01-15 14:43:20 +01:00
Sofia Papagiannaki
551428af61
Fix alert definition routine stop (#30117) 2021-01-11 16:14:03 +02:00
Sofia Papagiannaki
5560be73bf
Alerting NG: update API to expect UIDs instead of IDs (#29896)
* Change API to expect UIDs instead of ID

* Remove unnecessary transactions

When only one query is executed

* Modify API responses

* Cleanup tests

* Use globally orgID and UID for identifying alert definitions
2021-01-07 17:45:42 +02:00
Sofia Papagiannaki
3cac10e598
AlertingNG: Create a scheduler to evaluate alert definitions (#29305)
* Always use cache: stop passing skipCache among ngalert functions

* Add updated column

* Scheduler initial draft

* Add retry on failure

* Allow settting/updating alert definition interval

Set default interval if no interval is provided during alert definition creation.
Keep existing alert definition interval if no interval is provided during alert definition update.

* Parameterise alerting.Ticker to run on custom interval

* Allow updating alert definition interval without having to provide the queries and expressions

* Add schedule tests

* Use xorm tags for having initialisms with consistent case in Go

* Add ability to pause/unpause the scheduler

* Add alert definition versioning

* Optimise scheduler to fetch alert definition only when it's necessary

* Change MySQL data column to mediumtext

* Delete alert definition versions

* Increase default scheduler interval to 10 seconds

* Fix setting OrgID on updates

* Add validation for alert definition name length

* Recreate tables
2020-12-17 16:00:09 +02:00
Kyle Brandt
6d64c603c2
Expr: fix failure to execute due to OrgID (#29653)
* Expr: fix failure to execute due to OrgID
Get orgID from the plugin context, which makes more sense anyways.
makes expressions work again after https://github.com/grafana/grafana/pull/29449 changes.

* Do not save organisation on its alert query model

Co-authored-by: Sofia Papagiannaki <sofia@grafana.com>
2020-12-07 10:30:38 -05:00
Kyle Brandt
0cb29d337a
Expressions: Move GEL into core as expressions (#29072)
* comes from grafana/gel-app
* remove transform plugin code
* move __expr__ and -100 constants to expr pkg
* set OrgID on request plugin context
* use gtime for resample duration
* in resample, rename "rule" to "window", use gtime for duration, parse duration before exec
* remove gel entry from plugins-bundled/external.json
which creates an empty array for plugins
2020-11-19 07:17:00 -05:00
Sofia Papagiannaki
43f580c299
AlertingNG: manage and evaluate alert definitions via the API (#28377)
* Alerting NG: prototype v2 (WIP)

* Separate eval package

* Modify eval alert definition endpoint

* Disable migration if ngalert is not enabled

* Remove premature test

* Fix lint issues

* Delete obsolete struct

* Apply suggestions from code review

* Update pkg/services/ngalert/ngalert.go

Co-authored-by: Kyle Brandt <kyle@grafana.com>

* Add API endpoint for listing alert definitions

* Introduce index for alert_definition table

* make ds object for expression to avoid panic

* wrap error

* Update pkg/services/ngalert/eval/eval.go

* Swith to backend.DataQuery

* Export TransformWrapper callback

* Fix lint issues

* Update pkg/services/ngalert/ngalert.go

Co-authored-by: Kyle Brandt <kyle@grafana.com>

* Validate alert definitions before storing them

* Introduce AlertQuery

* Add test

* Add QueryType in AlertQuery

* Accept only float64 (seconds) durations

* Apply suggestions from code review

* Get rid of bus

* Do not export symbols

* Fix failing test

* Fix failure due to service initialization order

Introduce MediumHigh service priority and assign it to backendplugin
service

* Fix test

* Apply suggestions from code review

* Fix renamed reference

Co-authored-by: Kyle Brandt <kyle@grafana.com>
2020-11-12 15:11:30 +02:00
Arve Knudsen
7897c6b7d5
Chore: Fix staticcheck issues (#28854)
* Chore: Fix issues reported by staticcheck

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Undo changes

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2020-11-05 11:57:20 +01:00
Kyle Brandt
44a795cb17
AlertingNG: remove warn/crit from eval prototype (#28334)
and misc cleanup
2020-10-16 12:33:57 -04:00
Sofia Papagiannaki
4ab90f9397
Fix linting: remove commented code (#28208) 2020-10-13 08:35:41 +02:00
Sofia Papagiannaki
4acbcd7053
AlertingNG: POC of evaluator under feature flag. (#27922)
* New feature toggle for enabling alerting NG

* Initial commit

* Modify evaluate alert API request

* Check for unique labels in alert execution result dataframes

* Remove print statement

* Additional minor fixes/comments

* Fix lint issues

* Add API endpoint for evaluating panel queries

* Push missing renaming

* add refId for condition to API

* add refId for condition to API

* switch dashboard based eval to get method

* add from/to params to dashboard based eval

* add from/to params to  eval endpoint

Co-authored-by: kyle <kyle@grafana.com>
2020-10-12 21:51:39 +03:00