* [Alerting]: Add alerting endpoint for Query Evaluation
* Fix passing down now parameter
* Add validations and test
* Fix eval queries and expressions test
* Add eval tests
* set processing time
* merge labels and set on response
* use state cache for adding alerts to rules
* minor cleanup
* pr feedback
* Do not initialize mutex unnecessarily
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
* linter
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
* [Alerting]: Fix updating rule group and add test
* Fix updating rule labels
* Set default values for rule no data and error states
if they are missing
* Add test for updating rule
* Test updating annotations
* Apply suggestions from code review
Co-authored-by: gotjosh <josue@grafana.com>
* add test for posting an unknown rule UID
* Fix alert rule validation and add tests
* Remove org id from PostableGrafanaRule
This field was not used; each rule gets the organisation of the user making
the rerquest
* Update pkg/tests/api/alerting/api_alertmanager_test.go
Co-authored-by: gotjosh <josue@grafana.com>
A set of fixes for the GET alert and groups endpoints.
- First, is the fact that the default values where not being for the query params. I've introduced a new method in the Grafana context that allow us to do this.
- Second, is the fact that alerts were never being transitioned to active. To my surprise this is actually done by the inhibitor in the pipeline - if an alert is not muted, or inhibited then it's active.
- Third, I have added an integration test to cover for regressions.
Signed-off-by: Josue Abreu <josue@grafana.com>
* init
* autogens AM route
* POST dashboards/db spec
* POST alert-notifications spec
* fix description
* re inits vendor, updates grafana to master
* go mod updates
* alerting routes
* renames to receivers
* prometheus endpoints
* align config endpoint with cortex, include templates
* Change grafana receiver type
* Update receivers.go
* rename struct to stop swagger thrashing
* add rules API
* index html
* standalone swagger ui html page
* Update README.md
* Expose GrafanaManagedAlert properties
* Some fixes
- /api/v1/rules/{Namespace} should return a map
- update ExtendedUpsertAlertDefinitionCommand properties
* am alerts routes
* rename prom swagger section for clarity, remove example endpoints
* Add missing json and yaml tags
* folder perms
* make folders POST again
* fix grafana receiver type
* rename fodler->namespace for perms
* make ruler json again
* PR fixes
* silences
* fix Ok -> Ack
* Add id to POST /api/v1/silences (#9)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
* Add POST /api/v1/alerts (#10)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
* fix silences
* Add testing endpoints
* removes grpc replace directives
* [wip] starts validation
* pkg cleanup
* go mod tidy
* ignores vendor dir
* Change response type for Cortex/Loki alerts
* receiver unmarshaling tests
* ability to split routes between AM & Grafana
* api marshaling & validation
* begins work on routing lib
* [hack] ignores embedded field in generation
* path specific datasource for alerting
* align endpoint names with cloud
* single route per Alerting config
* removes unused routing pkg
* regens spec
* adds datasource param to ruler/prom route paths
* Modifications for supporting migration
* Apply suggestions from code review
* hack for cleaning circular refs in swagger definition
* generates files
* minor fixes for prom endpoints
* decorate prom apis with required: true where applicable
* Revert "generates files"
This reverts commit ef7e975584.
* removes server autogen
* Update imported structs from ngalert
* Fix listing rules response
* Update github.com/prometheus/common dependency
* Update get silence response
* Update get silences response
* adds ruler validation & backend switching
* Fix GET /alertmanager/{DatasourceId}/config/api/v1/alerts response
* Distinct gettable and postable grafana receivers
* Remove permissions routes
* Latest JSON specs
* Fix testing routes
* inline yaml annotation on apirulenode
* yaml test & yamlv3 + comments
* Fix yaml annotations for embedded type
* Rename DatasourceId path parameter
* Implement Backend.String()
* backend zero value is a real backend
* exports DiscoveryBase
* Fix GO initialisms
* Silences: Use PostableSilence as the base struct for creating silences
* Use type alias instead of struct embedding
* More fixes to alertmanager silencing routes
* post and spec JSONs
* Split rule config to postable/gettable
* Fix empty POST /silences payload
Recreating the generated JSON specs fixes the issue
without further modifications
* better yaml unmarshaling for nested yaml docs in cortex-am configs
* regens spec
* re-adds config.receivers
* omitempty to align with prometheus API behavior
* Prefix routes with /api
* Update Alertmanager models
* Make adjustments to follow the Alertmanager API
* ruler: add for and annotations to grafana alert (#45)
* Modify testing API routes
* Fix grafana rule for field type
* Move PostableUserConfig validation to this library
* Fix PostableUserConfig YAML encoding/decoding
* Use common fields for grafana and lotex rules
* Add namespace id in GettableGrafanaRule
* Apply suggestions from code review
* fixup
* more changes
* Apply suggestions from code review
* aligns structure pre merge
* fix new imports & tests
* updates tooling readme
* goimports
* lint
* more linting!!
* revive lint
Co-authored-by: Sofia Papagiannaki <papagian@gmail.com>
Co-authored-by: Domas <domasx2@gmail.com>
Co-authored-by: Sofia Papagiannaki <papagian@users.noreply.github.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Co-authored-by: gotjosh <josue@grafana.com>
Co-authored-by: David Parrott <stomp.box.yo@gmail.com>
Co-authored-by: Kyle Brandt <kyle@grafana.com>
* move state tracker tests to /tests
* set default labels on alerts
* handle empty labels in result.Instance
* create annotation on transition to alerting state
* set query in rules response
* Theme: tweaking dark theme colors (#33007)
* Library Panels: Add library panel tab to share modal (#32953)
* Explore: Scroll split panes in Explore independently (#32978)
* Change default prometheus to latest and prometheus v1 to prometheus1
* Update README
* Remove prometheus1 block as not used
* Explore: Separatae scrolling in split view
* Update snapshot
* Allow skip migrations in tests via environment variable (#32958)
* Dashboard: Fix issue where Slack notifications won't link to users (#32861)
* DashboardPage: refactored styles from sass to emotion (#32955)
* DashboardPage: refactored styles from sass to emotion
* refactored dashboardPage component to be alot easier to read and understand
* more refactoring...
* more cleaning...
* fixes frontend test
* fixes frontend test- I hope
* fixes frontend test- I hope
* moves dashboard scss styles back to it's standalone file
* GraphNG: use theme font family and size for axis labels (#33009)
* GraphNG: use theme font family and size for axis labels
* fix test
* AlertingNG: Slack notification channel (#32675)
* AlertingNG: Slack notification channel
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Add tests
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Fix review comments
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Fix review comments and small refactoring
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* GraphNG: stacking (#30749)
* First iteration
* Dev dash
* Re-use StackingMode type
* Fix ts and api issues
* Stacking work resurected
* Fix overrides
* Correct values in tooltip and updated test dashboard
* Update dev dashboard
* Apply correct bands for stacking
* Merge fix
* Update snapshot
* Revert go.sum
* Handle null values correctyl and make filleBelowTo and stacking mutual exclusive
* Snapshots update
* Graph->Time series stacking migration
* Review comments
* Indicate overrides in StandardEditorContext
* Change stacking UI editor, migrate stacking to object option
* Small refactor, fix for hiding series and dev dashboard
* VizLegend: sets a min and max value of the seriesCount control in Storybook (#33022)
* Alerting: Filter rules list (#32818)
* Chore: Reduces strict errors (#33012)
* Chore: reduces strict error in OptionPicker tests
* Chore: reduces strict errors in FormDropdownCtrl
* Chore: reduces has no initializer and is not definitely assigned in the constructor errors
* Chore: reduces has no initializer and is not definitely assigned in the constructor errors
* Chore: lowers strict count limit
* Tests: updates snapshots
* Tests: updates snapshots
* Chore: updates after PR comments
* Refactor: removes throw and changes signature for DashboardSrv.getCurrent
* [Alerting]: Several modifications in alert rules (#32983)
* [Alerting]: Use common properties for all rules
* Add Labels in rules
* Fix update ruleGroup API
Return 400 Bad Request response
when the request contains a UID that does not exist
* Check permissions and return namespace id
* Apply suggestions from code review
Co-authored-by: gotjosh <josue@grafana.com>
* WIP (#33025)
* Chore: Bump strict error count limit (#33035)
* set query in rules response
Co-authored-by: Torkel Ödegaard <torkel@grafana.org>
Co-authored-by: kay delaney <45561153+kaydelaney@users.noreply.github.com>
Co-authored-by: Ivana Huckova <30407135+ivanahuckova@users.noreply.github.com>
Co-authored-by: Dafydd <72009875+dafydd-t@users.noreply.github.com>
Co-authored-by: n-wbrown <n-wbrown@users.noreply.github.com>
Co-authored-by: Uchechukwu Obasi <obasiuche62@gmail.com>
Co-authored-by: Leon Sorokin <leeoniya@gmail.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Co-authored-by: Dominik Prokop <dominik.prokop@grafana.com>
Co-authored-by: Nathan Rodman <nathanrodman@gmail.com>
Co-authored-by: Hugo Häggmark <hugo.haggmark@grafana.com>
Co-authored-by: Sofia Papagiannaki <papagian@users.noreply.github.com>
Co-authored-by: gotjosh <josue@grafana.com>
Co-authored-by: Marcus Efraimsson <marcus.efraimsson@gmail.com>
* [Alerting]: Use common properties for all rules
* Add Labels in rules
* Fix update ruleGroup API
Return 400 Bad Request response
when the request contains a UID that does not exist
* Check permissions and return namespace id
* Apply suggestions from code review
Co-authored-by: gotjosh <josue@grafana.com>
* [Alerting]: Fix empty rules evaluation statuses
`GetRuleGroupAlertRules()` requires an non empty namespaceUID
* Include the namespace into the response
* [Alerting]: Use title instead of slug for retrieving the namespace
* Apply suggestions from code review
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
* Alerting: Use a default configuration and periodically poll for new ones
Use a default configuration to make sure we always start the grafana
instance. Then, regularly poll for new ones.
I've also made sure that failures to apply configuration do not stop the
Grafana server but instead keep polling until it is a success.
* add db columns
* Fix deserialisation issue of AlertRule For field (#32848)
* Update to latest alerting-api
Co-authored-by: Sofia Papagiannaki <papagian@users.noreply.github.com>
* Alerting: Cleanup and move legacy to a legacy file
A quick cleanup of the ngalert/api directory, optimising for an easy
removal of what is will be considered legacy at some point. A quick
summary of what's done is:
- Add a prefix `generated` prefix to files that are auto-generated by
our swagger definitions.
- Create a legacy file to place all the legacy API routes implementation
and helpers. Deleting files that where no longer needed after this
move.
- Rename the `lotex` file to `lotex_ruler`
- Adding a couple of comments here and there.
With this, I hope to organise our code in this directory a bit better
given there's a lot going on.
* Return cached alerts for prometheus/api/v1/alerts
* Return not implemented for /prometheus/grafana/api/v1/rules
* Set StartsAt for already alerting states
* Fix tests
* Initial commit for state tracking
* basic state transition logic and tests
* constructor. test and interface fixup
* use new sig for sch.definitionRoutine()
* test fixup
* make the linter happy
* more minor linting cleanup
* Alerting: Send alerts from state tracker to notifier
* Add evaluation time and test
Add evaluation time and test
* Add cleanup routine and logging
* Pull in compact.go and reconcile differences
* Save alert transitions and save all state on shutdown
* pr feedback
* WIP
* WIP
* Persist alerts on evaluation and shutdown. Warm cache on startup
* Filter non-firing alerts before sending to notifier
Co-authored-by: Josue Abreu <josue@grafana.com>
* Add validation for grafana recipient
* Alertmanager API implementation (WIP)
* Fix encoding/decoding receiver settings from/to YAML
* Save templates together with the configuration
* update POST to apply latest config
* Alertmanager service enabled by the ngalert toggle
* Silence API integration with Alertmanager
* Apply suggestions from code review
Co-authored-by: gotjosh <josue@grafana.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
* Alerting: Introduce the silencing interface
The operations introduced are:
- Listing silences
- Retrieving an specific silence
- Deleting a silence
- Creating a silence
Signed-off-by: Josue Abreu <josue@grafana.com>
* Add a comment to listing silences
* Update to upstream alertmanager
* Remove copied code from the Alertmanager
* Initial commit for state tracking
* basic state transition logic and tests
* constructor. test and interface fixup
* use new sig for sch.definitionRoutine()
* test fixup
* make the linter happy
* more minor linting cleanup
* Alerting: Send alerts from state tracker to notifier
* Add evaluation time and test
Add evaluation time and test
* Add cleanup routine and logging
* Pull in compact.go and reconcile differences
* pr feedback
* pr feedback
Pull in compact.go and reconcile differences
Co-authored-by: Josue Abreu <josue@grafana.com>
* Render new email template and fix the title
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Fix nit
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Initial commit for state tracking
* basic state transition logic and tests
* constructor. test and interface fixup
* use new sig for sch.definitionRoutine()
* test fixup
* make the linter happy
* more minor linting cleanup
* Alerting: Fetch configuration from the database and run a notification
instance
Co-Authored-By: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
- Takes the conditions property from the settings column of an alert from alerts table and turns into an ng alerting condition with the queries and classic condition.
- Has temp API rest endpoint that will take the dashboard conditions json, translate it to SEE queries + classic condition, and execute it (only enabled in dev mode).
- Changes expressions to catch query responses with a non-nil error property
- Adds two new states for an NG instance result (NoData, Error) and updates evaluation to match those states
- Changes the AsDataFrame (for frontend) from Bool to string to represent additional states
- Fix bug in condition model to accept first Operator as empty string.
- In ngalert, adds GetQueryDataRequest, which was part of execute and is still called from there. But this allows me to get the Expression request from a condition to make the "pipeline" can be built.
- Update AsDataFrame for evalresult to be row based so it displays a little better for now
* Alerting: Add database table for persisting alerting configuration
* Fix the linter
* Address review comments
* Don't split templates and configuration
It is already bundled together as part of a of the API so might as well
marshall it directly.
* AlertingNG: base API implementation
* Pass the interface instead of the base impl
* Ruler mock draft (WIP)
* Update alerting-api dependency
* Improve mock implementation
* break out new and edit
* changed model to match new model in backend
* AlertingNG: API modifications (#30683)
* Fix API consistency
* Change eval alert definition to POST request
* Fix eval endpoint to accept custom now parameter
* Change JSON input property for create/update endpoints
* model adjustments
* set mixed datasource, fix put url
* update snapshots
* remove edit and add landing page
* remove snapshot tests ans snapshots
* wrap linkbutton in array
Co-authored-by: Sofia Papagiannaki <papagian@users.noreply.github.com>
Co-authored-by: Sofia Papagiannaki <sofia@grafana.com>
* AlertingNG: pause/unpause definitions via the API
* Apply suggestions from code review
Co-authored-by: Will Browne <wbrowne@users.noreply.github.com>
* Enable pausing/unpausing multiple definitions
Co-authored-by: Will Browne <wbrowne@users.noreply.github.com>
* Enforce unique alert definition title/uid per org
* Remove print statement from test
* Do not allow empty alert definition titles
* update error message on dup title
* also add title error to update
* CamelCase json properties
* Add test for title unique enforcement in updates
Co-authored-by: kyle <kyle@grafana.com>
* Pass skipCache from context
* Use macaron Params instead of ParamsEscape for UIDs
* Modify queries and transform to get datasource UIDs
* Update github.com/grafana/grafana-plugin-sdk-go to v0.83.0
* Chore: moves common and response into separate packages
* Chore: moves common and response into separate packages
* Update pkg/api/utils/common.go
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
* Chore: changes after PR comments
* Chore: move wrap to routing package
* Chore: move functions in common to response package
* Chore: move functions in common to response package
* Chore: formats imports
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
* Change API to expect UIDs instead of ID
* Remove unnecessary transactions
When only one query is executed
* Modify API responses
* Cleanup tests
* Use globally orgID and UID for identifying alert definitions
* Always use cache: stop passing skipCache among ngalert functions
* Add updated column
* Scheduler initial draft
* Add retry on failure
* Allow settting/updating alert definition interval
Set default interval if no interval is provided during alert definition creation.
Keep existing alert definition interval if no interval is provided during alert definition update.
* Parameterise alerting.Ticker to run on custom interval
* Allow updating alert definition interval without having to provide the queries and expressions
* Add schedule tests
* Use xorm tags for having initialisms with consistent case in Go
* Add ability to pause/unpause the scheduler
* Add alert definition versioning
* Optimise scheduler to fetch alert definition only when it's necessary
* Change MySQL data column to mediumtext
* Delete alert definition versions
* Increase default scheduler interval to 10 seconds
* Fix setting OrgID on updates
* Add validation for alert definition name length
* Recreate tables
* Expr: fix failure to execute due to OrgID
Get orgID from the plugin context, which makes more sense anyways.
makes expressions work again after https://github.com/grafana/grafana/pull/29449 changes.
* Do not save organisation on its alert query model
Co-authored-by: Sofia Papagiannaki <sofia@grafana.com>
* comes from grafana/gel-app
* remove transform plugin code
* move __expr__ and -100 constants to expr pkg
* set OrgID on request plugin context
* use gtime for resample duration
* in resample, rename "rule" to "window", use gtime for duration, parse duration before exec
* remove gel entry from plugins-bundled/external.json
which creates an empty array for plugins
* Alerting NG: prototype v2 (WIP)
* Separate eval package
* Modify eval alert definition endpoint
* Disable migration if ngalert is not enabled
* Remove premature test
* Fix lint issues
* Delete obsolete struct
* Apply suggestions from code review
* Update pkg/services/ngalert/ngalert.go
Co-authored-by: Kyle Brandt <kyle@grafana.com>
* Add API endpoint for listing alert definitions
* Introduce index for alert_definition table
* make ds object for expression to avoid panic
* wrap error
* Update pkg/services/ngalert/eval/eval.go
* Swith to backend.DataQuery
* Export TransformWrapper callback
* Fix lint issues
* Update pkg/services/ngalert/ngalert.go
Co-authored-by: Kyle Brandt <kyle@grafana.com>
* Validate alert definitions before storing them
* Introduce AlertQuery
* Add test
* Add QueryType in AlertQuery
* Accept only float64 (seconds) durations
* Apply suggestions from code review
* Get rid of bus
* Do not export symbols
* Fix failing test
* Fix failure due to service initialization order
Introduce MediumHigh service priority and assign it to backendplugin
service
* Fix test
* Apply suggestions from code review
* Fix renamed reference
Co-authored-by: Kyle Brandt <kyle@grafana.com>
* New feature toggle for enabling alerting NG
* Initial commit
* Modify evaluate alert API request
* Check for unique labels in alert execution result dataframes
* Remove print statement
* Additional minor fixes/comments
* Fix lint issues
* Add API endpoint for evaluating panel queries
* Push missing renaming
* add refId for condition to API
* add refId for condition to API
* switch dashboard based eval to get method
* add from/to params to dashboard based eval
* add from/to params to eval endpoint
Co-authored-by: kyle <kyle@grafana.com>