Migrations:
* add a new column alert_group_idx to alert_rule table
* add a new column alert_group_idx to alert_rule_version table
* re-index existing rules during migration
API:
* set group index on update. Use the natural order of items in the array as group index
* sort rules in the group on GET
* update the version of all rules of all affected groups. This will make optimistic lock work in the case of multiple concurrent request touching the same groups.
UI:
* update UI to keep the order of alerts in a group
* Alerting: decapitalize log lines and use "err" as the key for errors
Found using (logger|log).(Warn|Debug|Info|Error)\([A-Z] and (logger|log).(Warn|Debug|Info|Error)\(.+"error"
* update authz to exclude entire group if user does not have access to rule
* change rule update authz to not return changes because if user does not have access to any rule in group, they do not have access to the rule
* a new query that returns alerts in group by UID of alert that belongs to that group
* collect all affected groups during calculate changes
* update authorize to check access to groups
* update tests for calculateChanges to assert new fields
* add authorization tests
This change adds a field to state.State and models.AlertInstance
that indicate the "Reason" that an instance has its current state. This
helps us account for cases where the state is "Normal" but the
underlying evaluation returned "NoData" or "Error", for example.
Fixes#42606
Signed-off-by: Joe Blubaugh <joe.blubaugh@grafana.com>
* introduce a fallback handler that checks that role is Viewer.
* update UI nav links to allow alerting tabs for anonymous user
* update rule api to check for Viewer role instead of SignedIn when RBAC is disabled
* create AlertGroupKey structure
* update PrometheusSrv.
- extract creation of RuleGroup to a separate method. Use group key for grouping
* update RuleSrv
- update calculateChanges to use groupKey
- authorize to use groupkey
* add check for access to rule's data source in GET APIs
* use more general method GetAlertRules instead of GetNamespaceAlertRules.
* remove unused GetNamespaceAlertRules.
Tests:
* create a method to generate permissions for rules
* extract method to create RuleSrv
* add tests for RouteGetNamespaceRulesConfig
* move validation at the beginning of method
* remove usage of GetOrgRuleGroups because it is not necessary. All information is already available in memory.
* remove unused method
* Alerting: Accurately set value for prom-compatible APIs
Sets the value fields for the prometheus compatible API based on a combination of condition `refID` and the values extracted from the different frames.
* Fix an extra test
* Ensure a consitent ordering
* Address review comments
* address review comments
* Use alert:create action for folder search with edit permissions. This matches the action that is used to query dashboards (the update will be addressed later)
* Update rule store to use FindDashboards instead of folder service to list folders the user has access to view alerts. Folder service does not support query type and additional filters.
* Do not check whether the user can save to folder if FGAC is enabled because it is checked on API level.
* Alerting: Remove internal labels from prometheus compatible API responses
* Appease the linter
* Fix integration tests
* Fix API documentation & linter
* move removal of internal labels to the models
* Chore: GetDashboardQuery should be dispatched using DispatchCtx
* Fix after merge
* Changes after review
* Various fixes
* Use GetDashboardCtx function instead of GetDashboard
When, and currently only when using a classic condition, evaluation information is added (which is like the EvalMatches from dashboard alerting).
This is returned via the API and can be included in notifications by reading the `__value__` label attached `.Alerts` in the template. It is a string.
* nest cache by orgID, ruleUID, stateID
* update accessors to use new cache structure
* test and linter fixup
* fix panic
Co-authored-by: Kyle Brandt <kyle@grafana.com>
* add comment to identify what's going on with nested maps in cache
Co-authored-by: Kyle Brandt <kyle@grafana.com>
* set processing time
* merge labels and set on response
* use state cache for adding alerts to rules
* minor cleanup
* add support for NoData and Error results
* rename test
* bring in changes from other PRs tha have been merged
* pr feedback
* add integration test
* close state tracker cleanup on context.Done
* fixup test
* rename state tracker
* set EvaluationDuration on Result
* default labels set as constants
* separate cache and state from manager
* use RWMutex in cache
* set processing time
* merge labels and set on response
* use state cache for adding alerts to rules
* minor cleanup
* add support for NoData and Error results
* rename test
* bring in changes from other PRs tha have been merged
* pr feedback
* add integration test
* close state tracker cleanup on context.Done
* fixup test
* not those annotations
* set processing time
* merge labels and set on response
* use state cache for adding alerts to rules
* minor cleanup
* pr feedback
* Do not initialize mutex unnecessarily
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
* linter
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
* init
* autogens AM route
* POST dashboards/db spec
* POST alert-notifications spec
* fix description
* re inits vendor, updates grafana to master
* go mod updates
* alerting routes
* renames to receivers
* prometheus endpoints
* align config endpoint with cortex, include templates
* Change grafana receiver type
* Update receivers.go
* rename struct to stop swagger thrashing
* add rules API
* index html
* standalone swagger ui html page
* Update README.md
* Expose GrafanaManagedAlert properties
* Some fixes
- /api/v1/rules/{Namespace} should return a map
- update ExtendedUpsertAlertDefinitionCommand properties
* am alerts routes
* rename prom swagger section for clarity, remove example endpoints
* Add missing json and yaml tags
* folder perms
* make folders POST again
* fix grafana receiver type
* rename fodler->namespace for perms
* make ruler json again
* PR fixes
* silences
* fix Ok -> Ack
* Add id to POST /api/v1/silences (#9)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
* Add POST /api/v1/alerts (#10)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
* fix silences
* Add testing endpoints
* removes grpc replace directives
* [wip] starts validation
* pkg cleanup
* go mod tidy
* ignores vendor dir
* Change response type for Cortex/Loki alerts
* receiver unmarshaling tests
* ability to split routes between AM & Grafana
* api marshaling & validation
* begins work on routing lib
* [hack] ignores embedded field in generation
* path specific datasource for alerting
* align endpoint names with cloud
* single route per Alerting config
* removes unused routing pkg
* regens spec
* adds datasource param to ruler/prom route paths
* Modifications for supporting migration
* Apply suggestions from code review
* hack for cleaning circular refs in swagger definition
* generates files
* minor fixes for prom endpoints
* decorate prom apis with required: true where applicable
* Revert "generates files"
This reverts commit ef7e975584.
* removes server autogen
* Update imported structs from ngalert
* Fix listing rules response
* Update github.com/prometheus/common dependency
* Update get silence response
* Update get silences response
* adds ruler validation & backend switching
* Fix GET /alertmanager/{DatasourceId}/config/api/v1/alerts response
* Distinct gettable and postable grafana receivers
* Remove permissions routes
* Latest JSON specs
* Fix testing routes
* inline yaml annotation on apirulenode
* yaml test & yamlv3 + comments
* Fix yaml annotations for embedded type
* Rename DatasourceId path parameter
* Implement Backend.String()
* backend zero value is a real backend
* exports DiscoveryBase
* Fix GO initialisms
* Silences: Use PostableSilence as the base struct for creating silences
* Use type alias instead of struct embedding
* More fixes to alertmanager silencing routes
* post and spec JSONs
* Split rule config to postable/gettable
* Fix empty POST /silences payload
Recreating the generated JSON specs fixes the issue
without further modifications
* better yaml unmarshaling for nested yaml docs in cortex-am configs
* regens spec
* re-adds config.receivers
* omitempty to align with prometheus API behavior
* Prefix routes with /api
* Update Alertmanager models
* Make adjustments to follow the Alertmanager API
* ruler: add for and annotations to grafana alert (#45)
* Modify testing API routes
* Fix grafana rule for field type
* Move PostableUserConfig validation to this library
* Fix PostableUserConfig YAML encoding/decoding
* Use common fields for grafana and lotex rules
* Add namespace id in GettableGrafanaRule
* Apply suggestions from code review
* fixup
* more changes
* Apply suggestions from code review
* aligns structure pre merge
* fix new imports & tests
* updates tooling readme
* goimports
* lint
* more linting!!
* revive lint
Co-authored-by: Sofia Papagiannaki <papagian@gmail.com>
Co-authored-by: Domas <domasx2@gmail.com>
Co-authored-by: Sofia Papagiannaki <papagian@users.noreply.github.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Co-authored-by: gotjosh <josue@grafana.com>
Co-authored-by: David Parrott <stomp.box.yo@gmail.com>
Co-authored-by: Kyle Brandt <kyle@grafana.com>
* set query in rules response
* Theme: tweaking dark theme colors (#33007)
* Library Panels: Add library panel tab to share modal (#32953)
* Explore: Scroll split panes in Explore independently (#32978)
* Change default prometheus to latest and prometheus v1 to prometheus1
* Update README
* Remove prometheus1 block as not used
* Explore: Separatae scrolling in split view
* Update snapshot
* Allow skip migrations in tests via environment variable (#32958)
* Dashboard: Fix issue where Slack notifications won't link to users (#32861)
* DashboardPage: refactored styles from sass to emotion (#32955)
* DashboardPage: refactored styles from sass to emotion
* refactored dashboardPage component to be alot easier to read and understand
* more refactoring...
* more cleaning...
* fixes frontend test
* fixes frontend test- I hope
* fixes frontend test- I hope
* moves dashboard scss styles back to it's standalone file
* GraphNG: use theme font family and size for axis labels (#33009)
* GraphNG: use theme font family and size for axis labels
* fix test
* AlertingNG: Slack notification channel (#32675)
* AlertingNG: Slack notification channel
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Add tests
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Fix review comments
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Fix review comments and small refactoring
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* GraphNG: stacking (#30749)
* First iteration
* Dev dash
* Re-use StackingMode type
* Fix ts and api issues
* Stacking work resurected
* Fix overrides
* Correct values in tooltip and updated test dashboard
* Update dev dashboard
* Apply correct bands for stacking
* Merge fix
* Update snapshot
* Revert go.sum
* Handle null values correctyl and make filleBelowTo and stacking mutual exclusive
* Snapshots update
* Graph->Time series stacking migration
* Review comments
* Indicate overrides in StandardEditorContext
* Change stacking UI editor, migrate stacking to object option
* Small refactor, fix for hiding series and dev dashboard
* VizLegend: sets a min and max value of the seriesCount control in Storybook (#33022)
* Alerting: Filter rules list (#32818)
* Chore: Reduces strict errors (#33012)
* Chore: reduces strict error in OptionPicker tests
* Chore: reduces strict errors in FormDropdownCtrl
* Chore: reduces has no initializer and is not definitely assigned in the constructor errors
* Chore: reduces has no initializer and is not definitely assigned in the constructor errors
* Chore: lowers strict count limit
* Tests: updates snapshots
* Tests: updates snapshots
* Chore: updates after PR comments
* Refactor: removes throw and changes signature for DashboardSrv.getCurrent
* [Alerting]: Several modifications in alert rules (#32983)
* [Alerting]: Use common properties for all rules
* Add Labels in rules
* Fix update ruleGroup API
Return 400 Bad Request response
when the request contains a UID that does not exist
* Check permissions and return namespace id
* Apply suggestions from code review
Co-authored-by: gotjosh <josue@grafana.com>
* WIP (#33025)
* Chore: Bump strict error count limit (#33035)
* set query in rules response
Co-authored-by: Torkel Ödegaard <torkel@grafana.org>
Co-authored-by: kay delaney <45561153+kaydelaney@users.noreply.github.com>
Co-authored-by: Ivana Huckova <30407135+ivanahuckova@users.noreply.github.com>
Co-authored-by: Dafydd <72009875+dafydd-t@users.noreply.github.com>
Co-authored-by: n-wbrown <n-wbrown@users.noreply.github.com>
Co-authored-by: Uchechukwu Obasi <obasiuche62@gmail.com>
Co-authored-by: Leon Sorokin <leeoniya@gmail.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Co-authored-by: Dominik Prokop <dominik.prokop@grafana.com>
Co-authored-by: Nathan Rodman <nathanrodman@gmail.com>
Co-authored-by: Hugo Häggmark <hugo.haggmark@grafana.com>
Co-authored-by: Sofia Papagiannaki <papagian@users.noreply.github.com>
Co-authored-by: gotjosh <josue@grafana.com>
Co-authored-by: Marcus Efraimsson <marcus.efraimsson@gmail.com>
* [Alerting]: Use common properties for all rules
* Add Labels in rules
* Fix update ruleGroup API
Return 400 Bad Request response
when the request contains a UID that does not exist
* Check permissions and return namespace id
* Apply suggestions from code review
Co-authored-by: gotjosh <josue@grafana.com>
* [Alerting]: Fix empty rules evaluation statuses
`GetRuleGroupAlertRules()` requires an non empty namespaceUID
* Include the namespace into the response
* Return cached alerts for prometheus/api/v1/alerts
* Return not implemented for /prometheus/grafana/api/v1/rules
* Set StartsAt for already alerting states
* Fix tests