* Alerting: Remove double quotes from matchers
With #38629 a new Alertmanager configuration object was introduced with `object_matchers`, it was meant to circumvent around the fact that Prometheus label names don't support a set of characters that Grafana needs to support for alerts, silences, matchers, etc. (with a common example being elasticsearch's `.`).
This new object does not include the label of sanitzation or validation that its Prometheus equivalent supports in `matchers` and therefore are semantically not equivalent.
This triggered the problem that when the migration is run, we use `matchers` as the object to populate in configuration for routing policies, but when the UI does its first save this object is transformed to `object_matchers`.
Matchers that were previously running just fine would immediately stop working as soon as the configuration is saved.
This problem surfaced with the introduction of #49952 where we stopped stripping double quotes from matchers (not just regex but _all_ of them).
* Add comment explaining rationale and future removal
Co-authored-by: Alex Weaver <weaver.alex.d@gmail.com>
* update authz to exclude entire group if user does not have access to rule
* change rule update authz to not return changes because if user does not have access to any rule in group, they do not have access to the rule
* a new query that returns alerts in group by UID of alert that belongs to that group
* collect all affected groups during calculate changes
* update authorize to check access to groups
* update tests for calculateChanges to assert new fields
* add authorization tests
* Add validator for mute timing and make it provisionable
* Add tests to ensure prometheus validators are running and errors are propagated
* Internal API for manipulating mute timings
* Define and generate API layer
* Wire up generated code
* Implement API handlers
* Tests for golang layer
* Fix reference bug
* Fix linter and auth tests
* Resolve semantic errors and regenerate
* Remove pointless comment
* Extract out provisioning path param keys, simplify
* Expected number of paths
* Support for documenting stable vs unstable alerting routes
* empty commit, restart drone
* Touch-up references in root makefile and drop trailing escape newline
* Rebase and regenerate
* Extend README with docs for this change
This change adds a field to state.State and models.AlertInstance
that indicate the "Reason" that an instance has its current state. This
helps us account for cases where the state is "Normal" but the
underlying evaluation returned "NoData" or "Error", for example.
Fixes#42606
Signed-off-by: Joe Blubaugh <joe.blubaugh@grafana.com>
This change extracts screenshot data from alert messages via a private annotation `__alertScreenshotToken__` and attaches a URL to a Slack message or uploads the data to an image upload endpoint if needed.
This change also implements a few foundational functions for use in other notifiers.
* introduce a fallback handler that checks that role is Viewer.
* update UI nav links to allow alerting tabs for anonymous user
* update rule api to check for Viewer role instead of SignedIn when RBAC is disabled
* create AlertGroupKey structure
* update PrometheusSrv.
- extract creation of RuleGroup to a separate method. Use group key for grouping
* update RuleSrv
- update calculateChanges to use groupKey
- authorize to use groupkey
* Generate API for writing templates
* Persist templates app logic layer
* Validate templates
* Extract logic, make set and delete methods
* Drop post route for templates
* Fix response details, wire up remainder of API
* Authorize routes
* Mirror some existing tests on new APIs
* Generate mock for prov store
* Wire up prov store mock, add tests using it
* Cover cases for both storage paths
* Add happy path tests and fix bugs if file contains no template section
* Normalize template content with define statement
* Tests for deletion
* Fix linter error
* Move provenance field to DTO
* empty commit
* ID to name
* Fix in auth too
* Template service
* Add GET routes and implement them
* Generate mock for persist layer
* Unit tests for reading templates
* Set up composition root and get integration tests working
* Fix prealloc issue
* Extract setup boilerplate
* Update AuthorizationTest
* Rebase and resolve
* Fix linter error
Invalid PostableSilences could be passed to the Alerting API - if they
are passed all the way down into the alertmanager data layer, they can
cause a panic. This change adds validation to avoid a panic in the
alertmanager.
* wip: Implement kvstore for secrets
* wip: Refactor kvstore for secrets
* wip: Add format key function to secrets kvstore sql
* wip: Add migration for secrets kvstore
* Remove unused Key field from secrets kvstore
* Remove secret values from debug logs
* Integrate unified secrets with datasources
* Fix minor issues and tests for kvstore
* Create test service helper for secret store
* Remove encryption tests from datasources
* Move secret operations after datasources
* Fix datasource proxy tests
* Fix legacy data tests
* Add Name to all delete data source commands
* Implement decryption cache on sql secret store
* Fix minor issue with cache and tests
* Use secret type on secret store datasource operations
* Add comments to make create and update clear
* Rename itemFound variable to isFound
* Improve secret deletion and cache management
* Add base64 encoding to sql secret store
* Move secret retrieval to decrypted values function
* Refactor decrypt secure json data functions
* Fix expr tests
* Fix datasource tests
* Fix plugin proxy tests
* Fix query tests
* Fix metrics api tests
* Remove unused fake secrets service from query tests
* Add rename function to secret store
* Add check for error renaming secret
* Remove bus from tests to fix merge conflicts
* Add background secrets migration to datasources
* Get datasource secure json fields from secrets
* Move migration to secret store
* Revert "Move migration to secret store"
This reverts commit 7c3f872072.
* Add secret service to datasource service on tests
* Fix datasource tests
* Remove merge conflict on wire
* Add ctx to data source http transport on prometheus stats collector
* Add ctx to data source http transport on stats collector test
* Test composition simplification from last PR
* Policies use proper API model everywhere
* Expose policy provenance in API, miss some dep injection
* Complete injection
* fix args
* Tests for provenance value
* Extract test helpers so tests are very readable
* Single source adapter struct that was copied in 3 places
* Drop redundant test
* Resolve merge conflicts on changelog
* Refactor GET am config to be extensible
* Extract post config route
* Fix tests
* Remove temporary duplication
* Fix broken test due to layer shift
* Fix duplicated error message
* Properly return 400 on config rejection
* Revert weird half method extraction
* Move things to notifier package and avoid redundant interface
* Simplify documentation
* Split encryption service and depend on minimal abstractions
* Properly initialize things all the way up to the composition root
* Encryption -> Crypto
* Address misc feedback
* Missing docstring
* Few more simple polish improvements
* Unify on MultiOrgAlertmanager. Discover bug in existing test
* Fix rebase conflicts
* Misc feedback, renames, docs
* Access crypto hanging off MultiOrgAlertmanager rather than having a separate API to initialize
* Alerting: unwrap upsert into insert and update function
* add changelog entry
* remove changelog entry
* rename upsertrule to updaterule
* use directly alertrule model for inserts
* add test for updating a rule with a conflicting name
* add check for access to rule's data source in GET APIs
* use more general method GetAlertRules instead of GetNamespaceAlertRules.
* remove unused GetNamespaceAlertRules.
Tests:
* create a method to generate permissions for rules
* extract method to create RuleSrv
* add tests for RouteGetNamespaceRulesConfig
* move validation at the beginning of method
* remove usage of GetOrgRuleGroups because it is not necessary. All information is already available in memory.
* remove unused method
* Base-line API for provisioning notification policies
* Wire API up, some simple tests
* Return provenance status through API
* Fix missing call
* Transactions
* Clarity in package dependencies
* Unify receivers in definitions
* Fix issue introduced by receiver change
* Drop unused internal test implementation
* FGAC hooks for provisioning routes
* Polish, swap names
* Asserting on number of exposed routes
* Don't bubble up updated object
* Integrate with new concurrency token feature in store
* Back out duplicated changes
* Remove redundant tests
* Regenerate and create unit tests for API layer
* Integration tests for auth
* Address linter errors
* Put route behind toggle
* Use alternative store API and fix feature toggle in tests
* Fixes, polish
* Fix whitespace
* Re-kick drone
* Rename services to provisioning
* Alerting: Accurately set value for prom-compatible APIs
Sets the value fields for the prometheus compatible API based on a combination of condition `refID` and the values extracted from the different frames.
* Fix an extra test
* Ensure a consitent ordering
* Address review comments
* address review comments
* Add basic UI for custom ruler URL
* Add build info fetching for alerting data sources
* Add keeping data sources build info in the store
* Use data source build info to construct data source urls
* Remove unused code
* Add custom ruler support in prometheus api calls
* Migrate actions
* Use thunk condition to prevent multiple data source buildinfo fetches
* Unify prom and ruler rules loading
* Upgrade RuleEditor tests
* Upgrade RuleList tests
* Upgrade PanelAlertTab tests
* Upgrade actions tests
* Build info refactoring
* Get rid of lotex ruler support action
* Add prom ruler availability checking when the buildinfo is not available
* Add rulerUrlBuilder tests
* Improve prometheus data source validation, small build info refactoring
* Change prefix based on Prometheus subtype
* Use the correct path
* Revert config routing
* Add deprecation notice for /api/prom prefix
* Add tests to the datasource subtype
* Remove custom ruler support
* Remove deprecation notice
* Prevent fetching ruler rules when ruler api is not available
* Add build info tests
* Unify naming of ruler methods
* Fix test
* Change buildinfo data source validation
* Use strings for subtype params and unveil mimir
* organise imports
* frontend changes and wordsmithing
* fix test suite
* add a nicer verbose message for prometheus datasources
* detect Mimir datasource
* fix test
* fix buildinfo test for Mimir
* shrink vectors
* add some code documentation
* DRY prepareRulesFilterQueryParams
* clarify that Prometheus does not support managing rules
* Improve buildinfo error handling
Co-authored-by: gotjosh <josue.abreu@gmail.com>
Co-authored-by: gillesdemey <gilles.de.mey@gmail.com>