Commit Graph

1612 Commits

Author SHA1 Message Date
Brandon
fbad76007d
Alerting: Limit and clean up old alert rules versions (#89754) 2024-10-05 00:31:21 +03:00
Matthew Jacobson
099055e8a5
Alerting: Verify receiver permission read on rule create/update (#94286)
* Alerting: Verify receiver permission read on rule create/update
2024-10-04 23:52:38 +03:00
Yuri Tseretyan
27c7e33217
Alerting: Update permissions to reciever and template test API (#94282)
* add action "alert.notifications.receivers:test" to receiver creator

* update API permissions to accept new granular actions
2024-10-04 15:52:44 -04:00
Alexander Zobnin
5d724c2482
Zanzana: Initial dashboard search (#93093)
* Zanzana: Search in a background and compare results

* refactor

* Search with check

* instrument zanzana client

* add single_read option

* refactor

* refactor move check into separate function

* Fix tests

* refactor

* refactor getFindDashboardsFn

* add resource type to span attributes

* run ListObjects concurrently

* Use list and search in less cases

* adjust metrics buckets

* refactor: move Check and ListObjects to AccessControl implementation

* Revert "Fix tests"

This reverts commit b0c2f072a2.

* refactor: use own types for Check and ListObjects inside accesscontrol package

* Fix search scenario with low limit and empty query string

* more accurate search with checks

* revert

* fix linter

* Revert "revert"

This reverts commit ee5f14eea8.

* add search errors metric

* fix query performance under some conditions

* simplify check strategy

* fix pagination

* refactor findDashboardsZanzanaList

* Iterate over multiple pages while making check request

* refactor listUserResources

* avoid unnecessary db call

* remove unused zclient

* Add notes for SkipAccessControlFilter

* use more accurate check loop

* always use check for search with provided UIDs

* rename single_read to zanzana_only_evaluation

* refactor

* update go workspace

* fix linter

* don't use deprecated fields

* refactor

* fail if no org specified

* refactor

* initial integration tests

* Fix tests

* fix linter errors

* fix linter

* Fix tests

* review suggestions

Co-authored-by: Gabriel MABILLE <gamab@users.noreply.github.com>

* fix limit

* refactor

* refactor tests

* fix db config in tests

* fix migrator (postgres)

---------

Co-authored-by: Gabriel MABILLE <gamab@users.noreply.github.com>
2024-10-04 12:27:10 +02:00
Yuri Tseretyan
78290301f4
Alerting: Update GettableRuleGroupConfig and PostableRuleGroupConfig with missing fields supported by Prometheus (#94030) 2024-10-01 14:17:57 -04:00
Arati R.
e399fe6d09
Folders: Set folder creation permission as part of legacy create (#94040)
* Add folder store to dashboard permissions
* Include folder store in annotation scope resolver
* Add folder store when initialising library elements
* Include folder store in search v2 service initialisation
* Include folder store in GetInheritedScopes
* Add folder store to folder permissions provider
* Include cfg, folder permissions in folder service
* Move setting of folder permissions for folder service create method
2024-10-01 14:03:02 +02:00
Yuri Tseretyan
0c1aafd643
Alerting: skip flaky test TestBroadcastAndHandleMessages (#94039) 2024-09-30 18:50:55 -04:00
Alexander Weaver
393faa8732
Alerting: Move rule evaluation status logic out of prometheus API and into scheduler (#89141)
* Add health fields to rules and an aggregator method to the scheduler

* Move health, last error, and last eval time in together to minimize state processing

* Wire up a readonly scheduler to prom api

* Extract to exported function

* Use health in api_prometheus and fix up tests

* Rename health struct to status

* Fix tests one more time

* Several new tests

* Handle inactive rules

* Push state mapping into state manager

* rename to StatusReader

* Rectify cyclo complexity rebase

* Convert existing package local status implementation to models one

* fix tests

* undo RuleDefs rename
2024-09-30 16:52:49 -05:00
Santiago
aa77023008
Alerting: Fix panics when attempting to create an Alertmanager after failing (#94023) 2024-09-30 13:50:35 -03:00
Santiago
80611b381c
Alerting: Decrypt secure settings when testing receivers in the remote Alertmanager (#93864)
* Alerting: Decrypt secure settings when testing receivers in the remote Alertmanager

* go work sync

* make update-workspace

* point to latest main in grafana/alerting

* unit test

* import definitions only once
2024-09-30 13:28:30 -03:00
Arati R.
ed75aea21d
Folders: Export folder store implementation (#93897)
* Export folder store implementation

* Rename folder store

* Add folder store as a parameter to folder service

* Add folder store to dash service implementation

* Fix folder store comments
2024-09-30 10:28:47 +02:00
Yuri Tseretyan
84c079d93f
Alerting: Add time intervals fixed roles (#93942)
add time intervals role
2024-09-27 16:12:25 -04:00
Alexander Weaver
c2799b4901
Alerting: Fix incorrect permission on POST external rule groups endpoint [CVE-2024-8118] (#93940)
Fix endpoint permission on rule write endpoint
2024-09-27 14:23:21 -05:00
Tom Ratcliffe
fc51ec70ba
Alerting: Add manage permissions UI logic for Contact Points (#92885)
* Add showPolicies prop

* Add manage permissions component for easier reuse within alerting

* Add method for checking whether to show access control within alerting

* Remove accidental console.log from main

* Tweak styling for contact point width and add manage permissions drawer

* Improve typing for access control type response

* Add basic test for manage permissions on contact points list

* Only show manage permissions if grafana AM and alertingApiServer is enabled

* Update i18n

* Add test utils for turning features on and back off

* Add access control handlers

* Update tests with new util

* Pass AM in and add tests

* Receiver OSS resource permissions

There is a complication that is not fully addressed: Viewer defaults to read:*
and Editor defaults to read+write+delete:*

This is different to other resource permissions where non-admin are not granted
any global permissions and instead access is handled solely by resource-specific
permissions that are populated on create and removed on delete.

This allows them to easily remove permission to view or edit a single resource
from basic roles.

The reason this is tricky here is that we have multiple APIs that can
create/delete receivers: config api, provisioning api, and k8s receivers api.
Config api in particular is not well-equipped to determine when creates/deletes
are happening and thus ensuring that the proper resource-specific permissions
are created/deleted is finicky.

We would also have to create a migration to populate resource-specific
permissions for all current receivers. This migration would need to be reset so
it can run again if the flag is disabled.

* Add access control permissions

* Pass in contact point ID to receivers form

* Temporarily remove access control check for contact points

* Include access control metadata in k8s receiver List & Get

GET: Always included.
LIST: Included by adding a label selector with value `grafana.com/accessControl`

* Include new permissions for contact points navbar

* Fix receiver creator fixed role to not give global read

* Include in-use metadata in k8s receiver List & Get

GET: Always included.
LIST: Included by adding a label selector with value `grafana.com/inUse`

* Add receiver creator permission to receiver writer

* Add receiver creator permission to navbar

* Always allow listing receivers, don't return 403

* Remove receiver read precondition from receiver create

Otherwise, Creator role will not be able to create their first receiver

* Update routes permissions

* Add further support for RBAC in contact points

* Update routes permissions

* Update contact points header logic

* Back out test feature toggle refactor

Not working atm, not sure why

* Tidy up imports

* Update mock permissions

* Revert more test changes

* Update i18n

* Sync inuse metadata pr

* Add back canAdmin permissions after main merge

* Split out check for policies navtree item

* Tidy up utils and imports and fix rules in use

* Fix contact point tests and act warnings

* Add missing ReceiverPermissionAdmin after merge conflict

* Move contact points permissions

* Only show contact points filter when permissions are correct

* Move to constants

* Fallback to empty array and remove labelSelectors (not needed)

* Allow `toAbility` to take multiple actions

* Show builtin alertmanager if contact points permission

* Add empty state and hide templates if missing permissions

* Translations

* Tidy up mock data

* Fix tests and templates permission

* Update message for unused contact points

* Don't return 403 when user lists receivers and has access to none

* Fix receiver create not adding empty uid permissions

* Move SetDefaultPermissions to ReceiverPermissionService

* Have SetDefaultPermissions use uid from string

Fixes circular dependency

* Add FakeReceiverPermissionsService and fix test wiring

* Implement resource permission handling in provisioning API and renames

Create: Sets to default permissions
Delete: Removes permissions
Update: If receiver name is modified and the new name doesn't exist, it copies
the permissions from the old receiver to the newly created one. If old receiver
is now empty, it removes the old permissions as well.

* Split contact point permissions checks for read/modify

* Generalise getting annotation values from k8s entities

* Proxy RouteDeleteAlertingConfig through MultiOrgAlertmanager

* Cleanup permissions on config api reset and restore

* Cleanup permissions on config api POST

note this is still not available with feature flag enabled

* Gate the permission manager behind FF until initial migration is added

* Sync changes from config api PR

* Switch to named export

* Revert unnecessary changes

* Revert Filter auth change and implement in k8s api only

* Don't allow new scoped permissions to give access without FF

Prevents complications around mixed support for the scoped permissions causing
oddities in the UI.

* Fix integration tests to account for list permission change

* Move to `permissions` file

* Add additional tests for contact points

* Fix redirect for viewer on edit page

* Combine alerting test utils and move to new file location

* Allow new permissions to access provisioning export paths with FF

* Always allow exporting if its grafana flavoured

* Fix logic for showing auto generated policies

* Fix delete logic for contact point only referenced by a rule

* Suppress warning message when renaming a contact point

* Clear team and role perm cache on receiver rename

Prevents temporarily broken UI permissions after rename when a user's source of
elevated permissions comes from a cached team or basic role permission.

* Debug log failed cache clear on CopyPermissions

---------

Co-authored-by: Matt Jacobson <matthew.jacobson@grafana.com>
2024-09-27 19:56:32 +01:00
Yuri Tseretyan
86faeae6d2
Alerting: Update GetTemplates to return sorted list of templates (#93933) 2024-09-27 18:49:37 +01:00
Alexander Weaver
378d92130d
Alerting: Don't suppress translation errors in PointsFromFrames (#93747)
* don't suppress error

* reorder

* re-add nilcheck
2024-09-26 16:30:50 -05:00
Steve Simpson
acb051b314
Alerting: Fix logging for failed annotations writing. (#93856) 2024-09-26 23:27:40 +02:00
Jeff Levin
a21a232a8e
Revert read replica POC (#93551)
* Revert "chore: add replDB to team service (#91799)"

This reverts commit c6ae2d7999.

* Revert "experiment: use read replica for Get and Find Dashboards (#91706)"

This reverts commit 54177ca619.

* Revert "QuotaService: refactor to use ReplDB for Get queries (#91333)"

This reverts commit 299c142f6a.

* Revert "refactor replCfg to look more like plugins/plugin config (#91142)"

This reverts commit ac0b4bb34d.

* Revert "chore (replstore): fix registration with multiple sql drivers, again (#90990)"

This reverts commit daedb358dd.

* Revert "Chore (sqlstore): add validation and testing for repl config (#90683)"

This reverts commit af19f039b6.

* Revert "ReplStore: Add support for round robin load balancing between multiple read replicas (#90530)"

This reverts commit 27b52b1507.

* Revert "DashboardStore: Use ReplDB and get dashboard quotas from the ReadReplica (#90235)"

This reverts commit 8a6107cd35.

* Revert "accesscontrol service read replica (#89963)"

This reverts commit 77a4869fca.

* Revert "Fix: add mapping for the new mysqlRepl driver (#89551)"

This reverts commit ab5a079bcc.

* Revert "fix: sql instrumentation dual registration error (#89508)"

This reverts commit d988f5c3b0.

* Revert "Experimental Feature Toggle: databaseReadReplica (#89232)"

This reverts commit 50244ed4a1.
2024-09-25 15:21:39 -08:00
Alexander Akhmetov
b9964865cb
Alerting: Copy alert rule metadata when the rule is updated via provisioning API (#93723)
Alerting: Copy alert rule metadata when the rule is updated
2024-09-25 22:31:02 +02:00
Matthew Jacobson
e86929eb0a
Alerting: Managed receiver resource permission in config api (#93632)
* Alerting: Managed receiver resource permission in config api
2024-09-25 09:39:36 -04:00
Yuri Tseretyan
10582e48f7
Alerting: Notifications Templates API (#91349) 2024-09-25 09:31:57 -04:00
Jean-Philippe Quéméner
bfc6c032c4
refactor(alerting): remove transformation that is now done by the querier (#93660) 2024-09-24 14:46:03 +03:00
Matthew Jacobson
e699348d39
Alerting: Managed receiver resource permission in provisioning (#93631)
* Alerting: Managed receiver resource permission in provisioning
2024-09-23 17:52:14 -04:00
Matthew Jacobson
6652233493
Alerting: Managed receiver resource permission in receiver_svc (#93556)
* Alerting: Managed receiver resource permission in receiver_svc
2024-09-23 21:12:25 +03:00
Karl Persson
2e38329026
RBAC: Add required component to perform access control checks for user api when running single tenant (#93104)
* Unexport store and create new constructor function

* Add ResourceAuthorizer and LegacyAccessClient

* Configure checks for user store

* List with checks if AccessClient is configured

* Allow system user service account to read all users

---------

Co-authored-by: Gabriel MABILLE <gamab@users.noreply.github.com>
2024-09-23 11:26:44 +02:00
Matthew Jacobson
1ede1e32b8
Alerting: Receiver resource permissions service (#93552) 2024-09-20 18:31:42 -04:00
Matthew Jacobson
7398fe3fcb
Alerting: Proxy RouteDeleteAlertingConfig through MultiOrgAlertmanager (#93549)
Proxy RouteDeleteAlertingConfig through MultiOrgAlertmanager
2024-09-20 15:25:14 -04:00
Alexander Akhmetov
0ed70d0b2f
Alerting: Add a metric to track the number of rules with simplified editor settings (#93511)
* Alerting: Add a metric to track the number of rules with simplified editor settings
2024-09-20 17:56:40 +02:00
William Wernert
f1ba7deff5
Alerting: Also clear fields in model/store validation for recording rules (#93506)
* Fix model validation

* Remove validation from provisioning service
2024-09-20 00:27:37 +03:00
Alexander Weaver
534bfba7e3
Alerting: Use typed errors in prometheus remote writer (#93500)
Strongly typed writer errors
2024-09-19 13:34:35 -05:00
Alexander Akhmetov
9f5b05f936
Alerting: Add metadata field with editor_settings to alert rule (#93245) 2024-09-19 16:43:41 +02:00
Alexander Akhmetov
e59ea00518
Alerting: Add TLS, QoS and retain options to the MQTT receiver (#92331) 2024-09-17 21:11:16 +02:00
Yuri Tseretyan
0f788d8d83
Alerting: Support for renaming receivers (#93349)
* update RenameReceiverInNotificationSettings in DbStore to check for provisioning

* implement renaming in receiver service and provisioning

* do not patch route when stitching

* fix bug in stitching because it returned new name but the old one was expected

* update receiver service to always return result converted from storage model this makes sure that UID and version are consistent with GET\LIST operations

* use provided metadata.name for UID of domain model because rename changes UID and request fails

* remove rename guard

* update UI to not disable receiver name when k8s api enabled

* create should calculate uid from name because new receiver does not have UID yet.
2024-09-17 19:07:31 +03:00
Matthew Jacobson
1ea873950b
Alerting: Reject receiver update in config API when FlagAlertingApiServer enabled (#93300)
* Reject receiver update in config API when FlagAlertingApiServer enabled
2024-09-17 16:49:17 +03:00
Jean-Philippe Quéméner
10314585ec
fix(alerting): extend instant vector check for non-nullable types (#93323) 2024-09-17 13:20:40 +02:00
Matthew Jacobson
3bf77d2e05
Alerting: Include in-use metadata in k8s receiver LIST & GET (#93016)
* Include in-use metadata in k8s receiver List & Get
2024-09-13 20:20:09 +03:00
Matthew Jacobson
bd9fc8127b
Alerting: Fix config api POST provenance guard (#93244)
* Add failing tests

* Fix bug in provenance guard on renaming receivers or moving integrations

* Linting
2024-09-13 12:42:33 -04:00
Matthew Jacobson
ff6a20f54a
Alerting: Include access control metadata in k8s receiver LIST & GET (#93013)
* Include access control metadata in k8s receiver List & Get

* Add tests for receiver access

* Simplify receiver access provisioning extension

- prevents edge case infinite recursion
- removes read requirement from create
2024-09-12 20:57:53 +03:00
Yuri Tseretyan
f8fa5286a1
Alerting: Introduce alert rule models in storage (#93187)
* introduce storage model for alert rule tables
* remove AlertRuleVersion from models because it's not used anywhere other than in storage
* update historian xorm store to use alerting store to fetch rules

* fix folder tests

---------

Co-authored-by: Matthew Jacobson <matthew.jacobson@grafana.com>
2024-09-12 13:20:33 -04:00
William Wernert
efe62086f9
Alerting: Add type label rule_group_rules metric (#91425)
* Add group and type labels to rule_group_rules metric

* Don't include group to avoid high cardinality

* Add comments

* Reset rule_group_rules before recording new values

* Edit description for rule_group_rules

* Include ruleGroup combo key in labels

* Fix lint
2024-09-12 17:27:09 +03:00
Jean-Philippe Quéméner
eabf3b9f73
feat(alerting): add support for query service instant vectors (#92091) 2024-09-12 15:33:00 +02:00
Tito Lins
a910188675
Replace prom MustRegister with Register (#92725) 2024-09-12 10:24:12 +02:00
Yuri Tseretyan
cb372d3fa8
Alerting: Support secrets in contact points nested fields (#92035)
Back-end:
* update alerting module
* update GetSecretKeysForContactPointType to extract secret fields from nested options
* Update RemoveSecretsForContactPoint to support complex settings
* update PostableGrafanaReceiverToEmbeddedContactPoint to support nested secrets
* update Integration to support nested settings in models.Integration
* make sigv4 fields optional

Front-end:
* add UI support for encrypted subform fields
* allow emptying nested secure fields
* Omit non touched secure fields in POST payload when saving a contact point
* Use SecretInput from grafana-ui instead of the new EncryptedInput
* use produce from immer
* rename mapClone
* rename sliceClone
* Don't use produce from immer as we need to delete the fileds afterwards

---------

Co-authored-by: Gilles De Mey <gilles.de.mey@gmail.com>
Co-authored-by: Sonia Aguilar <soniaaguilarpeiron@gmail.com>
Co-authored-by: Matt Jacobson <matthew.jacobson@grafana.com>
2024-09-10 22:26:23 -04:00
Tito Lins
539d363d2c
Create histogram and observe grafana config size (#93028) 2024-09-06 18:25:16 +02:00
Alexander Akhmetov
152d3540db
Alerting: Log number of dimensions instead of all evaluation results (#92733) 2024-08-30 12:35:02 +02:00
Yuri Tseretyan
ce64d79027
Alerting: Integration tests for Receiver API (#90632)
---------

Co-authored-by: Matthew Jacobson <matthew.jacobson@grafana.com>
2024-08-29 22:27:26 -04:00
Matthew Jacobson
d5fd6aceca
Alerting: Stop redacting receivers by default in receiver_svc (#92631)
* Stop redacting receivers by default in receiver_svc

[REDACTED] is only used in provisioning API since response doesn't include
SecureFields. This is not necessary in k8s or notifications api, instead we do
not include the encrypted settings in Settings at all, leaving it to
SecureFields to specify when a secure field exists.

* Capitalize logs messages
2024-08-29 14:48:59 -04:00
Matthew Jacobson
e43ddd516d
Alerting: Ensure k8s receiver API create/update will never store nil settings (#92701)
Ensure Create/Update will never store nil Settings
2024-08-29 20:00:55 +03:00
Todd Treece
2bb2183b41
Scopes: Move title and groups to status in ScopeDashboardBinding (#92377)
---------

Co-authored-by: Kyle Brandt <kyle@grafana.com>
Co-authored-by: Bogdan Matei <bogdan.matei@grafana.com>
2024-08-28 08:59:18 -04:00
Alexander Akhmetov
7f6b6dea45
Alerting: Change expire placeholder for Pushover in the UI to 10800 seconds (#92379)
* Alerting: Change max retry for Pushover in the UI to 10800 seconds
* Update alerting to 70248a7a3a674e50e026a37205ebac86e1ec25fd
2024-08-27 11:13:58 +03:00