* Folders: Expose function for getting all org folders with specific UIDs
* lint
* Fix test
* fixup
* Apply suggestion from code review
* Remove changes in alerting scheduler
* fixup
* fixup after merge with main
* Add batching
* Use strings.Builder
* Return all org folders if UIDs is empty
* Filter out not accessible folders by the user
* Remove comment
* Fix batching when count is zero
* Do not include dashboard permissions
* Add some tests
* fix test
* Use batch request for folders
* Use batch request to deduplicate folders
* Refactor
* Fix after merging main
* Refactor
---------
Co-authored-by: Sofia Papagiannaki <1632407+papagian@users.noreply.github.com>
* Add API test
* Add move tests
* Fix create folder
* Fix move
* Fix test
* Drop and re-create index so that allows a folder to contain a dashboard and a subfolder with same name
* Get folder by title defaults to root folder and optionally fetches folder by provided parent folder
* Apply suggestions from code review
* Folders: Expose function for getting all org folders with specific UIDs
* Return all org folders if UIDs is empty
* Filter out not accessible folders by the user
* Modify query to optionally returning a string that contains the UIDs of all parent folders separated by slash.
* Alerting: Add action, scope, role_id to permission table
The existing role_id, action, scope index has the wrong ordering to be most
effectively used in dashboard/folder permission requests.
On a large tests set, the slow database calls were on the order of ~30-40ms, so
when performed individually they don't have that large of a latency impact.
However, when done in bulk in the migration this adds up to some very slow
requests.
After the index is added these same database calls are reduced to ~4-5ms
* Change index to action, scope, role_id
* Make new index unique and drop [role_id, action, scope] index
* Alerting: During legacy migration reduce the number of created silences
During legacy migration every migrated rule was given a label rule_uid=<uid>.
This was used to silence DatasourceError/DatasourceNoData alerts for
migrated rules that had either ExecutionErrorState/NoDataState set to
keep_state, respectively.
This could potentially create a large amount of silences and a high cardinality
label. Both of these scenarios have poor outcomes for CPU load and latency in
unified alerting.
Instead, this change creates one label per ExecutionErrorState/NoDataState when
they are set to keep_state as well as two silence rules, if rules with said
labels were created during migration. These silence rules are:
- __legacy_silence_error_keep_state__ = true
- __legacy_silence_nodata_keep_state__ = true
This will drastically reduce the number of created silence rules in most cases
as well as not create the potentially high cardinality label `rule_uid`.
* introduce feature toggle
* create base service structure
* fix sample metric
* register metrics
* add to codeowners
* separate api dtos from service models
* remove leading newline
* Add AuthNSvc reload handling
* Working, need to add test
* Remove commented out code
* Add Reload implementation to connectors
* Align and add tests, refactor
* Add more tests, linting
* Add extra checks + tests to oauth client
* Clean up based on reviews
* Move config instantiation into newSocialBase
* Use specific error
These don't get marshalled and unmarshalled in the same way as they are represented in Go
This PR changes the OpenAPI spec to reflect what the API accepts and sends back
* Simple, per-base-interval jitter
* Add log just for test purposes
* Add strategy approach, allow choosing between group or rule
* Add flag to jitter rules
* Add second toggle for jittering within a group
* Wire up toggles to strategy
* Slightly improve comment ordering
* Add tests for offset generation
* Rename JitterStrategyFrom
* Improve debug log message
* Use grafana SDK labels rather than prometheus labels
* add deployment registry API cloud only
* update versions
* add feature flag endpoints
* use helpers
* merge main
* update AllowSelfServie and re-run code gen
* fix package name
* add allowselfserve flag to payload
* remove config
* update list api to return the full registry including states
* change enabled check
* fix compile error
* add feature toggle and split path in frontend
* changes
* with status
* add more status/state
* add back config thing
* add back config thing
* merge main
* merge main
* now on the /current api endpoint
* now on the /current api endpoint
* drop frontend changes
* change group name to featuretoggle (singular)
* use the same settings
* now with patch
* more common refs
* more common refs
* WIP actually do the webhook
* fix comment
* fewer imports
* registe standalone
* one less file
* fix singular name
---------
Co-authored-by: Michael Mandrus <michael.mandrus@grafana.com>
* ngalert openapi: Use same `basePath` as rest of Grafana
Currently, there are two issues that prevent easily merging `ngalert` and grafana openapi specs:
- The basePath is different. `grafana` has `/api` and `ngalert` has `/api/v1`. I changed `ngalert` to use `/api`
- The `ngalert` endpoints have their basePath in the each operation path. The basePath should actually be omitted
---------
Co-authored-by: Yuriy Tseretyan <yuriy.tseretyan@grafana.com>
* first touches
* Merge missing SSO settings to support Advanced Auth pages
* fix
* Update secrets correctly
* Add test for upsert with redactedsecret
* Verify decryption in the List tests
* AuthnSync: Rename files and structures
* AuthnSync: register rbac cloud role sync if feature toggle is enabled
* RBAC: Add new sync function to service interface
* RBAC: add common prefix and role names for cloud fixed roles
* AuthnSync+RBAC: implement rbac cloud role sync
Co-authored-by: Ieva <ieva.vasiljeva@grafana.com>
* Change ruler API to expect the folder UID as namespace
* Update example requests
* Fix tests
* Update swagger
* Modify FIle field in /api/prometheus/grafana/api/v1/rules
* Fix ruler export
* Modify folder in responses to be formatted as <parent UID>/<title>
* Add alerting test with nested folders
* Apply suggestion from code review
* Alerting: use folder UID instead of title in rule API (#77166)
Co-authored-by: Sonia Aguilar <soniaaguilarpeiron@gmail.com>
* Drop a few more latent uses of namespace_id
* move getNamespaceKey to models package
* switch GetAlertRulesForScheduling to use folder table
* update GetAlertRulesForScheduling to return folder titles in format `parent_uid/title`.
* fi tests
* add tests for GetAlertRulesForScheduling when parent uid
* fix integration tests after merge
* fix test after merge
* change format of the namespace to JSON array
this is needed for forward compatibility, when we migrate to full paths
* update EF code to decode nested folder
---------
Co-authored-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
Co-authored-by: Virginia Cepeda <virginia.cepeda@grafana.com>
Co-authored-by: Sonia Aguilar <soniaaguilarpeiron@gmail.com>
Co-authored-by: Alex Weaver <weaver.alex.d@gmail.com>
Co-authored-by: Gilles De Mey <gilles.de.mey@gmail.com>
* Alerting: Add metric to check for default AM configurations
* Use a gauge for the config hash
* don't go out of bounds when converting uint64 to float64
* expose metric for config hash
* update metrics after applying config
* remove latest.json and replace with api call to grafana.com
* remove latest.json
* Revert "remove latest.json"
This reverts commit bcff43d898.
* Revert "remove latest.json and replace with api call to grafana.com"
This reverts commit 02b867d84e.
* add deprecation message to latest.json
* first commit
* add: pagination to anondevices
* fmt
* swagger and tests
* swagger
* testing out test
* fixing tests
* made it possible to query for from and to time
* refactor: change to query for ip adress instead
* fix: tests
* separate nestedFolderPickerOverride toggle to force enable it without nestedFolders
* let's call it newFolderPicker
* update unit tests and keyboard handling
* reduce spacing when no folder open chevron
---------
Co-authored-by: Josh Hunt <joshhunt@users.noreply.github.com>
* Split subquery when cleaning annotations
* update comment
* Raise batch size, now that we pay attention to it
* Iterate in batches
* Separate cancellable batch implementation to allow for multi-statement callbacks, add overload for single-statement use
* Use split-out utility in outer batching loop so it respects context cancellation
* guard against empty queries
* Use SQL parameters
* Use same approach for tags
* drop unused function
* Work around parameter limit on sqlite for large batches
* Bulk insert test data in DB
* Refactor test to customise test data creation
* Add test for catching SQLITE_MAX_VARIABLE_NUMBER limit
* Turn annotation cleanup test to integration tests
* lint
---------
Co-authored-by: Sofia Papagiannaki <1632407+papagian@users.noreply.github.com>
* add get mute timing by name to MuteTimingService
* update get mute timing request handler to use the service method
* replace validation, uniqueness and used errors with errutils
* update mute timing methods return errutil responses
* use the term "time interval" in errors bevause mute timings are deprecated in Alertmanager and will be replaced by time intervals in the future.
* update create and update methods to return struct instead of pointer
* Remove FolderID from service tests
* Add models
* Add folderID pack to publicdashboard tests
* Remove folderID from dashboard tests
* Remove folderID from folders
* Remove folderID from ngalert tests
* Remove nolint comment
* Add back some tests after rebase
* Alerting: Increase size of kvstore value type for MySQL to LONGTEXT
alertmanager uses the kvstore to persist its notification log and the current
column limit for MySQL (16.7mb) puts the maximum entries at a level that is
potentially achievable for heavy alerting users (~40-80k entries).
In comparison, the current type for PSQL (TEXT) is effectively unlimited and
I believe SQLIte defaults to 2gb which is also plenty of leeway.
* Move scope type vars to testutil package
* Expose parts of state historian for use in annotation backend
* Implement Loki ASH Annotation store
This store will only implement the `Get` method of a RepositoryImpl since alert state history
writes to Loki elsewhere.
* Use interface for Loki HTTP Client
* Add tests for Loki ASH Annotation store
* Add missing test
* Fix lint
* Organize tests
* Add filter tests
* Improve tests
* Move filter logic into outer function
* Fix lint
* Add comment
* Fix tests
* Fix lint
* Rename historian store + refactor
* Cleanup historian store
* Fix tests
* Minor cleanup
* Use new `ShouldRecordAnnotation` filter
* Fix logic and add tests for this check
* Fix typos, remove unused variables, `< 1` -> `== 0`
* More closely mimic RBAC filter from xorm to ensure correct logic
* Move off weaveworks client
* Address PR comments