* Alerting: Create feature flag for alert query optimization
Adds a feature flag alertingQueryOptimization for an already existing
functionality: alert query optimization. This feature flag will now be disabled
by default.
* reload SSO settings for HA setups
* remove check for grafana HA
* add unit tests
* fetch all sso settings with one sql query
* register background service
* Add enablePluginsTracingByDefault feature flag
* Enable tracing for all plugins if enablePluginsTracingByDefault is set
* fix docstrings for IsEnabled and IsEnabledGlobally
* fix tests
* do not use separate feature manager
* add test case
* Revert "fix tests"
This reverts commit 46a2420ed1.
* cleanup
* fix plugin tracing disabled if wrong plugin setting is present
* add test case for enabled on plugin with wrong plugin setting but with enablePluginsTracingByDefault feature flag
* Add RequiresRestart = true to enablePluginsTracingByDefault
* re-generate feature flags
* pr review feedback
* Alerting: Add metrics to the remote Alertmanager struct
* rephrase http_requests_failed description
* make linter happy
* remove unnecessary metrics
* extract timed client to separate package
* use histogram collector from dskit
* remove weaveworks dependency
* capture metrics for all requests to the remote Alertmanager (both clients)
* use the timed client in the MimirAuthRoundTripper
* HTTPRequestsDuration -> HTTPRequestDuration, clean up mimir client factory function
* refactor
* less git diff
* gauge for last readiness check in seconds
* initialize LastReadinesCheck to 0, tweak metric names and descriptions
* add counters for sync attempts/errors
* last config sync and last state sync timestamps (gauges)
* change latency metric name
* metric for remote Alertmanager mode
* code review comments
* move label constants to metrics package
* Alerting: Fix NoData & Error alerts not resolving when rule is reset
On rule reset, when creating the PostableAlerts StateToPostableAlert did not
attach the correct NoData/Error alertname and rulename labels to expire/resolve
the active alerts when the previous cached state was NoData/Error.
* Return data in camelCase from the OAuth fb strategy
* changes
* wip
* Add defaults for oauth fb strategy
* revert other changes
* basic includeDefaults query param implementation
* basic secret removal and etag implementation
* correct imports
* rebase
* move default settings filter to models
* only replace ClientSecret value if set
* first GetForProvider test & use FNV for ETag to avoid Blocklisted import error
* add tests
* add annotation for the openapi spec & generate spec
* remove TODO
* use IsSecret, improve tests, remove DefaultOAuthSettings
* add comment explaining generateFNVETag
* add error handling for generateFNVETag
* run go generate
* Update pkg/services/ssosettings/api/api.go
Co-authored-by: Mihai Doarna <mihai.doarna@grafana.com>
* move isSecret to service, create GetForProviderWithRedactedSecrets func
* add unit test for GetForProviderWithRedactedSecrets & remove duplicated code
* regen openapi/swagger
* revert dependency bumps
---------
Co-authored-by: Mihaly Gyongyosi <mgyongyosi@users.noreply.github.com>
Co-authored-by: Mihai Doarna <mihai.doarna@grafana.com>
This PR has two steps that together create a functional dry-run capability for the migration.
By enabling the feature flag alertingPreviewUpgrade when on legacy alerting it will:
a. Allow all Grafana Alerting background services except for the scheduler to start (multiorg alertmanager, state manager, routes, …).
b. Allow the UI to show Grafana Alerting pages alongside legacy ones (with appropriate in-app warnings that UA is not actually running).
c. Show a new “Alerting Upgrade” page and register associated /api/v1/upgrade endpoints that will allow the user to upgrade their organization live without restart and present a summary of the upgrade in a table.
* extract get and save operations to a alertmanagerConfigStore. this removes duplicated code in service (currently only mute timings) and improves testing
* replace generic errors with errutils one with better messages.
* update provisioning services to use new store
---------
Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com>
* Docs: Add table data in PDF
* fix lint issues
* Switch to public preview
* Apply suggestions from code review
Co-authored-by: Isabel <76437239+imatwawana@users.noreply.github.com>
---------
Co-authored-by: Isabel <76437239+imatwawana@users.noreply.github.com>
There were a few errors that prevented these endpoints (which are the most up-to-date ones) from being present in the openapi spec:
- The `enterprise` tag excluded the endpoints from being generated
- `okRespoonse` typo
- Invalid templating on the parameters
- Missing parameter structs
Some refactoring that will simplify next changes for dry-run PRs. This should be no-op as far as the created ngalert resources and database state, though it does change some logs.
The key change here is to modify migrateOrg to return pairs of legacy struct + ngalert struct instead of actually persisting the alerts and alertmanager config. This will allow us to capture error information during dry-run migration.
It also moves most persistence-related operations such as title deduplication and folder creation to the right before we persist. This will simplify eventual partial migrations (individual alerts, dashboards, channels, ...).
Additionally it changes channel code to deal with PostableGrafanaReceiver instead of PostableApiReceiver (integration instead of contact point).
* Separate overlapping legacy and UA alerting routes
api/alert-notifiers, alerting/list, and alerting/notifications existed in both
legacy and UA.
Rename legacy route paths and nav ids to be independent of UA ones.
Backend:
* Update the Grafana Alerting engine to provide feedback to HysteresisCommand. The feedback information is stored in state.Manager as a fingerprint of each state. The fingerprint is persisted to the database. Only fingerprints that belong to Pending and Alerting states are considered as "loaded" and provided back to the command.
- add ResultFingerprint to state.State. It's different from other fingerprints we store in the state because it is calculated from the result labels.
- add rule_fingerprint column to alert_instance
- update alerting evaluator to accept AlertingResultsReader via context, and update scheduler to provide it.
- add AlertingResultsFromRuleState that implements the new interface in eval package
- update getExprRequest to patch the hysteresis command.
* Only one "Recovery Threshold" query is allowed to be used in the alert rule and it must be the Condition.
Frontend:
* Add hysteresis option to Threshold in UI. It's called "Recovery Threshold"
* Add test for getUnloadEvaluatorTypeFromCondition
* Hide hysteresis in panel expressions
* Refactor isInvalid and add test for it
* Remove unnecesary React.memo
* Add tests for updateEvaluatorConditions
---------
Co-authored-by: Sonia Aguilar <soniaaguilarpeiron@gmail.com>