* Remove Result field from AddDataSourceCommand
* Remove DatasourcesPermissionFilterQuery Result
* Remove GetDataSourceQuery Result
* Remove GetDataSourcesByTypeQuery Result
* Remove GetDataSourcesQuery Result
* Remove GetDefaultDataSourceQuery Result
* Remove UpdateDataSourceCommand Result
* Mark AM configuration as applied
* add missing checks, make linter happy
* fix deadlock, mark as valid on save and on load
* mark configurations only if needed
* check error after applyConfig()
* code review comments
* code review changes
* more code review changes
* clean HistoricConfigFromAlertConfig function
This change marks tests in the `sender` package that use an external
process as integration tests instead of unit tests. This speeds up the
package's unit tests by about 20 seconds.
This change also reduces the number of alert instances in the `store`
package's bulk write integration test from 20_000 to 10_000. This is
still enough to exercise the bulk-write code but speeds up the package
tests from about 250s to 130s.
Put together, integration tests go to about 160s while also speeding up
unit tests by 20s.
* Remove URL-based alertmanagers from endpoint config
* WIP
* Add migration and alertmanagers from admin_configuration
* Empty comment removed
* set BasicAuth true when user is present in url
* Remove Alertmanagers from GET /admin_config payload
* Remove URL-based alertmanager configuration from UI
* Fix new uid generation in external alertmanagers migration
* Fix tests for URL-based external alertmanagers
* Fix API tests
* Add more tests, move migration code to separate file, and remove possible am duplicate urls
* Fix edge cases in migration
* Fix imports
* Remove useless fields and fix created_at/updated_at retrieval
Co-authored-by: George Robinson <george.robinson@grafana.com>
Co-authored-by: Konrad Lalik <konrad.lalik@grafana.com>
* Audit logs in sender package
* Fix casing and touch up a few key names
* Avoid logging entire alert struct
* Log configuration ID being applied
* Revert change to errorf rather than log
* Tune levels further and remove some redundancies
* Adjust logger naming and standardize log context
* Adjust logger naming in router
* Move log and get rid of dead error handling code
* Alerting: Sanitize invalid label/annotation names for external alertmanagers
Grafana's built-in Alertmanager supports both Unicode label keys and values; however, if using an external
Prometheus Alertmanager label keys must be compatible with their data model.
This means label keys must only contain ASCII letters, numbers, as well as underscores and match the regex
`[a-zA-Z_][a-zA-Z0-9_]*`.
Any invalid characters will now be removed or replaced by the Grafana alerting engine before being sent to
the external Alertmanager according to the following rules:
- `Whitespace` will be removed.
- `ASCII characters` will be replaced with `_`.
- `All other characters` will be replaced with their lower-case hex representation.
* Prefix hex replacements with `0x`
* Refactor for clarity
* Apply suggestions from code review
Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>
Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>
* move fake FakeExternalAlertmanager to sender package
* move tests from scheduler to router
* update alerts router to have all fields private
* update scheduler tests to use sender mock
* Introduce AlertsRouter in the sender package, and move all fields and methods related to notifications out of the scheduler to this router.
* Introduce a new interface AlertsSender in the schedule package and replace calls of anonymous function `notify` inside the ruleRoutine to calling methods of that interface.
* Rename interface Scheduler in api package to ExternalAlertmanagerProvider, and replace scheduler with AlertRouter as struct that implements the interface.
* Alerting: Refactor & fix unified alerting metrics structure
Fixes and refactors the metrics structure we have for the ngalert service. Now, each component has its own metric struct that includes the JUST the metrics it uses. Additionally, I have fixed the configuration metrics and added new metrics to determine if we have discovered and started all the necessary configurations of an instance.
This allows us to alert on `grafana_alerting_discovered_configurations - grafana_alerting_active_configurations != 0` to know whether an alertmanager instance did not start successfully.
* Alerting: Send alerts to external Alertmanager(s)
Within this PR we're adding support for registering or unregistering
sending to a set of external alertmanagers. A few of the things that are
going are:
- Introduce a new table to hold "admin" (either org or global)
configuration we can change at runtime.
- A new periodic check that polls for this configuration and adjusts the
"senders" accordingly.
- Introduces a new concept of "senders" that are responsible for
shipping the alerts to the external Alertmanager(s). In a nutshell,
this is the Prometheus notifier (the one in charge of sending the alert)
mapped to a multi-tenant map.
There are a few code movements here and there but those are minor, I
tried to keep things intact as much as possible so that we could have an
easier diff.