grafana

mirror of https://github.com/grafana/grafana.git synced 2024-11-25 18:30:41 -06:00

Author	SHA1	Message	Date
Ieva	167151b211	Chore: Remove use of deprecated method in AC code (#87541 ) * switch from using cfg to using featuremgmt for checking a feature toggle in AC code * merge test fixes	2024-05-10 11:56:52 +01:00
Matthew Jacobson	babfa2beac	Alerting: Hook up GMA silence APIs to new authentication handler (#86625 ) This PR connects the new RBAC authentication service to existing alertmanager API silence endpoints.	2024-05-03 15:32:30 -04:00
Santiago	b76a9e4d31	Alerting: Implement GetStatus in the remote Alertmanager struct (#84887 ) * Alerting: Implement GetStatus in the remote Alertmanager struct * update tests * fix tests, extract AlertmanagerConfig from PostableConfig * get the remote AM config instead of the Grafana one from the remote AM * pass grafana AM config in test * return error in GetStatus instead of logging it (internal AM)	2024-05-03 13:59:02 +02:00
Matthew Jacobson	3397e8bf09	Alerting: Improve error when receiver or time interval used by rule is deleted (#86865 ) * Alerting: Improve error when receiver used by rule is deleted * Remove RuleUID from public error and data * Improve fallback error in am config post * Refactor to expand to time intervals * Fix message on unchecked errors to be same as before	2024-04-25 13:36:00 -04:00
Julian Siebert	14f018e3fc	Docs: Use correct description for "og_priority" (#80889 )	2024-04-22 13:53:18 +00:00
Santiago	529f55cfe8	Alerting: Remove isDefault field from receivers (Alertmanager configuration) (#86605 ) Alerting: Remove isDefault field from receivers in the Alertmanager configuration	2024-04-19 15:44:20 +02:00
Santiago	309a7e7684	Alerting: Implement SaveAndApplyDefaultConfig in the remote Alertmanager struct (#85005 ) * Alerting: Implement SaveAndApplyDefaultConfig in the remote Alertmanager struct * send the hash of the encrypted configuration * tests, default config hash in AM struct * add missing default config to test * restore build directory * go work file... * fix broken test * remove unnecessary conversion to []byte * go work again... * make things work again with latest main branch changes * update error messages in tests for decrypting config	2024-04-19 15:11:07 +02:00
Matthew Jacobson	71445002b7	Alerting: Fix simplified routing group by override (#86552 ) * Alerting: Fix simplified routing custom group by override Custom group by overrides for simplified routing were missing required fields GroupBy and GroupByAll normally set during upstream Route validation. This fix ensures those missing fields are applied to the generated routes. * Inline GroupBy and GroupByAll initialization instead of normalize after	2024-04-18 21:08:14 -04:00
Matthew Jacobson	533bed6d94	Alerting: Fix simplified routes '...' groupBy creating invalid routes (#86006 ) * Alerting: Fix simplified routes '...' groupBy creating invalid routes There were a few ways to go about this fix: 1. Modifying our copy of upstream validation to allow this 2. Modify our notification settings validation to prevent this 3. Normalize group by on save 4. Normalized group by on generate Option 4. was chosen as the others have a mix of the following cons: - Generated routes risk being incompatible with upstream/remote AM - Awkward FE UX when using '...' - Rule definition changing after save and potential pitfalls with TF With option 4. generated routes stay compatible with external/remote AMs, FE doesn't need to change as we allow mixed '...' and custom label groupBys, and settings we save to db are the same ones requested. In addition, it has the slight benefit of allowing us to hide the internal implementation details of `alertname, grafana_folder` from the user in the future, since we don't need to send them with every FE or TF request. * Safer use of DefaultNotificationSettingsGroupBy * Fix missed API tests	2024-04-16 12:14:39 -04:00
Matthew Jacobson	f79dd7c7f9	Alerting: Persist silence state immediately on Create/Delete (#84705 ) * Alerting: Persist silence state immediately on Create/Delete Persists the silence state to the kvstore immediately instead of waiting for the next maintenance run. This is used after Create/Delete to prevent silences from being lost when a new Alertmanager is started before the state has persisted. This can happen, for example, in a rolling deployment scenario. * Fix test that requires real data * Don't error if silence state persist fails, maintenance will correct	2024-04-09 13:39:34 -04:00
Santiago	2e7cc68394	Alerting: Remove CleanUp method from the Alertmanager (#85650 ) Alerting: Remove Cleanup method from the Alertmanager	2024-04-09 12:13:27 +02:00
Santiago	6a75a8f354	Alerting: Update grafana/alerting and use Upsert for creating silences (#85676 ) * Alerting: Update grafana/alerting and use Upsert for creating silences * go.work.sum * change error message in tests for silences (save -> upsert)	2024-04-08 11:46:14 +02:00
Santiago	c7573bb0f7	Alerting: Make retention period configurable for the notification log (#85605 ) * Alerting: Make retention period configurable for the notification log * update sample.ini * fix outdated comment (on disk -> kvstore) * skip checking cyclomatic complexity for ReadUnifiedAlertingSettings	2024-04-05 12:25:43 +02:00
Dave Henderson	5687243d0b	Feature Flags: use FeatureToggles interface where possible (#85131 ) * Feature Flags: use FeatureToggles interface where possible Signed-off-by: Dave Henderson <dave.henderson@grafana.com> * Replace TestFeatureToggles with existing WithFeatures Signed-off-by: Dave Henderson <dave.henderson@grafana.com> --------- Signed-off-by: Dave Henderson <dave.henderson@grafana.com>	2024-04-04 12:22:31 -04:00
Matthew Jacobson	0c3c5c5607	Alerting: Stop persisting silences and nflog to disk (#84706 ) With this change, we no longer need to persist silence/nflog states to disk in addition to the kvstore	2024-03-23 00:37:33 +02:00
Pepe Cano	2d6586952d	Alerting: Add placeholder to the Email Contact Point Message (#84064 )	2024-03-21 13:03:12 -04:00
Santiago	c9bb18101c	Alerting: Decrypt secrets before sending configuration to the remote Alertmanager (#83640 ) * (WIP) Alerting: Decrypt secrets before sending configuration to the remote Alertmanager * refactor, fix tests * test decrypting secrets * tidy up * test SendConfiguration, quote keys, refactor tests * make linter happy * decrypt configuration before comparing * copy configuration struct before decrypting * reduce diff in TestCompareAndSendConfiguration * clean up remote/alertmanager.go * make linter happy * avoid serializing into JSON to copy struct * codeowners	2024-03-19 12:12:03 +01:00
Matthew Jacobson	2e8c514cfd	Alerting: Stop persisting user-defined templates to disk (#83456 ) Updates Grafana Alertmanager to work with new interface from grafana/alerting#161. This change stops passing user-defined templates to the Grafana Alertmanager by persisting them to disk and instead passes them by string.	2024-03-04 20:12:49 +02:00
Santiago	8ad367e4ad	Chore: Remove redundant error check (#83769 )	2024-03-01 13:28:08 -03:00
George Robinson	a564c8c439	Alerting: Keep order of time and mute time intervals consistent (#83257 )	2024-02-22 16:57:20 +00:00
George Robinson	1ed1242358	Alerting: Basic support for time_intervals (#83216 ) This commit adds basic support for time_intervals, as mute_time_intervals is deprecated in Alertmanager and scheduled to be removed before 1.0. It does not add support for time_intervals in API or file provisioning, nor does it support exporting time intervals. This will be added in later commits to keep the changes as simple as possible.	2024-02-22 15:58:56 +00:00
Yuri Tseretyan	1eebd2a4de	Alerting: Support for simplified notification settings in rule API (#81011 ) * Add notification settings to storage\domain and API models. Settings are a slice to workaround XORM mapping * Support validation of notification settings when rules are updated * Implement route generator for Alertmanager configuration. That fetches all notification settings. * Update multi-tenant Alertmanager to run the generator before applying the configuration. * Add notification settings labels to state calculation * update the Multi-tenant Alertmanager to provide validation for notification settings * update GET API so only admins can see auto-gen	2024-02-15 09:45:10 -05:00
Dan Cech	790e1feb93	Chore: Update test database initialization (#81673 ) * streamline initialization of test databases, support on-disk sqlite test db * clean up test databases * introduce testsuite helper * use testsuite everywhere we use a test db * update documentation * improve error handling * disable entity integration test until we can figure out locking error	2024-02-09 09:35:39 -05:00
Gokhan	cf601fab09	Alerting: Enable sending notifications to a specific topic on Telegram (#79546 ) Co-authored-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>	2024-02-06 17:19:22 +02:00
William Wernert	2ab7d3c725	Alerting: Receivers API (read only endpoints) (#81751 ) * Add single receiver method * Add receiver permissions * Add single/multi GET endpoints for receivers * Remove stable tag from time intervals See end of PR description here: https://github.com/grafana/grafana/pull/81672	2024-02-05 20:12:15 +02:00
William Wernert	7e939401dc	Alerting: Introduce initial common receiver service (#81211 ) * Create locking config store that mimics existing provisioning store * Rename existing receivers(_test).go * Introduce shared receiver group service * Fix test * Move query model to models package * ReceiverGroup -> Receiver * Remove locking config store * Move convert methods to compat.go * Cleanup	2024-02-01 14:42:59 -05:00
George Robinson	0726c7c3fa	Alerting: Prevent inhibition rules in Grafana Alertmanager (#81712 ) This commit prevents saving configurations containing inhibition rules in Grafana Alertmanager. It does not reject inhibition rules when using external Alertmanagers, such as Mimir. This meant the validation had to be put in the MultiOrgAlertmanager instead of in the validation of PostableUserConfig. We can remove this when inhibition rules are supported in Grafana Managed Alerts.	2024-02-01 14:53:15 +00:00
Matthew Jacobson	0ce1ccd6f9	Alerting: Fix inconsistent AM raw config when applied via sync vs API (#81655 ) AM config applied via API would use the PostableUserConfig as the AM raw config and also the hash used to decide when the AM config has changed. However, when applied via the periodic sync the PostableApiAlertingConfig would be used instead. This leads to two issues: - Inconsistent hash comparisons when modifying the AM causing redundant applies. - GetStatus assumed the raw config was PostableUserConfig causing the endpoint to return correctly after a new config is applied via API and then nothing once the periodic sync runs. Note: Technically, the upstream GrafanaAlertamanger GetStatus shouldn't be returning PostableUserConfig or PostableApiAlertingConfig, but instead GettableStatus. However, this issue required changes elsewhere and is out of scope.	2024-01-31 21:05:30 +02:00
William Wernert	2203bc2a3d	Alerting: Refactor provisioning tests/fakes (#81205 ) * Fix up test Alertmanager config JSON * Move fake AM config and provisioning stores to fakes package	2024-01-24 17:15:55 -05:00
George Robinson	85b9edcd28	Alerting: Fix incorrect initialization of logger (#81099 )	2024-01-23 17:29:38 +02:00
Santiago	3afd94185c	Alerting: Add metric to check for default AM configurations (#80225 ) * Alerting: Add metric to check for default AM configurations * Use a gauge for the config hash * don't go out of bounds when converting uint64 to float64 * expose metric for config hash * update metrics after applying config	2024-01-16 17:12:24 +01:00
Santiago	9e78faa7ba	Alerting: Add metrics to the remote Alertmanager struct (#79835 ) * Alerting: Add metrics to the remote Alertmanager struct * rephrase http_requests_failed description * make linter happy * remove unnecessary metrics * extract timed client to separate package * use histogram collector from dskit * remove weaveworks dependency * capture metrics for all requests to the remote Alertmanager (both clients) * use the timed client in the MimirAuthRoundTripper * HTTPRequestsDuration -> HTTPRequestDuration, clean up mimir client factory function * refactor * less git diff * gauge for last readiness check in seconds * initialize LastReadinesCheck to 0, tweak metric names and descriptions * add counters for sync attempts/errors * last config sync and last state sync timestamps (gauges) * change latency metric name * metric for remote Alertmanager mode * code review comments * move label constants to metrics package	2024-01-10 11:18:24 +01:00
Santiago	1f6575e65e	Alerting: Test MOA in remote secondary mode (#79828 )	2024-01-05 11:05:27 +01:00
Santiago	a77ba40ed4	Alerting: Use the forked Alertmanager for remote secondary mode (#79646 ) * (WIP) Alerting: Use the forked Alertmanager for remote secondary mode * fall back to using internal AM in case of error * remove TODOs, clean up .ini file, add orgId as part of remote AM config struct * log warnings and errors, fall back to remoteSecondary, fall back to internal AM only * extract logic to decide remote Alertmanager mode to a separate function, switch on mode * tests * make linter happy * remove func to decide remote Alertmanager mode * refactor factory function and options * add default case to switch statement * remove ineffectual assignment	2023-12-21 15:26:31 +01:00
Santiago	c46da8ea9b	Alerting: Update alerting package and imports from cluster and clusterpb (#79786 ) * Alerting: Update alerting package * update to latest commit * alias for imports	2023-12-21 12:34:48 +01:00
Santiago	f7248efff5	Alerting: Fix panic when creating a new Alertmanager returns an error (#79641 ) Alerting: Fix panic after error creating new Alertmanager	2023-12-18 15:33:07 +01:00
Santiago	57e0d6bcb5	Chore: Simplify function signature for GetLatestAlertmanagerConfiguration (#79392 )	2023-12-12 13:49:54 +01:00
Santiago	73776f37eb	Alerting: Send state to the remote Alertmanager (#78538 ) * Alerting: Introduce a Mimir client as part of the Remote Alertmanager Mimir client that understands the new APIs developed for mimir. Very much a WIP still. * more wip * appease the linter * more linting * add more code * get state from kvstore, encode, send * send state to the remote Alertmanager, extract fullstate logic into its own function * pass kvstore to remote.NewAlertmanager() * refactor * add fake kvstore to tests * tests * use FileStore to get state * always log 'completed state upload' * refactor compareRemoteConfig * base64-encode the state in the file store * export silences and nflog filenames, refactor * log 'completed state/config upload...' regardless of outcome * add values to the state store in tests * address code review comments * log error from filestore --------- Co-authored-by: gotjosh <josue.abreu@gmail.com>	2023-11-29 12:49:39 +01:00
Santiago	01d274852c	Alerting: Add GetFullState method to FileStore (#78701 ) * Alerting: Add GetFullState method to FileStore * make tests compile, create stateStore in NewAlertmanager * return errors instead of logging, accept an arbitrary number of strings * make NewAlertmanager() accept a stateStore	2023-11-28 15:34:45 +01:00
Santiago	197f0d2859	Alerting: Add methods for silences to the forked Alertmanager (#77805 ) * Alerting: Add an empty Forked Alertmanager * Alerting: Add methods for silences to the forked Alertmanager * check for errors in tests * make linter happy * make linter happy * Alerting: Add methods for silences to the forked Alertmanager	2023-11-08 12:03:40 +01:00
Ryan McKinley	5d5f8dfc52	Chore: Upgrade Go to 1.21.3 (#77304 )	2023-11-01 09:17:38 -07:00
Santiago	a6b9b27673	Alerting: Remove OrgID() from the Alertmanager interface (#77398 )	2023-10-31 10:58:47 +01:00
Yuri Tseretyan	48b55f39bf	Alerting: Add support for responders to Opsgenie integration (#77159 ) * add support for responders in opsgenie UI config * update export model Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>	2023-10-27 13:06:46 -04:00
Santiago	f9fc2e4568	Alerting: Remove ConfigHash() from the Alertmanager interface (#77134 )	2023-10-25 17:11:53 +02:00
Santiago	322a9c0b15	Alerting: Replace FileStore() for CleanUp() in the Alertmanager interface (#77126 ) Alerting: Remplace FileStore() for CleanUp() in the Alertmanager interface	2023-10-25 13:58:28 +02:00
gotjosh	866acbd5ac	Alerting: Move `ExternalAlertmanager` to its own package (#76854 ) * Alerting: Move `ExternalAlertmanager` to its own package We'll avoid import cycles when using components from other packages. In addition to that, I've created an `Options` approach for the multiorg alertmanger to allow us to override how per tenant alertmanagers are created. * switch things around * address review comments * fix references and warnings	2023-10-20 14:08:13 +02:00
Santiago	a60ec150f9	Alerting: Fetch receivers from remote Alertmanager (#76841 ) * Alerting: fetch receivers from remote Alertmanager * make linter happy * change require.Eventually() timeout and tick	2023-10-20 11:34:17 +02:00
Santiago	61cb26711e	Alerting: Fetch alerts from a remote Alertmanager (#75844 ) * Alerting: post alerts to the remote Alertmanager and fetch them * fix broken tests * Alerting: Add Mimir Backend image to devenv (blocks) * add alerting as code owner for mimir_backend block * Alerting: Use Mimir image to run integration tests for the remote Alertmanager * skip integration test when running all tests * skipping integration test when no Alertmanager URL is provided * fix bad host for mimir_backend * remove basic auth testing until we have an nginx image in our CI * add integration tests for alerts * fix tests * change SendCtx -> Send, add context.Context to Send, fix CI * add reover() for functions from the Prometheus Alertmanager HTTP client that could panic * add TODO to implement PutAlerts in a way that mimicks what Prometheus does * fix log format	2023-10-19 11:27:37 +02:00
Santiago	7d9b2c73c7	Alerting: Use Mimir image to run integration tests for the remote Alertmanager (#76608 ) * Alerting: Use Mimir image to run integration tests for the remote Alertmanager * skip integration test when running all tests * skipping integration test when no Alertmanager URL is provided * fix bad host for mimir_backend * remove basic auth testing until we have an nginx image in our CI	2023-10-17 12:21:45 +02:00
Matthew Jacobson	82f3127e23	Alerting: Move legacy alert migration from sqlstore migration to service (#72702 )	2023-10-12 13:43:10 +01:00

1 2 3 4 5 ...

386 Commits