grafana

mirror of https://github.com/grafana/grafana.git synced 2024-11-25 18:30:41 -06:00

Author	SHA1	Message	Date
Santiago	309a7e7684	Alerting: Implement SaveAndApplyDefaultConfig in the remote Alertmanager struct (#85005 ) * Alerting: Implement SaveAndApplyDefaultConfig in the remote Alertmanager struct * send the hash of the encrypted configuration * tests, default config hash in AM struct * add missing default config to test * restore build directory * go work file... * fix broken test * remove unnecessary conversion to []byte * go work again... * make things work again with latest main branch changes * update error messages in tests for decrypting config	2024-04-19 15:11:07 +02:00
Santiago	a2ce8fefed	Alerting: Use a struct when sending a Grafana AM configuration to the remote Alertmanager (#86451 ) * Alerting: Use a struct when sending a Grafana AM configuration to the remote Alertmanager * remove '-distroless' from mimir image name	2024-04-19 13:04:18 +02:00
Matthew Jacobson	f79dd7c7f9	Alerting: Persist silence state immediately on Create/Delete (#84705 ) * Alerting: Persist silence state immediately on Create/Delete Persists the silence state to the kvstore immediately instead of waiting for the next maintenance run. This is used after Create/Delete to prevent silences from being lost when a new Alertmanager is started before the state has persisted. This can happen, for example, in a rolling deployment scenario. * Fix test that requires real data * Don't error if silence state persist fails, maintenance will correct	2024-04-09 13:39:34 -04:00
Santiago	2e7cc68394	Alerting: Remove CleanUp method from the Alertmanager (#85650 ) Alerting: Remove Cleanup method from the Alertmanager	2024-04-09 12:13:27 +02:00
Matthew Jacobson	0c3c5c5607	Alerting: Stop persisting silences and nflog to disk (#84706 ) With this change, we no longer need to persist silence/nflog states to disk in addition to the kvstore	2024-03-23 00:37:33 +02:00
Santiago	a2facbecd4	Alerting: Implement ApplyConfig for remote primary mode (forked AM) (#84811 ) * Alerting: Implement ApplyConfig for remote primary mode (forked AM) * add TODO for saving the config hash in other config-related methods * fix bad method receiver name (m -> am) * tests * add mutex * remove sync loop	2024-03-22 15:17:41 +01:00
Santiago	4ad6d66479	Alerting: Remove ID from UserGrafanaConfig struct (#84602 ) * Alerting: Remove ID from UserGrafanaConfig struct * user custom mimir image withoud id in grafana config * change mimir image name	2024-03-19 12:47:13 +01:00
Santiago	c9bb18101c	Alerting: Decrypt secrets before sending configuration to the remote Alertmanager (#83640 ) * (WIP) Alerting: Decrypt secrets before sending configuration to the remote Alertmanager * refactor, fix tests * test decrypting secrets * tidy up * test SendConfiguration, quote keys, refactor tests * make linter happy * decrypt configuration before comparing * copy configuration struct before decrypting * reduce diff in TestCompareAndSendConfiguration * clean up remote/alertmanager.go * make linter happy * avoid serializing into JSON to copy struct * codeowners	2024-03-19 12:12:03 +01:00
Santiago	fbbda6c05e	Alerting: Retry readiness check to the remote Alertmanager on 5xx status code responses (#81174 )	2024-01-24 21:39:06 +01:00
Santiago	3217a0dc05	Alerting: Fix state sync errors counter increment (#80702 )	2024-01-17 11:04:27 +01:00
Santiago	6c87d9a1e7	Alerting: Stop retries on 4xx status code responses (remote Alertmanager readiness check) (#80350 )	2024-01-11 12:12:35 +01:00
Santiago	9e78faa7ba	Alerting: Add metrics to the remote Alertmanager struct (#79835 ) * Alerting: Add metrics to the remote Alertmanager struct * rephrase http_requests_failed description * make linter happy * remove unnecessary metrics * extract timed client to separate package * use histogram collector from dskit * remove weaveworks dependency * capture metrics for all requests to the remote Alertmanager (both clients) * use the timed client in the MimirAuthRoundTripper * HTTPRequestsDuration -> HTTPRequestDuration, clean up mimir client factory function * refactor * less git diff * gauge for last readiness check in seconds * initialize LastReadinesCheck to 0, tweak metric names and descriptions * add counters for sync attempts/errors * last config sync and last state sync timestamps (gauges) * change latency metric name * metric for remote Alertmanager mode * code review comments * move label constants to metrics package	2024-01-10 11:18:24 +01:00
Santiago	a77ba40ed4	Alerting: Use the forked Alertmanager for remote secondary mode (#79646 ) * (WIP) Alerting: Use the forked Alertmanager for remote secondary mode * fall back to using internal AM in case of error * remove TODOs, clean up .ini file, add orgId as part of remote AM config struct * log warnings and errors, fall back to remoteSecondary, fall back to internal AM only * extract logic to decide remote Alertmanager mode to a separate function, switch on mode * tests * make linter happy * remove func to decide remote Alertmanager mode * refactor factory function and options * add default case to switch statement * remove ineffectual assignment	2023-12-21 15:26:31 +01:00
Santiago	9945514baa	Alerting: Validate configuration for the remote Alertmanager struct (#79691 ) * Alerting: Validate configuration for the remote Alertmanager struct * add TenantID to test * add OrgID to config struct in tests	2023-12-19 18:41:48 +01:00
Santiago	23b4568597	Alerting: Send configuration and state to the remote Alertmanager on shutdown (#78682 ) * Alerting: Send configuration and state to the remote Alertmanager on shutdown * Alerting: Add a sync interval for ApplyConfig in remote secondary mode * add routine to sync states and configs * pass a cancellable context to syncRoutine(), remove tests for ApplyConfig, cache last config in memory * extract logic to update config and state in the remote Alertmanager * get latest config from the database * avoid using separate goroutine for updating state and config * clean up PR * refactor, comments, tests * update tests * remove canceled context from calls to StopAndWait() * create context with timeout and send config and state to remote Alertmanager * update tests * address code review comments	2023-12-13 22:53:09 +01:00
Santiago	91836e7832	Alerting: Add time-based convergence in remote secondary mode (#78809 ) * Alerting: Add a sync interval for ApplyConfig in remote secondary mode * add routine to sync states and configs * pass a cancellable context to syncRoutine(), remove tests for ApplyConfig, cache last config in memory * extract logic to update config and state in the remote Alertmanager * get latest config from the database * avoid using separate goroutine for updating state and config * clean up PR * refactor, comments, tests * update tests * add config struct for remote secondary forked Alertmanager * use errgroups for sync operations * use waitgroup instead of errgroup * remove helper method to sync AMs * check for errors instead of bool syncErr	2023-12-13 13:36:17 +01:00
Santiago	1a5c2cb55b	Alerting: Check whether the internal Alertmanager is ready in remote secondary mode (#79406 ) Alerting: Check whether the internal Alertmanager is ready in remote secondary	2023-12-12 18:33:11 +01:00
gotjosh	cc3c0a2cc2	Alerting: Refactor readiness check (#78799 ) * Alerting: Refactor readiness check Moves the readiness check to the mimir client and removes the need to assert that we have senders - it already has a queue and can hold notifications until we're ready to send them. --------- Signed-off-by: gotjosh <josue.abreu@gmail.com>	2023-12-12 15:34:54 +00:00
Santiago	d64c2b6f4e	Alerting: Implement ApplyConfig in the forked Alertmanager (#78684 ) * Alerting: Add a sync interval for ApplyConfig in remote secondary mode * remove out of scope code * remove parentheses after CleanUp for consistency in test comments * Add comment to ApplyConfig	2023-11-30 15:36:41 +01:00
Santiago	316c8b50bc	Alerting: Add SaveAndApply methods to the forked Alertmanager (remote secondary) (#78827 ) * Alerting: Add configuration methods to the forked Alertmanager for remote secondary modes * update comments	2023-11-30 15:18:56 +01:00
Santiago	73776f37eb	Alerting: Send state to the remote Alertmanager (#78538 ) * Alerting: Introduce a Mimir client as part of the Remote Alertmanager Mimir client that understands the new APIs developed for mimir. Very much a WIP still. * more wip * appease the linter * more linting * add more code * get state from kvstore, encode, send * send state to the remote Alertmanager, extract fullstate logic into its own function * pass kvstore to remote.NewAlertmanager() * refactor * add fake kvstore to tests * tests * use FileStore to get state * always log 'completed state upload' * refactor compareRemoteConfig * base64-encode the state in the file store * export silences and nflog filenames, refactor * log 'completed state/config upload...' regardless of outcome * add values to the state store in tests * address code review comments * log error from filestore --------- Co-authored-by: gotjosh <josue.abreu@gmail.com>	2023-11-29 12:49:39 +01:00
Santiago	01d274852c	Alerting: Add GetFullState method to FileStore (#78701 ) * Alerting: Add GetFullState method to FileStore * make tests compile, create stateStore in NewAlertmanager * return errors instead of logging, accept an arbitrary number of strings * make NewAlertmanager() accept a stateStore	2023-11-28 15:34:45 +01:00
gotjosh	8120306fea	Remote Alertmanager(refactor): Only parse the URL once (#78631 ) * Remote Alertmanager(refactor): Only parse the URL once Exactly what it says in the tin. Signed-off-by: gotjosh <josue.abreu@gmail.com> * use the existing tests Signed-off-by: gotjosh <josue.abreu@gmail.com> --------- Signed-off-by: gotjosh <josue.abreu@gmail.com>	2023-11-24 11:05:13 +00:00
gotjosh	23fe8f4e9c	Alerting: Introduce a Mimir client as part of the Remote Alertmanager (#78357 ) * Alerting: Introduce a Mimir client as part of the Remote Alertmanager This is our first attempt at making Grafana communicate use Mimir as a backend - it uses a new set of APIs that we've developed on the Mimir side to upload the grafana configuration and alertmanager state so that it can then be ported over. Codewise, we've introduced a couple of things: A client to isolate in its own package all the communication that happens with Mimir A few changes to the remote/alertmanager to include uploading the configuration and state when it starts A few refactors that align a bit better with the design approach that we're thinking An integration tests again these newly developed APIs using a custom image --------- Signed-off-by: gotjosh <josue.abreu@gmail.com> Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>	2023-11-23 16:59:36 +00:00
Santiago	4a152a0e35	Alerting: Add lifecycle methods to the forked Alertmanager (#77741 ) * Alerting: Add an empty Forked Alertmanager * Alerting: Add methods for silences to the forked Alertmanager * check for errors in tests * make linter happy * Alerting: Add methods for alerts to the forked Alertmanager * Alerting: Add methods for receivers to the forked Alertmanager * Alerting: Add TestTemplate method to the forked Alertmanager * make linter happy * separate into both forked AMs * fix tests * Alerting: Add lifecycle methods to the forked Alertmanager	2023-11-14 11:17:17 +01:00
Santiago	8b751eb216	Alerting: Add TestTemplate method to the forked Alertmanager (#77577 ) * Alerting: Add an empty Forked Alertmanager * Alerting: Add methods for silences to the forked Alertmanager * check for errors in tests * make linter happy * Alerting: Add methods for alerts to the forked Alertmanager * Alerting: Add methods for receivers to the forked Alertmanager * Alerting: Add TestTemplate method to the forked Alertmanager * make linter happy * separate into both forked AMs * fix tests	2023-11-09 12:35:24 +01:00
Santiago	ba51c371ec	Alerting: Add methods for receivers to the forked Alertmanager (#77574 ) * Alerting: Add an empty Forked Alertmanager * Alerting: Add methods for silences to the forked Alertmanager * check for errors in tests * make linter happy * Alerting: Add methods for alerts to the forked Alertmanager * Alerting: Add methods for receivers to the forked Alertmanager * make linter happy * separate into both forked AMs * fix tests * rename testErr -> expErr	2023-11-09 11:38:16 +01:00
Santiago	e24fe96d90	Alerting: Add methods for alerts to the forked Alertmanager (#77571 ) * Alerting: Add an empty Forked Alertmanager * Alerting: Add methods for silences to the forked Alertmanager * check for errors in tests * make linter happy * Alerting: Add methods for alerts to the forked Alertmanager * make linter happy * separate into both forked AMs * rename testErr -> expErr	2023-11-08 13:52:04 +01:00
Santiago	197f0d2859	Alerting: Add methods for silences to the forked Alertmanager (#77805 ) * Alerting: Add an empty Forked Alertmanager * Alerting: Add methods for silences to the forked Alertmanager * check for errors in tests * make linter happy * make linter happy * Alerting: Add methods for silences to the forked Alertmanager	2023-11-08 12:03:40 +01:00
Santiago	01af8f61f1	Alerting: Separate the forked Alertmanager into two implementations (#77582 )	2023-11-02 17:53:18 +01:00
Santiago	8fc9873443	Alerting: Add an empty Forked Alertmanager struct (#77550 ) Alerting: Add an empty Forked Alertmanager	2023-11-02 16:49:03 +01:00
Santiago	a6b9b27673	Alerting: Remove OrgID() from the Alertmanager interface (#77398 )	2023-10-31 10:58:47 +01:00
Santiago	f9fc2e4568	Alerting: Remove ConfigHash() from the Alertmanager interface (#77134 )	2023-10-25 17:11:53 +02:00
Santiago	322a9c0b15	Alerting: Replace FileStore() for CleanUp() in the Alertmanager interface (#77126 ) Alerting: Remplace FileStore() for CleanUp() in the Alertmanager interface	2023-10-25 13:58:28 +02:00
Santiago	01add144b8	Alerting: Send alerts to the remote Alertmanager (#77034 ) * Alerting: Rename remote.ExternalAlertmanager to remote.Alertmanager * Alerting: Send alerts to the remote Alertmanager * add ticker to readiness check, add tests * use options when creating a new sender.ExternaAlertmanager * unexport defaultMaxQueueCapacity * delete unused defaultConfig field * add debug log line when sending alerts to the remote alertmanager * move and refactor readiness check * update tests to not include defaultConfig	2023-10-25 11:52:48 +02:00
Santiago	488a60aee6	Alerting: Rename remote.ExternalAlertmanager to remote.Alertmanager (#76956 )	2023-10-23 15:37:14 +02:00
gotjosh	866acbd5ac	Alerting: Move `ExternalAlertmanager` to its own package (#76854 ) * Alerting: Move `ExternalAlertmanager` to its own package We'll avoid import cycles when using components from other packages. In addition to that, I've created an `Options` approach for the multiorg alertmanger to allow us to override how per tenant alertmanagers are created. * switch things around * address review comments * fix references and warnings	2023-10-20 14:08:13 +02:00

37 Commits