grafana/pkg
Steve Simpson 73873f5a8a
Alerting: Optimize rule status gathering APIs when a limit is applied. (#86568)
* Alerting: Optimize rule status gathering APIs when a limit is applied.

The frontend very commonly calls the `/rules` API with `limit_alerts=16`. When
there are a very large number of alert instances present, this API is quite
slow to respond, and profiling suggests that a big part of the problem is
sorting the alerts by importance, in order to select the first 16.

This changes the application of the limit to use a more efficient heap-based
top-k algorithm. This maintains a slice of only the highest ranked items whilst
iterating the full set of alert instances, which substantially reduces the
number of comparisons needed. This is particularly effective, as the
`AlertsByImportance` comparison is quite complex.

I've included a benchmark to compare the new TopK function to the existing
Sort/limit strategy. It shows that for small limits, the new approach is
much faster, especially at high numbers of alerts, e.g.

100K alerts / limit 16: 1.91s vs 0.02s (-99%)

For situations where there is no effective limit, sorting is marginally faster,
therefore in the API implementation, if there is either a) no limit or b) no
effective limit, then we just sort the alerts as before. There is also a space
overhead using a heap which would matter for large limits.

* Remove commented test cases

* Make linter happy
2024-04-19 11:51:22 +02:00
..
api QueryService: Add feature toggles to better support testing (#86493) 2024-04-19 12:26:21 +03:00
apimachinery Plugins: Expose backendplugin for client proto interface (#86207) 2024-04-17 18:47:01 +02:00
apis Scopes: Add basic integration tests (#85351) 2024-03-29 16:12:28 +02:00
apiserver Plugins: Expose backendplugin for client proto interface (#86207) 2024-04-17 18:47:01 +02:00
build K8s: Add slog wrapper (#84680) 2024-04-09 18:39:25 +03:00
bus Tracing: Standardize on otel tracing (#75528) 2023-10-03 14:54:20 +02:00
cmd Plugins: Pass cancellable context during API server creation (#86545) 2024-04-19 09:22:14 +03:00
codegen Core: Remove thema and kindsys dependencies (#84499) 2024-03-21 11:11:29 +01:00
components Chore: Remove public vars in setting package (#81018) 2024-01-23 12:36:22 +01:00
events Alerting: Update rules version when folder title is updated (#53013) 2022-08-01 19:28:38 -04:00
expr SSE: Threshold expression to use simple functions (#86062) 2024-04-16 13:35:41 -04:00
extensions K8s: Add slog wrapper (#84680) 2024-04-09 18:39:25 +03:00
generated K8s: update hack codegen script (#81216) 2024-01-25 12:01:09 -08:00
ifaces/gcsifaces Chore: Upgrade Go to 1.19.1 (#54902) 2022-09-12 12:03:49 +02:00
infra Tracing: Allow otel service name and attributes to be overridden from env (#85937) 2024-04-11 15:18:46 +02:00
kinds Schemas: Replace registry generation and github workflow (#83490) 2024-03-13 18:05:21 +02:00
login/social samlsettings: api integration (#84300) 2024-03-25 10:54:45 +01:00
middleware Chore: Remove repetitive words (#84132) 2024-03-11 08:55:18 -04:00
mocks/mock_gcsifaces Chore: use any rather than interface{} (#74066) 2023-08-30 18:46:47 +03:00
models Auth: Add empty role definition (#64694) 2023-07-06 15:40:06 +02:00
modules Storage Api: Add metrics (#85316) 2024-04-08 08:35:01 -06:00
plugins Return plugin error when requesting settings (#86052) 2024-04-18 14:29:02 +02:00
promlib Prometheus: (Instrumentation) Add rawExpr (pre-interpolation) to traces (#86449) 2024-04-17 19:53:38 +02:00
registry QueryService: Add feature toggles to better support testing (#86493) 2024-04-19 12:26:21 +03:00
server Storage: Watch tests (#85496) 2024-04-08 11:42:12 -04:00
services Alerting: Optimize rule status gathering APIs when a limit is applied. (#86568) 2024-04-19 11:51:22 +02:00
setting Storage Api: Adds traces (#85391) 2024-04-16 08:30:51 -06:00
tests Alerting: Fix simplified routes '...' groupBy creating invalid routes (#86006) 2024-04-16 12:14:39 -04:00
tsdb MSSQL: Add SQL_VARIANT converter and update test (#85823) 2024-04-17 16:49:51 -05:00
util Unified Storage: added pkg/util/ring package to handle queueing of notifications (#84657) 2024-04-11 19:32:31 -03:00
web Image Rendering: Add settings for default width, height and scale (#82040) 2024-02-26 13:27:34 +01:00
README.md Chore: Move all backend contribution documents to a single directory (#61140) 2023-01-11 11:16:52 +01:00
ruleguard.rules.go Chore: update all +build statements (#38782) 2021-09-01 17:38:56 +03:00

This directory contains the code for the Grafana backend.

The contributor documentation for Grafana's backend is in /contribute/backend/README.md.