* Alerting: Attempt to retry retryable errors
Retrying has been broken for a good while now (at least since version 9.4) - this change attempts to re-introduce them in their simplest and safest form possible.
I first introduced #79095 to make sure we don't disrupt or put additional load on our customer's data sources with this change in a patch release. Paired with this change, retries can now work as expected.
There's two small differences between how retries work now and how they used to work in legacy alerting.
Retries only occur for valid alert definitions - if we suspect that that error comes from a malformed alert definition we skip retrying.
We have added a constant backoff of 1s in between retries.
---------
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* first round of entityapi updates
- quote column names and clean up insert/update queries
- replace grn with guid
- streamline table structure
fixes
streamline entity history
move EntitySummary into proto
remove EntitySummary
add guid to json
fix tests
change DB_Uuid to DB_NVarchar
fix folder test
convert interface to any
more cleanup
start entity store under grafana-apiserver dskit target
CRUD working, kind of
rough cut of wiring entity api to kube-apiserver
fake grafana user in context
add key to entity
list working
revert unnecessary changes
move entity storage files to their own package, clean up
use accessor to read/write grafana annotations
implement separate Create and Update functions
* go mod tidy
* switch from Kind to resource
* basic grpc storage server
* basic support for grpc entity store
* don't connect to database unless it's needed, pass user identity over grpc
* support getting user from k8s context, fix some mysql issues
* assign owner to snowflake dependency
* switch from ulid to uuid for guids
* cleanup, rename Search to List
* remove entityListResult
* EntityAPI: remove extra user abstraction (#79033)
* remove extra user abstraction
* add test stub (but
* move grpc context setup into client wrapper, fix lint issue
* remove unused constants
* remove custom json stuff
* basic list filtering, add todo
* change target to storage-server, allow entityStore flag in prod mode
* fix issue with Update
* EntityAPI: make test work, need to resolve expected differences (#79123)
* make test work, need to resolve expected differences
* remove the fields not supported by legacy
* sanitize out the bits legacy does not support
* sanitize out the bits legacy does not support
---------
Co-authored-by: Ryan McKinley <ryantxu@gmail.com>
* update feature toggle generated files
* remove unused http headers
* update feature flag strategy
* devmode
* update readme
* spelling
* readme
---------
Co-authored-by: Ryan McKinley <ryantxu@gmail.com>
* Alerting: Attempt to retry retryable errors
Currently in a draft state, but this was the minimal diff I could put together to exemplify how could achieve this.
Signed-off-by: gotjosh <josue.abreu@gmail.com>
---------
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* Have the first iteration
* Prepare bench testing
* rename the test files
* Remove unnecessary test file
* Introduce influxqlStreamingParser feature flag
* Apply streaming parser feature flag
* Add new tests
* More tests
* return executedQueryString only in first frame
* add frame meta and config
* Update golden json files
* Support tags/labels
* more tests
* more tests
* Don't change original response_parser.go
* provide context
* create util package
* don't pass the row
* update converter with formatted frameName
* add executedQueryString info only to first frame
* update golden files
* rename
* update test file
* use pointer values
* update testdata
* update parsing
* update converter for null values
* prepare converter for table response
* clean up
* return timeField in fields
* handle no time column responses
* better nil field handling
* refactor the code
* add table tests
* fix config for table
* table response format
* fix value
* if there is no time column set name
* linting
* refactoring
* handle the status code
* add tracing
* Update pkg/tsdb/influxdb/influxql/converter/converter_test.go
Co-authored-by: İnanç Gümüş <m@inanc.io>
* fix import
* update test data
* sanity
* sanity
* linting
* simplicity
* return empty rsp
* rename to prevent confusion
* nullableJson field type for null values
* better handling null values
* remove duplicate test file
* fix healthcheck
* use util for pointer
* move bench test to root
* provide fake feature manager
* add more tests
* partial fix for null values in table response format
* handle partial null fields
* comments for easy testing
* move frameName allocation in readSeries
* one less append operation
* performance improvement by making string conversion once
pkg: github.com/grafana/grafana/pkg/tsdb/influxdb/influxql
│ stream2.txt │ stream3.txt │
│ sec/op │ sec/op vs base │
ParseJson-10 314.4m ± 1% 303.9m ± 1% -3.34% (p=0.000 n=10)
│ stream2.txt │ stream3.txt │
│ B/op │ B/op vs base │
ParseJson-10 425.2Mi ± 0% 382.7Mi ± 0% -10.00% (p=0.000 n=10)
│ stream2.txt │ stream3.txt │
│ allocs/op │ allocs/op vs base │
ParseJson-10 7.224M ± 0% 6.689M ± 0% -7.41% (p=0.000 n=10)
* add comment lines
---------
Co-authored-by: İnanç Gümüş <m@inanc.io>
* Unified Alerting: Set `max_attempts` to 1 by default
The retry logic for unified alerting has been broken as far as v9.4.x, rather than fixing it in one go and causing a headache to our users with rules putting extra load on their datasources - I think a better approach is to simply set 1 as a default and then let our users change it.
I see two cons with this approach:
- Configuration for legacy to unified alerting cannot be ported over automatically, users will have to manually set `max_attempts` to 3 when migrating.
- Users expecting to get any sort of retrying (as with legacy alerting) will not have it out of the box and will have to manually edit the configuration.
Signed-off-by: gotjosh <josue.abreu@gmail.com>
---------
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* Folders: Show folders user has access to at the root level
* Refactor
* Refactor
* Hide parent folders user has no access to
* Skip expensive computation if possible
* Fix tests
* Fix potential nil access
* Fix duplicated folders
* Fix linter error
* Fix querying folders if no managed permissions set
* Update benchmark
* Add special shared with me folder and fetch available non-root folders on demand
* Fix parents query
* Improve db query for folders
* Reset benchmark changes
* Fix permissions for shared with me folder
* Simplify dedup
* Add option to include shared folder permission to user's permissions
* Fix nil UID
* Remove duplicated folders from shared list
* Folders: Fix fetching empty folder
* Nested folders: Show dashboards with directly assigned permissions
* Fix slow dashboards fetch
* Refactor
* Fix cycle dependencies
* Move shared folder to models
* Fix shared folder links
* Refactor
* Use feature flag for permissions
* Use feature flag
* Review comments
* Expose shared folder UID through frontend settings
* Add frontend type for sharedWithMeFolderUID option
* Refactor: apply review suggestions
* Fix parent uid for shared folder
* Fix listing shared dashboards for users with access to all folders
* Prevent creating folder with "shared" UID
* Add tests for shared folders
* Add test for shared dashboards
* Fix linter
* Add metrics for shared with me folder
* Add metrics for shared with me dashboards
* Fix tests
* Tests: add metrics as a dependency
* Fix access control metadata for shared with me folder
* Use constant for shared with me
* Optimize parent folders access check, fetch all folders in one query.
* Use labels for metrics
* Export Notification Policy correctly (#78020)
The JSON version of an exported Notification Policy now
inline correctly the policy in the same way the Yaml version
does.
Co-authored-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
* ngalert `make`: Support GNU install on Darwin
Currently, the Makefile assumes that Darwin is using the Mac version of `sed`
I have the GNU version, so it failed. With this PR, it checks which version is installed
I also called `make` and there are some changes that came out of it
* swagger-gen
fix: ability to (de)serialize parents and current index
- refactored so only history stores parent relations and current index
- rounded indirect parent links
The definition for preferences is globally named `Spec` because that's the type that cue outputs
This adds a swagger annotation to rename the definition in the swagger schema to `Preferences`
This will be easier to use in generated clients