* Use the current namespace
* Enable PeakQ API
* Enable PeakQ API when Query API is enabled
* Enable PeakQ API when Query API & Query Library are enabled
* Add search index table
* Stab a test
* Add more tests
* Add basic index
* Switch to UID and add a test for the index
* Improve tests coverage
* Remove redundant whitespaces
* Load all data source APIs when query history is loaded
* Fix column type
* Fix migration
* Clean-up the index
* Fix linting
* Fix migrations
* Fix migrations
* Fix migrations
* Rename index to details
* implement querying gms for snapshot status
* add some documentation
* provide snapshot resources after snapshot is created
* add rate limiting to backend
* fix compilation error
* fix typo
* add unit tests
* finish merge
* lint
* swagger gen
* more testing
* remove duplicate test
* address a couple PR comments
* update switch statement to a map
* add timeouts to gms client through the http client
* remove extra whitespace
* put method back where it was so the PR is less confusing
* fix tests
* add todo
* fix final unit test
* Create some integration testing infra for RRs
* whoops
* Require no error in responding
* fix linter
* Panic, no need to pass testing around
* Extend status test
* add WWW-Authenticate header in the http response of /metrics endpoint in case of wrong basic auth credentials
Signed-off-by: Syed Nihal <syed.nihal@nokia.com>
* added change log for the change fixing the issue https://github.com/grafana/grafana/issues/86902
Signed-off-by: Syed Nihal <syed.nihal@nokia.com>
* Update CHANGELOG.md
---------
Signed-off-by: Syed Nihal <syed.nihal@nokia.com>
* Cloud migration: upload snapshot files using presigned url
* log error if index file cannot be closed
* log error if file cannot be closed in uploadUsingPresignedURL
* Implement EventDetails for expanded rows and pagination on the events list
* Add test for getPanelDataForRule function
* prettier
* refactor EventState component
* create interfaces for props
* Add missing translations
* Update some comments
* Add plus button in alertrulename , to add it into the filter
* Add plus button to add filters from the list labels and alert name
* Add clear labels filter button
* run prettier
* fix RBAC checks
* Update AlertLabels onLabelClick functionality
* add limit=0 in useCombinedRule call
* Add filter by state
* remove plus button in labels
* Fix state filter
* Add filter by previous state
* fix some errors after solving conflicts
* Add comments and remove some type assertions
* Update the number of transitions calculation to be for each instance
* Add tests for state filters
* remove type assertion
* Address review comments
* Update returnTo prop in alert list view url
* Update translations
* address review comments
* prettier
* update cursor to pointer
* Address Deyan review comments
* address review pr comments from Deyan
* fix label styles
* Visualize expanded row as a state graph and address some pr review comments
* Add warning when limit of events is reached and rename onClickLabel
* Update texts
* Fix translations
* Update some Labels in the expanded states visualization
* move getPanelDataForRule to a separate file
* Add header to the list of events
* Move HistoryErrorMessage to a separate file
* remove getPanelDataForRule function and test
* add comment
* fitler by instance label results shown inthe state chart
* remove defaults.ini changes
* fix having single event on time state chart
---------
Co-authored-by: Gilles De Mey <gilles.de.mey@gmail.com>
* order session list descending
* add snapshot status method to store
* query stats while retrieving snapshot
* return stats in dto
* swagger
* fix tests
* commit results of bingo get
* fix swagger
* minor improvement
* fix typo
* forgot a file
* Plugins: Enhanced plugin instrumentation
* use backend.CallResourceResponseSenderFunc
* sdk v0.237.0
* support admission control
* cover all handlers in log and metrics middlewares
* fix after review
* soft delete
* Fix bench test
Co-authored-by: Bruno Abrantes <bruno@brunoabrantes.com>
* Add integration test for soft deletion
---------
Co-authored-by: Ryan McKinley <ryantxu@gmail.com>
* fix kind of TimeInterval
* register custom fields for selectors
* support field selectors in legacy storage
* support selectors in storage
===== Misc
* refactor conversions to build in one place
* hide implementation of provenance status behind accessors to use the key in selectors
* fix provenance error
* Unify values
* Fix with latest changes on main
* Fix up NaN test
* Keep refIDs with -1 as value
* Test that refIDs are preserved on Normal to Error transition
* Alerting to err test too
* Add a blurb to docs about this behavior
The contact point deletion API was returning 500 when it should have been
returning a 4xx error, when the contact point is in use:
- When in use by a notificiation policy, we were missing
the `.Errorf("")` to convert `errutil.Base` into `errutil.Error`.
- When in use by an alert rule, an regular error was returned.
* Compare results when reading/writing between unified_storage and legacy
* Always use name when comparing objects
* Compare on get method
* Update pkg/apiserver/rest/dualwriter.go
Co-authored-by: Dan Cech <dcech@grafana.com>
* Add new metric to count how many times we read from legacy in mode 2
* Move counter
* Add name in mode1
---------
Co-authored-by: Dan Cech <dcech@grafana.com>
* filter the k6 folder out in the SQL queries rather than during post processing to ensure that the correct number of results is always returned
* linting
* Split org_mapping correctly if it contains multiple colons
* Improve tests
* Use backslash as an escape character for colons
* Cleanup, address feedback
* Change test to use double quotes as an example
* Revert "Chore: Return influxdb query error early before parsing the result (#88549)"
This reverts commit a87c155c06.
* Handle error in buffered parser
* handle error message in streaming parser
* Add org_mapping and org_attribute_path to the UI
* Add validators, allow setting org mapping to only Grafana Admins
* comment
* Address feedback, improve validation, fix FE test, lint
* Cloud migrations: create snapshot and store it on disk
* fix merge conflicts
* implement StartSnapshot for gms client
* pass snapshot directory as argument to snapshot builder
* ensure snapshot folder is set
* make swagger-gen
* remove Test_ExecuteAsyncWorkflow
* pass signed in user to buildSnapshot method / use github.com/grafana/grafana-cloud-migration-snapshot to create snapshot files
* fix FakeServiceImpl.CreateSnapshot
* remove new line
Adds more spans for timing in accesscontrol and remove permission deduplicating code after benchmarking
---------
Signed-off-by: Dave Henderson <dave.henderson@grafana.com>
Co-authored-by: Dave Henderson <dave.henderson@grafana.com>
Co-authored-by: Ieva <ieva.vasiljeva@grafana.com>
* Zanzana: Listen http to handle fga cli requests.
* make configurable
* start http server during service run
* wait for GRPC server is ready
* remove unnecessary logs
* fix linter errors
* run only in devenv
* make address configurable
* Alerting: Add setting for maximum allowed rule evaluation results
Added a new configuration setting `quota.alerting_rule_evaluation_results` to set the maximum number of alert rule evaluation results per rule. If the limit is exceeded, the evaluation will result in an error.
This PR reduces the number of allocations made while caching permissions from the database, fixes the hierarchy of spans and adds new spans for tracing.
---------
Signed-off-by: Dave Henderson <dave.henderson@grafana.com>
Co-authored-by: Dave Henderson <dave.henderson@grafana.com>
* add method CanReadAllRules to rule authorization service
* add alias type Namespace for Folder in ngalert's models package. It implements the Namespacer interface that is used by authz logic
* update state history's backends to authorize access to rules.
* update Loki to add folders UIDs to query.
* Update BuildLogQuery to drop filter by folders if it's too long and fall back to in-memory filtering.
* WIP implement generic compare interface
* Use global compare fn for all entities
* Lint
* Update pkg/apiserver/rest/dualwriter.go
Co-authored-by: Dan Cech <dcech@grafana.com>
* Don't need to hash, just compare bytes
* Fix tests
---------
Co-authored-by: Dan Cech <dcech@grafana.com>
Alerting: fix preserving errors in the alert rule state during error to error transitions
Alert state transition from one error to another did not update state.Error correctly.
The error in state.Error remained as the initial error encountered.
This led to another issue, where after a Grafana restart, the error was lost because
the state of the alert rule did not change, but the Error is not preserved in the database
between restarts.
This could happen if the expression service returned an error or the alert routine panicked
during querying.
* create a new table for migration resources
* remove raw result bytes from db
* more snapshot resource management stuff
* integrate new table with snapshots
* pass in result limit and offset as params
* combine create and update
* set up xorm store test
* add unit tests
* save some cpu
* remove unneeded arg
* regen swagger
* fix bug with result processing
* fix update create logic so that uid isn't required for lookup
* change offset to page
* regen swagger
* revert accidental changes to file
* curl command page should be 1 indexed
Co-authored-by: Will Browne <wbrowne@users.noreply.github.com>
Co-authored-by: Dan Cech <dcech@grafana.com>
Co-authored-by: Andres Martinez Gotor <andres.martinez@grafana.com>
* add regex support for api tests
* revert dumb thing
* add api tests
* add unit test for core async workflow
* add xorm store unit tests
* fix typo
* remove unnecessary assignment
* expose ngalert API to public
* add delete action to time-intervals
* introduce time-interval model generated by app-platform-sdk from CUE model the fields of the model are chosen to be compatible with the current model
* implement api server
* add feature flag alertingApiServer
---- Test Infra
* update helper to support creating custom users with enterprise permissions
* add generator for Interval model
* Simple replace of State.Resolved with State.ResolvedAt
* Retain ResolvedAt time between Normal->Normal transition
* Introduce ResolvedRetention to keep sending recently resolved alerts
* Make ResolvedRetention configurable with resolved_alert_retention
* Tick-based LastSentAt for testing of ResendDelay and ResolvedRetention
* Do not reset ResolvedAt during Normal->Pending transition
Initially this was done to be inline with Prom ruler. However, Prom ruler
doesn't keep track of Inactive->Pending/Alerting using the same alert instance,
so it's more understandable that they choose not to retain ResolvedAt. In our
case, since we use the same cached instance to represent the transition, it
makes more sense to retain it.
This should help alleviate some odd situations where temporarily entering
Pending will stop future resolved notifications that would have happened
because of ResolvedRetention.
* Pointers for ResolvedAt & LastSentAt
To avoid awkward time.Time{}.Unix() defaults on persist
* Zanana: Use grafana migrations to run openFGA migration files and initilize store.
* Add feature toggle
* Zanzana: return noop client if feature toggle is disabled
* add new apis
* add payloads
* create snapshot status type
* add some impl
* finish implementing update
* start implementing build snapshot func
* add more fake build logic
* add cancel endpoint. do some cleanup
* implement GetSnapshot
* implement upload snapshot
* merge onprem status with gms result
* get it working
* update comment
* rename list endpoint
* add query limit and offset
* add helper method to snapshot
* little bit of cleanup
* work on swagger annotations
* manual merge
* generate swagger specs
* clean up curl commands
* fix bugs found during final testing
* fix linter issue
* fix unit test
This adds a version of the SQLStore that includes a ReadReplica. The primary DB can be accessed directly - from the caller's standpoint, there is no difference between the SQLStore and ReplStore unless they wish to explicitly call the ReadReplica() and use that for the DB sessions.
Currently only the stats service GetSystemStats and GetAdminStats are using the ReadReplica(); if it's misconfigured or if the databaseReadReplica feature flag is not turned on, it will fall back to the usual (SQLStore) behavior.
Testing requires a database and read replica - the replication should already be configured. I have been testing this locally with a docker mysql setup (https://medium.com/@vbabak/docker-mysql-master-slave-replication-setup-2ff553fceef2) and the following config:
[feature_toggles]
databaseReadReplica = true
[database]
type = mysql
name = grafana
user = grafana
password = password
host = 127.0.0.1:3306
[database_replica]
type = mysql
name = grafana
user = grafana
password = password
host = 127.0.0.1:3307
* keep config in a separate struct in LDAP
* implement reload function for LDAP
* remove param from sso service constructor
* update unit tests
* add feature flag
* remove nil params
* address feedback
* add unit test for disabled config
* Fix restoring dashboard to root folder
* use a root folder representation instead of nil
* change root folder by general folder
---------
Co-authored-by: Ezequiel Victorero <ezequiel.victorero@grafana.com>
* Zanana: Initial work to run zanana as ebeddedn or standalone
* Add addr settings for when remote client is used.
* sync dependencies
* Lock mysql driver version
---------
Co-authored-by: Dan Cech <dcech@grafana.com>
* add function to search for free port
* Update pkg/tsdb/influxdb/fsql/fsql_test.go
Co-authored-by: Dave Henderson <dave.henderson@grafana.com>
* Update pkg/tsdb/influxdb/fsql/fsql_test.go
Co-authored-by: Dave Henderson <dave.henderson@grafana.com>
* fix test
* fix go linting issue
* fix go lint
---------
Co-authored-by: Dave Henderson <dave.henderson@grafana.com>
* WIP
* Add barchart panel with scenes
* Fix timerange in barchart panel
* Refactor: component names
* Remove not used css styles and rename panel title
* Remove unnecessary HistoryEventsListObject class and update text in labels filter
* add padding top for filter
* Add translations
* update limit labels constant
* Update showing state reason
* Fix scene object
* Address review comments
* Update icons
* use endpoints instead of the autogenerated hook
* Address review comments
* Add tooltip for alert name
* use private polling interval
* fix autogenerated translations
* Address pr rewview comments
* Address review comments
* Update text in placeholder
* Rename variable and remove spaces in Trans children
* Fix several broadcaster data races and error handling
- Separate concerns between sender and receiver sides in channel usage
- broadcaster: Fix data race between Subscribe/Unsubscribe and start
- Fix Subscribe error to be io.EOF when broadcaster is terminated
- Fix Watch never unsubscribing
- General cleanup
- fix usage of context
- add a huge amount of documentation about channels
* Add TracedClient
* Handle errors and status codes
* Wire up tracing to normal ASH and loki annotation mapping
* Add tracing to remote alertmanager
* one more spot
* and not or
* More consistency with other grafana traces, lower cardinality name
* chore(perf): Pre-allocate where possible (enable prealloc linter)
Signed-off-by: Dave Henderson <dave.henderson@grafana.com>
* fix TestAlertManagers_buildRedactedAMs
Signed-off-by: Dave Henderson <dave.henderson@grafana.com>
* prealloc a slice that appeared after rebase
Signed-off-by: Dave Henderson <dave.henderson@grafana.com>
---------
Signed-off-by: Dave Henderson <dave.henderson@grafana.com>
* Pass prometheus registerer to the dual writer
* Fix tests
* Remove unused var
* Fix tests
* Uncomment test
* Remove leading line
* Fix tests. Reuse registerer if there's already one
* Lint
* Improve double registering logic
* Rebase main
* rename some stuff
* more renaming
* clean up api
* rename more functions
* rename cms -> gms
* update comment
* update swagger gen
* update endpoints
* overzealous
* final touches
* dont modify existing migrations
* break structs into domain and dtos
* add some conversion funcs
* fix build
* update frontend
* try to make swagger happy
* resolve action sets when GetPermissions is called
* a fix to ensure that dashboard permissions that override parent folder permissions are displayed on top of the inherited permission
* linting
* linting pt2
Improves log line to help with debugging in Server Side Expressions. In particular, the traceId, datasourceType, and datasourceUid will now be included.
* include and resolve action sets when fetching user's permissions
* expand both action and action prefix (returns an empty set for the one that isn't specified)
Co-authored-by: Gabriel MABILLE <gamab@users.noreply.github.com>
* if action is specified, check for exact match; also extend tests
* make the config sync happen on each call to ApplyConfig(), fix tests
* send autogen config
* add fake autogen function for tests
* update stale comments, tidy things up, make linter happy
* add auto-gen routes only if the feature toggle is enabled
* remove unnecessary fake autogen function
* throttle configuration syncs
* restore pkg/services/store/entity/sqlstash/sql_storage_server.go
* test sync loop in ApplyConfig, skip invalid autogen routes
* restore conf/defaults.ini
* restore conf/defaults.ini
* avoid skipping invalid auto-gen routes in SaveAndApplyConfig
* test that autogenFn is called and its errors are returned
* add debug message about the sync interval not having elapsed
* collapse two log lines into one
* Docs: Update "Configure high availability" guide with ha_reconnect_timeout configuration
---------
Co-authored-by: Christopher Moyer <35463610+chri2547@users.noreply.github.com>
* Make MakeDependencyError public for tests in another package
* Create tests for errors in eval results
* Extract logic to pull frame errors out into exported function
* Maybe we can drop cyclomatic complexity lint suppression now?
* extract frame errors and fail recording rules if frames contain error
* Fix up retry logic to actually work
* Do not retry non retryable errors
* add root and client certificate value fields for LDAP
* update error messages for connection error
* add LDAP fallback strategy for SSO settings service
* fix params for sso service provider
* fix params for sso service provider
* sort imports
* sort imports
* replace json.Number with int64 in config map
* remove type assertions
* add test for the bug
* remove unused struct
* update db store to post process filters by group using go-lang's case-sensitive string comparison
--------
Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com>
* iam-716 - prevent a folder move operation when the folder's uid or any of its parents uids begin with k6-app
* fox folder move check and only list non-k6 folders to users
* adding tests for moving
* add a test for listing folders
* fix the other tests
* use method that adds folder parent
---------
Co-authored-by: IevaVasiljeva <ieva.vasiljeva@grafana.com>
* Add and fix tests for playlists in mode1
* Make etcd tests pass mode1 for now
* Fix mode1 and add more tests for playlists in mode 1
* Remove repeated test
* Fix test setup