* Use context in aws DescribeRegionsWithContext
In the current way, DescribeRegions is used which doesn't allow
cancelling the request if the context changes. Using
DescribeRegionsWithContext is the preferred way.
* Fix context variable
* Revert GetRegionsWithContext to GetRegions
GetRegions is not an AWS SDK method. Hence, GetRegions should be enough
as the name change is not needed for context implementation.
expose apiserver metrics
Add a route to the apiserver metrics on a new endpoint, `/apiserver-metrics`. This requires a signed-in user but otherwise ignores the MetricsEndpoind-relating configuration. that will come in a following PR
* Add proxy service template
* Replace SA srv with proxy for external SA srv
* Move service account prefix to a constant
* Prevent deletion from external service account
* Make SA validation a resusable function
* Add protection for creating service accounts
* Add protection when updating service accounts
* Add IsExternal field for service account
* Protect ext service account token generation
* Add verbose errors for form name or sa name
* add tests
* Add logs
* Adjusts tests
---------
Co-authored-by: Misi <mgyongyosi@users.noreply.github.com>
Co-authored-by: Gabriel MABILLE <gamab@users.noreply.github.com>
* Alerting: Move `ExternalAlertmanager` to its own package
We'll avoid import cycles when using components from other packages. In addition to that, I've created an `Options` approach for the multiorg alertmanger to allow us to override how per tenant alertmanagers are created.
* switch things around
* address review comments
* fix references and warnings
* initial commit for PromQAIL
* add feature toggle and start button
* add drawer
* set up drawer and state
* fix styles and start the conditional text display
* add data info list going to ai
* add logos and style
* metric display table style, neeed to make responsive
* make feature toggle frontend only
* add logic for want prompt for list or not, add helpers, addquerysuggestion type
* make query suggestion component
* add buttons to add or refine prompt
* refactor logic to add interactions to have multiple AI or historical interactions
* refactor and enable multiple questions, all flow
* add colorful AI icon to drawer open button
* fix linting
* refactor for hooking up promQail app and only giving one suggestion
* design fixes
* fix next prompt button styling
* historical suggestions give us 5, fixed that and some design things
* hook up the api, provide defense filler if it's down, refactor lots
* use query, fix linting
* add metadata to explain for ai suggestions
* styling fixes
* give metadata for historical suggestions by parsing query on the fly
* no prompt field to query-suggestion endpoint if prompt is empty
* fix linting
* use suggest rte for historical list, fix long code style
* use suggest rte for historical list, fix long code style
* fix historical bug
* added prompt file
* updated llm logic in explainer helper
* bump @grafana/experimental from 1.7.0 to 1.7.2
* use llmservice and vectorservice
* cleanup prompts + streaming explainer
* promqail feature toggle: fix re-order
* PromQL non-llm failback recommendation logic (#75469)
* added template recommendation logic directly in helpers
* also added selected labels to recommendation
* PromQail: query gen: fix prompt formatting and fetch metric labels to be used (#75450)
* PromQail: query gen: fix prompt formatting and fetch metric labels to be used
* Code fixes as suggested
* Use newly decided collection name for promql templates
* Prometheus: Promqail tests and bug fixes (#75852)
* add tests for drawer
* refine one prompt at a time, fix css
* scroll into view on interaction change
* fix styles for light
* disable prompt input after getting sugestions for that interaction
* make buttons disappear after selecting refine prompt or show historical queries to prevent user from clicking many times
* fix border radius
* fix new eslint rule about css requiring objects and not template literals
* add scrollIntoView for test
* grafana_prometheus_promqail_explanation_feedback - add feedback rudderstack interaction for explanation
* add form link to feedback for query suggestions
* fix bugs
* for prettier
* PromQL Builder Explainer: Added promql documentation and updated prompt (#75593)
* added promql documentation and updated prompt
* refactor prompt generation into isolated function
* updated prompt to answer with a question
* removed commented code
* updated metadata logic
* updated documentation body logic
* Prometheus: PromQAIL UI fixes (#76654)
* align buttons at 16px
* only autoscroll when an interaction has been added or the suggestions have been updated
* add 12px below explain for suggested queries
* add . after suggestion number
* fix linting error
* Prometheus: PromQAIL feedback improvements (#76711)
* align buttons at 16px
* only autoscroll when an interaction has been added or the suggestions have been updated
* add 12px below explain for suggested queries
* add . after suggestion number
* add text indication for explanation feedback
* add form for suggestion feedback, add form for not helpful explanation feedback
* fix linting error
* make radio button feedback required
* required text, padding additions, thank you for your feedback
* PromQL Builder Suggestion: Added type level templates and removed explainer steps for fallback suggestion logic (#75764)
* adding more detailed templates to promql fallback suggest
* remove debug logs
* added missing explain logic
* Fix brendan's type issue
---------
Co-authored-by: Brendan O'Handley <brendan.ohandley@grafana.com>
Co-authored-by: bohandley <brendan.ohandley@gmail.com>
* make yarn.lock equal to current in main
* fix feature toggles
* fix prettier issues
---------
Co-authored-by: Edward Qian <edward.qian@grafana.com>
Co-authored-by: Yasir Ekinci <yas.ekinci@grafana.com>
Co-authored-by: Edward Qian <edward.c.qian@gmail.com>
Co-authored-by: Gerry Boland <gerboland@users.noreply.github.com>
* Alerting: Move migration from background service run to ngalert init
sqlite database write contention between the migration's single transaction and
dashboard provisioning's frequent commits was causing the migration to
fail with SQLITE_BUSY/SQLITE_BUSY_SNAPSHOT on all retries.
This is not a new issue for sqlite+grafana, but the discrepancy between the
length of the transactions was causing it to be very consistent. In addition,
since a failed migration has implications on the assumed correctness of the
alertmanager and alert rule definition state, we cause a server shutdown on
error. This can make e2e tests as well as some high-load provisioned
sqlite installations flaky on startup.
The correct fix for this is better transaction management across various
services and is out of scope for this change as we're primarily interested in
mitigating the current bout of server failures in e2e tests when using sqlite.
* introduce data source admin role and fix frontend check
* introduce fixed roles for data source creator and team reader
* add documentation
* undo an unintended change
* Alerting: post alerts to the remote Alertmanager and fetch them
* fix broken tests
* Alerting: Add Mimir Backend image to devenv (blocks)
* add alerting as code owner for mimir_backend block
* Alerting: Use Mimir image to run integration tests for the remote Alertmanager
* skip integration test when running all tests
* skipping integration test when no Alertmanager URL is provided
* fix bad host for mimir_backend
* remove basic auth testing until we have an nginx image in our CI
* add integration tests for alerts
* fix tests
* change SendCtx -> Send, add context.Context to Send, fix CI
* add reover() for functions from the Prometheus Alertmanager HTTP client that could panic
* add TODO to implement PutAlerts in a way that mimicks what Prometheus does
* fix log format
* Move rotate logic into its own function
* Move oauth token sync to session client
* Add user to the local cache if refresh tokens are not enabled for the provider so we can skip the check in other
requests
* feat: add cost management to admin and put adaptive metrics and log volume under it
* test: fix applinks test
* chore: fix lint error
* remove "new" from feature toggle description
---------
Co-authored-by: Ashley Harrison <ashley.harrison@grafana.com>
* Update cue to have an AuthProvider entry
* Cable the new auth provider
* Add feature flag check to the accesscontrol service
* Fix test
* Change the structure of externalServiceRegistration (#76673)
* Add teamHeaders for datasource proxy requests
* adds validation for the teamHeaders
* added tests for applying teamHeaders
* remove previous implementation
* validation for header values being set to authproxy
* removed unnecessary checks
* newline
* Add middleware for injecting headers on the data source backend
* renamed feature toggle
* Get user teams from context
* Fix feature toggle name
* added test for validation of the auth headers and fixed evaluation to cover headers
* renaming of teamHeaders to teamHTTPHeaders
* use of header set for non-existing header and add for existing headers
* moves types into datasources
* fixed unchecked errors
* Refactor
* Add tests for data model
* Update pkg/api/datasources.go
Co-authored-by: Victor Cinaglia <victor@grafana.com>
* Update pkg/api/datasources.go
Co-authored-by: Victor Cinaglia <victor@grafana.com>
---------
Co-authored-by: Alexander Zobnin <alexanderzobnin@gmail.com>
Co-authored-by: Victor Cinaglia <victor@grafana.com>
* Alerting: Use Mimir image to run integration tests for the remote Alertmanager
* skip integration test when running all tests
* skipping integration test when no Alertmanager URL is provided
* fix bad host for mimir_backend
* remove basic auth testing until we have an nginx image in our CI
* update with sdk
* do sql
* fix core plugins
* fix proxy settings
* bump SDK version
* tidy
* enable pdc for test
* add codeowners
* bump dep
* go mod tidy
* bump SDK
* Replace FixedRoleUID function with a common function to generate these prefixes
* Use common function to generate prefixed uid for external service accounts
Co-authored-by: Gabriel MABILLE <gabriel.mabille@grafana.com>
---------
Co-authored-by: Gabriel MABILLE <gabriel.mabille@grafana.com>
fetch fresh permissions for global in AuthorizeInOrgMiddleware
Update pkg/services/accesscontrol/authorize_in_org_test.go
do not load viewer permissions in global ID
* update data migration to update rows that have changes
* fix migration for sqlite
* remove id; fix postgres
* Fix for MySQL
* delete old items from folder table
* change integer to boolean
---------
Co-authored-by: Sofia Papagiannaki <1632407+papagian@users.noreply.github.com>
* Add images
* Basic button functionality; TODO placeholders for dispatching contentOutlineToggle and rendering content outline component
* Basic content outline container
* Content outline toggles
* Remove icon files from explore
* Scroll into view v1
* outline that reflect's explore's order of vizs
* Update icon name
* Add scrollId to PanelChrome; scrolling enabled for Table
* Add queries icon
* Improve scroll behavior in split view
* Add wrapper so the sticky navigation doesn't scroll when on the bottom of the window
* Fix the issue with logs gap; center icons
* Memoize register and unregister functions; adjust content height
* Make displayOrderId optional
* Use Node API for finding position of panels in content outline; add tooltip
* Dock content outline in expanded mode; at tooltip to toggle button
* Handle content outline visibility from Explore and not redux; pass outlineItems as a prop
* Fix ContentOutline test
* Add interaction tracking
* Add padding to fix test
* Replace string literals with objects for styles
* Update event reporting payloads
* Custom content outline button; content outline container improvements
* Add aria-expanded to content outline button in ExploreToolbar
* Fix vertical and horizontal scrolling
* Add aria-controls
* Remove unneccessary css since ExploreToolbar is sticky
* Update feature toggles; Fix typos
* Make content outline button more prominent in split mode; add padding to content outline items;
* Diego's UX updates
* WIP: some scroll fixes
* Fix test and type error
* Add id to ContentOutline to differentiate in split mode
* No default exports
---------
Co-authored-by: Giordano Ricci <me@giordanoricci.com>
* Use apache/arrow v13
* remove apache/thrift
* go mod tidy with go1.21.1
* add metrics team as owner
---------
Co-authored-by: Kyle Brandt <kyle@grafana.com>
* Added spans to trace.go
* Added spans to search_stream.go
* Added spans to parca datasource
* Added spans for pyroscope
* Fix tests
* Fix another test
* Lint
* Revert "Fix another test"
This reverts commit a1639049e3.
* Use grafana-sdk-go tracing
* Fix migration of custom dashboard permissions
Dashboard alert permissions were determined by both its dashboard and
folder scoped permissions, while UA alert rules only have folder
scoped permissions.
This means, when migrating an alert, we'll need to decide if the parent folder
is a correct location for the newly created alert rule so that users, teams,
and org roles have the same access to it as they did in legacy.
To do this, we translate both the folder and dashboard resource
permissions to two sets of SetResourcePermissionCommands. Each of these
encapsulates a mapping of all:
OrgRoles -> Viewer/Editor/Admin
Teams -> Viewer/Editor/Admin
Users -> Viewer/Editor/Admin
When the dashboard permissions (including those inherited from the parent
folder) differ from the parent folder permissions alone, we need to create a
new folder to represent the access-level of the legacy dashboard.
Compromises:
When determining the SetResourcePermissionCommands we only take into account
managed and basic roles. Fixed and custom roles introduce significant complexity
and synchronicity hurdles. Instead, we log a warning they had the potential to
override the newly created folder permissions.
Also, we don't attempt to reconcile datasource permissions that were
not necessary in legacy alerting. Users without access to the necessary
datasources to edit an alert rule will need to obtain said access separate from
the migration.
* Manage service account secrets
* Wip
* WIP
* WIP
* Revert to keep a light interface
* Implement SaveExternalService
* Remove unecessary functions from the interface
* Remove unused field
* Better log
* Leave ext svc credentials out of the extsvcauth package for now
* Remove todo
* Add tests to SaveExternalService
* Test that secret has been removed from store
* Lint
* Nit.
* Rename commands and structs
Co-authored-by: Kalle Persson <kalle.persson@grafana.com>
* Account for PR feedback
Co-authored-by: Andres Martinez Gotor <andres.martinez@grafana.com>
* Linting
* Add nosec comment G101 - this is not a hardcoded secret
* Lowercase kvStoreType
---------
Co-authored-by: Kalle Persson <kalle.persson@grafana.com>
Co-authored-by: Andres Martinez Gotor <andres.martinez@grafana.com>
* Update origin annotation names
k8s does not support annotation names with multiple slashes in them, so this PR updates the origin annotations to match the format for updated and created annotations.
* fix tests
This PR replaces the vendored models in the migration with their equivalent ngalert models. It also replaces the raw SQL selects and inserts with service calls.
It also fills in some gaps in the testing suite around:
- Migration of alert rules: verifying that the actual data model (queries, conditions) are correct 9a7cfa9
- Secure settings migration: verifying that secure fields remain encrypted for all available notifiers and certain fields migrate from plain text to encrypted secure settings correctly e7d3993
Replacing the checks for custom dashboard ACLs will be replaced in a separate targeted PR as it will be complex enough alone.
* Move errors to error file
* Move check for both empty username and email to user service
* Move check for empty email and username to user service Update
* Wrap inner error
* Set username in test