Commit Graph

6002 Commits

Author SHA1 Message Date
Todd Treece
bf8af608a7
K8s: Add tracer provider to config (#77003) 2023-10-25 15:19:44 -04:00
Ryan McKinley
7e069f9d91
K8s: Move the namespace mapper to the same package that resolves them (#77101) 2023-10-25 14:13:46 -04:00
Misi
1e81ffccac
Auth: Handle when access token has already been refreshed in OAuth token sync (#77118)
* Use singleflight to prevent logging error if the token has already been refreshed

* Change order of error checks

* align tests, change error name

* Change sf key

* Update based on the review

* refactor
2023-10-25 18:15:41 +02:00
Ryan McKinley
d2732ae726
K8s: Add explicit table converter (#77098) 2023-10-25 09:00:20 -07:00
Santiago
f9fc2e4568
Alerting: Remove ConfigHash() from the Alertmanager interface (#77134) 2023-10-25 17:11:53 +02:00
Alexander Weaver
6ee52ac80c
Alerting: Allow more time before Alertmanager expire-resolves alerts (#77094)
* Sync endsAt factor with prometheus

* Fix state tests
2023-10-25 10:03:46 -05:00
linoman
dff7403b29
auth: implement feature flag for service account proxy (#77129)
* add FlagExternalServiceAccounts to proxy service

* add FlagExternalServiceAccounts value to tests

---------

Co-authored-by: Gabriel MABILLE <gamab@users.noreply.github.com>
2023-10-25 16:44:05 +02:00
Gilles De Mey
e12e40fc24
Alerting: Contact Points v2 part IV (#76063) 2023-10-25 15:57:53 +02:00
kay delaney
b215d2f0fb
Library Panels: Fix library panel creation with RBAC enabled (#76553) 2023-10-25 16:29:57 +03:00
Marcus Efraimsson
9bf7eb5fbc
Plugins: Adds logging around loading of plugins for better tracking (#76896) 2023-10-25 14:01:30 +02:00
Santiago
322a9c0b15
Alerting: Replace FileStore() for CleanUp() in the Alertmanager interface (#77126)
Alerting: Remplace FileStore() for CleanUp() in the Alertmanager interface
2023-10-25 13:58:28 +02:00
linoman
1bc81b7bd1
auth: migrate api interface implementation (#77040)
* expand serviceaccount service interface

* implemet FakeServiceAccountService

* Replace SA service interface from api

* merge sa proxy tests with new fake service

* implement DeleteServiceAccountToken

* add test for DeleteServiceAccountToken
2023-10-25 12:40:30 +02:00
Santiago
01add144b8
Alerting: Send alerts to the remote Alertmanager (#77034)
* Alerting: Rename remote.ExternalAlertmanager to remote.Alertmanager

* Alerting: Send alerts to the remote Alertmanager

* add ticker to readiness check, add tests

* use options when creating a new sender.ExternaAlertmanager

* unexport defaultMaxQueueCapacity

* delete unused defaultConfig field

* add debug log line when sending alerts to the remote alertmanager

* move and refactor readiness check

* update tests to not include defaultConfig
2023-10-25 11:52:48 +02:00
Hugo Kiyodi Oshiro
dfc1875061
Plugins: Add managed instance installation resources (#76767)
* Plugins: Add configs to allow managed install

* Expose methods to use with cloud plugin installer

* Change plugins installer bind to OSS
2023-10-24 16:21:37 +02:00
Todd Treece
162a422f0a
K8s: Playlist apply fix (#76971) 2023-10-24 10:19:17 -04:00
Gabriel MABILLE
897e3a4dab
AuthN: Add metrics to external service accounts management (#76789)
* AuthN: Add metrics to external service accounts management

* Add a new metric to count stored external service accounts

* Update variable names

Co-authored-by: linoman <2051016+linoman@users.noreply.github.com>

* Add test to SearchOrgServiceAccounts

* Add feature flags checks before registering and using the metrics

---------

Co-authored-by: linoman <2051016+linoman@users.noreply.github.com>
2023-10-24 15:54:14 +02:00
Matias Chomicki
765defea1e
Loki Queries: Query Splitting enabled by default (#75876)
* Loki Query Splitting: enable by default

* Query splitting: add gdev dashboard

* Update testdata file

* Update devenv/dev-dashboards/datasource-loki/loki_query_splitting.json

Co-authored-by: Ivana Huckova <30407135+ivanahuckova@users.noreply.github.com>

* Revert "Update testdata file"

This reverts commit 5a891ba1f2.

* Update feature-toggles readme

---------

Co-authored-by: Ivana Huckova <30407135+ivanahuckova@users.noreply.github.com>
2023-10-24 16:09:30 +03:00
Alexander Zobnin
cad3c43bb1
Team LBAC: Move middleware to enterprise (#76969)
* Team LBAC: Move middleware to enterprise

* Remove ds proxy part

* Move utils to enterprise
2023-10-24 14:06:18 +03:00
Gabriel MABILLE
3015e5921f
Chore: Move extsvcaccounts package to serviceaccounts (#76977)
* Chore: Move extsvcaccounts package to serviceaccounts

* Fix proxy

* Fix tests

* Fix linting
2023-10-24 11:01:04 +02:00
Ieva
159bb3c032
RBAC: Allow scoping access to root level dashboards (#76987)
* correctly check permissions to list dashboards on the root

* correctly display the access inherited from general folder for dashboards

* Update pkg/services/sqlstore/permissions/dashboard.go

Co-authored-by: Gabriel MABILLE <gamab@users.noreply.github.com>

* Update dashboard_filter_no_subquery.go

---------

Co-authored-by: Gabriel MABILLE <gamab@users.noreply.github.com>
2023-10-24 11:55:38 +03:00
Sofia Papagiannaki
03a626f1d6
Search: Fix empty folder details for nested folder items (#76504)
* Introduce dashboard.folder_uid column

* Add data migration

* Search: Fix empty folder details for nested folders

* Set `dashboard.folder_uid` and update tests

* Add unique index

* lint

Ignore cyclomatic complexity of func
`(*DashboardServiceImpl).BuildSaveDashboardCommand

* Fix search by folder UID
2023-10-24 10:04:45 +03:00
Todd Treece
949b3af1b2
K8s: Remove duplicate listener in production (#76583) 2023-10-23 21:42:10 +03:00
Kyle Brandt
59ef1558e8
Prometheus: (Chore) Switch to sdk tracing from infra tracing (#76975) 2023-10-23 13:11:12 -04:00
Ivan Ortega Alba
a03f9e7660
Feature toggle: Mark dashgpt as GA (#76304)
Co-authored-by: nmarrs <nathanielmarrs@gmail.com>
2023-10-23 09:39:12 -07:00
Alexander Weaver
39599fa7f7
Alerting: Alert rule constraint violations return as 400s in provisioning API (#76396)
Constraint violations become 400s
2023-10-23 10:28:40 -05:00
Sofia Papagiannaki
b04a014341
Chore: Fix failure when importing dashboard (#76947) 2023-10-23 18:16:46 +03:00
Kristin Laemmert
f166202e11
chore(grafana-apiserver): expose apiserver metrics endpoint (#76572)
expose apiserver metrics

Add a route to the apiserver metrics on a new endpoint, `/apiserver-metrics`. This requires a signed-in user but otherwise ignores the MetricsEndpoind-relating configuration. that will come in a following PR
2023-10-23 10:05:50 -04:00
Santiago
488a60aee6
Alerting: Rename remote.ExternalAlertmanager to remote.Alertmanager (#76956) 2023-10-23 15:37:14 +02:00
linoman
359d84799e
auth: add serviceaccount proxy (#76815)
* Add proxy service template

* Replace SA srv with proxy for external SA srv

* Move service account prefix to a constant

* Prevent deletion from external service account

* Make SA validation a resusable function

* Add protection for creating service accounts

* Add protection when updating service accounts

* Add IsExternal field for service account

* Protect ext service account token generation

* Add verbose errors for form name or sa name

* add tests

* Add logs

* Adjusts tests

---------

Co-authored-by: Misi <mgyongyosi@users.noreply.github.com>
Co-authored-by: Gabriel MABILLE <gamab@users.noreply.github.com>
2023-10-23 14:09:42 +02:00
Isabella Siu
ecbc52f515
CloudWatch: Update query batching logic (#76075) 2023-10-20 15:09:41 -04:00
Karl Persson
ed1c50233f
Revert "AuthN: move oauth token hook into session client" (#76882)
Revert "AuthN: move oauth token hook into session client (#76688)"

This reverts commit 455cede699.
2023-10-20 16:09:46 +02:00
Michael Mandrus
c3102c7d0a
Caching: Enable useCachingService feature toggle by default (#76845)
* enable by default

* update docs

* add helpful comment
2023-10-20 10:00:37 -04:00
gotjosh
866acbd5ac
Alerting: Move ExternalAlertmanager to its own package (#76854)
* Alerting: Move `ExternalAlertmanager` to its own package

We'll avoid import cycles when using components from other packages. In addition to that, I've created an `Options` approach for the multiorg alertmanger to allow us to override how per tenant alertmanagers are created.

* switch things around

* address review comments

* fix references and warnings
2023-10-20 14:08:13 +02:00
Santiago
a60ec150f9
Alerting: Fetch receivers from remote Alertmanager (#76841)
* Alerting: fetch receivers from remote Alertmanager

* make linter happy

* change require.Eventually() timeout and tick
2023-10-20 11:34:17 +02:00
Steve Simpson
a0476741f2
Alerting: Fix HCL export for alerts with non-zero "for" field. (#76739)
* Alerting: Fix HCL export for alerts with non-zero "for" field.

Fixes #76734

* fix tests

---------

Co-authored-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
2023-10-20 11:09:08 +02:00
Brendan O'Handley
5580d06101
Prometheus: PromQAIL frontend, drawer, feature toggle, workflow, etc. (#73020)
* initial commit for PromQAIL

* add feature toggle and start button

* add drawer

* set up drawer and state

* fix styles and start the conditional text display

* add data info list going to ai

* add logos and style

* metric display table style, neeed to make responsive

* make feature toggle frontend only

* add logic for want prompt for list or not, add helpers, addquerysuggestion type

* make query suggestion component

* add buttons to add or refine prompt

* refactor logic to add interactions to have multiple AI or historical interactions

* refactor and enable multiple questions, all flow

* add colorful AI icon to drawer open button

* fix linting

* refactor for hooking up promQail app and only giving one suggestion

* design fixes

* fix next prompt button styling

* historical suggestions give us 5, fixed that and some design things

* hook up the api, provide defense filler if it's down, refactor lots

* use query, fix linting

* add metadata to explain for ai suggestions

* styling fixes

* give metadata for historical suggestions by parsing query on the fly

* no prompt field to query-suggestion endpoint if prompt is empty

* fix linting

* use suggest rte for historical list, fix long code style

* use suggest rte for historical list, fix long code style

* fix historical bug

* added prompt file

* updated llm logic in explainer helper

* bump @grafana/experimental from 1.7.0 to 1.7.2

* use llmservice and vectorservice

* cleanup prompts + streaming explainer

* promqail feature toggle: fix re-order

* PromQL non-llm failback recommendation logic (#75469)

* added template recommendation logic directly in helpers

* also added selected labels to recommendation

* PromQail: query gen: fix prompt formatting and fetch metric labels to be used (#75450)

* PromQail: query gen: fix prompt formatting and fetch metric labels to be used

* Code fixes as suggested

* Use newly decided collection name for promql templates

* Prometheus: Promqail tests and bug fixes (#75852)

* add tests for drawer

* refine one prompt at a time, fix css

* scroll into view on interaction change

* fix styles for light

* disable prompt input after getting sugestions for that interaction

* make buttons disappear after selecting refine prompt or show historical queries to prevent user from clicking many times

* fix border radius

* fix new eslint rule about css requiring objects and not template literals

* add scrollIntoView for test

* grafana_prometheus_promqail_explanation_feedback - add feedback rudderstack interaction for explanation

* add form link to feedback for query suggestions

* fix bugs

* for prettier

* PromQL Builder Explainer: Added promql documentation and updated prompt (#75593)

* added promql documentation and updated prompt

* refactor prompt generation into isolated function

* updated prompt to answer with a question

* removed commented code

* updated metadata logic

* updated documentation body logic

* Prometheus: PromQAIL UI fixes (#76654)

* align buttons at 16px

* only autoscroll when an interaction has been added or the suggestions have been updated

* add 12px below explain for suggested queries

* add . after suggestion number

* fix linting error

* Prometheus: PromQAIL feedback improvements (#76711)

* align buttons at 16px

* only autoscroll when an interaction has been added or the suggestions have been updated

* add 12px below explain for suggested queries

* add . after suggestion number

* add text indication for explanation feedback

* add form for suggestion feedback, add form for not helpful explanation feedback

* fix linting error

* make radio button feedback required

* required text, padding additions, thank you for your feedback

* PromQL Builder Suggestion: Added type level templates and removed explainer steps for fallback suggestion logic (#75764)

* adding more detailed templates to promql fallback suggest

* remove debug logs

* added missing explain logic

* Fix brendan's type issue

---------

Co-authored-by: Brendan O'Handley <brendan.ohandley@grafana.com>
Co-authored-by: bohandley <brendan.ohandley@gmail.com>

* make yarn.lock equal to current in main

* fix feature toggles

* fix prettier issues

---------

Co-authored-by: Edward Qian <edward.qian@grafana.com>
Co-authored-by: Yasir Ekinci <yas.ekinci@grafana.com>
Co-authored-by: Edward Qian <edward.c.qian@gmail.com>
Co-authored-by: Gerry Boland <gerboland@users.noreply.github.com>
2023-10-19 10:45:32 -05:00
Matthew Jacobson
c2efcdde09
Alerting: Fix flaky SQLITE_BUSY when migrating with provisioned dashboards (#76658)
* Alerting: Move migration from background service run to ngalert init

sqlite database write contention between the migration's single transaction and
dashboard provisioning's frequent commits was causing the migration to
 fail with SQLITE_BUSY/SQLITE_BUSY_SNAPSHOT on all retries.

 This is not a new issue for sqlite+grafana, but the discrepancy between the
 length of  the transactions was causing it to be very consistent. In addition,
 since a failed migration has implications on the assumed correctness of the
 alertmanager and alert rule definition state, we cause a server shutdown on
 error. This can make e2e tests as well as some high-load provisioned
 sqlite installations flaky on startup.

 The correct fix for this is better transaction management across various
 services and is out of scope for this change as we're primarily interested in
 mitigating the current bout of server failures in e2e tests when using sqlite.
2023-10-19 10:03:00 -04:00
Ieva
94fec65192
RBAC: introduce a data source admin role (#75915)
* introduce data source admin role and fix frontend check

* introduce fixed roles for data source creator and team reader

* add documentation

* undo an unintended change
2023-10-19 14:36:41 +01:00
linoman
e06f7251d7
Add prefix for external service accounts (#76794)
* Add prefix for external service accounts
2023-10-19 13:06:09 +02:00
Giuseppe Guerra
48a1dae834
Plugins: Add contextual logger to streaming methods in ContextualLoggerMiddleware (#76761) 2023-10-19 11:52:50 +02:00
Santiago
61cb26711e
Alerting: Fetch alerts from a remote Alertmanager (#75844)
* Alerting: post alerts to the remote Alertmanager and fetch them

* fix broken tests

* Alerting: Add Mimir Backend image to devenv (blocks)

* add alerting as code owner for mimir_backend block

* Alerting: Use Mimir image to run integration tests for the remote Alertmanager

* skip integration test when running all tests

* skipping integration test when no Alertmanager URL is provided

* fix bad host for mimir_backend

* remove basic auth testing until we have an nginx image in our CI

* add integration tests for alerts

* fix tests

* change SendCtx -> Send, add context.Context to Send, fix CI

* add reover() for functions from the Prometheus Alertmanager HTTP client that could panic

* add TODO to implement PutAlerts in a way that mimicks what Prometheus does

* fix log format
2023-10-19 11:27:37 +02:00
Alexander Weaver
acee3efcf9
Alerting: Use common StateReason values for NoData/Error mapped states (#76781)
Fix hardcoded state reasons
2023-10-18 17:26:41 -05:00
Eric Leijonmarck
17fe1d3fc7
Team LBAC: Refactor to use only the teamHeader json part (#76756)
* refactor: to check for feature toggle and for checking for jsonData field

* fix tests

* whitelisting of X-Prom-Label-Policy Header
2023-10-18 16:09:22 +01:00
Hugo Kiyodi Oshiro
43add83d1a
Plugins: Add feat toggle to install managed plugins (#75973) 2023-10-18 15:17:03 +02:00
Karl Persson
455cede699
AuthN: move oauth token hook into session client (#76688)
* Move rotate logic into its own function

* Move oauth token sync to session client

* Add user to the local cache if refresh tokens are not enabled for the provider so we can skip the check in other
requests
2023-10-18 12:51:15 +02:00
Marcus Efraimsson
872386b427
Instrumentation: Log errors embedded within query data responses (#76285)
Fixes #76140

Co-authored-by: Giuseppe Guerra <giuseppe.guerra@grafana.com>
2023-10-18 11:59:36 +02:00
Ieva
1fc375855c
Chore: delete team related entries for an org after the org gets deleted (#76706)
* delete team related entries for an org after the org gets deleted

* fix tests

* one more test fix
2023-10-18 10:40:26 +01:00
Adam Bannach
de1ed216f4
Feat: Add cloud plugin cost management to admin section (#76547)
* feat: add cost management to admin and put adaptive metrics and log volume under it

* test: fix applinks test

* chore: fix lint error

* remove "new" from feature toggle description

---------

Co-authored-by: Ashley Harrison <ashley.harrison@grafana.com>
2023-10-17 17:15:51 +01:00
Todd Treece
863f25acf7
K8s: Add grafana-apiserver config (#76649)
Co-authored-by: Kristin Laemmert <mildwonkey@users.noreply.github.com>
2023-10-17 11:29:06 -04:00
Todd Treece
ec7ed11ea1
K8s: Logging improvements (#76646) 2023-10-17 10:44:23 -04:00