Commit Graph

1355 Commits

Author SHA1 Message Date
Santiago
d3bb9fbbaf
Alerting: Use only token for images in notifications (#70196)
* Alerting: Use only tokens for images in notifications

* update tests

* make linter and modfile validator happy
2023-06-21 20:53:45 -03:00
Santiago
ff9eff49bd
Alerting: Bump grafana/alerting and refactor the ImageStore/Provider to provide image URL/bytes (#70182)
* implement alerting.images.Provider interface in our ImageStore

* add URLExists() method to fakeConfigStore

* make linter happy

* update integration tests
2023-06-21 20:53:30 -03:00
Alexander Weaver
ce6f73bd32
Alerting: Add two missing tests which cover missing URLs for Loki state history (#70460)
Add two missing tests which cover individual missing URLs
2023-06-21 12:58:37 -05:00
George Robinson
8a13ee3cd4
Alerting: Add debug logs when saving instances is finished (#70447) 2023-06-21 14:19:04 +02:00
George Robinson
a1cb7319d5
Alerting: Update in app documentation for customizing message and subject (#70367) 2023-06-20 12:20:01 +02:00
George Robinson
815e98ed95
Alerting: Add debug logs for EndsAt timestamp (#70336)
This commit adds debug logs for previous_ends_at and next_ends_at
to state.go to help us debug issues where alerts are resolved in
Alertmanager due to expiration. This change is in response to a
support escalation where this information was needed but unavailable.
2023-06-20 12:13:38 +03:00
SatVeer Singh
1bfa3a0f1e
Chore: Replace go-multierror with errors package (#66432)
* code refactor and type assertions added to tests

* no-lint rule added for specific line
2023-06-19 12:29:45 +03:00
Jean-Philippe Quéméner
934ba1aaa1
Alerting: Rewrite range to instant queries if possible (#69976) 2023-06-16 19:55:49 +02:00
Yuri Tseretyan
842f33580e
SSE: Add functions that determine NodeType by UID and construct a data source struct from NodeType (#70106)
* add NodeTypeFromDatasourceUID and DataSourceModelFromNodeType()
* deprecate expr.DataSourceModel
* replace usages of IsDataSource to NodeTypeFromDatasourceUID 
* replace usages of DataSourceModel to DataSourceModelFromNodeType()
2023-06-16 13:05:06 -04:00
Yuri Tseretyan
f1d47d18a8
Alerting: Sort RefIDs in error message returned by api.validateCondition (#70198)
sort RefIDs in error message
2023-06-15 18:37:30 -03:00
Yuri Tseretyan
b963defa44
Alerting: update rules POST API to validate query and condition only for rules that changed. (#68667)
* replace condition validation with just structural validation
* validate conditions of only new and updated rules
* add integration tests for rule update\delete API

Co-authored-by: George Robinson <george.robinson@grafana.com>
2023-06-15 13:33:42 -04:00
Yuri Tseretyan
baffe83da6
Alerting: Improve performance of cache.getOrCreate (#63909)
* move expansion of labels and annotations outside of mutex lock
* propagate struct but not pointer
2023-06-15 09:37:47 -04:00
George Robinson
f085e99d3c
Alerting: Add matchers metrics to Alertmanager (#69855) 2023-06-15 09:18:01 +01:00
Santiago
ff3e028a85
Alerting: Add image URI annotation only when there's an image (#69825)
* Alerting: Add image URI annotation only when there's an image

* fix function name (changed on main branch)
2023-06-09 10:59:24 -03:00
Matthew Jacobson
ba3994d338
Alerting: Repurpose rule testing endpoint to return potential alerts (#69755)
* Alerting: Repurpose rule testing endpoint to return potential alerts

This feature replaces the existing no-longer in-use grafana ruler testing API endpoint /api/v1/rule/test/grafana. The new endpoint returns a list of potential alerts created by the given alert rule, including built-in + interpolated labels and annotations.

The key priority of this endpoint is that it is intended to be as true as possible to what would be generated by the ruler except that the resulting alerts are not filtered to only Resolved / Firing and ready to be sent.

This means that the endpoint will, among other things:

- Attach static annotations and labels from the rule configuration to the alert instances.
- Attach dynamic annotations from the datasource to the alert instances.
- Attach built-in labels and annotations created by the Grafana Ruler (such as alertname and grafana_folder) to the alert instances.
- Interpolate templated annotations / labels and accept allowed template functions.
2023-06-08 18:59:54 -04:00
Matthew Jacobson
0c688190f7
Alerting: Fix unique violation when updating rule group with title chains/cycles (#67868)
* Alerting: Fix unique violation when updating rule group with title chains/cycles

The uniqueness constraint for titles within an org+folder is enforced on every update within a transaction instead of on commit (deferred constraint). This means that there could be a set of updates that will throw a unique constraint violation in an intermediate step even though the final state is valid. For example, a chain of updates RuleA -> RuleB -> RuleC could fail if not executed in the correct order, or a swap of titles RuleA <-> RuleB cannot be executed in any order without violating the constraint.

The exact solution to this is complex and requires determining directed paths and cycles in the update graph, adding in temporary updates to break cycles, and then executing the updates in reverse topological order (see first commit in PR if curious).

This is not implemented here.

Instead, we choose a simpler solution that works in all cases but might perform more updates than necessary. This simpler solution makes a determination of whether an intermediate collision could occur and if so, adds a temporary title on all updated rules to break any cycles and remove the need for specific ordering.

In addition, we make sure diffs are executed in the following order: DELETES, UPDATES, INSERTS.
2023-06-08 18:51:50 -04:00
Will Browne
624777258b
Plugins: Refactor creation of plugin context to dedicated service (#66451)
* first pass

* fix tests

* return errs

* change signature

* tidy

* delete unnecessary fields from test

* tidy

* fix tests

* simplify

* separate error check in API

* apply nits
2023-06-08 13:59:51 +02:00
Horst Gutmann
f4c04d4055
Alerting: Update patch for #865 after #68898 (#890) 2023-06-06 13:38:37 +02:00
dsotirakis
f9c310dbaf
Require alert.notifications:write permissions to test receivers and templates (#865)
# Conflicts:
#	pkg/services/ngalert/api/authorization.go
2023-06-06 13:33:56 +02:00
Matthew Jacobson
c16f1f5e99
Alerting: Fix provisioned templates being ignored by alertmanager (#69485)
* Alerting: Fix provisioned templates being ignored by alertmanager

Template provisioning sets the template in cfg.TemplateFiles while a recent change
made it so that alertmanager reads cfg.AlertmanagerConfig.Templates instead.

This change fixes the issue on both ends, by having provisioning set boths fields and
reverts the change on the alertmanager side so that it uses cfg.TemplateFiles.
2023-06-02 15:47:43 -04:00
Arati R
6cb1a5e368
Nested folders: Add alert rule counts and deletion to folder registry (#67259)
* Let alert rule service implement registry service
* Add count method to RuleStore interface
* Add implementation for deletion of alert rules
* Rename uid to folderUID in registry methods
* Check forceDeleteRule value for registry deletion
* Register alerting store with folder service
* Move folder test functions to separate package
* Add testing for alert rule counting, deletion
* Remove redundant count method
* Fix deleteChildrenInFolder signature
* Update pkg/services/ngalert/store/alert_rule.go
Co-authored-by: Sofia Papagiannaki <1632407+papagian@users.noreply.github.com>
* Add tests for nested folder deletion
* Refactor TestIntegrationNestedFolderService
* Add rules store as parameter for alertng provider

---------

Co-authored-by: Sofia Papagiannaki <1632407+papagian@users.noreply.github.com>
2023-06-02 16:38:02 +02:00
Ieva
d8b66d5c4b
RBAC: remove some IsDisabled checks (#69272)
* remove some access contorl IsDisabled() checks

* cleaning up tests

* update tests

* linting
2023-05-31 09:58:57 +01:00
Alexander Weaver
0ed5d3bdf2
Revert "Alerting: Refactor the ImageStore/Provider to provide image URL/bytes" (#69265)
Revert "Alerting: Refactor the ImageStore/Provider to provide image URL/bytes (#67693)"

This reverts commit 72a187b0be.
2023-05-30 11:33:33 -05:00
Alexander Weaver
0f88b117dc
Alerting: Skip flaky test TestRouteGetRuleStatuses (#69258)
Skip TestRouteGetRuleStatuses
2023-05-30 09:48:02 -05:00
Santiago
72a187b0be
Alerting: Refactor the ImageStore/Provider to provide image URL/bytes (#67693)
* (WIP) Refactor the ImageStore interface to work with our latest alerting repository

* update alerting package

* refactor, new URLExists method in ImageProvider

* tests for the new methods

* fix linter warnings

* use alertingImages as an alias for grafana/alerting/images

* logs about image uris and not found images

* nerf image not found logs

* extract duplicated code to getImageFromURI() method

* refactor getImageFromURI()

* add index on url

* add comment about migration log

* sync generated files
2023-05-30 11:25:55 -03:00
Ieva
d98813796c
RBAC: Remove legacy AC from HasAccess permission check (#68995)
* remove unused HasAdmin and HasEdit permission methods

* remove legacy AC from HasAccess method

* remove unused function

* update alerting tests to work with RBAC
2023-05-30 14:39:09 +01:00
Matthew Jacobson
97ae6ae6ef
Alerting: Fix flaky TestIntegrationUpdateAlertRules (#69106)
Prevents duplicate alert rule ids and 0 value for BaseInterval and IntervalSeconds
2023-05-25 16:00:06 -04:00
Yuri Tseretyan
b57ef1f2c7
Alerting: Fix TestIntegration_GetAlertRulesForScheduling to make sure rules are created in different org (#69088)
make sure rules are created in different org
2023-05-25 13:51:38 -04:00
Sladyn
a06a5a7393
Alerting: Improve log messages (#67688)
* Rename base logger and capatilize messages
* Remove cflogger from config.go
2023-05-25 18:55:01 +03:00
Yuri Tseretyan
e00260465b
Alerting: Fix provenance guard checks for Alertmanager configuration to not cause panic when compared nested objects (#69009)
* fix current settings parsed as new
* replace map comparison with cmp.Diff and log the diff
2023-05-25 11:41:11 -04:00
Jean-Philippe Quéméner
5717d8954f
Alerting: Return empty list on export if no rules exist (#69023) 2023-05-25 14:12:18 +02:00
Ieva
4980b64274
RBAC: Remove legacy ac from authorization middleware (#68898)
remove legacy AC fallback from RBAC middleware, and some unused auth logic
2023-05-24 09:49:42 +01:00
Yuri Tseretyan
ab5a3820d5
Alerting: Fix status code of successful response POST /api/alertmanager/grafana/api/v2/silences in swagger specs (#67951)
* update status code to reflect reality

* update docs
2023-05-15 11:23:30 -04:00
Emil Tullstedt
23a9963507
Chore: Upgrade Prometheus to 2.43.0 (#67853)
- github.com/prometheus/prometheus => 2.43.0 (aka 0.43.0)
- github.com/prometheus/client_golang => 1.15.1
2023-05-10 14:09:49 +02:00
Virginia Cepeda
e1ff434917
Alerting: Change text on cloud AM email addresses for contact points (#68143) 2023-05-10 10:44:17 +02:00
Matthew Jacobson
5422609fb1
Alerting: Fix broken integration test (#68140)
From https://github.com/grafana/grafana/pull/68122
2023-05-09 22:27:40 +03:00
Jean-Philippe Quéméner
8bb62a8316
Alerting: Add option for memberlist label (#67982) 2023-05-09 10:32:23 +02:00
Matthew Jacobson
91471ac7ae
Alerting: Template Testing API (#67450) 2023-04-28 15:56:59 +01:00
Yuri Tseretyan
9eb10bee1f
Alerting: Scheduler use rule fingerprint instead of version (#66531)
* implement calculation of fingerprint for ruleWithFolder
* update scheduler to use fingerprint instead of rule's version
2023-04-28 10:42:16 -04:00
Uwe Sommerlatt
dfc99cdd19
Alerting: Fix misleading status code in provisioning API (#67331)
Fixes #66249
2023-04-27 09:25:34 +01:00
Santiago
b0881daf23
Alerting: Use URLs in image annotations (#66804)
* use tokens or urls in image annotations

* improve tests, fix some comments

* fix empty tokens

* code review changes, check for url before checking for token (support old token formats)
2023-04-26 13:06:18 -03:00
Yuri Tseretyan
a8b4a4bb45
Alerting: Update alerting module to 20230418161049-5f374e58cb32 + refactoring (#66622)
* update to alerting 20230418161049-5f374e58cb32
* rename renamed structs in https://github.com/grafana/alerting/pull/73
* update ValidateContactPoint to use BuildReceiverConfiguration
* update logger factory according to changes
* rewrite integration builder
Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>
2023-04-25 13:39:46 -04:00
Alexander Weaver
117636e8ca
Alerting: Fix panic when reparenting receivers to groups following an attempted rename via Provisioning (#67167) 2023-04-24 21:23:23 -04:00
Steve Simpson
9effb9a708
Alerting: Allow hooking into request handler functions. (#67000)
* Alerting: Allow hooking into request handler functions.

Adds a facility to AlertNG for hooking into API handlers, allowing the
replacement of request handlers for specific paths. One of goals of this
approach was to allow hooking as late as possible in the request, e.g.
after all middleware has been applied, to simplfiy usage.

* Update pkg/services/ngalert/api/hooks.go

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Update pkg/services/ngalert/api/hooks.go

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Update pkg/services/ngalert/ngalert.go

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Fixes to review comments

* Fix passing logger in

---------

Co-authored-by: gotjosh <josue.abreu@gmail.com>
2023-04-24 18:18:44 +02:00
Jean-Philippe Quéméner
bbce69f295
Alerting: Use configured headers for external alertmanager (#63819) 2023-04-21 16:16:27 +02:00
Matthew Jacobson
eddd4f4508
Alerting: Add totalsFiltered to RuleResponse for hidden by filters count (#66883)
Alerting: Add totalsFiltered to RuleResponse to facilitate hidden by filters count

Currently, when both a limit_alerts and a matcher/state filter is applied, there is not enough information to determine how many alert instances were hidden by the filters. Only enough to determine the total hidden by the limit and filter combined.

This change adds a separate totalsFiltered field alongside the AlertRule totals that will contain the count of instances after filters but before limits.
2023-04-21 09:35:12 +01:00
George Robinson
35342a3c76
Alerting: Fix DatasourceUID and RefID missing for DatasourceNoData alerts (#66733)
This commit fixes a bug where DatasourceUID and RefID annotations are
missing for DatasourceNoData alerts in Grafana 9.5. This bug affects
datasource plugins that have moved to using the data plane contract.
2023-04-20 14:38:20 +01:00
George Robinson
883dcc81c0
Alerting: Add tests for Evaluate (#66739) 2023-04-20 11:24:40 +01:00
Alexander Weaver
3634079b8f
Alerting: Attach hash of instance labels to state history log lines (#65968)
* Add instanceID which is hash of labels

* Rename field to fingerprint

* Move to prometheus style signature

* Appease linter
2023-04-19 14:22:19 -05:00
Jean-Philippe Quéméner
bc11a484ed
Alerting: Add support for running HA using Redis (#65267)
Co-authored-by: Steve Simpson <steve.simpson@grafana.com>
2023-04-19 17:05:26 +02:00
Alexander Weaver
a384194e15
Alerting: Use default page size of 5000 when querying Loki for state history (#66315)
Always specify limit of 5000
2023-04-18 14:31:29 -05:00
Alexander Weaver
cf7157f683
Alerting: Capture refID of rule's condition expression in Loki state history entries (#66419)
* Capture condition from rule

* Add test
2023-04-18 14:21:28 -05:00
Alex Moreno
f64a89727e
Alerting: Allow provenance disable in alerting provisioning API (#63650)
* Allow provenance None in alert rule update and rule group replace

* Allow provenance None in contact point update

* Allow updating policies to none by sending x-disable-provenance header

* Allow mute timings to disable provenance with x-disable-provenance header

* Allow disabling provenance by using x-disable-provenance header

* Add provenance helper to lower the cyclomatic complexity

* Do not downgrade provenance except un ReplaceRuleGroup

* Add function explanation and change error handling

* Add docs for x-disable-provenance changes (#66300)

* Add docs for x-disable-provenance changes

* Apply suggestions from code review

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>

* Update _index.md

---------

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>

* Update docs/sources/alerting/set-up/provision-alerting-resources/_index.md

Co-authored-by: George Robinson <george.robinson@grafana.com>

* Add error message check in tests

* Change docs

---------

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>
Co-authored-by: George Robinson <george.robinson@grafana.com>
2023-04-18 15:10:36 +02:00
Kyle Brandt
840fb32ad8
SSE: (Instrumentation) Add Tracing (#66700)
spans are prefixed `SSE.`
2023-04-18 08:04:51 -04:00
Kyle Brandt
2f13c851e4
SSE: (Chore/Instrumentation) Add ds_queries_total metric and move met… (#66695)
* SSE: (Chore/Instrumentation) Add ds_queries_total metric and move metrics to service
2023-04-17 16:12:44 -07:00
George Robinson
19ebb079ba
Alerting: Add limits and filters to Prometheus Rules API (#66627)
This commit adds support for limits and filters to the Prometheus Rules
API.

Limits:

It adds a number of limits to the Grafana flavour of the Prometheus Rules
API:

- `limit` limits the maximum number of Rule Groups returned
- `limit_rules` limits the maximum number of rules per Rule Group
- `limit_alerts` limits the maximum number of alerts per rule

It sorts Rule Groups and rules within Rule Groups such that data in the
response is stable across requests. It also returns summaries (totals)
for all Rule Groups, individual Rule Groups and rules.

Filters:

Alerts can be filtered by state with the `state` query string. An example
of an HTTP request asking for just firing alerts might be
`/api/prometheus/grafana/api/v1/rules?state=alerting`.

A request can filter by two or more states by adding additional `state`
query strings to the URL. For example `?state=alerting&state=normal`.

Like the alert list panel, the `firing`, `pending` and `normal` state are
first compared against the state of each alert rule. All other states are
ignored. If the alert rule matches then its alert instances are filtered
against states once more.

Alerts can also be filtered by labels using the `matcher` query string.
Like `state`, multiple matchers can be provided by adding additional
`matcher` query strings to the URL.

The match expression should be parsed using existing regular expression
and sent to the API as URL-encoded JSON in the format:

{
    "name": "test",
    "value": "value1",
    "isRegex": false,
    "isEqual": true
}

The `isRegex` and `isEqual` options work as follows:

| IsEqual | IsRegex  | Operator |
| ------- | -------- | -------- |
| true    | false    |    =     |
| true    | true     |    =~    |
| false   | true     |    !~    |
| false   | false    |    !=    |
2023-04-17 17:45:06 +01:00
Arati R
cab3ba519a
NestedFolders: Add folder service registry with dashboard service implementation (#65033)
* Delete folders, dashboards with registry service
Co-authored-by: Serge Zaitsev <hello@zserge.com>
* Update signature of ProvideDashboardServiceImpl
* Regenerate mockery file
* Add test for DeleteInFolder
* Add test for DeleteDashboardsInFolder
* Delete child dashboard associations via registry
* Add validation of folder uid and org id

---------

Co-authored-by: Serge Zaitsev <hello@zserge.com>
2023-04-14 11:17:23 +02:00
Yuri Tseretyan
afd52d0866
Alerting: use alerting GrafanaReceiver and BuildReceiverConfiguration in Grafana (#65224)
* replace receiver errors with one from alerting
* add the converter to alerting models
* update buildReceiverIntegration to accept GrafanaReceiver
---------

Co-authored-by: George Robinson <george.robinson@grafana.com>
2023-04-13 12:25:32 -04:00
gotjosh
2bbf0c9de4
Alerting: Allow Rules to Schedule to be filtered by Rule Group (#59990)
* Alerting: Allow Rules to Schedule to be filtered by Rule Group
2023-04-13 12:55:42 +01:00
Michael Mandrus
5626461b3c
Caching: Refactor enterprise query caching middleware to a wire service (#65616)
* define initial service and add to wire

* update caching service interface

* add skipQueryCache header handler and update metrics query function to use it

* add caching service as a dependency to query service

* working caching impl

* propagate cache status to frontend in response

* beginning of improvements suggested by Lean - separate caching logic from query logic.

* more changes to simplify query function

* Decided to revert renaming of function

* Remove error status from cache request

* add extra documentation

* Move query caching duration metric to query package

* add a little bit of documentation

* wip: convert resource caching

* Change return type of query service QueryData to a QueryDataResponse with Headers

* update codeowners

* change X-Cache value to const

* use resource caching in endpoint handlers

* write resource headers to response even if it's not a cache hit

* fix panic caused by lack of nil check

* update unit test

* remove NONE header - shouldn't show up in OSS

* Convert everything to use the plugin middleware

* revert a few more things

* clean up unused vars

* start reverting resource caching, start to implement in plugin middleware

* revert more, fix typo

* Update caching interfaces - resource caching now has a separate cache method

* continue wiring up new resource caching conventions - still in progress

* add more safety to implementation

* remove some unused objects

* remove some code that I left in by accident

* add some comments, fix codeowners, fix duplicate registration

* fix source of panic in resource middleware

* Update client decorator test to provide an empty response object

* create tests for caching middleware

* fix unit test

* Update pkg/services/caching/service.go

Co-authored-by: Arati R. <33031346+suntala@users.noreply.github.com>

* improve error message in error log

* quick docs update

* Remove use of mockery. Update return signature to return an explicit hit/miss bool

* create unit test for empty request context

* rename caching metrics to make it clear they pertain to caching

* Update pkg/services/pluginsintegration/clientmiddleware/caching_middleware.go

Co-authored-by: Marcus Efraimsson <marcus.efraimsson@gmail.com>

* Add clarifying comments to cache skip middleware func

* Add comment pointing to the resource cache update call

* fix unit tests (missing dependency)

* try to fix mystery syntax error

* fix a panic

* Caching: Introduce feature toggle to caching service refactor (#66323)

* introduce new feature toggle

* hide calls to new service behind a feature flag

* remove licensing flag from toggle (misunderstood what it was for)

* fix unit tests

* rerun toggle gen

---------

Co-authored-by: Arati R. <33031346+suntala@users.noreply.github.com>
Co-authored-by: Marcus Efraimsson <marcus.efraimsson@gmail.com>
2023-04-12 12:30:33 -04:00
Kyle Brandt
e78be44e1a
SSE: Dataplane Compliance (#65927)
Takes a specific code path for data that identifies itself as dataplane instead of "guessing" what the data is.

The data must identify itself by being in the dataplane by having both the following frame metadata properties:

- TypeVersion property that is greater than 0.0
- 'Type' property

The flag is disableSSEDataplane and disables this functionality and uses the old code for all queries regardless.

See https://github.com/grafana/grafana-plugin-sdk-go/blob/main/data/contract_docs/contract.md for dataplane details.
2023-04-12 12:24:34 -04:00
Matthew Jacobson
63187fae0c
Alerting: Remove and revert flag alertingBigTransactions (#65976)
* Alerting: Remove and revert flag alertingBigTransactions

This is a partial revert of #56575 and a removal of the `alertingBigTransactions` flag.

Real-word use has seen no clear performance incentive to maintain this flag. Lowered db connection count
came at the cost of significant increase in CPU usage and query latency.

* Fix lint backend

* Removed last bits of alertingBigTransactions

---------

Co-authored-by: Armand Grillet <2117580+armandgrillet@users.noreply.github.com>
2023-04-06 18:06:25 +02:00
gotjosh
1c3ce0735f
Alerting: Tiny refactor on the eval and schedule packages (#66130)
* Alerting: Tiny refactor on the eval and schedule packages

two very small things:

- We had a constructor on something called a `Context` which is not a `context.Context` so let's just name that constructor `NewContext`
- The user that we use to run query evaluations is the same (with some variation) abstract it to a function so that it can be re-used when necessary.

* Update pkg/services/ngalert/schedule/schedule.go

Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com>

* Update pkg/services/ngalert/schedule/schedule.go

Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com>

---------

Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com>
2023-04-06 16:02:28 +01:00
Matthew Jacobson
85f738cdf9
Alerting: Add endpoint to revert to a previous alertmanager configuration (#65751)
* Alerting: Add endpoint to revert to a previous alertmanager configuration

This endpoint is meant to be used in conjunction with /api/alertmanager/grafana/config/history to
revert to a previously applied alertmanager configuration. This is done by ID instead of raw config
string in order to avoid secure field complications.
2023-04-05 14:10:03 -04:00
Alexander Weaver
fb520edd72
Alerting: Use a completely isolated context for state history writes (#64989)
* Add fresh context with timeout and same log properties, re-derive logger

* Unify timeout constants

* Move ctx after shortcut that got added through rebasing

* Unify timeouts

* Port opentracing's SpanFromContext and ContextFromSpan to the grafana tracing package

* Support both opentracing and otel variants

* Better document why we're creating a new ctx

* Add new func to FakeSpan which was added after rebase

* Support grafana-specific traceID key in both tracer implementations
2023-04-04 16:41:46 -05:00
George Robinson
bd29071a0d
Revert "Alerting: Add limits to the Prometheus Rules API" (#65842) 2023-04-03 15:20:37 +00:00
George Robinson
d96b0a71d3
Alerting: Add limits to the Prometheus Rules API (#65169)
This commit adds a number of limits to the Grafana flavor of the
Prometheus Rules API:

1. `limit` limits the maximum number of Rule Groups returned
2. `limit_rules` limits the maximum number of rules per Rule Group
3. `limit_alerts` limits the maximum number of alerts per rule

It sorts Rule Groups and rules within Rule Groups such that data in the
response is stable across requests. It also returns summaries (totals) for
all Rule Groups, individual Rule Groups and rules.
2023-04-03 10:17:02 +01:00
Santiago
aba91d3053
Alerting: Fetch all applied alerting configurations (#65728)
* WIP

* skip invalid historic configurations instead of erroring

* add warning log when bad historic config is found

* remove unused custom marshaller for GettableHistoricUserConfig

* add id to historic user config, move limit check to store, fix typo

* swagger spec
2023-03-31 17:43:04 -03:00
Alexander Weaver
da4832724e
Alerting: Delete stub for SQL alert state history backend (#65667)
Delete stub for SQL backend
2023-03-31 11:15:56 -05:00
Matthew Jacobson
b9dc04139a
Alerting: Respect "For" Duration for NoData alerts (#65574)
* Alerting: Respect "For" Duration for NoData alerts

This change modifies `resultNoData` to be more inline with the logic of the other state handlers.

The main effects of this are:

1) NoData states with NoDataState config set to Alerting will respect "For" duration.
2) Prevents zero value in StartsAt and EndsAt for alerts that have only even been in normal state. This includes state transitions from NoDataState=OK and ExecErrState=OK.
3) Better state transition logging.
2023-03-31 19:05:15 +03:00
Steve Simpson
04336d53a9
Alerting: Update prometheus version (#65688) 2023-03-31 16:34:35 +02:00
Yuri Tseretyan
622c23716a
Alerting: Use logger with context in the state cache (#65663) 2023-03-31 10:11:30 -04:00
Alexander Weaver
b2abb63286
Alerting: Introduce proper feature toggles for common state history backend combinations (#65497)
* define 3 feature toggles for rollout phases

* Pass feature toggles along

* Implement first feature toggle

* Try a different strategy with fall-throughs to specific configurations

* Apply toggle overrides once outside of backend composition

* Emit log messages when we coerce backends

* Run code generator for feature toggle files

* Improve wording in flag descs

* Re-run generator

* Use code-generated constants instead of plain strings

* Use converted enum values rather than strings for pre-parsing
2023-03-30 13:53:21 -05:00
Alexander Weaver
5e87ea745d
Alerting: Fix and re-enable filters instance labels in log line test (#65618)
Fix and reenable test
2023-03-30 09:02:18 -05:00
Dimitris Sotirakis
e758b017d0
Alerting: Disable filters instance labels in log line test (#65610)
* Disable filters instance labels in log line test

* Add drone reference
2023-03-30 16:04:29 +03:00
Yuri Tseretyan
9eaffdf5a8
Alerting: Remove dependency on alerting package in definitions (#65390)
* move export rules to definitions package
* move provisioning contact point methods to provisioning package
* move AlertRuleGroupWithFolderTitle to ngalert models and adapter functions to api's compat
* move rule_types files back to where they were before.
2023-03-29 13:34:59 -04:00
Alexander Weaver
a416100abc
Alerting: No longer index state history log streams by instance labels (#65474)
* Remove private labels

* No longer index by instance labels

* Labels are now invariant, only build them once

* Remove bucketing since everything is in a single stream

* Refactor statesToStreams to only return a single unified log stream

* Don't query on labels that no longer exist

* Move selector logic to loki layer, genericize client to work in terms of straight logQL

* Add support for line-level label filters in query

* Combine existing selector tests for better parallelism

* Tests for logQL construction

* Underscore instead of dot for unwrapping labels in logql
2023-03-29 11:52:11 -05:00
Santiago
7b92849508
Alerting: Add CustomDetails field in PagerDuty contact point (#64860)
* Alerting: Add CustomDetails for PagerDuty

* fix default value for 'severity' from 'error' to 'critical'

* minimal docs for notifiers, specifying config for PagerDuty

* replace notifier -> integration

* replace notifier -> integration
2023-03-29 10:35:01 -03:00
Alexander Weaver
de1637afe5
Alerting: Add alert instance labels to Loki log lines in addition to stream labels (#65403)
Add instance labels to log line
2023-03-28 08:57:51 -05:00
Alexander Weaver
dd04757fc9
Alerting: Add "backend" label to state history writes metrics (#65395)
* Add backend label to state history writes metrics

* Update test expectations
2023-03-28 08:49:51 -05:00
Giuseppe Guerra
a89202eab2
Plugins: Improve instrumentation by adding metrics and tracing (#61035)
* WIP: Plugins tracing

* Trace ID middleware

* Add prometheus metrics and tracing to plugins updater

* Add TODOs

* Add instrumented http client

* Add tracing to grafana update checker

* Goimports

* Moved plugins tracing to middleware

* goimports, fix tests

* Removed X-Trace-Id header

* Fix comment in NewTracingHeaderMiddleware

* Add metrics to instrumented http client

* Add instrumented http client options

* Removed unused function

* Switch to contextual logger

* Refactoring, fix tests

* Moved InstrumentedHTTPClient and PrometheusMetrics to their own package

* Tracing middleware: handle errors

* Report span status codes when recording errors

* Add tests for tracing middleware

* Moved fakeSpan and fakeTracer to pkg/infra/tracing

* Add TestHTTPClientTracing

* Lint

* Changes after PR review

* Tests: Made "ended" in FakeSpan private, allow calling End only once

* Testing: panic in FakeSpan if span already ended

* Refactoring: Simplify Grafana updater checks

* Refactoring: Simplify plugins updater error checks and logs

* Fix wrong call to checkForUpdates -> instrumentedCheckForUpdates

* Tests: Fix wrong call to checkForUpdates -> instrumentedCheckForUpdates

* Log update checks duration, use Info log level for check succeeded logs

* Add plugin context span attributes in tracing_middleware

* Refactor prometheus metrics as httpclient middleware

* Fix call to ProvidePluginsService in plugins_test.go

* Propagate context to update checker outgoing http requests

* Plugin client tracing middleware: Removed operation name in status

* Fix tests

* Goimports tracing_middleware.go

* Goimports

* Fix imports

* Changed span name to plugins client middleware

* Add span name assertion in TestTracingMiddleware

* Removed Prometheus metrics middleware from grafana and plugins updatechecker

* Add span attributes for ds name, type, uid, panel and dashboard ids

* Fix http header reading in tracing middlewares

* Use contexthandler.FromContext, add X-Query-Group-Id

* Add test for RunStream

* Fix imports

* Changes from PR review

* TestTracingMiddleware: Changed assert to require for didPanic assertion

* Lint

* Fix imports
2023-03-28 11:01:06 +02:00
Serge Zaitsev
0beb768427
Chore: Remove result fields from ngalert (#65410)
* remove result fields from ngalert

* remove duplicate imports
2023-03-28 10:34:35 +02:00
Yuri Tseretyan
ec4152c7e5
Alerting: Remove dependency on secrets in definitions package (#65391) 2023-03-27 16:35:54 -04:00
Yuri Tseretyan
52a0f59706
Alerting: introduce AlertQuery in definitions package (#63825)
* copy AlertQuery from ngmodels to the definition package
* replaces usages of ngmodels.AlertQuery in API models
* create a converter between models of AlertQuery
---------

Co-authored-by: Alex Moreno <alexander.moreno@grafana.com>
2023-03-27 11:55:13 -04:00
Alexander Weaver
07368dec74
Alerting: Fix attachment of external labels to Loki state history log streams (#65140)
Fix attachment of external labels, add tests
2023-03-21 18:00:59 -05:00
Alexander Weaver
bf54f2672e
Alerting: Switch to snappy-compressed-protobuf for outgoing push requests to Loki (#65077)
* Encode with snappy, always

* JSON encoder type

* Headers

* Copy labels formatter from promtail

* Implement snappy-proto encoding

* Create encoder interface, test both encoders, choose snappy-proto by default

* Make encoder configurable at the LokiCfg level

* Export both encoders

* Touch up comment and tests

* Drop unnecessary conversions after move to plain strings to appease linter
2023-03-21 13:38:42 -05:00
Alexander Weaver
cc7e5ce62e
Alerting: Fix ambiguous handling of equals in labels when bucketing Loki state history streams (#65013)
* Use JSON instead of data.Labels string format as label repr

* Drop debug log line
2023-03-21 12:33:27 -05:00
Alexander Weaver
e39d7f44c9
Alerting: Elide requests to Loki if nothing should be recorded (#65011)
Exit early if no log streams or annotations
2023-03-21 09:30:56 -05:00
Alexander Weaver
40c5713cbd
Vendor errors.Join from Go standard library to avoid version incompatibilities (#64985)
Vendor errors.Join from std lib
2023-03-17 14:07:58 -05:00
Alexander Weaver
a31672fa40
Alerting: Create new state history "fanout" backend that dispatches to multiple other backends at once (#64774)
* Rename RecordStatesAsync to Record

* Rename QueryStates to Query

* Implement fanout writes

* Implement primary queries

* Simplify error joining

* Add test for query path

* Add tests for writes and error propagation

* Allow fanout backend to be configured

* Touch up log messages and config validation

* Consistent documentation for all backend structs

* Parse and normalize backend names more consistently against an enum

* Touch-ups to documentation

* Improve clarity around multi-record blocking

* Keep primary and secondaries more distinct

* Rename fanout backend to multiple backend

* Simplify config keys for multi backend mode
2023-03-17 12:41:18 -05:00
Alexander Weaver
9bcf8819d3
Alerting: Handful of small adjustments to log levels and parameters (#64572)
Calculate duration earlier in scheduler
2023-03-17 12:15:49 +00:00
gotjosh
02a8f62021
Alerting: Fix stats that display alert count when using unified alerting (#64852)
* Alerting: Fix stats when using unified alerting
2023-03-17 11:19:18 +00:00
George Robinson
0b506b4ccc
Alerting: Update github.com/grafana/alerting (#64882) 2023-03-16 13:59:35 +00:00
Yuri Tseretyan
85a954cd81
Alerting: Update scheduler to get updates only from database (#64635)
* stop using the scheduler's Update and Delete methods all communication must be via the database
* update scheduler's registry to calculate diff before re-setting the cache
* update fetcher to return the diff generated by registry
* update processTick to update rule eval routine if the rule was updated and it is not going to be evaluated at this tick.
* remove references to the scheduler from api package
* remove unused methods in the scheduler
2023-03-14 18:02:51 -04:00
Alexander Weaver
faef3a8258
Alerting: Log error but don't fail initialization if state history connection test fails (#64699)
Don't return init error if ping fails, add tests
2023-03-13 15:54:46 -05:00
Andrej Ocenas
6647217208
Phlare: Use enum config to send deduplicated func and filenames (#64435) 2023-03-13 11:06:04 +01:00
Jean-Philippe Quéméner
fb5ed0b0b3
Alerting: fix flaky cache test (#64499) 2023-03-09 06:08:05 -05:00
Ryan McKinley
42e7ec9fe4
Chore: cleanup dashboard service names (#64442) 2023-03-08 14:37:45 -05:00
George Robinson
0c8876c3a2
Alerting: Return errors when expanding templates (#63662)
This commit changes the state package so that errors encountered while
expanding templates for custom labels and annotations are returned
from the function. This is not used at present, but will be used in the
future as we look at how to offer better feedback to users who don't
have access to logs, for example our customers who use Hosted Grafana.
2023-03-08 12:25:02 +00:00
Alexander Weaver
4a1c18abf6
Alerting: Fix intermittency when seeding database in rule store tests (#64322)
Force unique IDs when seeding database
2023-03-07 09:40:55 -05:00
George Robinson
ed71012ced
Alerting: Fix Classic Conditions $values variable (#64243)
This commit fixes a bug in the $values variable in notification
templates when using Classic Conditions. Since Classic Conditions
are not multi-dimensional, the values of each series that exceeded
the condition should be available as a RefID and offset. For example,
B0, B1, etc. However, this bug meant that instead just a single
condition would be printed as B, not B0.
2023-03-06 12:08:00 -05:00
Alexander Weaver
19d01dff91
Alerting: Expose Prometheus metrics for persisting state history (#63157)
* Create historian metrics and dependency inject

* Record counter for total number of state transitions logged

* Track write failures

* Track current number of active write goroutines

* Record histogram of how long it takes to write history data

* Don't copy the registerer

* Adjust naming of write failures metric

* Introduce WritesTotal to complement WritesFailedTotal

* Measure TransitionsFailedTotal to complement TransitionsTotal

* Rename all to state_history

* Remove redundant Total suffix

* Increment totals all the time, not just on success

* Drop ActiveWriteGoroutines

* Drop PersistDuration in favor of WriteDuration

* Drop unused gauge

* Make writes and writesFailed per org

* Add metric indicating backend and a spot for future metadata

* Drop _batch_ from names and update help

* Add metric for bytes written

* Better pairing of total + failure metric updates

* Few tweaks to wording and naming

* Record info metric during composition

* Create fakeRequester and simple happy path test using it

* Blocking test for the full historian and test for happy path metrics

* Add tests for failure case metrics

* Smoke test for full annotation persistence

* Create test for metrics on annotation persistence, both happy and failing paths

* Address linter complaints

* More linter complaints

* Remove unnecessary whitespace

* Consistency improvements to help texts

* Update tests to match new descs
2023-03-06 10:40:37 -06:00
gotjosh
5422f7cf56
Alerting: Add metrics for active receiver and integrations (#64050)
* Alerting: Add metrics for active receiver and integrations

Introduces metrics that allows us to track the number of configured receivers and integration in the Alertmanager for all orgs.

As a bonus, I realised that the alert reception metrics where not being exported nor collected. This does that too.
2023-03-06 16:37:07 +00:00
Serge Zaitsev
0bdb105df2
Chore: Remove xorcare/pointer dependency (#63900)
* Chore: remove pointer dependency

* fix type casts

* deprecate xorcare/pointer library in linter

* rooky mistake
2023-03-06 05:23:15 -05:00
Ieva
a52999a886
Access Control: revert to using folder store from the scope resolvers (#64132)
* revert to using folder store from the resolvers

* fixing tests after revert

* api test fixes

---------

Co-authored-by: Kristin Laemmert <mildwonkey@users.noreply.github.com>
2023-03-03 10:56:33 -05:00
Yuri Tseretyan
e760f22402
Alerting: Use background context for maintenance function (#64065) 2023-03-02 14:19:52 -05:00
Emil Tullstedt
10ee900beb
Errors: Remove direct dependencies on github.com/pkg/errors (#64026)
Co-authored-by: Sofia Papagiannaki <1632407+papagian@users.noreply.github.com>
2023-03-02 16:28:10 +01:00
Kristin Laemmert
bb798e24f3
chore(services): replace dependencies on dashboard store with dashboard service (#63937)
* chore(services): replace dependencies on dashboard store with dashboard service

This continues the backend service/store split by replacing dashboard store dependencies with service dependencies. the folder service remains the single exception for now; otherwise we'd have a dependency cycle between the folder and dashboard services. I have some ideas for that, but I'll take care of all the easy parts first.

While doing this, I identified and removed a number of unused arguments from the following functions:

NewFolderNameScopeResolver
NewFolderIDScopeResolver
NewFolderUIDScopeResolver
NewDashboardIDScopeResolver
NewDashboardUIDScopeResolver
resolveDashboardScope

I have a small enterprise PR to support this commit.

* lingering fmt
2023-03-02 08:09:57 -05:00
Yuri Tseretyan
5e2a661dec
Alerting: update API models to user NoDataState and ExecutionErrorState from definitions instead of models (#63824) 2023-02-28 16:21:41 -05:00
Yuri Tseretyan
f561e71de8
Alerting: decouple api models from domain\dto models: separate Provenance status + converters (#63594)
* move conversions of domain models to api models and reverse from definition package to api package
2023-02-27 17:57:15 -05:00
Alexander Weaver
e77621649d
Alerting: Instrument outgoing state history requests using weaveworks/common (#63600)
* Loki backend and client depend on a requester

* Instrument all requests to loki using weaveworks TimedClient

* Construct collector in metrics package
2023-02-23 17:52:02 -06:00
Yuri Tseretyan
98e1aeaebd
Alerting: Fix client to external Alertmanager to correctly build URL for Mimir Alertmanager (#63676) 2023-02-23 13:55:26 -05:00
Emil Tullstedt
3abaf32cf2
Chore: Upgrade golangci-lint to v1.51.2 (#63630)
Co-authored-by: Sofia Papagiannaki <1632407+papagian@users.noreply.github.com>
2023-02-23 15:10:03 +01:00
Alex Moreno
f60dc4441f
Alerting: Add status label to GroupRules metric (#63454)
* Add status label to GroupRules metric

* Add state (active and paused) label to GrouRules

* Add active/paused metrics tests
2023-02-23 12:38:27 +01:00
George Robinson
f93a9c794d
Alerting: Fix incorrect comment in eval.go (#63510)
This commit fixes an incorrect comment in the Result struct in eval.go
that I had written some time ago. The comment now documents the
actual behaviour and content of this field.
2023-02-21 15:42:04 +00:00
bla2ej
56c8661929
Alerting: Get alert rules on faults (#61248) (#63051)
* Alerting: get alert rules on faults (#61248)

Two functions used to fetch alert rules from DB are updated:
- GetAlertRulesForScheduling
- ListAlertRules
Rows are scanned one by one so good ones are returned.
Common Error is logged with indication how many
rules failed on deserialization.

Resolved: #61248

* updates from review comments
2023-02-21 08:54:20 -06:00
George Robinson
9f2fb3fa27
Alerting: Add filter and remove funcs for custom labels and annotations (#63437)
This commit adds filterLabels, filterLabelsRe, removeLabels, and
removeLabelsRe functions to templates for custom labels and annotations.
It allows for use cases such as removing all private labels.
2023-02-20 14:40:26 +00:00
George Robinson
c637a5543e
Alerting: Rename caps to captures as cap is a reserved word (#63432) 2023-02-20 10:08:36 +00:00
George Robinson
aacf9da969
Alerting: Change Data to use Labels instead of map[string]string (#63431)
This commit changes the Data struct in template.go to use Labels
instead of map[string]string. It changes how labels are printed
when using {{ .Labels }} from map[foo:bar bar:baz] to
foo=bar, bar=baz.
2023-02-20 10:08:23 +00:00
George Robinson
0a01391ebe
Alerting: Small readability improvements to template.go (#63422)
* Alerting: Small readability improvements to template.go

* Fix lint
2023-02-20 09:24:11 +00:00
George Robinson
0659134793
Alerting: Better printing of labels (#63348)
This commit changes how labels are printed in templates for custom
annotations and labels from map[foo:bar bar:baz] to foo=bar, bar=baz.
Labels are comma separated, and sorted in increasing order.
2023-02-16 12:04:15 -05:00
George Robinson
9e86916d48
Alerting: Move templating to template package (#63347)
This commit moves templating from the state package to a sub-package
called template. This sub-package will be the logical package for
future ease-of-use improvements to templating custom annotations
and labels.
2023-02-16 17:16:36 +01:00
Alexander Weaver
958fb2c50a
Alerting: Unify structs in Loki client and make them more consistent with Prometheus (#63055)
* Use existing row struct instead of [2]string, add deserialization helper

* Replace Stream struct with stream struct which is exactly the same

* Drop unused status field

* Don't export queryRes and queryData

* Tests for custom marshalling

* Rename row fields to T and V for consistency with prometheus samples

* Rename row to sample
2023-02-11 05:17:44 -06:00
George Robinson
1f984409a2
Alerting: Fix a bug taking screenshots with Dashboard UID (#63220)
This commit fixes a bug where Grafana would fail to take a screenshot if
the same Dashboard UID was present across two or more different orgs.
2023-02-09 15:23:01 -05:00
Steve Simpson
4d1a2c3370
Alerting: Move rule_groups_rules metric from State to Scheduler. (#63144)
The `rule_groups_rules` metric is currently defined and computed by `State`.
It makes more sense for this metric to be computed off of the configured rule
set, not based on the rule evaluation state. There could be an edge condition
where a rule does not have a state yet, and so is uncounted.

Additionally, we would like this metric (and others), to have a `rule_group`
label, and this is much easier to achieve if the metric is produced from the
`Scheduler` package.
2023-02-09 17:05:19 +01:00
suntala
49b3027049
Chore: Remove Result field from datasources (#63048)
* Remove Result field from AddDataSourceCommand
* Remove DatasourcesPermissionFilterQuery Result
* Remove GetDataSourceQuery Result
* Remove GetDataSourcesByTypeQuery Result
* Remove GetDataSourcesQuery Result
* Remove GetDefaultDataSourceQuery Result
* Remove UpdateDataSourceCommand Result
2023-02-09 15:49:44 +01:00
Alexander Weaver
f80bf11782
Alerting: Make time range query parameters not required when querying Loki (#62985)
* Make from and to not required

* Move default range calculation up to loki.go
2023-02-07 14:26:43 -06:00
Santiago
955c7b13ea
Alerting: Change error log to warning and apply correct format when updating historic config (#62973)
* change error to warning, apply correct format

* Update pkg/services/ngalert/store/alertmanager.go

Co-authored-by: George Robinson <george.robinson@grafana.com>

* different wording and data

---------

Co-authored-by: George Robinson <george.robinson@grafana.com>
2023-02-07 11:04:05 -03:00
Joey Tawadrous
121260e0dd
Parca: Use data query schema (#62840)
* Parca data query schema

* Remove groupBy
2023-02-07 09:56:21 +00:00
Alexander Weaver
0efb84617e
Alerting: Create benchmarking test for state.ProcessEvalResults (#62041)
* Create benchmark for ProcessEvalResults

* Simplify the test
2023-02-03 15:38:08 -06:00
Yuri Tseretyan
f066e8cdcd
Alerting: Update to alerting 20230203015918-0e4e2675d7aa (after refactoring) (#62823)
* add alerting prefix to some packages from alerting that have similar names in prometheus alertmanager
2023-02-03 11:36:49 -05:00
idafurjes
982939111b
Rename Id to ID for annotation models (#62886)
* Rename Id to ID for annotation models

* Add xorm tags

* Rename Id to ID for API key models

* Add xorm tags
2023-02-03 17:23:09 +01:00
Alexander Weaver
9eeea8f5ea
Alerting: Add label query parameters to state history endpoint (#62831)
* Allow equality-only matching of arbitrary labels via query params

* Pre-initialize map
2023-02-02 16:52:08 -06:00
Jean-Philippe Quéméner
1694a35e0c
Alerting: implement loki query for alert state history (#61992)
* Alerting: implement loki query for alert state history

* extract selector building

* add unit tests for selector creation

* backup

* give selectors their own type

* build dataframe

* add some tests

* small changes after manual testing

* use struct client

* golint

* more golint

* Make RuleUID optional for Loki implementation

* Drop initial assumption that we only have one series

* Pare down to three columns, fix timestamp overflows, improve failure cases in loki responses

* Embed structred log lines in the dataframe as objects rather than json strings

* Include state history label filter

* Remove dead code

---------

Co-authored-by: Alex Weaver <weaver.alex.d@gmail.com>
2023-02-02 16:31:51 -06:00
Matthew Jacobson
f9ec16e74f
Alerting: Fix template validation in provisioning api (#62530)
* Alerting: Fix template validation in provisioning api

Fix issue where provisioning API accepts a malformed template having extra
text outside of definition block and template name matching definition name.
2023-02-02 15:26:39 -05:00
Alexander Weaver
647f73ddc5
Alerting: Add static label to all state history entries (#62817)
* Add static label to all state history entries

* Separate label and value visually
2023-02-02 13:25:26 -06:00
Santiago
ba731f7865
Alerting: Mark AM configuration as applied (#61330)
* Mark AM configuration as applied

* add missing checks, make linter happy

* fix deadlock, mark as valid on save and on load

* mark configurations only if needed

* check error after applyConfig()

* code review comments

* code review changes

* more code review changes

* clean HistoricConfigFromAlertConfig function
2023-02-02 14:45:17 -03:00
Alexander Weaver
6ad1cfef38
Alerting: Add endpoint for querying state history (#62166)
* Define endpoint and generate

* Wire up and register endpoint

* Cleanup, define authorization

* Forgot the leading slash

* Wire up query and SignedInUser

* Wire up timerange query params

* Add todo for label queries

* Drop comment

* Update path to rules subtree
2023-02-02 11:34:00 -06:00
Alexander Weaver
9fa28c11c5
Alerting: Usability adjustments to Loki representation of state history values (#62643)
* Extract label merge, add test file

* Extract error/NoData to first class fields, remove a layer from values

* Include dashUID and panelID as line-level fields

* Drop unnecessary object receiver

* Add tests for stream building

* Drop NoData field from log lines
2023-02-02 10:54:10 -06:00
idafurjes
23c27cffb3
Chore: Rename Id to ID in alerting models (#62777)
* Chore: Rename Id to ID in alerting models

* Add xorm tags for datasource

* Add xorm tag for uid
2023-02-02 17:22:43 +01:00
Sonia Aguilar
753c84f825
Alerting: Pass yaml as a query param in export request (#62751)
* Set YAML as default value for exporting alert rules

* use YAML format for rule list export

Co-authored-by: Sonia Aguilar <33540275+soniaAguilarPeiron@users.noreply.github.com>

* lint

* Add new format query param to swagger+docs

* Fix broken test

---------

Co-authored-by: Gilles De Mey <gilles.de.mey@gmail.com>
Co-authored-by: Matt Jacobson <matthew.jacobson@grafana.com>
2023-02-02 16:10:02 +00:00
Steve Simpson
c44e9f6b71
Alerting: Add metrics around notification delivery. (#62778)
This change exposes more metrics from the embedded Alertmanager, which are
valuable for troubleshooting Alertmanager operation particularly in HA setups.

```
grafana_alerting_notifications_total
grafana_alerting_notifications_failed_total
grafana_alerting_notification_requests_total
grafana_alerting_notification_requests_failed_total
grafana_alerting_notification_latency_seconds
grafana_alerting_nflog_gc_duration_seconds
grafana_alerting_nflog_snapshot_duration_seconds
grafana_alerting_nflog_snapshot_size_bytes
grafana_alerting_nflog_queries_total
grafana_alerting_nflog_query_errors_total
grafana_alerting_nflog_query_duration_seconds
grafana_alerting_nflog_gossip_messages_propagated_total
grafana_alerting_dispatcher_aggregation_groups
grafana_alerting_dispatcher_alert_processing_duration_seconds
```

Note that `alertmanager_dispatcher_aggregation_group_limit_reached_total` is
explicitly not exposed, as the group limit metrics are not enabled.
2023-02-02 14:44:20 +01:00
Alexander Weaver
fcecf4d3cb
Alerting: Refactor away a layer of indirection around the goroutine in Loki state history (#62644)
Inline recordStreamsAsync in loki backend
2023-02-01 11:57:29 -06:00
Gilles De Mey
26866953c1
Alerting: hide "silence" button for external AM setups (#62133) 2023-02-01 15:51:05 +01:00
Sofia Papagiannaki
f143b0a5b2
Chore: Move folder store interface, implementation and test under pkg/services/folder (#62586)
* Chore: Move folder store into folder service package

* Split folder and dashboard store implementations
2023-02-01 15:43:21 +02:00
Alex Moreno
53945afedf
Alerting: Allow alert rule pausing from API (#62326)
* Add is_paused attr to the POST alert rule group endpoint

* Add is_paused to alerting API POST alert rule group

* Fixed tests

* Add is_paused to alerting gettable endpoints

* Fix integration tests

* Alerting: allow to pause existing rules (#62401)

* Display Pause Rule switch in Editing Rule form

* add isPaused property to form interface and dto

* map isPaused prop with is_paused value from DTO

Also update test snapshots

* Append '(Paused)' text on alert list state column when appropriate

* Change Switch styles according to discussion with UX

Also adding a tooltip with info what this means

* Adjust styles

* Fix alignment and isPaused type definition

Co-authored-by: gillesdemey <gilles.de.mey@gmail.com>

* Fix test

* Fix test

* Fix RuleList test

---------

Co-authored-by: gillesdemey <gilles.de.mey@gmail.com>

* wip

* Fix tests and add comments to clarify AlertRuleWithOptionals

* Fix one more test

* Fix tests

* Fix typo in comment

* Fix alert rule(s) cannot be paused via API

* Add integration tests for alerting api pausing flow

* Remove duplicated integration test

---------

Co-authored-by: Virginia Cepeda <virginia.cepeda@grafana.com>
Co-authored-by: gillesdemey <gilles.de.mey@gmail.com>
Co-authored-by: George Robinson <george.robinson@grafana.com>
2023-02-01 13:15:03 +01:00
Alexander Weaver
03f3fbec0d
Alerting: Fix handling of special floating-point cases when writing observed values to annotations (#61074)
* Fix json serialization of state values

* Simplify two of the tests

* Fix linter complaint

* Don't return error if we fail to look up dashboard, just log it and move on

* Address linter complaint
2023-01-31 15:27:51 -06:00
gotjosh
55e7cf1aed
Alerting: Introduce Metric Aggregation starting with Silences (#62512)
* Alerting: Introduce Metric Aggregation starting with Silences
---------

Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com>
2023-01-31 19:54:38 +00:00
gotjosh
178f290f0c
Update dskit to the latest main (#62616)
* Update dskit to the latest main

* Break free from a cortex depedency
2023-01-31 19:05:49 +00:00
ismail simsek
91221bc436
Expressions: Fixes the issue showing expressions editor (#62510)
* Use suggested value for uid

* update the snapshot

* use __expr__

* replace all -100 with __expr__

* update snapshot

* more changes

* revert redundant change

* Use expr.DatasourceUID where it's possible

* generate files
2023-01-31 18:50:10 +01:00
Alexander Weaver
e7ace4ed62
Alerting: Allow separate read and write path URLs for Loki state history (#62268)
Extract config parsing and add tests
2023-01-30 16:30:05 -06:00
Alexander Weaver
b4682fe3cb
Alerting: Configurable externalLabels for Loki state history (#62404)
* Add config option for external labels

* Remove redundant nilcheck
2023-01-30 14:24:45 -06:00
Alex Moreno
7a465f42a6
Alerting: Allow pausing alerts from provisioning (#62263)
* Allow pausing alerts from provisioning

* Update swagger

* Add IsPaused to provision export endpoints

* Add pause field in sample.yml

* Add exception for reset state in first loop iteration of scheduler if rule is paused

* Update provision definition and swagger docs

* Fix provisioning export tests

* Suggestion: Simplify if condition

* Add more context to a comment
2023-01-30 16:29:05 +01:00
Ieva
ee3d742c7d
RBAC: inherit folder permissions when resolving managed permissions (#62244)
* add nested folder scope inheritance to managed permission services

* add a more specific erorr

* remove circular dependencies

* use errutil for returning erorr

* fix tests

* fix tests

* define a new error in ac package
2023-01-30 14:19:42 +00:00
idafurjes
3bda112c5f
Chore: Move search model from models package to search service (#62215)
* Chore: Move search model from models package to search service

* Remove unused imports

* Cleanup after merge
2023-01-30 15:17:53 +01:00
Serge Zaitsev
d6d4097567
Chore: Fix goimports grouping in alerting (#62424)
* fix goimports

* fix goimports order
2023-01-30 09:55:35 +01:00
Yuri Tseretyan
0c4671e31f
Alerting: Update historian to ignore transitions from Normal Paused and Updated (#62267) 2023-01-27 16:26:22 -05:00
gotjosh
3c616da83f
Alerting: Refactor metrics/ngalert.go into seperate files (#62362)
* Alerting: Refactor metrics/ngalert.go into seperate files
2023-01-27 18:49:49 +00:00
Matthew Jacobson
c006df375a
Alerting: Create endpoints for exporting in provisioning file format (#58623)
This adds provisioning endpoints for downloading alert rules and alert rule groups in a 
format that is compatible with file provisioning. Each endpoint supports both json and 
yaml response types via Accept header as well as a query parameter 
download=true/false that will set Content-Disposition to recommend initiating a download 
or inline display.

This also makes some package changes to keep structs with potential to drift closer 
together. Eventually, other alerting file structs should also move into this new file 
package, but the rest require some refactoring that is out of scope for this PR.
2023-01-27 11:39:16 -05:00
George Robinson
b6db1ed524
Revert "Alerting: Add is_paused attr to the POST alert rule group endpoint" (#62310)
Revert "Alerting: Add is_paused attr to the POST alert rule group endpoint (#62253)"

This reverts commit 3ccafe3a5a.
2023-01-27 13:41:36 +01:00
Alex Moreno
3ccafe3a5a
Alerting: Add is_paused attr to the POST alert rule group endpoint (#62253)
Add is_paused attr to the POST alert rule group endpoint
2023-01-27 10:50:06 +01:00
Yuri Tseretyan
05bf241952
Alerting: Update state manager to return StateTransitions when Delete or Reset (#62264)
* update Delete and Reset methods to return state transitions

this will be used by notifier code to decide whether alert needs to be sent or not.

* update scheduler to provide reason to delete states and use transitions

* update FromAlertsStateToStoppedAlert to accept StateTransition and filter by old state

* fixup

* fix tests
2023-01-27 09:46:21 +01:00
idafurjes
6c5a573772
Chore: Move ReqContext to contexthandler service (#62102)
* Chore: Move ReqContext to contexthandler service

* Rename package to contextmodel

* Generate ngalert files

* Remove unused imports
2023-01-27 08:50:36 +01:00
Alex Moreno
531b439cf1
Alerting: Add alert pausing feature (#60734)
* Add field in alert_rule model, add state to alert_instance model, and state to eval

* Remove paused state from eval package

* Skip paused alert rules in scheduler

* Add migration to add is_paused field to alert_rule table

* Convert to postable alerts only if not normal, pernding, or paused

* Handle paused eval results in state manager

* Add Paused state to eval package

* Add paused alerts logic in scheduler

* Skip alert on scheduler

* Remove paused status from eval package

* Apply suggestions from code review

Co-authored-by: George Robinson <george.robinson@grafana.com>

* Remove state

* Rethink schedule and manager for paused alerts

* Change return to continue

* Remove unused var

* Rethink alert pausing

* Paused alerts storing annotations

* Only add one state transition

* Revert boolean method renaming refactor

* Revert take image refactor

* Make registry errors public

* Revert method extraction for getting a folder title

* Revert variable renaming refactor

* Undo unnecessary changes

* Revert changes in test

* Remove IsPause check in PatchPartiLAlertRule function

* Use SetNormal to set state

* Fix text by returning to old behaviour on alert rule deletion

* Add test in schedule_unit_test.go to test ticks with paused alerts

* Add coment to clarify usage of context.Background()

* Add comment to clarify resetStateByRuleUID method usage

* Move rule get to a more limited scope

* Update pkg/services/ngalert/schedule/schedule.go

Co-authored-by: George Robinson <george.robinson@grafana.com>

* rum gofmt on pkg/services/ngalert/schedule/schedule.go

* Remove defer cancel for context

* Update pkg/services/ngalert/models/instance_test.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* Update pkg/services/ngalert/models/testing.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* Update pkg/services/ngalert/schedule/schedule_unit_test.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* Update pkg/services/ngalert/schedule/schedule_unit_test.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* Update pkg/services/ngalert/models/instance_test.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* skip scheduler rule state clean up on paused alert rule

* Update pkg/services/ngalert/schedule/schedule.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* Fix mock in test

* Add (hopefully) final suggestions

* Use error channel from recordAnnotationsSync to cancel context

* Run make gen-cue

* Place pause alert check in channel update after version check

* Reduce branching un update channel select

* Add if for error and move code inside if in state manager ResetStateByRuleUID

* Add reason to logs

* Update pkg/services/ngalert/schedule/schedule.go

Co-authored-by: George Robinson <george.robinson@grafana.com>

* Do not delete alert rule routine, just exit on eval if is paused

* Reduce branching and create-close a channel to avoid deadlocks

* Separate state deletion and state reset (includes history saving)

* Add current pause state in rule route in scheduler

* Split clearState and bring errCh closer to RecordStatesAsync call

* Change rule to ruleMeta in RecordStatesAsync

* copy state to be able to modify it

* Add timeout to context creation

* Shorten the timeout

* Use resetState is rule is paused and deleteState if rule is not paused

* Remove Empty state reason

* Save every rule change in historian

* Add tests for DeleteStateByRuleUID and ResetStateByRuleUID

* Remove useless line

* Remove outdated comment

Co-authored-by: George Robinson <george.robinson@grafana.com>
Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>
Co-authored-by: Armand Grillet <2117580+armandgrillet@users.noreply.github.com>
2023-01-26 18:29:10 +01:00
Kristin Laemmert
e8b8a9e276
chore: move dashboard_acl models into dashboard service (#62151) 2023-01-26 08:46:30 -05:00
gotjosh
0bfe150928
Alerting: Fix Test Receivers when settings are non-strings (#62156)
* Alerting: Fix Test Receivers when settings are non-strings

As part of the Alerting extraction, we want to make sure we don't have circular depedencies. As such, I had to move `PostableGrafanaReceiver` to a new struct in `grafana/alerting` called `GrafanaReceiver`.

`PostableGrafanaReceiver` has an attribute called `Settings` that uses a Grafana-propietary struct called `RawMessage`, this struct shadows `json.RawMessage`.

When I created `GrafanaReceiver`, I turned settings into a `map[string]string` thinking all settings would end up as strings. This was a mistake, and this test proves that it doesn't work, and breaks the API.
2023-01-26 12:54:03 +00:00
George Robinson
a7eab8e46e
Alerting: Support context.Context in Loki interface (#61979)
This commit adds support for canceleable contexts in the Loki
interface.
2023-01-26 09:31:20 +00:00
Sofia Papagiannaki
cd27562c76
Access control: Modify dashboard/folder resolvers so that return also the inherited scopes (#62025)
* Access Control: Add folder service dependency to the dashboard/folder resolvers

* Expose the function fetching parents to folder interface

* Add generic prepend utility

* Modify dashboard resolvers to return inherited scopes
2023-01-26 10:21:10 +02:00
Alexander Weaver
eb1293ebb1
Alerting: Re-generate swagger definitions (#62154)
Regenerate swagger, add body binding so parameters work
2023-01-25 15:01:42 -06:00
Alexander Weaver
046a9bb7c1
Alerting: Copy rule definitions into state history (#62032)
* Copy rules instead of accepting pointer

* Deep-copy the rule, for even more guarantees

* Create struct just for needed fields

* Move RuleMeta to historian/model package, iron out package dependencies

* Move tests for dash ID parsing to model package along with code
2023-01-25 11:29:57 -06:00
idafurjes
b54b80f473
Chore: Remove Result from dashboard models (#61997)
* Chore: Remove Result from dashboard models

* Fix lint tests

* Fix dashboard service tests

* Fix API tests

* Remove commented out code

* Chore: Merge main - cleanup
2023-01-25 10:36:26 +01:00
idafurjes
421976e919
Chore: Remove folders from models pkg (#61853) 2023-01-25 09:14:32 +01:00
Santiago
e5920c211e
Chore: Fix random indices for slices in test files (#61884)
* Fix random indices for slices in test files

* Empty commit
2023-01-24 15:07:37 -03:00
George Robinson
239d94205a
Alerting: Return chan <-error for #61811 (#61858) 2023-01-24 15:41:38 +00:00
Alexander Weaver
7ccc845187
Alerting: Push state history entries to Loki (#61724)
* Implement push endpoint

* Drop duplicated struct

* Genericize auth/tenant headers and improve logging in error case

* Flesh out the data model

* Drop dead code

* Drop log line entirely

* Drop unused arg

* Rename a few type manipulation functions

* Extract label keys as constants

* Improve logs when loki responds with error

* Inline lokiRepresentation function
2023-01-23 16:31:03 -06:00
Sofia Papagiannaki
c7a7ebd3e0
Chore: Drop search service dependency from folder service (#61789)
* Chore: Drop search service dependency from folder service
2023-01-23 14:09:09 +02:00
gotjosh
0be920e61c
Alerting: Remove unused code after importing from grafana/alerting (#61869)
* Alerting: Remove unused code after importing from grafana/alerting
2023-01-23 10:30:10 +00:00
gotjosh
511dab3b4b
Update grafana/alerting to the latest main (#61810)
* Update `grafana/alerting` to the latest main

Also updates prometheus-alertmanager since we use that one directly for some structs.
2023-01-19 20:44:49 +00:00
Sofia Papagiannaki
c104cc7020
Chore: Split folder store and dashboard store interfaces (#61655)
* update folder store mock

* Split folder store and dashboard store interfaces
2023-01-19 18:38:07 +02:00
Alexander Weaver
c10713ea76
Alerting: Create query interface for state history along with annotation-based implementation (#61646) 2023-01-19 10:45:31 +01:00
Yuri Tseretyan
2c46f46d37
Alerting: Rule evaluator to get cached data source info (#61305)
do not skip cache when get data source info
2023-01-18 14:25:11 -05:00
Jean-Philippe Quéméner
44b11d3228
Alerting: support basic auth for the state history loki client (#61696) 2023-01-18 20:24:40 +01:00
George Robinson
d4256b352d
Docs: Rename Message templates to Notification templates (#59477)
This commit renames "Message templates" to "Notification templates"
in the user interface as it suggests that these templates cannot
be used to template anything other than the message. However, message
templates are much more general and can be used to template other fields
too such as the subject of an email, or the title of a Slack message.
2023-01-18 17:26:34 +00:00
Sofia Papagiannaki
b80c9bb974
Chore: Drop dashboard service dependency from folder service (#61614)
* Chore: Drop dashboard dependency from folder service
2023-01-18 17:47:59 +02:00
Santiago
b5fa9e3501
Chore: Fix "manger" typo (#61649)
fix mangers -> managers
2023-01-17 23:13:27 +00:00
Alexander Weaver
1ac89ea040
Alerting: Add client configuration for remote Loki historian backend and test connection (#61114)
* Create loki client type and ping method

* Expose TestConnection on client

* Configure and ping Loki URL

* Close response body reader if present

* Add 30 second timeout

* Remove duplicate close
2023-01-17 13:58:52 -06:00
Kristin Laemmert
f6e3252c00
chore: move notifications models into notifications service (#61638) 2023-01-17 14:47:31 -05:00
Matthew Jacobson
23e05373a7
Alerting: Fix flaky TestIntegrationUpdateAlertRules (#61641)
Prevents random OrgID=0 in test alert generation causing invalid alert rule.
2023-01-17 19:09:46 +00:00
Alexander Weaver
4f1bdc0607
Alerting: Skip flaky test in TestIntegrationUpdateAlertRules (#61627)
* Skip flaky test

* Add comment
2023-01-17 10:39:16 -06:00
Denis Limarev
e6dee8a723
Perfomance: Preallocate slices (#61580) 2023-01-17 11:50:17 +00:00
idafurjes
7c2522c477
Chore: Move dashboard models to dashboard pkg (#61458)
* Copy dashboard models to dashboard pkg

* Use some models from current pkg instead of models

* Adjust api pkg

* Adjust pkg services

* Fix lint
2023-01-16 16:33:55 +01:00
Jo
dcfeab2c73
AuthN: User Quota (#61540)
* remove reqContext from quota checks in login

* add guards for nil ScopeParams
2023-01-16 11:54:15 +01:00
Yuri Tseretyan
9d57b1c72e
Alerting: Do not persist noop transition from Normal state. (#61201)
* add feature flag `alertingNoNormalState`
* update instance database to support exclusion of state in list operation
* do not save normal state and delete transitions to normal
* update get methods to filter out normal state
2023-01-13 18:29:29 -05:00
Ghazanfar
d553a016cc
Alerting: UI changes required to support v3 and Auth in Kafka Contact Point (#61123) 2023-01-13 17:45:43 -05:00
Alexander Weaver
b289b8ac6e
Alerting: Set error annotation on EvaluationError regardless of underlying error type (#61506)
Set error annotation regardless of underlying error type
2023-01-13 13:58:02 -06:00
gotjosh
e7cd6eb13c
Alerting: Use alerting.GrafanaAlertmanager instead of initialising Alertmanager components directly (#61230)
* Alerting: Use `alerting.GrafanaAlertmanager` instead of initialising Alertmanager components directly
2023-01-13 12:54:38 -04:00
gotjosh
8f72893076
Alerting: Document not supporting inhibition rules (#61313)
* Alerting: Document not supporting inhibition rules

* Update docs/sources/alerting/manage-notifications/create-silence.md

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>

* Update docs/sources/alerting/manage-notifications/alertmanager.md

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>
2023-01-13 15:06:06 +01:00
gotjosh
49ae1bbe63
Introduce AlertingConfiguration that implements alerting.Configuration (#61427)
* Introduce `AlertingConfiguration` that implements `alerting.Configuration`
2023-01-12 16:03:07 -04:00
gotjosh
fd6f107ded
Alerting Unification: Use the errors from grafana/alerting in Alerts (#61425) 2023-01-12 15:23:34 -04:00
gotjosh
ddb85ad6ad
Use the ClusterPeer interface from grafana/alerting (#61409)
* Use the Cluster interface from grafana/alerting
2023-01-12 14:47:22 -04:00
gotjosh
2d1faae0b5
Alerting Unification: Use alerting.MaintenanceOptions to configure silences and nflog (#61384) 2023-01-12 12:31:38 -04:00
gotjosh
39e429a14b
Alerting Unification: Use the errors from grafana/alerting in Silences (#61334) 2023-01-12 10:03:49 -04:00
gotjosh
f85a948214
Alerting Unification: Use the State interface from the alerting package (#61333) 2023-01-11 19:50:45 -04:00
Denis Limarev
90badc8729
Performance: Add preallocation for some slices (#59593) 2023-01-11 18:03:37 +01:00
Yuri Tseretyan
b4e1e1871f
Alerting: Fix evaluation timeout (#61303) 2023-01-11 10:52:54 -05:00
Yuri Tseretyan
86b5fbbf60
Alerting: Introduce state manager config structure (#61249) 2023-01-10 16:26:15 -05:00
George Robinson
2a291afbae
Alerting: Use consts from alerting package (#61241) 2023-01-10 19:59:13 +00:00
George Robinson
d19d8c6625
Alerting: Update Alerting and Alertmanager to v0.25.1 (#61233)
Update Alerting and Alertmanager to v0.25.1
2023-01-10 16:17:07 +00:00
Yuri Tseretyan
da18c89e91
Alerting: Scheduler to call DeleteAlertRule once when it stops deleted rules (#61189)
scheduler to call DeleteAlertRule once when it stops deleted rules
2023-01-09 14:39:32 -05:00
Yuri Tseretyan
48f1db63ff
Alerting: Add support for tracing to alerting scheduler (#61057) 2023-01-06 21:21:43 -05:00
Alexander Weaver
eb960d9725
Alerting: Add un-documented toggle for changing state history backend, add shells for remote loki and sql (#61072)
* Add toggle for state history backend and shells

* Extract some shared logic and add tests
2023-01-06 12:06:01 -06:00
Alexander Weaver
8c3a5f6da0
Alerting: Allow state history to be disabled through configuration (#61006)
* Add configuration option for if state history should be enabled

* Inject no-op when history is disabled
2023-01-05 12:21:07 -06:00
George Robinson
9af7adef76
Alerting: Support customizable timeout for screenshots (#60981)
This commit adds a customizable timeout for screenshots called
capture_timeout. The default value is 10 seconds, and the maximum
value is 30 seconds. This timeout should be less than the minimum
Interval of all Evaluation Groups to avoid back pressure on alert
rule evaluation.
2023-01-05 16:07:46 +00:00
Alexander Weaver
0e7640475f
Alerting: Store alertmanager configuration history in a separate table in the database (#60492)
* Update config store to split between active and history tables

* Migrations to fix up indexes

* Implement migration from old format to new

* Move add migrations call

* Delete duplicated rows

* Explicitly map fields

* Quote the column name because it's a reserved word

* Lift migrations to top

* Use XORM for nearly everything, avoid any non trivial raw SQL

* Touch up indexes and zero out IDs on move

* Drop TODO that's already completed

* Fix assignment of IDs
2023-01-04 10:43:26 -06:00
Yuri Tseretyan
4d989860fb
Alerting: Fix conversion of alert state from db state during manager warmup (#60933) 2023-01-04 09:40:04 -05:00
Alexander Weaver
b88b8bc291
Alerting: Fix missing dashboard/panelID links in annotations (#60926)
Assign thru ref
2023-01-03 14:12:27 -06:00
Santiago
05c9af5110
Extract custom template functions (#60695)
extract custom template functions and export the FuncMap
2022-12-22 17:31:40 -03:00
Yuri Tseretyan
f990be58cb
Alerting: Use all notifiers from alerting repository (#60655) 2022-12-22 09:27:18 -05:00
Marcus Efraimsson
c35c689a96
Plugins: Automatically forward plugin request HTTP headers in outgoing HTTP requests (#60417)
Automatically forward core plugin request HTTP headers in outgoing HTTP requests. 
Core datasource plugin authors don't have to specifically handle forwarding of HTTP 
headers, e.g. do not have to "hardcode" the header-names in the datasource plugin, 
if not having custom needs.

Fixes #57065
2022-12-21 13:25:58 +01:00
Yuri Tseretyan
dc2ca80f4d
Alerting: Refactor email notifier (#60602)
* refactor email to not use simplejson

* add tests

* split integration test and unit test + more unit-tests

* Remove outdated comment

Co-authored-by: Armand Grillet <2117580+armandgrillet@users.noreply.github.com>
2022-12-21 02:03:15 -05:00
Yuri Tseretyan
4a3097f52a
Alerting: Update Discord receiver to use encoding/json to build a webhook message + truncate long message (#60592)
* replace simplejson with models
* truncate too long messages

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>
2022-12-20 14:20:42 -05:00
Yuri Tseretyan
aaa55b4252
Alerting: Update Kafka receiver to use encoding/json to build messages (#60593)
Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>
2022-12-20 14:20:09 -05:00
Yuri Tseretyan
a0bf62cc9e
Alerting: Update receivers to use app version from factory config (#60585) 2022-12-20 11:23:10 -05:00
Yuri Tseretyan
ec45c9c990
Alerting: update dingding, discord, googlechat, kafka, line notifiers to use encoding/json to parse settings (#60542)
also, rename Content to Message to match JSON name for Discord and GoogleChat
2022-12-20 09:46:13 -05:00
Yuri Tseretyan
35090c376c
Alerting: Replace VictorOps receiver with the one from alerting repository (#60543)
* replace victorops with one from alerting

* update other usages
2022-12-20 10:55:41 +01:00
Alexander Weaver
ca3f8ba6f4
Alerting: Refactor alertmanager notifier to use encoding/json to parse settings instead of simplejson (#55507)
* replace basic auth header with method call

Co-authored-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
2022-12-19 15:12:49 -05:00
Yuri Tseretyan
f0cabe14d5
Alerting: import Grafana alerting package and update usages (#60490)
* update remaining notifiers to use alerting package
2022-12-19 10:53:58 -05:00
Yuri Tseretyan
92d12fdefa
Alerting: Remove fake secret service in tests (#60488) 2022-12-16 15:01:41 -05:00
Yuri Tseretyan
9ad45aedcf
Alerting: replace usage of simplejson to json.RawMessage in NotificationChannelConfig (#60423)
* introduce alias for json.RawMessage with name RawMessage. This is needed to keep raw JSON and implement a marshaler for YAML, which does not seem to be used but there are tests that fail.
* replace usage of simplejson with RawMessage in NotificationChannelConfig
* remove usage of simplejson in tests
* change migration code to convert simplejson to raw message
2022-12-16 13:01:06 -05:00
Alexander Weaver
91bd1cdb41
Revert "Alerting: Store alertmanager configuration history in a separate table in the database" (#60470)
Revert "Alerting: Store alertmanager configuration history in a separate table in the database (#60197)"

This reverts commit ec80f38c34.
2022-12-16 10:07:44 -05:00
Alex Moreno
174c61b949
Alerting: Set Dashboard and Panel IDs on rule group replacement (#60374)
* Set Dashboard and Panel IDs on rule group replacement

* fix comments and abbreviate test variable name

* Update pkg/services/ngalert/provisioning/alert_rules.go

Co-authored-by: Jean-Philippe Quéméner <JohnnyQQQQ@users.noreply.github.com>

Co-authored-by: Jean-Philippe Quéméner <JohnnyQQQQ@users.noreply.github.com>
2022-12-16 11:47:25 +01:00
Alexander Weaver
ec80f38c34
Alerting: Store alertmanager configuration history in a separate table in the database (#60197)
* Update config store to split between active and history tables

* Migrations to fix up indexes

* Implement migration from old format to new

* Move add migrations call

* Delete duplicated rows

* Explicitly map fields

* Quote the column name because it's a reserved word

* Lift migrations to top
2022-12-15 17:35:00 -06:00
Yuri Tseretyan
6637333748
Alerting: refactor notifiers to use package specific Logger interface (#60361)
* introduce Logger interface local to channles + implementaton that wraps the Grafana logger
* make NewFactoryConfig accept LoggerFactory
* add logger field to FactoryConfig
* update usages of log.Logger to internal interface
2022-12-15 11:10:31 -05:00
Sofia Papagiannaki
11d8bcbea9
Guardian: Introduce additional constructors (#59577)
* Guardian: Use dashboard UID instead of ID

* Apply suggestions from code review

Introduce several guardian constructors and each time use
the most appropriate one.
2022-12-15 16:34:17 +02:00
Yuri Tseretyan
0e7c95a4d2
Alerting: Remove reference to global models package in channels package (#60358)
* remove intermediate struct to create Base struct
* fix alertmanager
2022-12-14 16:21:55 -05:00
Kristina
5a7f38053b
Remove explore compact URLs (#59686)
* Remove explore compact URLs

* Remove two explore link builders that create compact URLs

* Fix merge conflict
2022-12-14 12:57:53 -06:00
Yuri Tseretyan
de008005ce
Alerting: isolate ImageStore in notify package (#60353) 2022-12-14 13:20:20 -05:00
Yuri Tseretyan
7c3ab4a715
Alerting: Remove dependency on Grafana notifications package in alerting notifiers (#60271)
* create sender service interface and bridge to grafana notifier service
* update notifiers to use local sender interface
2022-12-14 10:59:37 -05:00
Yuri Tseretyan
07b5043222
Alerting: Add support for settings parse_mode and disable_notifications to Telegram reciever (#60198) 2022-12-14 10:44:39 -05:00
Yuri Tseretyan
ad09feed83
Alerting: rule backtesting API (#57318)
* Implement backtesting engine that can process regular rule specification (with queries to datasource) as well as special kind of rules that have data frame instead of query.
* declare a new API endpoint and model
* add feature toggle `alertingBacktesting`
2022-12-14 09:44:14 -05:00
Alexander Weaver
821614fb43
Alerting: Align notifier truncation and logging with prometheus/alertmanager (#59339)
* Move truncation code to util to mirror upstream

* Resolve merge conflicts

* Align logging of alert key

* Update tests and fix field passing bug

* Remove superfluous newline in test now that we trim whitespace

* Uptake minor log changes from upstream
2022-12-13 19:50:24 -06:00
Alexander Weaver
e97b43cd58
Alerting: Add provisioning endpoint to fetch all rules (#59989)
* Domain layer api for fetching all rules

* Add endpoint for getting all rules
2022-12-13 11:54:08 +01:00
Alexander Weaver
595e623c28
Alerting: Additional tests for the config store (#60130)
Additional tests for the config store
2022-12-12 11:11:18 -06:00
Yuri Tseretyan
df7f636759
Alerting: Fix slack receiver to close file descriptors when they're not needed anymore (#60178) 2022-12-12 11:19:02 -05:00
Yuri Tseretyan
4374966987
Alerting: Replace hardcoded <no value> to [no value] in label expansion (#60129)
* replace hardcoded <no value> to [no value] in label expansion
2022-12-12 10:12:30 -05:00
Joe Blubaugh
1a8d0e2736
Alerting: Speed up unit and integration tests. (#60067)
This change marks tests in the `sender` package that use an external
process as integration tests instead of unit tests. This speeds up the
package's unit tests by about 20 seconds.

This change also reduces the number of alert instances in the `store`
package's bulk write integration test from 20_000 to 10_000. This is
still enough to exercise the bulk-write code but speeds up the package
tests from about 250s to 130s.

Put together, integration tests go to about 160s while also speeding up
unit tests by 20s.
2022-12-12 14:21:06 +08:00
George Robinson
76601f3ae7
Alerting: Better define how we set states (#59977)
This commit better defines how we set states in resultNormal,
resultAlerting, resultError and resultNoData. It changes the existing
code to call methods such as SetAlerting, SetPending, SetNormal,
SetError and NoData instead of assigning values to each individual field
whenever the state is changed. This should make it easier to understand
what fields should be set for which states and avoid cases where states are
missing, or have additional unexpected fields.
2022-12-08 20:12:13 +00:00
Yuri Tseretyan
316870c658
Alerting: PagerDuty receiver to let user configure fields Source, Client and Client URL (#59895)
* add support for source field
* add client_url
* use real host name for source placeholder
2022-12-08 11:49:27 -05:00
Joe Blubaugh
e6743a7e9a
Alerting: Use the QuotaTargetSrv instead of the QuotaTarget in quota check (#60026)
Before this change, the alerting provisioning system incorrectly used
the QuotaTarget to check if alerting's request quota had been reached.
The quota service requires the QuotaTargetSrv, which is what's
registered with the service at startup time. This is leading to errors
in the provisioning system.
2022-12-08 22:34:46 +08:00
Yuri Tseretyan
c5ee4e4ae1
Alerting: Improve rule validation to check if rule uses backend datasources (#58986)
* validate if rule uses backend datasources

* add backend datasource to test

* fix tests

* another forgotten import

* remove unused var
2022-12-08 10:44:02 +01:00
George Robinson
6359dab040
Alerting: Change resultError in preparation for supporting ForError duration (#59894) 2022-12-07 10:45:56 +00:00
Serge Zaitsev
43f40e6c7c
Chore: Replace yaml.v2 with yaml.v3 (#59897)
* replace yaml.v2 with yaml.v3

* fix a few tests due to the yaml.v3 api changes

* and another goconvey mistake in tests
2022-12-06 21:17:17 +01:00
George Robinson
3c249e1b99
Fix incorrect start time for DatasourceError alerts (#59903) 2022-12-06 18:44:06 +00:00
Yuri Tseretyan
abb49d96b5
Alerting: update state manager to return StateTransition instead of State (#58867)
* improve test for stale states
* update state manager return StateTransition
* update scheduler to accept state transitions
2022-12-06 13:07:39 -05:00
Yuri Tseretyan
a85adeed96
Alerting: Update state history service to filter states transitions (#58863)
* rename the method to better reflect its behavior
* make historian filter transition on itself
* call historian with all changes
2022-12-06 12:33:15 -05:00
Yuri Tseretyan
eeb57cd520
Alerting: Refactor PagerDuty and OpsGenie notifiers to use encoding/json to parse settings (#58925)
* update pagerduty and opsgenie to deserialize settings using standard JSON library
* update pagerduty truncation to use a function from Alertamanger package
* update opsgenie to use payload model (same as in Alertmanager)
2022-12-05 11:38:50 -05:00
Yuri Tseretyan
866aea0db2
Alerting: fix UI element for PagerDuty's severity field configuration (#58927)
* make severity a regular text field
* add logs + fallback to critical if empty
2022-12-05 11:02:20 -05:00
Alexander Weaver
9977c7ea43
Alerting: Simplify scheduler configuration and remove dependency on Grafana-wide settings (#59735)
* Make scheduler not depend directly on grafana-wide settings

* Re-add missing interval
2022-12-02 16:02:07 -06:00
George Robinson
ec1d93c8ab
Alerting: Upload images to Slack via files.upload (#59163)
This commit makes a number of changes to how images work in Slack
notifications.

It adds support for uploading images to Slack via the files.upload
API when the contact point has a token. Images are no longer linked
via a URL if a token is present.

Each image uploaded to Slack is posted as a reply to the original
notification. Up to maxImagesPerThreadTs images can be posted as
replies before a final message is sent with:

  There are no images than can be shown here. To see the panels for
  all firing and resolved alerts please check Grafana

Incoming Webhooks cannot upload files via files.upload and so webhooks
require the image to be uploaded to cloud storage and linked via URL.
2022-12-02 09:41:24 +00:00
Alexander Weaver
1481ace528
Alerting: Fix swallowing of errors when attaching images to notifications (#59432)
* Break out image logic and add logging

* Attach alert log context to image attachment

* Fix capitalization
2022-11-29 13:18:47 -06:00
Sofia Papagiannaki
02b6b09121
Nested Folders: Set user in the API level (#59148) 2022-11-23 11:13:47 +02:00
Denis Limarev
4d8287b319
Performance: add preallocation for some slice/map (#57860)
This change preallocates slices and maps where the size of the data is known before the object is created.

Co-authored-by: Joe Blubaugh <joe.blubaugh@grafana.com>
2022-11-22 20:24:36 +08:00
Sasha Melentyev
c02003af3c
Refactor time durations (#58484)
This change uses `time.Second` in place of `1000 * time.Millisecond` and `time.Minute` in place of `60*time.Second`.
2022-11-22 15:09:15 +08:00
Bart Peeters
57d6adbc7c
Alerting: Support Prometheus durations in Provisioning API (#58293)
Provisioning API should support Prometheus durations
2022-11-21 18:58:25 +00:00
Yuri Tseretyan
b57689e07e
Alerting: Add header X-Grafana-Org-Id to evaluation requests (#58972) 2022-11-21 10:13:44 +01:00
Yuri Tseretyan
8c72d19bcc
Alerting: Refactor MS teams, Pushover and Webhook notifiers to use encoding/json to parse settings (#56834)
* update teams
* update sensugo
* update pushover
* update webhook to use json.Number
2022-11-18 09:24:12 -05:00
Karl Persson
fef1e1d5bc
Auth: Refactor auth package (#58920)
* Auth: move interface to its own file

* Auth: move to test package

* Auth: move quota consts to auth file

* Auth: move service to impl package

* Auth: move interfaces and related models to auth package

* Auth: Create sub package and type alias to avoid circular dependency
2022-11-18 09:56:06 +01:00
Gilles De Mey
ea27eca147
Email: Use MJML email templates (#57751)
Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>
2022-11-17 21:41:46 +01:00
matt abrams
74010fd05d
Admin: Fix broken links to image assets in email templates (#58729)
fix broken links to image assets
2022-11-16 14:17:39 +01:00
Torkel Ödegaard
84a69135a7
Scene: Variables and support for declaring variable dependencies and getting notified or re-rendered when they change (#58299)
* Component that can cache and extract variable dependencies

* Component that can cache and extract variable dependencies

* Updates

* Refactoring

* Lots of refactoring and iterations of supporting both re-rendering and query re-execution

* Updated SceneCanvasText

* Updated name of file

* Updated

* Refactoring a bit

* Added back getName

* Added comment

* minor fix

* Minor fix

* Merge fixes

* Merge fixes

* Some review fixes

* Updated comment

* Added forceRender function

* Add back fail on console log
2022-11-15 12:54:24 +01:00
Sofia Papagiannaki
9855e74b92
Chore: Refactor quota service (#58643)
Chore: Refactor quota service (#57586)

* Chore: refactore quota service

* Apply suggestions from code review
2022-11-14 21:08:10 +02:00
Yuri Tseretyan
28d39d35fd
Alerting: Update state manager to save state transitions in one batch (#58358)
* change stale results handler to not update database but return transitions
* save all transitions in one call
2022-11-14 10:57:51 -05:00
Alex Moreno
78bb8c10ce
Alerting: Allow none provenance alert rule creation from provisioning API (#58410) 2022-11-11 19:58:45 +01:00
gotjosh
d748979048
Alerting: Implement the Webex notifier (#58480)
* Alerting: Implement the Webex notifier

Closes https://github.com/grafana/grafana/issues/11750

Signed-off-by: gotjosh <josue.abreu@gmail.com>
2022-11-11 17:27:13 +00:00
idafurjes
080ea88af7
Nested Folders: Support getting of nested folder in folder service wh… (#58597)
* Nested Folders: Support getting of nested folder in folder service when feature flag is set

* Fix lint

* Fix some tests

* Fix ngalert test

* ngalert fix

* Fix API tests

* Fix some tests and lint

* Fix lint 2

* Fix library elements and panels

* Add access control to get folder

* Cleanup and minor test change
2022-11-11 14:28:24 +01:00
Alex Moreno
45facbba11
Alerting: Remove url based external alertmanagers config (#57918)
* Remove URL-based alertmanagers from endpoint config

* WIP

* Add migration and alertmanagers from admin_configuration

* Empty comment removed

* set BasicAuth true when user is present in url

* Remove Alertmanagers from GET /admin_config payload

* Remove URL-based alertmanager configuration from UI

* Fix new uid generation in external alertmanagers migration

* Fix tests for URL-based external alertmanagers

* Fix API tests

* Add more tests, move migration code to separate file, and remove possible am duplicate urls

* Fix edge cases in migration

* Fix imports

* Remove useless fields and fix created_at/updated_at retrieval

Co-authored-by: George Robinson <george.robinson@grafana.com>
Co-authored-by: Konrad Lalik <konrad.lalik@grafana.com>
2022-11-10 16:34:13 +01:00
George Robinson
c5ae1bcfe0
Alerting: Fix logging pointer address of DashboardUID and PanelID variables (#58539) 2022-11-10 09:58:38 +00:00
George Robinson
68600c224b
Alerting: Log when alert rule cannot be screenshot to help debugging (#58537) 2022-11-10 09:41:31 +00:00
Sofia Papagiannaki
bf5a08e039
API: Support creating a nested folder (#58508)
* API: Support nested folder creation

* Update swagger

* fixup

* Update pkg/api/dtos/folder.go

Co-authored-by: Serge Zaitsev <serge.zaitsev@grafana.com>

* Fix some tests

* create legacy folder url from title and uid

Co-authored-by: idafurjes <36131195+idafurjes@users.noreply.github.com>
Co-authored-by: Serge Zaitsev <serge.zaitsev@grafana.com>
Co-authored-by: Ida Furjesova <ida.furjesova@grafana.com>
2022-11-10 04:41:03 -05:00
Alexander Weaver
2bfdda5b68
Alerting: Break dependency between state and image packages (#58381)
* Refactor state and manager to not depend directly on image interface

* Move generic errors to models package

* Move NotAvailableImageService to state as its only references are in state tests

* Move NoopImageService to state package

* Move mock to state package

* Fix linter error

* Fix comment styling

* Fix a couple added references introduced by rebase

* Empty commit to kick build
2022-11-09 15:06:49 -06:00
Yuri Tseretyan
bad4f28d0d
Alerting: update test TestAlertingTicker to not rely on clock (#58544)
* extract method processTick
* make processTick return scheduled rules
* move state manager tests to state manager
* update test
* move all tests into one file
* remove unused fields
2022-11-09 15:08:57 -05:00
George Robinson
7e852720e3
Alerting: Fix images cached on rule instead of dashboard panel signature (#58510) 2022-11-09 17:01:48 +00:00
George Robinson
b92a0223e3
Alerting: Improve debug logs in image service (#58507) 2022-11-09 16:32:58 +00:00
George Robinson
1290951b65
Alerting: Small improvements to staleResultsHandler (#58007) 2022-11-09 11:08:32 +00:00
George Robinson
c646ff0ce3
Alerting: Fix screenshots were not cached (#58493) 2022-11-09 01:52:16 +00:00
George Robinson
ad9ac85ee0
Alerting: Use hash of opts in singleflight (#58474) 2022-11-08 22:37:49 +00:00
Kristin Laemmert
a255c32e1a
nested folders: support creation of nested folders in folder service when feature flag is set (#58364)
* nested folders: support creation of nested folders in folder service when feature flag is set
2022-11-08 08:59:55 -05:00
Kristin Laemmert
ef7145e4aa
feat(nested folders): Add CountAlertRulesInFolder to ngalert store (#58269)
* chore: refactor CountDashboardsInFolder to use the more efficient Count() sql function

* feat(nested folders): Add CountAlertRulesInFolder to ngalert store

This commit adds CountAlertRulesInFolder and a new model for the CountAlertRulesQuery. It returns a count of alert rules associated with a given orgID and parent folder UID. (the namespace referenced inside alert rules is the parent folder).

I'm not sure where this belongs in the ngalert service, so that will come in a future commit.
2022-11-08 11:51:00 +01:00
Sofia Papagiannaki
96cdf77995
Revert "Chore: Refactor quota service (#57586)" (#58394)
This reverts commit 326ea86a57.
2022-11-08 11:52:07 +02:00
Sofia Papagiannaki
326ea86a57
Chore: Refactor quota service (#57586)
* Chore: refactore quota service

* Apply suggestions from code review
2022-11-08 10:25:34 +02:00
George Robinson
8353f307aa
Alerting: Fix test fails in some environments (#58251) 2022-11-07 16:34:37 +00:00
Yuri Tseretyan
3621cf5a12
Alerting: Update handling of stale state (#58276)
* delete all stale states in one lock
* do not use touched states to detect stale rely only on LastEvaluationTime maintained correctly
* fix tests to use correct eval time
* delete unused method
2022-11-07 11:03:53 -05:00
Neel
db1fd10ff1
Alerting: Append org ID to alert notification URLs (#57123) 2022-11-07 16:03:25 +00:00
Yuri Tseretyan
623de12e35
Alerting: Create AlertInstanceKey in one place (#58278)
* use method GetAlertInstanceKey
* do not add key if error
2022-11-07 09:35:29 -05:00
Yuri Tseretyan
f9c88e72ae
Alerting: Update saveAlertStates in state manager to not return results (#58279) 2022-11-07 09:09:19 -05:00
Yuri Tseretyan
978f1119d7
Alerting: Run state manager as regular sub-service (#58246) 2022-11-04 17:06:47 -04:00
Ryan McKinley
e6a9fa1cf9
ServiceAccounts: enable service accounts after IsRealUser change (#58263)
* suppor service accounts

* add: IsServiceAccount to scheduleUser in scheduler

Co-authored-by: eleijonmarck <eric.leijonmarck@gmail.com>
2022-11-04 15:53:35 -04:00
Yuri Tseretyan
dce8879145
Alerting: Update state manager to accept rule store as Warm method argument (#58244) 2022-11-04 14:23:08 -04:00
Will Jordan
d581b368bd
Alerting: Remove duplicate Slack notification title (#58107)
Move mentions to a markdown-formatted pretext field
to prevent issues mixing blocks and legacy-attachment content.
2022-11-04 17:09:24 +01:00
Alexander Weaver
cc8c1380e2
Alerting: Persist annotations from multidimensional rules in batches (#56575)
* Reduce piecemeal state fields

* Read data directly off state instead of rule

* Unify state and context into single struct

* Expose contextual information to layer above setNextState

* Work in terms of ContextualState and call historian in batches

* Call annotations service in batches

* Export format state and reason and remove workaround in unrelated test package

* Add new method to annotation service for batch inserting

* Fix loop variable aliasing bug caught by linter, didn't change behavior

* Incl timerange on annotation tests

* Insert one at a time if tags are present

* Point to rule from ContextualState rather than copy fields

* Build annotations and copy data prior to starting goroutine

* Rename to StateTransition

* Use new bulk-insert utility

* Remove rule from StateTransition and pass in directly to historian

* Simplify annotations logic since we have only one rule

* Fix logs and context, nilcheck, simplify method name

* Regenerate mock
2022-11-04 10:39:26 -05:00
Dan Cech
9ea6a43089
Build: clean up and document integration test convention (#58170)
* clean up and document integration test convention

* clarify integration test conventions

* clean up integration tests that don't follow convention

* mark testIntegration* functions as helpers to avoid confusion
2022-11-04 10:14:21 -04:00
Eric Leijonmarck
72d0c6b428
Auth: add IsServiceAccount to IsRealUser (#58015)
* add: IsServiceAccount to SignedInUser and IsRealUser

* fix: linting error

* refactor: add function IsServiceAccountUser()

By adding the function IsServiceAccountUser() we use it to identify for
ServiceAccounts in the HasUniqueID() since caching is built up on having
a uniqueID, see comment: https://github.com/grafana/grafana/pull/58015#discussion_r1011361880
2022-11-04 12:39:54 +00:00
Alex Moreno
3558cadb7e
Alerting: Add title and description to Webhook contact point (#58058)
* Add title and description to Webhook contact point

* Remove deprecation message
2022-11-03 10:52:07 +01:00
Alex Moreno
ba15d675e7
Alerting: Add values to annotations (#57738)
* Add values to annotations

* Fix imports

* Use State attrs instead of Result attrs

* Remove unnecessary variable
2022-11-03 10:35:34 +01:00
George Robinson
f2e4cb7c4e
Alerting: Fix feedback (#57922) 2022-11-02 22:36:14 +00:00
George Robinson
215ffee437
Alerting: Fix screenshot is not taken for stale series (#57982) 2022-11-02 22:14:22 +00:00
Yuriy Tseretyan
e3a4bde622
Alerting: Condition evaluator with cached pipeline (#57479)
* create rule evaluator
* load header from the context
* init one factory
* update scheduler
2022-11-02 10:13:39 -04:00
George Robinson
4c581b5f85
Alerting: Fix response is not returned for invalid Duration in Provisioning API (#58046) 2022-11-02 08:21:23 -04:00
George Robinson
b0a927b138
Alerting: Add debug logs in validateAndGetPrefix (#57002) 2022-10-31 16:40:28 +00:00
Yuriy Tseretyan
3294918e9f
Alerting: Update state manager to support nil stores and metrics (#57791) 2022-10-28 13:10:28 -04:00
Yuriy Tseretyan
d848cc629b
Alerting: Refactor rule interval validation to be reusable (#57792) 2022-10-28 14:40:11 +00:00
Alex Moreno
c08c14f8dd
Alerting: Add custom title to pushover contact point (#57530)
* Add custom title to pushover contact point

* Update pkg/services/ngalert/notifier/channels/pushover.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* Use simplejson

* Use more verbose variable names

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>
2022-10-27 19:07:01 +02:00
Alex Moreno
10fdfa8583
Alerting: Change handling of settings to pagerduty contact point (#57524)
* Add custom title to pagerduty contact point

* Fix tests by saving decrypted key

* Use simplejson
2022-10-27 16:20:10 +02:00
Alex Moreno
f8d12af021
Add custom title to googlechet contact point (#57517)
* Add custom title to googlechet contact point

* Update pkg/services/ngalert/notifier/channels/googlechat.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* Use simplejson

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>
2022-10-27 16:19:48 +02:00
Alex Moreno
3d437117ad
Alerting: Add custom title to discord contact point (#57506)
* Add custom title to discord contact point

* Update pkg/services/ngalert/notifier/channels/discord.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* Use simplejson

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>
2022-10-27 16:17:18 +02:00
Alex Moreno
1ab0af1eb2
Alerting: Add custom title to DingDing contact point (#57498)
* Add custom title to DingDing contact point

* Update pkg/services/ngalert/notifier/channels_config/available_channels.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* Update pkg/services/ngalert/notifier/channels/dingding.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* Add error checking before URL templating

* Remove comment

* Use simplejson

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>
2022-10-27 16:16:36 +02:00
Alex Moreno
fb62660df7
Alerting: Add title and description to VictorOps contact point (#57458)
* Add title and description to VictorOps contact point

* Update pkg/services/ngalert/notifier/channels_config/available_channels.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>
2022-10-27 16:12:14 +02:00
Alex Moreno
73a9e2a115
Add title and description to Threema contact point (#57429) 2022-10-27 16:11:38 +02:00
Alex Moreno
6839154720
Alerting: Add missing custom title and description to Line contact point (#57388)
* Add title and description to Line receiver

* Fix labal names for LINE contact point
2022-10-27 15:27:04 +02:00
Alex Moreno
1dcc432537
Alerting: Add missing custom title and description fields in Kafka contact point (#57361)
* Add description and details to Kafka notifier

* Fixed testing and add new logic testing

* Add proper description to kafka contact point UI

* Update pkg/services/ngalert/notifier/channels_config/available_channels.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

* Update pkg/services/ngalert/notifier/channels_config/available_channels.go

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>

Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>
2022-10-27 15:25:54 +02:00
Santiago
cdb5d4230a
Alerting: Fix "Not Implemented" responses (#57710)
* fix swagger spec, return 404 instead of 501 when an endpoint does not exist

* update number of paths in authorization_test.go
2022-10-26 23:35:52 -03:00
Yuriy Tseretyan
0a4121cef8
Alerting: Contextual log provider for rule key (#57476)
* create contextual log context provider
* use contextual provider in scheduler
* init logger in the package
* use context for log context
* use context in state manager
2022-10-26 19:16:02 -04:00
Yuriy Tseretyan
2d20c8db7b
Chore: Expression engine to support relative time range (#57474)
* make TimeRange interface and add relative range
* make Execute methods support the current time
* update resample to support relative time range
* update DSNode to support relative time range
* update query service to create queries with absolute time
* make alerting evaluator create relative time ranges
2022-10-26 16:13:58 -04:00
Galen Kistler
f93c3acc51
Prometheus: Flavor/version configuration (#57554)
* Revert "Revert "Prometheus: Type and flavor configuration (#56496)" (#57552)"
This reverts commit 2432ce619a.
* Adds new fields and documentation for Prometheus datasource configuration: prometheus type, and version
2022-10-24 14:53:11 -05:00
Galen Kistler
2432ce619a
Revert "Prometheus: Type and flavor configuration (#56496)" (#57552)
This reverts commit 7ecbc98b3e.
2022-10-24 12:33:11 -05:00
Galen Kistler
7ecbc98b3e
Prometheus: Type and flavor configuration (#56496)
* Adding two new fields to the data JSON in the prometheus datasource configuration: prometheusType, and prometheusVersion.
* Version field will attempt to auto-detect via buildinfo API when prometheus Type is selected
2022-10-24 09:26:32 -05:00
Alexander Weaver
de46c1b002
Alerting: Improve logs in state manager and historian (#57374)
* Touch up log statements, fix casing, add and normalize contexts

* Dedicated logger for dashboard resolver

* Avoid injecting logger to historian

* More minor log touch-ups

* Dedicated logger for state manager

* Use rule context in annotation creator

* Rename base logger and avoid redundant contextual loggers
2022-10-21 16:16:51 -05:00
Alexander Weaver
5ee4744d62
Alerting: Improve operational logs in sender package (#57134)
* Audit logs in sender package

* Fix casing and touch up a few key names

* Avoid logging entire alert struct

* Log configuration ID being applied

* Revert change to errorf rather than log

* Tune levels further and remove some redundancies

* Adjust logger naming and standardize log context

* Adjust logger naming in router

* Move log and get rid of dead error handling code
2022-10-20 14:19:04 -05:00
Yuriy Tseretyan
f3c219a980
Alerting: update format of logs in scheduler (#57302)
* Change the severity level of the log messages
2022-10-20 13:43:48 -04:00
Alexander Weaver
3ddb28bad9
Find-and-replace 'err' logs to 'error' to match log search conventions (#57309) 2022-10-19 17:36:54 -04:00
Yuriy Tseretyan
3e6bc28de5
Alerting: Change severity level of fetcher log messages (#57299) 2022-10-19 16:00:47 -04:00
Alexander Weaver
4eb8e4ff66
Alerting: Add traceability headers for alert queries (#57127)
* Define EvaluationContext

* Refactor ConditionEval to use new context struct

* Refactor QueriesAndExpressionsEval to use EvaluationContext

* Remove dead field from AlertExecCtx

* Refactor Validate to use EvaluationContext

* Get rid of privately used AlertExecCtx

* Move EvaluationContext to new file and add helper

* Add builder pattern and bind rule info to context

* Extract header logic and add rule UID header

* Fix missing call
2022-10-19 14:19:43 -05:00
Santiago
85cda0db69
Alerting: Templated URLs for webhook type contact points (#57296)
* templated URLs for webhooks

* clear tmplErr before using tmpl() again
2022-10-19 16:14:53 -03:00
Kristin Laemmert
05709ce411
chore: remove sqlstore & mockstore dependencies from (most) packages (#57087)
* chore: add alias for InitTestDB and Session

Adds an alias for the sqlstore InitTestDB and Session, and updates tests using these to reduce dependencies on the sqlstore.Store.

* next pass of removing sqlstore imports
* last little bit
* remove mockstore where possible
2022-10-19 09:02:15 -04:00
aimuz
c0cc85b5f1
Alerting: Add support for wecom apiapp (#55991)
This change adds new functionality to the wecom alerting contact point. In addition to a webhook address, you can now send alerts to the wecom apiapp endpoint.

Based on https://github.com/grafana/grafana/discussions/55883

Signed-off-by: aimuz <mr.imuz@gmail.com>
2022-10-19 12:17:37 +08:00
ying-jeanne
ed98d7bc27
Chore: remove busmock (#57170) 2022-10-18 13:31:56 +00:00
Santiago
6ad405e256
fix swagger spec for receivers API response (#57124) 2022-10-17 16:58:55 -03:00
Yuriy Tseretyan
888bdfd4ad
Alerting: Use correct response body for silence post API (#57114) 2022-10-17 15:43:37 -04:00
Alexander Weaver
129a28919b
Alerting: Cache result of dashboard ID lookups (#56587)
* Create caching dashboard resolver

* A couple tests for dashboard resolving

* Log warning on not found

* Additional polish + review nits

* Move to singleflight instead of a plain mutex

* Store errors instead of -1 in cache and use reflection when reading

* Address linter error

* One more linter error
2022-10-14 15:48:02 -05:00
Kristin Laemmert
c61b5e85b4
chore: replace sqlstore.Store with db.DB (#57010)
* chore: replace sqlstore.SQLStore with db.DB

* more post-sqlstore.SQLStore cleanup
2022-10-14 15:33:06 -04:00
George Robinson
2f85172718
Alerting: Remove blank comment (#56889) 2022-10-14 13:28:41 +01:00
Santiago
3c56fd8da0
Fix duplicated receivers in API response (#56829) 2022-10-13 10:01:28 -03:00
Joe Blubaugh
c7c640d903
Alerting: Fix email image embedding on Windows. (#56766)
The email notifier was incorrectly handling Windows filepaths. This is
fixed by using the `path/filepath` package.
2022-10-13 10:24:00 +08:00
Matt
26bb139470
Fixes 48972 - Exposes channels.WebhookMessage (#56140) 2022-10-12 09:50:28 +01:00
Armand Grillet
74a79b517d
Update Alerting changelog (#56684)
Now simpler to use.
2022-10-11 10:55:18 +00:00
George Robinson
52965de369
Alerting: Add doc comments to state struct and normalize fields (#56647) 2022-10-11 09:30:33 +01:00
Serge Zaitsev
53baecd71f
Chore: Move folder service into a separate package (#56591)
* Chore: move folder service interface into a separate package

* copy implementation into a standalone package

* move implementation and tests to the new folder package

* remove leftovers from wire

* add test doubles for folder service

* fix tests in library panels/elements

* fix provideservice in ngalert
2022-10-10 21:47:53 +02:00
George Robinson
802d67eeca
Alerting: Support values in notification templates (#56457)
We have received a lot of feedback regarding the ValueString in alert notifications. Perhaps one of the most frequent complaints about ValueString is that it is difficult to read because it contains a lot of information, and the information is shown as a JSON-like string. Users have often asked how it can be templated and the answer is that it can't.

Until now users have been able to add custom annotations to their alert rules which contains values via the $values variable added in previous versions of Grafana. However, these custom annotations must be added for each of the user's alert rule, instead of once in a template that all of their alerts can be notified via.

This commit adds then the much requested feature to support values in notification templates. Users can then create a single template that prints the annotations, labels and values of their alerts in a format of their choice!
2022-10-10 13:40:21 +01:00
Joe Blubaugh
7312a2dab0
Alerting: Mark all tests that interact with the database as Integration tests. (#54875)
Previously, two tests were not explicitly marked as integration tests
and so were not run against all 3 supported databases in the CI
environment.
2022-10-10 01:54:54 -04:00
Yuriy Tseretyan
e2f1201382
Alerting: Fix migration to not add label "alertname" (#56509)
* do not add label alertname because it is overridden in state manager anyway
* update state manager to not consider labels with same value as dupe
2022-10-07 15:06:53 -04:00