grafana

mirror of https://github.com/grafana/grafana.git synced 2024-11-29 04:04:00 -06:00

Author	SHA1	Message	Date
Fayzal Ghantiwala	543f0ae37e	Alerting: Update ListAlertRulesQuery to take a slice of RuleGroups (#88385 ) * Change ListAlertRulesQuery to take RuleGroup slice instead * Change func name * Change func name * Fix fakes * Fix function arg	2024-05-29 11:50:33 +01:00
Matthew Jacobson	8418aca823	Alerting: Add single rule checks to alert rule access control (#88307 ) * Alerting: Add single rule checks to alert rule access control Modifies ruler api single rule read to no longer fetch entire groups and instead use the new single rule ac check. Simplifies provisioning api getAlertRuleAuthorized logic to always load a single rule instead of conditionally loading the entire group when provisioning permissions are not present. * Swap out Has/AuthorizeAccessToRule for Has/AuthorizeAccessInFolder	2024-05-28 10:49:24 -04:00
Alexander Weaver	65793440d3	Alerting: Test infrastructure for recording rules (#88200 ) * Add test rule generator support for recording rules * Remove accidental add * Recording rules appear in GetRulesForScheduling * A couple more tests, updates, count * No need to capture rule defs	2024-05-23 16:27:07 -05:00
Alexander Weaver	49c8deb1ea	Alerting: Add recording rules to ruler API and validation (#87779 ) * Read path, main API * Define record field for incoming requests * Refactor several alerting specific validators into two paths * Refactor validateCondition actually contain all the condition validation logic * Move condition validation inside rule path * Validators for recording rules * Wire feature flag through to validators * Test for accepting a valid recording rule * Tests for negative case, no UID * Test for ignoring alerting fields * Build conditions based on recording rules as well * Regenerate swagger docs * Fix CRUD test to cover the right thing * Re-generate swagger docs with backdated v0.30.2 version * Regenerate base spec * Regenerate ngalert specs * Regenerate top level specs * Comment and rename * Return struct instead of modifying ref	2024-05-21 14:39:28 -05:00
Alexander Weaver	1badcf4b63	Alerting: Allow NoData and ExecErrState to be fully blank on recording rules (#87868 ) * Allow empty NoData and ExecErrState on recording rules * remove TODO about this	2024-05-15 09:35:54 -05:00
Alexander Weaver	b8a284fb81	Alerting: Fix xorm serialization of Record field struct, add tests for storing and reading (#87857 ) Fix sub struct ser and deser, add tests	2024-05-14 14:50:06 -05:00
Alexander Weaver	36ef611cf4	Alerting: Add database migration for recording rule fields (#87012 ) * Create recording rule fields in model * Add migration * Write to database, support in version table * extend fingerprint * Force fields to be empty on validate * Another storage spot, tests for fingerprint * Explicitly set defaults in provisioning API * Tests for main API validation * Add diff tests even though fields are unpopulated for now * Use struct tag approach instead of FromDB/ToDB hooks as it better handles nulls when deserializing * test for deser * Backout RecordTo for now since it's not decided in the doc * back out of migration too * Drop datasourceref for now * address linter complaints * Try a single outer struct with all fields embedded	2024-05-09 12:12:44 -05:00
Matthew Jacobson	babfa2beac	Alerting: Hook up GMA silence APIs to new authentication handler (#86625 ) This PR connects the new RBAC authentication service to existing alertmanager API silence endpoints.	2024-05-03 15:32:30 -04:00
Yuri Tseretyan	052082a927	Alerting: Refactor Alert Rule Generators (#86813 )	2024-04-29 21:52:15 -04:00
Yuri Tseretyan	dff7cb9afb	Alerting: Move alertmanager api silence code to separate files (#86947 ) * Move alertmanager api silence code to separate files unchanged * Replace with silence model instead interface --------- Co-authored-by: Matt Jacobson <matthew.jacobson@grafana.com>	2024-04-25 15:20:37 -04:00
Yuri Tseretyan	9735a8a080	Alerting: Distinguish conflict violation errors (#86634 ) * update generator to set ID = 0 and do not set 0 if unique is needed * return proper message when the constraint violation	2024-04-22 12:28:46 -04:00
Matthew Jacobson	a20197229e	Alerting: Prevent simplified routing zero duration GroupInterval and RepeatInterval (#86561 ) Prevent zero duration GroupInterval and RepeatInterval	2024-04-18 21:08:38 -04:00
Matthew Jacobson	533bed6d94	Alerting: Fix simplified routes '...' groupBy creating invalid routes (#86006 ) * Alerting: Fix simplified routes '...' groupBy creating invalid routes There were a few ways to go about this fix: 1. Modifying our copy of upstream validation to allow this 2. Modify our notification settings validation to prevent this 3. Normalize group by on save 4. Normalized group by on generate Option 4. was chosen as the others have a mix of the following cons: - Generated routes risk being incompatible with upstream/remote AM - Awkward FE UX when using '...' - Rule definition changing after save and potential pitfalls with TF With option 4. generated routes stay compatible with external/remote AMs, FE doesn't need to change as we allow mixed '...' and custom label groupBys, and settings we save to db are the same ones requested. In addition, it has the slight benefit of allowing us to hide the internal implementation details of `alertname, grafana_folder` from the user in the future, since we don't need to send them with every FE or TF request. * Safer use of DefaultNotificationSettingsGroupBy * Fix missed API tests	2024-04-16 12:14:39 -04:00
Alexander Weaver	03114e7602	Alerting: Return better error for invalid time range on alert queries (#85611 ) * Return better error for invalid time range * drop comment	2024-04-05 09:20:21 -05:00
William Wernert	6d16cf2699	Alerting: Marshal incoming json.RawMessage in diff (#84692 ) This will ensure the encoding is correct when comparing to the existing rule.	2024-03-20 13:10:39 -04:00
Yuri Tseretyan	cfc3957894	Alerting: move store.ErrAlertRuleGroupNotFound to models package (#84308 ) move ErrAlertRuleGroupNotFound to models to avoid future circular dependencies	2024-03-12 15:38:21 -04:00
William Wernert	10dc6c6d75	Alerting: Add "Keep Last State" backend functionality (#83940 ) * Implement keep last state for state transitions * Respect For duration when keeping state * Only keep transition from recording an annotation * Add keep last state option for nodata/error in UI	2024-03-12 10:00:43 -04:00
Yuri Tseretyan	1eebd2a4de	Alerting: Support for simplified notification settings in rule API (#81011 ) * Add notification settings to storage\domain and API models. Settings are a slice to workaround XORM mapping * Support validation of notification settings when rules are updated * Implement route generator for Alertmanager configuration. That fetches all notification settings. * Update multi-tenant Alertmanager to run the generator before applying the configuration. * Add notification settings labels to state calculation * update the Multi-tenant Alertmanager to provide validation for notification settings * update GET API so only admins can see auto-gen	2024-02-15 09:45:10 -05:00
Alexander Weaver	ccb4533a86	Alerting: Remove unused AlertRuleVersionWithPauseStatus (#82383 ) Remove unused AlertRuleVersionWithPauseStatus	2024-02-13 10:56:24 -06:00
Matthew Jacobson	dd0ca1263b	Alerting: Include rule uid, title, namespace in unique constraint errors (#82011 ) * Alerting: Include rule_uid, title, namespace_uid in unique constraint errors	2024-02-07 12:55:48 -05:00
Yuri Tseretyan	47546a4c72	Alerting: Update API to use folders' full paths (#81214 ) * update GetUserVisibleNamespaces to use FolderSeriver * update GetNamespaceByUID to use FolderService.GetFolders * update GetAlertRulesForScheduling to use FolderService.GetFolders * Update API and GetAlertRulesForScheduling to use the folder's full path * get full path of folder in RouteTestGrafanaRuleConfig * fix escaping of titles for MySQL	2024-02-06 17:12:13 -05:00
William Wernert	2ab7d3c725	Alerting: Receivers API (read only endpoints) (#81751 ) * Add single receiver method * Add receiver permissions * Add single/multi GET endpoints for receivers * Remove stable tag from time intervals See end of PR description here: https://github.com/grafana/grafana/pull/81672	2024-02-05 20:12:15 +02:00
William Wernert	7e939401dc	Alerting: Introduce initial common receiver service (#81211 ) * Create locking config store that mimics existing provisioning store * Rename existing receivers(_test).go * Introduce shared receiver group service * Fix test * Move query model to models package * ReceiverGroup -> Receiver * Remove locking config store * Move convert methods to compat.go * Cleanup	2024-02-01 14:42:59 -05:00
Yuri Tseretyan	131c72d655	Alerting: Fix scheduler to group folders by the unique key (orgID and UID) (#81303 )	2024-01-30 17:14:11 -05:00
Matthew Jacobson	71e70c424f	Alerting: During legacy migration reduce the number of created silences (#78505 ) * Alerting: During legacy migration reduce the number of created silences During legacy migration every migrated rule was given a label rule_uid=<uid>. This was used to silence DatasourceError/DatasourceNoData alerts for migrated rules that had either ExecutionErrorState/NoDataState set to keep_state, respectively. This could potentially create a large amount of silences and a high cardinality label. Both of these scenarios have poor outcomes for CPU load and latency in unified alerting. Instead, this change creates one label per ExecutionErrorState/NoDataState when they are set to keep_state as well as two silence rules, if rules with said labels were created during migration. These silence rules are: - __legacy_silence_error_keep_state__ = true - __legacy_silence_nodata_keep_state__ = true This will drastically reduce the number of created silence rules in most cases as well as not create the potentially high cardinality label `rule_uid`.	2024-01-24 15:56:19 -05:00
Alexander Weaver	00a260effa	Alerting: Add setting to distribute rule group evaluations over time (#80766 ) * Simple, per-base-interval jitter * Add log just for test purposes * Add strategy approach, allow choosing between group or rule * Add flag to jitter rules * Add second toggle for jittering within a group * Wire up toggles to strategy * Slightly improve comment ordering * Add tests for offset generation * Rename JitterStrategyFrom * Improve debug log message * Use grafana SDK labels rather than prometheus labels	2024-01-18 12:48:11 -06:00
Sofia Papagiannaki	d1dab5828d	Alerting: Update rule API to address folders by UID (#74600 ) * Change ruler API to expect the folder UID as namespace * Update example requests * Fix tests * Update swagger * Modify FIle field in /api/prometheus/grafana/api/v1/rules * Fix ruler export * Modify folder in responses to be formatted as <parent UID>/<title> * Add alerting test with nested folders * Apply suggestion from code review * Alerting: use folder UID instead of title in rule API (#77166) Co-authored-by: Sonia Aguilar <soniaaguilarpeiron@gmail.com> * Drop a few more latent uses of namespace_id * move getNamespaceKey to models package * switch GetAlertRulesForScheduling to use folder table * update GetAlertRulesForScheduling to return folder titles in format `parent_uid/title`. * fi tests * add tests for GetAlertRulesForScheduling when parent uid * fix integration tests after merge * fix test after merge * change format of the namespace to JSON array this is needed for forward compatibility, when we migrate to full paths * update EF code to decode nested folder --------- Co-authored-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com> Co-authored-by: Virginia Cepeda <virginia.cepeda@grafana.com> Co-authored-by: Sonia Aguilar <soniaaguilarpeiron@gmail.com> Co-authored-by: Alex Weaver <weaver.alex.d@gmail.com> Co-authored-by: Gilles De Mey <gilles.de.mey@gmail.com>	2024-01-17 11:07:39 +02:00
Yuri Tseretyan	f6a46744a6	Alerting: Support hysteresis command expression (#75189 ) Backend: * Update the Grafana Alerting engine to provide feedback to HysteresisCommand. The feedback information is stored in state.Manager as a fingerprint of each state. The fingerprint is persisted to the database. Only fingerprints that belong to Pending and Alerting states are considered as "loaded" and provided back to the command. - add ResultFingerprint to state.State. It's different from other fingerprints we store in the state because it is calculated from the result labels. - add rule_fingerprint column to alert_instance - update alerting evaluator to accept AlertingResultsReader via context, and update scheduler to provide it. - add AlertingResultsFromRuleState that implements the new interface in eval package - update getExprRequest to patch the hysteresis command. * Only one "Recovery Threshold" query is allowed to be used in the alert rule and it must be the Condition. Frontend: * Add hysteresis option to Threshold in UI. It's called "Recovery Threshold" * Add test for getUnloadEvaluatorTypeFromCondition * Hide hysteresis in panel expressions * Refactor isInvalid and add test for it * Remove unnecesary React.memo * Add tests for updateEvaluatorConditions --------- Co-authored-by: Sonia Aguilar <soniaaguilarpeiron@gmail.com>	2024-01-04 11:47:13 -05:00
Matthew Jacobson	0424d44b39	Alerting: In migration, create one label per channel (#76527 ) * In migration, create one label per channel This PR changes how routing is done by the legacy alerting migration. Previously, we created a single label on each alert rule that contained an array of contact point names. Ex: __contact__="slack legacy testing","slack legacy testing2" This label was then routed against a series of regex-matching policies with continue=true. Ex: __contacts__ =~ ."slack legacy testing". In the case of many contact points, this array could quickly become difficult to manage and difficult to grok at-a-glance. This PR replaces the single __contact__ label with multiple __legacy_c_{contactname}__ labels and simple equality-matching policies. These channel-specific policies are nested in a single route under the top-level route which matches against __legacy_use_channels__ = true for ease of organization. This should improve the experience for users wanting to keep the default migrated routing strategy but who also want to modify which contact points an alert sends to.	2023-12-19 13:25:13 -05:00
Santiago	57e0d6bcb5	Chore: Simplify function signature for GetLatestAlertmanagerConfiguration (#79392 )	2023-12-12 13:49:54 +01:00
Matthew Jacobson	ce90a1f2be	Alerting: Apply query optimization to eval endpoints (#78566 ) * Alerting: Apply query optimization to eval endpoints Previously, query optimization was applied to alert queries when scheduled but not when ran through `api/v1/eval` or `/api/v1/rule/test/grafana`. This could lead to discrepancies between preview and scheduled alert results.	2023-11-28 19:44:28 -05:00
Matthew Jacobson	893839d27b	Alerting: Move general alert rule validation from db-layer to model (#78325 ) Alerting: Move general alert rule validation to model	2023-11-17 11:20:50 -05:00
Yuri Tseretyan	7cec741bae	Alerting: Extract alerting rules authorization logic to a service (#77006 ) * extract alerting authorization logic to separate package * convert authorization logic to service	2023-11-15 18:54:54 +02:00
Jo	580477bf8e	NGAlerting: Use identity.Requester interface instead of SignedInUser (#76360 ) * unfurl SignedInUserAttrs services * replace signedInUser with Requester replace signedInUser with requester * fix tests * linting --------- Co-authored-by: Ieva <ieva.vasiljeva@grafana.com>	2023-11-14 14:47:34 +00:00
Yuri Tseretyan	85425b2194	Alerting: Fix flaky test TestExportRules (#77519 ) * fix test to correclty mock data store * Update pkg/services/ngalert/api/api_ruler_export_test.go Co-authored-by: Jean-Philippe Quéméner <JohnnyQQQQ@users.noreply.github.com> * Update pkg/services/ngalert/api/api_ruler_export_test.go --------- Co-authored-by: Jean-Philippe Quéméner <JohnnyQQQQ@users.noreply.github.com>	2023-11-01 21:35:04 +02:00
Alexander Weaver	acee3efcf9	Alerting: Use common StateReason values for NoData/Error mapped states (#76781 ) Fix hardcoded state reasons	2023-10-18 17:26:41 -05:00
Yuri Tseretyan	2497db4bd6	Alerting: Add UID of rules to response that were affected by update group request (#75985 ) * update storage's method InstertRules to return ids of added rules as slice to keep the same order as rules in the argument * schematize response of update rule group endpoint, add created, updated, deleted fields that contain UID of affected rules. * update integration tests to use the new fields	2023-10-07 01:11:24 +03:00
Yuri Tseretyan	027bd9356f	Alerting: Rule Modify Export APIs (#75322 ) * extend RuleStore interface to get namespace by UID * add new export API endpoints * implement request handlers * update authorization and wire handlers to paths * add folder error matchers to errorToResponse * add tests for export methods	2023-10-02 11:47:59 -04:00
Yuri Tseretyan	237ce5ea82	Alerting: Extract methods for fetching rule groups with authorization (#75375 ) * extract methods for fetching rule groups with authorization and refactor the request handlers. * add logging to delete handler	2023-09-26 12:45:22 -04:00
Ryan McKinley	025b2f3011	Chore: use any rather than interface{} (#74066 )	2023-08-30 18:46:47 +03:00
Yuri Tseretyan	0053b07885	Alerting: Refactor of state manager tests (#72849 ) * calculate cacheID instead of literals * use mocked clocks * advance clocks with the eval results * use clearer timestamp aliases * make expected state labels be more clear to read Co-authored-by: Matthew Jacobson <matthew.jacobson@grafana.com>	2023-08-04 13:39:49 -04:00
Alexander Weaver	8c8b3ecb5b	Alerting: Add dashboardUID and panelID query parameters for loki state history (#72119 ) * read query parameters * Generate loki query from params	2023-07-24 23:46:46 -05:00
Alexander Weaver	f94fb765b5	Alerting: Add limit query parameter to Loki-based ASH api, drop default limit from 5000 to 1000, extend visible time range for new ASH UI (#70769 ) * Add limit query parameter * Drop copy paste comment * Extend history query limit to 30 days and 250 entries * Fix history log entries ordering * Update no history message, add empty history test --------- Co-authored-by: Konrad Lalik <konrad.lalik@grafana.com>	2023-06-28 13:32:28 -05:00
Yuri Tseretyan	842f33580e	SSE: Add functions that determine NodeType by UID and construct a data source struct from NodeType (#70106 ) * add NodeTypeFromDatasourceUID and DataSourceModelFromNodeType() * deprecate expr.DataSourceModel * replace usages of IsDataSource to NodeTypeFromDatasourceUID * replace usages of DataSourceModel to DataSourceModelFromNodeType()	2023-06-16 13:05:06 -04:00
Yuri Tseretyan	b57ef1f2c7	Alerting: Fix TestIntegration_GetAlertRulesForScheduling to make sure rules are created in different org (#69088 ) make sure rules are created in different org	2023-05-25 13:51:38 -04:00
George Robinson	19ebb079ba	Alerting: Add limits and filters to Prometheus Rules API (#66627 ) This commit adds support for limits and filters to the Prometheus Rules API. Limits: It adds a number of limits to the Grafana flavour of the Prometheus Rules API: - `limit` limits the maximum number of Rule Groups returned - `limit_rules` limits the maximum number of rules per Rule Group - `limit_alerts` limits the maximum number of alerts per rule It sorts Rule Groups and rules within Rule Groups such that data in the response is stable across requests. It also returns summaries (totals) for all Rule Groups, individual Rule Groups and rules. Filters: Alerts can be filtered by state with the `state` query string. An example of an HTTP request asking for just firing alerts might be `/api/prometheus/grafana/api/v1/rules?state=alerting`. A request can filter by two or more states by adding additional `state` query strings to the URL. For example `?state=alerting&state=normal`. Like the alert list panel, the `firing`, `pending` and `normal` state are first compared against the state of each alert rule. All other states are ignored. If the alert rule matches then its alert instances are filtered against states once more. Alerts can also be filtered by labels using the `matcher` query string. Like `state`, multiple matchers can be provided by adding additional `matcher` query strings to the URL. The match expression should be parsed using existing regular expression and sent to the API as URL-encoded JSON in the format: { "name": "test", "value": "value1", "isRegex": false, "isEqual": true } The `isRegex` and `isEqual` options work as follows: \| IsEqual \| IsRegex \| Operator \| \| ------- \| -------- \| -------- \| \| true \| false \| = \| \| true \| true \| =~ \| \| false \| true \| !~ \| \| false \| false \| != \|	2023-04-17 17:45:06 +01:00
gotjosh	2bbf0c9de4	Alerting: Allow Rules to Schedule to be filtered by Rule Group (#59990 ) * Alerting: Allow Rules to Schedule to be filtered by Rule Group	2023-04-13 12:55:42 +01:00
gotjosh	1c3ce0735f	Alerting: Tiny refactor on the eval and schedule packages (#66130 ) * Alerting: Tiny refactor on the eval and schedule packages two very small things: - We had a constructor on something called a `Context` which is not a `context.Context` so let's just name that constructor `NewContext` - The user that we use to run query evaluations is the same (with some variation) abstract it to a function so that it can be re-used when necessary. * Update pkg/services/ngalert/schedule/schedule.go Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com> * Update pkg/services/ngalert/schedule/schedule.go Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com> --------- Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com>	2023-04-06 16:02:28 +01:00
George Robinson	bd29071a0d	Revert "Alerting: Add limits to the Prometheus Rules API" (#65842 )	2023-04-03 15:20:37 +00:00
George Robinson	d96b0a71d3	Alerting: Add limits to the Prometheus Rules API (#65169 ) This commit adds a number of limits to the Grafana flavor of the Prometheus Rules API: 1. `limit` limits the maximum number of Rule Groups returned 2. `limit_rules` limits the maximum number of rules per Rule Group 3. `limit_alerts` limits the maximum number of alerts per rule It sorts Rule Groups and rules within Rule Groups such that data in the response is stable across requests. It also returns summaries (totals) for all Rule Groups, individual Rule Groups and rules.	2023-04-03 10:17:02 +01:00

1 2 3 4

185 Commits