Commit Graph

20 Commits

Author SHA1 Message Date
Alexander Weaver
a8fb01a502
Swap weaveworks/common utilities for equivalents in grafana/dskit (#80051)
* Replace histogram collector and grpc injectors

* Extract request timing utility

* Also vendor test file

* Suppress erroneous linter warn
2024-01-05 10:08:38 -06:00
William Wernert
f7bf818527
Alerting: Make alert state history Loki http client public (#78291)
* Make state history Loki client public

* Make historian metrics subsystem configurable
2023-11-27 09:20:50 -05:00
Alexander Weaver
f94fb765b5
Alerting: Add limit query parameter to Loki-based ASH api, drop default limit from 5000 to 1000, extend visible time range for new ASH UI (#70769)
* Add limit query parameter

* Drop copy paste comment

* Extend history query limit to 30 days and 250 entries

* Fix history log entries ordering

* Update no history message, add empty history test

---------

Co-authored-by: Konrad Lalik <konrad.lalik@grafana.com>
2023-06-28 13:32:28 -05:00
Alexander Weaver
a384194e15
Alerting: Use default page size of 5000 when querying Loki for state history (#66315)
Always specify limit of 5000
2023-04-18 14:31:29 -05:00
Alexander Weaver
fb520edd72
Alerting: Use a completely isolated context for state history writes (#64989)
* Add fresh context with timeout and same log properties, re-derive logger

* Unify timeout constants

* Move ctx after shortcut that got added through rebasing

* Unify timeouts

* Port opentracing's SpanFromContext and ContextFromSpan to the grafana tracing package

* Support both opentracing and otel variants

* Better document why we're creating a new ctx

* Add new func to FakeSpan which was added after rebase

* Support grafana-specific traceID key in both tracer implementations
2023-04-04 16:41:46 -05:00
Alexander Weaver
a416100abc
Alerting: No longer index state history log streams by instance labels (#65474)
* Remove private labels

* No longer index by instance labels

* Labels are now invariant, only build them once

* Remove bucketing since everything is in a single stream

* Refactor statesToStreams to only return a single unified log stream

* Don't query on labels that no longer exist

* Move selector logic to loki layer, genericize client to work in terms of straight logQL

* Add support for line-level label filters in query

* Combine existing selector tests for better parallelism

* Tests for logQL construction

* Underscore instead of dot for unwrapping labels in logql
2023-03-29 11:52:11 -05:00
Alexander Weaver
07368dec74
Alerting: Fix attachment of external labels to Loki state history log streams (#65140)
Fix attachment of external labels, add tests
2023-03-21 18:00:59 -05:00
Alexander Weaver
bf54f2672e
Alerting: Switch to snappy-compressed-protobuf for outgoing push requests to Loki (#65077)
* Encode with snappy, always

* JSON encoder type

* Headers

* Copy labels formatter from promtail

* Implement snappy-proto encoding

* Create encoder interface, test both encoders, choose snappy-proto by default

* Make encoder configurable at the LokiCfg level

* Export both encoders

* Touch up comment and tests

* Drop unnecessary conversions after move to plain strings to appease linter
2023-03-21 13:38:42 -05:00
Alexander Weaver
a31672fa40
Alerting: Create new state history "fanout" backend that dispatches to multiple other backends at once (#64774)
* Rename RecordStatesAsync to Record

* Rename QueryStates to Query

* Implement fanout writes

* Implement primary queries

* Simplify error joining

* Add test for query path

* Add tests for writes and error propagation

* Allow fanout backend to be configured

* Touch up log messages and config validation

* Consistent documentation for all backend structs

* Parse and normalize backend names more consistently against an enum

* Touch-ups to documentation

* Improve clarity around multi-record blocking

* Keep primary and secondaries more distinct

* Rename fanout backend to multiple backend

* Simplify config keys for multi backend mode
2023-03-17 12:41:18 -05:00
Alexander Weaver
19d01dff91
Alerting: Expose Prometheus metrics for persisting state history (#63157)
* Create historian metrics and dependency inject

* Record counter for total number of state transitions logged

* Track write failures

* Track current number of active write goroutines

* Record histogram of how long it takes to write history data

* Don't copy the registerer

* Adjust naming of write failures metric

* Introduce WritesTotal to complement WritesFailedTotal

* Measure TransitionsFailedTotal to complement TransitionsTotal

* Rename all to state_history

* Remove redundant Total suffix

* Increment totals all the time, not just on success

* Drop ActiveWriteGoroutines

* Drop PersistDuration in favor of WriteDuration

* Drop unused gauge

* Make writes and writesFailed per org

* Add metric indicating backend and a spot for future metadata

* Drop _batch_ from names and update help

* Add metric for bytes written

* Better pairing of total + failure metric updates

* Few tweaks to wording and naming

* Record info metric during composition

* Create fakeRequester and simple happy path test using it

* Blocking test for the full historian and test for happy path metrics

* Add tests for failure case metrics

* Smoke test for full annotation persistence

* Create test for metrics on annotation persistence, both happy and failing paths

* Address linter complaints

* More linter complaints

* Remove unnecessary whitespace

* Consistency improvements to help texts

* Update tests to match new descs
2023-03-06 10:40:37 -06:00
Alexander Weaver
e77621649d
Alerting: Instrument outgoing state history requests using weaveworks/common (#63600)
* Loki backend and client depend on a requester

* Instrument all requests to loki using weaveworks TimedClient

* Construct collector in metrics package
2023-02-23 17:52:02 -06:00
Alexander Weaver
958fb2c50a
Alerting: Unify structs in Loki client and make them more consistent with Prometheus (#63055)
* Use existing row struct instead of [2]string, add deserialization helper

* Replace Stream struct with stream struct which is exactly the same

* Drop unused status field

* Don't export queryRes and queryData

* Tests for custom marshalling

* Rename row fields to T and V for consistency with prometheus samples

* Rename row to sample
2023-02-11 05:17:44 -06:00
Alexander Weaver
f80bf11782
Alerting: Make time range query parameters not required when querying Loki (#62985)
* Make from and to not required

* Move default range calculation up to loki.go
2023-02-07 14:26:43 -06:00
Jean-Philippe Quéméner
1694a35e0c
Alerting: implement loki query for alert state history (#61992)
* Alerting: implement loki query for alert state history

* extract selector building

* add unit tests for selector creation

* backup

* give selectors their own type

* build dataframe

* add some tests

* small changes after manual testing

* use struct client

* golint

* more golint

* Make RuleUID optional for Loki implementation

* Drop initial assumption that we only have one series

* Pare down to three columns, fix timestamp overflows, improve failure cases in loki responses

* Embed structred log lines in the dataframe as objects rather than json strings

* Include state history label filter

* Remove dead code

---------

Co-authored-by: Alex Weaver <weaver.alex.d@gmail.com>
2023-02-02 16:31:51 -06:00
Alexander Weaver
e7ace4ed62
Alerting: Allow separate read and write path URLs for Loki state history (#62268)
Extract config parsing and add tests
2023-01-30 16:30:05 -06:00
Alexander Weaver
b4682fe3cb
Alerting: Configurable externalLabels for Loki state history (#62404)
* Add config option for external labels

* Remove redundant nilcheck
2023-01-30 14:24:45 -06:00
George Robinson
a7eab8e46e
Alerting: Support context.Context in Loki interface (#61979)
This commit adds support for canceleable contexts in the Loki
interface.
2023-01-26 09:31:20 +00:00
Alexander Weaver
7ccc845187
Alerting: Push state history entries to Loki (#61724)
* Implement push endpoint

* Drop duplicated struct

* Genericize auth/tenant headers and improve logging in error case

* Flesh out the data model

* Drop dead code

* Drop log line entirely

* Drop unused arg

* Rename a few type manipulation functions

* Extract label keys as constants

* Improve logs when loki responds with error

* Inline lokiRepresentation function
2023-01-23 16:31:03 -06:00
Jean-Philippe Quéméner
44b11d3228
Alerting: support basic auth for the state history loki client (#61696) 2023-01-18 20:24:40 +01:00
Alexander Weaver
1ac89ea040
Alerting: Add client configuration for remote Loki historian backend and test connection (#61114)
* Create loki client type and ping method

* Expose TestConnection on client

* Configure and ping Loki URL

* Close response body reader if present

* Add 30 second timeout

* Remove duplicate close
2023-01-17 13:58:52 -06:00