mirror of
https://github.com/nosqlbench/nosqlbench.git
synced 2024-11-21 16:27:51 -06:00
157 lines
9.3 KiB
Markdown
157 lines
9.3 KiB
Markdown
|
# NoSQLBench 5.21
|
||
|
|
||
|
__release notes preview / work-in-progress__
|
||
|
|
||
|
The 5.21 series of NoSQLBench marks a significant departure from the earlier versions. The platform
|
||
|
is rebased on top of [Java 21 LTS](https://openjdk.org/projects/jdk/21/). The core architecture
|
||
|
has been adapted to suit more advanced workflows, particularly around dimensional metrics, test
|
||
|
parameterization and labeling, and automated analysis. Support for the hierarchic naming methods of
|
||
|
graphite have been removed and the core metrics logic has been rebuilt around dimensional metric
|
||
|
labeling.
|
||
|
|
||
|
## Java 21 LTS
|
||
|
|
||
|
The release of [Java 21](https://openjdk.org/projects/jdk/21/) is significant to the NoSQLBench
|
||
|
project for several reasons.
|
||
|
|
||
|
### Java 21 Performance & Threading Model
|
||
|
|
||
|
For systems like NoSQLBench, the runtime threading model in Java 21 is much improved. Virtual
|
||
|
threads offer a distinctly better solution for the kinds of workloads where you need to emulate
|
||
|
request-per-thread behavior efficiently. While virtual threads are not advised as a general
|
||
|
replacement in every case, they are particularly suited to the agnostic APIs within NoSQLBench
|
||
|
which wrap a myriad of different system driver types. In NB 5.21, virtual threads will be
|
||
|
enabled further as corner cases which cause pinning and other side-effects are removed.
|
||
|
|
||
|
The performance improvements are deeper than just the threading model by itself. The
|
||
|
built-in concurrent libraries which have evolved to work along-side virtual threads offer some of
|
||
|
the best opportunities for streamlining and simplifying concurrent code. A key example of this
|
||
|
is the rate limiter implementation in 5.21 which simply does not have the previous limitations
|
||
|
of the 5.17 implementation. It is based directly on java.util.concurrent.Semaphore, which
|
||
|
provides a character of scaling over cores and configurations which is surprisingly good.
|
||
|
|
||
|
## Component Tree
|
||
|
|
||
|
Contrary to the metrics system which is moving from a hierarchic model to a dimensional model,
|
||
|
the core runtime structure of NoSQLBench is moving from a flat model to a hierarchic model. This
|
||
|
may seem counter-intuitive at first, but these two structural systems work together to provide a
|
||
|
more direct and fool-proof way of identifying test data, metrics, lifecycles, configuration, etc.
|
||
|
|
||
|
This approach is called the "Component Tree" in NoSQLBench. It simply reflects that each phase,
|
||
|
each parameter, each measurement that is in a NoSQLBench test design has a specific beginning
|
||
|
and end point which is well-defined, _within the scope of its parent_, and that all these aspects
|
||
|
live together on the component they pertain to.
|
||
|
|
||
|
Here are some of the basic features of the component tree:
|
||
|
|
||
|
* Each component has a parent except for the root component, which has no parent.
|
||
|
* Each component registers with its parent upon creation, and is scoped to its parent's
|
||
|
lifecycle. i.e., when a parent component goes out of scope, it takes its attached
|
||
|
sub-components with it.
|
||
|
* All functions and side effects that a component may provide happen naturally within that
|
||
|
component's lifecycle, whether that is upon attachment, detachment, or in-between. No
|
||
|
component is considered valid outside of these boundaries.
|
||
|
* Each component may provide a set of component-specific labels and label values at time of
|
||
|
construction which _uniquely_ describe its context within then parent component. Overriding a
|
||
|
label which is already set is not allowed, nor is providing a label set which is already known
|
||
|
within a parent component. Each component has a labels property which is the logical sum of all
|
||
|
the labels on it and all parents. This provides unique labels at every level which are compatible
|
||
|
with dimensional metrics, annotation, and logging systems.
|
||
|
* Basic services, like metrics registration are provided within the component API
|
||
|
orthogonally and attached directly to components. Thus, the view of all metrics within the
|
||
|
runtime is simply the sum of all metrics registered on all components.
|
||
|
|
||
|
Here's a sketch of a typical NoSQLBench 5.21 session:
|
||
|
|
||
|
```
|
||
|
[CLI]
|
||
|
\
|
||
|
Session {session="s20231123_123456.123"}
|
||
|
┗━ Scenario {scenario="default"}
|
||
|
┣━ Activity {activity="schema"}
|
||
|
┃ ┗━ metric timer {name="cycles"}
|
||
|
┣━ Activity {activity="rampup"]
|
||
|
┃ ┗━ metric timer {name="cycles"}
|
||
|
┗━ Activity {activity="testann",k="100",dimensions="1000"}
|
||
|
┗━ metric timer {name="cycles"}
|
||
|
```
|
||
|
|
||
|
This shows the tree structure of the runtime and the implied lifecycle bounds of each type:
|
||
|
|
||
|
* The Command Line Interface is not a component, but it is used to configure global session
|
||
|
settings and launch a session.
|
||
|
* The Session is the root component. It has a single label under the name `session`. It has
|
||
|
three attached activities with distinct labels, each with an attached metric.
|
||
|
* `activity=schema`
|
||
|
* `activity=rampup`
|
||
|
* `activity=testann`
|
||
|
|
||
|
This contrived example demonstrates very simply the mechanisms of the component tree at work.
|
||
|
Each metric has a set of labels which uniquely identify it:
|
||
|
|
||
|
* timer with labels `{session="s20231123_123456.123",scenario="default",activity="schema",
|
||
|
name="cycles"}`
|
||
|
* timer with labels `{session="s20231123_123456.123",scenario="default",activity="rampup",
|
||
|
name="cycles"}`
|
||
|
* timer with
|
||
|
labels `{session="s20231123_123456.123",scenario="default",activity="testann",k="100",dimensions="1000",name="cycles"}`
|
||
|
|
||
|
## Dimensional Metrics
|
||
|
|
||
|
Backstory and motivation for this change is captured in [^1].
|
||
|
|
||
|
Beginning in NoSQLBench 5.21, the primary metrics transport will be client-push using the
|
||
|
[openmetrics](https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md)
|
||
|
exposition format. As well, the Victoria Metrics [community edition](https://victoriametrics.com/products/open-source/)
|
||
|
is open source and provides all the necessary telemetry features needed, it is the preferred
|
||
|
collector, database and query engine which the NoSQLBench project will integrate with by default.
|
||
|
That doesn't mean that others will be or are not supported, but it does mean that they will not get
|
||
|
prioritized for implementation unless there is a specific user need which doesn't compromise the
|
||
|
basic integrity of the dimensional metrics system.
|
||
|
|
||
|
Further, the reliance on the original metrics library has become more problematic over time. The
|
||
|
next version, which promised support for dimensional labels in metrics is officially
|
||
|
["on pause"](https://github.com/dropwizard/metrics#metrics). As such, the NB project will seek
|
||
|
to pivot off this library to something more current and supported going forward, as options permit.
|
||
|
|
||
|
### Native Analysis Methods
|
||
|
|
||
|
The scenario scripting layer in NoSQLBench hasn't gone away, but it will be considered secondary
|
||
|
to the Java-native way of writing scenario logic, especially for more sophisticated scenarios.
|
||
|
Tools like findmax, stepup, and optimo will become more prevalent as the primary way that users
|
||
|
leverage NoSQLBench. These advanced analysis methods were mostly functional in previous versions,
|
||
|
but they were nigh un-maintainable in their un-debuggable script form. This meant that they
|
||
|
couldn't be reliably leveraged across testing efforts to remove subjective and interpretive
|
||
|
human logic from advanced testing scenario. The new capability emulates the scenario fixtures of
|
||
|
before, but with a native context for all the APIs, wherein all component services can be
|
||
|
accessed directly.
|
||
|
|
||
|
### Parameterization
|
||
|
|
||
|
The changes described above hint at a capability that is nascent in the NB project: testing
|
||
|
within parameter spaces. In order to support the kinds of automated and advanced testing needed
|
||
|
for today's systems, this is a must-have. Specifically, we need the ability to describe a set of
|
||
|
parameters (what some may describe as _hyper-parameters_), and to have the testing system apply
|
||
|
an optimization or search algorithm to determine a local or global maxima. These parameters and
|
||
|
their results must be visible in a tangible form for technologists and diagnosticians to make
|
||
|
sense of them. This is why they are surfaced in NB 5.21 as labeled measurements, episodic and
|
||
|
real-time, over (labeled) parameter spaces. There will be more to come on this as we prove out
|
||
|
the analysis methods.
|
||
|
|
||
|
### Footnotes
|
||
|
|
||
|
[^1]: The original metrics library used with NoSQLBench was the --Coda Hale-- --Yammer--
|
||
|
DropWizard metrics library which adopted the hierarchic naming scheme popular with systems like
|
||
|
graphite. While useful at the time, telemetry systems moved on to dimensional metrics with the
|
||
|
adoption of Prometheus. The combination of graphite naming structure and data flow and
|
||
|
Prometheus was tenuous in practice. For a time, NoSQLBench wedged data from the hierarchic naming
|
||
|
schemed into dimensional form for Prometheus by using the graphite exporter and pattern matching
|
||
|
based name and label extraction. This was incredibly fragile and prevented workload modeling and
|
||
|
metrics capture around test parameters and other important details. Further, the _prometheus way_ of
|
||
|
gathering metrics imposed an onerous requirement on users that the metrics system was actively in
|
||
|
control of all data flows. (Yes you could use the external gateway, but that was yet another moving
|
||
|
part.) This further degraded the quality of metrics data by taking the time and cadence putting the
|
||
|
timing and cadence of metrics flows out of control of the client. It also put metrics flow behind
|
||
|
two polling mechanisms which degraded the immediacy of the metrics.
|
||
|
|