mirror of
https://github.com/nosqlbench/nosqlbench.git
synced 2025-02-25 18:55:28 -06:00
fix markdown highlights
This commit is contained in:
@@ -5,61 +5,75 @@ weight: 00
|
||||
|
||||
# YAML Organization
|
||||
|
||||
It is best to keep every workload self-contained within a single YAML file, including schema, data rampup, and the main
|
||||
phase of testing. The phases of testing are controlled by tags as described in the Standard YAML section.
|
||||
It is best to keep every workload self-contained within a single YAML
|
||||
file, including schema, data rampup, and the main phase of testing. The
|
||||
phases of testing are controlled by tags as described below.
|
||||
|
||||
:::info
|
||||
The phase names described below have been adopted as a convention within the built-in workloads. It is strongly advised
|
||||
that new workload YAMLs use the same tagging scheme so that workload are more plugable across YAMLs.
|
||||
:::
|
||||
**NOTE:**
|
||||
The phase names described below have been adopted as a convention within
|
||||
the built-in workloads. It is strongly advised that new workload YAMLs use
|
||||
the same tagging scheme so that workload are more plugable across YAMLs.
|
||||
|
||||
## Schema phase
|
||||
|
||||
The schema phase is simply a phase of your test which creates the necessary schema on your target system. For CQL, this
|
||||
generally consists of a keyspace and one ore more table statements. There is no special schema layer in nosqlbench. All
|
||||
statements executed are simply statements. This provides the greatest flexibility in testing since every activity type
|
||||
is allowed to control its DDL and DML using the same machinery.
|
||||
The schema phase is simply a phase of your test which creates the
|
||||
necessary schema on your target system. For CQL, this generally consists
|
||||
of a keyspace and one ore more table statements. There is no special
|
||||
schema layer in nosqlbench. All statements executed are simply statements.
|
||||
This provides the greatest flexibility in testing since every activity
|
||||
type is allowed to control its DDL and DML using the same machinery.
|
||||
|
||||
The schema phase is normally executed with defaults for most parameters. This means that statements will execute in the
|
||||
order specified in the YAML, in serialized form, exactly once. This is a welcome side-effect of how the initial
|
||||
parameters like _cycles_ is set from the statements which are activated by tagging.
|
||||
The schema phase is normally executed with defaults for most parameters.
|
||||
This means that statements will execute in the order specified in the
|
||||
YAML, in serialized form, exactly once. This is a welcome side-effect of
|
||||
how the initial parameters like _cycles_ is set from the statements which
|
||||
are activated by tagging.
|
||||
|
||||
You can mark statements as schema phase statements by adding this set of tags to the statements, either directly, or by
|
||||
block:
|
||||
You can mark statements as schema phase statements by adding this set of
|
||||
tags to the statements, either directly, or by block:
|
||||
|
||||
tags:
|
||||
phase: schema
|
||||
|
||||
## Rampup phase
|
||||
|
||||
When you run a performance test, it is very important to be aware of how much data is present. Higher density tests are
|
||||
more realistic for systems which accumulate data over time, or which have a large working set of data. The amount of
|
||||
data on the system you are testing should recreate a realistic amount of data that you would run in production, ideally.
|
||||
In general, there is a triangular trade-off between service time, op rate, and data density.
|
||||
When you run a performance test, it is very important to be aware of how
|
||||
much data is present. Higher density tests are more realistic for systems
|
||||
which accumulate data over time, or which have a large working set of
|
||||
data. The amount of data on the system you are testing should recreate a
|
||||
realistic amount of data that you would run in production, ideally. In
|
||||
general, there is a triangular trade-off between service time, op rate,
|
||||
and data density.
|
||||
|
||||
It is the purpose of the _rampup_ phase to create the backdrop data on a target system that makes a test meaningful for
|
||||
some level of data density. Data density is normally discussed as average per node, but it is also important to consider
|
||||
distribution of data as it varies from the least dense to the most dense nodes.
|
||||
It is the purpose of the _rampup_ phase to create the backdrop data on a
|
||||
target system that makes a test meaningful for some level of data density.
|
||||
Data density is normally discussed as average per node, but it is also
|
||||
important to consider distribution of data as it varies from the least
|
||||
dense to the most dense nodes.
|
||||
|
||||
Because it is useful to be able to add data to a target cluster in an incremental way, the bindings which are used with
|
||||
a _rampup_ phase may actually be different from the ones used for a _main_ phase. In most cases, you want the rampup
|
||||
phase to create data in a way that incrementally adds to the population of data in the cluster. This allows you to add
|
||||
some data to a cluster with `cycles=0..1M` and then decide whether to continue adding data using the next contiguous
|
||||
range of cycles, with `cycles=1M..2M` and so on.
|
||||
Because it is useful to be able to add data to a target cluster in an
|
||||
incremental way, the bindings which are used with a _rampup_ phase may
|
||||
actually be different from the ones used for a _main_ phase. In most
|
||||
cases, you want the rampup phase to create data in a way that
|
||||
incrementally adds to the population of data in the cluster. This allows
|
||||
you to add some data to a cluster with `cycles=0..1M` and then decide
|
||||
whether to continue adding data using the next contiguous range of cycles,
|
||||
with `cycles=1M..2M` and so on.
|
||||
|
||||
You can mark statements as rampup phase statements by adding this set of tags to the statements, either directly, or by
|
||||
block:
|
||||
You can mark statements as rampup phase statements by adding this set of
|
||||
tags to the statements, either directly, or by block:
|
||||
|
||||
tags:
|
||||
phase: rampup
|
||||
|
||||
## Main phase
|
||||
|
||||
The main phase of a nosqlbench scenario is the one during which you really care about the metric. This is the actual
|
||||
test that everything else has prepared your system for.
|
||||
The main phase of a nosqlbench scenario is the one during which you really
|
||||
care about the metric. This is the actual test that everything else has
|
||||
prepared your system for.
|
||||
|
||||
You can mark statement as schema phase statements by adding this set of tags to the statements, either directly, or by
|
||||
block:
|
||||
You can mark statement as schema phase statements by adding this set of
|
||||
tags to the statements, either directly, or by block:
|
||||
|
||||
tags:
|
||||
phase: main
|
||||
|
||||
@@ -5,15 +5,18 @@ weight: 07
|
||||
|
||||
# Multi-Docs
|
||||
|
||||
The YAML spec allows for multiple yaml documents to be concatenated in the same file with a separator:
|
||||
The YAML spec allows for multiple yaml documents to be concatenated in the
|
||||
same file with a separator:
|
||||
|
||||
```yaml
|
||||
---
|
||||
```
|
||||
|
||||
This offers an additional convenience when configuring activities. If you want to parameterize or tag some a set of
|
||||
statements with their own bindings, params, or tags, but alongside another set of uniquely configured statements, you
|
||||
need only put them in separate logical documents, separated by a triple-dash.
|
||||
This offers an additional convenience when configuring activities. If you
|
||||
want to parameterize or tag some a set of statements with their own
|
||||
bindings, params, or tags, but alongside another set of uniquely
|
||||
configured statements, you need only put them in separate logical
|
||||
documents, separated by a triple-dash.
|
||||
|
||||
For example:
|
||||
|
||||
@@ -43,11 +46,13 @@ doc2.number eight
|
||||
doc1.form1 doc1.1
|
||||
```
|
||||
|
||||
This shows that you can use the power of blocks and tags together at one level and also allow statements to be broken
|
||||
apart into a whole other level of partitioning if desired.
|
||||
This shows that you can use the power of blocks and tags together at one
|
||||
level and also allow statements to be broken apart into a whole other
|
||||
level of partitioning if desired.
|
||||
|
||||
:::warning
|
||||
The multi-doc support is there as a ripcord when you need it. However, it is strongly advised that you keep your YAML
|
||||
workloads simple to start and only use features like the multi-doc when you absolutely need it. For this, blocks are
|
||||
generally a better choice. See examples in the standard workloads.
|
||||
:::
|
||||
**WARNING:**
|
||||
The multi-doc support is there as a ripcord when you need it. However, it
|
||||
is strongly advised that you keep your YAML workloads simple to start and
|
||||
only use features like the multi-doc when you absolutely need it. For
|
||||
this, blocks are generally a better choice. See examples in the standard
|
||||
workloads.
|
||||
|
||||
@@ -10,26 +10,30 @@ Docs, Blocks, and Statements can all have names:
|
||||
```yaml
|
||||
name: doc1
|
||||
blocks:
|
||||
- name: block1
|
||||
statements:
|
||||
- stmt1: statement1
|
||||
- name: st2
|
||||
stmt: statement2
|
||||
- name: block1
|
||||
statements:
|
||||
- stmt1: statement1
|
||||
- name: st2
|
||||
stmt: statement2
|
||||
---
|
||||
name: doc2
|
||||
...
|
||||
```
|
||||
|
||||
This provides a layered naming scheme for the statements themselves. It is not usually important to name things except
|
||||
for documentation or metric naming purposes.
|
||||
This provides a layered naming scheme for the statements themselves. It is
|
||||
not usually important to name things except for documentation or metric
|
||||
naming purposes.
|
||||
|
||||
If no names are provided, then names are automatically created for blocks and statements. Statements assigned at the
|
||||
document level are assigned to "block0". All other statements are named with the format `doc#--block#--stmt#`.
|
||||
If no names are provided, then names are automatically created for blocks
|
||||
and statements. Statements assigned at the document level are assigned
|
||||
to "block0". All other statements are named with the
|
||||
format `doc#--block#--stmt#`.
|
||||
|
||||
For example, the full name of statement1 above would be `doc1--block1--stmt1`.
|
||||
For example, the full name of statement1 above would
|
||||
be `doc1--block1--stmt1`.
|
||||
|
||||
:::info
|
||||
If you anticipate wanting to get metrics for a specific statement in addition to the other metrics, then you will want
|
||||
to adopt the habit of naming all your statements something basic and descriptive.
|
||||
:::
|
||||
**NOTE:**
|
||||
If you anticipate wanting to get metrics for a specific statement in
|
||||
addition to the other metrics, then you will want to adopt the habit of
|
||||
naming all your statements something basic and descriptive.
|
||||
|
||||
|
||||
@@ -5,44 +5,50 @@ weight: 10
|
||||
|
||||
# Named Scenarios
|
||||
|
||||
There is one final element of a yaml that you need to know about: _named scenarios_.
|
||||
There is one final element of a yaml that you need to know about: _named
|
||||
scenarios_.
|
||||
|
||||
**Named Scenarios allow anybody to run your testing workflows with a single command.**
|
||||
**Named Scenarios allow anybody to run your testing workflows with a
|
||||
single command.**
|
||||
|
||||
You can provide named scenarios for a workload like this:
|
||||
|
||||
```yaml
|
||||
# contents of myworkloads.yaml
|
||||
scenarios:
|
||||
default:
|
||||
- run driver=diag cycles=10 alias=first-ten
|
||||
- run driver=diag cycles=10..20 alias=second-ten
|
||||
longrun:
|
||||
- run driver=diag cycles=10M
|
||||
default:
|
||||
- run driver=diag cycles=10 alias=first-ten
|
||||
- run driver=diag cycles=10..20 alias=second-ten
|
||||
longrun:
|
||||
- run driver=diag cycles=10M
|
||||
```
|
||||
|
||||
This provides a way to specify more detailed workflows that users may want to run without them having to build up a
|
||||
command line for themselves.
|
||||
This provides a way to specify more detailed workflows that users may want
|
||||
to run without them having to build up a command line for themselves.
|
||||
|
||||
A couple of other forms are supported in the YAML, for terseness:
|
||||
|
||||
```yaml
|
||||
scenarios:
|
||||
oneliner: run driver=diag cycles=10
|
||||
mapform:
|
||||
part1: run driver=diag cycles=10 alias=part2
|
||||
part2: run driver=diag cycles=20 alias=part2
|
||||
oneliner: run driver=diag cycles=10
|
||||
mapform:
|
||||
part1: run driver=diag cycles=10 alias=part2
|
||||
part2: run driver=diag cycles=20 alias=part2
|
||||
```
|
||||
|
||||
These forms simply provide finesse for common editing habits, but they are automatically read internally as a list. In
|
||||
the map form, the names are discarded, but they may be descriptive enough for use as inline docs for some users. The
|
||||
order is retained as listed, since the names have no bearing on the order.
|
||||
These forms simply provide finesse for common editing habits, but they are
|
||||
automatically read internally as a list. In the map form, the names are
|
||||
discarded, but they may be descriptive enough for use as inline docs for
|
||||
some users. The order is retained as listed, since the names have no
|
||||
bearing on the order.
|
||||
|
||||
## Scenario selection
|
||||
|
||||
When a named scenario is run, it is *always* named, so that it can be looked up in the list of named scenarios under
|
||||
your `scenarios:` property. The only exception to this is when an explicit scenario name is not found on the command
|
||||
line, in which case it is automatically assumed to be _default_.
|
||||
When a named scenario is run, it is *always* named, so that it can be
|
||||
looked up in the list of named scenarios under your `scenarios:` property.
|
||||
The only exception to this is when an explicit scenario name is not found
|
||||
on the command line, in which case it is automatically assumed to be _
|
||||
default_.
|
||||
|
||||
Some examples may be more illustrative:
|
||||
|
||||
@@ -67,60 +73,75 @@ nb scenario myworkloads longrun default longrun scenario another.yaml name1 name
|
||||
|
||||
## Workload selection
|
||||
|
||||
The examples above contain no reference to a workload (formerly called _yaml_). They don't need to, as they refer to
|
||||
themselves implicitly. You may add a `workload=` parameter to the command templates if you like, but this is never
|
||||
needed for basic use, and it is error prone to keep the filename matched to the command template. Just leave it out by
|
||||
default.
|
||||
The examples above contain no reference to a workload (formerly called _
|
||||
yaml_). They don't need to, as they refer to themselves implicitly. You
|
||||
may add a `workload=` parameter to the command templates if you like, but
|
||||
this is never needed for basic use, and it is error prone to keep the
|
||||
filename matched to the command template. Just leave it out by default.
|
||||
|
||||
_However_, if you are doing advanced scripting across multiple systems, you can actually provide a `workload=` parameter
|
||||
particularly to use another workload description in your test.
|
||||
_However_, if you are doing advanced scripting across multiple systems,
|
||||
you can actually provide a `workload=` parameter particularly to use
|
||||
another workload description in your test.
|
||||
|
||||
:::info
|
||||
This is a powerful feature for workload automation and organization. However, it can get unweildy quickly. Caution is
|
||||
advised for deep-linking too many scenarios in a workspace, as there is no mechanism for keeping them in sync when small
|
||||
changes are made.
|
||||
:::
|
||||
**NOTE:**
|
||||
This is a powerful feature for workload automation and organization.
|
||||
However, it can get unweildy quickly. Caution is advised for deep-linking
|
||||
too many scenarios in a workspace, as there is no mechanism for keeping
|
||||
them in sync when small changes are made.
|
||||
|
||||
## Named Scenario Discovery
|
||||
|
||||
For named scenarios, there is a way for users to find all the named scenarios that are currently bundled or in view of
|
||||
their current directory. A couple simple rules must be followed by scenario publishers in order to keep things simple:
|
||||
For named scenarios, there is a way for users to find all the named
|
||||
scenarios that are currently bundled or in view of their current
|
||||
directory. A couple simple rules must be followed by scenario publishers
|
||||
in order to keep things simple:
|
||||
|
||||
1. Workload files in the current directory `*.yaml` are considered.
|
||||
2. Workload files under in the relative path `activities/` with name `*.yaml` are
|
||||
considered.
|
||||
3. The same rules are used when looking in the bundled nosqlbench, so built-ins
|
||||
come along for the ride.
|
||||
4. Any workload file that contains a `scenarios:` tag is included, but all others
|
||||
are ignored.
|
||||
2. Workload files under in the relative path `activities/` with
|
||||
name `*.yaml` are considered.
|
||||
3. The same rules are used when looking in the bundled nosqlbench, so
|
||||
built-ins come along for the ride.
|
||||
4. Any workload file that contains a `scenarios:` tag is included, but all
|
||||
others are ignored.
|
||||
|
||||
This doesn't mean that you can't use named scenarios for workloads in other locations. It simply means that when users
|
||||
use the `--list-scenarios` option, these are the only ones they will see listed.
|
||||
This doesn't mean that you can't use named scenarios for workloads in
|
||||
other locations. It simply means that when users use
|
||||
the `--list-scenarios` option, these are the only ones they will see
|
||||
listed.
|
||||
|
||||
## Parameter Overrides
|
||||
|
||||
You can override parameters that are provided by named scenarios. Any parameter that you specify on the command line
|
||||
after your workload and optional scenario name will be used to override or augment the commands that are provided for
|
||||
the named scenario.
|
||||
You can override parameters that are provided by named scenarios. Any
|
||||
parameter that you specify on the command line after your workload and
|
||||
optional scenario name will be used to override or augment the commands
|
||||
that are provided for the named scenario.
|
||||
|
||||
This is powerful, but it also means that you can sometimes munge user-provided activity parameters on the command line
|
||||
with the named scenario commands in ways that may not make sense. To solve this, the parameters in the named scenario
|
||||
commands may be locked. You can lock them silently, or you can provide a verbose locking that will cause an error if the
|
||||
user even tries to adjust them.
|
||||
This is powerful, but it also means that you can sometimes munge
|
||||
user-provided activity parameters on the command line with the named
|
||||
scenario commands in ways that may not make sense. To solve this, the
|
||||
parameters in the named scenario commands may be locked. You can lock them
|
||||
silently, or you can provide a verbose locking that will cause an error if
|
||||
the user even tries to adjust them.
|
||||
|
||||
Silent locking is provided with a form like `param==value`. Any silent locked parameters will reject overrides from the
|
||||
command line, but will not interrupt the user.
|
||||
Silent locking is provided with a form like `param==value`. Any silent
|
||||
locked parameters will reject overrides from the command line, but will
|
||||
not interrupt the user.
|
||||
|
||||
Verbose locking is provided with a form like `param===value`. Any time a user provides a parameter on the command line
|
||||
for the named parameter, an error is thrown and they are informed that this is not possible. This level is provided for
|
||||
cases in which you would not want the user to be unaware of an unset parameter which is germain and specific to the
|
||||
named scenario.
|
||||
Verbose locking is provided with a form like `param===value`. Any time a
|
||||
user provides a parameter on the command line for the named parameter, an
|
||||
error is thrown and they are informed that this is not possible. This
|
||||
level is provided for cases in which you would not want the user to be
|
||||
unaware of an unset parameter which is germain and specific to the named
|
||||
scenario.
|
||||
|
||||
All other parameters provided by the user will take the place of the same-named parameters provided in *each* command
|
||||
templates, in the order they appear in the template. Any other parameters provided by the user will be added to *each*
|
||||
All other parameters provided by the user will take the place of the
|
||||
same-named parameters provided in *each* command templates, in the order
|
||||
they appear in the template. Any other parameters provided by the user
|
||||
will be added to *each*
|
||||
of the command templates in the order they appear on the command line.
|
||||
|
||||
This is a little counter-intuitive at first, but once you see some examples it should make sense.
|
||||
This is a little counter-intuitive at first, but once you see some
|
||||
examples it should make sense.
|
||||
|
||||
## Parameter Override Examples
|
||||
|
||||
@@ -129,18 +150,19 @@ Consider a simple workload with three named scenarios:
|
||||
```yaml
|
||||
# basics.yaml
|
||||
scenarios:
|
||||
s1: run driver=stdout cycles=10
|
||||
s2: run driver=stdout cycles==10
|
||||
s3: run driver=stdout cycles===10
|
||||
s1: run driver=stdout cycles=10
|
||||
s2: run driver=stdout cycles==10
|
||||
s3: run driver=stdout cycles===10
|
||||
|
||||
bindings:
|
||||
c: Identity()
|
||||
c: Identity()
|
||||
|
||||
statements:
|
||||
- A: "cycle={c}\n"
|
||||
- A: "cycle={c}\n"
|
||||
```
|
||||
|
||||
Running this with no options prompts the user to select one of the named scenarios:
|
||||
Running this with no options prompts the user to select one of the named
|
||||
scenarios:
|
||||
|
||||
```text
|
||||
$ nb basics
|
||||
@@ -150,8 +172,8 @@ $
|
||||
|
||||
### Basic Override example
|
||||
|
||||
If you run the first scenario `s1` with your own value for `cycles=7`, it does as you
|
||||
ask:
|
||||
If you run the first scenario `s1` with your own value for `cycles=7`, it
|
||||
does as you ask:
|
||||
|
||||
```text
|
||||
$ nb basics s1 cycles=7
|
||||
@@ -168,8 +190,10 @@ $
|
||||
|
||||
### Silent Locking example
|
||||
|
||||
If you run the second scenario `s2` with your own value for `cycles=7`, then it does what the locked parameter
|
||||
`cycles==10` requires, without telling you that it is ignoring the specified value on your command line.
|
||||
If you run the second scenario `s2` with your own value for `cycles=7`,
|
||||
then it does what the locked parameter
|
||||
`cycles==10` requires, without telling you that it is ignoring the
|
||||
specified value on your command line.
|
||||
|
||||
```text
|
||||
$ nb basics s2 cycles=7
|
||||
@@ -187,13 +211,16 @@ cycle=9
|
||||
$
|
||||
```
|
||||
|
||||
Sometimes, this is appropriate, such as when specifying settings like `threads==` for schema phases.
|
||||
Sometimes, this is appropriate, such as when specifying settings
|
||||
like `threads==` for schema phases.
|
||||
|
||||
### Verbose Locking example
|
||||
|
||||
If you run the third scenario `s3` with your own value for `cycles=7`, then you will get an error telling you that this
|
||||
is not possible. Sometimes you want to make sure tha the user knows a parameter should not be changed, and that if they
|
||||
want to change it, they'll have to make their own custom version of the scenario in question.
|
||||
If you run the third scenario `s3` with your own value for `cycles=7`,
|
||||
then you will get an error telling you that this is not possible.
|
||||
Sometimes you want to make sure tha the user knows a parameter should not
|
||||
be changed, and that if they want to change it, they'll have to make their
|
||||
own custom version of the scenario in question.
|
||||
|
||||
```text
|
||||
$ nb basics s3 cycles=7
|
||||
@@ -201,52 +228,65 @@ ERROR: Unable to reassign value for locked param 'cycles===7'
|
||||
$
|
||||
```
|
||||
|
||||
Ultimately, it is up to the scenario designer when to lock parameters for users. The built-in workloads offer some
|
||||
examples on how to set these parameters so that the right value are locked in place without bother the user, but some
|
||||
values are made very clear in how they should be set. Please look at these examples for inspiration when you need.
|
||||
Ultimately, it is up to the scenario designer when to lock parameters for
|
||||
users. The built-in workloads offer some examples on how to set these
|
||||
parameters so that the right value are locked in place without bother the
|
||||
user, but some values are made very clear in how they should be set.
|
||||
Please look at these examples for inspiration when you need.
|
||||
|
||||
## Forcing Undefined (default) Parameters
|
||||
|
||||
If you want to ensure that any parameter in a named scenario template remains unset in the generated scenario script,
|
||||
you can assign it a value of UNDEF. The locking behaviors described above apply to this one as well. Thus, for schema
|
||||
commands which rely on the default sequence length (which is based on the number of active statements), you can set
|
||||
cycles==UNDEF to ensure that when a user passes a cycles parameter the schema phase doesn't break with too many cycles.
|
||||
If you want to ensure that any parameter in a named scenario template
|
||||
remains unset in the generated scenario script, you can assign it a value
|
||||
of UNDEF. The locking behaviors described above apply to this one as well.
|
||||
Thus, for schema commands which rely on the default sequence length (which
|
||||
is based on the number of active statements), you can set cycles==UNDEF to
|
||||
ensure that when a user passes a cycles parameter the schema phase doesn't
|
||||
break with too many cycles.
|
||||
|
||||
## Automatic Parameters
|
||||
|
||||
Some parameters are already known due to the fact that you are using named scenarios.
|
||||
Some parameters are already known due to the fact that you are using named
|
||||
scenarios.
|
||||
|
||||
### workload
|
||||
|
||||
The `workload` parameter is, by default, set to the logical path (fully qualified workload name) of the yaml file
|
||||
containing the named scenario. However, if the command template contains this parameter, it may be overridden by users
|
||||
as any other parameter depending on the assignment operators as explained above.
|
||||
The `workload` parameter is, by default, set to the logical path (fully
|
||||
qualified workload name) of the yaml file containing the named scenario.
|
||||
However, if the command template contains this parameter, it may be
|
||||
overridden by users as any other parameter depending on the assignment
|
||||
operators as explained above.
|
||||
|
||||
### alias
|
||||
|
||||
The `alias` parameter is, by default, set to the expanded name of WORKLOAD_SCENARIO_STEP, which means that each activity
|
||||
within the scenario has a distinct and symbolic name. This is important for distinguishing metrics from one another
|
||||
across workloads, named scenarios, and steps within a named scenario. The above words are interpolated into the alias as
|
||||
follows:
|
||||
The `alias` parameter is, by default, set to the expanded name of
|
||||
WORKLOAD_SCENARIO_STEP, which means that each activity within the scenario
|
||||
has a distinct and symbolic name. This is important for distinguishing
|
||||
metrics from one another across workloads, named scenarios, and steps
|
||||
within a named scenario. The above words are interpolated into the alias
|
||||
as follows:
|
||||
|
||||
- WORKLOAD - The simple name part of the fully qualified workload name. For example, with a workload (yaml path) of
|
||||
foo/bar/baz.yaml, the WORKLOAD name used here would be `baz`.
|
||||
- WORKLOAD - The simple name part of the fully qualified workload name.
|
||||
For example, with a workload (yaml path) of foo/bar/baz.yaml, the
|
||||
WORKLOAD name used here would be `baz`.
|
||||
|
||||
- SCENARIO - The name of the scenario as provided on the command line.
|
||||
|
||||
- STEP - The name of the step in the named scenario. If you used the list or string forms to provide a command template,
|
||||
then the steps are automatically named as a zero-padded number representing the step in the named scenario, starting
|
||||
from `000`, per named scenario. (The numbers are not globally assigned)
|
||||
- STEP - The name of the step in the named scenario. If you used the list
|
||||
or string forms to provide a command template, then the steps are
|
||||
automatically named as a zero-padded number representing the step in the
|
||||
named scenario, starting from `000`, per named scenario. (The numbers
|
||||
are not globally assigned)
|
||||
|
||||
Because it is important to have uniquely named activities for the sake of sane metrics and logging, any alias provided
|
||||
when using named scenarios which does not include the three tokens above will cause a warning to be issued to the user
|
||||
explaining why this is a bad idea.
|
||||
Because it is important to have uniquely named activities for the sake of
|
||||
sane metrics and logging, any alias provided when using named scenarios
|
||||
which does not include the three tokens above will cause a warning to be
|
||||
issued to the user explaining why this is a bad idea.
|
||||
|
||||
:::info
|
||||
**NOTE:**
|
||||
UNDEF is handled before alias expansion above, so it is possible to force
|
||||
the default activity naming behavior above with `alias===UNDEF`. This is
|
||||
generally recommended, and will inform users if they try to set the alias
|
||||
in an unsafe way.
|
||||
|
||||
UNDEF is handled before alias expansion above, so it is possible to force the default activity naming behavior above
|
||||
with `alias===UNDEF`. This is generally recommended, and will inform users if they try to set the alias in an unsafe
|
||||
way.
|
||||
|
||||
:::
|
||||
|
||||
|
||||
@@ -5,12 +5,14 @@ weight: 2
|
||||
|
||||
# Example Commands
|
||||
|
||||
Let's run a simple test against a cluster to establish some basic familiarity with the NoSQLBench.
|
||||
Let's run a simple test against a cluster to establish some basic
|
||||
familiarity with the NoSQLBench.
|
||||
|
||||
## Create a Schema
|
||||
|
||||
We will start by creating a simple schema in the database. From your command line, go ahead and execute the following
|
||||
command, replacing the `host=<host-or-ip>` with that of one of your database nodes.
|
||||
We will start by creating a simple schema in the database. From your
|
||||
command line, go ahead and execute the following command, replacing
|
||||
the `host=<host-or-ip>` with that of one of your database nodes.
|
||||
|
||||
```text
|
||||
./nb run driver=cql workload=cql-keyvalue tags=phase:schema host=<host-or-ip>
|
||||
@@ -33,28 +35,36 @@ Let's break down each of those command line options.
|
||||
|
||||
`run` tells nosqlbench to run an activity.
|
||||
|
||||
`driver=...` is used to specify the activity type (driver). In this case we are using `cql`, which tells nosqlbench to
|
||||
use the DataStax Java Driver and execute CQL statements against a database.
|
||||
`driver=...` is used to specify the activity type (driver). In this case
|
||||
we are using `cql`, which tells nosqlbench to use the DataStax Java Driver
|
||||
and execute CQL statements against a database.
|
||||
|
||||
`workload=...` is used to specify the workload definition file that defines the activity.
|
||||
`workload=...` is used to specify the workload definition file that
|
||||
defines the activity.
|
||||
|
||||
In this example, we use `cql-keyvalue` which is a pre-built workload that is packaged with nosqlbench.
|
||||
In this example, we use `cql-keyvalue` which is a pre-built workload that
|
||||
is packaged with nosqlbench.
|
||||
|
||||
`tags=phase:schema` tells nosqlbench to run the yaml block that has the `phase:schema` defined as one of its tags.
|
||||
`tags=phase:schema` tells nosqlbench to run the yaml block that has
|
||||
the `phase:schema` defined as one of its tags.
|
||||
|
||||
In this example, that is the DDL portion of the `cql-keyvalue` workload. `host=...` tells nosqlbench how to connect to
|
||||
your database, only one host is necessary.
|
||||
In this example, that is the DDL portion of the `cql-keyvalue`
|
||||
workload. `host=...` tells nosqlbench how to connect to your database,
|
||||
only one host is necessary.
|
||||
|
||||
If you like, you can verify the result of this command by decribing your keyspace in cqlsh or DataStax Studio with
|
||||
If you like, you can verify the result of this command by decribing your
|
||||
keyspace in cqlsh or DataStax Studio with
|
||||
`DESCRIBE KEYSPACE baselines`.
|
||||
|
||||
## Load Some Data
|
||||
|
||||
Before running a test of typical access patterns where you want to capture the results, you need to make the test more
|
||||
interesting than loading an empty table. For this, we use the rampup phase.
|
||||
Before running a test of typical access patterns where you want to capture
|
||||
the results, you need to make the test more interesting than loading an
|
||||
empty table. For this, we use the rampup phase.
|
||||
|
||||
Before sending our test writes to the database, we will use the `stdout` activity type so we can see what nosqlbench is
|
||||
generating for CQL statements.
|
||||
Before sending our test writes to the database, we will use the `stdout`
|
||||
activity type so we can see what nosqlbench is generating for CQL
|
||||
statements.
|
||||
|
||||
Go ahead and execute the following command:
|
||||
|
||||
@@ -75,29 +85,36 @@ insert into baselines.keyvalue (key, value) values (8,296173906);
|
||||
insert into baselines.keyvalue (key, value) values (9,97405552);
|
||||
```
|
||||
|
||||
NoSQLBench deterministically generates data, so the generated values will be the same from run to run.
|
||||
NoSQLBench deterministically generates data, so the generated values will
|
||||
be the same from run to run.
|
||||
|
||||
Now we are ready to write some data to our database. Go ahead and execute the following from your command line:
|
||||
Now we are ready to write some data to our database. Go ahead and execute
|
||||
the following from your command line:
|
||||
|
||||
./nb run driver=cql workload=cql-keyvalue tags=phase:rampup host=<host-or-ip> cycles=100k --progress console:1s
|
||||
|
||||
Note the differences between this and the command that we used to generate the schema.
|
||||
Note the differences between this and the command that we used to generate
|
||||
the schema.
|
||||
|
||||
`tags=phase:rampup` is running the yaml block in `cql-keyvalue` that has only INSERT statements.
|
||||
`tags=phase:rampup` is running the yaml block in `cql-keyvalue` that has
|
||||
only INSERT statements.
|
||||
|
||||
`cycles=100k` will run a total of 100,000 operations, in this case, 100,000 writes. You will want to pick an
|
||||
appropriately large number of cycles in actual testing to make your main test meaningful.
|
||||
`cycles=100k` will run a total of 100,000 operations, in this case,
|
||||
100,000 writes. You will want to pick an appropriately large number of
|
||||
cycles in actual testing to make your main test meaningful.
|
||||
|
||||
:::info
|
||||
The cycles parameter is not just a quantity. It is a range of values. The `cycles=n` format is short for
|
||||
`cycles=0..n`, which makes cycles a zero-based range. For example, cycles=5 means that the activity will use cycles
|
||||
0,1,2,3,4, but
|
||||
not 5. The reason for this is explained in detail in the Activity Parameters section.
|
||||
:::
|
||||
**NOTE:**
|
||||
The cycles parameter is not just a quantity. It is a range of values.
|
||||
The `cycles=n` format is short for
|
||||
`cycles=0..n`, which makes cycles a zero-based range. For example,
|
||||
cycles=5 means that the activity will use cycles 0,1,2,3,4, but not 5. The
|
||||
reason for this is explained in detail in the Activity Parameters section.
|
||||
|
||||
These parameters are explained in detail in the section on _Activity Parameters_.
|
||||
These parameters are explained in detail in the section on _Activity
|
||||
Parameters_.
|
||||
|
||||
`--progress console:1s` will print the progression of the run to the console every 1 second.
|
||||
`--progress console:1s` will print the progression of the run to the
|
||||
console every 1 second.
|
||||
|
||||
You should see output that looks like this
|
||||
|
||||
@@ -118,8 +135,9 @@ cql-keyvalue: 100.00%/Finished (details: min=0 cycle=100000 max=100000)
|
||||
|
||||
## Run the main test phase
|
||||
|
||||
Now that we have a base dataset of 100k rows in the database, we will now run a mixed read / write workload, by default
|
||||
this runs a 50% read / 50% write workload.
|
||||
Now that we have a base dataset of 100k rows in the database, we will now
|
||||
run a mixed read / write workload, by default this runs a 50% read / 50%
|
||||
write workload.
|
||||
|
||||
./nb run driver=cql workload=cql-keyvalue tags=phase:main host=<host-or-ip> cycles=100k cyclerate=5000 threads=50 --progress console:1s
|
||||
|
||||
@@ -156,17 +174,23 @@ cql-keyvalue: 100.00%/Finished (details: min=0 cycle=100000 max=100000)
|
||||
|
||||
We have a few new command line options here:
|
||||
|
||||
`tags=phase:main` is using a new block in our activity's yaml that contains both read and write queries.
|
||||
`tags=phase:main` is using a new block in our activity's yaml that
|
||||
contains both read and write queries.
|
||||
|
||||
`threads=50` is an important one. The default for nosqlbench is to run with a single thread. This is not adequate for
|
||||
workloads that will be running many operations, so threads is used as a way to increase concurrency on the client side.
|
||||
`threads=50` is an important one. The default for nosqlbench is to run
|
||||
with a single thread. This is not adequate for workloads that will be
|
||||
running many operations, so threads is used as a way to increase
|
||||
concurrency on the client side.
|
||||
|
||||
`cyclerate=5000` is used to control the operations per second that are initiated by nosqlbench. This command line option
|
||||
is the primary means to rate limit the workload and here we are running at 5000 ops/sec.
|
||||
`cyclerate=5000` is used to control the operations per second that are
|
||||
initiated by nosqlbench. This command line option is the primary means to
|
||||
rate limit the workload and here we are running at 5000 ops/sec.
|
||||
|
||||
## Now What?
|
||||
|
||||
Note in the above output, we see `Logging to logs/scenario_20190812_154431_028.log`.
|
||||
Note in the above output, we
|
||||
see `Logging to logs/scenario_20190812_154431_028.log`.
|
||||
|
||||
By default nosqlbench records the metrics from the run in this file, we will go into detail about these metrics in the
|
||||
next section Viewing Results.
|
||||
By default nosqlbench records the metrics from the run in this file, we
|
||||
will go into detail about these metrics in the next section Viewing
|
||||
Results.
|
||||
|
||||
@@ -5,26 +5,30 @@ weight: 3
|
||||
|
||||
# Example Results
|
||||
|
||||
We just ran a very simple workload against our database. In that example, we saw that nosqlbench writes to a log file
|
||||
and it is in that log file where the most basic form of metrics are displayed.
|
||||
We just ran a very simple workload against our database. In that example,
|
||||
we saw that nosqlbench writes to a log file and it is in that log file
|
||||
where the most basic form of metrics are displayed.
|
||||
|
||||
## Log File Metrics
|
||||
|
||||
For our previous run, we saw that nosqlbench was writing to `logs/scenario_20190812_154431_028.log`
|
||||
For our previous run, we saw that nosqlbench was writing
|
||||
to `logs/scenario_20190812_154431_028.log`
|
||||
|
||||
Even when you don't configure nosqlbench to write its metrics to another location, it will periodically report all the
|
||||
metrics to the log file. At the end of a scenario, before nosqlbench shuts down, it will flush the partial reporting
|
||||
interval again to the logs. This means you can always look in the logs for metrics information.
|
||||
Even when you don't configure nosqlbench to write its metrics to another
|
||||
location, it will periodically report all the metrics to the log file. At
|
||||
the end of a scenario, before nosqlbench shuts down, it will flush the
|
||||
partial reporting interval again to the logs. This means you can always
|
||||
look in the logs for metrics information.
|
||||
|
||||
:::warning
|
||||
If you look in the logs for metrics, be aware that the last report will only contain a partial interval of results. When
|
||||
looking at the last partial window, only metrics which average over time or which compute the mean for the whole test
|
||||
will be meaningful.
|
||||
:::
|
||||
**WARNING:**
|
||||
If you look in the logs for metrics, be aware that the last report will
|
||||
only contain a partial interval of results. When looking at the last
|
||||
partial window, only metrics which average over time or which compute the
|
||||
mean for the whole test will be meaningful.
|
||||
|
||||
|
||||
Below is a sample of the log that gives us our basic metrics. There is a lot to digest here, for now we will only focus
|
||||
a subset of the most important metrics.
|
||||
Below is a sample of the log that gives us our basic metrics. There is a
|
||||
lot to digest here, for now we will only focus a subset of the most
|
||||
important metrics.
|
||||
|
||||
```text
|
||||
2019-08-12 15:46:00,274 INFO [main] i.e.c.ScenarioResult [ScenarioResult.java:48] -- BEGIN METRICS DETAIL --
|
||||
@@ -35,16 +39,20 @@ a subset of the most important metrics.
|
||||
2019-08-12 15:46:01,703 INFO [main] i.e.c.ScenarioResult [ScenarioResult.java:56] -- END METRICS DETAIL --
|
||||
```
|
||||
|
||||
The log contains lots of information on metrics, but this is obviously _
|
||||
not_ the most desirable way to consume metrics from nosqlbench.
|
||||
|
||||
The log contains lots of information on metrics, but this is obviously _not_ the most desirable way to consume metrics
|
||||
from nosqlbench.
|
||||
We recommend that you use one of these methods, according to your
|
||||
environment or tooling available:
|
||||
|
||||
We recommend that you use one of these methods, according to your environment or tooling available:
|
||||
1. `--docker-metrics` with a local docker-based grafana dashboard (See the
|
||||
section on Docker Based Metrics)
|
||||
2. Send your metrics to a dedicated graphite server
|
||||
with `--report-graphite-to graphitehost`
|
||||
3. Record your metrics to local CSV files
|
||||
with `--report-csv-to my_metrics_dir`
|
||||
4. Record your metrics to HDR logs
|
||||
with `--log-histograms my_hdr_metrics.log`
|
||||
|
||||
1. `--docker-metrics` with a local docker-based grafana dashboard (See the section on Docker Based Metrics)
|
||||
2. Send your metrics to a dedicated graphite server with `--report-graphite-to graphitehost`
|
||||
3. Record your metrics to local CSV files with `--report-csv-to my_metrics_dir`
|
||||
4. Record your metrics to HDR logs with `--log-histograms my_hdr_metrics.log`
|
||||
|
||||
See the command line reference for details on how to route your metrics to a metrics collector or format of your
|
||||
preference.
|
||||
See the command line reference for details on how to route your metrics to
|
||||
a metrics collector or format of your preference.
|
||||
|
||||
@@ -5,32 +5,33 @@ weight: 05
|
||||
|
||||
# Activity Parameters
|
||||
|
||||
Activity parameters are passed as named arguments for an activity,
|
||||
either on the command line or via a scenario script. On the command
|
||||
line, these take the form of
|
||||
Activity parameters are passed as named arguments for an activity, either
|
||||
on the command line or via a scenario script. On the command line, these
|
||||
take the form of
|
||||
|
||||
<paramname>=<paramvalue>
|
||||
|
||||
Some activity parameters are universal in that they can be used with any
|
||||
driver type. These parameters are recognized by nosqlbench whether or
|
||||
not they are recognized by a particular driver implementation. These are
|
||||
driver type. These parameters are recognized by nosqlbench whether or not
|
||||
they are recognized by a particular driver implementation. These are
|
||||
called _core parameters_. Only core activity parameters are documented
|
||||
here.
|
||||
|
||||
:::info
|
||||
To see what activity parameters are valid for a given activity type, see the documentation for that activity type with
|
||||
**NOTE:**
|
||||
To see what activity parameters are valid for a given activity type, see
|
||||
the documentation for that activity type with
|
||||
`nb help <activity type>`.
|
||||
:::
|
||||
|
||||
When starting out, you want to familiarize yourself with these parameters. The most important ones to learn about first
|
||||
are driver, cycles and threads.
|
||||
When starting out, you want to familiarize yourself with these parameters.
|
||||
The most important ones to learn about first are driver, cycles and
|
||||
threads.
|
||||
|
||||
## driver
|
||||
|
||||
For historic reasons, you can also use `type`. They both mean the same
|
||||
thing for now, but `driver` is more descriptive. The `type` parameter
|
||||
will continue to be supported in this major version (3.x), but it will
|
||||
be an error to use it in 4.x and newer.
|
||||
thing for now, but `driver` is more descriptive. The `type` parameter will
|
||||
continue to be supported in this major version (3.x), but it will be an
|
||||
error to use it in 4.x and newer.
|
||||
|
||||
- `driver=<activity type>`
|
||||
- _default_: inferred from `alias` or `yaml` parameters, or unset
|
||||
@@ -39,17 +40,17 @@ be an error to use it in 4.x and newer.
|
||||
|
||||
Every activity is powered by a named ActivityType. Thus, you must set
|
||||
the `type` parameter. If you do not specify this parameter, it will be
|
||||
inferred from a substring match against the alias and/or yaml
|
||||
parameters. If there is more than one valid match for a valid type
|
||||
value, then you must set the type parameter directly.
|
||||
inferred from a substring match against the alias and/or yaml parameters.
|
||||
If there is more than one valid match for a valid type value, then you
|
||||
must set the type parameter directly.
|
||||
|
||||
Telling nosqlbench what type of an activity will be run also determines
|
||||
what other parameters are considered valid and how they will be used. So
|
||||
in this way, the type parameter is actually the base parameter for any
|
||||
activity. When used with scenario commands like `run` or `start`, an
|
||||
activity of the named type will be initialized, and then further
|
||||
activity parameters on the command line will be used to configure it
|
||||
before it is started.
|
||||
activity of the named type will be initialized, and then further activity
|
||||
parameters on the command line will be used to configure it before it is
|
||||
started.
|
||||
|
||||
## alias
|
||||
|
||||
@@ -62,10 +63,9 @@ You *should* set the _alias_ parameter when you have multiple activities,
|
||||
when you want to name metrics per-activity, or when you want to control
|
||||
activities via scripting.
|
||||
|
||||
Each activity can be given a symbolic name known as an _alias_. It is
|
||||
good practice to give all your activities an alias, since this
|
||||
determines the named used in logging, metrics, and even scripting
|
||||
control.
|
||||
Each activity can be given a symbolic name known as an _alias_. It is good
|
||||
practice to give all your activities an alias, since this determines the
|
||||
named used in logging, metrics, and even scripting control.
|
||||
|
||||
_default value_ : The name of any provided YAML filename is used as the
|
||||
basis for the default alias. Otherwise, the activity type name is used.
|
||||
@@ -81,34 +81,38 @@ This is a convenience for simple test scenarios only.
|
||||
You *should* set the _threads_ parameter when you need to ramp up a
|
||||
workload.
|
||||
|
||||
Each activity can be created with a number of threads. It is important
|
||||
to adjust this setting to the system types used by nosqlbench.
|
||||
Each activity can be created with a number of threads. It is important to
|
||||
adjust this setting to the system types used by nosqlbench.
|
||||
|
||||
_default value_ : For now, the default is simply *1*. Users must be
|
||||
aware of this setting and adjust it to a reasonable value for their
|
||||
workloads.
|
||||
_default value_ : For now, the default is simply *1*. Users must be aware
|
||||
of this setting and adjust it to a reasonable value for their workloads.
|
||||
|
||||
`threads=auto` : When you set `threads=auto`, it will set the number of threads to 10x the number of cores
|
||||
in your system. There is no distinction here between full cores and hardware threads. This is generally
|
||||
a reasonable number of threads to tap into the procesing power of a client system.
|
||||
`threads=auto` : When you set `threads=auto`, it will set the number of
|
||||
threads to 10x the number of cores in your system. There is no distinction
|
||||
here between full cores and hardware threads. This is generally a
|
||||
reasonable number of threads to tap into the procesing power of a client
|
||||
system.
|
||||
|
||||
`threads=__x` : When you set `threads=5x` or `threads=10x`, you will set the number of threads to some multiplier
|
||||
of the logical CPUs in the local system.
|
||||
`threads=__x` : When you set `threads=5x` or `threads=10x`, you will set
|
||||
the number of threads to some multiplier of the logical CPUs in the local
|
||||
system.
|
||||
|
||||
:::info
|
||||
The threads parameter will work slightly differently for activities using the async parameter. For example, when
|
||||
`async=500` is provided, then the number of async operations is split between all configured threads, and each thread
|
||||
will juggle a number of in-flight operations asynchronously. Without the async parameter, threads determines the logical
|
||||
concurrency level of nosqlbench in the classic 'request-per-thread' mode. Neither mode is strictly correct, and both
|
||||
modes can be used for more accurate testing depending on the constraints of your environment.
|
||||
:::
|
||||
**NOTE:**
|
||||
The threads parameter will work slightly differently for activities using
|
||||
the async parameter. For example, when `async=500` is provided, then the
|
||||
number of async operations is split between all configured threads, and
|
||||
each thread will juggle a number of in-flight operations asynchronously.
|
||||
Without the async parameter, threads determines the logical concurrency
|
||||
level of nosqlbench in the classic 'request-per-thread' mode. Neither mode
|
||||
is strictly correct, and both modes can be used for more accurate testing
|
||||
depending on the constraints of your environment.
|
||||
|
||||
A good rule of thumb for setting threads for maximum effect is to set it
|
||||
relatively high, such as 10XvCPU when running synchronous workloads
|
||||
(when not providing the async parameter), and to 5XvCPU for all async
|
||||
workloads. Variation in system dynamics make it difficult to peg an
|
||||
ideal number, so experimentation is encouraged while you dial in your
|
||||
settings initially.
|
||||
workloads. Variation in system dynamics make it difficult to peg an ideal
|
||||
number, so experimentation is encouraged while you dial in your settings
|
||||
initially.
|
||||
|
||||
## cycles
|
||||
|
||||
@@ -119,14 +123,14 @@ settings initially.
|
||||
- _dynamic_: no
|
||||
|
||||
The cycles parameter determines the starting and ending point for an
|
||||
activity. It determines the range of values which will act as seed
|
||||
values for each operation. For each cycle of the test, a statement is
|
||||
built from a statement template and executed as an operation.
|
||||
activity. It determines the range of values which will act as seed values
|
||||
for each operation. For each cycle of the test, a statement is built from
|
||||
a statement template and executed as an operation.
|
||||
|
||||
If you do not set the cycles parameter, then it will automatically be
|
||||
set to the size of the sequence. The sequence is simply the length of
|
||||
the op sequence that is constructed from the active statements and
|
||||
ratios in your activity YAML.
|
||||
If you do not set the cycles parameter, then it will automatically be set
|
||||
to the size of the sequence. The sequence is simply the length of the op
|
||||
sequence that is constructed from the active statements and ratios in your
|
||||
activity YAML.
|
||||
|
||||
You *should* set the cycles for every activity except for schema-like
|
||||
activities, or activities which you run just as a sanity check of active
|
||||
@@ -137,14 +141,15 @@ number of cycles, and is equivalent to `cycles=0..<cycle max>`. In both
|
||||
cases, the max value is not the actual number of the last cycle. This is
|
||||
because all cycle parameters define a closed-open interval. In other
|
||||
words, the minimum value is either zero by default or the specified
|
||||
minimum value, but the maximum value is the first value *not* included
|
||||
in the interval. This means that you can easily stack intervals over
|
||||
minimum value, but the maximum value is the first value *not* included in
|
||||
the interval. This means that you can easily stack intervals over
|
||||
subsequent runs while knowing that you will cover all logical cycles
|
||||
without gaps or duplicates. For example, given `cycles=1000` and then
|
||||
`cycles=1000..2000`, and then `cycles=2000..5K`, you know that all
|
||||
cycles between 0 (inclusive) and 5000 (exclusive) have been specified.
|
||||
`cycles=1000..2000`, and then `cycles=2000..5K`, you know that all cycles
|
||||
between 0 (inclusive) and 5000 (exclusive) have been specified.
|
||||
|
||||
## stride
|
||||
|
||||
- `stride=<stride>`
|
||||
- _default_: same as op sequence length
|
||||
- _required_: no
|
||||
@@ -153,30 +158,28 @@ cycles between 0 (inclusive) and 5000 (exclusive) have been specified.
|
||||
Usually, you don't want to provide a setting for stride, but it is still
|
||||
important to understand what it does. Within nosqlbench, each time a
|
||||
thread needs to allocate a set of cycles to operate on, it takes a
|
||||
contiguous range of values from a shared atomic value. Thus, the stride
|
||||
is the unit of micro-batching within nosqlbench. It also means that you
|
||||
can use stride to optimize a workload by setting the value higher than
|
||||
the default. For example if you are running a single-statement workload
|
||||
at a very high rate, it doesn't make sense for threads to allocate one
|
||||
op at a time from a shared atomic value. You can simply set
|
||||
contiguous range of values from a shared atomic value. Thus, the stride is
|
||||
the unit of micro-batching within nosqlbench. It also means that you can
|
||||
use stride to optimize a workload by setting the value higher than the
|
||||
default. For example if you are running a single-statement workload at a
|
||||
very high rate, it doesn't make sense for threads to allocate one op at a
|
||||
time from a shared atomic value. You can simply set
|
||||
`stride=1000` to cause (ballpark estimation) about 1000X less internal
|
||||
contention.
|
||||
|
||||
The stride is initialized to the calculated sequence length. The
|
||||
sequence length is simply the number of operations in the op sequence
|
||||
that is planned from your active statements and their ratios.
|
||||
The stride is initialized to the calculated sequence length. The sequence
|
||||
length is simply the number of operations in the op sequence that is
|
||||
planned from your active statements and their ratios.
|
||||
|
||||
You usually do not want to set the stride directly. If you do, make sure
|
||||
it is a multiple of what it would normally be set to if you need to ensure
|
||||
that sequences are not divided up differently. This can be important when
|
||||
simulating the access patterns of applications.
|
||||
|
||||
:::info
|
||||
When simulating multi-op access patterns in non-async mode, the
|
||||
stride metric can tell you how long it took for a whole group of
|
||||
operations to complete.
|
||||
:::
|
||||
|
||||
**NOTE:**
|
||||
When simulating multi-op access patterns in non-async mode, the stride
|
||||
metric can tell you how long it took for a whole group of operations to
|
||||
complete.
|
||||
|
||||
## async
|
||||
|
||||
@@ -185,21 +188,20 @@ operations to complete.
|
||||
- _required_: no
|
||||
- _dynamic_: no
|
||||
|
||||
The `async=<ops>` parameter puts an activity into an asynchronous
|
||||
dispatch mode and configures each thread to juggle a proportion of the
|
||||
operations specified. If you specify `async=500 threads=10`, then each
|
||||
of 10 threads will manage execution of 50 operations at a time. With
|
||||
async mode, a thread will always prepare and send operations if there
|
||||
are fewer in flight than it is allotted before servicing any pending
|
||||
responses.
|
||||
The `async=<ops>` parameter puts an activity into an asynchronous dispatch
|
||||
mode and configures each thread to juggle a proportion of the operations
|
||||
specified. If you specify `async=500 threads=10`, then each of 10 threads
|
||||
will manage execution of 50 operations at a time. With async mode, a
|
||||
thread will always prepare and send operations if there are fewer in
|
||||
flight than it is allotted before servicing any pending responses.
|
||||
|
||||
Async mode also puts threads into a different sequencing behavior. When
|
||||
in async mode, responses from an operation may arrive in a different
|
||||
order than they are sent, and thus linearized operations can't be
|
||||
guaranteed as with the non-async mode. This means that sometimes you use
|
||||
want to avoid async mode when you are intentionally simulating access
|
||||
patterns with multiple linearized operations per user as you may see in
|
||||
your application.
|
||||
Async mode also puts threads into a different sequencing behavior. When in
|
||||
async mode, responses from an operation may arrive in a different order
|
||||
than they are sent, and thus linearized operations can't be guaranteed as
|
||||
with the non-async mode. This means that sometimes you use want to avoid
|
||||
async mode when you are intentionally simulating access patterns with
|
||||
multiple linearized operations per user as you may see in your
|
||||
application.
|
||||
|
||||
The absence of the async parameter leaves the activity in the default
|
||||
non-async mode, where each thread works through a sequence of ops one
|
||||
@@ -217,23 +219,23 @@ The cyclerate parameter sets a maximum op rate for individual cycles
|
||||
within the activity, across the whole activity, irrespective of how many
|
||||
threads are active.
|
||||
|
||||
:::info
|
||||
The cyclerate is a rate limiter, and can thus only throttle an activity
|
||||
to be slower than it would otherwise run. Rate limiting is also an
|
||||
invasive element in a workload, and will always come at a cost. For
|
||||
extremely high throughput testing, consider carefully whether your
|
||||
testing would benefit more from concurrency-based throttling as with
|
||||
async or the striderate described below.
|
||||
:::
|
||||
**NOTE:**
|
||||
The cyclerate is a rate limiter, and can thus only throttle an activity to
|
||||
be slower than it would otherwise run. Rate limiting is also an invasive
|
||||
element in a workload, and will always come at a cost. For extremely high
|
||||
throughput testing, consider carefully whether your testing would benefit
|
||||
more from concurrency-based throttling as with async or the striderate
|
||||
described below.
|
||||
|
||||
When the cyclerate parameter is provided, two additional metrics are
|
||||
tracked: the wait time and the response time. See the 'Reference|Timing
|
||||
Terms' section for more details on these metrics.
|
||||
|
||||
_default_: None. When the cyclerate parameter is not provided, an
|
||||
activity runs as fast as it can given how fast operations can complete.
|
||||
_default_: None. When the cyclerate parameter is not provided, an activity
|
||||
runs as fast as it can given how fast operations can complete.
|
||||
|
||||
Examples:
|
||||
|
||||
- `cyclerate=1000` - set the cycle rate limiter to 1000 ops/s and a
|
||||
default burst ratio of 1.1.
|
||||
- `cyclerate=1000,1.0` - same as above, but with burstrate set to 1.0
|
||||
@@ -242,15 +244,16 @@ Examples:
|
||||
50% burst allowed)
|
||||
|
||||
Synonyms:
|
||||
|
||||
- `rate`
|
||||
- `targetrate`
|
||||
|
||||
### burst ratio
|
||||
|
||||
This is only an optional part of the cyclerate as shown in examples
|
||||
above. If you do not specify it when you initialize a cyclerate, then it
|
||||
defaults 1.1. The burst ratio is only valid as part of a rate limit and
|
||||
can not be specified by itself.
|
||||
This is only an optional part of the cyclerate as shown in examples above.
|
||||
If you do not specify it when you initialize a cyclerate, then it defaults
|
||||
1.1. The burst ratio is only valid as part of a rate limit and can not be
|
||||
specified by itself.
|
||||
|
||||
* _default_: `1.1`
|
||||
* _dynamic_: yes
|
||||
@@ -259,31 +262,31 @@ The nosqlbench rate limiter provides a sliding scale between strict rate
|
||||
limiting and average rate limiting. The difference between them is
|
||||
controlled by a _burst ratio_ parameter. When the burst ratio is 1.0
|
||||
(burst up to 100% relative rate), the rate limiter acts as a strict rate
|
||||
limiter, disallowing faster operations from using time that was
|
||||
previously forfeited by prior slower operations. This is a "use it or
|
||||
lose it" mode that means things like GC events can steal throughput from
|
||||
a running client as a necessary effect of losing time in a strict timing
|
||||
sense.
|
||||
limiter, disallowing faster operations from using time that was previously
|
||||
forfeited by prior slower operations. This is a "use it or lose it" mode
|
||||
that means things like GC events can steal throughput from a running
|
||||
client as a necessary effect of losing time in a strict timing sense.
|
||||
|
||||
When the burst ratio is set to higher than 1.0, faster operations may
|
||||
recover lost time from previously slower operations. For example, a
|
||||
burst ratio of 1.3 means that the rate limiter will allow bursting up to
|
||||
130% of the base rate, but only until the average rate is back to 100%
|
||||
relative speed. This means that any valleys created in the actual op
|
||||
rate of the client can be converted into plateaus of throughput above
|
||||
the strict rate, but only at a speed that fits within (op rate * burst
|
||||
ratio). This allows for workloads to approximate the average target rate
|
||||
over time, with controllable bursting rates. This ability allows for
|
||||
near-strict behavior while allowing clients to still track truer to rate
|
||||
limit expectations, so long as the overall workload is not saturating
|
||||
resources.
|
||||
recover lost time from previously slower operations. For example, a burst
|
||||
ratio of 1.3 means that the rate limiter will allow bursting up to 130% of
|
||||
the base rate, but only until the average rate is back to 100% relative
|
||||
speed. This means that any valleys created in the actual op rate of the
|
||||
client can be converted into plateaus of throughput above the strict rate,
|
||||
but only at a speed that fits within (op rate * burst ratio). This allows
|
||||
for workloads to approximate the average target rate over time, with
|
||||
controllable bursting rates. This ability allows for near-strict behavior
|
||||
while allowing clients to still track truer to rate limit expectations, so
|
||||
long as the overall workload is not saturating resources.
|
||||
|
||||
:::info
|
||||
The default burst ratio of 1.1 makes testing results slightly more stable on average, but can also hide some
|
||||
short-term slow-downs in system throughput. It is set at the default to fit most tester's expectations for averaging
|
||||
results, but it may not be strict enough for your testing purposes. However, a strict setting of 1.0 nearly always adds
|
||||
cold/startup time to the result, so if you are testing for steady state, be sure to account for this across test runs.
|
||||
:::
|
||||
**NOTE:**
|
||||
The default burst ratio of 1.1 makes testing results slightly more stable
|
||||
on average, but can also hide some short-term slow-downs in system
|
||||
throughput. It is set at the default to fit most tester's expectations for
|
||||
averaging results, but it may not be strict enough for your testing
|
||||
purposes. However, a strict setting of 1.0 nearly always adds cold/startup
|
||||
time to the result, so if you are testing for steady state, be sure to
|
||||
account for this across test runs.
|
||||
|
||||
## striderate
|
||||
|
||||
@@ -295,23 +298,24 @@ cold/startup time to the result, so if you are testing for steady state, be sure
|
||||
|
||||
The `striderate` parameter allows you to limit the start of a stride
|
||||
according to some rate. This works almost exactly like the cyclerate
|
||||
parameter, except that it blocks a whole group of operations from
|
||||
starting instead of a single operation. The striderate can use a burst
|
||||
ratio just as the cyclerate.
|
||||
parameter, except that it blocks a whole group of operations from starting
|
||||
instead of a single operation. The striderate can use a burst ratio just
|
||||
as the cyclerate.
|
||||
|
||||
This sets the target rate for strides. In nosqlbench, a stride is a group of
|
||||
operations that are dispatched and executed together within the same thread.
|
||||
This is useful, for example, to emulate application behaviors in which some
|
||||
outside request translates to multiple internal requests. It is also a way
|
||||
to optimize a client runtime for more efficiency and throughput. The stride
|
||||
rate limiter applies to the whole activity irrespective of how many threads
|
||||
it has.
|
||||
This sets the target rate for strides. In nosqlbench, a stride is a group
|
||||
of operations that are dispatched and executed together within the same
|
||||
thread. This is useful, for example, to emulate application behaviors in
|
||||
which some outside request translates to multiple internal requests. It is
|
||||
also a way to optimize a client runtime for more efficiency and
|
||||
throughput. The stride rate limiter applies to the whole activity
|
||||
irrespective of how many threads it has.
|
||||
|
||||
:::warning
|
||||
When using the cyclerate an striderate options together, operations are delayed based on both rate limiters. If the
|
||||
relative rates are not synchronised with the side of a stride, then one rate limiter will artificially throttle the
|
||||
other. Thus, it usually doesn't make sense to use both of these settings in the same activity.
|
||||
:::
|
||||
**WARNING:**
|
||||
When using the cyclerate an striderate options together, operations are
|
||||
delayed based on both rate limiters. If the relative rates are not
|
||||
synchronised with the side of a stride, then one rate limiter will
|
||||
artificially throttle the other. Thus, it usually doesn't make sense to
|
||||
use both of these settings in the same activity.
|
||||
|
||||
## seq
|
||||
|
||||
@@ -321,8 +325,8 @@ other. Thus, it usually doesn't make sense to use both of these settings in the
|
||||
- _dynamic_: no
|
||||
|
||||
The `seq=<bucket|concat|interval>` parameter determines the type of
|
||||
sequencing that will be used to plan the op sequence. The op sequence is
|
||||
a look-up-table that is used for each stride to pick statement forms
|
||||
sequencing that will be used to plan the op sequence. The op sequence is a
|
||||
look-up-table that is used for each stride to pick statement forms
|
||||
according to the cycle offset. It is simply the sequence of statements
|
||||
from your YAML that will be executed, but in a pre-planned, and highly
|
||||
efficient form.
|
||||
@@ -335,14 +339,13 @@ might expect wil happen: those statements will occur multiple times to
|
||||
meet their ratio in the op mix. You can customize the op mix further by
|
||||
changing the seq parameter to concat or interval.
|
||||
|
||||
:::info
|
||||
**NOTE:**
|
||||
The op sequence is a look up table of statement templates, *not*
|
||||
individual statements or operations. Thus, the cycle still determines
|
||||
the uniqueness of an operation as you would expect. For example, if
|
||||
statement form ABC occurs 3x per sequence because you set its ratio to
|
||||
3, then each of these would manifest as a distinct operation with fields
|
||||
determined by distinct cycle values.
|
||||
:::
|
||||
individual statements or operations. Thus, the cycle still determines the
|
||||
uniqueness of an operation as you would expect. For example, if statement
|
||||
form ABC occurs 3x per sequence because you set its ratio to 3, then each
|
||||
of these would manifest as a distinct operation with fields determined by
|
||||
distinct cycle values.
|
||||
|
||||
There are three schemes to pick from:
|
||||
|
||||
@@ -366,20 +369,21 @@ frequency over a unit interval of time, and apportions the associated
|
||||
operation to occur evenly over that time. When two operations would be
|
||||
assigned the same time, then the order of appearance establishes
|
||||
precedence. In other words, statements appearing first win ties for the
|
||||
same time slot. The ratios A:4 B:2 C:1 would yield the sequence A B C A
|
||||
A B A. This occurs because, over the unit interval (0.0,1.0), A is
|
||||
assigned the positions `A: 0.0, 0.25, 0.5, 0.75`, B is assigned the
|
||||
same time slot. The ratios A:4 B:2 C:1 would yield the sequence A B C A A
|
||||
B A. This occurs because, over the unit interval (0.0,1.0), A is assigned
|
||||
the positions `A: 0.0, 0.25, 0.5, 0.75`, B is assigned the
|
||||
positions `B: 0.0, 0.5`, and C is assigned position `C: 0.0`. These
|
||||
offsets are all sorted with a position-stable sort, and then the
|
||||
associated ops are taken as the order.
|
||||
|
||||
In detail, the rendering appears as `0.0(A), 0.0(B), 0.0(C), 0.25(A),
|
||||
0.5(A), 0.5(B), 0.75(A)`, which yields `A B C A A B A` as the op
|
||||
sequence.
|
||||
In detail, the rendering appears
|
||||
as `0.0(A), 0.0(B), 0.0(C), 0.25(A), 0.5(A), 0.5(B), 0.75(A)`, which
|
||||
yields `A B C A A B A` as the op sequence.
|
||||
|
||||
This sequencer is most useful when you want a stable ordering of operation from a rich mix of statement types, where
|
||||
each operations is spaced as evenly as possible over time, and where it is not important to control the cycle-by-cycle
|
||||
sequencing of statements.
|
||||
This sequencer is most useful when you want a stable ordering of operation
|
||||
from a rich mix of statement types, where each operations is spaced as
|
||||
evenly as possible over time, and where it is not important to control the
|
||||
cycle-by-cycle sequencing of statements.
|
||||
|
||||
## hdr_digits
|
||||
|
||||
@@ -388,10 +392,12 @@ sequencing of statements.
|
||||
- _required_: no
|
||||
- _dynamic_: no
|
||||
|
||||
This parameter determines the number of significant digits used in all HDR histograms for metrics collected from this
|
||||
activity. The default of 4 allows 4 significant digits, which means *up to* 10000 distinct histogram buckets per named
|
||||
metric, per histogram interval. This does not mean that there _will be_ 10000 distinct buckets, but it means there could
|
||||
be if there is significant volume and variety in the measurements.
|
||||
This parameter determines the number of significant digits used in all HDR
|
||||
histograms for metrics collected from this activity. The default of 4
|
||||
allows 4 significant digits, which means *up to* 10000 distinct histogram
|
||||
buckets per named metric, per histogram interval. This does not mean that
|
||||
there _will be_ 10000 distinct buckets, but it means there could be if
|
||||
there is significant volume and variety in the measurements.
|
||||
|
||||
If you are running a scenario that creates many activities, then you can set `hdr_digits=1` on some of them to save
|
||||
client resources.
|
||||
If you are running a scenario that creates many activities, then you can
|
||||
set `hdr_digits=1` on some of them to save client resources.
|
||||
|
||||
@@ -5,39 +5,44 @@ weight: 06
|
||||
|
||||
# Core Statement Parameters
|
||||
|
||||
Some statement parameters are recognized by the nosqlbench runtime and can be used on any statement in a YAML file.
|
||||
Some statement parameters are recognized by the nosqlbench runtime and can
|
||||
be used on any statement in a YAML file.
|
||||
|
||||
## *ratio*
|
||||
|
||||
A statement parameter called _ratio_ is supported by every workload. It can be attached to a statement, or a block or a
|
||||
document level parameter block. It sets the relative ratio of a statement in the op sequence before an activity is
|
||||
started.
|
||||
A statement parameter called _ratio_ is supported by every workload. It
|
||||
can be attached to a statement, or a block or a document level parameter
|
||||
block. It sets the relative ratio of a statement in the op sequence before
|
||||
an activity is started.
|
||||
|
||||
When an activity is initialized, all of the active statements are combined into a sequence based on their relative
|
||||
ratios. By default, all statement templates are initialized with a ratio of 1 if non is specified by the user.
|
||||
When an activity is initialized, all of the active statements are combined
|
||||
into a sequence based on their relative ratios. By default, all statement
|
||||
templates are initialized with a ratio of 1 if non is specified by the
|
||||
user.
|
||||
|
||||
For example, consider the statements below:
|
||||
|
||||
```yaml
|
||||
statements:
|
||||
- s1: "select foo,bar from baz where ..."
|
||||
ratio: 1
|
||||
- s2: "select bar,baz from foo where ..."
|
||||
ratio: 2
|
||||
- s3: "select baz,foo from bar where ..."
|
||||
ratio: 3
|
||||
- s1: "select foo,bar from baz where ..."
|
||||
ratio: 1
|
||||
- s2: "select bar,baz from foo where ..."
|
||||
ratio: 2
|
||||
- s3: "select baz,foo from bar where ..."
|
||||
ratio: 3
|
||||
```
|
||||
|
||||
If all statements are activated (there is no tag filtering), then the activity will be initialized with a sequence
|
||||
length of 6. In this case, the relative ratio of statement "s3" will be 50% overall. If you filtered out the first
|
||||
statement, then the sequence would be 5 operations long. In this case, the relative ratio of statement "s3" would be 60%
|
||||
overall. It is important to remember that statement ratios are always relative to the total sum of the active
|
||||
statements' ratios.
|
||||
If all statements are activated (there is no tag filtering), then the
|
||||
activity will be initialized with a sequence length of 6. In this case,
|
||||
the relative ratio of statement "s3" will be 50% overall. If you filtered
|
||||
out the first statement, then the sequence would be 5 operations long. In
|
||||
this case, the relative ratio of statement "s3" would be 60% overall. It
|
||||
is important to remember that statement ratios are always relative to the
|
||||
total sum of the active statements' ratios.
|
||||
|
||||
:::info
|
||||
Because the ratio works so closely with the activity parameter `seq`, the description for that parameter is include
|
||||
below.
|
||||
:::
|
||||
**NOTE:**
|
||||
Because the ratio works so closely with the activity parameter `seq`, the
|
||||
description for that parameter is include below.
|
||||
|
||||
### *seq* (activity level - do not use on statements)
|
||||
|
||||
@@ -46,52 +51,65 @@ below.
|
||||
- _required_: no
|
||||
- _dynamic_: no
|
||||
|
||||
The `seq=<bucket|concat|interval>` parameter determines the type of sequencing that will be used to plan the op
|
||||
sequence. The op sequence is a look-up-table that is used for each stride to pick statement forms according to the cycle
|
||||
offset. It is simply the sequence of statements from your YAML that will be executed, but in a pre-planned, and highly
|
||||
The `seq=<bucket|concat|interval>` parameter determines the type of
|
||||
sequencing that will be used to plan the op sequence. The op sequence is a
|
||||
look-up-table that is used for each stride to pick statement forms
|
||||
according to the cycle offset. It is simply the sequence of statements
|
||||
from your YAML that will be executed, but in a pre-planned, and highly
|
||||
efficient form.
|
||||
|
||||
An op sequence is planned for every activity. With the default ratio on every statement as 1, and the default bucket
|
||||
scheme, the basic result is that each active statement will occur once in the order specified. Once you start adding
|
||||
ratios to statements, the most obvious thing that you might expect wil happen: those statements will occur multiple
|
||||
times to meet their ratio in the op mix. You can customize the op mix further by changing the seq parameter to concat or
|
||||
interval.
|
||||
An op sequence is planned for every activity. With the default ratio on
|
||||
every statement as 1, and the default bucket scheme, the basic result is
|
||||
that each active statement will occur once in the order specified. Once
|
||||
you start adding ratios to statements, the most obvious thing that you
|
||||
might expect wil happen: those statements will occur multiple times to
|
||||
meet their ratio in the op mix. You can customize the op mix further by
|
||||
changing the seq parameter to concat or interval.
|
||||
|
||||
:::info
|
||||
The op sequence is a look up table of statement templates, *not* individual statements or operations. Thus, the cycle
|
||||
still determines the uniqueness of an operation as you would expect. For example, if statement form ABC occurs 3x per
|
||||
sequence because you set its ratio to 3, then each of these would manifest as a distinct operation with fields
|
||||
determined by distinct cycle values.
|
||||
:::
|
||||
**NOTE:**
|
||||
The op sequence is a look up table of statement templates, *not*
|
||||
individual statements or operations. Thus, the cycle still determines the
|
||||
uniqueness of an operation as you would expect. For example, if statement
|
||||
form ABC occurs 3x per sequence because you set its ratio to 3, then each
|
||||
of these would manifest as a distinct operation with fields determined by
|
||||
distinct cycle values.
|
||||
|
||||
There are three schemes to pick from:
|
||||
|
||||
### bucket
|
||||
|
||||
This is a round robin planner which draws operations from buckets in circular fashion, removing each bucket as it is
|
||||
exhausted. For example, the ratios A:4, B:2, C:1 would yield the sequence A B C A B A A. The ratios A:1, B5 would yield
|
||||
the sequence A B B B B B.
|
||||
This is a round robin planner which draws operations from buckets in
|
||||
circular fashion, removing each bucket as it is exhausted. For example,
|
||||
the ratios A:4, B:2, C:1 would yield the sequence A B C A B A A. The
|
||||
ratios A:1, B5 would yield the sequence A B B B B B.
|
||||
|
||||
### concat
|
||||
|
||||
This simply takes each statement template as it occurs in order and duplicates it in place to achieve the ratio. The
|
||||
ratios above (A:4, B:2, C:1) would yield the sequence A A A A B B C for the concat sequencer.
|
||||
This simply takes each statement template as it occurs in order and
|
||||
duplicates it in place to achieve the ratio. The ratios above (A:4, B:2,
|
||||
C:1) would yield the sequence A A A A B B C for the concat sequencer.
|
||||
|
||||
### interval
|
||||
|
||||
This is arguably the most complex sequencer. It takes each ratio as a frequency over a unit interval of time, and
|
||||
apportions the associated operation to occur evenly over that time. When two operations would be assigned the same time,
|
||||
then the order of appearance establishes precedence. In other words, statements appearing first win ties for the same
|
||||
time slot. The ratios A:4 B:2 C:1 would yield the sequence A B C A A B A. This occurs because, over the unit interval
|
||||
(0.0,1.0), A is assigned the positions `A: 0.0, 0.25, 0.5, 0.75`, B is assigned the positions `B: 0.0, 0.5`, and C is
|
||||
assigned position `C: 0.0`. These offsets are all sorted with a position-stable sort, and then the associated ops are
|
||||
taken as the order.
|
||||
This is arguably the most complex sequencer. It takes each ratio as a
|
||||
frequency over a unit interval of time, and apportions the associated
|
||||
operation to occur evenly over that time. When two operations would be
|
||||
assigned the same time, then the order of appearance establishes
|
||||
precedence. In other words, statements appearing first win ties for the
|
||||
same time slot. The ratios A:4 B:2 C:1 would yield the sequence A B C A A
|
||||
B A. This occurs because, over the unit interval
|
||||
(0.0,1.0), A is assigned the positions `A: 0.0, 0.25, 0.5, 0.75`, B is
|
||||
assigned the positions `B: 0.0, 0.5`, and C is assigned position `C: 0.0`.
|
||||
These offsets are all sorted with a position-stable sort, and then the
|
||||
associated ops are taken as the order.
|
||||
|
||||
In detail, the rendering appears as `0.0(A), 0.0(B), 0.0(C), 0.25(A), 0.5(A), 0.5(B), 0.75(A)`, which yields `A B C A A
|
||||
B A` as the op sequence.
|
||||
In detail, the rendering appears
|
||||
as `0.0(A), 0.0(B), 0.0(C), 0.25(A), 0.5(A), 0.5(B), 0.75(A)`, which
|
||||
yields `A B C A A B A` as the op sequence.
|
||||
|
||||
This sequencer is most useful when you want a stable ordering of operation from a rich mix of statement types, where
|
||||
each operations is spaced as evenly as possible over time, and where it is not important to control the cycle-by-cycle
|
||||
sequencing of statements.
|
||||
This sequencer is most useful when you want a stable ordering of operation
|
||||
from a rich mix of statement types, where each operations is spaced as
|
||||
evenly as possible over time, and where it is not important to control the
|
||||
cycle-by-cycle sequencing of statements.
|
||||
|
||||
|
||||
|
||||
@@ -5,16 +5,20 @@ weight: 2
|
||||
|
||||
# Grafana Metrics
|
||||
|
||||
NoSQLBench comes with a built-in helper to get you up and running quickly with client-side testing metrics. This
|
||||
functionality is based on docker, and a built-in method for bringing up a docker stack, automated by NoSQLBench.
|
||||
NoSQLBench comes with a built-in helper to get you up and running quickly
|
||||
with client-side testing metrics. This functionality is based on docker,
|
||||
and a built-in method for bringing up a docker stack, automated by
|
||||
NoSQLBench.
|
||||
|
||||
:::warning
|
||||
This feature requires that you have docker running on the local system and that your user is in a group that
|
||||
is allowed to manage docker. Using the `--docker-metrics` command *will* attempt to manage docker on your local system.
|
||||
:::
|
||||
**WARNING:**
|
||||
This feature requires that you have docker running on the local system and
|
||||
that your user is in a group that is allowed to manage docker. Using
|
||||
the `--docker-metrics` command *will* attempt to manage docker on your
|
||||
local system.
|
||||
|
||||
To ask nosqlbench to stand up your metrics infrastructure using a local docker runtime, use this command line option
|
||||
with any other nosqlbench commands:
|
||||
To ask nosqlbench to stand up your metrics infrastructure using a local
|
||||
docker runtime, use this command line option with any other nosqlbench
|
||||
commands:
|
||||
|
||||
--docker-metrics
|
||||
|
||||
|
||||
@@ -53,18 +53,22 @@ Each activity metric for a given activity alias is available at this name. This
|
||||
directly. Some metrics objects have also been enhanced with wrapper logic to provide simple getters and setters, like
|
||||
`.p99ms` or `.p99ns`, for example.
|
||||
|
||||
Interaction with the nosqlbench runtime and the activities therein is made easy by the above variables and objects. When
|
||||
an assignment is made to any of these variables, the changes are propagated to internal listeners. For changes to
|
||||
_threads_, the thread pool responsible for the affected activity adjusts the number of active threads (AKA slots). Other
|
||||
changes are further propagated directly to the thread harnesses and components which implement the ActivityType.
|
||||
Interaction with the nosqlbench runtime and the activities therein is made
|
||||
easy by the above variables and objects. When an assignment is made to any
|
||||
of these variables, the changes are propagated to internal listeners. For
|
||||
changes to
|
||||
_threads_, the thread pool responsible for the affected activity adjusts
|
||||
the number of active threads (AKA slots). Other changes are further
|
||||
propagated directly to the thread harnesses and components which implement
|
||||
the ActivityType.
|
||||
|
||||
:::warning
|
||||
Assignment to the _workload_ and _alias_ activity parameters has no special effect, as you can't change an activity to a
|
||||
different driver once it has been created.
|
||||
:::
|
||||
**WARNING:**
|
||||
Assignment to the _workload_ and _alias_ activity parameters has no
|
||||
special effect, as you can't change an activity to a different driver once
|
||||
it has been created.
|
||||
|
||||
You can make use of more extensive Java or Javascript libraries as needed, mixing then with the runtime controls
|
||||
provided above.
|
||||
You can make use of more extensive Java or Javascript libraries as needed,
|
||||
mixing then with the runtime controls provided above.
|
||||
|
||||
## Enhanced Metrics for Scripting
|
||||
|
||||
|
||||
@@ -5,10 +5,10 @@ weight: 13
|
||||
|
||||
# Advanced Testing
|
||||
|
||||
:::info
|
||||
Some of the features discussed here are only for advanced testing scenarios.
|
||||
:::
|
||||
|
||||
**NOTE:**
|
||||
Some of the features discussed here are only for advanced testing
|
||||
scenarios. First-time users should become familiar with the basic options
|
||||
first.
|
||||
|
||||
## Hybrid Rate Limiting
|
||||
|
||||
|
||||
Reference in New Issue
Block a user