mirror of
https://github.com/nosqlbench/nosqlbench.git
synced 2025-02-25 18:55:28 -06:00
doc updates for docsite testing
This commit is contained in:
@@ -0,0 +1,42 @@
|
||||
# Workload Specification
|
||||
|
||||
This directory contains the testable specification for workload definitions used by NoSQLBench.
|
||||
All the content blocks in this section have been validated with the latest NoSQLBench build.
|
||||
|
||||
Usually, users will not need to delve too deeply into this section. It is useful as a detailed
|
||||
guide for contributors and driver developers. If you are using a driver which leaves you
|
||||
wondering what a good op template example looks like, then the driver needs better examples in
|
||||
its documentation!
|
||||
|
||||
# Synopsis
|
||||
|
||||
There are two primary views of workload definitions that we care about:
|
||||
|
||||
1. The User View of **op templates**
|
||||
1. Op templates are simply the schematic recipes for building an operation once you know the
|
||||
cycle it is for.
|
||||
2. Op templates are provided by users in YAML or JSON or even directly via runtime API. This
|
||||
is called a workload template, which contains op templates.
|
||||
3. Op templates can be provided with optional metadata which serve to label, group,
|
||||
parameterize or otherwise make the individual op templates more manageable.
|
||||
4. A variety of forms are supported which are self-evident, but which allow users to have
|
||||
some flexibility in how they structure their YAML, JSON, or runtime collections. **This
|
||||
specification is about how these various forms are allowed, and how they relate to a
|
||||
fully-qualified and de-normalized op template view.
|
||||
2. The Developer View of the ParsedOp API. This is the view of an op template which presents the
|
||||
developer with a very high-level toolkit for building op synthesis functions.
|
||||
|
||||
# Details
|
||||
|
||||
The documentation in this directory serve as a testable specification for all the above. It
|
||||
shows specific examples of all the valid op template forms in both YAML and JSON, as well as how
|
||||
the data is normalized to feed developer's view of the ParsedOp API.
|
||||
|
||||
## Related Reading
|
||||
|
||||
If you want to understand the rest of this document, it is crucial that you have a working knowledge
|
||||
of the standard YAML format and several examples from the current drivers. You can learn this from
|
||||
the main documentation which demonstrates step-by-step how to build a workload. Reading further in
|
||||
this document will be most useful for core NB developers, or advanced users who want to know all
|
||||
the possible ways of building workloads.
|
||||
|
||||
@@ -0,0 +1,205 @@
|
||||
# ParsedOp API
|
||||
|
||||
In the workload template examples, we show statements as being formed from a string value. This is a
|
||||
specific type of statement form, although it is possible to provide structured op templates as well.
|
||||
|
||||
**The ParsedOp API is responsible for converting all valid op template forms into a consistent and
|
||||
unambiguous model.** Thus, the rules for mapping the various forms to the command model must be
|
||||
precise. Those rules are the substance of this specification.
|
||||
|
||||
## Op Synthesis
|
||||
|
||||
Executable operations are _created_ on the fly by NoSQLBench via a process called _Op Synthesis_.
|
||||
This is done incrementally in stages. The following narrative describes this process in logical
|
||||
stages. (The implementation may vary from this, but it explains the effects, nonetheless.)
|
||||
|
||||
Everything here happens **during** activity initialization, before the activity starts running
|
||||
cycles:
|
||||
|
||||
1. *Template Variable Expansion* - If there are template variables, such as
|
||||
`TEMPLATE(name,defaultval)` or `<<name:defaultval>>`, then these are expanded
|
||||
according to their defaults and any overrides provided in the activity params. This is a macro
|
||||
substitution only, so the values are simply interposed into the character stream of the document.
|
||||
2. *Jsonnet Evaluation* - If the source file was in jsonnet format (the extension was `.jsonnet`)
|
||||
then it is interpreted by sjsonnet, with all activity parameters available as external variables.
|
||||
3. *Structural Normalization* - The workload template (yaml, json, or data structure) is loaded
|
||||
into memory and transformed into a standard format. This means taking various list and map
|
||||
forms at every level and converting them to a singular standard form in memory.
|
||||
4. *Auto-Naming* - All elements which do not already have a name are assigned a simple name like
|
||||
`block2` or `op3`.
|
||||
5. *Auto-Tagging* - All op templates are given standard tag values under reserved tag names:
|
||||
- **block**: the name of the block containing the op template. For example: `block2`.
|
||||
- **name**: the name of the op template, prefixed with the block value and `--`. For example,
|
||||
`block2--op1`.
|
||||
6. *Property De-normalization* - Default values for all the standard op template properties are
|
||||
copied from the doc to the block layer unless the same-named key exists. Then the same
|
||||
method is applied from the doc layer to the op template layer. **At this point, the op
|
||||
templates are effectively an ordered list of data structures, each containing all necessary
|
||||
details for use.**
|
||||
7. *Tag Filtering* - The activity's `tag` param is used to filter all the op templates
|
||||
according to their tag map.
|
||||
8. *Bind Point and Capture Points* - Each op template is now converted into a ParsedOp, which is
|
||||
a swiss-army knife of op template introspection and function generation. It is the direct
|
||||
programmatic API that driver adapters use in subsequent steps.
|
||||
- Any string sequences with bind points like `this has a {bindpoint}` are automatically
|
||||
converted to a long -> string function.
|
||||
- Any direct references with no surrounding text like `{bindpoint}` are automatically
|
||||
converted to direct binding references.
|
||||
- Any other string form is cached as a static value.
|
||||
- The same process is applied to Lists and Maps, allowing structural templates which read
|
||||
like JSON with bind points in arbitrary places.
|
||||
8. *Op Mapping* - Using the ParsedOp API, each op template is categorized by the active `driver`
|
||||
according to that driver's documented examples and type-matching rules. Once the op mapper
|
||||
determines what op type a user intended, it uses this information and the associated op
|
||||
fields to create an *Op Dispenser*.
|
||||
9. *Op Sequencing* - The op dispensers are kept as an internal sequence, and installed into a
|
||||
[LUT](https://en.wikipedia.org/wiki/Lookup_table) according to their ratios and the specified
|
||||
(or default) sequencer. By default, round-robin with bucket exhaustion is used. The ratios
|
||||
specified are used directly in the LUT.
|
||||
|
||||
When this is complete, you are left with an efficient lookup table which indexes into a set of
|
||||
OpDispensers. The length of this lookup table is called the _sequence length_, and that value is
|
||||
used, by default, to set the _stride_ for the activity. This stride determines the size of
|
||||
per-thread cycle batching, effectively turning each sequence into a thread-safe set of
|
||||
operations which are serialized, and thus suitable for testing linearized operations with
|
||||
suitable dependency and error-handling mechanisms. (But wait, there's more!)
|
||||
|
||||
## Special Cases
|
||||
|
||||
Drivers are assigned to op templates individually, meaning you can specify the driver within an
|
||||
op template, not even assigning a default for the activity. Further, certain drivers are able to
|
||||
fill in missing details for op templates, like the `stdout` driver which only requires bindings.
|
||||
|
||||
This means that there are distinct cases for configuration which are valid, and these are
|
||||
checked at initialization time:
|
||||
|
||||
- A `driver` must be selected for each op template either directly or via activity params.
|
||||
- If the whole workload template provided does not include actual op templates **AND** a
|
||||
default driver is provided which can create synthetic op templates, it is given the raw
|
||||
workload template, incomplete as it is, and asked to provide op templates which have all
|
||||
normalization, naming, etc. already done. This is injected before the tag-filtering phase.
|
||||
- In any case that an actual non-zero list of op templates is provided and tag filtering removes
|
||||
them all, an error is thrown.
|
||||
- If, after tag filtering no op template are in the active list, an error is thrown.
|
||||
|
||||
# The ParsedOp
|
||||
|
||||
The components of a fully-parsed op template (AKA a ParsedOp) are:
|
||||
|
||||
## name
|
||||
|
||||
Each ParsedOp knows its name, which is simply the op template name that it was made from. This
|
||||
is useful for diagnostics, logging, and metrics.
|
||||
|
||||
## description
|
||||
|
||||
Every named element of a workload may be given a description.
|
||||
|
||||
## tags
|
||||
|
||||
Every op template has tags, even if they are auto-assigned from the block and op template names.
|
||||
If you assign explicit tags to an op template, the standard tags are still provided. Thus, it is
|
||||
an error to directly provide a tag named `block` or `name`.
|
||||
|
||||
## bindings
|
||||
|
||||
Although bindings are usually defined as workload template level property, they can also be
|
||||
provided directly as an op field property.
|
||||
|
||||
## op fields
|
||||
|
||||
The **op** property of an op template or ParsedOp is the root of the op fields. This is a map of
|
||||
specific fields specified by the user.
|
||||
|
||||
### static op fields
|
||||
|
||||
Some op fields are simply static values. Since these values are not generated per cycle, they are
|
||||
kept separate as reference data. Knowing which fields are static and which are not makes it
|
||||
possible for developers to optimize op synthesis.
|
||||
|
||||
### dynamic op fields
|
||||
|
||||
Other fields may be specified as recipes, with the actual value to be filled-in once the cycle
|
||||
value is known. All such fields are known as _dynamic op fields_, and are provided to the op
|
||||
dispenser as a long function, where the input is always the cycle value and the output is a
|
||||
type-specific value as determined by the associated binding recipe.
|
||||
|
||||
### bind points
|
||||
|
||||
This is how dynamic values are indicated. Each bind point in an op template results in some type of
|
||||
procedural generation binding. These can be references to named bindings elsewhere in the
|
||||
workload template, or they can be inline.
|
||||
|
||||
### capture points
|
||||
|
||||
Names of result values to save, and the variable names they are to be saved as. The names represent
|
||||
the name as it would be found in the native driver's API, such as the name `userid`
|
||||
in `select userid from ...`. In string form statements, users can specify that the userid should be
|
||||
saved as the thread-local variable named *userid* simply by tagging it
|
||||
like `select [userid] from ...`. They can also specify that this value should be captured under a
|
||||
different name with a variation like `select [userid as user_id] from ...`. This is the standard
|
||||
variable capture syntax for any string-based statement form.
|
||||
|
||||
### params
|
||||
|
||||
A backwards-compatible feature called op params is still available. This is another root
|
||||
property within an op template which can be used to accessorize op fields. By default, any op
|
||||
field which is not explicitly rooted under the `op` property are put there anyway. This is also
|
||||
true when there is an explicitly `params` property. However if the op property is provided, then
|
||||
all non-reserved fields are given to the params property instead. If both the `op` and the
|
||||
`param` op properties are specified, then no non-reserved op fields are allowed outside of these
|
||||
root values. Thus it is possible to still support params, but it is **highly** recommended that
|
||||
new driver developers avoid using this field, and instead allow all fields to be automatically
|
||||
anchored under the `op` property. This keeps configs terse and simple going forward.
|
||||
|
||||
Params may not be dynamic.
|
||||
|
||||
# Mapping Rules
|
||||
|
||||
A ParsedOp does not necessarily describe a specific low-level operation to be performed by
|
||||
a native driver. It *should* do so, but it is up to the user to provide a valid op template
|
||||
according to the documented rules of op construction for that driver type. These rules should be
|
||||
clearly documented by the driver developer as examples in markdown that is required for every
|
||||
driver. With this documentation, users can use `nb5 help <driver>` to see exactly how
|
||||
to create op templates for a given driver.
|
||||
|
||||
## String Form
|
||||
|
||||
Basic operations are made from a statement in some type of query language:
|
||||
|
||||
```yaml
|
||||
ops:
|
||||
- stringform: select [userid] from db.users where user='{username}';
|
||||
bindings:
|
||||
username: NumberNameToString()
|
||||
```
|
||||
|
||||
# Reserved op fields
|
||||
|
||||
The property names `ratio`, `driver`, `space`, are considered reserved by the NoSQLBench runtime.
|
||||
These are extracted and handled specially by the core runtime.
|
||||
|
||||
# Base OpDispenser fields
|
||||
|
||||
The BaseOpDispenser, which <s>is</s> will be required as the base implementation of any op
|
||||
dispenser going forward, provides cross-cutting functionality. These include `start-timers`,
|
||||
`stop-timers`, `instrument`, and likely will include more as future cross-driver functionality is
|
||||
added. These fields will be considered reserved property names.
|
||||
|
||||
# Optimization
|
||||
|
||||
It should be noted that the op mapping process, where user intentions are mapped from op templates to
|
||||
op dispensers is not something that needs to be done quickly. This occurs at _initialization_
|
||||
time. Instead, it is more important to focus on user experience factors, such as flexibility,
|
||||
obviousness, robustness, correctness, and so on. Thus, priority of design factors in this part
|
||||
of NB is placed more on clear and purposeful abstractions and less on optimizing for speed. The
|
||||
clarity and detail which is conveyed by this layer to the driver developer will then enable
|
||||
them to focus on building fast and correct op dispensers. These dispensers are also constructed
|
||||
before the workload starts running, but are used at high speed while the workload is running.
|
||||
|
||||
In essence:
|
||||
- Any initialization code which happens before or in the OpDispenser constructor should not be
|
||||
concerned with careful performance optimization.
|
||||
- Any code which occurs within the OpDispenser#apply method should be as lightweight as is
|
||||
reasonable.
|
||||
|
||||
@@ -1,99 +0,0 @@
|
||||
# Workload Specification
|
||||
|
||||
This directory contains the testable specification for workload definitions used by NoSQLBench.
|
||||
|
||||
## Op Templates vs Developer API
|
||||
There are two primary views of workload definitions that we care about:
|
||||
|
||||
1. The User View of **op templates**
|
||||
1. Op templates are simply the schematic recipes for building an operation.
|
||||
2. Op templates are provided by users in YAML or JSON or even directly via runtime API.
|
||||
3. Op templates can be provided with optional metadata which serves to label, group or
|
||||
otherwise make the individual op templates more manageable.
|
||||
4. A variety of forms are supported which are self-evident, but which allow users to have
|
||||
some flexibility in how they structure their YAML, JSON, or runtime collections.
|
||||
2. The Developer View of the ParsedOp API -- All op templates, regardless of the form they are
|
||||
provided in, are processed into a normalized internal data structure.
|
||||
1. The detailed documentation for the ParsedOp API is in javadoc.
|
||||
|
||||
The documentation in this directory serve as a testable specification for all the above. It
|
||||
shows specific examples of all the valid op template forms in both YAML and JSON, as well as how
|
||||
the data is normalized to feed developer's view of the ParsedOp API.
|
||||
|
||||
If you are a new user, it is recommended that you read the basic docs first before delving into
|
||||
these specification-level docs too much. The intro docs show normative and simple ways to
|
||||
specific workloads without worrying too much about all the possible forms.
|
||||
|
||||
## Templating Language
|
||||
|
||||
When users want to specify a set of operations to perform, they do so with the workload templating
|
||||
format, which includes document level details, block level details, and op level details.
|
||||
Specific reserved words like `block` or `ops` are used in tandem with nesting structure to
|
||||
define all valid workload constructions. Because of this, workload definitions are
|
||||
essentially data structures comprised of basic collection types and primitive values. Any on-disk
|
||||
format which can be loaded as such can be a valid source of workload definitions.
|
||||
|
||||
- [SpecTest Formatting](spectest_formatting.md) - A primer on the example formats used here
|
||||
- [Workload Structure](workload_structure.md) - Overall workload structure, keywords, nesting
|
||||
features
|
||||
- [Op Template Basics](op-template-basics.md) - Basic Details of op templating
|
||||
- [Op Template Variations](op_template_variations.md) - Additional op template variants
|
||||
and corner cases
|
||||
- [Template Variables](template_variables.md) - Textual macros and default values
|
||||
|
||||
## ParsedOp API
|
||||
|
||||
After a workload template is loaded into an activity, it is presented to the driver in an API which
|
||||
is suitable for building executable ops in the native driver.
|
||||
|
||||
- [ParsedOp API](parsed_op_api.md) - Defines the API which developers see after a workload is fully
|
||||
loaded.
|
||||
|
||||
## Related Reading
|
||||
|
||||
If you want to understand the rest of this document, it is crucial that you have a working knowledge
|
||||
of the standard YAML format and several examples from the current drivers. You can learn this from
|
||||
the main documentation which demonstrates step-by-step how to build a workload. Reading further in
|
||||
this document will be most useful for core NB developers, or advanced users who want to know all
|
||||
the possible ways of building workloads.
|
||||
|
||||
## Op Mapping Stages
|
||||
|
||||
The process of loading a workload definition occurs in several discrete steps during a NoSQLBench
|
||||
session:
|
||||
|
||||
1. The workload file is loaded.
|
||||
2. Template variables from the activity parameters are interposed into the raw contents of the
|
||||
file.
|
||||
3. The file is deserialized from its native form into a raw data structure.
|
||||
4. The raw data structure is transformed into a normalized data structure according to the Op
|
||||
Template normalization rules.
|
||||
5. Each op template is then denormalized as a self-contained data
|
||||
structure, containing all the provided bindings, params, and tags from the upper layers of the
|
||||
doc structure.
|
||||
6. The data is provided to the ParsedOp API for use by the developer.
|
||||
7. The DriverAdapter is loaded which understands the op fields provided in the op template.
|
||||
8. The DriverAdapter uses its documented rules to determine which types of native driver operations
|
||||
each op template is intended to represent. This is called **Op Mapping**.
|
||||
9. The DriverAdapter (via the selected Op Mapper) uses the identified types to create dispensers of
|
||||
native driver operations. This is called **Op Dispensing**.
|
||||
10. The op dispensers are arranged into an indexed bank of op sources according to the specified
|
||||
ratios and or sequencing strategy. From this point on, NoSQLBench has the ability to
|
||||
construct an operation for any given cycle at high speed.
|
||||
|
||||
These specifications are focused on steps 2-5. The DriverAdapter focuses on the developer's use of
|
||||
the ParsedOp API, and as such is documented in javadoc primarily. Some details on the ParsedOp
|
||||
API are shared here for basic awareness, but developers should look to the javadoc for the full
|
||||
story.
|
||||
|
||||
## Mapping vs Running
|
||||
|
||||
It should be noted that the Op Mapping stage, where user intentions are mapped from op templates to
|
||||
native operations is not something that needs to be done quickly. This occurs at
|
||||
_initialization_ time. Instead, it is more important to focus on user experience factors, such as
|
||||
flexibility, obviousness, robustness, correctness, and so on. Thus, priority of design factors in
|
||||
this part of NB is placed more on clear and purposeful abstractions and less on optimizing for
|
||||
speed. The clarity and detail which is conveyed by this layer to the driver developer will then
|
||||
enable them to focus on building fast and correct op dispensers. These dispensers are also
|
||||
constructed before the workload starts running, but are used at high speed while the workload
|
||||
is running.
|
||||
@@ -1,301 +0,0 @@
|
||||
# ParsedOp API
|
||||
|
||||
In the workload template examples, we show statements as being formed from a string value. This is a
|
||||
specific type of statement form, although it is possible to provide structured op templates as well.
|
||||
|
||||
**The ParsedOp API is responsible for converting all valid op template forms into a consistent and
|
||||
unambiguous model.** Thus, the rules for mapping the various forms to the command model must be
|
||||
precise. Those rules are the substance of this specification.
|
||||
|
||||
## Op Synthesis
|
||||
|
||||
The method of turning an op template, some data generation functions, and some seed values into an
|
||||
executable operation is called *Op Synthesis* in NoSQLBench. This is done in incremental stages:
|
||||
|
||||
1. During activity initialization, NoSQLBench parses the workload template and op templates
|
||||
contained within. Each active op template (after filtering) is converted to a parsed command.
|
||||
2. The NB driver uses the parsed command to guide the construction of an OpDispenser<T>. This is a
|
||||
dispenser of operations that can be executed by the driver's Action implementation.
|
||||
3. When it is time to create an actual operation to be executed, unique with its own procedurally
|
||||
generated payload and settings, the OpDispenser<T> is invoked as a LongFunction<T>. The input
|
||||
provided to this function is the cycle number of the operation. This is essentially a seed that
|
||||
determines the content of all the dynamic fields in the operation.
|
||||
|
||||
This process is non-trivial in that it is an incremental creational pattern, where the resultant
|
||||
object is contextual to some native API. The command API is intended to guide and enable op
|
||||
synthesis without tying developers' hands.
|
||||
|
||||
## Command Fields
|
||||
|
||||
A command structure is intended to provide all the fields needed to fully realize a native
|
||||
operation. Some of these fields will be constant, or *static* in the op template, expressed simply
|
||||
as strings, numbers, lists or maps. These are parsed from the op template as such and are cached in
|
||||
the command structure as statics.
|
||||
|
||||
Other fields are only prescribed as recipes. This comes in two parts: 1) The description for how to
|
||||
create the value from a binding function, and 2) the binding point within the op template. Suppose
|
||||
you have a string-based op template like this:
|
||||
|
||||
```yaml
|
||||
ops:
|
||||
- op1: select * from users where userid={userid}
|
||||
bindings:
|
||||
userid: ToString();
|
||||
```
|
||||
|
||||
In this case, there is only one op in the list of ops, having a name `op1` and a string form op
|
||||
template of `select * from users where userid={userid}`.
|
||||
|
||||
## Parsed Command Structure
|
||||
|
||||
Once an op template is parsed into a *parsed command*, it has the state shown in the data structure
|
||||
schematic below:
|
||||
|
||||
```json
|
||||
|
||||
{
|
||||
"name": "some-map-name",
|
||||
"statics": {
|
||||
"s1": "v1",
|
||||
"s2": {
|
||||
"f1": "valfoo"
|
||||
}
|
||||
},
|
||||
"dynamics": {
|
||||
"d1": "NumberNameToString()"
|
||||
},
|
||||
"captures": {
|
||||
"resultprop1": "asname1"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
If either an **op** or **stmt** field is provided, then the same structure as above is used:
|
||||
|
||||
```json
|
||||
|
||||
{
|
||||
"name": "some-string-op",
|
||||
"statics": {
|
||||
},
|
||||
"dynamics": {
|
||||
"op": "select username from table where name userid={userid}"
|
||||
},
|
||||
"captures": {
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The parts of a parsed command structure are:
|
||||
|
||||
### command name
|
||||
|
||||
Each command knows its name, just like an op template does. This can be useful for diagnostics and
|
||||
metric naming.
|
||||
|
||||
### static fields
|
||||
|
||||
The field names which are statically assigned and their values of any type. Since these values are
|
||||
not generated per-op, they are kept separate as reference data. Knowing which fields are static and
|
||||
which are not makes it possible for developers to optimize op synthesis.
|
||||
|
||||
### dynamic fields
|
||||
|
||||
Named bindings points within the op template. These values will only be known for a given cycle.
|
||||
|
||||
### variable captures
|
||||
|
||||
Names of result values to save, and the variable names they are to be saved as. The names represent
|
||||
the name as it would be found in the native driver's API, such as the name `userid`
|
||||
in `select userid from ...`. In string form statements, users can specify that the userid should be
|
||||
saved as the thread-local variable named *userid* simply by tagging it
|
||||
like `select [userid] from ...`. They can also specify that this value should be captured under a
|
||||
different name with a variation like `select [userid as user_id] from ...`. This is the standard
|
||||
variable capture syntax for any string-based statement form.
|
||||
|
||||
# Resolved Command Structure
|
||||
|
||||
Once an op template has been parsed into a command structure, the runtime has everything it needs to
|
||||
know in order to realize a specific set of field values, *given a cycle number*. Within a cycle, the
|
||||
cycle number is effectively a seed value that drives the generation of all dynamic data for that
|
||||
cycle.
|
||||
|
||||
However, this seed value is only known by the runtime once it is time to execute a specific cycle.
|
||||
Thus, it is the developer's job to tell the NoSQLBench runtime how to map from the parsed structure
|
||||
to a native type of executable operation suitable for execution with that driver.
|
||||
|
||||
# Interpretation
|
||||
|
||||
A command structure does not necessarily describe a specific low-level operation to be performed by
|
||||
a native driver. It *should* do so, but it is up to the user to provide a valid op template
|
||||
according to the documented rules of op construction for that driver type. These rules should be
|
||||
clearly documented by the driver developer.
|
||||
|
||||
Once the command structure is provided, the driver takes over and maps the fields into an executable
|
||||
op -- *almost*. In fact, the driver developer defines the ways that a command structure can be
|
||||
turned into an executable operation. This is expressed as a *Function<CommandTemplate,T>* where T is
|
||||
the type used in the native driver's API.
|
||||
|
||||
How a developer maps a structure like the above to an operations is up to them. The general rule of
|
||||
thumb is to use the most obvious and familiar representation of an operation as it would appear to a
|
||||
user. If this is CQL or SQL, then recommend use that as the statement form. If it GraphQL, use that.
|
||||
In both of these cases, you have access to
|
||||
|
||||
## String Form
|
||||
|
||||
Basic operations are made from a statement in some type of query language:
|
||||
|
||||
```yaml
|
||||
ops:
|
||||
- stringform: select [userid] from db.users where user='{username}';
|
||||
bindings:
|
||||
username: NumberNameToString()
|
||||
```
|
||||
|
||||
## Structured Form
|
||||
|
||||
Some operations can't be easily represented by a single statement. Some operations are built from a
|
||||
set of fields which describe more about an operation than the basic statement form. These types of
|
||||
operations are expressed to NoSQLBench in map or *object* form, where the fields within the op can
|
||||
be specified independently.
|
||||
|
||||
```yaml
|
||||
ops:
|
||||
- structured1:
|
||||
stmt: select * from db.users where user='{username}}';
|
||||
prepared: true
|
||||
consistency_level: LOCAL_QUORUM
|
||||
bindings:
|
||||
username: NumberNameToString();
|
||||
- structured2:
|
||||
cmdtype: "put"
|
||||
namespace: users
|
||||
key: { userkey }
|
||||
body: "User42 was here"
|
||||
bindings:
|
||||
userkey: FixedValue(42)
|
||||
```
|
||||
|
||||
In the first case, the op named *structured1* is provided as a string value within a map structure.
|
||||
The *stmt* field is a reserved word (synonomous with op and operation). When you are reading an op
|
||||
from the command API, these will represented in exactly the same way as the stringform example
|
||||
above.
|
||||
|
||||
In the second case,
|
||||
|
||||
In the second, the op named *structured form* is provided as a map. Both of these examples would
|
||||
make sense to a user, as they are fairly self-explanatory.
|
||||
|
||||
Op templates may specify an op as either a string or a map. No other types are allowed. However,
|
||||
there are no restrictions placed on the elements below a map.
|
||||
|
||||
The driver developer should not have to parse all the possible structural forms that users can
|
||||
provide. There should be one way to access all of these in a consistent and unambiguous API.
|
||||
|
||||
## Command Structure
|
||||
|
||||
Here is an example data structure which illustrates all the possible elements of a parsed command:
|
||||
|
||||
```json
|
||||
|
||||
{
|
||||
"statics": {
|
||||
"prepared": "true",
|
||||
"consistency_level'"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Users provide a template form of an operation in each op template. This contains a sketch of what an
|
||||
operation might look like, and includes the following optional parts:
|
||||
|
||||
- properties of the operation, whether meta (like a statement) or payload content
|
||||
- the binding points where generated field values will be injected
|
||||
- The names of values to be extracted from the result of a successful operation.
|
||||
|
||||
## Statement Forms
|
||||
|
||||
Sometimes operations are derived from a query language, and are thus self-contained in a string
|
||||
form.
|
||||
|
||||
When mapping the template of an operation provided by users to an executable operation in some
|
||||
native driver with specific values, you have to know
|
||||
|
||||
* The s
|
||||
* The substance of the operation: The name and values of the fields that the user provides as part
|
||||
of the operation
|
||||
|
||||
Command templates are the third layer of workload templating. As described in other spec documents,
|
||||
the other layers are:
|
||||
|
||||
1. [Workload level templates](templated_workloads.md) - This specification covers the basics of a
|
||||
workload template, including the valid properties and structure.
|
||||
2. [Operation level templates](templated_operations.md) - This specification covers how operations
|
||||
can be specified, including semantics and structure.
|
||||
3. Command level templates, explained below. These are the detailed views of what goes into an op
|
||||
template, parsed and structured in a way that allows for efficient use at runtime.
|
||||
|
||||
Users do not create command templates directly. Instead, these are the *parsed* form of op templates
|
||||
as seen by the NB driver. The whole point of a command template is to provide crisp semantics and
|
||||
structure about what a user is asking a driver to do. Command Template
|
||||
|
||||
Command templates are essentially schematics for an operation. They are a structural interpretation
|
||||
of the content provided by users in op templates. Each op template provided can be converted into a
|
||||
command template. In short, the op template is the form that users tend to edit in yaml or provided
|
||||
as a data structure via scripting. **Command templates are the view of an op template as seen by an
|
||||
NB driver.**
|
||||
|
||||
```
|
||||
### Command Templates
|
||||
|
||||
Command templates are part of the workload API.
|
||||
|
||||
There exists a need to provide op templates to a myriad of runtime APIs,
|
||||
and thus it has to be flexible enough to serve them all.
|
||||
|
||||
1. In some cases, an operation is based on a query language where the
|
||||
query language itself encodes everything needed for specific operation.
|
||||
SQL queries are like this. This is a nice simplification, but it is not
|
||||
realistic for systems build on modern distributed principles.
|
||||
2. In most cases, you have both an operation and some qualifying rules
|
||||
about how the operation should be handled, such as consistency level.
|
||||
Thus, there is a need to provide parameters which can decorate
|
||||
operations.
|
||||
3. In some cases, you have a payload for your operation which is not based
|
||||
on a query language, but instead on an object with fields, or a verb
|
||||
which determines what other fields are needed. This structure is better
|
||||
described as a *command* than a *statement*.
|
||||
4. Finally, you must support separate both of the latter cases where the
|
||||
command or operations is defined in some pseudo-structured way, but it
|
||||
also has *separately* a set of qualifying parameters which are
|
||||
considered orthogonal, or at least separate from the meaning of the
|
||||
operation itself.
|
||||
|
||||
To address the full set of these mapping requirements, a type has been
|
||||
added to NB which provides a structured and pre-baked version of a
|
||||
resolvable command -- the CommandTemplate.
|
||||
|
||||
This type provides a view to the driver builder of all the fields
|
||||
specified by the user, whether as encoded as a string, such
|
||||
as `select row from ...`, or by a set of properties such
|
||||
as `{"verb":"get",
|
||||
"id":"2343"}`. It also exposes the parameters separately if provided.
|
||||
|
||||
### Static vs Dynamic command fields
|
||||
|
||||
Further, for each field in the command template, the driver implementor
|
||||
knows whether this was provided as a static value or one that can only be
|
||||
realized for a specific cycle (seed data). Thus, it is possible for
|
||||
advanced op mapping implementations to optimize the way that new
|
||||
operations are synthesized for efficiency.
|
||||
|
||||
For example, if you know that you have a command which has no dynamic
|
||||
fields in its command template, then it is possible to create a singleton
|
||||
op template which can simply be re-used. A fully dynamic command template,
|
||||
in contrast, may need to be realized dynamically for each cycle, given
|
||||
that you don't know the value of the fields in the command until you know
|
||||
the cycle value.
|
||||
|
||||
|
||||
|
||||
```
|
||||
Reference in New Issue
Block a user