docs structure updates

2025-02-25 18:55:28 -06:00 · 2020-03-28 01:52:17 -05:00
parent 7a75408423
commit 0908f53494
29 changed files with 196 additions and 39 deletions
--- a/engine-docs/src/main/resources/docs-for-nb/getting_started/02_grafana_metrics.md
+++ b/engine-docs/src/main/resources/docs-for-nb/getting_started/02_grafana_metrics.md
--- a/engine-docs/src/main/resources/docs-for-nb/getting_started/01_example_commands.md
+++ b/engine-docs/src/main/resources/docs-for-nb/getting_started/01_example_commands.md
@@ -89,9 +89,10 @@ Note the differences between this and the command that we used to generate the s
 appropriately large number of cycles in actual testing to make your main test meaningful.

 :::info
-The cycles parameter is not just a quantity. It is a range of values. The `cycles=n` format is short for `cycles=0..n`,
-which makes cycles a zero-based quantity by default. For example, cycles=5 means that the activity will use cycles
-0,1,2,3,4, but not 5. The reason for this is explained in detail in the Activity Parameters section.
+The cycles parameter is not just a quantity. It is a range of values. The `cycles=n` format is short for
+`cycles=0..n`, which makes cycles a zero-based range. For example, cycles=5 means that the activity will use cycles
+0,1,2,3,4, but
+not 5. The reason for this is explained in detail in the Activity Parameters section.
 :::

 These parameters are explained in detail in the section on _Activity Parameters_.
--- a/engine-docs/src/main/resources/docs-for-nb/reference/parameter_types.md
+++ b/engine-docs/src/main/resources/docs-for-nb/reference/parameter_types.md
--- a/engine-docs/src/main/resources/docs-for-nb/showcase/portable_workloads.md
+++ b/engine-docs/src/main/resources/docs-for-nb/showcase/portable_workloads.md
@@ -31,12 +31,12 @@ parameter. This is a way of templating a workload and make it multi-purpose or a

 ## Experimentation Friendly

-Because the workload YAML format is generic across activity types, it is possible to ask one acivity type to interpret
-the statements that are meant for another. This isn't generally a good idea, but it becomes extremely handy when you
-want to have a very high level activity type like `stdout` use a lower-level syntax like that of the `cql` activity
-type. When you do this, the stdout activity type _plays_ the statements to your console as they would be executed in
-CQL, data bindings and all.
+Because the workload YAML format is generic across driver types, it is possible to ask one driver type to interpret the
+statements that are meant for another. This isn't generally a good idea, but it becomes extremely handy when you want to
+have a high level driver type like `stdout` interpret the syntax of another driver like `cql`. When you do this, the
+stdout activity type _plays_ the statements to your console as they would be executed in CQL, data bindings and all.

 This means you can empirically and substantively demonstrate and verify access patterns, data skew, and other dataset
-details before you change back to cql mode and turn up the settings for a higher scale test.
+details before you change back to cql mode and turn up the settings for a higher scale test. It takes away the guess
+work about what your test is actually doing, and it works for all drivers.

--- a/virtdata-userlibs/pom.xml
+++ b/virtdata-userlibs/pom.xml
@@ -106,6 +106,7 @@
                    <include>META-INF/functions</include>
                    <include>data/**</include>
                    <include>docs-for-virtdata/**</include>
+                    <include>docs/**</include>
                </includes>
            </resource>
        </resources>
--- a/virtdata-userlibs/src/main/java/io/nosqlbench/virtdata/userlibs/apps/docsapp/FunctionDoc.java
+++ b/virtdata-userlibs/src/main/java/io/nosqlbench/virtdata/userlibs/apps/docsapp/FunctionDoc.java
@@ -1,30 +0,0 @@
-package io.nosqlbench.virtdata.userlibs.apps.docsapp;
-
-import io.nosqlbench.virtdata.annotations.Category;
-import io.nosqlbench.virtdata.processors.DocCtorData;
-
-import java.util.*;
-
-public class FunctionDoc {
-
-    private String funcName;
-    private String classDocs;
-    private Set<Category> categories= new HashSet<>();
-    private List<DocCtorData> ctors = new ArrayList<>();
-
-    public FunctionDoc(String funcName) {
-        this.funcName = funcName;
-    }
-
-    public void setClassDocs(String distinctClassDocs) {
-        this.classDocs = distinctClassDocs;
-    }
-
-    public void addCategories(Category[] categories) {
-        this.categories.addAll(Arrays.asList(categories));
-    }
-
-    public void addCtor(DocCtorData ctor) {
-        this.ctors.add(ctor);
-    }
-}
--- a/virtdata-userlibs/src/main/resources/docs-for-virtdata/01_binding_functions/concepts.md
+++ b/virtdata-userlibs/src/main/resources/docs-for-virtdata/01_binding_functions/concepts.md
--- a/virtdata-userlibs/src/main/resources/docs-for-virtdata/01_binding_functions/funcref_collections.md
+++ b/virtdata-userlibs/src/main/resources/docs-for-virtdata/01_binding_functions/funcref_collections.md
--- a/virtdata-userlibs/src/main/resources/docs-for-virtdata/01_binding_functions/funcref_conversion.md
+++ b/virtdata-userlibs/src/main/resources/docs-for-virtdata/01_binding_functions/funcref_conversion.md
--- a/virtdata-userlibs/src/main/resources/docs-for-virtdata/01_binding_functions/funcref_datetime.md
+++ b/virtdata-userlibs/src/main/resources/docs-for-virtdata/01_binding_functions/funcref_datetime.md
--- a/virtdata-userlibs/src/main/resources/docs-for-virtdata/01_binding_functions/funcref_diagnostics.md
+++ b/virtdata-userlibs/src/main/resources/docs-for-virtdata/01_binding_functions/funcref_diagnostics.md
--- a/virtdata-userlibs/src/main/resources/docs-for-virtdata/01_binding_functions/funcref_distributions.md
+++ b/virtdata-userlibs/src/main/resources/docs-for-virtdata/01_binding_functions/funcref_distributions.md
--- a/virtdata-userlibs/src/main/resources/docs-for-virtdata/01_binding_functions/funcref_functional.md
+++ b/virtdata-userlibs/src/main/resources/docs-for-virtdata/01_binding_functions/funcref_functional.md
--- a/virtdata-userlibs/src/main/resources/docs-for-virtdata/01_binding_functions/funcref_general.md
+++ b/virtdata-userlibs/src/main/resources/docs-for-virtdata/01_binding_functions/funcref_general.md
--- a/virtdata-userlibs/src/main/resources/docs-for-virtdata/01_binding_functions/funcref_nulls.md
+++ b/virtdata-userlibs/src/main/resources/docs-for-virtdata/01_binding_functions/funcref_nulls.md
--- a/virtdata-userlibs/src/main/resources/docs-for-virtdata/01_binding_functions/funcref_premade.md
+++ b/virtdata-userlibs/src/main/resources/docs-for-virtdata/01_binding_functions/funcref_premade.md
--- a/virtdata-userlibs/src/main/resources/docs-for-virtdata/01_binding_functions/funcref_state.md
+++ b/virtdata-userlibs/src/main/resources/docs-for-virtdata/01_binding_functions/funcref_state.md
--- a/virtdata-userlibs/src/main/resources/docs-for-virtdata/01_binding_functions/index.md
+++ b/virtdata-userlibs/src/main/resources/docs-for-virtdata/01_binding_functions/index.md
--- a/virtdata-userlibs/src/main/resources/docs-for-virtdata/01_binding_functions/using_bindings.md
+++ b/virtdata-userlibs/src/main/resources/docs-for-virtdata/01_binding_functions/using_bindings.md
--- a/virtdata-userlibs/src/main/resources/docs/category_blurbs/funcref_collections.md
+++ b/virtdata-userlibs/src/main/resources/docs/category_blurbs/funcref_collections.md
@@ -0,0 +1,10 @@
+---
+title: collection functions
+weight: 40
+---
+
+Collection functions allow you to construct Java Lists, Maps or Sets.
+These functions often take the form of a higher-order function, where
+the inner function definitions are called to determine the size of
+the collection, the individual values to be added, etc.
+
--- a/virtdata-userlibs/src/main/resources/docs/category_blurbs/funcref_conversion.md
+++ b/virtdata-userlibs/src/main/resources/docs/category_blurbs/funcref_conversion.md
@@ -0,0 +1,8 @@
+---
+title: conversion functions
+weight: 30
+---
+
+Conversion functions simply allow values of one type
+to be converted to another type in an obvious way.
+
--- a/virtdata-userlibs/src/main/resources/docs/category_blurbs/funcref_datetime.md
+++ b/virtdata-userlibs/src/main/resources/docs/category_blurbs/funcref_datetime.md
@@ -0,0 +1,12 @@
+---
+title: datetime functions
+weight: 20
+---
+
+Functions in this category know about times and dates, datetimes, seconds or millisecond epoch times, and so forth.
+
+Some of the functions in this category are designed to allow testing of UUID types which are usually designed to avoid
+determinism. This makes it possible to test systems which depend on UUIDs but which require determinism in test data.
+This is strictly for testing use. Breaking the universally-unique properties of UUIDs in production systems is a bad
+idea. Yet, in testing, this determinism is quite useful.
+
--- a/virtdata-userlibs/src/main/resources/docs/category_blurbs/funcref_diagnostics.md
+++ b/virtdata-userlibs/src/main/resources/docs/category_blurbs/funcref_diagnostics.md
@@ -0,0 +1,7 @@
+---
+title: diagnostic functions
+weight: 40
+---
+
+Diagnostic functions can be used to help you construct the right VirtData recipe.
+
--- a/virtdata-userlibs/src/main/resources/docs/category_blurbs/funcref_distributions.md
+++ b/virtdata-userlibs/src/main/resources/docs/category_blurbs/funcref_distributions.md
@@ -0,0 +1,92 @@
+---
+title: distribution functions
+weight: 30
+---
+
+All of the distributions that are provided in the Apache Commons Math
+project are supported here, in multiple forms.
+
+## Continuous or Discrete
+
+These distributions break down into two main categories:
+
+### Continuous Distributions
+
+These are distributions over real numbers like 23.4323, with
+continuity across the values. Each of the continuous distributions can
+provide samples that fall on an interval of the real number line.
+Continuous probability distributions include the *Normal* distribution,
+and the *Exponential* distribution, among many others.
+
+### Discrete Distributions
+
+Discrete distributions, also known as *integer distributions* have only
+whole-number valued samples. These distributions include the *Binomial*
+distribution, the *Zipf* distribution, and the *Poisson* distribution,
+among others.
+
+## Hashed or Mapped
+
+### hashed samples
+
+Generally, you will want to "randomly sample" from a probability distribution.
+This is handled automatically by the functions below if you do not override the
+defaults. **The `hash` mode is the default sampling mode for probability
+distributions.** This is accomplished by computing an internal on the unit
+interval variate input before using the resulting value to map into the sampling
+curve. This is called the `hash` sampling mode by VirtData. You can put `hash`
+into the modifiers as explained below if you want to document it explicitly.
+
+### mapped samples
+
+The method used to sample from these distributions depends on a mathematical
+function called the cumulative probability function, or more specifically
+the inverse of it. Having this function computed over some interval allows
+one to sample the shape of a distribution progressively if desired. In
+other words, it allows for some *percentile-like* view of values within
+a given probability distribution. This mode of using the inverse cumulative
+density function is known as the `map` mode in VirtData, as it allows one
+to map a unit interval variate in a deterministic way to a density
+sampling curve. To enable this mode, simply pass `map` as one of the
+function modifiers for any function in this category.
+
+## Interpolated or Computed Samples
+
+When sampling from mathematical models of probability densities, performance
+between different densities can vary drastically. This means that you may
+end up perturbing the results of your test in an unexpected way simply
+by changing parameters of your testing distributions. Even worse, some
+densities have painful corner cases in performance, like 'Zipf', which
+can make tests unbearably slow and flawed as they chew up CPU resources.
+
+### Interpolated Samples
+
+For this reason, interpolation is built-in to these sampling functions.
+**The default mode is `interpolate`.** This means that the sampling
+function is pre-computed over 1000 equidistant points in the unit interval,
+and the result is shared among all threads as a look-up-table for
+interpolation. This makes all statistical sampling functions perform nearly
+identically at runtime (after initialization, a one time cost).
+This does have the minor side effect of a little loss in accuracy, but
+the difference is generally negligible for nearly all performance testing
+cases.
+
+### Computed Samples
+
+Conversely, `compute` mode sampling calls the sampling function every
+time a sample is needed. This affords a little more accuracy, but is generally
+not preferable to the default interpolated mode. You'll know if you need
+computed samples. Otherwise, it's best to stick with interpolation so that
+you spend more time testing your target system and less time testing
+your data generation functions.
+
+## Input Range
+
+All of these functions take a long as the input value for sampling. This
+is similar to how the unit interval (0.0,1.0) is used in mathematics
+and statistics, but more tailored to modern system capabilities. Instead
+of using the unit interval, we simply use the interval of all positive
+longs. This provides more compatibility with other functions in VirtData,
+including hashing functions.
+
+
--- a/virtdata-userlibs/src/main/resources/docs/category_blurbs/funcref_functional.md
+++ b/virtdata-userlibs/src/main/resources/docs/category_blurbs/funcref_functional.md
@@ -0,0 +1,6 @@
+---
+title: flow functions
+weight: 40
+---
+
+These functions help combine other functions into higher-order functions when needed.
--- a/virtdata-userlibs/src/main/resources/docs/category_blurbs/funcref_general.md
+++ b/virtdata-userlibs/src/main/resources/docs/category_blurbs/funcref_general.md
@@ -0,0 +1,7 @@
+---
+title: general functions
+weight: 20
+---
+
+These functions have no particular category, so they ended up here by default.
+
--- a/virtdata-userlibs/src/main/resources/docs/category_blurbs/funcref_nulls.md
+++ b/virtdata-userlibs/src/main/resources/docs/category_blurbs/funcref_nulls.md
@@ -0,0 +1,13 @@
+---
+title: null functions
+weight: 40
+---
+
+These functions can generate null values. When using nulls in your binding recipes, ensure that you don't generate them
+in-line as inputs to other functions. This will lead to errors which interrupt your test. If you must use functions that
+generate null values, ensure that they are the only or last function in a chain.
+
+If you need to mark a field to be undefined, but _not set to null_, then use the functions which know how to yield a
+VALUE.UNSET, which is a sigil constant within the VirtData runtime. These functions are correctly interpreted by
+conformant drivers like the SQL driver so that they will avoid inject the named field into an operation if it has this
+special value.
--- a/virtdata-userlibs/src/main/resources/docs/category_blurbs/funcref_premade.md
+++ b/virtdata-userlibs/src/main/resources/docs/category_blurbs/funcref_premade.md
@@ -0,0 +1,11 @@
+---
+title: pre-made functions
+weight: 20
+---
+
+Functions in this category are meant to provide easy grab-and-go functions that are tailored for real-world simulation.
+This library will grow over time. These functions are often built directly on top of other functions in the core
+libraries. However, they are provided here for simplicity in workload construction. They perform exactly the same as
+their longer-form equivalents.
+
+
--- a/virtdata-userlibs/src/main/resources/docs/category_blurbs/funcref_state.md
+++ b/virtdata-userlibs/src/main/resources/docs/category_blurbs/funcref_state.md
@@ -0,0 +1,19 @@
+---
+title: state functions
+weight: 30
+---
+
+Functions in the state category allow you to do things with side-effects in the function flow. Specifically, they allow
+you to save or load values of named variables to thread-local registers. These work best when used with non-async
+activities, since the normal statement grouping allows you to share data between statements in the sequence. It is not
+advised to use these with async activities.
+
+When using these functions, be careful that you call them when needed. For example, if you have a named binding which
+will save a value, that action only occurs if some statement with this named binding is used.
+
+For example, if you have an account records and transaction records, where you want to save the account identifier to
+use within the transaction inserts, you must ensure that each account binding is used within the thread first.
+
+
+
+