more docs for doc system

2025-02-25 18:55:28 -06:00 · 2020-08-06 09:50:47 -05:00
parent 6b5d4c3a68
commit 459b84c8c0
1 changed files with 186 additions and 59 deletions
--- a/devdocs/docstructure/bundled_docs.md
+++ b/devdocs/docstructure/bundled_docs.md
@@ -1,48 +1,59 @@
-# Bundled Docs
+# NBDocs - NoSQLBench Docs

-In order to keep the structure of NoSQLBench modular enough to allow for easy extension by contributors, yet cohesive in
-how it presents documentation and features to users, it is necessary to provide internal services which aggregate
-content by subject matter into a consumable whole that can be used by the documentation system.
+__THIS IS A WORK IN PROGRESS__

-# MarkdownDocs Service
+In order to keep the structure of NoSQLBench modular enough to allow for easy extension by
+contributors, yet cohesive in how it presents documentation and features to users, it is necessary
+to provide internal services which aggregate content by subject matter into a consumable whole that
+can be used by the documentation system.
+
+## MarkdownDocs Service

 The primary markdown service that is meant to be consumed by the documetnation system is known simply as

    MarkdownDocs

-Static methods on this class will provide all of the markdown content in pre-baked and organized form. The markdown
-service is responsible for reading all the raw markdown sources and organizing their content into a single cohesive
-structure. MardownDocs finds all content that is provided by individual MarkdownProvider services, as described below.
+Static methods on this class will provide all of the markdown content in pre-baked and organized
+form. The markdown service is responsible for reading all the raw markdown sources and organizing
+their content into a single cohesive structure. MardownDocs finds all content that is provided by
+individual MarkdownProvider services, as described below.

-All of the rules for how raw markdown content is to be combined are owned by the MarkdownDocs service.
+All of the rules for how raw markdown content is to be combined are owned by the MarkdownDocs
+service.

-The MarkdownDocs service relies on SPI published services which provide raw markdown sources as described below.
+The MarkdownDocs service relies on SPI published services which provide raw markdown sources as
+described below.

-# RawMarkdownSource Services
+## RawMarkdownSource Services

-The `RawMarkdownSource` service is responsible for bundling the raw markdown for a path within a NoSQLBench module. Each
-module that wishes to publish markdown docs to users must provide one or more RawMarkdownSource services via SPI. This
-is most easily done with a `@Service(RawMarkdownSource.class)` annotation.
+The `RawMarkdownSource` service is responsible for bundling the raw markdown for a path within a
+NoSQLBench module. Each module that wishes to publish markdown docs to users must provide one or
+more RawMarkdownSource services via SPI. This is most easily done with a
+`@Service(RawMarkdownSource.class)` annotation.

 ## RawMarkdownSource endpoints provide Content

-Each instance of a RawMarkdownSource service provides all of the individual markdown files it finds indirectly as
-io.nosqlbench.nb.api.content.Content, which allows the internal file content to be read appropriately regardless of
-whether it comes from a classpath resource stream, a file on disk, or even a dynamic source like function metadata.
+Each instance of a RawMarkdownSource service provides all of the individual markdown files it finds
+indirectly as io.nosqlbench.nb.api.content.Content, which allows the internal file content to be
+read appropriately regardless of whether it comes from a classpath resource stream, a file on disk,
+or even a dynamic source like function metadata.

 ## RawMarkdownSources Aggregator

-A service aggregator called RawMarkdownSources provides easy access to all raw markdown sources provided by all
-published instances of the service.
+A service aggregator called RawMarkdownSources provides easy access to all raw markdown sources
+provided by all published instances of the service.

-# Front Matter Interpretation
+## Front Matter Interpretation

-There is a set of rules observed by MarkdownDocs for repacking markdown for structured display. These rules are largely
-driven by front matter.
+There is a set of rules observed by MarkdownDocs for repacking markdown for structured display.
+These rules are largely driven by front matter.

-## Doc Scope
+### Doc Scope

-There are three doc scopes that can be added to source markdown via the `scopes` front matter.
+The `scope` front matter property determines where content should be visible. This is useful for
+providing documentation about a topic or concept that can be pulled into multiple places. There are
+three doc scopes that can be added to source markdown via the `scopes` front matter. This is a
+multi-valued property

 * `cli` - The source content should be included for command-line searching and viewing.
 * `web` - The source content should be included for static web documentation.
@@ -51,49 +62,165 @@ There are three doc scopes that can be added to source markdown via the `scopes`

 If no scopes are provided, then a special scope `any` is assigned to source content.

-__ THIS IS A WORK IN PROGRESS __
+**Examples**

-## Topic Names
+```yaml
+---
+scopes: all
+---
+```

-The `topic` property determines
+```yaml
+---
+scopes: cli,web
+---
+```

-1. Front matter may be sanity checked for unrecognized properties.
-2. All front matter that is considered is required to have at least one topic value.
-3. Topic values which contain `, ` or `; ` patterns are auto-split into multiple topics.
-4. Topics can be hierarchical. Topics in the form of `cat1/cat2/topicfoo` are considered nested topics, with the
-   containing layer being considered a category. The right most word is considered the basic topic name. This means that
-   in the above topic name, `cat1` is a topic category containing the `cat2` topic category, which contains the topic
-   `topicfoo`.
-5. *Topic Expansion* - A topic entry which starts with a caret `^`, contains either of '.*', '.+', or ends with a `$` is
-   considered a wildcard topic. It will be treated as a topic pattern which will be compared to known topics. When it
-   matches another topic, the matched topic is added to the virtualized topic list of the owning item.
-6. `aggregations` are used to physically aggregate content from matching topics onto a markdown source:
-   1. Each aggregation is a pattern that is tested against all topics after topic expansion.
-   2. When a source item is matched to an aggregation,
-   3. wildcards, except that they
-      cause all matching topics to be aggregated onto the body of the owning markdown source.
-   4. All topics (after topicin order determined by weight. Aggregations are indicated with an `aggregation` property.
-   regations are split on commas and semicolons as above, and are always considered patterns for matching. Thus,
-   aggregation with none of the regex indicators above will only match topics with the same literal pattern.
+### Topic Names

-7. Front matter will be provided with topical aggregations included, with the following conditions:
-   * aggregations properties are elided from the repacked view. Instead, an `included` header is added which lists all
-     of the included topics.
+The `topic` front-matter property determines the assocation between a documentation fragment and the
+ways that a user might name or search for it. All content within NBDocs falls within one or more
+nested topics. That is, raw content could be homed under a simple topic name, or it could be homed
+under a topic which has a another topic above.

+1. All front matter that is considered is required to have at least one topic value.
+2. Topic values which contain `, ` or `; ` patterns are auto-split into multiple topics.
+3. Topics can be hierarchical. Topics in the form of `cat1/cat2/topicfoo` are considered nested
+   topics. Topics which contain other topics are called topic categories, but they are also topics.
+4. Topics can be literal values or they can be patterns which match other topics.

-## Composite Markdown
+**examples**

-When aggregations occur, the resulting markdown that is produces is simply a composite of all of the included markdown
-sources. The front matter of the including markdown source becomes the first element, and all other included are added
-after this. The front matter of the including markdown becomes the representative front matter for the composite
-markdown.
+```yaml
+---
+topics: cli, parameters
+---
+```

-## Indexing Data
+### Topic Aggregation

-Indexing data should be provided in two forms:
+Topics can be placeholders for matching other topics. When a topic name starts with a caret `^`,
+contains either of `.*`, or `.+`, or ends with a `$`, it is called a topic pattern.

-1. The basic metadata index which includes topics, titles, and other basic info and logical path info. This view is used
-   to build menus for traversal and other simple views of topics as needed for direct presence check, or lookup.
-2. A FTS index which includes a basic word index with stemming and other concerns pre-baked. This view is used as a
-   cache-friendly searchable index into the above metadata.
+The patterns are regular expressions, even though the patterns above are used explicitly to enable
+pattern detection.

+This allows for content to be included in other places. When source content is marked with a topic
+pattern, any other source content that is matched by that pattern is included in views of its raw
+content. Further, the topics which are matched are added to an `included` property in the matching
+content's front matter. The matched content is not affected in any other way and is still visible
+under the matched topic names.
+
+It is considered an error to create a circular reference between different topics. This condition is
+checked explicitly by the topic mapping logic.
+
+**examples**
+
+```yaml
+---
+title: owning topic
+topics: ^cli$, foobarbaz
+---
+
+# Main Content
+
+main content
+```
+
+```yaml
+---
+title: included topic
+topics: cli, parameters
+---
+
+## Included content
+
+included content
+```
+
+Taking the two source documents above, the markdown loading system would present the following
+document which globs the second one onto the first:
+
+```yaml
+---
+title: owning topic
+topics: cli, foobarbaz, parameters
+---
+
+# Main Content
+
+main content
+
+## Included Content
+
+included content
+
+```
+
+Within an aggregate source like the above, all included sections will be ordered according to the
+weight front-matter property first, and then by the title, and then by the name of the interior
+top-level heading of source item.
+
+## Logical Structure
+
+According to the rules and mechanisms above, it is possible to organize all the provided content
+into a clean and consistent strucure for search and presentation.
+
+The example below represents a schematic of what content sources will be provided in every document
+after loading and processing:
+
+```
+---
+title: <String>
+scope: <Set<String>>
+topics: <Set<String>>
+included: <Set<String>>
+weight: <Number>
+---
+<optional content>
+```
+
+In specific:
+
+**title** will be provided as a string, even if it is empty. **scope** will be provided as a set of
+strings, possibly an empty set. **topics** will be provided as a set of strings, possibly an empty
+set. **included** will be provided as a set of strings, possibly an empty set. **weight** will be
+provided as a number, possibly 0.
+
+Headings and content may or may not be provided, depending on how the content is aggregated and what
+topics are matched for content aggregation.
+
+## Repackaged Artifacts
+
+The documentation aggregator system may be asked to store repackaged forms. These are suggested:
+
+1. A metadata.json which includes the basic metadata of all the documentation entries, including:
+   1. module path of logical document
+   2. title
+   3. topics
+   4. included
+   5. weight
+2. A manifest.json which includes:
+   1. module path of logical document
+   2. title of document
+   3. heading structure of document
+3. A topics.json which includes the topic structure of all topics and headings
+4. A full-text search index.
+
+With these artifacts provides as services, clients can be efficient in discovering content and
+making it searchable for users.
+
+## Topic Mapping Logic
+
+Topics may be transitive. There is no clear rationale for avoiding this, and at least one good  
+reason to allow it. The method for validating and resolving topic aggregations, where there may be
+layers of dependencies, is explained here:
+
+__BEING REWRITTEN__
+1. All content sources are put into a linked list in no particular order.
+2. The list is traversed from head to tail repeatedly.
+3. When the head of the list is an aggregating source, and all of the matching elements are
+   non-aggregating, then the element converted to a non-aggregating element.
+4. If the head of the list is a non-aggregating source, it is moved to the tail.
+5. When all of the elements of the list are non-aggregating, the mapping is complete.
+6. When the