The termination logic would sometimes need to scan the full line to see
if some terminating condition was found inside quotes. Plenty of
comments in a file start on the first character of a line, meaning this
scan is unnecessary.
Implementing these checks as function objects improves performance
slightly (5% or so according to my measurements), probably due to the
functions being inlined rather than reduced to function pointers.
Book keeping internally in the string class is rather unnecessary and
performed a lot (once per line in the input). Performing a manual copy
without string encumbered book keeping (essentially using the string as
a raw char* buffer) gives some performance, which scales well with the
lines in each deck.
Some decks use comma to separate items in a record, but the comma adds
no extra information and is essentially ignored. Post-process the file
after cleaning it and replace all commas with whitespace.
Considering the record "foo bar , , , baz" / baz will silently be moved
to the third item, but handling that situation is a **lot** more
difficult and time consuming and not worth the effort for now.
Additionally, we have yet to receive decks that fail in such a manner,
nor are we sure if it is even a valid record in Eclipse.
Save some implementation by using std::stack as interface and
implementation for InputStack. Remove the vector pair of file and file*
and just use file directly.
When the TITLE keyword was present in the deck, but no parameter was
given the parser would consume the next keyword as the simulation TITLE.
Override this by writing a default TITLE if it's unset.
Internal names are deprecated, and instead added ParserKeyword instances
are maintained and kept for the lifetime of the ParserKeyword instance.
Querying keyword existence from python picks up on Deck names, expected
to always be the intended case, instead of internal names.
By allowing getline to assume that the next line always starts after the
found \n *or* that after the found newline is the \0, we can avoid a
branch and potential stall.
This function is not an obvious member of Parser, as it is just as
reliant on ParserState which is source-file private to Parser. Moves to
source file only, without externally-visible private symbol table entry.
tryParseKeyword and createRawKeyword don't use anything non-public and
does not rely on any non-public parser state, so they are now
implemented as functions private/static in Parser.cpp
A getline implementation that carries mostly the same assumptions as
istream::getline. Requires ParserState::done() to always be checked (for
negative) to safely retrieve the next line.
Enables tryParseKeywords and friend to only maintain a high level view
of the parsing procedure, and not deal with stream positioning.
Moves the storage/lifetime components of ParserState into its own
helper utility class. Splits the implementation responsibility so that
all input data and state is handled by the InputStack, and "global"
information is managed directly by ParserState.
Perform a pre-pass over the input and remove everything that isn't
interesting. Preserves empty lines to provide accurate line number for
diagnostics/error output.
The use of string_view for keys allows comparison with keywords from the
file buffers to be compared directly, instead of having to go via
std::string. This removes the need for a series of (inherently)
unecessary copies.
Several inner parser functions modified to use string_view, to reduce
unecessary copying (and indirectly allocationg) related to passing
strings around.
Replaces the parser's dependence on streams with string_view, which
won't copy to its internal buffers. Involves hand-rolling std::getline
for string_view to preserve stream-like behaviour.
Replaces the external stream-support by reading the entire contents of
input files at once, rather than lazily through streams. Uses
stringstreams internally, but keeps the entire file in memory through
std::string
openString/File has been renamed to loadString/File
Rewrites ParseState to use an explicit stack of open file streams
instead of spawning multiple ParseState objects. While recursion is
*the* most elegant to solve a typical include-file system, the global
nature of eclipse declarations make this clumsy (at best). Instead,
maintain an explicit stack of open files and logically parse the input
as if it was all concatenated into one large file.
Also adds some convenient ways of accessing the current (and
interesting) top of the stack.
Restructures createRawKeyword to use multiple return statements over
if-else and updating variables, reducing indentation to have fewer
contemporary branches.
Changes the control flow of Parser::createRawKeyword to return on
failing to meet preconditions, rather than if-else block. Reduces
indentation and concurrent code paths.
Makes the actual effects of the various condtions more clear by
emphasising the return of the function rather than some possible action
after the loop.
Restructures the per-iteration check of whether or not to continue
parsing to happen first and be an early-break, instead of this flow
being handled by if-else.