opm-common

Author	SHA1	Message	Date
Joakim Hove	af7693ef6b	Add parse option keep '/' in data sections The records are norally terminated on the first unquoted '/', but for the UDQ and ACTIONX keywords the data sections of a keyword contain mathematical expressions which can contain '/' literals. This commit adds a per-keyword ability to terminate the records on the last '/' instead of the first '/'.	2019-03-05 07:15:29 +01:00
Jørgen Kvalsvik	e884b0664c	Redesign cmake Tune the makefile according to new principles, which adds a few bells and whistles and for clarity. Synopsis: * The dependency on opm-common is completely gone. This is reflected in travis and appveyor as well. No non-kitware cmake modules are used. * Directories are flattened, quite a bit - source code is located in the lib/ directory if it belongs to opm-parser, and external/ if third party. * The sibling build feature is implemented through cmake's export(PACKAGE) rather than implicitly looking through source files. * Targets explicitly set required public and private include directories, compile options and definitions, which cmake will handle and propagate * opm-parser-config.cmake for downstream users is now provided. * Dependencies are set up using targets. In the future, when cmake 3.x+ can be used, these should be either targets from newer Find modules, or interface libraries. * Fewer system specific assumptions are coded in, instead we assume cmake or users set up system specific details. * All module wide configuration and looking up libraries is handled in the root makefile - all sub directories only set up libraries and compile options for the module in question. * Targets are defined and links handled transitively because cmake now is told about them. ${module_LIBRARIES} variables are gone. This is largely guided by the principles outlined in https://rix0r.nl/blog/2015/08/13/cmake-guide/ Most source files are just moved - if they have some content change then it's nothing more than include fixes or similar in order to make them compile.	2017-06-01 15:29:23 +02:00
Jørgen Kvalsvik	0c1dae7016	Check for separator/quote table overflow Just relying on the char data type is not sufficient to guard against overflows, and several input decks would invoke undefined behaviour. This code path is extremely hot, so we're essentially only reading the least significant 7 bits to achieve branchless lookup.	2016-10-21 10:33:28 +02:00
Jørgen Kvalsvik	37c04328ca	Remove shared_ptr typedefs	2016-10-19 20:38:28 +02:00
Jørgen Kvalsvik	a51127f0c8	Make character lookup table boolean	2016-10-06 13:17:06 +02:00
Jørgen Kvalsvik	d9443c7355	Replace number parser with boost::spirit::qi The hand-written number parser functions implemented using strtod and friends were rather slow (profiling indicates that typically 30% of the program is spent inside of strtod internals). By using boost::spirit::qi, which we already depend on through boost-filesystem and others this portion typically seem to be reduced to 20% (via instruction count) and with somewhat better cache performance. Rudimentary measuring indicates ~15% speedup overall. Additionally, the intention is a lot clearer this way, so readability received a boost. Compilation time of StarToken goes through the roof.	2016-08-28 11:08:30 +02:00
Jørgen Kvalsvik	db33a9cc55	Prefer function objects in Parser::clean Implementing these checks as function objects improves performance slightly (5% or so according to my measurements), probably due to the functions being inlined rather than reduced to function pointers.	2016-08-08 09:42:41 +02:00
Jørgen Kvalsvik	f571f21171	is_separator includes comma This deprecates the comma replace function in the reader.	2016-08-04 16:05:53 +02:00
Jørgen Kvalsvik	7e9158d319	Anon namespace, removal of unused string constant	2016-08-04 16:05:53 +02:00
Jørgen Kvalsvik	b2f206d54a	Replace RawRecord expanded items; reuse view Reuses the original records string_view rather than expanding to the same std::string, we save some allocations, memory cache misses and simplify the class slightly. Additionally, the uninteresting add-multiple-identical-records logic ParserItem did before has been moved into RawRecord and is now performed by std::deque (which also means it can allocate better for itself). The addition of prepend deprecates push_front.	2016-08-04 16:05:53 +02:00
Jørgen Kvalsvik	11b4bc2dcd	Lookup tables in is_separator/quote The use of lookup tables reduce branching and seem to improve performance by a few percent.	2016-08-04 16:05:53 +02:00
Jørgen Kvalsvik	33a87a1ced	Fix warnings in StarToken	2016-07-13 23:40:09 +02:00
Jørgen Kvalsvik	06c90c4bc7	RawKeyword uses uppercase from util	2016-06-14 17:01:50 +02:00
Magne Sjaastad	85e3ae61b3	VS2015 : Added missing include to cctype	2016-05-25 10:39:19 +02:00
Magne Sjaastad	393bdb42f2	VS2015 : Added missing include to ssize_t	2016-05-25 10:39:19 +02:00
Jørgen Kvalsvik	1f2c2ba98d	Name cases where TITLE is unset When the TITLE keyword was present in the deck, but no parameter was given the parser would consume the next keyword as the simulation TITLE. Override this by writing a default TITLE if it's unset.	2016-05-03 15:42:30 +02:00
Jørgen Kvalsvik	0966a9cb8c	Fix RawKeywor tests to reflect assumptions	2016-05-03 13:39:36 +02:00
Jørgen Kvalsvik	120a30e94b	Replace std::isspace in parser; add \r to is_sep	2016-05-03 09:16:28 +02:00
Jørgen Kvalsvik	784a1a5d78	Replace std::isspace with hand-rolled version Profiling indicate isspace isn't inlined properly, so we replace it with a hand-rolled easier-to-inline version.	2016-05-03 09:16:28 +02:00
Jørgen Kvalsvik	8a4eb5279c	Splitting records with string_view; test updates The splitting of RawRecords into individual symbols uses string_view. Also updates tests since RawRecord now assumes that the record string it receives is complete and does not contain the terminating slash.	2016-05-03 09:16:28 +02:00
Jørgen Kvalsvik	c2b5da457c	Re-implement is_separator to use isspace	2016-05-03 09:16:28 +02:00
Jørgen Kvalsvik	07eea89c34	Remove redundant overloads These overloads were written to allow testing, but with string_view accepting char* they're unecessary and confusing.	2016-05-03 09:16:28 +02:00
Jørgen Kvalsvik	6b64796d49	Add char* constructor to string_view Mostly relevant for testing, this enables string_view to work as expected with string literals.	2016-05-03 09:16:28 +02:00
Jørgen Kvalsvik	bfa3f799b9	Improved fast path in number parsing Since string_view uses char* for representation, there is no longer a need to copy into a local char array for most code paths (only when the input must be modified due to fortran float formatting).	2016-05-03 09:16:28 +02:00
Jørgen Kvalsvik	b648719513	Base string_view on char, not string::iterator By representing string_view as char instead of std::string::const_iterator the string_view class bring possibly slightly lower overhead, but mostly enables some optimisations.	2016-05-03 09:16:27 +02:00
Jørgen Kvalsvik	a1ae0e0067	Updated various functions to accept string_view To take advantage of parser's internal string_view representation of keyword names, several signatures have been updated to accept string_view.	2016-05-03 09:16:27 +02:00
Jørgen Kvalsvik	096e843d08	Moved special-casing of TITLE to RawKeyword Replaces the special-casing of the TITLE keyword where a terminating slash is added onto the record with a non-mutating version.	2016-05-03 09:16:27 +02:00
Jørgen Kvalsvik	7d29d63bea	Replace string with string_view in inner parser Several inner parser functions modified to use string_view, to reduce unecessary copying (and indirectly allocationg) related to passing strings around.	2016-05-03 09:16:27 +02:00
Jørgen Kvalsvik	db6bb58f60	Remove redundant application of uppercase() When considering if a keyword is valid, the parser procedures convert the same string to uppercase twice, with copies. This behaviour has been changed, and ParserKeyword now assumes it will be given a correctly-formatted keyword to look up.	2016-05-03 09:15:46 +02:00
Jørgen Kvalsvik	9ce28fb7ea	Restructure addRawRecordString Redcues indentation slightly and adds multiple return to better communicate that some conditions actually logically terminate at several points.	2016-05-03 09:15:45 +02:00
Jørgen Kvalsvik	4b4d2c02c0	Reimplementation of stripComments The implementation has been rewritten to use iterators and renamed for internal use. The public static function still exists for testability and easy verification, but should be considered an internal part of the parser.	2016-05-03 09:15:22 +02:00
Jørgen Kvalsvik	bfb7f6ec0c	RawRecord internals use ranges over string methods Rewrites the internal string mainpulation functions of RawRecord to use std::algorithm and ranges over string methods.	2016-05-03 09:15:22 +02:00
Joakim Hove	bdb1313f41	RawKeyword getRecord( int ) -> getFirstRecord() The general loop through all raw records should be based on the iterator interface of the RawKeyword, but to resolve INCLUDE statements we have implemented a special case method to get the first record.	2016-04-04 16:21:52 +02:00
Joakim Hove	e57d61468d	Added non const record iterators to RawKeyword.	2016-04-03 22:00:30 +02:00
Jørgen Kvalsvik	8e4e9b15d8	ReadValueToken correctly parses numbers Changes the implementation of numbers parsing from std::atoi/f to std::strtod/l. These support setting the optional end-of-string pointer which are used to determine if a parsing was successful or not. This has the nice side effect of greatly simplifying the logic, at the expense of some C-style details. Tests added to verify that the different edge cases are handled properly.	2016-04-01 10:13:37 +02:00
Jørgen Kvalsvik	046afdd3be	RawKeyword iterator support Since RawRecords now has automatic storage, managed by std::list, offering iterators is feasible. The random access RawKeyword::getRecord's real use was accessing the records in order, which now is handled via iterators.	2016-03-30 12:47:33 +02:00
Jørgen Kvalsvik	83ae276d67	Fix string_view iterator invalidation bug By changing the underlying storage of RawKeyword to std::list, we ensure that the RawRecords aren't reallocated and moved, preserving the validity of string_view's. This changes the access complexity of RawRecord from O(1) to O(n).	2016-03-30 12:31:00 +02:00
Jørgen Kvalsvik	9e76ec5f78	Inline hot-but-trivial functions These functions are called a lot and are trivial accessors to the underlying containers. By opening them for inlining we get a decent performance benefit "for free" via optimisation opportunities.	2016-03-22 14:45:17 +01:00
Jørgen Kvalsvik	0da5cadc75	RawRecords auto store, strings moved to RawRecord The accumulated strings are moved into RawRecords, which reduces execution time (rough measurements indicates 4-8%). To facilitate this, RawRecords are stored directly in the vector in favour of via shared_ptrs.	2016-03-22 08:58:48 +01:00
Jørgen Kvalsvik	aa064a9050	Fix buffer overflow vulnerability An attacker using very long decimal integers as input could trigger a buffer overflow write during int/double parsing. The vulnerability has been fixed and raw buffer boundaries are checked. Additionally, integer buffer size is determined by platform 'int' width. 'double' uses a heuristic to support both pure decimal formats (up to 64 characters long) and float formats.	2016-03-15 16:42:02 +01:00
Atgeirr Flø Rasmussen	78e0870bad	Use std::list instead of std::vector to fix push_front(). The push_front() method can cause reallocation of expanded_items, thereby invalidating iterators already stored in m_recordItems. Switching to std::list fixes this.	2016-03-14 14:54:29 +01:00
Atgeirr Flø Rasmussen	f23af386cf	Add missing include, remove unused function.	2016-03-14 13:22:22 +01:00
Jørgen Kvalsvik	55b46da658	Moved RawRecord::isTerminator out of interface This feature is internal to the raw records and is removed from its public interface.	2016-03-14 08:29:54 +01:00
Jørgen Kvalsvik	2a650d5972	RawRecord refactoring Some simple refactoring to remove a redundant check and clean up some initialisation routines.	2016-03-14 08:29:54 +01:00
Jørgen Kvalsvik	dc094cbb16	More efficient findTerminatingSlash Uses some heuristics and quick exists to avoid always paying worst case cost for finding terminating slash.	2016-03-14 08:29:54 +01:00
Jørgen Kvalsvik	28eb195ac3	readValueToken< double > split into fast/slow path. readValueToken spent almost half its time dealing with weirdly formed or broken floats. Now has a shorter path that can early return a successfully parsed float and only do slow handling of cases that need it (notably zero, fortran style exponent and errors).	2016-03-14 08:29:54 +01:00
Jørgen Kvalsvik	38f88b4e14	RawKeyword::isTerminator uses is_separator	2016-03-14 08:29:54 +01:00
Jørgen Kvalsvik	1d1715b421	RawConsts::is_separator function This replaces the inefficient RawConsts::separators.find( char ) with an availble, efficient and inlinable is_separator.	2016-03-14 08:29:53 +01:00
Jørgen Kvalsvik	93b7c0739b	Replace boost::lexical_cast<> with std functions The boost provided lexical cast are inefficient and is shown to be a slowdown in the inner loop. Replaces them with std::atoi/std::atof and some simple correctness checking.	2016-03-14 08:29:53 +01:00
Jørgen Kvalsvik	e4ddf884f1	Using operator+ and stream operators	2016-03-14 08:29:53 +01:00

1 2 3

137 Commits