Keywords is the most important aspect of the ECLIPSE datafile parser. 1. The structure of a keyword ----------------------------- A keyword is the fundamental unit when parsing. Keywords are added to the parser by schema definitions. The schema definition of the keywords are given as Json files under the opm/share/keywords directory. Json can be thought of as a lean alternative to XML, you will find it described here: http://www.json.org/ As part of the build process these keyword definitions are compiled to ParserKeyword instances. 1.1 Starting a keyword ---------------------- The keywords are defined as follows: 1. The keyword should start in column 0; everything beyond character 8 is ignored. 2. The keyword should start with a alphabet character, subsequent characters should be alphanumeric or in the set [-,_,+]. 3. We think ECLIPSE is case insensitive. This is cehcked by the static method: ParserKeyword::validDeckName(). An important part of the parsing of keywords is to detect when the keyword specification is complete. For most keywords we can detect that either by a terminating '/' or the keywords have a predefined size, but for some odd keywords we can not reliably detect the end-of-keyword condition and instead we terminate keyword1 when we find a string which corresponds to the start of a new keyword. Examples of such oddball keywords include: VFPPROD and VFPINJ. 1.2 Records ----------- The data content of a keyword comes as a collection of records. Records are a collection of 'data', terminated by a '/'. Here are three examples of records: 'METRES' / 1 'OFF' 100 '*' 24.0 / 0.26 0.27 0.26 0.78 0.82 0.66 0.27 0.78 0.76 0.56 0.23 0.67 / From these examples we see that: 1. One record can contain a mix of integer, float and string values. 2. Records typically correspond to one line in the data-file, but that is purely convention; the records can be sprinkled with newlines. 3. Each record is terminated with a '/' 1.3 How many records in a keyword --------------------------------- One of the first structural elements which must be configured into the the parser keywords is the number of records in the different keywords, this is closely related to how the keyboard is terminated. Here comes some typical examples of keywords: GRID WCONHIST ... / ... / ... / / ---\ EQLDIMS | .... / | | EQUIL | .... / | .... / | ---/ VFPPROD A .. / B... / .... / .... / PVGO / / In the list above here the GRID keyword has zero records, i.e. no data at all. The WCONHIST keyword has three records of data, the EQLDIMS keyword has one record, the EQUIL keyword has two records and finally the VFPPROD keyword has four records. The number of records, or how to infer it, must be configured with the "size" attribute in the JSON configuration. When it comes to the number of records and termination we currently have five different categories: 1. Keywords with a fixed number of records. Both the GRID keyword and the EQLDIMS keyword have a fixed number of records, zero and one respectively. These keywords are therefor not explicitly terminated in any way, and the "size" attribute has the numerical explicitly: {"name" : "EQLDIMS" , "size" : 1 , ....} {"name" : "GRID" , "size" : 0, .... } 2. Keywords with a variable number of records like the WCONHIST. Becase the number of records is not known in advance this keyword must be explicitly terminated with a '/'. This is the most common configuration and for keywords of this type it is not necessary to specify a size attribute at all: {"name" : "WCONHIST" , .... } 3. Keywords where the number of records is inferred from the content of another keyword; this is the case with EQUIL which reads the number of records from the xxx item of the EQLDIMS keyword. Since the number of records is known in advance (indirectly through the EQLDIMS keyword) the EQUIL keyword is not explicitly terminated with a '/'. In the json file this is specified with the "size" attribute being an object containing the name and item of keyword which should be consulted to infer the size; so for the EQUIL keyword the size attribute looks like: {"name" : "EQUIL" , "size" : {"keyword" : "EQLDIMS" , "item" : "NTEQUL"} , ... When parsing the EQUIL keyword the parser will consult the already parsed content of the 'EQLDIMS' keyword (i.e. a DeckKeyword instance) and get the numerical value of the 'NTEQUL' item. 4. For some keywords the number of records should be calculated run-time based based on the content of the first records in the keyword - this at least applies to VFPPROD and VFPINJ. Since the size of the keyword is deterministic - given the first few records - the keyword is not slash terminated. To infer the number of records in the keyword based on an internal calculation is not supported, hence for these keywords size is given as unknown, and the keywords are terminated when the next valid keyword is found: {"name" : "VFPPROD" , "size" : "UNKNOWN", .... 5. Tables PVTG and PVTO: The two tables PVTG and PVTO are special cased. The special casing should probably be removed, and the "size" : "UNKNOWN" could be used for these two keyword. 1.4 The content of a record: items ---------------------------------- A record consist of one or several items. An item can consist of one or several values from the record, for items with more than one value it is not possible to specify the exact number of values - the item will just consume the remaining values in the input stream. An item has a name, a data type and optionally a default value. For instance the WCONHIST keyword has the the following items specification: "items": [{"name" : "WELL" , "value_type" : "STRING"}, {"name" : "STATUS" , "value_type" : "STRING" , "default" : "OPEN"}, {"name" : "CMODE" , "value_type" : "STRING"}, {"name" : "ORAT" , "value_type" : "DOUBLE", "default" : 0.0, "dimension" : "LiquidSurfaceVolume/Time"}, {"name" : "WRAT" , "value_type" : "DOUBLE" , "default" : 0.0, "dimension" : "LiquidSurfaceVolume/Time"}, {"name" : "GRAT" , "value_type" : "DOUBLE" , "default" : 0.0, "dimension" : "GasSurfaceVolume/Time"}, {"name" : "VFP_TABLE" , "value_type" : "INT" , "default" : 0.0 , "comment":"The default is a state variable"}, {"name" : "LIFT" , "value_type" : "DOUBLE" , "default" : 0.0 , "comment":"The default is a state variable"}, {"name" : "THP" , "value_type" : "DOUBLE" , "default" : 0.0 , "dimension" : "Pressure"}, {"name" : "BHP" , "value_type" : "DOUBLE" , "default" : 0.0 ,"dimension" : "Pressure"}, {"name" : "NGLRAT" , "value_type" : "DOUBLE" , "default" : 0.0 ,"dimension" : "LiquidSurfaceVolume/Time"}]} Here we can see the following: 1. The items can be of types string, integer and float, the type is specified with the "value_type" attribute which must equal "STRING", "DOUBLE" or "INT". 2. You can optionally specify a default value, see the discussion of the parsing workflow below for the treatment of defaults. 3. For items of type double you can specify a dimension, see XXXX for the available dimensions. For quantities with a dimension the parser will convert to SI units, and the DeckDoubleItem class has a getSIDouble() and getRawDouble() method. Items consuming the rest of the record -------------------------------------- Most of the items will consume exactly one value from the input deck, but it is also possible that the last item consumes the remaining items in the input deck, these typically correspond to table keywords or lists of memnonics. In the input deck the PVTG keyword will typically appear like this: PVTG 200 0.15 0.15 10 0.20 0.20 12 0.25 0.20 15 / 250 0.05 0.05 20 0.15 0.10 40 / ... In the manual this is described as two tables with three columns, one with three rows and one with two rows. The leading values of 200 and 250 are the pressure values where the two tables apply. The visual formatting in the deck, and also the written desciption in the manual, strongly hints at a table structure - however from a parsing point of view this corresponds to just two records of different length. Both records start with a pressure value, and then follows 3n consecutive values. The item configuration of PVTG looks like this: "items" : [ {"name":"GAS_PRESSURE", "value_type" : "DOUBLE", "dimension":"Pressure" }, {"name":"DATA", "size_type" : "ALL" , "value_type":"DOUBLE" , "dimension" : ["OilDissolutionFactor","OilDissolutionFactor","Viscosity"]} ] I.e. first we consume one value from the input deck and assign it to the GAS_PRESSURE item, then the DATA item has "size_type" : "ALL" - meaning that this item will consume the the rest of the values in the input record. Also observe that for the "DATA" item the dimension is a vector of three elements, when converting to SI the dimension applied to element i is given as: dim[i] = dimension[i % 3] In addition to tables the grid property keywords use items which consume the rest of the record. For instance the PORO keyword will typcially look like this in the input deck: PORO 0.14 0.15 0.0 0.10 0.16 0.25 0.1 0.11 0.14 0.15 0.0 0.09 ... 0.21 0.07 0.1 0.13 / From a parsing point of view this is one single record; which contains one item consuming all of the values in the input deck. In the configuration this could have been configured as: "items" : [{"name" : "DATA", "value_type" : "DOUBLE" , "size_type" : "ALL" , "default" : 0 , "dimension" : "1"}] However, since keywords containing large data arrays, like e.g. COORD and PERMX are quite common a shortcut has been created for such keywords, for instance the PORO keyword is configures as: {"name" : "PORO" , "sections" : ["GRID"], "data" : {"value_type" : "DOUBLE" , "default" : 0 , "dimension":"1"}} i.e. the "data" attribute is used as shorthand to configure a keyword with one record and one item which consumes all the data of the input deck. Multirecord keyword configuration --------------------------------- Units and dimensions -------------------- The values given in the input dataset are generally dimensionfull, and before the simulator can start we must convert to internal SI values based on the unit system used in the input deck. In the input deck the different physical quantities are generally expressed with per-quantity units. The unit system is *not* based on selecting a unit for the fundamental dimensions length, mass and time and then deriving composite units based on the dimension of the composite quantity. As a consequence the list of dimensions supported by the parser is long, and growing. The current list can be found in the source file: opm/parser/eclipse/Units/UnitSystem.cpp Default values -------------- Classes: -------- The library contains classes along two dimensions: +----------------+ +----------------+ +----------------+ | Parser | | RawDeck | | Deck | +----------------+ +----------------+ +----------------+ +----------------+ +----------------+ +----------------+ | ParserKeyword | | Rawkeyword | | DeckKeyword | +----------------+ +----------------+ +----------------+ +----------------+ +----------------+ +----------------+ | ParserRecord | | RawRecord | | DeckRecord | +----------------+ +----------------+ +----------------+ +----------------+ +----------------+ | ParserItem | | DeckItem | +----------------+ +----------------+