opm-common/docs/keywords.txt

Keywords is the most important aspect of the ECLIPSE datafile
parser.

1. The structure of a keyword
-----------------------------

A keyword is the fundamental unit when parsing. Keywords are added to
the parser by schema definitions. The schema definition of the keywords
are given as Json files under the opm/share/keywords directory. Json
can be thought of as a lean alternative to XML, you will find it described
here: http://www.json.org/
As part of the build process these keyword definitions are compiled
to ParserKeyword instances.


1.1 Starting a keyword
----------------------
The keywords are defined as follows:

 1. The keyword should start in column 0; everything beyond character
    8 is ignored.

 2. The keyword should start with a alphabet character, subsequent
    characters should be alphanumeric or in the set [-,_,+].

 3. We think ECLIPSE is case insensitive.

This is cehcked by the static method:
ParserKeyword::validDeckName(). An important part of the parsing of
keywords is to detect when the keyword specification is complete. For
most keywords we can detect that either by a terminating '/' or the
keywords have a predefined size, but for some odd keywords we can not
reliably detect the end-of-keyword condition and instead we terminate
keyword1 when we find a string which corresponds to the start of a new
keyword. Examples of such oddball keywords include: VFPPROD and
VFPINJ.


1.2 Records
-----------

The data content of a keyword comes as a collection of
records. Records are a collection of 'data', terminated by a '/'. Here
are three examples of records:

   'METRES'  /

   1  'OFF'  100  '*'  24.0  /

   0.26 0.27 0.26 0.78
   0.82 0.66 0.27 0.78
   0.76 0.56 0.23 0.67   /

From these examples we see that:

  1. One record can contain a mix of integer, float and string values.
  2. Records typically correspond to one line in the data-file, but
     that is purely convention; the records can be sprinkled with
     newlines.
  3. Each record is terminated with a '/'


1.3 How many records in a keyword
---------------------------------

One of the first structural elements which must be configured into the
the parser keywords is the number of records in the different
keywords, this is closely related to how the keyboard is
terminated. Here comes some typical examples of keywords:


  GRID


  WCONHIST
    ... /
    ... /
    ... /
  /

                ---\
  EQLDIMS          |
    ....   /       |
                   |
  EQUIL            |
    .... /         |
    .... /         |
                ---/

  VFPPROD
    A ..   /
    B...   /
    ....   /
    ....   /


  PVGO
   /
   /


In the list above here the GRID keyword has zero records, i.e. no data
at all. The WCONHIST keyword has three records of data, the EQLDIMS
keyword has one record, the EQUIL keyword has two records and finally
the VFPPROD keyword has four records. The number of records, or how to
infer it, must be configured with the "size" attribute in the JSON
configuration. When it comes to the number of records and termination
we currently have five different categories:

  1. Keywords with a fixed number of records. Both the GRID keyword
     and the EQLDIMS keyword have a fixed number of records, zero and
     one respectively. These keywords are therefor not explicitly
     terminated in any way, and the "size" attribute has the numerical
     explicitly:

       {"name" : "EQLDIMS" , "size" : 1 , ....}
       {"name" : "GRID" , "size" : 0, .... }


  2. Keywords with a variable number of records like the
     WCONHIST. Becase the number of records is not known in advance
     this keyword must be explicitly terminated with a '/'. This is
     the most common configuration and for keywords of this type it is
     not necessary to specify a size attribute at all:

       {"name" : "WCONHIST" , .... }


  3. Keywords where the number of records is inferred from the content
     of another keyword; this is the case with EQUIL which reads the
     number of records from the xxx item of the EQLDIMS keyword. Since
     the number of records is known in advance (indirectly through the
     EQLDIMS keyword) the EQUIL keyword is not explicitly terminated
     with a '/'. In the json file this is specified with the "size"
     attribute being an object containing the name and item of keyword
     which should be consulted to infer the size; so for the EQUIL
     keyword the size attribute looks like:

       {"name" : "EQUIL" ,
          "size" : {"keyword" : "EQLDIMS" , "item" : "NTEQUL"} , ...

     When parsing the EQUIL keyword the parser will consult the
     already parsed content of the 'EQLDIMS' keyword (i.e. a
     DeckKeyword instance) and get the numerical value of the 'NTEQUL'
     item.


  4. For some keywords the number of records should be calculated
     run-time based based on the content of the first records in the
     keyword - this at least applies to VFPPROD and VFPINJ. Since the
     size of the keyword is deterministic - given the first few
     records - the keyword is not slash terminated.

     To infer the number of records in the keyword based on an
     internal calculation is not supported, hence for these keywords
     size is given as unknown, and the keywords are terminated when the
     next valid keyword is found:

        {"name" : "VFPPROD" , "size" : "UNKNOWN", ....


  5. Tables PVTG and PVTO: The two tables PVTG and PVTO are special
     cased. The special casing should probably be removed, and the
     "size" : "UNKNOWN" could be used for these two keyword.


1.4 The content of a record: items
----------------------------------

A record consist of one or several items. An item can consist of one
or several values from the record, for items with more than one value
it is not possible to specify the exact number of values - the item
will just consume the remaining values in the input stream. An item
has a name, a data type and optionally a default value. For instance
the WCONHIST keyword has the the following items specification:

     "items":
      [{"name" : "WELL"         , "value_type" : "STRING"},
       {"name" : "STATUS"       , "value_type" : "STRING" , "default" : "OPEN"},
       {"name" : "CMODE"        , "value_type" : "STRING"},
       {"name" : "ORAT"         , "value_type" : "DOUBLE",  "default" : 0.0, "dimension" : "LiquidSurfaceVolume/Time"},
       {"name" : "WRAT"         , "value_type" : "DOUBLE" , "default" : 0.0, "dimension" : "LiquidSurfaceVolume/Time"},
       {"name" : "GRAT"         , "value_type" : "DOUBLE" , "default" : 0.0, "dimension" : "GasSurfaceVolume/Time"},
       {"name" : "VFP_TABLE"    , "value_type" : "INT"   , "default" : 0.0 , "comment":"The default is a state variable"},
       {"name" : "LIFT"         , "value_type" : "DOUBLE" , "default" : 0.0 , "comment":"The default is a state variable"},
       {"name" : "THP"          , "value_type" : "DOUBLE" , "default" : 0.0 , "dimension" : "Pressure"},
       {"name" : "BHP"          , "value_type" : "DOUBLE" , "default" : 0.0 ,"dimension" : "Pressure"},
       {"name" : "NGLRAT"       , "value_type" : "DOUBLE" , "default" : 0.0 ,"dimension" : "LiquidSurfaceVolume/Time"}]}

Here we can see the following:

 1. The items can be of types string, integer and float, the type is
    specified with the "value_type" attribute which must equal
    "STRING", "DOUBLE" or "INT".

 2. You can optionally specify a default value, see the discussion of
    the parsing workflow below for the treatment of defaults.

 3. For items of type double you can specify a dimension, see XXXX for
    the available dimensions. For quantities with a dimension the
    parser will convert to SI units, and the DeckDoubleItem class has
    a getSIDouble() and getRawDouble() method.


Items consuming the rest of the record
--------------------------------------

Most of the items will consume exactly one value from the input deck,
but it is also possible that the last item consumes the remaining
items in the input deck, these typically correspond to table keywords
or lists of memnonics. In the input deck the PVTG keyword will
typically appear like this:

PVTG
200   0.15  0.15   10
      0.20  0.20   12
      0.25  0.20   15  /
250   0.05  0.05   20
      0.15  0.10   40  /
...

In the manual this is described as two tables with three columns, one
with three rows and one with two rows. The leading values of 200 and
250 are the pressure values where the two tables apply. The visual
formatting in the deck, and also the written desciption in the manual,
strongly hints at a table structure - however from a parsing point of
view this corresponds to just two records of different length. Both
records start with a pressure value, and then follows 3n consecutive
values. The item configuration of PVTG looks like this:

     "items" : [
         {"name":"GAS_PRESSURE", "value_type" : "DOUBLE", "dimension":"Pressure" },
         {"name":"DATA", "size_type" : "ALL" , "value_type":"DOUBLE" ,
                 "dimension" : ["OilDissolutionFactor","OilDissolutionFactor","Viscosity"]}
      ]

I.e. first we consume one value from the input deck and assign it to
the GAS_PRESSURE item, then the DATA item has "size_type" : "ALL" -
meaning that this item will consume the the rest of the values in the
input record. Also observe that for the "DATA" item the dimension is a
vector of three elements, when converting to SI the dimension applied
to element i is given as:

   dim[i] = dimension[i % 3]

In addition to tables the grid property keywords use items which
consume the rest of the record. For instance the PORO keyword will
typcially look like this in the input deck:

PORO
0.14 0.15 0.0 0.10
0.16 0.25 0.1 0.11
0.14 0.15 0.0 0.09
...
0.21 0.07 0.1 0.13
/

From a parsing point of view this is one single record; which contains
one item consuming all of the values in the input deck. In the
configuration this could have been configured as:

   "items" : [{"name"       : "DATA",
               "value_type" : "DOUBLE" ,
               "size_type"  : "ALL" ,
               "default"    : 0 ,
               "dimension"  : "1"}]

However, since keywords containing large data arrays, like e.g. COORD
and PERMX are quite common a shortcut has been created for such
keywords, for instance the PORO keyword is configures as:

   {"name" : "PORO" , "sections" : ["GRID"],
      "data" : {"value_type" : "DOUBLE" , "default" : 0 , "dimension":"1"}}

i.e. the "data" attribute is used as shorthand to configure a keyword
with one record and one item which consumes all the data of the input
deck.


Multirecord keyword configuration
---------------------------------


Units and dimensions
--------------------

The values given in the input dataset are generally dimensionfull, and
before the simulator can start we must convert to internal SI values
based on the unit system used in the input deck. In the input deck the
different physical quantities are generally expressed with
per-quantity units. The unit system is *not* based on selecting a unit
for the fundamental dimensions length, mass and time and then deriving
composite units based on the dimension of the composite quantity. As a
consequence the list of dimensions supported by the parser is long,
and growing. The current list can be found in the source file:

   opm/parser/eclipse/Units/UnitSystem.cpp


Default values
--------------


Classes:
--------

The library contains classes along two dimensions:

   +----------------+     +----------------+     +----------------+
   | Parser         |     | RawDeck        |     | Deck           |
   +----------------+     +----------------+     +----------------+

   +----------------+     +----------------+     +----------------+
   | ParserKeyword  |     | Rawkeyword     |     | DeckKeyword    |
   +----------------+     +----------------+     +----------------+

   +----------------+     +----------------+     +----------------+
   | ParserRecord   |     | RawRecord      |     | DeckRecord     |
   +----------------+     +----------------+     +----------------+

   +----------------+     		         +----------------+
   | ParserItem     |     		         | DeckItem       |
   +----------------+     		         +----------------+