farnDict

Description

A farnDict is a file in dictIO dict file format used with farn

A farnDict

  • defines the layers, the parameters varied per layer and the related sampling used to create the samples per layer

  • after sampling, the layers, parameters and generated samples per layer make up the designspace that farn traverses

  • farn creates one distinct case folder for each sample, making up a nested case folder structure

  • nest levels and -dimensions of the case folder structure follow the sequence of layers as defined in the farn dict

Elements

element / key

type

Description

_environment

dict

[optional] dict with system variables set at runtime, commonly containing folder paths, which can subsequently be referenced in shell commands defined in _commands

_always

dict

dict defining all objects (as variables, dicts and arrays) which have to be distributed to all farn layers in case no additional farn layer shall be generated

_layers

dict

dict defining all layers. Each layer represents one nest level in the folder structure that will be generated by farn.

 <LAYER>

dict

unique key defining a layer. It serves as basename for all case folders in the nest level corresponding with that layer.

  _sampling

dict

dict defining sampling-type and -parameters of a layer

   _type

string

sampling type. Choices currently implemented are {‘fixed’, ‘linSpace’, ‘uniformLhs’, ‘normalLhs’, ‘sobol’, ‘hilbertCurve’}

   _names

list[string]

list naming all variables / parameters being varied in this layer. For each variable / parameter named here, sampled values will be generated.

   _values

list[list[*float]]

(required for sampling type ‘fixed’): List containing lists of fixed values. For each parameter name defined in _names, one list of fixed values must exist, i.e. the number of lists in _values must match the number of parameter names defined in _names. The number of values can freely be chosen. However, all lists in _values must have the same number of values.

   _ranges

list[list[float, float]]

(required for sampling types ‘linSpace’, ‘uniformLhs’ and ‘hilbertCurve’): List containing ranges. A range is defined through the lower and upper boundary value for the related parameter name, given as tuple (minimum, maximum). For each parameter name defined in _names, one range tuple must exist.

   _numberOfSamples

int

(required for sampling types ‘linSpace’, ‘uniformLhs’ and ‘hilbertCurve’): Number of samples to be generated. In case of ‘linSpace’, boundary values are included if an odd number of samples is given. In case of ‘uniformLHS’, the given number of samples will be generated within range (=between lower and upper boundary), excluding the boundaries themselves.

   _includeBoundingBox

bool

(optional, for sampling type ‘uniformLhs’, ‘sobol’ and ‘hilbertCurve’): Defines whether the lower and upper boundary values of each parameter name shall be added as additional samples. If missing, defaults to FALSE.

   _iterationDepth

int

(optional, for sampling type ‘hilbertCurve’): Defines the hilbert iteration depth, default: 10.

  _condition

dict

(optional) a condition allows to define a filter expression to include or exclude specific samples. (see Filtering of Cases)

   _filter

string

filter expression (see Filter Expression)

   _action

string

(optional) defines the action triggered when the filter expression evaluates to True. Choices: ‘include’, ‘exclude’. If missing, defaults to ‘exclude’. (see Action)

  _samples

dict

dict containing all samples. The ‘_samples’ section of a layer is generated by farn when run with option –sample.

  _commands

dict

(optional) dict defining commandsets that can be executed in a layer.

   <COMMAND>

list[string]

unique key defining a command element. A command element contains one or more shell commands, saved as a list of strings. When farn is called with -e <COMMAND> argument, all shell commands listed in <COMMAND> will be executed in the given sequence, in all case folders corresponding to the layer the command element is defined in.

Example

Below example shows a typical farnDict file. In the example, a 6-dimensional design space is spawned, organised in 4 layers:

  1. ‘gp’ level 0 (root layer) no. of parameters: 1 sampling: fixed (Example ‘gp’ indicating e.g. a hypothetical grid parameter)

  2. ‘lhsvar’ level 1 (nested layer) no. of parameters: 3 sampling: uniformLhs (Example 3 dimensional sub design space, LHS sampled)

  3. ‘cp’ level 2 (nested layer) no. of parameters: 1 sampling: linSpace (Example ‘cp’ indicating e.g. a hypothetical compute parameter (solver setting, version whatever))

  4. ‘mp’ level 3 (leaf layer) no. of parameters: 1 sampling: fixed (Example ‘mp’ indicating e.g. a hypothetical multiplier for an internal variable)

/*---------------------------------*- C++ -*----------------------------------*\
filetype dictionary; coding utf-8; version 0.1; local --; purpose --;
\*----------------------------------------------------------------------------*/
_environment
{
    CASEDIR                   cases;
    DUMPDIR                   dump;
    LOGDIR                    logs;
    RESULTDIR                 results;
    TEMPLATEDIR               template;
}
_always
{
	coeff_0                   0.0;
	coeff_1                   10.0;
}
_layers
{
    gp                                          // unique key defining a layer. Will be used by farn as basename for all case folders in the corresponding nest level.
    {
        _sampling
        {
            _type fixed;                        // Fixed values. Note: Each sampling type has its own set of required arguments.
            _names(mpGrid);                     // list with names, each representing one variable or parameter.
            _values((0.9 1.3));                 // list containing list with fixed values, one list for each parameter name. Required for sampling type 'fixed'.
        }
    }
    lhsvar
    {
        _sampling
        {
            _type uniformLhs;                   // Latin-Hypercube-Sampling. Future options might also include lognormlhs etc. (currently not implemented)
            _names(param1 param2 param3);
            _ranges((-10 10)(0 3.5)(0 1.1));    // list containing ranges. A range is defined through the lower and upper boundary value for the related parameter name, given as tuple (minimum, maximum). For each parameter name, one range tuple must exist.
            _includeBoundingBox True;           // [optional] defines whether the lower and upper boundary values of each parameter name shall be added as additional samples. If missing, defaults to False.
            _numberOfSamples 100;               // number of samples to be generated. The given number of samples will be generated within range (=between lower and upper boundary), excluding the boundaries themselves.
        }
    }
    cp
    {
        _sampling
        {
            _type linSpace;                     // Linearly spaced sampling.
            _names(relFactor);
            _ranges((0.5 0.8));
            _numberOfSamples 5;
        }
        _condition                              // a condition allows to define a filter expression to include or exclude specific samples.
        {
            _filter  'param2 >= param3 and param1 >= 0'; // filter expression.
            _action  exclude;                            // [optional] defines the action triggered when the filter expression evaluates to True. choices: 'include', 'exclude'. If missing, defaults to 'exclude'.
        }
    }
    hilbert
    {
        _sampling
        {
            _type hilbertCurve;                 // Hilbert-Sampling. (derived from Hilbert's space-filling curve for dimensions >= 2)
            _names(param1 param2 param3);
            _ranges((-5. 5.)(0. 10.)(-10 10.)); // A list, containing ranges. A range is defined through the lower and upper boundary value for the related parameter name, given as tuple (minimum, maximum). For each parameter name, one range tuple must exist.
            _includeBoundingBox True;           // [optional] defines whether the lower and upper boundary values of each parameter name shall be added as additional samples. It is not recommended to use this option in hilbert sampling because at least two samples will be coincident.
                                                // If missing, defaults to False. If it is given, start and end points of Hilbert distribution coincide with bb points for technical reasons.
            _numberOfSamples 20;                // number of samples to be generated. The given number of samples will be generated within range (=between lower and upper boundary), excluding the boundaries themselves. Resampling, keeping the already done cases, is possible in a subsequent farn call by adding (_numberOfSamples-1) to itself.
                                                // a new sampling iteration on the same data set should be done using _numberOfSamples_n+1 = _numberOfSamples_n + _numberOfSamples_n - 1 to retain the lowest possible correlations between samples.
            _iterationDepth   5;                // iteration depth of Hilbert's algorithm: this example generates 2**(3*5) = 32768 Hilbert points (H. length).
        }
    }
    mp
    {
        _sampling
        {
            _type fixed;
            _names(cpMul ppMul);                // (just exemplary parameter names. Imagine e.g. be multipliers for solver and postprocessing.)
            _values((1.5 2.0 3.5)(1.5 2.0 3.5));
        }
        _commands                               // commands. Each <COMMAND> element contains a list with one or more shell commands.
        {
            prepare                             // command 'prepare' (contains 3 shell commands)
                (
                    'copy %TEMPLATEDIR%/caseDict'   // shell command 1
                    'rem parsed.caseDict'           // shell command 2
                    'dictParser --quiet caseDict'   // shell command 3
                );
            run                                 // command 'run' (contains 1 shell command)
                (
                    'cosim.exe run OspSystemStructure.xml -b 0 -d 20 --real-time -v'
                );
        }
    }
}

Filtering of Cases

Filtering of cases is possible using the ‘_condition’ element. The _condition element is evaluated with every call to farn, regardless of whether farn is run with option –sample, –generate or –execute. (However, still the effect of filtering is different, depending on the option farn is called with. See Effect of farn options on Filtering)

The structure of the _condition element is as follows:

Structure of the _condition Element

_condition
{
    _filter  FILTER_EXPRESSION;
    _action  VALUE;
}

Any _condition element must contain exactly one _filter expression and can optionally contain an _action value. The _condition element is layer specific, meaning it must be defined inside a _layer element. Each _layer can contain maximum one _condition element (either no or one).

Filter Expression

The value of the _filter element is expected to be a string formatted expression containing a relational statement or list comparison. Valid examples are e.g.:

Filter Expression

Comment

"param1 > 3"

with param1 being a user defined key within the current scope (layer)

"param1 < 0 or param2 == 1"

with param1 & param2 being keys within current level combined with boolean expressions “and, “or” and “not”

"param1 * sqrt(param2)"

with param1 & param2 being keys within current level in combination with mathematical operators and constants

"param1 not in [4, 5, 7]"

with param1 list comparison

"CASE_ATTRIBUTE in ['case_00', case_01']"

with CASE_ATTRIBUTE being one of the case attributes made available by farn during runtime (see Case Attributes below)

or any combination of the above.

Action

The optional _action element defines the action triggered in case the filter expression evaluates to True.

Currently supported values are the string literals

  • ‘include’ and

  • ‘exclude’.

If the _action element is missing, action defaults to ‘exclude’.

farn evaluates the _filter expression and the _action value for each case and with every run of farn.


A case is considered valid if one of the following two conditions is met:

_action is ‘exclude’ (default) and _filter expression evaluates to False

-> case is considered valid and will be included

_action is ‘include’ and _filter expression evaluates to True

-> case is considered valid and will be included


Correspondingly, a case is considered invalid if either of the two complementing conditions is met:

_action is ‘exclude’ (default) and _filter expression evaluates to True

-> case is considered invalid and will be excluded

_action is ‘include’ and _filter expression evaluates to False

-> case is considered invalid and will be excluded

Case Attributes

The following case attributes can be used in filter expressions. They are made available as variables by farn during runtime.

Case Attribute

Description

case

Name of the case

layer

Name of the layer

level

Level (integer, zero-based)

index

Index (integer, zero-based)

path

case folder path

is_leaf

Indication whether or not a case is a leaf case (bool)

If unsure, elevate farn’s log-level from INFO to DEBUG to see a list of available attributes for each processsed case.

The benefit of this approach is that it allows to use filtering in order to e.g. process small chunks of specific cases, simply by adapting a filter expression in the sampled.farnDict file. This can come handy when i.e. drilling down on “problem” cases, see following example:

_condition
{
    _filter  "index in [0, 200, 201]";
    _action  exclude;
}

Effect of farn options on Filtering

It is important to note that, depending on the commandline option farn is called with, filtering results in different effects:

farn option

Effect on Filtering

--sample

cases will appear or disappear in the _samples section in sampled.farnDict file written by farn
(meaning the corresponding cases are either available or not available right from the start)

--generate

case folders will be generated or not

--execute

cases will be executed or not

Filter expressions in the sampled.farnDict file can be modified as needed, at any time, when working with farn.

If, though, farn is called with –execute option and the folder of a case to be executed does actually not exist, farn will log a warning, mentioning that the respective case folder does not exist and needs to be generated first. This most commonly happens when a filter expression got changed in between generating the case folder structure and executing command sets therein. If so, simply generate the missing cases by calling farn with option –generate once again and then retry to execute the command set with option –execute.

Erraneous filter expressions

If a filter expression can not successfully be evaluated by farn, i.e. because parameter names are being used in the filter expression which are not (yet) defined or accesible in the current scope (layer), a warning is logged and the case is considered invalid.