Chapter 30. Measure Service

30.1. Concepts

The measure service is LogicBlox's implementation of online analytical processing (OLAP), which is used for analyzing multidimensional data. In contrast to LogiQL's normal relational model, OLAP is intended to support convenient and efficient roll-up operations that aggregate values belonging to semantically related keys. For instance, an analyst might start with fine-grained Sales data keyed by intersection (sku, store, week) and roll up by aggregating time to view Sales keyed by (sku, store, season). She might also roll up all products to yield Sales at (store, season).

In the LogicBlox approach to OLAP three primary concepts are levels, dimensions, and measures. Intuitively, a level is a set of values that are intended to serve as keys and that are intended to support roll-up, and a measure is a collection whose keys are drawn from zero or more levels. In the example above, Sales is a measure and each of sku, store, week, and season are levels. A dimension is a group of related levels whose members roll up to each other; for instance, week and season are part of a Calendar dimension.

For context, it's worth understanding that the measure service provides an OLAP view of data stored in normal LogiQL predicates and implemented by installing logic in a workspace. The advantage of using the measure service, as opposed to hand-implementing calculations in LogiQL, is that measure queries and definitions are substantially shorter and more direct than their LogiQL equivalents. As we will see, building an OLAP model will require providing both LogiQL definitions and constructing a measure model that explains how to interpret these definitions as OLAP concepts.

30.1.1. Levels, dimensions, and intersections

A level is a set of points, intended to serve as keys, and a dimension is a set of related levels. Common examples include a Location dimension with levels store, city, and state; a Product dimension with levels sku and class; and a Calendar dimension with levels day, week, season, month, etc. Roll-ups are implemented by specifying a total mapping from members of a lower level to a higher level.

Typically levels are implemented by LogiQL entity types. For example, each member of the Product.sku level might be backed by a myapp:sku entity type. Additionally, there are special dimensions corresponding to primitive types, such as the dimension Int which contains a single level, also named Int. Finally roll-ups between the levels are implemented by functional LogiQL predicates. For example, the simple calendar dimension illustrated in Figure 30.1, “Calendar dimension level relationships” may be built based on the LogiQL definitions shown in the following (partial) lb script:

addblock <doc>
// Entity types represent levels.
Day(d), dayId(d:s) -> string(s).
Month(m), monthId(m:s) -> string(s).
Year(y), yearId(y:s) -> string(s).
</doc>

exec <doc>
// Entity values represent members.
+dayId[_] = "1-1-2014".
+dayId[_] = "1-2-2014".
+dayId[_] = "2-1-2014".
+dayId[_] = "2-2-2014".

+monthId[_] = "January2014".
+monthId[_] = "February2014".

+yearId[_] = "2014".
</doc>

addblock <doc>
// Level maps specify how levels roll up
dayToMonth[d] = m -> Day(d), Month(m).
monthToYear[m] = y -> Month(m), Year(y).

dayToMonth[d] = m <- dayId[d] = "1-1-2014", monthId[m] = "January2014".
dayToMonth[d] = m <- dayId[d] = "1-2-2014", monthId[m] = "January2014".
dayToMonth[d] = m <- dayId[d] = "2-1-2014", monthId[m] = "February2014".
dayToMonth[d] = m <- dayId[d] = "2-2-2014", monthId[m] = "February2014".

monthToYear[m] = y <- monthId[m] = "January2014",  yearId[y] = "2014".
monthToYear[m] = y <- monthId[m] = "February2014", yearId[y] = "2014".
</doc>

Figure 30.1. Calendar dimension level relationships

Calendar dimension level relationships

It is also be possible to directly relate members of the Day level to the Year level, but as long as such a relationship commutes with the one from Day to Month and Month to Year, it is unnecessary.

Figure 30.2. Calendar dimension level relationships extended

Calendar dimension level relationships extended

However, dimensions need not be strictly linear. For example, we could add a season level to the dimension and provide additional relationships between month and season and season and year.

Figure 30.3. Calendar dimension extended with Season level

Calendar dimension extended with Season level

For some dimensions, it also makes sense to provide a mapping from a level to itself.

Example 30.1. 

Consider a dimension for measuring data with respect to different parts in manufacturing process. It is natural for the members of a "Parts" level to be related to other parts in the same level. For example, a "Gear" may be part of an "Engine".

Figure 30.4. Example of dimension with a self-relationship

Example of dimension with a self-relationship

Because we allow relationships between any two levels in a dimension, they can be thought of as a graph. However, we do place some structural limitations on the graph to ensure that it has the properties needed for well-defined OLAP queries:

  1. The first requirement is that the graph must not contain any cycles that involve more than one node. This allows for the relationships like the one for "Parts", described above, but rules out mappings that form larger cycles. For example, the Vatican is a country, within the city Rome, which is within the region of Lazio, which is within the country Italy. We can illustrate the relationship between members as

    Figure 30.5. Location dimension levels and members

    Location dimension levels and members

    However, to establish those relationships between levels, we would get a dimension looking like:

    Figure 30.6. Disallowed relationship among Location dimension levels

    Disallowed relationship among Location dimension levels

    This dimension's structure has a cycle involving three nodes, which is not allowed.

  2. The second requirement is that the transitive closure of the directed edge relationship must form a meet-semilattice. That is, when graph reachability is treated as a "less than or equals" relation (), then that relation must be a partial order, and there must exist a least element (level) , such that, for every level l in the dimension, ⊥ ≤ l. Furthermore, for every pair of levels l1 and l2, there must exist a meet (i.e., greatest lower bound).

Generally, dimensions are also described in terms of what are called hierarchies. A hierarchy can be thought of as a named path through the dimension graph. Hierarchies provide a useful modeling option for dimensions and they can also be used to direct some operations that involve dimensions to use a specific path.

Another important concept in modeling with dimensions are attributes. Attributes can be thought of as functions or properties of the members of a level. An attribute is generally used to allow meta-data concerning a member to be queried. For example, the name or label of a member might be two separate attributes of a level.

An intersection is a set of labeled levels. For example, given the level day and the level state, the labeled pair (myDay:day, myState:state) is an intersection. The measure service allows you to avoid writing a label, and will use each level's corresponding dimension as a default label. For instance the following are four ways of writing down the same intersection:

(day, state)
(Calendar:day, state)
(day, Location:state)
(Calendar:day, Location:state)

The order in which labels appear in an intersection can influence the measure service's choice of key ordering (see ???), but it is not important when writing queries or updates. In most contexts (day, state) and (state, day) are equivalent. A single level may occur multiple times in an intersection, for instance the intersection (before:sku, after:sku) might be used when modeling how likely one product is to be purchased after another.

Finally, a position is a record (unordered tuple) of level members corresponding to an intersection. For instance (June2014, Waffle) and (September2012, Toast) might be positions of the (month, breakfastFood) intersection.

30.1.2. Measures

A measure is a map from the positions of some intersection to a value or, less frequently, a set of values or no values at all. The canonical OLAP example is the Sales measure, that gives a decimal data value for each position of the intersection (Sku, Store, Week).

Every measure is defined by a measure expression. Measure expressions are discussed below, but for intuition it's worth considering two kinds of measures. Metrics are measures defined by the contents of a provided LogiQL predicate. You can think of metrics as the input data to the measure service. In contrast, an aggregation defines a measure by adding up values in other measures, which might be metrics. Putting this together, we might define Sales at (Sku, Store, Week) as a metric referencing LogiQL predicate companydata:sales and use an aggregation measure expression to roll up sales figures to the measure at intersection (Sku, Region, Year).

It is not strictly necessary that a measure be a function from positions to values. It is legal to have a set of values for each position. These measures can be used in queries, but our current reporting mechanism can only handle functional measures. Therefore, some filtering or aggregation is necessary to obtain a report from these relational measures.

Furthermore, it is not necessary that a measure contain data. A measure can consist entirely of a set of positions. This is isomorphic to having a dense measure measuring boolean values, but more space efficient. While such measures may be used in queries, given that there is no data within them, it is not possible to directly query them in a report.

Finally, it is also possible for measures to be parameterized so that their behavior can be adjusted on a per query basis. Because the parameterization mechanism is closely tied to implementation details, we do not cover it in depth here.

Note

The remainder of this chapter is under construction.

The measure service and CubiQL (a language that provides a high-level interface to the measure service) are still being actively developed. For the time being we provide only the current grammar.

30.2. Measure Expression Grammar

In many contexts the measure service allows one to concisely specify a measure query expression in a form described by the following grammar. (See chapter Grammar for a description of the notation. Additionally, we use %% to begin a comment that extends to the end of the line.)
%%%% Basic elements

NonDQuote is any character other than a double quote (i.e., '"').
NonBQuote is any character other than a back quote (i.e., '`').
Letter    is any Unicode alphabetic character.

NonzeroDigit = '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' .
Digit        = '0' | NonzeroDigit .

Ident = Letter { Letter | Digit }
      | '`' { NonBQuote } '`' .       % an escaped identifier


%%%% Queries

Label = Ident .

Type = 'string' | 'int' | 'float' | 'decimal' | 'boolean' | Ident
     | '(' [ Types ] ')' .                                         % tuple type

Types = Type { ',' Type } .

Level = Ident                         % Unqualified level name
      | Ident '.' Ident               % Level name qualified with dimension
      | Ident '.' Ident '.' Ident .   % Level name qualified with dimension and
                                      % hierarchy

LLevel = [Label ':' ] Level .         % Labeled level

Intersection = '(' Intersection ')'                    % parentheses
             |  '{' [ LLevels ] '}'                    % i.e., may be nullary
             |  'interof' '(' Expr ')'                 % indirect intersection
             |  Intersection { '&' Intersection }      % meet of intersections
             |  Intersection { '|' Intersection }      % join of intersections
             |  Intersection '!' Label                 % restriction by a label
             |  Intersection '!' LabelSet              %    and a set of labels
             |  Ident .                                % intersection variable

LLevels = LLevel { ',' LLevel } .
LabelSet = '<' Label { ',' Label } '>' .

BaseSignature = Intersection                              % position only
              | Intersection '=>' Type                    % single-valued
              | Intersection '=>' 'Set' '(' Type ')') .   % multi-valued

IntegerLiteral    = '0' | NonzeroDigit [ Digits ] .
FractionalLiteral = IntegerLiteral '.' Digits .

Digits = Digit { Digit } .

ScalarLiteral = IntegerLiteral                                 % integer literal
              | IntegerLiteral 'd' | FractionalLiteral ['d' ]  % decimal literal
              | IntegerLiteral 'f' | FractionalLiteral 'f'     % float literal
              | FractionalLiteral ('E' | 'e') [ Sign ] IntegerLiteral [ 'f' ]
              | 'true' | 'false'                               % boolean literal
              |  '"' { NonDQuote } '"' .                       % string literal

Sign = '+' | '-' .

LiteralColumn = '[' [ ScalarLiterals ] ']' .
LiteralTuple  = '(' [ ScalarLiterals ] ')' .

Literal = '{' [ LiteralTuples ] '}' ':' BaseSignature .

ScalarLiterals = ScalarLiteral { ',' ScalarLiteral } .

LiteralTuples = LiteralTuple { ',' LiteralTuple } .

Expr = [ '{{' [ Annotations ] '}}' ] ExprNoAnnotations .

Annotations = Annotation { ',' Annotation } .

Annotation = Ident '=' ScalarLiteral
           | Ident '=' LiteralColumn.

ExprNoAnnotations =
     '(' Expr ')'                        % parentheses
   | Ident                               % metric or expression variable
   |  LLevel '.' Ident                   % attribute
   |  '-' Expr                           % sugar for negate(<expr>)
   |  Expr '+' Expr                      % sugar for add(<expr>, <expr>)
   |  Expr '-' Expr                      % sugar for subtract(<expr>, <expr>)
   |  Expr '*' Expr                      % sugar for multiply(<expr>, <expr>)
   |  Expr '/' Expr                      % sugar for divide(<expr>, <expr>)
   |  Expr '@' Intersection              % widen
   |  '#' Expr                           % drop values making position only
   |  'demote' Label 'in' Expr           % convert a dimension into the value
                                         %   of the expression
   |  'promote' [Label 'in'] Expr        % convert the value of the expression
                                         %   into a dimension
   |  AggMethod Expr '@' Intersection    % aggregation to an intersection
   |  AggMethod Expr [ Groupings ]       % aggregation by grouping
   |  'headersort' Expr 'by' LLevel      % headersort
   |  'filter' Expr 'by' Comparisons     % filter
   |  'dice' Expr 'by' Dicers            % dice
   |  'fun' ArgBindings 'in' Expr        % function
   |  Ident '(' [ Exprs ] ')'            % operator
   |  Expr AppBindings                   % application
   |  'let' AppBindings 'in' Expr        % let binding
   |  'split' LabelMap 'in' Expr         % dimension splitting
   |  'relabel' LabelMap 'in' Expr       % dimension relabeling
   |  Expr '++' Expr { '++' Expr }       % override
   |  Expr '|' Expr { '|' Expr }         % union
   |  Expr & Expr { & Expr }             % intersection
   |  Expr '\' Expr                      % difference
   |  Expr 'as' Type                     % cast expression to type
   |  ScalarLiteral                      % sugar for a literal at the top
                                         %   intersection
   |  Literal                            % literal expression

Exprs = Expr { ',' Expr } .

ArgBindings = '[' [ InterArgBindings ] ']' '(' [ ExprArgBindings ] ')'
            | '[' [ InterArgBindings ] ']'
            |                              '(' [ ExprArgBindings ] ')' .

InterArgBindings = InterArgBinding { ',' InterArgBinding }' .

InterArgBinding = Ident                         % simple argument
                | Ident '=' Intersection .      % argument with default

ExprArgBindings = ExprArgBinding { ',' ExprArgBinding } .

ExprArgBinding = Ident                          % simple argument
               | Ident '=' Expr.                % argument with default

AppBindings = '[' [ InterAppBindings ] ']' '(' [ ExprAppBindings ] ')'
            | '[' [ InterAppBindings ] ']'
            |                              '(' [ ExprAppBindings ] ')' .

InterAppBindings =
     Intersections                                    % positional only
   | Intersections { ',' InterNamedBinding }          % positional + named
   | InterNamedBinding { ',' InterNamedBinding } .    % named only

Intersections = Intersection { ',' Intersection } .

InterNamedBinding = Ident '=' Intersection .

ExprAppBindings =
     Exprs                                            % positional only
   | Exprs { ',' ExprNamedBinding }                   % positional + named
   | ExprNamedBinding { ',' ExprNamedBinding } .      % named only

ExprNamedBinding = Ident '=' Expr .

LabelMap = Ident 'to' Ident { ',' Ident 'to' Ident } .

Comparisons = Comparison { 'and' Comparison }
            | Comparison { 'or' Comparison } .

Dicers = Expr { 'and' Expr }
       | Expr { 'or' Expr } .

Comparison = Compop Expr .

Compop = '='       % equal
       | '!='      % not equal
       | '<'       % less than
       | '<='      % less than or equal
       | '>'       % greater than
       | '>='      % greater than or equal
       | '~'       % Posix extended regex comparison

AggMethod = 'average' | 'count' | 'total' | 'collect' | 'ambig'
          | 'min' | 'max' | 'mode' | 'count_distinct' | 'histogram' | 'sort' .

Groupings = 'by' Grouping { Grouping } .

Grouping = 'all' Ident            % project dimension
         | 'to' LLevel            % rollup
         |  'slide' MapName .     % rollup across dimensions


%%%% Updates

Update = 'do' AtomicUpdates .

%% In the future this might be extended to
%%   Update = 'do' UpdateExpr .

AtomicUpdateExpr = '(' AtomicUpdateExpr ')'
                 | 'remove' Expr [ Transformations ] 'from' Target
                 | 'spread' Expr [ Transformations ] 'into' Target .

AtomicUpdates = AtomicUpdateExpr { 'and' AtomicUpdateExpr } .

%% Only conjunctive AtomicUpdateExprs are supported at this time.
%% The following is a possible future extension.
%%
%%   UpdateExpr = AtomicUpdates { 'or' AtomicUpdates }
%%              | 'if' AtomicUpdates 'then' UpdateExpr
%%                  { 'else' 'if' AtomicUpdates'then' UpdateExpr }
%%                    [ 'else' AtomicUpdates ] .

Target = Ident                     % metric
       | LLevel                    % level
       | LLevel '.' Ident .        % attribute

%% In the future Target might be extended by yet another case:
       | LLevel '=>' LLevel .      % level map

Transformations = 'via' Transformation { 'then' Transformation } .

Transformation =
    ('even' | 'ratio' | 'replicate' | 'query' Expr) Distribution .

Distribution = { 'down' Level } .