Chapter 37. Workbooks

37.1. Overview

The workbook framework uses the branching capabilities of the database to allow users to extract portions of the database to a branch, work in isolation on this branch, restrict access to this branch to certain users and merge changes to manage changes in different branches.

37.1.1. Partitioning

Workbooks are typically defined for a given partition of a database. Partitions can be defined by one or several levels (which must be specified within a hierarchy) of the measure model and one or several members for each of those levels.

A partitioned workbook will only contain:

  • the level members at the partition levels;

  • the children and descendants of those levels along the hierarchy that is used for the partitioning;

  • as well as the parent and ancestors of those levels along all hierarchies.

Example 37.1. 

If a workbook is partitioned at the Product:Department level, for a hierarchy which contains the typical Sku, Subclass, Class and Department relations, a workbook for the department "men's clothing" would only contain this department and only those classes, subclasses and SKUs that roll up to it. If an alternative hierarchy is defined, for example one in which Sku rolls up to a Brand, then the workbook will include only those brands that are rolled up to from some SKU that belongs to the "men's clothing" class.

Unfiltered Hierarchies

The behavior illustrated in the previous example with regard to the filtering of the hierarchies for partitioned workbooks is the default behavior. It is possible to configure a hierarchy to be "unfiltered", that is, to be fully included in the workbook. This can be achieved by specifying a list of unfiltered_hierarchies when specifying the level(s) to be used for partitioning.

37.1.2. BRANCH and CLONE methods

There are two methods to create a workbook.

BRANCH

The BRANCH method will spawn a branch from the master branch and use it as the workbook. This is most useful for workbooks which are not partitioned.

CLONE

The CLONE method will spawn a branch containing the schema and no data and will populate the partition associated with this workbook through a refresh. Note that in order to use this method you must create a no_data branch which contains the schema but no data.

37.1.3. Merge method

Data can be moved back and forth between a workbook and the master using a merge procedure: a commit merges workbook data into the master, whereas a refresh merges master data into a workbook. Users can define commit and refresh groups which specify which parts of the data model are transferred.

The merge method configuration allows users to select between different semantics for the merge procedure. The default merge method is FULL where all the data in the partition is transferred from the source to the target. Another option is DELTA merge, which only transfers data that changed in the source since the last merge (commit or refresh) was performed. This includes entity creation (often level members), entity retraction and changes in values. Note that both methods overwrite pre-existing changes in the target branch, but DELTA limits the scope of the data that is transferred.

The default merge method is FULL, but it can be configured at the workbook handler level by setting the merge_method option on the handler config file. This value will apply to any workbook or template created by the handler unless it gets overridden by the creation message (see Section 37.2, “Creating and deleting workbooks” for the creation message of workbooks and Section 37.3, “Templates” for templates). Note that once the merge method is set for a template, all workbooks created from that template will use the same merge method.

Example 37.2. 

In the following we compare the behavior of DELTA merge against FULL merge with a simple example.

  • The following shows the initial values for the Sales predicate in both the workbook and the master:

    SKU    | STORE   | WEEK   | Sales
    apples | atlanta | week-1 | 50
    apples | atlanta | week-2 | 0 
  • In the workbook the user makes the following update:

    SKU    | STORE   | WEEK   | Sales
    apples | atlanta | week-1 | 100 
  • Meanwhile, the data in the master is also updated:

    SKU    | STORE   | WEEK   | Sales
    apples | atlanta | week-1 | 75
    apples | atlanta | week-2 | 200 
  • When the user commits the workbook, the merge method defines which data is transferred. If merge method is FULL, then all data in the workbook is sent to the master, which includes the tuple for week-2 that has not been changed. The result would be that all changes in the master would be overwritten:

    SKU    | STORE   | WEEK   | Sales
    apples | atlanta | week-1 | 100
    apples | atlanta | week-2 | 0 

    On the other hand, if the merge method is DELTA, then only Sales at the intersection that was updated in the workbook will be transferred to the master, resulting in the following:

    SKU    | STORE   | WEEK   | Sales
    apples | atlanta | week-1 | 100
    apples | atlanta | week-2 | 200 

    Notice how the value for week-1 was replaced by the value from the workbook, while the value for week-2 was left intact.

The behavior is the same for the refresh action: FULL transfers all data from the master in to the workbook, while DELTA transfers only changes in the master.

37.1.4. Setup

In order to create a workbook, you must install the lb_web_workbooks library in your workspace and load the workbook-action handler provided by lb-web-workbooks.jar. You must then populate the predicate lb:web:workbooks:config:app_prefix[]=prefix -> string(prefix). This is usually done by augmenting the LogiQL code installed by the application with a fact for this predicate.

      lb:web:workbooks:config:app_prefix[] = "my_app". 

37.2. Creating and deleting workbooks

37.2.1.  Creating

In order to create a standalone workbook you must send to the /workbooks service a POST request with a JSON message that conforms to the Workbook message type in the workbooks.proto protobuf specification. A typical message would be

Example 37.3. 

{
   "checkout_method" : "CLONE",               // could be CLONE or BRANCH
   "merge_method" :    "FULL",                // could be FULL or DELTA
   "parent_workbook" : {
      "workspace_name" : "templates-auth",
      "clone_branch_name" : "no_data"         // name of the branch to clone
   },
   "olap_model" : {
      "measure_access" : {
        // Specifies the measure which will be populated in the workbook
        // (by default all measures):
        "measures": ["sales","return"],
        // If 'exclude' is true, the list will be interpreted as the list of
        // measures to exclude (all other measures will be included):
        "exclude": false
      },
      "position_access" : {
         "position_access" : [      // specifies the level at which we partition
            {
               "position" : {
                  "level" : {
                     "level" : "Region",
                     "hierarchy" : {
                        "dimension" : "Location",
                        "hierarchy" : "default"
                     }
                  },
                  // Specifies the list of unfiltered hierarchies:
                  "unfiltered_hierarchies": ["clustering"]
                  // Specifies the member of this level:
                  "value" : "region0"
               }
            }
         ]
      }
   },
   "name" : "wb"                                   // name of the workbook
} 

37.2.2. Deleting

You can delete workbooks by sending a DELETE request to the same service, /workbooks, providing either the id of the workbook you want to delete or all=true. Note that workbook deletion deletes the branch corresponding to a workbook, but keeps metadata information about the deleted workbook for reference purposes.

37.2.3. Querying workbooks

Sending a GET to this same service will return information about existing workbooks. By default this returns a list of all workbooks. This service takes a number of URL parameters:

  • active_only=true (or false) does not include deleted workbooks in the list.
  • template_name= ... only includes workbooks for a given template.
  • workbook_name= ... only includes workbooks with a given name.
  • workbook-id= ... only includes workbooks with a given id.
  • verbose=true (or false) returns all internal information about a workbook.
  • user_id= ... only includes workbooks reachable by the given user.

37.2.4.  Command Line

For development purposes the workbook framework provides a command line interface which will call the corresponding service. The typical usage for creating a workbook is as follows:

lb workbook create -j w.json 

where w.json is the workbook specification message.

And for deletion

lb workbook delete -w workspace --id workbook-id  

or

lb workbook delete -w workspace --name workbook-name  

Querying workbook is done with

lb workbook list -w workspace  

This command line, in addition to options corresponding to the options you can pass to the service, will parse the output and turn it into a csv for you. For instance,

lb workbook list -w workspace --csv id name template_name 

will produce 3 columns with the id, name and template names of workbooks.

37.3. Templates

Creating a workbook installs logic to filter the partition and export the data for each refresh group in the master. If you needed several workbooks partitioned at the same level, this logic would be installed for each workbook. In order to avoid such schema bloat you can create a Workbook Template to specify the partition level, and instantiate the template whenever you need a workbook for a member of this level.

37.3.1. Creating a template

Creating a template is similar to creating a workbook, except that the service you POST to is /workbook-templates and the message type is WorkbookTemplate.

JSON Example

Example 37.4. 

A typical WorkbookTemplate message contains the same information as a Workbook message, but the level member is left out.

{
   "parent_workbook" : {
      "clone_branch_name" : "no_data",
      "workspace_name" : "templates-auth"
   },
   "checkout_method" : "CLONE",
   "merge_method" :    "DELTA",
   "name" : "template",
   "position_access_templates" : {
       "position_access" : [
         {
            "level" : {
               "hierarchy" : {
                  "dimension" : "Location",
                  "hierarchy" : "default"
               },
               "level" : "Region"
            },
            // Specifies the list of unfiltered hierarchies:
            "unfiltered_hierarchies":["clustering"]
         }
      ]
   }
}

Command Line

The command line provides an easy way to call the service:

lb workbook create-template -w ws -j t.json 

where t.json is the workbook specification message.

37.3.2. Instantiation

Instantiating a template creates a workbook from an existing template. This is achieved by making a POST request to the /workbook-template-instantiation service, using the TemplateInstantiation message.

Note that the instantiation does not override any configuration from the template declaration. All workbooks instantiated from a template share the exact same configuration for different partitions at the same level.

Example JSON

Example 37.5. 

The instantiation message specifies the member of the partition level which should be used, the name of the template, and the name of the workbook to be created.

{
   "values" : [
      {
         "level" : {
            "hierarchy" : {
               "hierarchy" : "default",
               "dimension" : "Location"
            },
            "level" : "Region"
         },
         "value" : "region0"
      }
   ],
   "template_name" : "template",
   "workbook_name" : "instantiation_region0"
} 

Note that it is possible to instantiate the template for several members of this level:

{
   "values" : [
      {
         "level" : {
            "hierarchy" : {
               "hierarchy" : "default",
               "dimension" : "Location"
            },
            "level" : "Region"
         },
         "value" : "region0"
      },
      {
         "level" : {
            "hierarchy" : {
               "hierarchy" : "default",
               "dimension" : "Location"
            },
            "level" : "Region"
         },
         "value" : "region1"
      }
   ],
   "template_name" : "template",
   "workbook_name" : "instantiation_region0_region1"
} 

Command Line

The command line provides an easy way to call the service:

lb workbook instantiate -w ws -j i.json

where i.json is the template instantiation message.

37.4. Using services in workbooks

LB-web services can be hosted by workbooks, but their specification and declaration differs slightly from the specification of services in regular workspaces. Services on a workbook are declared in the master branch of the workspace, but are targeted at their own branch. The workbook framework offers some facilities to make declaration of workbooks easier:

  • lb:web:workbooks:interface:active_id(id) -> string id. provides the ids of all active workbooks.
  • lb:web:workbooks:config:service_wb_id[service] = id -> lb:web:config:service:workspace_service(service),string(id). when set, will instruct lb-web to target the workbook with the given workbook-id for the given service.

Example 37.6. 

This is a simplified version of the code the modeler uses to get one instance of the measure service running on each workbook.

 service_by_group[prefix[]+"/measure-"+id, "lb:web:public"] = x,
    protobuf_service(x) {
      service_wb_id[] = id,
      custom_handler[] = "measure-service",
      protobuf_protocol[] = "lb:web:measure_service",
    } <- active_id(id). 

37.5. Commit and refresh, usage and configuration

37.5.1. Usage

The workbook framework provides 2 services to commit or refresh a workbook: /commit and /refresh.

commit (refresh) takes a JSON message that specifies the id of the workbook to commit (refresh) and the name of the commit (refresh) group to use.

Example 37.7. 

{
  "name":"user_refresh",
  "workbook_id":"workbook_id"
}

37.5.2. Configuration

Commit and refresh groups are configured when creating the workbook or the template. By default each workbook is provided with a commit group and a refresh group named default which commit all measures and all dimensions of the OLAP model. This group is used at creation when using the CLONE method and can be overriden by using the name default when specifying a group.

Groups are specified in the Workbook or WorkbookTemplate message

Example 37.8. 

This defines a template with an extra commit and refresh group which only include the sales measure.

{
   "parent_workbook" : {
      "workspace_name" : "templates-auth",
      "clone_branch_name" : "no_data"
   },
   "position_access_templates" : {
      "position_access" : [
         {
            "level" : {
               "level" : "Region",
               "hierarchy" : {
                  "hierarchy" : "default",
                  "dimension" : "Location"
               }
            }
         }
      ]
   },
   "refresh_groups" : [
      {
         "measures" : [
            "sales"
         ],
         "include_all_level_maps" : true,
         "include_all_levels" : true,
         "name" : "sales-only"
      }
   ],
   "name" : "template",
   "checkout_method" : "CLONE",
   "commit_groups" : [
      {
         "measures" : [
            "sales"
         ],
         "include_all_level_maps" : true,
         "include_all_levels" : true,
         "name" : "sales-only"
      }
   ]
}

The commit and refresh groups are specified by an OLAPCommitRefreshGroup message. One can specify:

  • The list of measure to be included (all measures by default). If exclude_measures is true, all measures are included except those in the list.

    "measures" : [
       "sales"
    ],
    "exclude_measures" : false
  • The list of levels to be included, specified either by the same level message as in the position access, or by include_all_levels.

    "levels: [ {
          "level" : "Region",
          "hierarchy" : {
             "hierarchy" : "default",
             "dimension" : "Location"
          }],

    or

    "include_all_levels" : true.
  • The list of level_maps to be included, specified by the predicate which implements them, or by include_all_level_maps.

    "level_maps" : ["model:product:Sku_subclass"], 

    or

    "include_all_level_maps" : true,
  • The name of a block to be executed after refresh.

    "post_refresh_inactive_block" : "model:workbooks:prune",
  • A list of predicates to be omitted. This is most useful when we want to include all level maps except for just a few.

    "omitted_predicates" : ["model:product:Sku_subclass"], 
  • Policies to specify whether new entities should be added, and deleted entities deleted when including a level. These can be expressed for all levels, for a given dimension, or for a given level.

    "level_policy":[{"level":{"level":"Sku","hierarchy":{"hierarchy":"default","dimension":"Product"}},
        "policy":{
           "delete_entity":True
        }
    }],

    or

    "dimension_policy":[{"dimension_name":"Product",
        "policy":{
            "delete_entity":True
        }
    }]

    or

    "policy":{"delete_entity":True}

37.5.3. default Refresh group

The Refresh group named default is used for the initial refresh when using the CLONE method. Configuring this refresh group provides a way to configure the measures contained in a workbook.

37.6. Authorizing users

The workbook framework manages user authorization to control which user is allowed to access which workbook.

37.6.1. Adding, removing users, setting users list

The workbook framework provides 3 services for basic user management: /workbooks/addusers, /workbooks/deleteusers and /workbooks/setusers. Each acts on one workbook, and adds a list of authorized users, removes a list of authorized users, or sets the list of authorized users (which can result in both removals and additions in the same transaction). The input is a message of type AddUsersRequest, DeleteUsersRequest or SetUsersRequest.

37.6.2. Bulk Users management

Updating user permissions for several workbooks can be done in one transaction through the /workbooks/bulkusersmanagement service which takes as input a list for each type of users request.

37.6.3. Command Line

The CLI provides access to these services:

$ lb workbook add-users -h
usage: lb workbook add-users [-h] -w WORKSPACE [-u USERS [USERS ...]]
                             (--id WORKBOOK-ID | --name WORKBOOK-NAME)

optional arguments:
  -h, --help            show this help message and exit
  -w WORKSPACE, --workspace WORKSPACE
                        master workspace
  -u USERS [USERS ...], --users USERS [USERS ...]
                        users list
  --id WORKBOOK-ID      workbook id
  --name WORKBOOK-NAME  workbook name
$ lb workbook delete-users -h
usage: lb workbook delete-users [-h] -w WORKSPACE [-u USERS [USERS ...]]
                                (--id WORKBOOK-ID | --name WORKBOOK-NAME)

optional arguments:
  -h, --help            show this help message and exit
  -w WORKSPACE, --workspace WORKSPACE
                        master workspace
  -u USERS [USERS ...], --users USERS [USERS ...]
                        users list
  --id WORKBOOK-ID      workbook id

37.7. Building workbooks in workflows

The lb-workflow library provides a number of tasks meant to handle creation, deletion and management of workbooks in a batch context. Those tasks are documented in detail in the lb-workflow API documentation.

One important caveat is that the workflow lb.wb.CreateWorkbook is actually not creating a workbook from scratch but instantiating a template.

37.8.  Generating template instantiation data with workbook util

Generating template instantiation data can be challenging, as a separate JSON instantiation message is required for each position for which a workbook has to be created. To simplify this task, the workbook framework provides services to query template_instantiation entities (which can be generated from the application model) and obtain the data needed to produce the workbook instantiation messages, as well as workflows to use them.

37.8.1. Util model

The template_instantiation entities are entities modeling template instantiation and are computed from the business model.

For each instantiation of a template, it is mandatory to specify an id (template_instantiation_id), a name (workbook_name) and the name of the template from which the workbook should be instantiated (template_name). Additionally, it is possible to specify the list of authorized users (using the workbook_access predicate). The workbook_position relation can be used to specify the level members that are to be contained in the workbook.

The workbook framework provides a service which queries those entities for a given template_name, as well as services that query the position identifier and the list of users for a given template_instantiation entity: these are needed to create the workbook and assign users.

Below you can find the definition of the template_instantiation entity and all the related predicates, which are included in the lb:web:workbooks:util:model module. In the next section you can find an example of how these predicates can be populated from the business logic.

export(`{
  /**
   * This is the abstract entity that models a Template Instantiation.
   */
  template_instantiation(_) -> .

  /*
   * These 3 predicates should have a value for every template_instantiation:
   */
  template_instantiation_id[ti] = id -> string(id), template_instantiation(ti).
  workbook_name[ti] = name -> string(name), template_instantiation(ti).
  workbook_template_name[ti] = type -> template_instantiation(ti), string(type).

  /*
   * This specifies which user will be authorized to access the workbook
   * generated from this template_instantiation:
   */
  workbook_access(ti, user_id) -> template_instantiation(ti), string(user_id).

  /*
   * This specifies at which position the Template will be instantiated.
   * A template can access several positions, so each value must be
   * associated with the level of which it is a member.
   */
  workbook_position(ti, level, value) ->
     template_instantiation(ti), level(level), string(value).

  /*
   * This provides a way to tag a template_instantiation entity.
   * The tags can be used to filter queries through the template_instantiation
   * service.
   */
  tag(ti, tag) -> template_instantiation(ti), string(tag).


  /*
   * This entity represents a level of the measure model.
   * It is used to specify which level a value is for.
   */
  level(_) -> .
  level_name[level] = name -> level(level), string(name).
  level_dimension[level] = dim -> level(level), string(dim).
  level_hierarchy[level] = h -> level(level), string(h).

}),
sealed(`{
  /*
   * This will be populated when workbook 'id' is instantiated with
   * instantiation information 'ti'.
   */
  workbook_id(ti, id) -> template_instantiation(ti), string(id).
}), 

37.8.2.  Example of generated template_instantiation entities

Example 37.9. 

In this example we generate a template instantiation for each existing Product:Class in the model, compute the workbook_name from the class label, and setup the value for the class level, then assign users to each of those instantiations.

block(`workbook_config){

  alias(`lb:web:workbooks:util:model, `wb_model),

  export(`{
    class_instantiation[class] = ti ->
      Product:Class(class),
      wb_model:template_instantiation(ti).
  }),
  sealed(`{
    class_level[] = l -> wb_model:level(l).
  }),
  clauses(`{
    /**
    * Constructing instantiations for workbooks partitioned by Product:Class.
    */
    lang:constructor(`class_instantiation).

    // We create an entity for each existing class.
    class_instantiation[class] = ti,
    wb_model:template_instantiation(ti),
    wb_model:workbook_template_name[ti] = "class_template",
    wb_model:workbook_name[ti] = "Class workbook with class " + name,
    wb_model:template_instantiation_id[ti] = "ti_" + name <-
      Product:Class:label[class] = name.

    // All users of the application are given access to all instantiations.
    wb_model:workbook_access(ti, user_id) <-
      class_instantiation[_] = ti,
      system:app:User:name[_] = user_id.

    // We define here the entity that represents the Class level in the measure
    // model. We then use it to define at which position the template should
    // be instantiated.
    lang:constructor(`class_level).

    level(l),
    class_level[] = l,
    wb_model:level_name[l] = "Class",
    wb_model:level_hierarchy[l] = "Default",
    wb_model:level_dimension[l] = "Product".

    wb_model:workbook_position(ti, class_level[], name) <-
      class_instantiation[class] = ti,
      Product:Class:id[class] = name.
  })
} <-- .

37.8.3. Workbook util services

The workbook framework provides the following services for template instantiation:

  • /[app_prefix]/workbooks/util/template_instantiation: This queries the template_instantiation entities for a given template name.
  • /[app_prefix]/workbooks/util/positions: This queries the position value(s) for a given instantiation.
  • /[app_prefix]/workbooks/util/permission: This queries the list of users for a given template_instantiation.

37.8.4. Workflows for template instantiation

lb-workflow provides a number of workflows that use the template instantiation entities and the associated services described earlier. Please refer to the lb-workflow API documentation for details. An example of how they can be used to instantiate the template from our earlier example can be found below.

Example 37.10. 

In this example, we define a workflow workbooks.create_class_wb to instantiate the class_template, for which we defined the template instantiation entity earlier. The workflow first queries the template instantiation entity for the specified template and then calls the lb.wb.util.CreateWorkbook workflow, to create the workbooks and assign users to it.

workflow workbooks.create_class_wb(app_prefix) [template_instantiation* ] {
	($template_instantiation) <~ lb.wb.util.GetTemplateInstantiationRefByTemplate(
       app_prefix = $app_prefix,
       template_name = "class_template")
     ;
     forall<max=3>(ti in $template_instantiation)
       lb.wb.util.CreateWorkbook(
         app_prefix = $app_prefix,
         template_instantiation = $ti)
 }

37.8.5. Tagged template_instantiation

In some cases it is useful to query only a subset of all the possible instantiations of a template. The easiest way to do this is to add logic to set a tag on some template_instantiation and use this tag to query the pertinent subset when needed.

Example 37.11. 

The following rule tags the template instantiations that have not been instantiated at the time of the query.

      m:tag(ti, "no_instantiation") <-
        region_instantiation[_] = ti,
        !m:workbook_id(ti, _). 

The template instantiations can now be queried by adding a tag field to the JSON query message, or by using workflows with the tagged version of the query workflow (lb.wb.util.GetTaggedTemplateInstantiationRefByTemplate).