Chapter 40. Extract Example

Typically, an application can have a complicated and lengthy build process, along with a possibly long data load, which makes investigating the performance of one or a few rules difficult. One technique to debug or analyze such a set of rules is to create a simple workspace that includes only the necessary subset of data to run these rules, and then run these rules in isolation. Accomplishing this requires exporting all the direct dependencies of this rule into CSV files, and importing them into a workspace that contains declarations for only the direct dependencies. LogicBlox releases from 4.4 onward include an extract-example command as part of the lb utility that helps automate this process.

Production of an isolated runnable example from an existing workspace requires extracting a set of LogiQL rules and data used to trigger the rules. These rules to be extracted are found by looking for all derivation rules for a specified IDB predicate. A transitive option allows all derivation rules for IDB predicates to be recursively found and added to the example rule set.

Additionally, sometimes it is desirable to investigate rule performance during incremental updates. In these cases, the desired extracted example should have some initial data load of the base predicates, followed by one or more transactions to load the updates under investigation. Producing these updates is often difficult too, so this command provides a method to instrument the original workspace to capture these updates.

40.1. Command Overview

The extract-example command is run against a source workspace and writes a set of data files, LogiQL files, and shell scripts into an output directory. The generated files are self-sufficient and can be used to create an example workspace for testing or benchmarking. When using options for incremental updates, the output directory contains files necessary to instrument the source workspace, to export the monitored updates from the source workspace, and to import these updates in the example workspace.

A shell script called example.sh in the output directory uses the lb command to simplify building and loading data into the example workspace, as well as instrumenting the source workspace with data monitoring rules and exporting transaction data sets from the source workspace.

40.2. Command Options

The following options can be used to control the behavior of the lb extract-example command.

Option Name Description

--outdir dir-path

Directory to store all generated shell scripts, LogiQL files, and data files. Any existing files in the directory may be overwritten. The directory will be created if necessary.

--predicate predicate_name

This required parameter names the predicate whose derivation rules should be used as the basis for the example. The derivation rules will be extracted to a separate LogiQL file, allowing installation in the example workspace.

--transitive

Recursively include all IDB rules used in the derivation rules of the specified predicate, as if they were also specified.

--monitor

Generates LogiQL files containing delta rules that can capture data changes to the source predicates used in the example, as well as LogiQL rules to export these updates from the source workspace and import them into the example workspace.

--defun predicate_name

If this option is used and the specified predicate is functional, it will be translated to a non-functional predicate in the extracted example. This is useful when trying to analyze functional dependency errors.

40.3. Running the Extracted Example

The example.sh shell script in the output directory takes one of the arguments listed below to perform operations on the source or example workspaces.

Command Description
build

Build the example workspace. Does not import data. Will add installed LogiQL rules if the extracted example was based on database lifetime rules.

import

Import the base data into the example workspace.

export_source

Export base data from the source workspace.

query

Run the extracted rules as a query against the example workspace. This assumes the extracted rules were transaction lifetime rules.

instrument

Instrument the source workspace with rules that will capture relevant data changes for the example.

export_transactions

Export the captured data changes from the source workspace.

import_transactions

Import the source data files into the example workspace.

The specified output directory contains the following files and directories:

File / Directory Description
data/

Contains exported base data, and possibly updates related to the monitor option.

model.logic

The core schema for the example, not including the rules being analyzed.

rules.logic

The LogiQL rules being tested - derivation rules for the specified predicates along with optional transitive rules.

export.logic

LogiQL rules to export base data from the source workspace.

import.logic

LogiQL rules to import base data into the example workspace.

instrument.logic

Delta rules that will capture changes to relevant predicates in the source workspace.

export-transactions.logic

LogiQL rules to export all data changes from the the source workspace captured by the instrumentation rules.

import-transactions.logic

LogiQL rules to import captured data changes into the example workspace.

40.4. Tutorials

Below are some detailed examples of the lb extract-example command.

40.4.1. Simple Net Sales

Create a Test Workspace

To experiment with the extract-example command, first create a small test workspace by running the following commands:

$ lb create net_ws --overwrite
$ lb addblock -f schema.logic net_ws
$ lb addblock -f rules.logic net_ws
$ lb exec -f data.logic net_ws

where schema.logic contains

sku(sk), sku:id(sk:s) -> string(s).
sales[sk] = v -> sku(sk), int(v).
returns[sk] = v -> sku(sk), int(v).
net[sk] = v -> sku(sk), int(v).

and rules.logic contains

net[sk] = sales[sk] - returns[sk].

and data.logic contains

+sku(sk), +sku:id(sk:sid), ^sales[sk] = s, ^returns[sk] = r <-
   sid="sku1", s=10, r=1 ;
   sid="sku2", s=20, r=2 ;
   sid="sku3", s=30, r=3.

This will create a simple workspace with one rule computing net sales from sales and returns. The workspace will have a few data points that you can see with the lb print command.

$ lb print net_ws sales
[10000000004] "sku2" 20
[10000000005] "sku1" 10
[10000000007] "sku3" 30

$ lb print net_ws returns
[10000000004] "sku2" 2
[10000000005] "sku1" 1
[10000000007] "sku3" 3

$ lb print net_ws net
[10000000004] "sku2" 18
[10000000005] "sku1" 9
[10000000007] "sku3" 27

Extract Rules and Data

Run the following command to extract the net_sales rule and associated data from the test workspace.

$ lb extract-example net_ws --predicate net --outdir net_sales_example

If the command succeeds, it will produce a set of files and directories in a local directory called net_sales_example. Descriptions of the produced files are above.

The lb extract-example command currently doesn't do much error checking. If you see that all of the above files are empty (notably if the rules.logic file is empty), the most likely cause is that the predicate name used with the --predicate argument does not exist in the workspace or the predicate specified is not an IDB predicate (is not calculated via an installed LogiQL rule).

Running the Extracted Example

One of the files produced by lb extract-example is a shell script called example.sh. This script can be used to create a new workspace containing schema and rules extracted from the original workspace and can also be used to load extracted test data into the new workspace. Details are discussed above.

The example.sh script currently assumes it is being executed from the directory containing the script. It also provides no option for specifying the name of the workspace it creates without manually editing the file to change the DEST_WORKSPACE variable. The workspace it creates by default is called example.

Run the following sequence of commands to export data from the source workspace, build a new example workspace (overwriting if it existed previously), load data into the new example workspace, and print predicates to verify the data and rules loaded correctly.

$ cd net_sales_example
$ ./example.sh export_source
$ ./example.sh build
$ ./example.sh import

$ lb print example sales
[10000000004] "sku2" 20
[10000000005] "sku1" 10
[10000000007] "sku3" 30

$ lb print example returns
[10000000004] "sku2" 2
[10000000005] "sku1" 1
[10000000007] "sku3" 3

$ lb print example net
[10000000004] "sku2" 18
[10000000005] "sku1" 9
[10000000007] "sku3" 27

40.4.2. Transitive Net Sales

The extract-example command will find and extract the rules used to compute some predicate. By default, all the inputs to those calculations are treated as EDB input predicates in the resulting extracted files. If some of the input predicates in the original workspace are themselves calculated (i.e. IDB), you can transitively extract all their rules along with the rules of the originally specified predicate by using the --transitive option to the command.

To see an example of this, change the schema.logic file from the Simple Net Sales Example above to contain

sku(sk), sku:id(sk:s) -> string(s).
sales[sk] = v -> sku(sk), int(v).
returns[sk] = v -> sku(sk), int(v).
net[sk] = v -> sku(sk), int(v).
net_p1[sk] = v -> sku(sk), int(v).

and change rules.logic to contain

net[sk] = sales[sk] - returns[sk].
net_p1[sk] = net[sk] + 1.

Note the addition of the declaration and rule to compute net_p1 from the net predicate.

Create a workspace with these schema changes and load some test data.

$ lb create net_ws --overwrite
$ lb addblock -f schema.logic net_ws
$ lb addblock -f rules.logic net_ws
$ lb exec -f data.logic net_ws

Now transitively extract all the rules needed to compute the new net_p1 predicate.

$ lb extract-example net_ws --predicate net_p1 --outdir net_sales_trans --transitive
$ cd net_sales_trans
$ ./example.sh export_source
$ cd ..

Inspect the generated net_sales_trans/rules.logic file and you will see both the rule to compute the net predicate and the rule to compute the net_p1 predicate. The net_sales_trans/data/ directory will contain CSV files for returns and sales data that are the initial inputs to this chain of rules.

If you run the extraction without the --transitive option

$ lb extract-example net_ws --predicate net_p1 --outdir net_sales_no_trans
$ cd net_sales_no_trans
$ ./example.sh export_source
$ cd ..

you will see that the generated net_sales_no_trans/rules.logic file contains only the rule for the net_p1 predicate and the net_sales_no_trans/data/ directory contains a CSV file to load the net predicate instead of sales and returns. In this case, the calculated net predicate in the original workspace is converted to a non-computed EDB predicate in the extracted example.

40.4.3. Incremental Data Capture

The extract-example command can be used to augment an existing workspace to capture and export sets of data edits that can later be reloaded into test workspaces. Building on the Simple Net Sales example from above and using the same files, run the following commands to create a new workspace:

$ lb create net_ws --overwrite
$ lb addblock -f schema.logic net_ws
$ lb addblock -f rules.logic net_ws
$ lb exec -f data.logic net_ws

This will create a workspace with a few initial rules and a bit of data. Next you need to use the --monitor option of the extract-example command to generate an initial set of rule files and scripts from the workspace.

$ lb extract-example net_ws --predicate net --outdir net_sales_inc_data --monitor

You should see the same rule files and shell scripts generated as before. The net_sales_inc_data/data/ directory, however, will be empty and you will see a new net_sales_inc_data/instrument.logic file that can be added to an existing source workspace to track changes that occur to input predicates for the extracted rules. Add these rules to your workspace by running:

$ cd net_sales_inc_data
$ ./example.sh instrument

Note again that the example.sh script must be executed in the directory containing the script for it to find the necessary files.

Once the workspace has been instrumented, you can use the application, run batch scripts, etc. to cause data changes to the workspace. Any changes to input predicates for the extracted rules will be recorded. For this tutorial, execute something like the following to add a couple new sales and returns data points.

$ lb exec net_ws '+sku(sk), +sku:id(sk:sid), ^sales[sk] = s, ^returns[sk] = r <- sid="sku11", s=110, r=11.'
$ lb exec net_ws '+sku(sk), +sku:id(sk:sid), ^sales[sk] = s, ^returns[sk] = r <- sid="sku22", s=220, r=22 ; sid="sku33", s=330, r=33.'

Next use the example.sh script generated by the extract-example command to export the data changes.

$ ./example.sh export_transactions

This will produce a new set of CSV files in the data/ directory. These files contain data changes keyed by a transaction ID so that batches of changes made to the original workspace can be loaded in the same manner in which they were created.

To create a new testing workspace containing only the extracted rules and add the monitored data sets to it:

$ touch data/sku.csv
$ ./example.sh build
$ ./example.sh import

$ lb print example sales
[10000000004] "sku22" 220
[10000000005] "sku11" 110
[10000000007] "sku33" 330

$ lb print example returns
[10000000004] "sku22" 22
[10000000005] "sku11" 11
[10000000007] "sku33" 33

$ lb print example net
[10000000004] "sku22" 198
[10000000005] "sku11" 99
[10000000007] "sku33" 297

Note that the touch data/sku.csv command should be unnecessary. But if you have not previously created a sku.csv file (by using the export_source command for example), the import command will fail. This is a bug that needs to be fixed in the export_transactions command.

You should see that the testing workspace contains only data that was created after instrumentation was added to the original workspace.