TL;DR
Although LB doesn’t yet have a built-in facility for browsing data in a workspace, you aren’t limited to just using lb print to dump the contents of individual predicates. You can use commands like
lb exec WORKSPACE --print _ --exclude-ids --csv -f some_file.logic
to run the same complex queries over and over, easily load the results into a spreadsheet, and share queries with your friends. This approach is much easier than using lb print to dump predicates, and is simpler than building TDX services to export data, or introducing maintained logic into a project just for doing ad hoc investigation.
Introduction
This note describes some tools for doing ad hoc querying of an LB workspace. This will allow you to view the data in a workspace without writing a full UI application, or even TDX services. This can be useful as you are developing a system, or when you need to troubleshoot problems.
The LB command line tool contains several tools for retrieving data from an LB workspace. You can run each of these with the –help option to get more information:
- lb list : list all the predicates in a workspace
- lb predinfo : print the structure of a single predicate
- lb print : print the contents of a single predicate
- lb query : execute a single query against a workspace
- lb exec : execute LogiQL statements, which can be a simple as a single query.
Often just printing the contents of a single predicate does not provide enough information – you need to combine information from multiple predicates with queries. Both lb exec and lb query can be used to execute a query and return the results. Since lb exec has a couple options that lb query doesn’t have, let’s focus only on lb exec for now.
If you will need to run the same query more than once, it is a good practice to save the query in a file. lb exec has an option to execute logic from a file. Sharing such query files within a project team can be helpful, allowing others to get a better understanding of the overall data model as well as help with troubleshooting. You may want to accumulate a set of such files over the course of a project, and keep them in the project code repository so the entire team has access to them. Make sure to include comments in the query files so others understand their purpose.
Here are the options of lb exec that are most relevant for our purposes:
$ lb exec --help
usage: lb exec [-f FILE] [--raw] [--exclude-ids]
[--csv] [--print [PRED]]
WORKSPACE
positional arguments:
WORKSPACE name of workspace
optional arguments:
-f FILE, --file FILE logic file to execute
--raw print query result without escaping
--exclude-ids output only refmode values of entities
--csv print query result to csv files
--print [PRED] print local predicate PRED. Default '_'
Setup
Here is the query file used in the following examples.
$ cat collect_vendors.query
// lb query bn -f collect_vendors.query
_( v, vendor_id, name, city, state ) <-
vendor:vendor_id[v] = vendor_id,
vendor:name[v] = name,
vendor:city[v] = city,
vendor:state[v] = state,
vendor:collect_vendors(v) // Prefix this line with ! to get the opposite set of vendors.
As you might guess, the data model looks like
vendor(v), vendor_id(v:id) -> string(id).
name[v] = s -> vendor(v), string(s).
city[v] = c -> vendor(v), string(c).
state[v] = s -> vendor(v), string(s).
collect_vendors(v) -> vendor(v). // A list of specific vendors.
Examples
Here’s the simplest case, execute a query with the results going to stdout.
$ lb exec bn --print _ -f collect_vendors.query
/--------------- _ ---------------\
[10000089034] "1089655" "1089655" "VENDOR A" "CAMBRIDGE" "MA"
[10000089275] "1126036" "1126036" "VENDOR B" "PORTLAND" "OR"
[10000089287] "1126432" "1126432" "VENDOR C" "ATLANTA" "GA"
[10000089798] "12079" "12079" "VENDOR D" "AUSTIN" "TX"
[10000089824] "121219" "121219" "VENDOR E" "COLUMBUS" "MO"
\--------------- _ ---------------/
Notice the header and footer. Although they are not very helpful for these particular examples, they are helpful as separators if the results contain multiple predicates. Note that the result predicate name, _, is embedded in the separators.
Add –raw to get rid of the quotes around each field.
$ lb exec bn --print _ --raw -f collect_vendors.query
/--------------- _ ---------------\
[10000089034] 1089655 1089655 VENDOR A CAMBRIDGE MA
[10000089275] 1126036 1126036 VENDOR B PORTLAND OR
[10000089287] 1126432 1126432 VENDOR C ATLANTA GA
[10000089798] 12079 12079 VENDOR D AUSTIN TX
[10000089824] 121219 121219 VENDOR E COLUMBUS MO
\--------------- _ ---------------/
Add –exclude-ids to get drop entity IDs, and only print their refmode values:
$ lb exec bn --print _ --raw --exclude-ids -f collect_vendors.query
/--------------- _ ---------------\
1089655 1089655 VENDOR A CAMBRIDGE MA
1126036 1126036 VENDOR B PORTLAND OR
1126432 1126432 VENDOR C ATLANTA GA
12079 12079 VENDOR D AUSTIN TX
121219 121219 VENDOR E COLUMBUS MO
\--------------- _ ---------------/
Add –csv to get the result written to a CSV text file, without the dashed header and footer row.
$ lb exec bn --print _ --raw --exclude-ids --csv -f collect_vendors.query
/--------------- _ ---------------\
Output written to _.csv
\--------------- _ ---------------/
$ cat _.csv
1089655,1089655,VENDOR A ,CAMBRIDGE ,MA
1126036,1126036,VENDOR B ,PORTLAND ,OR
1126432,1126432,VENDOR C ,ATLANTA ,GA
12079 ,12079 ,VENDOR D ,AUSTIN ,TX
121219 ,121219 ,VENDOR E ,COLUMBUS ,MO
$
If you need to load your results into a spreadsheet your best bet may be to use csv mode but exclude IDs. (I’m not sure how quotes within a data value are handled.)
$ lb exec bn --print _ --exclude-ids --csv -f collect_vendors.query
/--------------- _ ---------------\
Output written to _.csv
\--------------- _ ---------------/
$ cat _.csv
"1089655","1089655","VENDOR A" ,"CAMBRIDGE" ,"MA"
"1126036","1126036","VENDOR B" ,"PORTLAND" ,"OR"
"1126432","1126432","VENDOR C" ,"ATLANTA" ,"GA"
"12079" ,"12079" ,"VENDOR D" ,"AUSTIN" ,"TX"
"121219" ,"121219" ,"VENDOR E" ,"COLUMBUS" ,"MO"
$
Summary
Saving queries in files, and using lb exec, can be a big time saver, and reduce duplicate work across a project team. You may even want to bundle relevant queries in deployments, at least during development and testing phases of a project, so that useful queries are available when troubleshooting on remote servers.