Chapter 4. Monitoring and Intervention

4.1. Review Current Executing Transactions

Transactions are generally either in a queue waiting for execution or they are active. LogicBlox processes read-only transactions in parallel utilizing all available cores on the machine, while read-write transactions are serialized.

To list transactions that are currently under processing for a specific workspace such as testworkspace, use:

lb status testworkspace --active

To also list all requested transactions that are currently within some queue, use:

lb status testworkspace --all

Note that it is important to select the workspace for which the detailed status is inquired. To list all transactions that are in progress or in some queue by any workspace use for example:

lb workspaces | xargs -n 1 lb status --all

Additional information, such as general information of how many workspaces are opened by the lb-server can be obtained by adding --debug to the status command. Note that information provided by --debug might change in future releases.

4.2. Logfiles

In the following, we describe logfiles and their default locations for installations that are managed via lb services.

LogicBlox daemons create logfiles in $LB_DEPLOYMENT_HOME/logs. In general, they contain valuable information for trouble-shooting the LogicBlox system. The following gives a brief overview of the individual logfiles and the content they provide.

lb-compiler

The lb-compiler.log logfile contains a status entry once the compiler has been started indicating the TCP ports on which the compiler is listening for connections. Additional entries indicate unexpected problems that should be addressed.

lb-server

The lb-server.log logfile contains information according to the configured log-level. During start-up, all configurations and significant environment variables are printed. At log-level info only major database events such as workspace creation or deletion are logged. With loglevel perf all activity of the lb-server is logged. This includes received requests; started and finished transactions; evaluation of meta-rules that manage the schema-level information stored in workspaces; evaluation of regular LogiQL rules; and any other activities. Furthermore, also details about fixpoint evaluations and optimizer decisions are logged at log-level perf.

Tip

Individual requests can be executed at a log-level that differs from the default log-level for the lb-server. To do so, use the command-line option --loglevel for lb commands. The log for such requests can then also be returned to the lb command by using the option --log.

lb-web

The lb-web component creates multiple logfiles: lb-web-server.log and files such as lb-web-server-access-2014_11_18.log.

The former contains information about the general status of the lb-web. During start-up, all workspaces are scanned to gather all services defined within them. Discovered services are then summarized in the logfile. Additional information in this log-file should be sparse and is generally indicative of some error condition.

The access logfiles, for example lb-web-server-access-2014_11_18.log, log accesses to the lb-web server. Each connection from a client is logged using the standard Combined Log Format a widely used extension of the Common Log Format (NCSA Common log format). Entries in this log format look like:

127.0.0.1 -  -  [18/Nov/2014:17:10:35 -0800] "POST /lb-web/admin HTTP/1.1" 200 132 "-" "Python-urllib/2.7" 12

Each entry is explained below:

127.0.0.1

The IP address of the client.

-

Here, the RFC 1413 identity of the client would be written. The hyphen indicates that this information is not available.

-

Here, the user id of the program sending the request is written if available. As the with the previous field, this information is usually not available, and moreover not reliable.

[18/Nov/2014:17:10:35 -0800]

The date and time when the server finished processing the request.

"POST /lb-web/admin HTTP/1.1"

The request line of the client program, containing the method (e.g., GET or POST) and the full URL.

200

The status code sent back to the client. The code is one of the usual HTTP status codes with a full list in RFC2616 section 10. Success is indicated by a code starting with 2, a redirection starts with 3, an error caused by the client with 4 and an error caused by the server with 5.

132

This is the size of the return message to the client in bytes.

"Python-urllib/2.7"

The string by which the client identifies itself. Here, the client was a python program.

12

The last entry logs the latency for the given request in milliseconds. Here, it took 12 milliseconds from receiving the request to sending the response.

Tip

Standard log-analyzing tools such as GoAccess, or AWStats can be used to analyze lb-web logfiles since they are in the standard "combined log format".

4.3. Performance Monitoring

Performance can be monitored at the lb-server or lb-web level. Database transactions can best be monitored with the perf loglevel following the lb-server logfile. Web-service performance is best monitored by following the access logfile of lb-web. The last column in the lb-web access log is the request duration in milli-seconds.

The database by defaults logs information in the lb-server log on rules that take more than 10 seconds to evaluate. This setting can be changed using the LB_MONITOR_RULE_TIME environment variable or the lb-server configuration option logging.monitor_rule_time. The value is expected in milli-seconds.

4.4. Resource Consumption

4.4.1. Disk Usage of Workspaces

LogicBlox workspaces are stored on disk within the workspace directory, usually within the $LB_DEPLOYMENT_HOME/workspaces directory. Here, workspace names are encoded to allow special symbols to occur in them. Use lb filepath workspacename to obtain the specific directory in which workspace testworkspace is stored. For an overview of which workspaces occupy how much space, use for example the following bash snippet:

$ for ws in `lb workspaces`; do echo -n "$ws : "; du -sh `lb filepath "$ws"`; done
w2 : 1.7M	/home/q/lb_deployment/workspaces/dzI
ws : 52M	/home/q/lb_deployment/workspaces/d3M

It is also possible to obtain fine-grained information about how disk storage is used within a single workspace. Unfortunately, this currently requires exclusive access to the workspace. It is thus necessary to either export the workspace and investigate the copy, or to shut down lb-server while performing the investigation.

For example, given a workspace stored in the directory /tmp/ws that is not concurrently accessed by the lb-server, use the following dlbatch commands to investigate which components contribute to the workspace size:

$ dlbatch
<blox> open /tmp/ws
<blox> profileDiskSpace -S -V

This will print a report similar to the following:

<blox> open /tmp/foo
<blox> profileDiskSpace -S -V   
Visited 1609 pages (52723712 bytes)

          Pages           Bytes Thing
--------------------------------------------------------------------------------
              5         163,840 refcounts
             68       2,228,224 x{block_1Z2F4REW:1(1)--1(32)#1Z2FCC2R}#0: (int)
             84       2,752,512 x$sip: (int)
            663      21,725,184 x{block_1Z2OWOZV:1(1)--1(33)#1Z2P7OYG}#0: (int)
            731      23,953,408 x: (int)
          1,546      50,659,328 oodb::TreeBase
          1,586      51,970,048 objects
          1,609      52,723,712 resource

          Pages           Bytes   Removed pages       New pages Version
--------------------------------------------------------------------------------
              0               0               0               0 2014-11-20 01:39:54,870397+00:00
             78       2,555,904               0              78 2014-11-20 01:40:19,385577+00:00
          1,468      48,103,424              78           1,468 2014-11-20 01:40:36,059848+00:00
<blox> 

To better understand the report, it is useful to know how the database was created:

$ lb create ws
created workspace 'ws'
$ lb addblock ws 'x(i) <- int:range(1,1000000,1,i).'
added block 'block_1Z2F4REW'
$ lb addblock ws 'x(i) <- int:range(1,10000000,1,i).'
added block 'block_1Z2OWOZV'
lb export-workspace ws /tmp/ws
     52,756,480 bytes received
    377,805,823 bytes/s
     86,376,448 pagefile size
     33,619,968 unused space
0.412384s (0.139639s transmit, 0.272745s fsync)
exported workspace 'ws' to directory '/tmp/ws'

We see that the database stores three versions of itself corresponding to the empty database just after creation, the database after the first block block_1Z2F4REW was added and the database after adding the second block.

Note

LogicBlox keeps some old versions of the database to allow investigation of the history or even to revert to an older state. There is a maximum number of such "automatic" backups that are maintained. Once this number has been reached, we remove older versions in a way that backups are more dense for the recent history and more sparse otherwise. For example, we might have 10 of the 64 most-recent versions, 20 of the 128 most-recent versions, 30 of the 256 most-recent versions, etc.

Returning to the disk-profile. We see that 2MB are used for the predicate x in the first block we added; while around 22MB are used for the predicate x in the second block. These two predicates are then merged into the single predicate x using 24MB of space. The entries oodb::TreeBase, objects, and resource count the same data in a different way; they should be ignored in the analysis. The total size of this workspace is around 51MB. Predicates that have $sip in their name represent "sampling and indexing" predicates that are used internally by the LogicBlox engine.

4.4.2. Memory Usage of Daemons

Standard tools such as ps, top, or htop can be used to investigate and monitor the memory usage of LogicBlox daemons. This section discusses the most important information relating to memoru usage of LogicBlox daemons. A more detailed introduction into memory-analysis on Linux systems in general is available in Chapter 9, Linux Memory Analysis.

From the three daemons (lb-compiler, lb-web-server, lb-server), the only interesting daemon is the lb-server. The lb-compiler and lb-web-server processes are both Java processes that have a configured maximum heap size that does normally not need to be changed. The memory usage of these processes is independent of the size of the database. The lb-server is the daemon that has the database open and the memory configuration of this process is important for the performance of the application. For non-trivial databases, the bulk of a system's resident memory should be used by lb-server.

The resident memory usage of the lb-server consists of:

  • The size of the buffer pool, which is an in-memory cache of pages of the database. Virtually every database system has some type of buffer pool (sometimes called buffer cache). Similar to other databases, the buffer pool should be as large as possible, because this means that the system less frequently has to go to disk to obtain data. The size of the buffer pool should only be constrained by the memory needs of other daemons running on the server. The LB_MEM environment variable configures the size of the buffer pool (see Section 3.3.2, “Configuration Variables”).

  • The memory used by the database engine for the database schema, active logic, and query evaluation. Normally, this should be a fairly limited amount of memory, but for databases with an extreme large schema (thousands of predicates and rules), it could amount to a few gigabyte.

Generally, the resident memory usage of the lb-server process should be in the order of the size of the buffer pool. For extreme large schemas, the buffer pool size might have to be reduced to leave enough memory for the non-paged resident memory of the database engine and other daemons.

The total resident memory of the lb-server can be determined using standard Linux tools. For example:

$ top -b -n 1 -p `cat $LB_DEPLOYMENT/logs/current/lb-server.pid` | grep PID -A 2
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 8429 q         20   0 3179116  66808  26212 S   0.0  0.4   0:01.44 lb-server

Here, VIRT shows the total amount of the virtual memory used in KiB. This value can grow very large (beyond the available physical memory and even SWAP space) and is usually not of concern. RES shows the amount of physical memory used in KiB, excluding memory that is swapped to the system swap space or memory that is written back into memory-mapped files.

The generic tools of the operating systems cannot provide a breakdown of the resident memory into buffer pool resident memory, and the remainder used by the database engine. The buffer pool is gradually increased until the configured limit, so it is not sufficient to simply subtract the buffer pool size from the total resident memory. The BloxPagerStat tool can be used to determine how much of the resident memory of lb-server is used by the buffer pool. The portion will not exceed LB_MEM, but could be significantly smaller depending on the size of the database. The remaining resident memory is what the database uses to keep the schema and evaluation infrastructure. The value in bytes can be obtained using:

$ BloxPagerStat -property residentMemory
residentMemory=959479808

4.5. Aborting Transactions

Long-running transactions can be aborted. To do so, first obtain the request identifier using lb status --active workspacename for an active transaction or via lb status --all workspacename for running or queued transactions. To abort the transaction with a request identifier of e.g., 10000000004, issue lb aborttransaction workspacename 10000000004. Aborts are performed asynchronously, that is the aborttransaction command will return immediately even before the transaction is aborted. The return message of the original request will indicate whether the transaction was successfully aborted or had been committed already.

Note that simply issuing CTRL-C on a lb addblock, lb exec or similar, does not abort the transaction but only abort the lb client.