Chapter 6. Primitive Types

LogiQL supports the following primitive types: string, int, float, decimal, datetime, boolean and int128.

string

A string value is a finite sequence of Unicode characters. Valid Unicode characters are listed in the section called “String Literals”. For example, "hello" and "Ce ça" are strings.

There two ways of writing strings. In the first a string is a sequence of Unicode characters (other than line breaks) between two quote characters. Two adjacent strings will be concatenated together (strings are considered adjacent even when they are separated by a sequence of whitespace characters, including line breaks). For example, writing "foo" "bar" is equivalent to writing a single string literal, "foobar".

The second form of string is any sequence of Unicode characters, including line breaks, between two groups of three quote characters. (More details are provided in this Note.)

Example 6.1. Multi-line string literal

"""foo
bar"""

int

Values of type int are the usual mathematical integers. These values are internally represented as 64 bit two's complement binary numbers. They must therefore be in the range -(2^63) through 2^(63)-1, or -9223372036854775808 through 9223372036854775807.

The operations use the integer arithmetic of the underlying hardware. So, for example, a result greater than the maximum value may be silently converted to a negative number. (Arithmetic on values of type decimal is different: an overflow on decimal computations causes logical failure.)

Note

By saying that an operation "causes logical failure" (or "fails") we mean that the atom or comparison that contains the expression will not be true. (See also this Note.)

float

Values of type float are 64-bit binary floating-point numbers, represented according to the IEEE Standard for Floating-Point Arithmetic (IEEE 754).

As described in the section called “Binary Floating-Point Literals”, a floating-point is written with an integer part, a decimal fractional part with a dot prefix and/or an exponent over base 10 with prefix E, and a suffix f. For example, a floating-point number can be written as 2.71f with a decimal part, 2E3f with an exponent part (equivalent to 2000.0f), or 2.71E3f with both decimal and exponent parts (equivalent to 2710.0f).

The internal representation of floating-point numbers uses the base 2.

If an arithmetic operation produces NaN (for instance, through division by 0), the value is not stored and the operation fails.

If an arithmetic operation produces a -0, it is converted and stored as a +0.

If an arithmetic operation results in a number that cannot be stored using a 64-bit representation, it is stored as either positive infinity +inf or negative infinity -inf. Two +inf values, even if resulting from different computations, are considered equal. Similarly with two -inf values. Note that LogiQL does not provide any literal representation for infinite float values. Nor is there any explicit way to check whether a value is infinite.

Example 6.2. Positive and negative infinity

The lb script shown below (see Section 21.1, “Preliminaries”)

create --unique

addblock <doc>
  p(x) -> float(x).
  q(x) -> float(x).
  r(x) -> float(x).
  s(x) -> boolean(x).
  t(x) -> boolean(x).

  p(x) <- float:pow[10.0f, 2000f] = x.
  q(x) <- float:pow[ 5.0f, 2001f] = x.
  r(x) <- float:pow[-5.0f, 2001f] = x.

  s(true)  <- p(x),  q(y), x = y.
  s(false) <- p(x),  q(y), x != y.

  t(true)  <- q(x),  r(y), x = y.
  t(false) <- q(x),  r(y), x != y.
</doc>
print p
print q
print r
print s
print t

close --destroy

will produce the following output

inf
inf
-inf
true
false

decimal

decimal is a numeric type that uses a fixed-point decimal representation. Values of type decimal have upto 18 digits before the decimal point and upto 18 digits after the decimal point. So decimal can represent numbers in increments of 10^-18 in the range from -10^18 + 1 through 10^18 - 1.

This means that the smallest number that can be represented is -999,999,999,999,999,999.999,999,999,999,999,999 and the largest is 999,999,999,999,999,999.999,999,999,999,999,999. Between 1 * 10^-18 and 3 * 10^-18, there is exactly one decimal value 2 * 10^-18. Note that this is different from floating point numbers, which are very dense near 0, but very imprecise for large numbers.

Decimal arithmetic is exact within the 10^-18 resolution. This means that addition and subtraction follow the usual laws of associativity and commutativity (as long as the results of all operations are within range). Multiplication with integers is also exact: associativity, commutativity, and distributive laws apply (again, as long as all intermediate results stay within range). Note that this is not the case for floating point numbers.

Example 6.3. Floating point operations are not commutative

The following LogiQL code shows that subtraction and addition of float values are not commutative. While 0.1 + 0.2 - 0.2 - 0.1 should be zero, it is not.

I[] = 0.1f.
II[] = 0.2f.
I_plus_II[] = I[] + II[].
I_plus_II_minus_II[] = I_plus_II[] - II[].
zero[] = I_plus_II_minus_II[] - I[].
// zero[] =  2.7755575615628914e-17 but should be 0.0f 

Due to these characteristics, decimal is usually preferable for financial applications, whereas float is generally better for scientific applications.

When an arithmetic operation on decimal values overflows (i.e., the resulting value is out of range), then this is handled as logical failure. This means that the computation does not have a result, but the overflow is not an error that aborts the transaction. For example, computing 999999999999999999.0 + 1.0 does not result in a value. Logical failure is also used for conversion operations, such as string:decimal:convert, float:decimal:convert, and int:decimal:convert (see Section 7.10, “Conversions”).

When the result of an arithmetic operation on decimal numbers has more than 18 digits after a decimal point, then it is rounded "half towards zero", as illustrated in the following example:

Example 6.4. 

 12345678.95d / 100000000000000000d  =   0.000000000123456789
 12345678.96d / 100000000000000000d  =   0.00000000012345679
-12345678.95d / 100000000000000000d  =  -0.000000000123456789
-12345678.96d / 100000000000000000d  =  -0.00000000012345679

Note

The lb tool currently exhibits the following behaviour:

  • It will print small decimals by using exponent notation. For example

    addblock <doc>
      q(0.0000001).
    </doc>
    print q

    will print out 1e-7, even though q(1e-7). would not be accepted by the compiler.

  • It will print only 9 decimal digits of a decimal number, rounding the result. For example,

    addblock <doc>
      q(999999999999999999.999999999999999999).
    </doc>
    print q

    will print out 1000000000000000000.000000000, even though that number is outside the range of decimal values.

datetime

datetime is a built-in type whose values are points in time (with a resolution of one microsecond). datetime values are always stored as UTC values. Many of the built-in predicates have parameters for specifying the time-zone when creating or displaying datetime values.

Note

Before LogicBlox version 4.4.5 the resolution of datetime values was 1 second.

boolean

Type boolean has only two values: true and false.

int128

int128 is the type of 128 bit integers. The range is -170141183460469231731687303715884105728 through 170141183460469231731687303715884105727.

Note

The primary purpose of this type is to serve as an efficient replacement for strings that are Universally Unique Identifiers (UUID). This sort of use is supported by two special operations:

  • int128:from_uuid_string[s] converts the UUID string s to a 128-bit integer;

  • int128:to_uuid_string[i] converts a 128-bit integer created by int128:from_uuid_string back to the original UUID string.

(See also Section 29.1.1, “File Definition ” and Section 29.1.2, “File Binding ”.)

Integer intervals

Note

The integer interval type is a recent addition to LogiQL and is still somewhat experimental. Some elements of the LogicBlox system may as yet be unable to handle values of this type.

A value of type interval can be thought of as a set of contiguous integer values between a specified minimum value and a specified maximum value.

To use an interval type, mention it in a type declaration, just like any other type:

price[sku, period] = price -> Sku(sku), interval(period), decimal(price).

In many cases you can rely on type inference.

An interval literal consists of an integer followed by the symbol .. and by another integer, which must not be smaller than the first one. For instance, the literal -17..35 represents all integers from -17 through 35. This can also be written as -17 .. 35, but there must be no whitespace between the two dots.

To construct an interval value from two integer values you can use the built-in function mk_interval. For example, the value of mk_interval[x, y] is the interval from the value of x through the value of y, provided that x <= y (if that is not the case, a fatal error will be reported). Of course, mk_interval[-13, 35] = -13..35, and the two are interchangeable, but we cannot write x..y .

The total ordering among intervals is a lexicographic one. On a machine with two-bit integers the ordered list of all integer intervals would be the following:

-2..-2, -2..-1, -2..0, -2..1, -1..-1, -1..0, -1..1, 0..0, 0..1, 1..1.

(Notice that only legal intervals are featured here: for example, -1..-2 is not an interval.)

This ordering is used in the usual manner. For example, both -2..1 < 0..0 and lt_2(-2..1, 0..0) succeed, while the value of lt_3[-2..1, 0..0] is true.

Operations specific to integer intervals are listed in Section 7.8, “Operations on Integer Intervals”.

The type was introduced to help solve some real-life problems that led to excessively large predicates. An outline of this sort of problem is discussed in the example below.

Example 6.5. The interval type could be useful

Consider an application that supports the operations of a large supply chain. The prices for various items are not changed every day, but less frequently (say, once a week, or even more rarely). So the predicate

price[sku, store, day] = price -> sku(sku), store(store),
                                  day(day), decimal(price).

(which could be huge if materialized) can be expressed as derived only, in terms of a much smaller predicate

price_event[sku, store, day_interval] = price  -> sku(sku), store(store),
                                                  interval(day_interval),
                                                  decimal(price).

as follows

price[sku, store, day] = price_event[sku, store, day_to_interval[day]].

In order to avoid irrelevant complications in the example, let us assume that days are represented by integers.

To keep things simple, let us begin with the assumption that the day intervals are given and are universally valid, i.e., do not depend on sku or store. It would be natural to have a predicate that stores the intervals:

day_interval(day_interval) -> interval(day_interval).

We would then presumably also have a constraint that ensures that the used intervals are “legal”:

price_event[_, _, day_interval] -> day_interval(day_interval).

In order to compute price with the rule given above, we would need the functional predicate day_to_interval. This is easily expressed as an IDB predicate:

day_to_interval[day] = day_interval -> int(day),  interval(day_interval).
day_to_interval[day] = day_interval <- day_interval(day_interval),
                                       interval_member(day_interval, day).

It should be obvious that, given our simplifying assumptions, the number of days and day intervals will not be very large. In particular, the computation of day_to_interval will be reasonably cheap.

Now, a very welcome side effect of this approach is that two different values stored in day_interval must be disjoint. If that is not the case, we will immediately get an FDV (functional dependency violation) when populating day_to_interval.

In practice the day intervals may not be so regular: they may have to be derived from price_event.

In the worst case, the intervals may be quite irregular, and may depend on particular combinations of store and sku. We would then have:

day_to_interval[day, store, sku] = day_interval
       -> int(day), store(store), sku(sku), interval(day_interval).
day_to_interval[day, store, sku] = day_interval
       <- price_event[store, sku, day_interval] = _,
          interval_member(day_interval, day).

price[sku, store, day] =
          price_event[sku, store, day_to_interval[day, store, sku]].

(Notice that there would no longer be a need for predicate day_interval.) In this most general case the cardinality of day_to_interval would be the same as the cardinality of price, so the whole exercise would be pointless.

In practice we may have some sort of regularity. For example, a price event in a particular store would affect a whole category of store-keeping units, such as produce, cleaning supplies etc. If the number of categories is, say, two orders of magnitude smaller than the number of store-keeping units, then it might be worthwhile to apply our approach with the following modifications:

day_to_interval[day, store, category] = day_interval
    -> int(day), store(store), sku_category(category), interval(day_interval).
day_to_interval[day, store, category] = day_interval
    <- price_event[store, category, day_interval] = _,
       interval_member(day_interval, day).

price[sku, store, day] =
    price_event[sku, store, day_to_interval[day, store, sku_to_category[sku]]].

The additional predicates sku_to_category and day_to_interval would be significantly smaller than price. One can also hope that when price_event or sku_to_category is updated, the changes to day_to_interval would be relatively minor, so the overall amortized cost of having such a predicate will not be excessive. The good news is that the disjointness of intervals (for any particular combination of store and category ) would still be assured.