Chapter 15. Linear recursion

Basic concepts

As is well-known, the Fibonacci numbers are given by the following recipe:

  • fib(1) = 1;
  • fib(2) = 1;
  • for any i > 2, fib(i) = fib(i - 1) + fib(i - 2).

This is easily translated directly to LogiQL. We must, of course, take care to limit the range of Fibonacci numbers to be computed, as they form an infinite set. (Additionally, since we are using ordinary integers, we should be aware that the representation of fib(93) would already exceed 63 bits and show up as a negative number.)

Example 15.1. Simple recursion (Fibonacci numbers)

create --unique

addblock <doc>
  fib[i] = n -> int(i), int(n).

  fib[1] = 1.
  fib[2] = 1.
  fib[i] = fib[i - 1] + fib[i - 2] <-  2 < i <= 20.
</doc>
print fib

close --destroy

As expected, this yields the following results:

created workspace 'unique_workspace_2017-10-07-18-21-35'
added block 'block_4LDQPDSB'
1  1
2  1
3  2
4  3
5  5
6  8
7  13
8  21
9  34
10 55
11 89
12 144
13 233
14 377
15 610
16 987
17 1597
18 2584
19 4181
20 6765
deleted workspace 'unique_workspace_2017-10-07-18-21-35'

Such a recursive predicate is computed quite quickly for this simple example, but in general recursive computations may be unacceptably slow. This is because the predicate is produced bottom-up during the process of maintenance (see Section 19.4.1, “Maintenance”). Given the facts fib(1) and fib(2), the LogicBlox system uses the recursive rule to produce fib(3). The contents of a predicate have changed, so this triggers another round of maintenance, producing fib(4), which triggers yet another round...

If the recursive predicate is large, or if there are large predicates that are dependent on the recursive one, each such round of maintenance may involve significant computation. (For example, all interim versions of predicates are materialised, new samples are taken to construct indices, the differences between each predicate and its previous version are computed...)

Unlike the general recursive rules in Example 11.12, “IDB rules and general recursion”, the rule for computing fib has a very special form: each value depends only on a strictly limited number of immediately preceding values, and this limited number is known in advance. Another well-known function with this property is the factorial: n! = n (n - 1)!.

Even though defined recursively, such functions are routinely computed without applying recursion, by using a technique that is sometimes called a "sliding window". The function creates a list of results, and the computation of each new member of the list can be based upon k most recent results, where k is an integer constant that is determined at compile time. One can think of the computation in terms of a list chasing its own tail.

LogiQL provides an efficient mechanism for computing such functions, which are often referred to as "linear recursive".

Note

Apart from being grammatically suspect (shouldn't it be linearly recursive?), the term is used somewhat loosely. One widely-recognised "definition" is: A linear recursive function is a function that only makes a single call to itself each time the function runs. This very imprecisely formulated quasi-definition would presumably describe the factorial, but not the Fibonacci function.

Before describing the mechanism in more detail, let us take a quick peek at how it can be used in our example.

Example 15.2. Simple "linear recursion" (Fibonacci numbers)

create --unique

addblock <doc>

  fib[i] = n -> int(i), int(n).

  first(i)   -> int(i).
  next(i, n) -> int(i), int(n).

  first(1).
  next(i, n) <- int:range(1, 19, 1, i), n = i + 1.

  fib[_] = _ <-
     linear_recursion<<
        lang:pragma:baseCase(`first).
        lang:pragma:recursiveCase(`next).

        fib[i1] = 1 <- first(i1).
        fib[i2] = 1 <- first(i1), next(i1, i2).
        fib[ n] = i <- i = fib[s1] + fib[s2], next(s2, s1), next(s1, n).
     >> first(_), next(_, _).
</doc>
print fib

close --destroy

A more readable version of the same logic uses functional predicates for first and next:

create --unique

addblock <doc>

  fib[i] = n -> int(i), int(n).

  first[] = i -> int(i).
  next[i] = n -> int(i), int(n).

  first[] = 1.
  next[i] = i + 1 <- int:range(1, 19, 1, i).

  fib[_] = _ <-
     linear_recursion<<
        lang:pragma:baseCase(`first).
        lang:pragma:recursiveCase(`next).

        fib[first[]]       = 1.
        fib[next[first[]]] = 1.
        fib[next[next[s]]] = fib[next[s]] + fib[s].
     >> _ = first[], _ = next[_].
</doc>
print fib

close --destroy

In both cases the results are the same as in Example 15.1, “Simple recursion (Fibonacci numbers)”.

As we see from the example, the "linear recursive" predicate is defined by means of a construct somewhat similar to agg (Chapter 12, Aggregations) or list (Section 13.2, “list).

The recursive predicate (which must be a single-valued functional predicate) is mentioned in the head of a rule whose premise (right-hand side) is of the form linear_recursion<<...>> body, where body is a formula that must mention two auxiliary predicates: one that defines the first key (the "base case"), and one that defines the progression of keys (the "recursive case"). In Example 15.2, “Simple "linear recursion" (Fibonacci numbers)” these predicates happen to be named first and next, respectively.

Each atom in the body of linear_recursion should reference a materialised predicate (in particular, it should not be a built-in such as =). Its arguments should all be variables.

The recursive predicate cannot be local.

The contents of the linear_recursion<<...>> construct must begin with declarations of the two auxiliary predicates, as shown in the example. This is followed by the rules that actually define the value of the recursive predicate for various keys, either in terms of the "base case", or in terms of "earlier" keys, where the progression of keys is defined by the "recursive case".

The names of the variables in the body are often ignored. In our example it would be OK to have a body like

     _ = next[F], F = first[].

or the equivalent

     _ = next[first[]].

However, the variable that is the value argument of the "recursive case" should not have other occurrences in the body.

If the body contains atoms that refer to any other predicates, their arguments are treated as "group-by" variables: see below.

The variables in the head are essentially ignored, except when it is impossible to give them types.

We recommend the style used in our examples, i.e., that all these variables be made anonymous, except when there is a good reason to name them.

Entity keys, grouping

The keys may be entities: the "recursive case" predicate will effectively order them. The "base case" predicate and the "recursive case" predicate may have additional arguments that are used for grouping (similar to the one described in Section 13.1, “seq and illustrated in Example 13.3, “Sorting and grouping”). These additional arguments must satisfy the following requirements:

  • They must precede the arguments used for determining the first key and the key sequence.

  • They must all be of entity types.

  • The sequence of extra arguments in the "base case" predicate must be of the same length as the sequence of extra arguments in the "recursive case" predicate, and the corresponding extra arguments from both predicates must be of identical types.

  • For the mandatory occurrences of the "base case" predicate and the "recursive case" predicate in the body of the linear_recursion construct, the extra arguments must be variables that are pairwise identical in both predicates.

These requirements sometimes force us to write code that is a little more convoluted than we would like, as illustrated by FirstDayForFruit and NextDayForFruit in the following example.

Example 15.3. "Linear recursion" with entities and grouping

The example illustrates the structure of a toy application that sums up the sales of each fruit in our friendly neighbourhood fruit stall. The linear_recursion construct is used to compute accumulated sales separately for each kind of fruit, thanks to the additional grouping argument f in the body of the construct.

create --unique

addblock <doc>

  // Entity types.
  Fruit(f) -> .
  FruitName[nm] = f -> string(nm), Fruit(f).
  lang:constructor(`FruitName).

  FruitName["apple"    ] = _.
  FruitName["avocado"  ] = _.
  FruitName["persimmon"] = _.

  WeekDay(d) -> .
  WeekDayName[nm] = d -> string(nm), WeekDay(d).
  lang:constructor(`WeekDayName).

  WeekDayName["Monday"   ] = _.
  WeekDayName["Tuesday"  ] = _.
  WeekDayName["Wednesday"] = _.
  WeekDayName["Thursday" ] = _.
  WeekDayName["Friday"   ] = _.

  // The succession of days in a week.
  firstDay[] = WeekDayName["Monday"].

  nextDay[WeekDayName["Monday"   ]] = WeekDayName["Tuesday"  ].
  nextDay[WeekDayName["Tuesday"  ]] = WeekDayName["Wednesday"].
  nextDay[WeekDayName["Wednesday"]] = WeekDayName["Thursday" ].
  nextDay[WeekDayName["Thursday" ]] = WeekDayName["Friday"   ].

  // How many pounds of each fruit sold in each day of the week?
  Sales[fruit, day] = weight -> Fruit(fruit), WeekDay(day), decimal(weight).

  Sales[apple    , Mon] = 100d,
  Sales[apple    , Tue] = 150d,
  Sales[apple    , Wed] = 120d,
  Sales[apple    , Thu] =  90d,
  Sales[apple    , Fri] =  10d,
  Sales[persimmon, Mon] =   0d,
  Sales[persimmon, Tue] =  10d,
  Sales[persimmon, Wed] =  20d,
  Sales[persimmon, Thu] =  15d,
  Sales[persimmon, Fri] = 100d
  <- apple     = FruitName["apple"],
     persimmon = FruitName["persimmon"],
     Mon = WeekDayName["Monday"],    Tue = WeekDayName["Tuesday"],
     Wed = WeekDayName["Wednesday"], Thu = WeekDayName["Thursday"],
     Fri = WeekDayName["Friday"].

  // Somewhat silly auxiliaries for linear recursion.
  FirstDayForFruit[f]   = firstDay[]  <- Fruit(f).
  NextDayForFruit[f, d] = nextDay[d]  <- Fruit(f).

  // Accumulated sales of this fruit by the end of this day.
  AccSales[fruit, day] = weight -> Fruit(fruit), WeekDay(day), decimal(weight).

  AccSales[_, _] = _  <- linear_recursion <<
     lang:pragma:baseCase(`FirstDayForFruit).
     lang:pragma:recursiveCase(`NextDayForFruit).

     AccSales[f,  d] = Sales[f, d] <- d = FirstDayForFruit[f].
     AccSales[f, nd] = AccSales[f, d] + Sales[f, nd]
       <- nd = NextDayForFruit[f, d].

  >> FirstDayForFruit[f] = _, NextDayForFruit[f, _] = _.

  // How many pounds of each fruit sold in a week?
  WeeklySales[fruit] = AccSales[fruit, WeekDayName["Friday"]].
</doc>
echo SALES
print Sales
echo WEEKLY SALES
print WeeklySales

close --destroy

The output is:

SALES
[10000000001] [10000000000] 0
[10000000001] [10000000002] 10
[10000000001] [10000000003] 15
[10000000001] [10000000007] 100
[10000000001] [10000000013] 20
[10000000004] [10000000000] 100
[10000000004] [10000000002] 150
[10000000004] [10000000003] 90
[10000000004] [10000000007] 10
[10000000004] [10000000013] 120
WEEKLY SALES
[10000000001] 145
[10000000004] 470

This may be somewhat less informative then what we would like to see, so we may extend the example as follows:

addblock <doc>
  ShowSales[fruitNm, dayNm] = n <- Sales[fruit, day] = n,
                                   FruitName[fruitNm] = fruit,
                                   WeekDayName[dayNm] = day.

  ShowWeeklySales[fruitNm] = n <- WeeklySales[fruit] = n,
                                  FruitName[fruitNm] = fruit.
</doc>

echo SALES
print ShowSales
echo WEEKLY SALES
print ShowWeeklySales

The output is now:

SALES
"apple"     "Friday"    10
"apple"     "Monday"    100
"apple"     "Thursday"  90
"apple"     "Tuesday"   150
"apple"     "Wednesday" 120
"persimmon" "Friday"    100
"persimmon" "Monday"    0
"persimmon" "Thursday"  15
"persimmon" "Tuesday"   10
"persimmon" "Wednesday" 20
WEEKLY SALES
"apple"     470
"persimmon" 145

Additional features

The rules that can appear within the angle brackets in a linear_recursion constructs must be rather simple. For example, they can only perform straightforward lookup: all the necessary joins must be carried out in the body. A more severe restriction is that the variables used to extract data from other predicates must also appear in the body. This is one of the reasons we were forced to define FirstDayForFruit and NextDayForFruit in our example.

Example 15.4. An incorect variant of Example 15.3, “"Linear recursion" with entities and grouping”.

If we try to replace the linear_recursion construct in our example by a variant that seems more natural, to wit,

  AccSales[_, _] = _  <- linear_recursion <<
     lang:pragma:baseCase(`firstDay).
     lang:pragma:recursiveCase(`nextDay).

     AccSales[f,  d] = Sales[f, d] <- d = firstDay[].
     AccSales[f, nd] = AccSales[f, d] + Sales[f, nd]
       <- nd = nextDay[d].

  >> firstDay[] = _, nextDay[_] = _.

we will get an error message

Error: Expected all reads from non-local predicates to be variables equivalent to P2P body variables, got AccSales[f,d]=__a in rule Forall __a::decimal,d::WeekDay,f::Fruit .
AccSales[f,d]=__a <-
   firstDay[]=d,
   Sales[f,d]=__a.

There is a more direct way to get around this particular difficulty. Rules within the angle brackets of a linear_recursion construct can refer to the "current" version of a body variable V by means of current:V[]. Moreover, the grouping variables can be specified in a separate atom in the body. So the rule for AccSales can also be written as in the following example.

Example 15.5. Using current: and a separate grouping predicate

  AccSales[_, _] = _   <- linear_recursion <<
     lang:pragma:baseCase(`firstDay).
     lang:pragma:recursiveCase(`nextDay).

     AccSales[f,  d] = Sales[f, d] <- f = current:f[], d = current:day[].
     AccSales[f, nd] = AccSales[f, d] + Sales[f, nd]
       <- f = current:f[], d = current:day[], nd = nextDay[d].

  >> Fruit(f), firstDay[] = day, nextDay[_] = _.

There is even a feature that allows us to replace current: with something else, e.g., magic:. All that is needed is another pragma at the beginning of the part in angular brackets:

    lang:pragma:prefix(`magic).

Finally, one should be aware that the same linear_recursion construct can be used to populate more than one recursive predicate. For example, our simple program for the neighbourhood fruit stall can be extended to compute the income for each fruit, based on a daily price.

Example 15.6. Linear recursion with two recursive predicates

Here are the relevant fragments of Example 15.3, “"Linear recursion" with entities and grouping”, as modified in Example 15.5, “Using current: and a separate grouping predicate ”, but with an additional recursive predicate. In this example the two recursive predicates share rules (inside the angular brackets), but in general this need not be the case.

  // How many pounds of each fruit sold in each day of the week?
  // What was the daily price?
  Sales[fruit, day] = weight -> Fruit(fruit), WeekDay(day), decimal(weight).
  Price[fruit, day] = price  -> Fruit(fruit), WeekDay(day), decimal(price).

  Sales[apple    , Mon] = 100d,   Price[apple    , Mon] = 0.50d,
  Sales[apple    , Tue] = 150d,   Price[apple    , Tue] = 0.60d,
  Sales[apple    , Wed] = 120d,   Price[apple    , Wed] = 0.70d,
  Sales[apple    , Thu] =  90d,   Price[apple    , Thu] = 0.90d,
  Sales[apple    , Fri] =  10d,   Price[apple    , Fri] = 1.10d,
  Sales[persimmon, Mon] =   0d,   Price[persimmon, Mon] = 1.50d,
  Sales[persimmon, Tue] =  10d,   Price[persimmon, Tue] = 2.00d,
  Sales[persimmon, Wed] =  20d,   Price[persimmon, Wed] = 2.50d,
  Sales[persimmon, Thu] =  15d,   Price[persimmon, Thu] = 2.50d,
  Sales[persimmon, Fri] = 100d,   Price[persimmon, Fri] = 1.00d
  <- apple     = FruitName["apple"],
     persimmon = FruitName["persimmon"],
     Mon = WeekDayName["Monday"],    Tue = WeekDayName["Tuesday"],
     Wed = WeekDayName["Wednesday"], Thu = WeekDayName["Thursday"],
     Fri = WeekDayName["Friday"].

  // Accumulated sales of/income for this fruit by the end of this day.
  AccSales [fruit, day] = weight -> Fruit(fruit), WeekDay(day), decimal(weight).
  AccIncome[fruit, day] = amount -> Fruit(fruit), WeekDay(day), decimal(amount).

  AccSales[_, _] = _, AccIncome[_, _] = _   <-
  linear_recursion <<
     lang:pragma:baseCase(`firstDay).
     lang:pragma:recursiveCase(`nextDay).

     AccSales [f, d] = Sales[f, d],
     AccIncome[f, d] = Sales[f, d] * Price[f, d]
         <- f = current:f[], d = current:day[].

     AccSales [f, nd] = AccSales[f, d]  + Sales[f, nd],
     AccIncome[f, nd] = AccIncome[f, d] + Sales[f, nd] * Price[f, nd]
         <- f = current:f[], d = current:day[], nd = nextDay[d].

  >> Fruit(f), firstDay[] = day, nextDay[_] = _.

  // How many pounds of each fruit sold in a week, and what was the income
  // for that fruit?
  WeeklySales [fruit] = AccSales [fruit, Fri],
  WeeklyIncome[fruit] = AccIncome[fruit, Fri] <- Fri = WeekDayName["Friday"].

Caveats

It should be noted that the instantiations of the grouping variables in the recursive rule are determined "behind the scenes", which may sometimes seem a little surprising. For example, if the recursive rule in Example 15.3, “"Linear recursion" with entities and grouping” is replaced with

     AccSales[f, nd] = AccSales[ff, d] + Sales[f, nd]
       <- nd = NextDayForFruit[f, d].

then f and ff will effectively be treated as the same variable.

In the interest of protecting the unwary, in some cases the system will not allow such usage. For example, if we wrote

  AccSales[f, nd] = AccSales[ff, d] + AccSales[f, d] + Sales[f, nd]
    <- nd = NextDayForFruit[f, d].

then the computation would terminate with

Error: Can't have two different key variables in the same position of
two occurrences of a recursive predicate in the same rule.
The variables are `f` and `ff`.
The current atom is: `AccSales[f,d]=__b`.
The current rule is:
Forall __e::decimal,d::WeekDay,f::Fruit,ff::Fruit,nd::WeekDay .
AccSales[f,nd]=__e <-
   Exists __a::decimal,__b::decimal,__c::decimal,__d::decimal .
      NextDayForFruit[f,d]=nd,
      decimal:add[__c,__d]=__e,
      decimal:add[__a,__b]=__c,
      AccSales[ff,d]=__a,
      AccSales[f,d]=__b,
      Sales[f,nd]=__d.

Developers of large applications should be aware that rules within the angle brackets access predicates only for lookup, so the usual optimisation known as "leapfrog tree join" does not apply.

Another thing to be aware of is that the list-chase operation carried out within the linear_recursion construct is applied to every grouping body join, even if there is logic inside the body that would restrict the number of actual groups to something far smaller. This can be circumvented by factoring out the grouping predicate, as in Example 15.5, “Using current: and a separate grouping predicate ”. Here is how we can avoid computing AccSales for those kinds of fruit that were not sold at all during this week:

  FruitSold(f) <- Fruit(f), Sales[f, _] != 0d.

  AccSales[_, _] = _   <- linear_recursion <<
     lang:pragma:baseCase(`firstDay).
     lang:pragma:recursiveCase(`nextDay).

     AccSales[f,  d] = Sales[f, d] <- f = current:f[], d = current:day[].
     AccSales[f, nd] = AccSales[f, d] + Sales[f, nd]
       <- f = current:f[], d = current:day[], nd = nextDay[d].

  >> FruitSold(f), firstDay[] = day, nextDay[_] = _.

This trick can, of course, be applied to more interesting cases, where there is more than one grouping variable and the variables come from a join of two or more predicates.

Note

The implementation of the linear_recursion construct is somewhat brittle, and is based on a number of assumptions that might sometimes seem unnatural. To give just one instance, in Example 15.2, “Simple "linear recursion" (Fibonacci numbers)” we cannot replace fib[next[first[]]] = 1. with the equivalent fib[2] = 1.: this would cause an error.

If you want to apply a pattern that is different from the ones shown here, we recommend that you test it on a small example.