## Chapter 13. Sorting

LogiQL provides two special types of rules to support the sorting of values: `seq`, and `list`. The result of sorting by `seq` is an array-like representation, where the nth member in the sorted set can be looked up by n. The result of sorting by `list` is a linked-list-like representation, where members in the sorted set must be navigated to through first- and next-like operations.

The choice between `seq` and `list` depends on which representation is more suitable for your problem.

## 13.1. `seq`

A `seq<<>>` rule supports the sorting of a predicate into an array-like functional predicate. Sequence rules have the following form:

```Sequence =
Atom "<-" "seq" "<<" [ Identifier "=" SortKeys ] ">>" Formula "." .

SortKeys = Identifier
| "(" Identifier { "," Identifier } ")".```

### Note

The optional text between the pair of angle brackets (`<<...>>`) is ignored. It is supported for backward compatibility with previous versions of LogiQL.

The formula in the body must consist of one atom. The head atom must contain all the variables that occur in the body atom, and an additional integer variable that does not occur in the body: that variable must be a key in the predicate named by the atom.

If this integer key is the only key of the predicate, then the tuples of the body atom are sorted, and the key indicates the position of each value tuple in the sorted sequence (starting at `0`). This is best explained by a couple of simple examples.

Example 13.1. Sorting integers

Let `a` be the following unary predicate of integers:

`a(x) -> int(x).`

We can sort the contents of `a` by using the following sequence clause:

```a_seq[i] = x -> int(i), int(x).
a_seq[i] = x <- seq<<>> a(x).```

If `a` contains the tuples ```{(20), (60), (40)}```, then the evaluation of the above rule results in `a_seq=20`, `a_seq=40`, and `a_seq=60`.

The predicate `a_seq` must be explicitly declared, as in the example above. (The name of the predicate is not important, and may be arbitrary.)

Example 13.2. Sorting a binary predicate

```b(x, y) -> string(x), string(y).
b_sort(i; x, y) -> int(i), string(x), string(y).

b_sort(i; x, y) <- seq<< >> b(x, y).```

If predicate `b` contains the tuples `{("a", "ab"), ("a", "aa"), ("b", "c")}`, then `b_sort` will contain `{(0, "a", "aa"), (1, "a", "ab"), (2, "b", "c")}`.

Note the semicolon in `b_sort(i; x, y)`: `b_sort` is a multi-valued functional predicate (see the section called “Multi-valued functional predicates”). This example illustrates a situation in which such predicates cannot be easily avoided.

The key of the sorted predicate can consist of several variables besides the indexing variable. If it does consist of more than one variable, then the fields that correspond to variables on the left of the indexing variable are sorted, forming groups: the remaining fields within each group are sorted separately.

All this is illustrated by the following examples.

Example 13.3. Sorting and grouping

```create --unique

produce(item, kind) -> string(item), string(kind).

produce("carrot" , "vegetable").
produce("apple"  , "fruit"    ).
produce("parsley", "vegetable").
produce("melon"  , "fruit"    ).
produce("celery" , "vegetable").
produce("mango"  , "fruit"    ).

items(i; x, y) -> int(i), string(x), string(y).
items(i; x, y) <- seq <<>> produce(x, y).

by_kind(y, i; x) -> int(i), string(x), string(y).
by_kind(y, i; x) <- seq <<>> produce(x, y).
</doc>
echo --- produce: ---
print produce
echo --- items: ---
print items
echo --- by_kind: ---
print by_kind

close --destroy```

The output is:

```created workspace 'unique_workspace_2017-10-07-17-22-41'
--- produce: ---
"apple"   "fruit"
"carrot"  "vegetable"
"celery"  "vegetable"
"mango"   "fruit"
"melon"   "fruit"
"parsley" "vegetable"
--- items: ---
0 "apple"   "fruit"
1 "carrot"  "vegetable"
2 "celery"  "vegetable"
3 "mango"   "fruit"
4 "melon"   "fruit"
5 "parsley" "vegetable"
--- by_kind: ---
"fruit"     0 "apple"
"fruit"     1 "mango"
"fruit"     2 "melon"
"vegetable" 0 "carrot"
"vegetable" 1 "celery"
"vegetable" 2 "parsley"
deleted workspace 'unique_workspace_2017-10-07-17-22-41'```

Example 13.4.  An illustration of the flexibility of `seq`

```create --unique

b(x, y, z) -> string(x), string(y), string(z).

b("a", "ab", "abc").
b("a", "aa", "bac").
b("b", "cb", "cab").
b("b", "bc", "abc").
b("b", "bc", "aaa").
</doc>
echo --- b:
print b

c0_sort(i; x, y, z)  -> int(i), string(x), string(y), string(z).
c1_sort(i; x, y, z)  -> int(i), string(x), string(y), string(z).
c2_sort(i; x, y, z)  -> int(i), string(x), string(y), string(z).
c3_sort(i; x, y, z)  -> int(i), string(x), string(y), string(z).
d0_sort(x, i; y, z)  -> int(i), string(x), string(y), string(z).
d1_sort(i, x; y, z)  -> int(i), string(x), string(y), string(z).
e0_sort[x, y, i] = z -> int(i), string(x), string(y), string(z).
e1_sort[x, i, y] = z -> int(i), string(x), string(y), string(z).

c0_sort(i; x, y, z)  <- seq<<>> b(x, y, z).
c1_sort(i; y, z, x)  <- seq<<>> b(x, y, z).
c2_sort(i; y, x, z)  <- seq<<>> b(x, y, z).
c3_sort(i; z, x, y)  <- seq<<>> b(x, y, z).
d0_sort(x, i; y, z)  <- seq<<>> b(x, y, z).
d1_sort(i, x; y, z)  <- seq<<>> b(x, y, z).
e0_sort[x, y, i] = z <- seq<<>> b(x, y, z).
e1_sort[x, i, y] = z <- seq<<>> b(x, y, z).
</doc>
echo --- c0_sort:
print c0_sort
echo --- c1_sort:
print c1_sort
echo --- c2_sort:
print c2_sort
echo --- c3_sort:
print c3_sort
echo --- d0_sort:
print d0_sort
echo --- d1_sort:
print d1_sort
echo --- e0_sort:
print e0_sort
echo --- e1_sort:
print e1_sort

close --destroy```

The output is as follows. It might be instructive to notice the difference between `c0_sort` and `c1_sort`, as well as the similarity between `d0_sort` and `e1_sort`.

```created workspace 'unique_workspace_2016-10-28-00-42-46'
--- b:
"a" "aa" "bac"
"a" "ab" "abc"
"b" "bc" "aaa"
"b" "bc" "abc"
"b" "cb" "cab"
--- c0_sort:
0 "a" "aa" "bac"
1 "a" "ab" "abc"
2 "b" "bc" "aaa"
3 "b" "bc" "abc"
4 "b" "cb" "cab"
--- c1_sort:
0 "aa" "bac" "a"
1 "ab" "abc" "a"
2 "bc" "aaa" "b"
3 "bc" "abc" "b"
4 "cb" "cab" "b"
--- c2_sort:
0 "aa" "a" "bac"
1 "ab" "a" "abc"
2 "bc" "b" "aaa"
3 "bc" "b" "abc"
4 "cb" "b" "cab"
--- c3_sort:
0 "aaa" "b" "bc"
1 "abc" "a" "ab"
2 "abc" "b" "bc"
3 "bac" "a" "aa"
4 "cab" "b" "cb"
--- d0_sort:
"a" 0 "aa" "bac"
"a" 1 "ab" "abc"
"b" 0 "bc" "aaa"
"b" 1 "bc" "abc"
"b" 2 "cb" "cab"
--- d1_sort:
0 "a" "aa" "bac"
1 "a" "ab" "abc"
2 "b" "bc" "aaa"
3 "b" "bc" "abc"
4 "b" "cb" "cab"
--- e0_sort:
"a" "aa" 0 "bac"
"a" "ab" 0 "abc"
"b" "bc" 0 "aaa"
"b" "bc" 1 "abc"
"b" "cb" 0 "cab"
--- e1_sort:
"a" 0 "aa" "bac"
"a" 1 "ab" "abc"
"b" 0 "bc" "aaa"
"b" 1 "bc" "abc"
"b" 2 "cb" "cab"
deleted workspace 'unique_workspace_2016-10-28-00-42-46'
```

## 13.2. `list`

A `list` rule can be used to sort a set of values into two result predicates, a first, and a next, thus providing a representation that is somewhat similar to a linked list.

A `list` rule has the following syntactic form:

```List =
Atom "," Atom "<-" "list" "<<" [ GroupBy ] ">>" Formula "." .

GroupBy = "group-by" "(" Identifiers ")" .```

A `list` rule derives into two predicates: one representing the first tuple in the sort, and the other the next relation. If specified, the `group-by` results in a nested sort, wherein the variables in `group-by` are sorted first, and the remaining variables sorted within each value of `group-by`.

Example 13.5. Sorting using `list` without `group-by`

```a(x) -> int(x).

first_a(x) -> int(x).
next_a(x, y) -> int(x), int(y).

first_a(x), next_a(x, y) <- list<<>> a(x).```

This example illustrates the sorting of `a` into two predicates:

• `first_a`, which contains the value of the first element in the sorted set;
• `next_a`, which contains binary tuples of the form `(x,y)`, where `y` is the successor of `x` in the sorted set.

If `a` contains `{(20), (30), (25)}`, then `first_a` contains `{(20)}`, and `next_a` contains `{(20, 25), (25, 30)}`

The optional `group-by` parameter can be used to perform sorting by group, in analogy with grouping in aggregate functions.

Example 13.6. Sorting with `group-by`

```b(x, y)         -> int(x), int(y).
first_b(x, y)   -> int(x), int(y).
next_b(x, y, z) -> int(x), int(y), int(z).

first_b(x, y), next_b(x, y, z) <-
list<< group-by(x) >> b(x, y).```

Using `group-by(x)`, the above `list` rule sorts only the values in the second argument position of `b`, for each unique `x` in the first argument position of `b`.

The predicate `first_b` contains sorted tuples of the form `(x,y)`, such that

• `(x,y)` belongs to `b`;
• if `b` were sorted, `(x,y)` would be the first tuple in `b` that has `x` in the first position.

In other words, `y` is the first value in the second position in the sorted sequence of tuples that have `x` in the first position.

The predicate `next_b` contains tuples of the form `(x,y,z)` such that `(x,z)` would immediately follow `(x,y)` in `b`, if `b` were sorted.

If `b` contains `{(1,2), (1,3), (1,4), (2,10), (2,11), (3,20)}`, then `first_b` contains `{(1,2), (2,10), (3,20)}`, and `next_b` contains `{(1,2,3), (1,3,4), (2,10,11)}`.

It is possible to group by more than one variable. For example:

```b(x, y, z)         -> int(x), int(y), int(z).
first_b(x, y, z)   -> int(x), int(y), int(z).
next_b(x, y, z, v) -> int(x), int(y), int(z), int(v).

first_b(x, y, z), next_b(x, y, z, v) <-
list<< group-by(x, y) >> b(x, y, z).```

If `b` contains `{(1,2,0), (1,2,1), (1,3,0), (1,4,0), (1,4,1), (2,10,100)}`, then `first_b` contains `{(1,2,0), (1,3,0), (1,4,0), (2,10,100)}`, and `next_b` contains `{(1,2,0,1), (1,4,0,1)}`.

Example 13.7. Sorting with multiple values

An element of the list may contain more than just one value. In the following example we group by the value of `x`, and additionally store pairs of numbers. `next_b(x, y, z, ny, nz)` should be read, roughly, as "for this value of `x`, the next pair after `y` and `z` is `ny` and `nz`.

```b(x, y, z)              -> int(x), int(y), string(z).
first_b(x, y, z)        -> int(x), int(y), string(z).
next_b(x, y, z, ny, nz) -> int(x), int(y), string(z), int(ny), string(nz).

first_b(x, y, z), next_b(x, y, z, ny, nz) <- list<< group-by(x) >> b(x, y, z).```

If `b` contains ```{(1,2,"3"), (1,3,"4"), (1,4,"5"), (2,10,"12"), (2,11,"13"), (3,20,"23")}```, then `first_b` contains `{(1,2,"3"), (2,10,"12"), (3,20,"23")}`, and `next_b` contains `{(1,2,"3",3,"4"), (1,3,"4",4,"5"), (2,10,"12",11,"13")}`.

The `group-by` parameters, if present, must be a strict prefix of the key variables of the relation to be sorted. The following does not represent a valid use of `group-by` and will cause a run-time exception:

```first_b(y, x), next_b(y, x, z) <-
list<< group-by(y) >> b(x, y).```

Here is another example that does not satisfy this condition:

```c[x, y] = z -> int(x), int(y), int(z).
first_c(x, y, z) -> int(x), int(y), int(z).
next_c(x, y, z, u) -> int(x), int(y), int(z), int(u).

first_c(x, y, z), next_c(x, y, z, u) <-
list<< group-by(x, y) >> c[x, y] = z.```

In this case `(x,y)` is a prefix of the key variables ```(x,y) ``` of `c`, but not a strict prefix. (The rule would therefore be asking the system to sort groups containing exactly one value each, which is a bit silly.)

More generally, the variable order of the `first` and `next` predicates must agree with the variable order of the predicate to be sorted. Thus, even with an empty `group-by`, a rule like the following would be rejected:

`first(y, x), next(y, x, u, v) <- list<< >> b(x, y).`

The required effect can be achieved by using an auxiliary predicate:

```d(y, x) <- b(x, y).
first(y, x), next(y, x, u, v) <- list<< >> d(y, x).```

### Sorting Entities

Entities cannot be used as sorting variables. (They can, however, be used as `group-by` arguments.) Thus a rule like the following is illegal:

```e(_) -> .
first(x), next(x, y) <- list<< >> e(x).```

### Note

The diagnostics for such an example might be somewhat difficult to interpret. They would look something like this:
```Error: List P2P mapping cannot contain entity types, for P2P mapping defined at block_1Z1B37OB:22(1)--22(38):
Forall x::e,y::e .
first(x),next(x,y) <-
list<< >>
e(x).```

Entities can be sorted indirectly via their constructor arguments.

Example 13.8. Sorting entities using their constructor predicates

```e(_) -> .
cons[x] = y -> string(x), e(y).
lang:constructor(`cons).

arg(x) <- cons[x] = _.
arg:first(x) -> string(x).
arg:next(x, y) -> string(x), string(y).

arg:first(x), arg:next(x, y) <- list<< >> arg(x).

e:first(x) <- arg:first(y), cons[y] = x.
e:next(x, y) <- arg:next(u, v), cons[u] = x, cons[v] = y.```