## Chapter 9. Expressions

An expression is a syntactic construct that represents a value. For example, `3+4` is an expression that adds the value `4` to the value `3`: the value of the expression is `7`. Another example is `x+1`, which adds `1` to whatever value is in `x`, and produces a new value.

This chapter describes the different kinds of expressions you can write, as well as how those expressions compute a value.

In general, it is also possible for an expression not to have a value. For example, `5/0` doesn't compute any result.

## 9.1. Literals

Expressions may be literals of the different primitive types supported by LogiQL:

```Expression    = StringLiteral | BooleanLiteral | NumberLiteral .
NumberLiteral = DecimalLiteral | FloatLiteral | IntegerLiteral .```

Example 9.1. Literals

```"hello"
12
3.141529f```

## 9.2. Variables

`      Expression = BasicIdentifier .`

Syntactically, a variable (or, more precisely, the name of a variable) is just an identifier (without a colon), such as `x`, `cost` or `Jane`. The semantics, however, require an explanation.

### Note

The identifier `_` (i.e., a single underscore) is special: it represents a so-called anonymous variable. An anonymous variable has no other occurrences, that is, every occurrence of `_` is equivalent to a brand new, unique identifier.

When a variable has only one occurrence by design, it is a good idea to make it anonymous, and thus avoid a warning from the compiler. The compiler reports non-anonymous variables with single occurrences, as these sometimes arise because of trivial typos, and the resulting errors would otherwise be difficult to catch.

If you'd like to keep the name of a variable meaningful even if it has only a single occurrence, just begin its name with an underscore, e.g., `_sales`. (The only effect of the underscore is to suppress the warning, otherwise the name is like any other: unlike in the case of anonymous variables, two occurrences in the same scope will refer to the same variable.)

### The concept of "a variable"

In imperative programming languages (such as C, Java, Python...) a variable essentially represents a memory location that may have some contents, and the contents may change over time. For example, in many such languages `x = x + 2` is not an equality, but an assignment statement. Execution of the statement usually increases (by 2) the contents of a memory location represented by `x`, and thus the "value of variable `x`". (We say "usually", because the effect may sometimes be different, due to the vagaries of finite integer arithmetic.)

This should be contrasted with high-school algebra or physics, where a variable represents some concrete value: its function is somewhat similar to a pronoun in natural language.

In that context, `t = s / v` represents the relation between some concrete time `t`, some concrete distance `s` and some concrete average speed `v`. Once we "plug in" concrete values (data from today's trip, say), the variables disappear, and we are left with an equality between concrete arithmetic expressions.

Variables in LogiQL are much more like the variables of algebra than like those of imperative programming languages. The closest counterparts are the variables of classical first-order logic, where we deal with statements such as the following:

1. "For any integers `x` and `y`, `x > y`."
2. "For any integer `x` there exists an integer `y` such that `x > y`."
3. "`x > y`"

The first of the above statements happens to be false, the second happens to be true, and the third may be false or true, depending on how we choose to instantiate `x` and `y`, i.e., depending on what concrete values are represented by these variables.

That the last statement is neither true nor false is due to the fact that the variables are free. By contrast, in the other statements they are bound by the quantifiers "for any" and "there exists". Indeed, the choice of a particular integer cannot affect the truth of a statement that purports to hold for any integer: if that particular integer is a counterexample, then the statement is simply false.

"For any" (or, equivalently, "for all") is the universal quantifier; "there exists" is the existential quantifier. It is worth noticing that

• there does not exist an `x` such that `p(x)` holds is equivalent to for every `x`, `p(x)` does not hold;
• for every `x`, `p(x)` holds is equivalent to there does not exist an `x` such that `p(x)` does not hold.

The above is a generalisation of deMorgan's rules:

• not (`p` and `q`) = not `p` or not `q`;
• not (`p` or `q`) = not `p` and not `q`.

Variables in LogiQL are like the variables of logic.

It should be noted that:

• A LogiQL variable is local to the rule in which it occurs (see Chapter 11, Rules). This means that occurrences of the same variable name in different rules refer to different variables. (Variables can also occur in facts, but a fact can be considered as a degenerate rule. Moreover, variables in facts are not useful: see Section 10.1.1, “Atoms as facts”.)
• Additionally, each variable is quantified by a universal or existential quantifier: see the section called “Rules and the quantification of variables”. (The quantifier may further limit the scope of the variable: see Example 11.4, “A quantifier limits the scope of a variable”.)
• The closest counterparts to global variables are scalars, i.e., functional predicates without arguments (see Section 8.2, “Functional Predicates”). For example, we might model the maximum delay between placing an order and receiving the shipment by introducing the predicate `max_delay[] = x -> int(x).`, and making sure it has some content, e.g., `max_delay[] = 3.` Such a predicate can contain at most one tuple, which can be easily accessed at any place in the program.

### Bound variables and their instantiations

A variable such as `x` is just a placeholder. In order to use it in a computation, we must instantiate it, i.e., replace it with a concrete value. The value is called the variable's instantiation. (An instantiation is often referred to as a binding, but the latter term can be confusing, as it is often used for quite diverse purposes in different contexts.)

There are several ways in which a LogiQL variable can be instantiated:

• If the variable occurs in an atom in the body of a rule (Section 10.1.3, “Atoms in bodies of rules”), and the atom refers to an EDB or IDB predicate (Chapter 8, Predicates, Chapter 11, Rules): the variable can then become instantiated to a value derived from the values in the predicate.
• If the variable occurs on one side of an equality such as `x = y + z * 2`, and all the variables on the other side are bound (see below).
• Sometimes the compiler is able to determine that a variable is bound in more complicated cases. For example, if `x` and `z` are known to be bound in `x = y + z * 2`, then `y` is also bound. Sometimes, however, the current compiler is not so sophisticated: for example, if `x` is known to be boolean, then `x != true` is not recognised as bound.

If a variable occurs in either of such contexts, then that occurrence is called a binding occurrence, and the variable is said to be bound. (This usage of the word "bound" is quite different from the conventional usage in logic: in the conventional sense, every variable in a LogiQL rule is bound by an implicit quantifier.)

In general, a bound variable should be positively bound, i.e., it should have a binding occurrence that is not in the scope of a negation. In the section called “Negation” we discuss the somewhat subtle exception to this rule.

Please notice that the second case above does not constitute a circular definition (of the notion of being bound): the definition is recursive, but there is a base case. Ultimately, all instantiations of variables are derived from the database or from equations with values that are derived from literals (e.g., `x = 2 * 3 / 4`).

A variable that is not bound is called unbound. A legal rule (or fact) in LogiQL must be syntactically correct, but it also must not contain occurrences of unbound variables.

It should be noted that the fact that a variable is bound does not ensure that it will actually be instantiated to anything. If the variable is bound by an occurrence in a predicate, the predicate may be empty. If the variable is bound by an equality, the other side of the equality might not evaluate to a value, as in `x = y/0` (or, indeed, as in `x = y`, if `y` is not actually instantiated to anything).

In order to deal with such cases, and in order to understand the semantics of LogiQL declaratively (i.e., without reference to the details of the computation, which can be quite involved), it is sometimes best to think of a variable as representing the set of all its instantiations. For a bound variable this set will always be finite, though sometimes it might be empty. (Treating a variable as a set is actually only a first approximation: see Section 10.1.3, “Atoms in bodies of rules”.)

### Note

One consequence of such a viewpoint is that an expression such as `x + y * 2` represents not a value, but a set of values (that is, all the different values that would be produced by systematically replacing variables with each of their instantiations). In this manual we often talk about "the value" of an expression, simply because explaining everything in terms of sets would make the text too stilted. One should, however, keep this perspective in mind, especially when mention is made of the evaluation of an expression ending in failure (e.g., during division by zero): this simply means that the set of values is empty.

Example 9.2. Bound and unbound variables

`p(x, 7).`

The variable `x` is unbound, so this is not a legal fact (see Section 10.1.1, “Atoms as facts”).

`p(x, y) <- x != y.`

The variables `x` and `y` are unbound (they cannot be instantiated by the disequality), so this is not a legal rule.

`p(x, y) <- x != y, q(x), r(y).`

If `q` and `r` are unary (i.e., of arity one) predicates of the same type, then the variables `x` and `y` are bound. (The predicates must be of the same type, because the variables are used in a disequality.)

The last rule effectively declares `p` to be a predicate that contains all pairs `(x, y)`, such that value `x` is in predicate `q`, value `y` is in predicate `r`, and the two values are different. See Section 10.1.2, “Atoms in heads of rules”.

## 9.3. Arithmetic Operations

`Expression = Expression ("+" | "-" | "*" | "/") Expression .`

The multiplication (`*`) and division (`/`) operators have higher precedence than addition (`+`) and subtraction (`-`) operators, and otherwise operations associate to the left.

Example 9.3.

`x + y * z` is equivalent to `x + (y * z)`, because `*` has higher precedence than `+`. Otherwise, ambiguities are resolved by associating to the left. For example, `x - y + z` is equivalent to `(x - y) + z`.

An arithmetic expression with one operator evaluates to its usual numeric value if both the operands are numbers and the value can be represented.

The expression `3+4` evaluates to the number `7`.

Example 9.5. Division by zero

If the value of `y` is `0`, then the expression `x/y` will not have a value: its evaluation will fail.

For example, given the following logic,

```p(0).  p(1).  p(2).  p(3).

q(20). q(10).

r(z) <- p(y), q(x), z = x / y.```

`r` will contain the values `3`, `5`, `6`, `10` and `20`.

As an exception, the result of dividing two integers is always an integer. If numeric division results in a fraction, then the value of the expression is that fraction rounded toward zero.

Example 9.6. Integer division

The value of `-4/-3` is `1`. The value of `4/-3` is `-1`.

Additionally, `+` can be used for strings. If both operands are strings, then the value of the expression is the string that is their concatenation.

Example 9.7.

`("abc" + "def")` evaluates to the string `"abcdef"`.

If a floating-point primitive produces a `NaN` value, no value is produced. If a floating-point operation produces `-0.0`, it is silently converted to `0.0`.

Example 9.8.

`+F(0.0f/0.0f).` will not result in anything being inserted to `F`, since `0.0f/0.0f` produces `NaN`.

Arithmetic operators must be used with expressions of the same type. The operation will result in a value of the same type, as well. It is a compile error for the right-hand-side of an operator to be a different type than the left-hand-side.

Example 9.9.

`5d * 3.0d` results in value `1.6667d`, a decimal value.

Example 9.10.

`5d * 3.0f` results in a compiler error, `TYPE_ERROR`.

## 9.4. Function applications

```Expression          = FunctionApplication
| ExternalDiffFunctionApplication .
FunctionApplication = PredicateDescriptor "[" [ ArgumentList ] "]" .

ExternalDiffFunctionApplication =
(PredicateDescriptor \ PredicateDescriptor) "[" [ ArgumentList ] "]" .

PredicateDescriptor = PredicateName [ "@" Suffix ].
PredicateName       = Identifier.
ArgumentList        = Expression { "," Expression } .```

The optional suffix in `PredicateDescriptor` can be a reference to a transaction stage (Section 21.4.4, “Stage suffixes”) or the name of a branch (Chapter 43, Database Branching).

`ExternalDiffFunctionApplication` is an application of an external diff predicate, which is allowed when the predicate whose versions are compared happens to be functional. See Section 8.11, “External Diff Predicates” and Section 10.1.5, “External diff atoms” for more information.

Function applications can be written only for predicates that are functional. It is an error to write an application for a non-functional predicate.

The expressions between square brackets (`[` and `]`) are called arguments of the application. The number of arguments supplied must be exactly the number of key arguments of the relevant functional predicate.

An application expression applies a functional predicate to a series of zero or more arguments. The expression evaluates to the value in the value position of that tuple of the functional predicate that matches the arguments.

Example 9.11.

Given a database where the predicate `sold` contains tuples: ```{ ("squids", 1995, 100), ("salmon", 1995, 20) }```, the application ```sold["squids", 1995]``` evaluates to `100`.

Here are a few example applications:

Example 9.12.

```cost["squids", 1995]
cost[item, year]
cost[bestseller[year], year + 1]```

It is worth noting that an application is purely a notational convenience, and can always be rewritten to something else. See Section 10.1, “Atoms” and Section 10.2, “Comparisons”.

Example 9.13. Functional notation is just a convenience

If `area_of_room` is a functional predicate, then the following two lines are equivalent in all respects (but the first one is to be preferred on styslistic grounds).

```area_of_room[12, 10] = a
area_of_room(12, 10, a)```

If the identifiers `new_var_1` and `new_var_2` are unused elsewhere in the current rule, then

`... <- ... cost[bestseller[year], year + 1] < can_spend ...`

is equivalent to

```... <- ... bestseller(year, new_var_1)
cost(new_var_1, year + 1, new_var_2),
new_var_2 < can_spend
...```

Similarly, the following

`p(cost[x]) <- purchased_item(x).`

is equivalent to

`p(c) <- c = cost(x, c), purchased_item(x).`

## 9.5. Parenthesized expressions

`Expression = "(" Expression ")" .`

An expression may be enclosed in parentheses. The value of such an expression is the same as the value of the enclosed expression. Parenthesized expressions are useful for overcoming the usual order of precedence.

Example 9.14.

`2 * 3 + 4` is not the same as `2 * (3 + 4)`, because the parentheses cause the summation to be evaluated before the multiplication.