We recently had occasion to study and admire the join algorithm of Ngo, Porat, Ré and Rudraa (henceforth NPRR). Given bounds on the input relation sizes, the running time of NPRR is bounded by the largest possible result size, which may be determined using the fractional edge cover method. Our commercial Datalog-based system LogicBlox employs a novel join algorithm (leapfrog triejoin) which performs conspicuously well over diverse benchmarks; we were curious how it would compare. We establish that leapfrog triejoin is also worst-case optimal for fully conjunctive queries, and in fact satisfies a more exacting optimality criterion. Our algorithm may offer a practical competitor to NPRR, being easy to absorb, simple to implement, and having a concise optimality proof.
Leapfrog Triejoin: A Worst-case Optimal Join Algorithm
Posted in Publications