We recently had occasion to study and admire the join algorithm of Ngo, Porat, Ré and Rudraa (henceforth NPRR). Given bounds on the input relation sizes, the running time of NPRR is bounded by the largest possible result size, which may be determined using the fractional edge cover method. Our commercial Datalog-based system LogicBlox employs a novel join algorithm (leapfrog triejoin) which performs conspicuously well over diverse benchmarks; we were curious how it would compare. We establish that leapfrog triejoin is also worst-case optimal for fully conjunctive queries, and in fact satisfies a more exacting optimality criterion. Our algorithm may offer a practical competitor to NPRR, being easy to absorb, simple to implement, and having a concise optimality proof.
Leapfrog Triejoin: A Worst-case Optimal Join Algorithm
-
SecureBlox: Customizable Secure Distributed Data Processing
Shan Shan Huang, , Publications, 0
The application of Datalog to the area of security, both in analysis and in the specification of rules, has...
-
In LogiQL: Relating Constraint Handling Rules to Datalog
Shan Shan Huang, , Publications, 0
One of the advantages of using LogiQL is the unification of programming model such that prescriptive rules of computation...
-
Scalable Analysis of Conceptual Data Models
Shan Shan Huang, , Publications, 0
Abstract Conceptual data models describe information systems without the burden of implementation details, and are increasingly used to generate...
-
ERBlox: Combining Matching Dependencies with Machine Learning for Entity Resolution
Shan Shan Huang, , Publications, 0
Entity resolution — determining multiple distinct pieces of data as identifiers for the same real-world object — is a...
-
Algebraic Structures for Capturing the Provenance of SPARQL Queries
Shan Shan Huang, , Publications, 0
Congratulations to LogicBlox team member Grigoris Karvounarakis on his recent publication in ICDT 2013: 16thInternational Conference on Database Theory....
-
Win-move is Coordination Free… Sometimes
Shan Shan Huang, , Publications, 0
Coordination barriers are a major source of inefficiency in distributed query evaluation. Identifying queries that can be evaluated in...
-
Rewriting Guarded Negation Queries
Shan Shan Huang, , Publications, 0
Query optimizations are notoriously difficult for queries involving negation. Together with academic collaborator Professor Michael Benedikt of University of...
-
LogicBlox, Platform and Language: a Tutorial
Shan Shan Huang, , Publications, 0
Datalog is the mathematical foundation to LogiQL, the LogicBlox query language. We here take a keen interest in fostering...