Scalable Test Data Generation from Multi-dimensional Models

Congratulations to Emina Torlak, researcher at University of California, Berkeley and former LogicBlox team member, on the acceptance of her paper to FSE 2012! FSE is the premier venue for publishing fundamental research results in the area of software engineering. Emina’s research in test model generation is instrumental in the way LogicBlox tests our decision support software. We are proud to have supported her work in this area!

Abstract Multidimensional data models form the core of modern decision support software. The need for this kind of software is signi cant, and it continues to grow with the size and variety of datasets being collected today. Yet real multidimensional instances are often unavailable for testing and benchmarking, and existing data generators can only produce a limited class of such structures. In this paper, we present a new framework for scalable generation of test data from a rich class of multidimensional models. The framework provides a small, expressive language for specifying such models, and a novel solver for generating sample data from them. While the satis ability problem for the language is NP-hard, we identify a polynomially solvable fragment that captures most practical modeling patterns. Given a model and, optionally, a statistical speci cation of the desired test dataset, the solver detects and instantiates a maximal subset of the model within this fragment, generating data that exhibits the desired statistical properties. We use our framework to generate a variety of high-quality test datasets from real industrial models, which cannot be correctly instantiated by existing data generators, or as e ectively solved by general-purpose constraint solvers.


Leave a reply

© Copyright 2021. Infor. All rights reserved.

Log in with your credentials

Forgot your details?