Papers by mohamed soliman

Proceedings of the Fifth International Workshop on Testing Database Systems, 2012
The accuracy of a query optimizer is intricately connected with a database system performance and... more The accuracy of a query optimizer is intricately connected with a database system performance and its operational cost: the more accurate the optimizer's cost model, the better the resulting execution plans. Database application programmers and other practitioners have long provided anecdotal evidence that database systems differ widely with respect to the quality of their optimizers, yet, to date no formal method is available to database users to assess or refute such claims. In this paper, we develop a framework to quantify an optimizer's accuracy for a given workload. We make use of the fact that optimizers expose switches or hints that let users influence the plan choice and generate plans other than the default plan. Using these implements, we force the generation of multiple alternative plans for each test case, time the execution of all alternatives and rank the plans by their effective costs. We compare this ranking with the ranking of the estimated cost and compute a score for the accuracy of the optimizer. We present initial results of an anonymized comparisons for several major commercial database systems demonstrating that there are in fact substantial differences between systems. We also suggest ways to incorporate this knowledge into the commercial development process.
Proceedings of the 2007 ACM …, 2007
Top-k processing in uncertain databases is semantically and computationally different from tradit... more Top-k processing in uncertain databases is semantically and computationally different from traditional top-k processing. The interplay between query scores and data uncertainty makes traditional techniques inapplicable. We introduce URank, a system that processes new probabilistic formulations of top-k queries in uncertain databases. The new formulations are based on marriage of traditional top-k semantics with possible worlds semantics. URank encapsulates a new processing framework that leverages existing query processing capabilities, and implements efficient search strategies that integrate ranking on scores with ranking on probabilities, to obtain meaningful answers for top-k queries.

Proceedings of the VLDB …, 2008
Uncertainty pervades many domains in our lives. Current real-life applications, e.g., location tr... more Uncertainty pervades many domains in our lives. Current real-life applications, e.g., location tracking using GPS devices or cell phones, multimedia feature extraction, and sensor data management, deal with different kinds of uncertainty. Finding the nearest neighbor objects to a given query point is an important query type in these applications. In this paper, we study the problem of finding objects with the highest marginal probability of being the nearest neighbors to a query object. We adopt a general uncertainty model allowing for data and query uncertainty. Under this model, we define new query semantics, and provide several efficient evaluation algorithms. We analyze the cost factors involved in query evaluation, and present novel techniques to address the trade-offs among these factors. We give multiple extensions to our techniques including handling dependencies among data objects, and answering threshold queries. We conduct an extensive experimental study to evaluate our techniques on both real and synthetic data.
Uploads
Papers by mohamed soliman