Papers by Mohamed Soliman
Top-k processing in uncertain databases is semantically and computationally different from tradit... more Top-k processing in uncertain databases is semantically and computationally different from traditional top-k processing. The interplay between query scores and data uncertainty makes traditional techniques inapplicable. We introduce URank, a system that processes new probabilistic formulations of top-k queries in uncertain databases. The new formulations are based on marriage of traditional top-k semantics with possible worlds semantics. URank encapsulates a new processing framework that leverages existing query processing capabilities, and implements efficient search strategies that integrate ranking on scores with ranking on probabilities, to obtain meaningful answers for top-k queries.
Top-k processing in uncertain databases is semantically and computationally different from tradit... more Top-k processing in uncertain databases is semantically and computationally different from traditional top-k processing. The interplay between score and uncertainty makes traditional techniques inapplicable. We introduce new probabilistic formulations for top-k queries. Our formulations are based on "marriage" of traditional top-k semantics and possible worlds semantics. In the light of these formulations, we construct a framework that encapsulates a state space model and efficient query processing techniques to tackle the challenges of uncertain data settings. We prove that our techniques are optimal in terms of the number of accessed tuples and materialized search states. Our experiments show the efficiency of our techniques under different data distributions with orders of magnitude improvement over naïve materialization of possible worlds.
Proceedings of The Vldb Endowment, 2008
Uncertainty pervades many domains in our lives. Current real-life applications, e.g., location tr... more Uncertainty pervades many domains in our lives. Current real-life applications, e.g., location tracking using GPS devices or cell phones, multimedia feature extraction, and sensor data management, deal with different kinds of uncertainty. Finding the nearest neighbor objects to a given query point is an important query type in these applications.
ACM Transactions on Database Systems, 2008
Page 1. 13 Probabilistic Top-k and Ranking-Aggregate Queries MOHAMED A. SOLIMAN and IHAB F. ILYAS... more Page 1. 13 Probabilistic Top-k and Ranking-Aggregate Queries MOHAMED A. SOLIMAN and IHAB F. ILYAS University of Waterloo and KEVIN CHENCHUAN CHANG University of Illinois at Urbana-Champaign Ranking and ...
ACM Computing Surveys, 2008
Efficient processing of top-k queries is a crucial requirement in many interactive environments t... more Efficient processing of top-k queries is a crucial requirement in many interactive environments that involve massive amounts of data. In particular, efficient top-k processing in domains such as the Web, multimedia search and distributed systems has shown a great impact on performance. In this survey, we describe and classify top-k processing techniques in relational databases. We discuss different design dimensions in the current techniques including query models, data access methods, implementation levels, data and query certainty, and supported scoring functions. We show the implications of each dimension on the design of the underlying techniques. We also discuss top-k queries in XML domain, and show their connections to relational approaches.
Large databases with uncertain information are becoming more common in many applications includin... more Large databases with uncertain information are becoming more common in many applications including data integration, location tracking, and Web search. In these applications, ranking records with uncertain attributes needs to handle new problems that are fundamentally different from conventional ranking. Specifically, uncertainty in records' scores induces a partial order over records, as opposed to the total order that is assumed in the conventional ranking settings.
Top-k processing in uncertain databases is semantically and computationally different from tradit... more Top-k processing in uncertain databases is semantically and computationally different from traditional top-k processing. The interplay between query scores and data uncertainty makes traditional techniques inapplicable. We introduce URank, a system that processes new probabilistic formulations of top-k queries in uncertain databases. The new formulations are based on marriage of traditional top-k semantics with possible worlds semantics. URank encapsulates a new processing framework that leverages existing query processing capabilities, and implements efficient search strategies that integrate ranking on scores with ranking on probabilities, to obtain meaningful answers for top-k queries.
Top-k processing in uncertain databases is semantically and computationally different from tradit... more Top-k processing in uncertain databases is semantically and computationally different from traditional top-k processing. The interplay between score and uncertainty makes traditional techniques inapplicable. We introduce new probabilistic formulations for top-k queries. Our formulations are based on "marriage" of traditional top-k semantics and possible worlds semantics. In the light of these formulations, we construct a framework that encapsulates a state space model and efficient query processing techniques to tackle the challenges of uncertain data settings. We prove that our techniques are optimal in terms of the number of accessed tuples and materialized search states. Our experiments show the efficiency of our techniques under different data distributions with orders of magnitude improvement over naïve materialization of possible worlds.
Proceedings of The Vldb Endowment, 2008
Uncertainty pervades many domains in our lives. Current real-life applications, e.g., location tr... more Uncertainty pervades many domains in our lives. Current real-life applications, e.g., location tracking using GPS devices or cell phones, multimedia feature extraction, and sensor data management, deal with different kinds of uncertainty. Finding the nearest neighbor objects to a given query point is an important query type in these applications.
ACM Transactions on Database Systems, 2008
Page 1. 13 Probabilistic Top-k and Ranking-Aggregate Queries MOHAMED A. SOLIMAN and IHAB F. ILYAS... more Page 1. 13 Probabilistic Top-k and Ranking-Aggregate Queries MOHAMED A. SOLIMAN and IHAB F. ILYAS University of Waterloo and KEVIN CHENCHUAN CHANG University of Illinois at Urbana-Champaign Ranking and ...
ACM Computing Surveys, 2008
Efficient processing of top-k queries is a crucial requirement in many interactive environments t... more Efficient processing of top-k queries is a crucial requirement in many interactive environments that involve massive amounts of data. In particular, efficient top-k processing in domains such as the Web, multimedia search and distributed systems has shown a great impact on performance. In this survey, we describe and classify top-k processing techniques in relational databases. We discuss different design dimensions in the current techniques including query models, data access methods, implementation levels, data and query certainty, and supported scoring functions. We show the implications of each dimension on the design of the underlying techniques. We also discuss top-k queries in XML domain, and show their connections to relational approaches.
Large databases with uncertain information are becoming more common in many applications includin... more Large databases with uncertain information are becoming more common in many applications including data integration, location tracking, and Web search. In these applications, ranking records with uncertain attributes needs to handle new problems that are fundamentally different from conventional ranking. Specifically, uncertainty in records' scores induces a partial order over records, as opposed to the total order that is assumed in the conventional ranking settings.
Uploads
Papers by Mohamed Soliman