Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
1993, ACM Transactions on Database Systems
…
27 pages
1 file
Many algorithms have been devised for minimizing the costs associated with obtaining the answer to a single, isolated query in a distributed database system. However, if more than one query may be processed by the system at the same time and if the arrival times of the queries are unknown, the determination of optimal query-processing strategies becomes a stochastic optimization problem. In order to cope with such problems, a theoretical state-transition model is presented that treats the system as one operating under a stochastic load. Query-processing strategies may then be distributed over the processors of a network as probability distributions, in a manner which accommodates many queries over time. It is then shown that the model leads to the determination of optimal query-processing strategies as the solution of mathematical programming problems, and analytical results for several examples are presented. Furthermore, a divide-and-conquer approach is introduced for decomposing ...
ACM Transactions on Database Systems, 1986
A state transition model for the optimization of query processing in a distributed database system is presented. The problem is parameterized by means of a state describing the amount of processing that has been performed at each site where the database is located. A state transition occurs each time a new join or semijoin is executed. Dynamic programming is used to compute recursively the costs of the states and the globally optimal solution, taking into account communication and local processing costs. The state transition model is general enough to account for the possibility of parallel processing among the various sites, as well as for redundancy in the database. The model also permits significant reductions of the necessary computations by taking advantage of simple additivity and site-uniformity properties of a cost model, and of clever strategies that improve on the basic dynamic programming algorithm.
2003
General stochastic query optimization (GSQO) problem for multiple join-join of p relations which are stored at p different sites-is presented. GSQO problem leads to a special kind of nonlinear programming problem (P). Problem (P) is solved by using a constructive method. A sequence converging to the solution of the optimization problem is built. Two algorithms for solving optimization problem (P) are proposed.
This paper addresses the processing of a query in distributed database systems using a sequence of semijoins. The objective is to minimize the intersite data traffic incurred by a distributed query. A method is developed which accurately and efficiently estimates the size of an intermediate result of a query. This method provides the basis of the query optimization algorithm. Since the distributed query optimization problem is known to be intractable, a heuristic algorithm is developed to determine a low-cost sequence of semijoins. The cost comparison with an existing algorithm is provided. The complexity of the main features of the algorithm is analytically derived. The scheduling time for sequences of semijoins is measured for example queries using the PASCAL program which implements the algorithm.
International Journal of Computer Applications, 2013
Optimization of query in distributed database system is one of the dominant subjects in the field of database theory. Depending upon the placement of data a query can be described as centralized or distributed query. The processing of distributed query is entirely different from the centralized query as in the former case the data is distributed over number of sites. Decision Support System Query (DSSQ) is one of the decisive types of distributed query. DSS queries are complex and time consuming in nature. Due to the decentralization of data and the complexity of query, it becomes mandatory to optimize the DSS query in distributed database system. In this work an effort is made to find an optimal DSS sub query allocation plan in distributed environment stochastically using Genetic Algorithm. The queries are designed on the basis of one of the benchmark of DSS query as given by TPC-DS. The DSS queries are optimized on the basis of Total Cost. The use of Genetic Algorithm has significantly expedited the process of DSS query optimization. The effect of varying communication cost over Total Cost of system resources is also observed.
IEEE Transactions on Computers, 2000
A model is developed for determining the optimal policy for processing a given relational model query. The model is based on operating cost (processing cost and communication cost), which is a function of selection of sites for processing query operations, sequence of operations, file size, and data reduction functions. The optimal policy specifies the site selection and sequence of operations that yield minimum operating cost. The query is first decomposed into a set of relational algebra operations whose precedence relationships are expressed as a query tree. Additional query trees may be generated by permuting these operations. A set of query processing graphs is then generated for a given query tree. Each node of a query processing graph represents the execution of a set of operations at a single site. Since the neighboring nodes represent distinct processing sites, the arcs between nodes represent the communication cost among sites. Theorems based on the cost model and the query processing graphs are developed for determining the optimal sites for processing the operations and for selecting the local optimal graphs from the set of query processing graphs. Use of these theorems greatly reduces the computation requirements in determining the optimal query processing policy. An example is given to illustrate the model. Index Terms-Distributed database, local operation group, optimal query processing, query operating cost, query processing graph, query tree, relational algebra, relational database.
Information Processing Letters, 1980
I would like to thank my supervisor Dr Dan Olteanu for his incredible level of enthusiasm and encouragement throughout the project. I am also very grateful for the continuous level of feedback and organisation as well as the amount of time he has devoted to answering my queries. I feel that I now approach complex and unknown problems with enthusiasm instead of apprehension as I used to. I couldn't have had a better supervisor.
2018
Distributed database is a collection of logically related databases that cooperate in a transparent manner. Query processing uses a communication network for transmitting data between sites. It refers to one of the challenges in the database world. The development of sophisticated query optimization technology is the reason for the commercial success of database systems, which complexity and cost increase with increasing number of relations in the query. Mariposa, query trading and query trading with processing task-trading strategies developed for autonomous distributed database systems, but they cause high optimization cost because of involvement of all nodes in generating an optimal plan. In this paper, we proposed a modification on the autonomous strategy K-QTPT that make the seller’s nodes with the lowest cost have gradually high priorities to reduce the optimization time. We implement our proposed strategy and present the results and analysis based on those results. Keywords—A...
Mathematical and Computer Modelling, 1995
The uncertainty inherent in the distributed environment poses new challenges to the efficient utilization of system resources in managing database transactions. In response to this realization, the execution of a join query in a system with probabilistic resource and cost parameters is contemplated, leading to the development of stochastic pro~amming models. Information in the form of relational tables and scattered amongst the sites of a distributed database system is to be collated and presented to the appropriate user, in response to an issued request. Performing this task demands the usage of limited resources; the ultimate goal is the determination of an execution strategy incurring minimal cost to the system. The actual state of any network component at the moment of its exploitation cannot be exactly ascertained in advance. Any interrogation of a distant element must be communicated by the network, and this involves a delay, as perceived by the questioner, during which the state of the system may change. Indeed, the time at which a task assigned to any particular component cannot itself be precisely predicted, even if the future state of the component could be known definitively. By considering the uncertain nature of the distributed environment, the earlier model of join query evaluation presented in [l] can be modified in different ways to sccount for system parameters known only in a stochastic sense. This new level of subjectivity is a revelation of the many different attitudes that may be taken towards the chance of infeasibility in the solution, for the major issue in dealing with uncertainty is the choice of an appropriate measure of risk.
Journal of Heuristics, 1997
The query optimizer is the DBMS (data base management system) component whose task is to find an optimal execution plan for a given input query. Typically, optimization is performed using dynamic programming. However, in distributed execution environments, this approach becomes intractable, due to the increase in the search space incurred by distribution. We propose the use of the tabu search metaheuristic for distributed query optimization. A hashing-based data structure is used to keep track of the search memory, simplifying significantly the implementation of tabu search. To validate this proposal, we implemented the tabu search strategy in the scope of an existing optimizer, which runs several search strategies. We focus our attention on the more difficult problems in terms of the query execution space, in which the solution space includes bushy execution plans and Cartesian products, which are not dealt with very often in the literature. Using a real-life application, we show the effectiveness of tabu search when compared to other strategies.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Proceedings of the 1983 ACM SIGSMALL symposium on Personal and small computers - SIGSMALL '83, 1983
International Journal of Computer Applications, 2013
IEEE Journal on Selected Areas in Communications, 2000
Indian Journal of Science and Technology, 2018
Lecture Notes in Computer Science, 2004
Encyclopedia of Database Systems, 2009
Proceedings of the second international conference on Information and knowledge management - CIKM '93, 1993
Distributed and Parallel Databases, 2008
Computing, Information Systems, Development Informatics & Allied Research Journal, 2016
IEEE Transactions on Software Engineering, 1985
Information and Software Technology, 1992