Scaling XML query processing: distribution, localization and pruning

Patrick Kling; M. Tamer Özsu; Khuzaima Daudjee

Scaling XML query processing: distribution, localization and pruning

M. Tamer Özsu

2011

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

Abstract Distributing data collections by fragmenting them is an effective way of improving the scalability of a database system. While the distribution of relational data is well understood, the unique characteristics of the XML data and query model present challenges that require different distribution techniques. In this paper, we show how XML data can be fragmented horizontally and vertically.

Key takeaways

With horizontal fragmentation, it is possible to evaluate a query by computing the union of all fragments and then executing a centralized query plan over the result.
Since the local plans can be evaluated independently of each other in parallel, we can model the cost of a query q as cost(q) = max{cost(p j ) | p j ∈ P } where P is the set of local plans (after pruning) corresponding to q for a given vertical fragmentation schema.
So far, for simplicity, we have focused on identifying a fragmentation schema for a single query.
Distributed query execution over the hybrid fragmentation yields even better results.
Figure 23 shows the throughput rates achieved by centralized query execution (which is vanishingly low in some of the cases shown), as well as distributed query 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 execution (with and without pruning) on a balanced fragmentation consisting of 2, 4 and 8 fragments and on the skewed fragmentation.

M. Tamer Özsu

2010

Abstract Distributing data collections by fragmenting them is an effective way of improving the scalability of a database system. While the distribution of relational data is well understood, the unique characteristics of XML data and its query model present challenges that require different distribution techniques. In this paper, we show how XML data can be fragmented horizontally and vertically.

Log In

Scaling XML query processing: distribution, localization and pruning

Sign up for access to the world's latest research

Abstract

Key takeaways

Related papers