Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2018, HAL (Le Centre pour la Communication Scientifique Directe)
…
15 pages
1 file
Dealing with large amount of data has always been a key focus of the Multidimensional Database (MDB) community, especially in the current era when data volume increases more and more rapidly. In this paper, we outline a conceptual modeling solution allowing reducing data in MDBs. A MDB after reduction is modeled with multiple states. Each state is valid during a period of time and aggregates data from a more recent state. We propose three alternatives of reduced MDB modeling at the logical level: (i) the flat modeling integrates all states into one single table, (ii) the horizontal modeling converts each state into a fact table and some dimension tables associated with a temporal interval and (iii) the vertical modeling breaks down a reduced MDB into separate tables, each table includes data from one or several states. We evaluate query execution efficiency in MDBs with and without data reduction. The result shows data reduction is an interesting solution, since it significantly decreases execution costs by 98.96% during our experimental assessments.
Lecture Notes in Computer Science, 2014
OATAO is an open access repository that collects the work of Toulouse researchers and makes it freely available over the web where possible. This is an author-deposited version published in : http://oatao.univ-toulouse.fr/ Eprints ID : 13214
Lecture Notes in Computer Science, 2000
It is commonly agreed that multidimensional data cubes form the basic logical data model for OLAP applications. Still, there seems to be no agreement on a common model for cubes. In this paper we propose a logical model for cubes based on the key observation that a cube is not a self-existing entity, but rather a view over an underlying data set. We accompany our model with syntactic characterisations for the problem of cube usability. To this end, we have developed algorithms to check whether (a) the marginal conditions of two cubes are appropriate for a rewriting, in the presence of aggregation hierarchies and (b) an implication exists between two selection conditions that involve different levels of aggregation of the same dimension hierarchy. Finally, we present a rewriting algorithm for the cube usability problem.
Lecture Notes in Computer Science, 2000
In the context of multidimensional databases implemented o n relational DBMSs through star schemes, the most effective technique t o enhance performances consists of materializing redundant aggregates called views. In this paper we investigate the problem of vertical fragmentation of views aimed at minimizing the workload response time. Each view includes several measures which not necessarily are always requested together; thus, the system performance may be increased by partitioning the views into smaller tables. On the other hand, drill-across queries involve measures taken from two or more views; in this case the access costs may be decreased by unifying these views into larger tables. After formalizing the fragmentation problem as a 0-1 integer linear programming problem, we define a cost function and outline a branch-and-bound algorithm t o minimize it. Finally, we demonstrate the usefulness of our approach b y presenting a set of experimental results based on the TPC-D benchmark.
International Journal of Decision Support System Technology, 2015
The authors' aim is to provide a solution for multidimensional data warehouse's reduction based on analysts' needs which will specify aggregated schema applicable over a period of time as well as retain only useful data for decision support. Firstly, they describe a conceptual modeling for multidimensional data warehouse. A multidimensional data warehouse's schema is composed of a set of states. Each state is defined as a star schema composed of one fact and its related dimensions. The derivation between states is carried out through combination of reduction operators. Secondly, they present a meta-model which allows managing different states of multidimensional data warehouse. The definition of reduced and unreduced multidimensional data warehouse schema can be carried out by instantiating the meta-model. Finally, they describe their experimental assessments and discuss their results. Evaluating their solution implies executing different queries in various contexts:...
Very Large Data Bases, 1997
We present a multi-dimensional database model, which we believe can serve as a con- ceptual model for On-Line Analytical Pro- cessing (OLAP)-based applications. Apart from providing the functionalities necessary for OLAP-based applications, the main fea- ture of the model we propose is a clear sepa- ration between structural aspects and the con- tents. This separation of concerns allows us to
Our aim is to define a framework supporting analysis in MDW with reductions. Firstly, we describe a modeling solution for reduced MDW. A schema of reduced MDW is composed of states. Each state is defined as a star schema composed of one fact and its related dimensions valid for a certain period of time. Secondly, we present a multi-state analysis framework. Extensions of classical drilldown and rollup operators are defined to support multi-states analyses. Finally we present a prototype of our framework aiming to prove the feasibility of concept. By implementing our extended operators, the prototype automatically generates appropriate SQL queries over metadata and reduced data.
The relational data model, which was introduced by Codd in 1970 and earned him the Turing Award a decade later, was the foundation of today’s multi-billion-dollar database industry. During the 1990s, a new type of data model, the multidimensional data model, has emerged, which has taken over from the relational model where the objective is to analyze data, rather than to perform on-line transactions. Multidimensional data models are designed expressly to support data analyses. A number of such models have been proposed by researchers from academia and industry. In academia, formal mathematical models have been proposed, while the industrial proposals have typically been specified more or less implicitly by the concrete software tools that implement them. Briefly, multidimensional models categorize data as being either facts with associated numerical measures, or as being dimensions that characterize the facts and are mostly textual. For example, in a retail business, products are sold to customers at certain times in certain amounts and at certain prices. A typical fact would be a purchase. Typical measures would be the amount and price of the purchase. Typical dimensions would be the location of the purchase, the type of product being purchased, and the time of the purchase. Queries then aggregate measure values over ranges of dimension values to produce results such as the total sales per month and product type........
2011
This work used a method of Matrix Decomposition Algorithm to obtain a new dataset of genetic epistasis as a surrogate for a multidimensional dataset which transformed multidimensional database to a 2dimensional database. It employed decomposition algorithms based on Boyce Codd Normal Form for minimizing anomalies. The decomposition and reversible algorithms were used on relationship among object attributes and were implemented. The implemented program ran on sample genetic epistasis datasets of up to 10 dimensions and it was shown that multidimensional datasets can be reduced to two dimensions. It was established that the time taken to generate a sequence of tuples from multidimensional database to a 2dimensional dataset was directly proportional to the number of genes considered. The result showed that the reduced 2-dimensional database did not require any in-built functions which take long processing time for generating query result as against querying of multidimensional dataset....
2014
The decision making processes need to reflect changes in the business world in a multidimensional way. This includes also similar way of viewing the data for carrying out key decisions that ensure competitiveness of the business. In this paper we focus on the Business Intelligence system as a main toolset that helps in carrying out complex decisions and which requires multidimensional view of data for this purpose. We propose a novel experimental approach to the design a multidimensional data model that uses principles of the anchor modeling technique. The proposed approach is expected to bring several benefits like better query execution performance, better support for temporal querying and several others. We provide assessment of this approach mainly from the query execution performance perspective in this paper. The emphasis is placed on the assessment of this technique as a potential innovative approach for the field of the data warehousing with some implicit principles that cou...
Data & Knowledge Engineering, 2004
The most effective technique to enhance performances of multidimensional databases consists in materializing redundant aggregates called views. In the classical approach to materialization, each view includes all and only the measures of the cube it aggregates. In this paper we investigate the benefits of materializing views in vertical fragments, aimed at minimizing the workload response time. We formalize the fragmentation problem as a 0-1 integer linear programming problem, which is then solved by means of a standard integer programming solver to determine the optimal fragmentation for a given workload. Finally, we demonstrate the usefulness of fragmentation by presenting a large set of experimental results based on the TPC-H benchmark.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Parallel Processing Letters, 1999
Proceedings of the Int. Conf. on Very Large Data …
International Journal of Data Warehousing and Mining, 2016
Information Systems, 2001
International Conference on Enterprise Information Systems, 2003
Handbook of Data Intensive Computing, 2011
Information Systems, 2004
Distributed and Parallel Databases, 1995