Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2003
The model we define organises data in a constellation of facts and dimensions with multiple hierarchies. In order to insure data consistence and reliable data manipulation, we extend this constellation model by intraand inter-dimension constraints. The intra-dimension constraints allow the definition of exclusions and inclusions between hierarchies of a same dimension. The inter-dimension constraints are related to hierarchies of different dimensions. Also, we study effects of these constraints on multidimensional operations. We depict integration of these constraints within GEDOOH prototype.
2001
A number of models for conceptual design of multidimensional databases have been proposed but only a few of them support the notion of integrity constraints.
Very Large Data Bases, 1997
We present a multi-dimensional database model, which we believe can serve as a con- ceptual model for On-Line Analytical Pro- cessing (OLAP)-based applications. Apart from providing the functionalities necessary for OLAP-based applications, the main fea- ture of the model we propose is a clear sepa- ration between structural aspects and the con- tents. This separation of concerns allows us to
Lecture Notes in Computer Science, 2000
It is commonly agreed that multidimensional data cubes form the basic logical data model for OLAP applications. Still, there seems to be no agreement on a common model for cubes. In this paper we propose a logical model for cubes based on the key observation that a cube is not a self-existing entity, but rather a view over an underlying data set. We accompany our model with syntactic characterisations for the problem of cube usability. To this end, we have developed algorithms to check whether (a) the marginal conditions of two cubes are appropriate for a rewriting, in the presence of aggregation hierarchies and (b) an implication exists between two selection conditions that involve different levels of aggregation of the same dimension hierarchy. Finally, we present a rewriting algorithm for the cube usability problem.
The relational data model, which was introduced by Codd in 1970 and earned him the Turing Award a decade later, was the foundation of today’s multi-billion-dollar database industry. During the 1990s, a new type of data model, the multidimensional data model, has emerged, which has taken over from the relational model where the objective is to analyze data, rather than to perform on-line transactions. Multidimensional data models are designed expressly to support data analyses. A number of such models have been proposed by researchers from academia and industry. In academia, formal mathematical models have been proposed, while the industrial proposals have typically been specified more or less implicitly by the concrete software tools that implement them. Briefly, multidimensional models categorize data as being either facts with associated numerical measures, or as being dimensions that characterize the facts and are mostly textual. For example, in a retail business, products are sold to customers at certain times in certain amounts and at certain prices. A typical fact would be a purchase. Typical measures would be the amount and price of the purchase. Typical dimensions would be the location of the purchase, the type of product being purchased, and the time of the purchase. Queries then aggregate measure values over ranges of dimension values to produce results such as the total sales per month and product type........
2008
This paper introduces a Conceptual Data Model for Data Warehouse including multidimensional aggregation. It is based on Entity-Relationships data model. The conceptual data model gracefully extends standard Entity-Relationship data model with multidimensional aggregated entities. The model has a clear mathematical theoretic semantics grounded on standard ER semantics and the GMD logic-based multidimensional data model. The aim of this work is not to propose yet another conceptual data model, but to find the most general and precise formalism considering all the proposals for a conceptual data model in the data warehouse field, making therefore a possible formal comparison of the differences of the models in the literature, and to study the formal properties or extensions of such data models.
Advanced Topics in Database Research, Volume 2
Multidimensional (MD) modeling is the basis for Data warehouses (DW), multidimensional databases (MDB) and On-Line Analytical Processing (OLAP) applications. In this chapter, we present how the Unified Modeling Language (UML) can be successfully used to represent both structural and dynamic properties of these systems at the conceptual level. The structure of the system is specified by means of a UML class diagram that considers the main properties of MD modeling with minimal use of constraints and extensions of the UML. If the system to be modeled is too complex, thereby leading us to a considerable number of classes and relationships, we sketch out how to use the package grouping mechanism provided by the UML to simplify the final model. Furthermore, we provide a UML-compliant class notation (called cube class) to represent OLAP initial user requirements. We also describe how we can use the UML state and interaction diagrams to model the behavior of a data warehouse system. We believe that our innovative approach provides a theoretical foundation for simplifying the conceptual design of multidimensional systems and our examples illustrate the use of our approach.
Problems and Solutions, 2003
Journal of Computing Science and Engineering
Data Warehousing a nd OLAP (On-Line Ana lytical Processing) have t urned into t he key technology for comprehensive dat a a nalysis. Origina lly developed for the needs of d ecision support in business, dat a warehouses have proven to be an adequat e solution for a var iety of non-business a pplications a nd doma ins, s uch as government, research, a nd medicine. Ana lytical power of the OLAP technology comes from its underlying multidimensiona l d at a model, which a llows users to see dat a from di fferent perspectives. However, t his model displays a number of defici encies when a pplied to non-convent iona l scenarios and a na lysis t asks. This paper presents a n at tempt to syst ema tically summarize va rious ext ensions of the origina l multidimensiona l d ata model that have been proposed by researchers a nd practit ioners in the recent years. Presented concepts are arra nged into a forma l classification consisting of fact types, factua l and fact-dimensiona l relationships, a nd dimension types, supplied wit h expla natory examples fro m real-world usage scena rios. Both t he st atic elements of the model, such as types of fact a nd dimension hierarchy schemes, and dynamic features, such as support for adva nced operat ors a nd derived elements. We a lso propose a semant ically rich gra phical notat ion called X-DFM that extends the p opular Dimensiona l Fact Model by refinin g and modifying t he set of constructs as to make it coherent wit h the for mal model. An eva luat ion of our framework against a set of co mmon modeling requirements summarizes the contribution .
2014
All data warehouse and data mart designers eventually come across the dimension they wish they could represent as a recursive hierarchy. It may be that it has too many levels or uneven branches, or it may be dynamic with complex segmentation rules not known at design time. The dilemma the designer faces is to force-fit the dimension into some fixed depth hierarchy, or to live with the performance implications of recursive SQL, such as the recursive union of DB2/UDB or Oracle’s “CONNECT BY”. Fortunately, there is another alternative. It exploits the concept of topological ordering and shifts most of the costly work to be done once in a preparatory step very similar to indexing. It is based on node enumeration applied at the time of data maintenance that enables transitive closure to be evaluated in a single SQL query. This technique particularly favors dimensions that are hierarchical, relatively stable in time, and are periodically maintained.
In this paper, we propose a conceptual multidimensional data model that facilitates a precise rigorous conceptualization for OLAP. First, our approach has strong relation with mathematics by applying a new defined concept, i.e. H-set. Afterwards, the mathematic soundness provides a foundation to handle natural hierarchical relationships among data elements within dimensions with many levels of complexity in their structures. Hereafter, the multidimensional data model organizes data in the form of metacubes, the concept of which is a generalization of other cube models. In addition, a metacube is associated with a set of groups each of which contains a subset of the metacube domain, which is a H-set of data cells. Furthermore, metacube operators (e.g. jumping, rollingUp and drillingDown) are defined in a very elegant manner.
Hawaii International Conference on System Sciences, 2010
Multidimensional data are the foundation for OLAP applications. They can be provided in several ways: relational OLAP, multidimensional OLAP, or hybrid OLAP. The usage of the underlying technology, which is well understood and in most cases formally defined, does not resolve the issue of a missing vocabulary for multidimensional data on a conceptual level. Some basic definitions are broadly used;
Sigmod Record, 2000
In the last few years, numerous proposals for modelling and querying Multidimensional Databases (MDDB) are proposed. A rigorous classification of the different types of hierarchies is still an open problem. In this paper we propose and discuss some different types of hierarchies within a single dimension of a cube. These hierarchies divide in different levels of aggregation a single dimension. Depending on them, we discuss the characterization of some OLAP operators that refer to hierarchies in order to maintain the data cube consistency. Moreover, we propose a set of operators for changing the hierarchy structure. The issues discussed provide modelling flexibility during the scheme design phase and correct data analysis.
Proceedings of the 4th ACM international workshop on Data warehousing and OLAP - DOLAP '01, 2001
Enhancing multidimensional database models with aggregation hierarchies allows viewing data at different levels of aggregation. Usually, hierarchy instances are represented by means of so-called rollup functions. Rollup between adjacent levels in the hierarchy are given extensionally, while rollups between connected nonadjacent levels are obtained by means of function composition. In many real-life cases, this model cannot capture accurately the meaning of common situations, particularly when exceptions arise. Exceptions may appear due to corporate policies, unreliable data or uncertainty, and their presence may turn the notion of rollup composition unsuitable for representing real relationships in the aggregation hierarchies. In this paper we present a language allowing augmenting traditional extensional rollup functions with intensional knowledge. We denote this language IRAH (Intensional Redefinition for Aggregation Hierarchies). Programs in IRAH consist of intensional rules, which can be regarded as patterns for: (a) overriding natural composition between rollup functions on adjacent levels in the concept hierarchy, (b) canceling the effect of rollup functions for specific values. Our proposal is presented as a stratified default theory. We show that a unique model for the underlying theory always exists, and can be computed in a bottom-up fashion. Finally, we present an algorithm that computes the revised dimension in polynomial time, although under more realistic assumptions, complexity becomes linear on the number of paths in the hierarchy of the dimension instance.
2003
In this paper we introduce a novel data model for multidimensional information, GMD, generalising the MD data model first proposed in Cabibbo et al (EDBT-98). The aim of this work is not to propose yet another multidimensional data model, but to find the general precise formalism encompassing all the proposals for a logical data model in the data warehouse field. Our proposal is compatible with all these proposals, making therefore possible a formal comparison of the differences of the models in the literature, and to study formal properties or extensions of such data models. Starting with a logic-based definition of the semantics of the GMD data model and of the basic algebraic operations over it, we show how the most important approaches in DW modelling can be captured by it. The star and the snowflake schemas, Gray’s cube, Agrawal’s and Vassiliadis’ models, MD and other multidimensional conceptual data can be captured uniformly by GMD. In this way it is possible to formally understand the real differences in expressivity of the various models, their limits, and their potentials.
1999
Recently numerous proposals for modelling and querying Multidimensional Databases (MDDB) are proposed. Among the still open problems there is a rigorous classification of the different types of hierarchies. In this paper we propose and discuss some different types of hierarchies within a single dimension of a cube. These hierarchies divide in different levels of aggregation a single dimension. Depending on them, we discuss the characterization of some OLAP operators which refer to hierarchies in order to maintain the data cube consistency.
Journal of Decision Systems, 2014
Many solutions have been defined for multidimensional database modelling. These propositions consider the same aggregation function to determine the values of an indicator according to different levels of granularity into the multidimensional space. We provide a more flexible conceptual model that supports multiple differentiated aggregations. Multiple aggregations allow associating different aggregation functions to the same measure for each dimension and for each hierarchy. Differentiated aggregation allows specific aggregations at each level (parameter). Our model is based on a double graphical formalism, expressive enough to control the validity of aggregation functions. We also study the consequences of this conceptual modelling for building lattices of pre-computed aggregates in a relational online analytical processing (R-OLAP) environment.
Lecture Notes in Computer Science, 2000
In the context of multidimensional databases implemented o n relational DBMSs through star schemes, the most effective technique t o enhance performances consists of materializing redundant aggregates called views. In this paper we investigate the problem of vertical fragmentation of views aimed at minimizing the workload response time. Each view includes several measures which not necessarily are always requested together; thus, the system performance may be increased by partitioning the views into smaller tables. On the other hand, drill-across queries involve measures taken from two or more views; in this case the access costs may be decreased by unifying these views into larger tables. After formalizing the fragmentation problem as a 0-1 integer linear programming problem, we define a cost function and outline a branch-and-bound algorithm t o minimize it. Finally, we demonstrate the usefulness of our approach b y presenting a set of experimental results based on the TPC-D benchmark.
1999
Abstract Linear constraint databases are a powerful framework to model spatial and temporal data. The use of constraint databases should be supported by access data structures that make effective use of secondary storage and reduce query processing time. Such structures should be able to store both finite and infinite objects and perform both containment (ALL) and intersection (EXIST) queries.
Modeling of semantics is one of the most di cult tasks in database design. Constraints are used to express database semantics. They are used di erently in database models. They express domain restrictions, specify relationships between components and state database behavior. The utilization depends on the richness of the type system used in the model. The relational model is using a simple type system and has a very large set of integrity constraints. Semantical models are using richer type systems which express also di erent types of integrity constraints. At the same time, the theory of integrity constraints is more complex. Object-oriented models use either a simple type system or type systems like the semantical models. The theory of integrity constraints is still under development. This overview tries to give a unifying framework on integrity constraints. 1
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.