Unit 4 :OLAP
Online Analytical Processing
NOTE : THIS PRESENTATION SHOULD BE CONSIDERED AS
SUPPORTING MATERIAL ONLY. FOR DETAILED STUDY STUDENTS
MUST REFER THE TEXT BOOKS AND REFRENCE BOOKS MENTIONED
IN SYLLABUS.
1 9/16/2020
What and Why OLAP?
OLAP is the dynamic analysis and consolidation of large
volumes of multi-dimensional data.
OLAP is the term that describes a technology that uses
multi-dimensional view of aggregate data to provide quick
access to strategic information for the purposes of advanced
analysis.
2 9/16/2020
What and Why OLAP?
OLAP enables users to gain a deeper understanding and
knowledge about various aspects of their corporate data
through fast, consistent, interactive access to a variety of
possible views of data.
The types of analysis available from OLAP range from basic
navigation and browsing (referred to as ‘slicing’ and dicing’) , to
calculations, to more complex analysis such as time series and
complex modeling.
3 9/16/2020
Need for Online Analytical Processing
The warehouse must have the capability to perform complex
analysis in less time.
Need for Multidimensional Analysis
Fast Access and Powerful Calculations
Limitations of other Analysis Methods
4 9/16/2020
Need for Multidimensional Analysis
For planning and making strategic decisions, managers and executives probe
into business data through scenarios.
Sample query:How much revenue did the new Product X generate during the
last three months, broken down by individual months, in the South Central
territory, by individual stores, broken down by promotions, compared to
estimates, and compared to the previous version of the product?
For effective analysis, users must have easy methods of performing complex
analysis along several business dimensions.
They need an environment that presents a multidimensional view of data,
providing the foundation for analytical processing through easy and flexible
access to information.
Decision makers must be able to analyze data along any number of dimensions,
at any level of aggregation, with the capability of viewing results in a variety of
ways.
They must have the ability to drill down and roll up along the hierarchies of
every dimension.
5 9/16/2020
Fast Access and Powerful Calculations
6 9/16/2020
List of typical calculations that get included in the
query requests:
Roll-ups to provide summaries and aggregations along the
hierarchies of the dimensions.
Drill-downs from the top level to the lowest along the
hierarchies of the dimensions, in combinations among the
dimensions.
Simple calculations, such as computation of margins (sales
minus costs).
Share calculations to compute the percentage of parts to the
whole.
Algebraic equations involving key performance indicators.
Moving averages and growth percentages.
Trend analysis using statistical methods.
7 9/16/2020
Limitations of Other Analysis Methods
In a real-world analysis session, many queries follow one after the
other.
Each query may translate into a number of intricate SQL
statements, with each of the statements likely to invoke full table
scans, multiple joins, aggregations, groupings, and sorting.
Analysis of the type we are discussing requires complex
calculations and handling time series data.
SQL is notably weak in these areas.
Even if you can imagine an analyst accurately formulating such
complex SQL statements, the overhead on the systems would
seriously impact the response times.
8 9/16/2020
OLAP Applications
Finance: Budgeting, activity-based costing, financial performance
analysis, and financial modeling.
Sales: Sales analysis and sales forecasting.
Marketing: Market research analysis, sales forecasting, promotions
analysis, customer analysis, and market/customer segmentation.
Manufacturing: Production planning and defect analysis.
9 9/16/2020
OLAP Key Features
Multi-dimensional views of data.
Support for complex calculations.
Time Intelligence.
10 9/16/2020
OLAP Benefits
Increased productivity of business end-users, IT developers,
and consequently the entire organization.
Reduced backlog of applications development for IT staff by
making end-users self-sufficient enough to make their own
schema changes and build their own models.
11 9/16/2020
OLAP Benefits
Retention of organizational control over the integrity of
corporate data as OLAP applications are dependent on data
warehouses and OLTP systems to refresh their source data
level.
Improved potential revenue and profitability by enabling the
organization to respond more quickly to market demands.
12 9/16/2020
OLTP vs OLAP
Already discussed in unit 1.
13 9/16/2020
OLAP and Multidimensional Analysis
OLAP database servers use multi-dimensional structures
to store data and relationships between data.
Multi-dimensional structures are best-visualized as
cubes of data, and cubes within cubes of data. Each side
of a cube is a dimension.
14 9/16/2020
OLAP and Multidimensional Analysis
OLAP is an approach to answering multi-dimensional
analytical (MDA) queries swiftly.
OLAP is part of the broader category of business
intelligence, which also encompasses relational database,
report writing and data mining.
15 9/16/2020
Representation of Multi-Dimensional Data
16 9/16/2020
Representation of Multi-Dimensional Data
Multi-dimensional databases are a compact and easy-to-understand way of visualizing and
manipulating data elements that have many inter-relationships.
The cube can be expanded to include another dimension, for example, the number of
sales staff in each city.
The response time of a multi-dimensional query depends on how many cells have to be
added on-the-fly.
As the number of dimensions increases, the number of cube’s cells increases
exponentially.
17 9/16/2020
Representation of Multi-Dimensional Data
18 9/16/2020
OLAP operations in Multi-dimensional
data Model
Multi-dimensional OLAP supports common analytical
operations, such as:
Consolidation: involves the aggregation of data such as
‘roll-ups’ or complex expressions involving interrelated
data. For example, branch offices can be rolled up to
cities and finally rolled up to countries.
19 9/16/2020
OLAP Operations
Drill-Down: is the reverse of consolidation and involves
displaying the detailed data that comprises the consolidated
data.
Slicing and dicing: refers to the ability to look at the data
from different viewpoints. Slicing and dicing is often
performed along a time axis in order to analyze trends and
find patterns.
20 9/16/2020
Other OLAP Operations
Pivot (rotate):
reorient the cube, visualization, 3D to series of 2D planes.
Drill across: involving (across) more than one fact table
Drill through: through the bottom level of the cube to its back-
end relational tables (using SQL)
21 9/16/2020
Roll up and drill down
22 9/16/2020
Slice and dice
23 9/16/2020
OLAP Tools - Features
In 1993, E.F. Codd formulated twelve rules as the basis for selecting
OLAP tools:
Multi-dimensional conceptual view
Transparency
Accessibility
Consistent reporting performance
Client-server architecture
Generic dimensionality
24 9/16/2020
OLAP Tools - Features
Multi-user support
cross-dimensional operations
Intuitive data manipulation
Flexible reporting
Unlimited dimensions and aggregation levels
25 9/16/2020
OLAP Models - Categories
OLAP models are categorized according to the architecture
used to store and process multi-dimensional data.
There are four main categories of OLAP models as defined by
Berson and Smith (1997) and Pends and Greeth (2001)
including:
Multi-dimensional OLAP (MOLAP)
Relational OLAP (ROLAP)
Hybrid OLAP (HOLAP)
Desktop OLAP (DOLAP)
26 9/16/2020
Multi-dimensional OLAP (MOLAP)
MOLAP tools use specialized data structures and multi-dimensional
database management systems (MDDBMS) to organize, navigate,
and analyze data.
To enhance query performance the data is typically aggregated and
stored according to predicted usage.
MOLAP data structures use array technology and efficient storage
techniques that minimize the disk space requirements through sparse
data management.
27 9/16/2020
Multi-dimensional OLAP (MOLAP)
The development issues associated with MOLAP:
Only a limited amount of data can be efficiently stored and analyzed.
Navigation and analysis of data are limited because the data is designed according to
previously determined requirements.
MOLAP products require a different set of skills and tools to build and maintain the
database.
28 9/16/2020
Relational OLAP (ROLAP)
ROLAP is the fastest-growing type of OLAP tools.
ROLAP supports RDBMS products through the use of a metadata layer, thus
avoiding the requirement to create a static multi-dimensional data structure.
This facilitates the creation of multiple multi-dimensional views of the two-
dimensional relation.
To improve performance, some ROLAP products have enhanced SQL engines to
support the complexity of multi-dimensional analysis, while others recommend, or
require, the use of highly denormalized database designs such as the star schema.
29 9/16/2020
Relational OLAP (ROLAP)
The development issues associated with ROLAP technology:
Performance problems associated with the processing of complex queries that require multiple
passes through the relational data.
Development of middleware to facilitate the development of multi-dimensional applications.
Development of an option to create persistent multi-dimensional structures, together with facilities
to assist in the administration of these structures.
30 9/16/2020
Hybrid OLAP (HOLAP)
HOLAP tools provide limited analysis capability, either directly against RDBMS
products, or by using an intermediate MOLAP server.
HOLAP tools deliver selected data directly from DBMS or via MOLAP server to the
desktop (or local server) in the form of data cube, where it is stored, analyzed, and
maintained locally is the fastest-growing type of OLAP tools.
31 9/16/2020
Hybrid OLAP (HOLAP)
The issues associated with HOLAP tools:
The architecture results in significant data redundancy and may cause problems for
networks that support many users.
Ability of each user to build a custom data cube may cause a lack of data consistency
among users.
Only a limited amount of data can be efficiently maintained.
32 9/16/2020
Desktop OLAP (DOLAP)
DOLAP tools store the OLAP data in client-based files and support multi-dimensional
processing using a client multi-dimensional engine. DOLAP requires that relatively
small extracts of data are held on client machines. This data may be distributed in
advance or on demand (possibly through the Web).
The administration of a DOLAP database is typically performed by a central server or
processing routine that prepares data cubes or sets of data for each user.
The development issues associated with DOLAP are as follows:
Provision of appropriate security controls to support all parts of the DOLAP
environment.
Reduction in the effort involved in deploying and maintaining the DOLAP tools.
33 9/16/2020
Desktop OLAP (DOLAP)
34 9/16/2020