0% found this document useful (0 votes)
6 views24 pages

Data Warehouse

The document discusses the differences between databases and data warehouses, highlighting that data warehouses support business intelligence while databases handle real-time information. It outlines the characteristics of structured, semi-structured, and unstructured data, as well as the ETL process for data warehousing. Additionally, it explains OLAP operations such as drill down, roll up, slice, and dice for querying multidimensional data.

Uploaded by

Hitanshu Khanna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views24 pages

Data Warehouse

The document discusses the differences between databases and data warehouses, highlighting that data warehouses support business intelligence while databases handle real-time information. It outlines the characteristics of structured, semi-structured, and unstructured data, as well as the ETL process for data warehousing. Additionally, it explains OLAP operations such as drill down, roll up, slice, and dice for querying multidimensional data.

Uploaded by

Hitanshu Khanna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 24

Data Warehouse &

Data Mining
Database and Datawarehouse
• DWH constitute entire information base for all time
• DB constitute real time information
• DWH supports DB and Business Intelligence
• DB is used to run the business
• DWH is how to run the business
Database and Big data
Database Big Data

Volume GB TB or PB

Generate rate Per day, per hour Per second and per minisecond

Structure Structured Unstructured, semistructured

Data Source Centralized Distributed

Data integration easy Difficult

Data Store RDBMS HDFS, noSQL

Data Structure Static Schema Dynamic Schema


A producer wants to Know
Data, Data everywhere yet…
What is a Datawarehouse?
Data Warehousing- A process
What is a data warehousing
Structured Data
• Fixed Format
• Can be accessed from database by simple search engine algorithm.
• Ex: Employee Table: Emp details, salary, job position etc can be found.
Semistructured
• Personal Data in XML file
Unstructured Data
• Image
• Video
• Audio
• CCTV Survillience
Characteristics of Datawarehouse
Data Marts: Structured and
Unstructured Data
Data Query
Making A Data warehouse:
Extraction
ETL: Transformation Process
ETL Process: Loading
Data Warehouse Architecture
OLAP Vs OLTP
OLAP Operations
• OLAP databases are
divided into one or more
cubes and these cubes
are known as Hyper-
cubes.
• It is based on
multidimensional data
model and allows the user
to query on multi-
dimensional data (eg.
Delhi -> 2018 -> Sales
data).
OLAP Operations
• Drill Down
• In drill-down operation, the less
detailed data is converted into
highly detailed data. It can be
done by:Moving down in the
concept hierarchy
• Adding a new dimension
• In the cube given in overview
section, the drill down operation
is performed by moving down in
the concept hierarchy
of Time dimension (Quarter ->
Month).
OLAP Operations
• Roll Up
• It is just opposite of the drill-
down operation. It performs
aggregation on the OLAP cube.
It can be done by:Climbing up in
the concept hierarchy
• Reducing the dimensions
• In the cube given in the
overview section, the roll-up
operation is performed by
climbing up in the concept
hierarchy of Location dimension
(City -> Country).
OLAP Operations
• Slice
• It selects a single
dimension from the OLAP
cube which results in a
new sub-cube creation. In
the cube given in the
overview section, Slice is
performed on the
dimension Time = “Q1”.
OLAP Operations
• Dice
• It selects a sub-cube from the
OLAP cube by selecting two or
more dimensions. In the cube
given in the overview section,
a sub-cube is selected by
selecting following
dimensions with
criteria:Location = “Delhi” or
“Kolkata”
• Time = “Q1” or “Q2”
• Item = “Car” or “Bus”

You might also like