0% found this document useful (0 votes)
13 views28 pages

Chapter 03 DSS 2025 Part1

Chapter 3 of 'Business Intelligence and Analytics: Systems for Decision Support' discusses data warehousing, defining it as a repository for integrated, subject-oriented databases that support decision-making. Key topics include the characteristics of data warehouses, the ETL process, and various architectural frameworks for data warehousing. The chapter also covers the development approaches, including the Inmon and Kimball models, and considerations for hosted data warehouses.

Uploaded by

Maram Abdelgayed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views28 pages

Chapter 03 DSS 2025 Part1

Chapter 3 of 'Business Intelligence and Analytics: Systems for Decision Support' discusses data warehousing, defining it as a repository for integrated, subject-oriented databases that support decision-making. Key topics include the characteristics of data warehouses, the ETL process, and various architectural frameworks for data warehousing. The chapter also covers the development approaches, including the Inmon and Kimball models, and considerations for hosted data warehouses.

Uploaded by

Maram Abdelgayed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Business Intelligence and Analytics:

Systems for Decision Support


(10th Edition)

Chapter 3:
Data Warehousing
Main Data Warehousing Topics
◼ DW definition
◼ Characteristics of DW
◼ Data Marts
◼ DW Framework
◼ DW Architecture & ETL Process
◼ DW Development
◼ DW Issues

3-2 Copyright © 2014 Pearson Education, Inc.


What is a Data Warehouse?
◼ A physical repository where relational data
are specially organized to provide
enterprise-wide, cleansed data in a
standardized format
◼ “The data warehouse is a collection of
integrated, subject-oriented databases
designed to support DSS functions, where
each unit of data is non-volatile and
relevant to some moment in time”
3-3 Copyright © 2014 Pearson Education, Inc.
A Historical Perspective to
Data Warehousing
ü Mainframe computers ü Centralized data storage ü Big Data analytics
ü Simple data entry ü Data warehousing was born ü Social media analytics
ü Routine reporting ü Inmon, Building the Data Warehouse ü Text and Web Analytics
ü Primitive database structures ü Kimball, The Data Warehouse Toolkit ü Hadoop, MapReduce, NoSQL
ü Teradata incorporated ü EDW architecture design ü In-memory, in-database

1970s 1980s 1990s 2000s 2010s

ü Mini/personal computers (PCs) ü Exponentially growing data Web data


ü Business applications for PCs ü Consolidation of DW/BI industry
ü Distributer DBMS ü Data warehouse appliances emerged
ü Relational DBMS ü Business intelligence popularized
ü Teradata ships commercial DBs ü Data mining and predictive modeling
ü Business Data Warehouse coined ü Open source software
ü SaaS, PaaS, Cloud Computing

3-4 Copyright © 2014 Pearson Education, Inc.


Characteristics of DWs
◼ Not normalized
◼ Subject oriented
◼ Integrated
◼ Time-variant (time series)
◼ Nonvolatile
◼ Metadata
◼ Web based, relational/multi-dimensional
◼ Client/server, real-time/right-time/active...

3-5 Copyright © 2014 Pearson Education, Inc.


Subject Oriented
◼ Data are organized by detailed subject, such as sales,
products, or customers, containing only information
relevant for decision support.
◼ Subject orientation enables users to determine not only
how their business is performing, but why.
◼ A data warehouse differs from an operational database
in that most operational databases have a product
orientation and are tuned to handle transactions that
update the database.
◼ Subject orientation provides a more comprehensive view
of the organization.

3-6 Copyright © 2014 Pearson Education, Inc.


Integrated
◼ Integration is closely related to subject orientation.
◼ Data warehouses must place data from different sources
into a consistent format. To do so,
◼ They must deal with naming conflicts and discrepancies
among units of measure.
◼ A data warehouse is presumed to be totally integrated.

3-7 Copyright © 2014 Pearson Education, Inc.


Time variant (time series).
◼ A warehouse maintains historical data.
◼ The data do not necessarily provide current status
(except in real-time systems).
◼ They detect trends, deviations, and long-term
relationships for forecasting and comparisons, leading to
decision making.
◼ Every data warehouse has a temporal quality.
◼ Time is the one important dimension that all data
warehouses must support.
◼ Data for analysis from multiple sources contains multiple
time points (e.g., daily, weekly, monthly views).

3-8 Copyright © 2014 Pearson Education, Inc.


Nonvolatile.
◼ After data are entered into a data warehouse, users
cannot change or update the data.
◼ Obsolete data are discarded, and changes are recorded
as new data.

3-9 Copyright © 2014 Pearson Education, Inc.


Additional Characteristics.
◼ Web based: Data warehouses are typically designed to
provide an efficient computing environment for Web-
based applications.
◼ Relational/multidimensional: A data warehouse uses
either a relational structure or a multidimensional
structure.
◼ Client/server: A data warehouse uses the client/server
architecture to provide easy access for end users.
◼ Real time: Newer data warehouses provide real-time, or
active, data-access and analysis capabilities
◼ Include metadata: A data warehouse contains metadata
(data about data) about how the data are organized and
how to effectively use them.
3-10 Copyright © 2014 Pearson Education, Inc.
Data Mart
A departmental small-scale “DW” that
stores only limited/relevant data

◼ Dependent data mart


A subset that is created directly from a data
warehouse

◼ Independent data mart


A small data warehouse designed for a
strategic business unit or a department

3-11 Copyright © 2014 Pearson Education, Inc.


Other DW Components
◼ Operational data stores (ODS)
A type of database often used as an interim
area for a data warehouse
◼ Oper marts - an operational data mart.
◼ Enterprise data warehouse (EDW)
A data warehouse for the enterprise.
◼ Metadata: Data about data.
In a data warehouse, metadata describe the
contents of a data warehouse and the manner
of its acquisition and use
3-12 Copyright © 2014 Pearson Education, Inc.
A Generic DW Framework
No data marts option
Data Applications
Sources (Visualization)
Access
Routine
ERP Business
ETL
Reporting
Process Data mart
(Marketing)
Select

/ Middleware
Legacy Metadata Data/text
Extract mining
Data mart
(Engineering)
Transform Enterprise
POS Data warehouse
OLAP,
Integrate

API
Data mart Dashboard,
(Finance) Web
Other Load
OLTP/wEB
Replication Data mart
(...) Custom built
External
applications
data

3-13 Copyright © 2014 Pearson Education, Inc.


DW Architecture
◼ Three-tier architecture
1. Data acquisition software (back-end)
2. The data warehouse that contains the data &
software
3. Client (front-end) software that allows users to
access and analyze data from the warehouse
◼ Two-tier architecture
First two tiers in three-tier architecture is combined
into one
… sometimes there is only one tier?

3-14 Copyright © 2014 Pearson Education, Inc.


DW Architectures

Tier 1: Tier 2: Tier 3:


Client workstation Application server Database server

Tier 1: Tier 2:
Client workstation Application & database server

3-15 Copyright © 2014 Pearson Education, Inc.


Data Warehousing Architectures
◼ Issues to consider when deciding which
architecture to use:
◼ Which database management system (DBMS)
should be used?
◼ Will parallel processing and/or partitioning be
used?
◼ Will data migration tools be used to load the data
warehouse?
◼ What tools will be used to support data retrieval
and analysis?

3-16 Copyright © 2014 Pearson Education, Inc.


A Web-Based DW Architecture

Web pages
Application
Server

Client Web
(Web browser) Internet/ Server
Intranet/
Extranet
Data
warehouse

3-17 Copyright © 2014 Pearson Education, Inc.


Alternative DW Architectures
(a) Independent Data Marts Architecture

ETL
End user
Source Staging Independent data marts
access and
Systems Area (atomic/summarized data)
applications

(b) Data Mart Bus Architecture with Linked Dimensional Datamarts

ETL
Dimensionalized data marts End user
Source Staging
linked by conformed dimensions access and
Systems Area
(atomic/summarized data) applications

(c) Hub and Spoke Architecture (Corporate Information Factory)

ETL
End user
Source Staging Normalized relational
access and
Systems Area warehouse (atomic data)
applications

Dependent data marts


(summarized/some atomic data)
Alternative DW Architectures
(d) Centralized Data Warehouse Architecture

ETL
Normalized relational End user
Source Staging
warehouse (atomic/some access and
Systems Area
summarized data) applications

(e) Federated Architecture

Data mapping / metadata


End user
Logical/physical integration of access and
Existing data warehouses
common data elements applications
Data marts and legacy systems

◼ Each architecture has advantages and


disadvantages!
◼ Which architecture is the best?
Ten factors that potentially affect the
architecture selection decision

1. Information 6. Strategic view of the data


interdependence between warehouse prior to
organizational units implementation
2. Upper management’s 7. Compatibility with existing
information needs systems
3. Urgency of need for a data 8. Perceived ability of the in-
warehouse house IT staff
4. Nature of end-user tasks 9. Technical issues
5. Constraints on resources 10. Social/political factors

3-20 Copyright © 2014 Pearson Education, Inc.


Alternative DW Architectures
Data Integration and the Extraction,
Transformation, and Load Process
◼ ETL = Extract Transform Load
◼ Data integration
Integration that comprises three major processes: data
access, data federation, and change capture.
◼ Enterprise application integration (EAI)
A technology that provides a vehicle for pushing data
from source systems into a data warehouse
◼ Enterprise information integration (EII)
An evolving tool space that promises real-time data
integration from a variety of sources, such as relational
or multidimensional databases, Web services, etc.

3-22 Copyright © 2014 Pearson Education, Inc.


Data Integration and the Extraction,
Transformation, and Load Process

Packaged Transient
application data source

Data
warehouse

Legacy
Extract Transform Cleanse Load
system

Data mart
Other internal
applications

3-23 Copyright © 2014 Pearson Education, Inc.


ETL (Extract, Transform, Load)
◼ Issues affecting the purchase of an ETL tool
◼ Data transformation tools are expensive
◼ Data transformation tools may have a long learning
curve
◼ Important criteria in selecting an ETL tool
◼ Ability to read from and write to an unlimited number
of data sources/architectures
◼ Automatic capturing and delivery of metadata
◼ A history of conforming to open standards
◼ An easy-to-use interface for the developer and the
functional user
3-24 Copyright © 2014 Pearson Education, Inc.
Data Warehouse Development
Data warehouse development approaches
◼ Inmon Model: EDW approach (top-down)
◼ Kimball Model: Data mart approach
(bottom-up)
◼ Which model is best?
◼ Table 3.3 provides a comparative analysis
between EDW and Data Mart approach
◼ One alternative is the hosted warehouse

3-25 Copyright © 2014 Pearson Education, Inc.


Additional DW Considerations
Hosted Data Warehouses
◼ Benefits:
◼ Requires minimal investment in infrastructure
◼ Frees up capacity on in-house systems
◼ Frees up cash flow
◼ Makes powerful solutions affordable
◼ Enables solutions that provide for growth
◼ Offers better quality equipment and software
◼ Provides faster connections
◼ … more in the book
3-26 Copyright © 2014 Pearson Education, Inc.
End of the Chapter

◼ Questions, comments

3-27 Copyright © 2014 Pearson Education, Inc.


All rights reserved. No part of this publication may be reproduced,
stored in a retrieval system, or transmitted, in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise,
without the prior written permission of the publisher. Printed in the
United States of America.

3-28 Copyright © 2014 Pearson Education, Inc.

You might also like