0% found this document useful (0 votes)
12 views4 pages

Data Warehouse Introduction

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views4 pages

Data Warehouse Introduction

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

What is a Data Warehouse?

A Data Warehouse (DWH) is an essential component of a business intelligence system that enables
users to perform various tasks and analyze large volumes of data. It typically contains multiple
data sources such as transaction applications and log files. A DWH is generally considered an
organization’s single source of truth.

It can be categorized as a specific area of comfort that supports the management’s process by
storing large volumes of data. A DWH can also help an organization make informed decisions by
analyzing the data.
Characteristics Of a Data Warehouse:

There are four characteristics of a DWH. They allow analysts to make informed decisions by
analyzing the data.

1. 1. Subject-oriented:

It denotes that the data warehousing process is focused on a more specified issue. It focuses on a
specific subject or topic instead of an organization’s ongoing operations. It also delivers a clear,
precise, and concise summary of the issue by excluding information that isn’t relevant to the
decision-making process.

2. Time-variant:

The DWH is consistent within a specified period. This means that the data in the DWH is uploaded
monthly, weekly, hourly, etc., and it does not change within that period.

3. Non-volatile:

When you add new data, the previous data is not truncated. Since a DWH is separate from an
operational database, any regular changes in the operational database are not reflected in the DWH.
People will understand what has happened because of the lack of volatility. It clarifies the results
of the investigation.

4. Integrated:

In a DWH, integration entails establishing a consistent unit of measurement for all related data
from various databases. It reduces data redundancy. This means that if the same data has different
names under different topics, DWH will identify all of them under one name.

Key Functions of a Data Warehouse:

A DWH serves as a data repository, with data held by an entity that provides data backup services.
It lowers the cost of the storage system and even the backup data at the organizational level. It
maintains information about the tables with high granular transaction levels that are monitored to
define data warehousing approaches.
Here are some functions of a DWH:

It refers to collecting data from all the sources in an organization,


Data consolidations
cleaning it, and combining it in a single location.

It is a process that ensures the correctness of data. The accuracy,


Data cleaning
integration, and consistency are checked.

It combines data from different sources in an organization under a single


Data integration
view to the users.

Data extraction It is the process of extracting data from an organization for further use.

Data transformation It is the process of converting some data into a usable format.

Data loading It is the process of loading the data into a storage system.

Refreshing It is the process of updating the data.

Important: Data cleaning and transformation are crucial to improving data quality and data
mining results.

Implementation Of a Data Warehouse:

Implementing a DWH entails creating and deploying a DWH to collect and arrange company data
for analytical querying and reporting.

In other words, DWH implementation refers to building and implementing a DWH system in a
company. Data warehousing is one of the most critical components of an organization’s business
intelligence process. The deployment of data warehousing necessitates a set of procedures that
must be followed meticulously.
1. Planning:

It aids in developing a path or road map that we must follow to reach the goals and objectives we
have set forth. There’s a reasonable probability that the project will fail if there’s no good planning.
While planning, we need to consider the technical, launch, and user requirements. Some important
considerations are as follows:

 – Data backup
 – Restoration and recovery of data
 – Data sources to be used
 – Combining data
 – Storing data
 – Cost estimations

2. Data Gathering:

Although data is readily available, not all of it is useful to a business. Data gathering is collecting
data from various sources to be utilized for data analysis and reporting. Some things to consider
are:

 – Data quality
 – Data joining
 – The health of the ETL tools

3. Data Analysis:

Data analysis is creating and extracting relevant insights from a day’s worth of data. Most things
to consider during this step are the same as data gathering. Others are:

 – Sources from which the data is collected


 – ETL tools and process
 – Processes to analyze the data
 – Testing

4. Business Actions:

The greater the number of insights, the greater the efficiency of business decisions, and these
judgments will determine the organization’s future.

 – Efficiency of data gathering


 – Efficiency of data analysis
 – Metadata
 – Factors affecting the decisions

You might also like