0% found this document useful (0 votes)
218 views7 pages

Cloud Data Warehouse

An enterprise data warehouse (EDW) centralizes business data from multiple sources for analysis and reporting. Traditionally run on in-house infrastructure, EDWs are now commonly hosted in the cloud to provide benefits like lower costs, higher scalability, and faster analytics. Implementing an EDW in the cloud involves assessing workloads, evaluating ETL tools, gaining cloud expertise, estimating migration costs, and conducting a phased approach of workshops, assessment, minimum viable cloud build, migration, and operations.

Uploaded by

Naveen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
218 views7 pages

Cloud Data Warehouse

An enterprise data warehouse (EDW) centralizes business data from multiple sources for analysis and reporting. Traditionally run on in-house infrastructure, EDWs are now commonly hosted in the cloud to provide benefits like lower costs, higher scalability, and faster analytics. Implementing an EDW in the cloud involves assessing workloads, evaluating ETL tools, gaining cloud expertise, estimating migration costs, and conducting a phased approach of workshops, assessment, minimum viable cloud build, migration, and operations.

Uploaded by

Naveen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

How a Cloud-Hosted Data Warehouse For an Enterprises Works

In simple terms, data warehousing is a process used to collect and analyze data from multiple
sources. The importance of data warehouses has been boosted with the availability of large
volumes of business data that is now available for analysis and reporting among business
users.

However, data warehousing is not a recent development and has been in the making since the
late 1980s. Before the evolution of cloud technology, traditional data warehouses were
operated on in-house infrastructure which was more expensive.

Additionally, these data warehouses contained data from selected sources including relational
database tables and data files. Today, data warehouses include these data sources and
additional sources including CRM & ERP systems, social media platforms, IoT devices, big
data, and much more.

So, how has data warehousing for the enterprise evolved since the 80s? How does a data
warehouse on the cloud compared with the traditional model? Let’s examine this and much
more in the remaining sections.

What is an Enterprise Data Warehouse?

An Enterprise Data Warehouse (or EDW) is a centralized database (or a collection of


databases) that stores business data from multiple data sources. This business data is then
made available for data analytics and for deriving valuable insights.

In today’s age of big data, an enterprise warehouse is a central repository that collects data
from multiple sources (as illustrated in the figure). This includes structured, unstructured, and
partially structured data that goes through an ETL (extraction, transformation, and loading)
process before being stored in the data warehouse.

According to Oracle, the enterprise data warehouse is a data management system that has
been designed for performing business intelligence or BI.

With the availability of all the business data in a centralized warehouse, the data can now be
analyzed using data analysis or BI tools for more accurate insights. For example, a data
warehouse in an airline company can be used to determine the current crew assignment,
frequent flyer programs, and air route analysis.

In recent years, companies have moved their enterprise data warehouse architecture from
traditional in-house warehousing to cloud-based architecture. An enterprise cloud data
warehouse provides the following business benefits as compared to the traditional
warehousing model:

 Eliminates the need to purchase any in-house hardware for data warehousing.
 Offer lower upfront costs as compared to traditional warehouses.
 It offers higher scalability with an increase in available data.
 Faster processing of analytics and queries thanks to its massively parallel processing
(or MPP) capabilities.

Next, we shall discuss the Enterprise data warehouse architecture along with its key
components.

Enterprise Data Warehouse – Architecture & Key Components

First, we shall discuss the key components of the enterprise data warehouse architecture.

Typically, a three-tier enterprise data warehouse architecture consists of the following


components, namely:

 Data sources or the databases that are storing the business data.
 Extract, Transform, Load (ETL) or Extract, Load, Transform (ELT), referred to as the
ETL or ELT tools that are used for data extraction, transformation, and loading into
the EDW. The main difference between ETL and ELT is that the ETL process
includes a staging area where the data transformation is executed.
 Staging area component that is included only when ETL tools are deployed. This is
the component where the extracted data is loaded and transformed before being stored
in the data warehouse.
 Data warehousing database where the extracted data is loaded and transformed (only
in the case of ELT) into the storage space.
 OLAP Server component used for online analytical processing of the data. This
component allows business users to slice & dice the generated data for compiling
business reports.
 Reporting components or tools that provide users with the BI interface for visualizing
the data and report generation.

As a technological innovation, an enterprise data warehouse can support many workloads in


any business enterprise, including manufacturing, financial reporting, supply chains, and
customer satisfaction analysis.

Among the primary reason to migrate the data enterprise to the cloud, data scalability can be
a major challenge as organizations keep adding new or more complex workloads. This
growth can create multiple workloads including analytical and operational workloads.

For proper planning, businesses must consider the following key categories of EDW
workloads:

 Auditable workloads, that include business legal requirements and operations for any
company. These workloads include data for business compliance, risk, and
government regulations.
 Sustained workloads, that are typically used by the management team to check key
measures such as customer satisfaction, CRM, and other business-driving metrics.
 Variable workloads, that include any financial and operational reporting data that do
not use EDW on a continuous basis. Ideal for cloud-powered EDW, these workloads
are used for sales planning and compensation.
 Organizational reporting workloads, that comprise of metrics and KPIs, which
measure business performance. This workload is ideal for an enterprise data
warehouse in the cloud platform.
 Business unit-specific workloads, that are department-specific data and are among the
first workloads that can be safely migrated to the cloud.

Now that we have an understanding of the EDW components and workload categories, let’s
move on to the process of implement the EDW on a cloud platform.

How to Implement EDW on the Cloud

With the emergence of cloud-powered technologies, business enterprises are implementing


(or migrating) their enterprise data warehouse in cloud platforms. Cloud-based data
warehousing services are offered on technologies such as Amazon Redshift, Google
BigQuery, and IBM DB2.

Before migrating your EDW to a cloud platform, any business must assess a few migration
considerations. This is crucial to ensure that the cloud-based data warehousing workloads are
integrated seamlessly with the in-premise workloads, thus minimizing any downtime. 

Each of the following considerations can be used to determine the priority of the workloads
to be moved to the cloud:

Define the required data sets and data volumes


For large data volumes, migrating the data warehouse from the premise to the cloud can be a
long and time-consuming process. Hence, you must define the data sets early in the process.
This ensures smooth connectivity for data movement and the design of a project schedule for
a time-bound data migration process.

Evaluate your ETL tools for cloud operation

A cloud-powered EDW requires all your ETL processes and data to be enabled beyond in-
premise warehousing. As a consideration, evaluate if your existing ETL tools are valid for
cloud operation and can be integrated with cloud-based EDW technologies.

Assess your development team’s expertise in cloud technologies

Cloud-powered EDWs offer capabilities that can be a challenge for developers who are used
to working on in-premise warehouses. Through proper training and learning, ensure that your
project developers gain the necessary expertise and resources required to adopt a cloud EDW.

Estimate the overall costs of data movement

Business costs of moving data from an on-premise warehouse to the cloud can overshoot in
the event of an inefficient migration process. As a business, evaluate the overall costs of
migration before proceeding with the same.

Implementing data warehousing for enterprise on the cloud requires a phase-wise process
comprising of new deployments as well as the migration of the existing workloads to the
cloud.

Let’s discuss each of these phases in detail:

1. Workshop phase

This is the initial phase where you can define the business metrics and KPIs for measuring
the EDW project deliverables. For defining target workloads, engage and brainstorm with
your team members including business analysts, BI programmers, and database
administrators. You can also identify the use cases for the cloud-powered EDW.

2. Assessment and planning phase

During the assessment phase, list and map out your workloads to data sets, database tables,
and other structures. For better governance and compliance, define the necessary security
controls. Additionally, you can define the roadmap for a minimum viable cloud (MVC) as
well as the staffing for executing the MVC build and other EDW operations.

3. MVC Build phase

In the third phase, design and implement the MVC build including elements such as
connectivity, routing, access controls, and other deployment tools. Additionally, you should
configure separate environments for development, testing, and production activities. Along
with building database instances, you can configure the staging environment for data
replication and transformation.

4. Migration phase

Finally, we come to the migration phase where you can move your datasets to the cloud
EDW platform. Additionally, you can implement the ETL processes during this phase to
ensure integrity between the on-premise and cloud premise workloads. Identify the individual
workloads (for development, testing, production) and migrate them along with the data
workflows. At last, reconfigure your data analytical models for execution on the cloud.

5. Operation phase

The final phase is the operational phase for executing an operational and automated response
to any system events. During this phase, identify the operation team personnel and if
necessary, train them to identify and respond to system events.

With this phase-wise implementation, you can complete the migration of your EDW system
to the cloud. Next, we shall look at how a cloud-powered enterprise data warehouse benefits
any organization.

Business Benefits of EDW on the Cloud

Cloud-hosted data warehousing for enterprise offer multiple business benefits including the
following: 

Lower operational costs

Any on-premise infrastructure involves multiple cost avenues including license costs,
hardware, manpower, and even deployment and maintenance-related costs. For cloud-
powered EDW, you only pay for what you consume. Additionally, cloud platforms free up
your internal resources for other work which would otherwise, have been allocated to
managing your data warehousing activities.

Higher business agility

Capabilities like effective data analytics and business intelligence are no longer just an IT
domain but are now used by business users and departments. In this environment, it’s no
longer feasible for business functions to wait for IT personnel to set up a data warehouse for
them.

On the other hand, cloud deployments offer the advantage of faster and agile data
warehousing for the enterprise and business users.

Enhanced security and protection

Cloud deployments have often raised business concerns about data security and privacy. This
has been addressed by cloud-powered tools like Microsoft Azure and Google Cloud
Storage that place a high value on their security measures. For example, measures like data
encryption on the Google Cloud Platform has improved data security.

Conclusion

For any industry or business domain, data warehousing for an enterprise has its share of
benefits and challenges. To simplify decisions, you must first understand your data-related
needs and the feasibility of migrating your data warehouse to the cloud.

With its customized services in cloud engineering and analytics, Countants has enabled its
global clients to move to the cloud platform and reduce their operational costs.

Want to leverage your business from our BI and cloud-powered services? Get in touch now
with your business queries and needs.

Here are key differences between the two data associated terms in the
mentioned aspects:

Parameters Data Lake Data Warehouse


A data warehouse will consist of data
In the data lake, all data is kept
that is extracted from transactional
irrespective of the source and its
systems or data which consists of
Storage structure. Data is kept in its raw form. It
quantitative metrics with their
is only transformed when it is ready to
attributes. The data is cleaned and
be used.
transformed
Big data technologies used in data Data warehouse concept, unlike big
History
lakes is relatively new. data, had been used for decades.
Captures all kinds of data and
Captures structured information and
Data structures, semi-structured and
organizes them in schemas as
Capturing unstructured in their original form from
defined for data warehouse purposes
source systems.
Data lakes can retain all data. This
includes not only the data that is in use In the data warehouse development
Data
but also data that it might use in the process, significant time is spent on
Timeline
future. Also, data is kept for all time, to analyzing various data sources.
go back in time and do an analysis.
Users Data lake is ideal for the users who The data warehouse is ideal for
indulge in deep analysis. Such users operational users because of being
include data scientists who need well structured, easy to use and
advanced analytical tools with understand.
capabilities such as predictive modeling
Parameters Data Lake Data Warehouse
and statistical analysis.
Data storing in big data technologies
Storage Storing data in Data warehouse is
are relatively inexpensive then storing
Costs costlier and time-consuming.
data in a data warehouse.
Data lakes can contain all data and data
Data warehouses can provide insights
types; it empowers users to access data
Task into pre-defined questions for pre-
prior the process of transformed,
defined data types.
cleansed and structured.
Data lakes empower users to access
data before it has been transformed, Data warehouses offer insights into
Processing cleansed and structured. Thus, it allows pre-defined questions for pre-defined
time users to get to their result more quickly data types. So, any changes to the
compares to the traditional data data warehouse needed more time.
warehouse.
Typically schema is defined before
Typically, the schema is defined after
data is stored. Requires work at the
Position of data is stored. This offers high agility
start of the process, but offers
Schema and ease of data capture but requires
performance, security, and
work at the end of the process
integration.
Data warehouse uses a traditional
Data Data Lakes use of the ELT (Extract
ETL (Extract Transform Load)
processing Load Transform) process.
process.
The chief complaint against data
Data is kept in its raw form. It is only warehouses is the inability, or the
Complain
transformed when it is ready to be used. problem faced when trying to make
change in in them.
They integrate different types of data to
come up with entirely new questions as
Key
these users not likely to use data
Benefits
warehouses because they may need to
go beyond its capabilities.

You might also like