0% found this document useful (0 votes)

36 views12 pages

Internship

The document outlines the architecture of data warehouses, detailing its three layers: bottom (database server), middle (OLAP server), and top (front-end tools). It also describes the ETL process, OLAP capabilities, data schemas (star and snowflake), and data marts, along with data preprocessing, clustering, and anomaly detection techniques in data mining. Additionally, it highlights the applications of data mining in various industries and differentiates between data warehousing and data mining.

Uploaded by

sandhiyamurthi05

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views12 pages

Internship

Uploaded by

sandhiyamurthi05

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

DATA MINING AND WAREHOUSE

t.t.Dilly
dillybabu
babu
t. Dilly babu
1. Data Warehouse Architecture A data warehouse is a system used for reporting and data analysis. Its architecture t

1. Data Warehouse Architecture :

❖ A data warehouse is a system used for reporting and data

analysis.
❖ Its architecture typically has three layers.

Bottom Tier:
A database server where data is stored.

Middle Tier:
An OLAP server that helps analyze data quickly.

Top Tier:
The front-end tools used by users to get reports or analyze data.
2. ETL (Extract, Transform, Load) Process :

❖ ETL stands for Extract, Transform, and Load.

❖ It's the process of:Extracting data from different sources

Transforming it into a proper format (cleaning and
organizing it)Loading it into a data warehouse.

❖ This process is essential for making sure the data is

accurate and ready for analysis.
3. OLAP (Online Analytical Processing) :

❖ OLAP stands for online, analytical and processing.

❖ OLAP is a tool that helps users analyze data quickly in many ways.

❖ It allows for:Viewing data in different dimensions.

❖ OLAP makes it easier to find patterns or trends in large datasets.

4. Star and Snowflake Schemas :

❖ These are ways to organize data in a data warehouse are,

Star Schema:
A simple structure where all data connects to a central fact
table.

Snowflake Schema:
➢ A more complex version where dimensions are split into
smaller tables.
➢ Both help improve the speed and efficiency of data queries.
5. Data Marts :
❖A data mart is a smaller, focused part of a data
warehouse.
❖ It stores data related to one specific department or subject,
like sales, marketing, or finance.
❖ It helps that department quickly access the data it needs
without searching the whole data warehouse.

Types of Data Marts:

1. Dependent Data Mart :

Gets its data from the main data warehouse.

2. Independent Data Mart :

Built directly from different data sources (not from a warehouse).

3. Hybrid Data Mart :

Uses both warehouse and other sources.
6. Data preprocessing :
❖ The data preprocessing is the process of cleaning and
preparing raw data before it is used in data mining or
machine learning.

❖ Raw data is often incomplete, inconsistent, noisy

(contains errors), or not in the right format. Preprocessing
makes the data accurate, clean, and ready for analysis.

❖ This is called data preprocessing.

7.Clustering:
❖ The clustering is a data mining technique used to
group similar data items together based on their
features or patterns.

❖ Unlike classification, clustering does not use

predefined labels. It tries to discover natural
groupings in the data.

❖ It is used to understand hidden patterns in

data.
8. Anomaly Detection :
❖ The anomaly detection is a technique used to identify unusual
or unexpected data that does not follow the normal pattern.

❖ These unusual items are called anomalies, outliers, or

exceptions.

❖ It helps detect fraud, errors, or rare events.

❖ It helps identify problems early before they become

serious.

❖ It is useful in security, healthcare, finance, and

monitoring systems.
9. Applications of Data Mining :
❖ Data Mining is the process of finding patterns, trends,
or useful information from large sets of data.

❖ This information is then used to make better decisions,

predict future outcomes, and improve
performance in many industries.

Marketing:
To find target customers.

Healthcare:
To predict diseases.

Banking:
To detect fraud.
110. Differenciate between data warehouse and data mining:

Aspect Data Warehousing Data Mining

Aspect Data Warehousing Data Mining
Store and manage large amounts of Extract patterns
Extract and insights
patterns and from
insights from the
Purpose Purpose Storedata
and manage large amounts of data the data
data
Analyze data to discover trends,
Function Collect, clean, and organize data Analyze
Function Collect, clean, and organize data predictions, etc.data to discover trends,
predictions, etc.
Data loading, transformation, and Pattern recognition, classification,
Process
storage Pattern recognition, classification,
clustering
Process Data loading, transformation, and storage
A well-structured database or clustering
Models, rules, predictions, and
Output
repository knowledge
Models, rules, predictions, and
Output A well-structured database or repository
Data analysis and knowledge
Focus Data integration and storage knowledge
discovery
Focus Data integration and storage Data analysis
Data analysts, and knowledge discovery
data scientists,
Users Database administrators, IT teams
business analysts
A centralized system holding sales Data
Finding analysts,
customer data
buying scientists, business
patterns
Users Example Database administrators, IT teams
data from stores analysts
from sales data
A centralized system holding sales data Finding customer buying patterns from
Example
from stores sales data
thank you

Data Mininng
No ratings yet
Data Mininng
11 pages
Data Warehouse Overview and Applications
No ratings yet
Data Warehouse Overview and Applications
17 pages
Data Mining and Warehousing Overview
No ratings yet
Data Mining and Warehousing Overview
15 pages
Unit 5 Notes
No ratings yet
Unit 5 Notes
19 pages
Data Warehousing Mining
No ratings yet
Data Warehousing Mining
26 pages
Datawarehouse and Data Mining Final Notes
No ratings yet
Datawarehouse and Data Mining Final Notes
9 pages
Lecture 1 & 2
No ratings yet
Lecture 1 & 2
14 pages
Data Warehousing and Mining Overview
No ratings yet
Data Warehousing and Mining Overview
36 pages
DWDM Fresh Notes For Unit 1, Unit 2, Unit 3
No ratings yet
DWDM Fresh Notes For Unit 1, Unit 2, Unit 3
54 pages
Data Mining Abstract
No ratings yet
Data Mining Abstract
6 pages
Data Mining in Insurance Analysis
No ratings yet
Data Mining in Insurance Analysis
11 pages
Data Warehousing and Mining Essentials
No ratings yet
Data Warehousing and Mining Essentials
31 pages
Database 4
No ratings yet
Database 4
35 pages
Unit 01
No ratings yet
Unit 01
10 pages
Data Mining and Warehousing
No ratings yet
Data Mining and Warehousing
18 pages
Data Warehousing Essentials
No ratings yet
Data Warehousing Essentials
19 pages
DATA Mining UNIT1 DATA Mining UNIT1: Operating System (Sindhi College) Operating System (Sindhi College)
No ratings yet
DATA Mining UNIT1 DATA Mining UNIT1: Operating System (Sindhi College) Operating System (Sindhi College)
24 pages
DWDM External
No ratings yet
DWDM External
30 pages
Data Warehousing & Mining Overview
No ratings yet
Data Warehousing & Mining Overview
55 pages
Data Mining Display
No ratings yet
Data Mining Display
20 pages
Introduction To Data Mining and Data Warehousing
No ratings yet
Introduction To Data Mining and Data Warehousing
2 pages
Data Mining and Warehousing Lecture-1,2
No ratings yet
Data Mining and Warehousing Lecture-1,2
37 pages
Data Mining and Warehousing Overview
No ratings yet
Data Mining and Warehousing Overview
21 pages
Data Mining and Data Warehousing
No ratings yet
Data Mining and Data Warehousing
73 pages
Data Mining & KDD Overview
No ratings yet
Data Mining & KDD Overview
63 pages
Data Mining and Warehouse Insights
No ratings yet
Data Mining and Warehouse Insights
54 pages
Unit-I Part II Erp
No ratings yet
Unit-I Part II Erp
60 pages
Data Mining & Warehousing Guide
No ratings yet
Data Mining & Warehousing Guide
6 pages
Part A Aim: Prerequisite: Database Outcome: To Impart Knowledge of Data Warehouse and Data Mining Theory
No ratings yet
Part A Aim: Prerequisite: Database Outcome: To Impart Knowledge of Data Warehouse and Data Mining Theory
4 pages
??? ????????? ???
No ratings yet
??? ????????? ???
21 pages
Lecture 2.1.1 2.1.2
No ratings yet
Lecture 2.1.1 2.1.2
19 pages
Data Mining and Warehouse Techniques
No ratings yet
Data Mining and Warehouse Techniques
70 pages
Data Mining Warehousing DistributedDBMS Summary
No ratings yet
Data Mining Warehousing DistributedDBMS Summary
5 pages
Data Warehousing & Mining Overview
75% (4)
Data Warehousing & Mining Overview
14 pages
001.data Mining and Data Warewhouse
No ratings yet
001.data Mining and Data Warewhouse
7 pages
Session 35 - Data Mining and Data Warehousing
No ratings yet
Session 35 - Data Mining and Data Warehousing
14 pages
Chapter 1&2
No ratings yet
Chapter 1&2
91 pages
Data Mining and Data Warehouse BY
100% (1)
Data Mining and Data Warehouse BY
12 pages
Data Mining Chapter 1 Introduction
No ratings yet
Data Mining Chapter 1 Introduction
39 pages
358 44 Datamining and Warehousing 4.4
No ratings yet
358 44 Datamining and Warehousing 4.4
155 pages
Data Warehousing in Government Operations
No ratings yet
Data Warehousing in Government Operations
26 pages
Data Warehousing for Data Mining Insights
No ratings yet
Data Warehousing for Data Mining Insights
31 pages
Introduction to Data Warehousing
No ratings yet
Introduction to Data Warehousing
80 pages
D-Unit-1 R16
No ratings yet
D-Unit-1 R16
17 pages
ISM Data Warehousing-1
No ratings yet
ISM Data Warehousing-1
23 pages
INFORMATION MANAGEMENT Unit 3 NEW
100% (1)
INFORMATION MANAGEMENT Unit 3 NEW
61 pages
Ai Pass
No ratings yet
Ai Pass
12 pages
DM & W SQ
No ratings yet
DM & W SQ
15 pages
Understanding DMDW Concepts
No ratings yet
Understanding DMDW Concepts
17 pages
Data Mining and Data Warehouse: Qis College of Engineering & Technology Ongole
No ratings yet
Data Mining and Data Warehouse: Qis College of Engineering & Technology Ongole
10 pages
DM Notes
No ratings yet
DM Notes
193 pages
DWM Assigment-Questions Ans
No ratings yet
DWM Assigment-Questions Ans
67 pages
1 What Is Data Mining
No ratings yet
1 What Is Data Mining
9 pages
Data Warehousing and Data Mining Final Year Seminar Topic
No ratings yet
Data Warehousing and Data Mining Final Year Seminar Topic
10 pages
Datamining Unit - 1
No ratings yet
Datamining Unit - 1
20 pages
UNIT-1 Why We Need Data Mining?
No ratings yet
UNIT-1 Why We Need Data Mining?
99 pages
03-Unit 2
No ratings yet
03-Unit 2
79 pages
DM 1
No ratings yet
DM 1
23 pages
Data Warehouse Fundamentals Explained
No ratings yet
Data Warehouse Fundamentals Explained
31 pages
Abstract
No ratings yet
Abstract
1 page
Internship Training Task
No ratings yet
Internship Training Task
34 pages
NM Data Visualization Submitted by M. Sandhiya
No ratings yet
NM Data Visualization Submitted by M. Sandhiya
10 pages
General Knowledge
No ratings yet
General Knowledge
2 pages
OS Full Units
No ratings yet
OS Full Units
26 pages
DBMS Unit 1
No ratings yet
DBMS Unit 1
7 pages
Internship
No ratings yet
Internship
12 pages
Task 3
No ratings yet
Task 3
2 pages
Business Intelligence and Analytics
No ratings yet
Business Intelligence and Analytics
1 page
100 Key Questions for CA Inter Exams
No ratings yet
100 Key Questions for CA Inter Exams
72 pages
KDD CRISPDM SEMMA Detailed
No ratings yet
KDD CRISPDM SEMMA Detailed
6 pages
(Ebook PDF) Handbook of Statistical Analysis and Data Mining Applications 2nd Edition Download
100% (5)
(Ebook PDF) Handbook of Statistical Analysis and Data Mining Applications 2nd Edition Download
43 pages
Ind IEEE Data Mining
No ratings yet
Ind IEEE Data Mining
15 pages
Unit2 Notes
No ratings yet
Unit2 Notes
8 pages
Final PPT File Cluster Analysis
No ratings yet
Final PPT File Cluster Analysis
28 pages
IBM Courses
No ratings yet
IBM Courses
19 pages
AIML Assignment II
No ratings yet
AIML Assignment II
2 pages
A Review On K Means Clustering
No ratings yet
A Review On K Means Clustering
7 pages
CS771: Introduction To Machine Learning Piyush Rai
No ratings yet
CS771: Introduction To Machine Learning Piyush Rai
25 pages
Shu Sen Wang
No ratings yet
Shu Sen Wang
6 pages
Sat - 95.Pdf - Heart Disease Prediction Using Machine Learning Algorithms
No ratings yet
Sat - 95.Pdf - Heart Disease Prediction Using Machine Learning Algorithms
11 pages
DWM Assignment Ques
No ratings yet
DWM Assignment Ques
38 pages
Original PDF
No ratings yet
Original PDF
7 pages
Analisis Faktor Yang Mempengaruhi Penumpang Angkutan Umum Beralih Ke Transportasi Online Go-Jek Menggunakan Metode K-Means Clustering
No ratings yet
Analisis Faktor Yang Mempengaruhi Penumpang Angkutan Umum Beralih Ke Transportasi Online Go-Jek Menggunakan Metode K-Means Clustering
7 pages
Business Intelligence and Analytics Notes
No ratings yet
Business Intelligence and Analytics Notes
260 pages
Clustering Techniques for Data Scientists
No ratings yet
Clustering Techniques for Data Scientists
5 pages
DWDM Unit 3
No ratings yet
DWDM Unit 3
16 pages
Data Mining
No ratings yet
Data Mining
55 pages
Crime Analysis and Prediction Using Data
100% (1)
Crime Analysis and Prediction Using Data
8 pages
TE - Syllabus - R2019 July9
No ratings yet
TE - Syllabus - R2019 July9
3 pages
Course Offerings in Engineering Fields
No ratings yet
Course Offerings in Engineering Fields
3 pages
Data Mining - Lab 1
No ratings yet
Data Mining - Lab 1
4 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
101 pages
Data Mining and Decision Trees Quiz
50% (6)
Data Mining and Decision Trees Quiz
3 pages
DBSCAN: Density-Based Clustering Guide
No ratings yet
DBSCAN: Density-Based Clustering Guide
18 pages
DWDM Lab Using Python
No ratings yet
DWDM Lab Using Python
15 pages
Distributed DataMining
No ratings yet
Distributed DataMining
16 pages
K-Means Clustering Analysis in R
No ratings yet
K-Means Clustering Analysis in R
17 pages

Internship

Uploaded by

Internship

Uploaded by

DATA MINING AND WAREHOUSE

1. Data Warehouse Architecture :

❖ A data warehouse is a system used for reporting and data

❖ ETL stands for Extract, Transform, and Load.

❖ It's the process of:Extracting data from different sources

❖ This process is essential for making sure the data is

❖ OLAP stands for online, analytical and processing.

❖ It allows for:Viewing data in different dimensions.

❖ OLAP makes it easier to find patterns or trends in large datasets.

❖ These are ways to organize data in a data warehouse are,

Types of Data Marts:

1. Dependent Data Mart :

2. Independent Data Mart :

3. Hybrid Data Mart :

❖ Raw data is often incomplete, inconsistent, noisy

❖ This is called data preprocessing.

❖ Unlike classification, clustering does not use

❖ It is used to understand hidden patterns in

❖ These unusual items are called anomalies, outliers, or

❖ It helps detect fraud, errors, or rare events.

❖ It helps identify problems early before they become

❖ It is useful in security, healthcare, finance, and

❖ This information is then used to make better decisions,

Aspect Data Warehousing Data Mining

You might also like