0% found this document useful (0 votes)

6 views3 pages

DM Week 2 Des

The KDD (Knowledge Discovery in Databases) process involves a multi-step approach to extract useful knowledge from large datasets, including data selection, preprocessing, transformation, mining, evaluation, and representation. Each step focuses on improving data quality, applying algorithms to discover patterns, and presenting findings in an understandable format. The process emphasizes the importance of handling data properly to ensure meaningful insights are derived.

Uploaded by

pattisollaithattathey

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views3 pages

DM Week 2 Des

Uploaded by

pattisollaithattathey

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Explain in detail about KDD process .

The KDD (Knowledge Discovery in Databases) process is a

multi-step procedure used to extract useful knowledge from
large datasets. It is often associated with data mining but
encompasses more than just the data mining step itself.

Here’s a detailed step-by-step explanation of the KDD process

in clear points:

1. Data Selection

Purpose: Identify the relevant data sources from

potentially many heterogeneous databases.

Details:

Choose data that is relevant to the analysis goals.

May involve multiple data sources (e.g., databases,

data warehouses, flat files).

Ensures the dataset is focused and manageable.

2. Data Preprocessing (Cleaning)

Purpose: Remove noise and inconsistencies to improve

data quality.

Details:

Handle missing values, noisy data, and

inconsistencies.
Examples: Removing duplicates, correcting wrong
entries, dealing with outliers.

This is critical since poor-quality data leads to poor

mining results

3. Data Transformation (Integration and Reduction)

Purpose: Convert data into appropriate formats for

mining.

Details:

Integration: Combine data from different sources

into a coherent dataset.

Transformation: Normalize or aggregate data (e.g.,

scaling numeric values, encoding categorical data).

Reduction: Reduce the data volume but keep

relevant information (e.g., feature selection,
dimensionality reduction).

4. Data Mining

Purpose: Apply algorithms to extract patterns or models

from prepared data.

Details:

Use techniques like classification, clustering,

association rule mining, regression, etc.

This is the core step where intelligent methods are

applied.
Output could be patterns, trends, relationships, or
predictive models

5. Pattern Evaluation

Purpose: Identify the truly interesting, useful, and valid

patterns.

Details:

Assess the discovered patterns for relevance and

novelty.

Remove redundant or insignificant patterns.

Criteria used: statistical significance, usefulness,

understandability.

6. Knowledge Representation (Visualization)

Purpose: Present the mined knowledge in a user-friendly

way.

Details:

Use visualization tools like charts, graphs,

dashboards, or reports.

Helps stakeholders understand and act on the

findings.

Often includes interactive interfaces for exploring

results.

DMW ALLinONE
No ratings yet
DMW ALLinONE
64 pages
Chapter 3 DATA MINIG
No ratings yet
Chapter 3 DATA MINIG
17 pages
Understanding the KDD Process in Data Mining
No ratings yet
Understanding the KDD Process in Data Mining
5 pages
Data Mining 14
No ratings yet
Data Mining 14
3 pages
Knowledge Discovery Database (KDD Process)
No ratings yet
Knowledge Discovery Database (KDD Process)
5 pages
Dmbi Unit-3
No ratings yet
Dmbi Unit-3
21 pages
Data Mining Simran
No ratings yet
Data Mining Simran
128 pages
What Is The KDD Process
No ratings yet
What Is The KDD Process
2 pages
Chapter 3
No ratings yet
Chapter 3
5 pages
DWDM Unit 3
No ratings yet
DWDM Unit 3
16 pages
NCVRT Datamining
No ratings yet
NCVRT Datamining
43 pages
KDD Process in Data Mining Explained
No ratings yet
KDD Process in Data Mining Explained
10 pages
DM Course Material
No ratings yet
DM Course Material
128 pages
Unit III DWDM
No ratings yet
Unit III DWDM
113 pages
Unit Iii
No ratings yet
Unit Iii
33 pages
KDD-Knowledge Discovery in Databases
No ratings yet
KDD-Knowledge Discovery in Databases
5 pages
cc15 2nd
No ratings yet
cc15 2nd
2 pages
FDS Unit 1
No ratings yet
FDS Unit 1
20 pages
Business Understanding This Step Involves Understanding The Problem That Needs To Be Solved and Defining The Objectives of The Data Mining Project
No ratings yet
Business Understanding This Step Involves Understanding The Problem That Needs To Be Solved and Defining The Objectives of The Data Mining Project
5 pages
Data Mining & Knowledge Discovery
No ratings yet
Data Mining & Knowledge Discovery
60 pages
Data Preprocessing Personal
No ratings yet
Data Preprocessing Personal
11 pages
Fund Data Science
No ratings yet
Fund Data Science
91 pages
Data Mining and KDD
No ratings yet
Data Mining and KDD
15 pages
Unit 1
No ratings yet
Unit 1
43 pages
Assignment Solution
No ratings yet
Assignment Solution
27 pages
Data Mining Essentials for Students
No ratings yet
Data Mining Essentials for Students
15 pages
Data Mining Basics and KDD Process
No ratings yet
Data Mining Basics and KDD Process
16 pages
DWDM Unit II
No ratings yet
DWDM Unit II
18 pages
Unit-1 Data Mining
No ratings yet
Unit-1 Data Mining
19 pages
KDD
No ratings yet
KDD
3 pages
Steps Involved in KDD Process: Data Mining
No ratings yet
Steps Involved in KDD Process: Data Mining
14 pages
Unit 1 DM
No ratings yet
Unit 1 DM
16 pages
Data Mining Versus Knowledge Discovery I
No ratings yet
Data Mining Versus Knowledge Discovery I
3 pages
Data Mining & KDD Overview
No ratings yet
Data Mining & KDD Overview
22 pages
Unit-3 DMDW
No ratings yet
Unit-3 DMDW
36 pages
Data Mining Assignment Overview
No ratings yet
Data Mining Assignment Overview
11 pages
U1 - Data Warehouse Intro
No ratings yet
U1 - Data Warehouse Intro
13 pages
Data Mining Overview and Techniques
No ratings yet
Data Mining Overview and Techniques
12 pages
Data Mining Basics & Techniques
No ratings yet
Data Mining Basics & Techniques
166 pages
Important Questions
No ratings yet
Important Questions
26 pages
Data Analytics Use Cases and Process
No ratings yet
Data Analytics Use Cases and Process
5 pages
Introduction To Data Mining
No ratings yet
Introduction To Data Mining
38 pages
Understanding Knowledge Discovery in Databases
No ratings yet
Understanding Knowledge Discovery in Databases
17 pages
Overview of The KDD Process
No ratings yet
Overview of The KDD Process
3 pages
Understanding KDD Process in Data Mining
No ratings yet
Understanding KDD Process in Data Mining
10 pages
Explanation For KDD
No ratings yet
Explanation For KDD
2 pages
DWDM Notes - Unit 1
No ratings yet
DWDM Notes - Unit 1
26 pages
Data Mining - Prashant
No ratings yet
Data Mining - Prashant
10 pages
Steps in the Data Mining Process
No ratings yet
Steps in the Data Mining Process
2 pages
Data Mining Module 1 Theory
No ratings yet
Data Mining Module 1 Theory
4 pages
DBMS - Unit 4 - Part1
No ratings yet
DBMS - Unit 4 - Part1
6 pages
Data Structures: Notes For Lecture 12 Introduction To Data Mining by Samaher Hussein Ali
No ratings yet
Data Structures: Notes For Lecture 12 Introduction To Data Mining by Samaher Hussein Ali
4 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
22 pages
Data Mining: Steps and Challenges
No ratings yet
Data Mining: Steps and Challenges
19 pages
Chapter 7
No ratings yet
Chapter 7
26 pages
Data Mining Q&A and Techniques
No ratings yet
Data Mining Q&A and Techniques
44 pages
Data Mining
No ratings yet
Data Mining
25 pages
Segmented Regression
No ratings yet
Segmented Regression
5 pages
Bivariate 1
No ratings yet
Bivariate 1
27 pages
PS5 Sol
No ratings yet
PS5 Sol
7 pages
Chapter 12 Assessment Answers For Students
No ratings yet
Chapter 12 Assessment Answers For Students
9 pages
When To Use This Sampling?: Sampling With Probability Proportion To Size Measure: PPS
No ratings yet
When To Use This Sampling?: Sampling With Probability Proportion To Size Measure: PPS
14 pages
Stefan Week 4 Slides
No ratings yet
Stefan Week 4 Slides
63 pages
Reliability Test Farid Ahmad Khalil 09
No ratings yet
Reliability Test Farid Ahmad Khalil 09
3 pages
Test Bank for Introductory Econometrics
No ratings yet
Test Bank for Introductory Econometrics
122 pages
Soal UTS Statistik Multivariat
No ratings yet
Soal UTS Statistik Multivariat
23 pages
Spearman Rho for Math Majors
No ratings yet
Spearman Rho for Math Majors
13 pages
Hypothesis Testing Guide & Examples
100% (1)
Hypothesis Testing Guide & Examples
98 pages
1+2. Inferential Statistics I & II
No ratings yet
1+2. Inferential Statistics I & II
47 pages
W06 Case Study-Math124 - Doc - W06CaseStudyLinearRegression Adesuwa
No ratings yet
W06 Case Study-Math124 - Doc - W06CaseStudyLinearRegression Adesuwa
7 pages
Bai Tap Chuong 2
No ratings yet
Bai Tap Chuong 2
3 pages
(Ebook PDF) Statistics For Business and Economics, Global Edition 9th Edition - The Full Ebook With All Chapters Is Available For Download Now
100% (4)
(Ebook PDF) Statistics For Business and Economics, Global Edition 9th Edition - The Full Ebook With All Chapters Is Available For Download Now
56 pages
Tree-Ring Chronology Software Guide
No ratings yet
Tree-Ring Chronology Software Guide
81 pages
Examples-Hypothesis Testing
No ratings yet
Examples-Hypothesis Testing
5 pages
ملف تحليل النتائج
No ratings yet
ملف تحليل النتائج
2 pages
DS Notes
No ratings yet
DS Notes
31 pages
Week 3 Assignment: Forecasting Errors
No ratings yet
Week 3 Assignment: Forecasting Errors
3 pages
Ps 1
No ratings yet
Ps 1
12 pages
James Stein Estimator
No ratings yet
James Stein Estimator
9 pages
Advanced Network Adjustment Guide
No ratings yet
Advanced Network Adjustment Guide
19 pages
Varshini Phase 2
No ratings yet
Varshini Phase 2
19 pages
SPSS ANNOTATED OUTPUT Discriminant Analysis 1
No ratings yet
SPSS ANNOTATED OUTPUT Discriminant Analysis 1
14 pages
Cheat Sheet - BT1101
100% (2)
Cheat Sheet - BT1101
29 pages
Separable Nonlinear Least Squares For Estimating
No ratings yet
Separable Nonlinear Least Squares For Estimating
5 pages
Kunicki Et Al 2023 A Primer On Structural Equation Model Diagrams and Directed Acyclic Graphs When and How To Use Each
No ratings yet
Kunicki Et Al 2023 A Primer On Structural Equation Model Diagrams and Directed Acyclic Graphs When and How To Use Each
14 pages
Index Models: Problem Sets
No ratings yet
Index Models: Problem Sets
14 pages
Final Notes
100% (1)
Final Notes
4 pages

DM Week 2 Des

Uploaded by

DM Week 2 Des

Uploaded by

Explain in detail about KDD process .

The KDD (Knowledge Discovery in Databases) process is a

Here’s a detailed step-by-step explanation of the KDD process

Purpose: Identify the relevant data sources from

Choose data that is relevant to the analysis goals.

May involve multiple data sources (e.g., databases,

Ensures the dataset is focused and manageable.

2. Data Preprocessing (Cleaning)

Purpose: Remove noise and inconsistencies to improve

Handle missing values, noisy data, and

This is critical since poor-quality data leads to poor

3. Data Transformation (Integration and Reduction)

Purpose: Convert data into appropriate formats for

Integration: Combine data from different sources

Transformation: Normalize or aggregate data (e.g.,

Reduction: Reduce the data volume but keep

Purpose: Apply algorithms to extract patterns or models

Use techniques like classification, clustering,

This is the core step where intelligent methods are

Purpose: Identify the truly interesting, useful, and valid

Assess the discovered patterns for relevance and

Remove redundant or insignificant patterns.

Criteria used: statistical significance, usefulness,

6. Knowledge Representation (Visualization)

Purpose: Present the mined knowledge in a user-friendly

Use visualization tools like charts, graphs,

Helps stakeholders understand and act on the

Often includes interactive interfaces for exploring

You might also like