0% found this document useful (0 votes)

37 views3 pages

Data Mining

Data mining, or knowledge discovery in data (KDD), involves extracting valuable patterns and information from large datasets, which can aid businesses in marketing, fraud detection, and decision-making. Key techniques include anomaly detection, association rule learning, clustering, classification, and regression, all aimed at uncovering insights and making predictions. The process encompasses data collection, cleaning, integration, selection, transformation, and applying algorithms, with supervised learning relying on labeled data for training.

Uploaded by

chauhanabhishekmr

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views3 pages

Data Mining

Uploaded by

chauhanabhishekmr

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

• Data mining, also known as knowledge discovery in data (KDD), is the process of uncovering

patterns and other valuable information from large data sets.

• Data mining can be used by corporations for everything from learning about what customers
are interested in or want to buy to fraud detection and spam filtering.

• It can help them to develop more effective marketing strategies, increase sales, and decrease
costs. Data mining relies on effective data collection, warehousing, and computer processing.

• It also is a market research tool that helps reveal the sentiment or opinions of a given group
of people.

• Social media companies use data mining techniques to commodify their users in order to
generate profit.

• Data mining is the process of discovering patterns, trends, and insights from large datasets
using various techniques from statistics, machine learning, and database systems.

• It involves extracting useful information from data, often with the goal of making informed
decisions or predictions

key properties of data mining

• Automatic discovery of patterns

• Prediction of likely outcomes

• Creation of actionable information

• Focus on large datasets and databases

Techniques of Data Mining

• Anomaly detection (Outlier/change/deviation detection) – The identification of unusual

data records, that might be interesting or data errors that require further investigation.

• Association rule learning (Dependency modelling) – Searches for relationships between

variables. For example a supermarket might gather data on customer purchasing habits.

• Using association rule learning, the supermarket can determine which products are
frequently bought together and use this information for marketing purposes. This is
sometimes referred to as market basket analysis.

• Clustering – is the task of discovering groups and structures in the data that are in some way
or another "similar", without using known structures in the data.

• Classification – is the task of generalizing known structure to apply to new data. For
example, an e-mail program might attempt to classify an e-mail as "legitimate" or as "spam".

• Regression – attempts to find a function which models the data with the least error.

• Summarization – providing a more compact representation of the data set, including

visualization and report generation.
How Data Mining Works?

1. Data is collected and loaded into data warehouses on site or on a cloud service.

2. Business analysts, management teams, and information technology professionals access the
data and determine how they want to organize it.

3. Custom application software sorts and organizes the data.

4. The end user presents the data in an easy-to-share format, such as a graph or table.

Knowledge Discovery in Databases(KDD)

• Some people treat data mining the same as Knowledge discovery while some people view
data mining essential step in the process of knowledge discovery.

1. Data Cleaning - In this step the noise and inconsistent data is removed.

2. Data Integration - In this step multiple data sources are combined.

3. Data Selection - In this step relevant to the analysis task are retrieved from the database.

4. Data Transformation - In this step data are transformed or consolidated into forms
appropriate for mining by performing summary or aggregation operations.

5. Estimate the model: The selection and implementation of the appropriate data-mining
technique is the main task in this phase.

6. Interpret the model and draw conclusions.

Data mining process

• Setting objectives,

• Data gathering

• Data preparation,

• 1. Outlier detection (and removal) – Outliers are unusual data values that are not
consistent with most observations.

• 2. Scaling, encoding, and selecting features – Data preprocessing includes several

steps such as variable scaling and different types of encoding. For example, one
feature with the range [0, 1] and the other with the range [−100, 1000] will not have
the same weights in the applied technique; they will also influence the final data-
mining results differently.

• Applying data mining algorithms

• Supervised Learning

• Classification

• Regression
• Unsupervised Learning

Evaluating results

Supervised learning

• Supervised learning is an approach to machine learning that uses labeled data sets to train
algorithms in order to properly classify data and predict outcomes.

Challenges of supervised learning

• Supervised learning models can require certain levels of expertise to structure accurately.

• Training supervised learning models can be very time-intensive.

• Datasets can have a higher likelihood of human error, resulting in algorithms learning
incorrectly.

• Unlike unsupervised learning models, supervised learning cannot cluster or classify data on
its own.

5 Data Mining Proccess and Techniques - Week 7
No ratings yet
5 Data Mining Proccess and Techniques - Week 7
61 pages
Data Mining
No ratings yet
Data Mining
20 pages
Data Mining - Prashant
No ratings yet
Data Mining - Prashant
10 pages
Data Mining for Business Insights
100% (3)
Data Mining for Business Insights
11 pages
Datamining&warehousing
No ratings yet
Datamining&warehousing
65 pages
Presentation Data Mining
No ratings yet
Presentation Data Mining
22 pages
Data Mining Chapter 1
0% (1)
Data Mining Chapter 1
12 pages
Data Mining Survey Overview
No ratings yet
Data Mining Survey Overview
8 pages
Unit III DWDM
No ratings yet
Unit III DWDM
113 pages
Data Mining Techniques and Applications
No ratings yet
Data Mining Techniques and Applications
39 pages
ISS-DSS - Module 3
No ratings yet
ISS-DSS - Module 3
23 pages
Data Mining Mids
No ratings yet
Data Mining Mids
24 pages
DM Module1
No ratings yet
DM Module1
15 pages
Introduction
No ratings yet
Introduction
26 pages
Data Mining ppt-1
No ratings yet
Data Mining ppt-1
16 pages
Combinepdf 1
No ratings yet
Combinepdf 1
74 pages
Data Mining for Business Insights
100% (1)
Data Mining for Business Insights
39 pages
Week-1-Introduction To Data Mining
No ratings yet
Week-1-Introduction To Data Mining
43 pages
Introduction To Data Mining Unit1
100% (1)
Introduction To Data Mining Unit1
37 pages
Chapter 3-IB
No ratings yet
Chapter 3-IB
69 pages
Unit I DM
No ratings yet
Unit I DM
27 pages
DWDM Unit-II Notes
No ratings yet
DWDM Unit-II Notes
29 pages
What Is Data Mining: Effective Data Collection Warehousing
No ratings yet
What Is Data Mining: Effective Data Collection Warehousing
21 pages
Unit 3 Ba
No ratings yet
Unit 3 Ba
29 pages
1 - Lect 1 & 2 Data Mining
No ratings yet
1 - Lect 1 & 2 Data Mining
20 pages
Understanding Data Mining Techniques
No ratings yet
Understanding Data Mining Techniques
33 pages
Data Mining & KDD Overview
No ratings yet
Data Mining & KDD Overview
22 pages
Data Mining-1
No ratings yet
Data Mining-1
7 pages
Understanding Data Mining Techniques
No ratings yet
Understanding Data Mining Techniques
39 pages
01 - Introduction To Datamining
No ratings yet
01 - Introduction To Datamining
19 pages
Data Mining
No ratings yet
Data Mining
6 pages
Presentation 1
No ratings yet
Presentation 1
28 pages
Data Mining
No ratings yet
Data Mining
88 pages
Fundamentals of Data Science Notes (Module - 1)
No ratings yet
Fundamentals of Data Science Notes (Module - 1)
19 pages
Data Mining Lecture One - Docx1
No ratings yet
Data Mining Lecture One - Docx1
12 pages
Unit Iii
No ratings yet
Unit Iii
33 pages
Introduction to Data Mining Basics
No ratings yet
Introduction to Data Mining Basics
43 pages
Data Mining (Introduction)
No ratings yet
Data Mining (Introduction)
31 pages
NCVRT Datamining
No ratings yet
NCVRT Datamining
43 pages
Unit 3 Data Mining
No ratings yet
Unit 3 Data Mining
21 pages
Introduction Lecture1gghhhhh
No ratings yet
Introduction Lecture1gghhhhh
23 pages
What Is Data Mining
No ratings yet
What Is Data Mining
1 page
1 - DM
No ratings yet
1 - DM
5 pages
Introduction To Data Mining
No ratings yet
Introduction To Data Mining
11 pages
DM Notes
No ratings yet
DM Notes
91 pages
Introduction To Data Mining
No ratings yet
Introduction To Data Mining
6 pages
FALLSEM2025 26 - VL - ISWE209L - 00100 - TH - 2025 07 31 - Course Material For Module 1
No ratings yet
FALLSEM2025 26 - VL - ISWE209L - 00100 - TH - 2025 07 31 - Course Material For Module 1
31 pages
Introduction to Data Mining Concepts
No ratings yet
Introduction to Data Mining Concepts
17 pages
Data Mining
No ratings yet
Data Mining
395 pages
Data Mining
No ratings yet
Data Mining
254 pages
Unit - I MLT
No ratings yet
Unit - I MLT
137 pages
Data Mining Concepts Overview
No ratings yet
Data Mining Concepts Overview
9 pages
DWDM 2
No ratings yet
DWDM 2
15 pages
Data Mining Concepts and Applications
No ratings yet
Data Mining Concepts and Applications
27 pages
Data Mining Concepts Overview
100% (1)
Data Mining Concepts Overview
17 pages
Introduction To Data Mining-Week1
No ratings yet
Introduction To Data Mining-Week1
43 pages
Unit 1 Datamining
No ratings yet
Unit 1 Datamining
16 pages
E Book
No ratings yet
E Book
5 pages
Right Join
No ratings yet
Right Join
2 pages
Regression Analytics
No ratings yet
Regression Analytics
7 pages
Month Sales Amount Date of Sales
No ratings yet
Month Sales Amount Date of Sales
5 pages
Unit 1
No ratings yet
Unit 1
122 pages
Unit 1
No ratings yet
Unit 1
115 pages
Index Tablue
No ratings yet
Index Tablue
3 pages
Bmba0205 Iv
No ratings yet
Bmba0205 Iv
49 pages
Dividend Theories
No ratings yet
Dividend Theories
17 pages
Quiz
No ratings yet
Quiz
29 pages
Valuation Ratios
No ratings yet
Valuation Ratios
4 pages
Table of Content Richaaaaaaaaaaaaaaaa
No ratings yet
Table of Content Richaaaaaaaaaaaaaaaa
3 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
14 pages
Unit 2 BA
No ratings yet
Unit 2 BA
84 pages
New PPT Unit 3 HCM
No ratings yet
New PPT Unit 3 HCM
65 pages
ST Lab File Structured
No ratings yet
ST Lab File Structured
37 pages
ST Lab File Updated
No ratings yet
ST Lab File Updated
34 pages
HCM Unit 1
No ratings yet
HCM Unit 1
9 pages
Unsupervised Learning: Clustering & Anomaly Detection
No ratings yet
Unsupervised Learning: Clustering & Anomaly Detection
50 pages
J Adv Model Earth Syst - 2023 - Sanford - Improving The Reliability of ML Corrected Climate Models With Novelty Detection
No ratings yet
J Adv Model Earth Syst - 2023 - Sanford - Improving The Reliability of ML Corrected Climate Models With Novelty Detection
14 pages
Adaptive Threat Intelligence An Incremental Learning Approach For Detecting Evolving APT Attacks
No ratings yet
Adaptive Threat Intelligence An Incremental Learning Approach For Detecting Evolving APT Attacks
6 pages
CNN for Automated Concrete Crack Detection
No ratings yet
CNN for Automated Concrete Crack Detection
2 pages
A Comprehensive Review of Machine Learning Approaches
No ratings yet
A Comprehensive Review of Machine Learning Approaches
15 pages
Detection of Real-Time Malicious Intrusions & Attacks in IOT Empowered Cybersecurity & Infrastructures
No ratings yet
Detection of Real-Time Malicious Intrusions & Attacks in IOT Empowered Cybersecurity & Infrastructures
100 pages
Landscape of Automated Log Analysis A Systematic L
No ratings yet
Landscape of Automated Log Analysis A Systematic L
22 pages
Data Cleaning Preprocessing
No ratings yet
Data Cleaning Preprocessing
28 pages
WTC25-Abstract Salloum
No ratings yet
WTC25-Abstract Salloum
2 pages
DevOps Shack AI DevOps 1752472227
No ratings yet
DevOps Shack AI DevOps 1752472227
28 pages
My New Resume
No ratings yet
My New Resume
2 pages
Rameshkumar FINAL PROJECT REPORT
No ratings yet
Rameshkumar FINAL PROJECT REPORT
66 pages
Wide & Deep Convolutional Neural Networks For Electricity-Theft Detection To Secure Smart Grids
No ratings yet
Wide & Deep Convolutional Neural Networks For Electricity-Theft Detection To Secure Smart Grids
10 pages
Soft Sensors For Online Steam Quality Measurements of OTSG
No ratings yet
Soft Sensors For Online Steam Quality Measurements of OTSG
11 pages
A High-Ce
No ratings yet
A High-Ce
15 pages
Use of OpenAI API in Suveillance of Phone Lines For Moroccan National Security: Technical Approach and Modeling
No ratings yet
Use of OpenAI API in Suveillance of Phone Lines For Moroccan National Security: Technical Approach and Modeling
3 pages
Latency 3
No ratings yet
Latency 3
10 pages
Problem Statement Breif
No ratings yet
Problem Statement Breif
27 pages
Unit 4 Predictive Analytics
No ratings yet
Unit 4 Predictive Analytics
9 pages
AI-Powered CyberShield Intelligence System For Threat Detection and Automated Remediation
No ratings yet
AI-Powered CyberShield Intelligence System For Threat Detection and Automated Remediation
31 pages
Fraud Detectioninthe Financial Sector Using Advanced Data Analysis Techniques 1
No ratings yet
Fraud Detectioninthe Financial Sector Using Advanced Data Analysis Techniques 1
12 pages
A Graph Construction Method For Anomalous Traffic Detection With Graph Neural Networks Using Sets of Flow Data
No ratings yet
A Graph Construction Method For Anomalous Traffic Detection With Graph Neural Networks Using Sets of Flow Data
2 pages
Machine Learning
No ratings yet
Machine Learning
57 pages
Doshi 2018
No ratings yet
Doshi 2018
7 pages
Deep Learning for Spacecraft Anomaly Detection
No ratings yet
Deep Learning for Spacecraft Anomaly Detection
11 pages
Java Network Anomaly Detection
No ratings yet
Java Network Anomaly Detection
105 pages
Ai-Assisted Tax Authorities
No ratings yet
Ai-Assisted Tax Authorities
13 pages
Predictive Maintenance Challenges Review
No ratings yet
Predictive Maintenance Challenges Review
15 pages
Anomaly Detection and Failure Prediction in Gas Turbines
No ratings yet
Anomaly Detection and Failure Prediction in Gas Turbines
98 pages
ME21B037 EP22B047 CS6046 Extreme Bandits Project Report
No ratings yet
ME21B037 EP22B047 CS6046 Extreme Bandits Project Report
4 pages

Data Mining

Uploaded by

Data Mining

Uploaded by

• Data mining, also known as knowledge discovery in data (KDD), is the process of uncovering

patterns and other valuable information from large data sets.

key properties of data mining

• Automatic discovery of patterns

• Prediction of likely outcomes

• Creation of actionable information

• Focus on large datasets and databases

Techniques of Data Mining

• Anomaly detection (Outlier/change/deviation detection) – The identification of unusual

• Association rule learning (Dependency modelling) – Searches for relationships between

• Summarization – providing a more compact representation of the data set, including

3. Custom application software sorts and organizes the data.

Knowledge Discovery in Databases(KDD)

2. Data Integration - In this step multiple data sources are combined.

6. Interpret the model and draw conclusions.

Data mining process

• 2. Scaling, encoding, and selecting features – Data preprocessing includes several

• Applying data mining algorithms

Challenges of supervised learning

• Training supervised learning models can be very time-intensive.

You might also like