Data Mining Notes

Data Mining, also known as Knowledge Discovery in Databases (KDD), involves extracting useful patterns from large datasets through a series of steps including data cleaning and transformation. It encompasses various tasks such as descriptive and predictive analytics, with applications in market analysis, healthcare, and fraud detection. Key techniques include classification, regression, clustering, and association rule mining, supported by evaluation metrics to assess their effectiveness.

Uploaded by

sameekshavishwakarma16

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views1 page

Data Mining Notes

Uploaded by

sameekshavishwakarma16

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Data Mining – Exam Notes (Easy Language) 1.

Introduction • Data Mining is the process of

discovering useful patterns and knowledge from large datasets. • It is also called Knowledge
Discovery in Databases (KDD). • Steps of KDD: Data Cleaning → Data Integration → Data
Selection → Data Transformation → Data Mining → Pattern Evaluation → Knowledge Presentation.
2. Types of Data • Structured Data – tables, rows, columns. • Unstructured Data – text, images,
videos. • Semi-structured – XML, JSON. 3. Data Mining Tasks a) Descriptive – Find patterns that
describe data (clustering, association rules). b) Predictive – Predict future outcomes (classification,
regression). 4. Data Preprocessing • Data Cleaning – remove noise, missing values. • Data
Integration – combine data from multiple sources. • Data Transformation – normalization,
aggregation. • Data Reduction – reduce size using PCA, sampling. 5. Classification • Predicts a
category/class label. • Algorithms: Decision Tree, Naive Bayes, KNN, SVM, Random Forest. •
Example: Email → spam or not spam. 6. Regression • Predicts continuous values. • Algorithms:
Linear Regression, Polynomial Regression. • Example: Predicting house prices. 7. Clustering •
Groups similar data objects without labels. • Algorithms: K-Means, Hierarchical Clustering,
DBSCAN. • Example: Customer segmentation. 8. Association Rule Mining • Finds relationships
among items. • Algorithm: Apriori. • Example: “If customer buys bread, they also buy butter.” 9.
Outlier Detection • Identifying data points that are very different from the rest. • Useful in fraud
detection. 10. Evaluation Metrics • Classification: Accuracy, Precision, Recall, F1-score. •
Clustering: Silhouette Score, SSE. 11. Applications of Data Mining • Market Basket Analysis •
Healthcare diagnosis • Fraud detection (banking) • Recommendation systems (Netflix, Amazon) •
Customer segmentation in marketing

Class Notes
No ratings yet
Class Notes
1 page
Interview Bundle
No ratings yet
Interview Bundle
2 pages
Cloud Unit 1
No ratings yet
Cloud Unit 1
12 pages
Unit 2 Cloud
No ratings yet
Unit 2 Cloud
9 pages
Javascript Notes
No ratings yet
Javascript Notes
1 page

Data Mining Notes

Uploaded by

Data Mining Notes

Uploaded by

Data Mining – Exam Notes (Easy Language) 1.

Introduction • Data Mining is the process of

You might also like