0% found this document useful (0 votes)

42 views21 pages

Unit 2

Uploaded by

pratikbharambe22

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views21 pages

Unit 2

Uploaded by

pratikbharambe22

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

UNIT- II

Data Mining

Data mining is the process of finding useful patterns, trends, or knowledge from large sets of
data. Think of it like searching for hidden treasure in a big pile of information. Businesses,
researchers, and organizations collect huge amounts of data every day — from sales records,
website visits, customer feedback, social media, and more. But just having data isn’t enough.
Data mining helps turn that data into something meaningful. For example, a supermarket might
use data mining to find that people who buy bread often buy butter too. This kind of information
can help in making better decisions, like how to arrange products on shelves or what products to
advertise together.

Data mining involves many steps. First, the data is collected from different sources. Then, it’s
cleaned to remove errors or missing values. After that, tools and algorithms are used to find
patterns or trends. Some of the common tasks in data mining include classification (putting
things into categories), clustering (grouping similar items), association (finding relationships
between items), and prediction (guessing future outcomes based on past data).

Data Mining Task Primitives

Data mining task primitives are the basic steps or building blocks that help define what exactly a
user wants to do with the data. Think of them as settings or instructions you give to the data
mining system so that it knows how to carry out the task correctly. These task primitives help
guide the system by telling it what kind of patterns to look for, where to look, and under what
conditions.

Here are the main types of task primitives:

1. The kind of data to be mined: This tells the system what type of data you are working
with. It could be a database, data warehouse, spreadsheets, or even text files. You also
define what specific attributes (columns or fields) you want to analyze.

2. The kind of patterns to be discovered: Data mining can be used for many purposes.
You need to specify what you’re looking for. For example:

○ Classification: Sorting data into predefined groups or classes (like spam or not
spam).

○ Clustering: Grouping similar items together (like customer segments).

○ Association: Finding relationships between items (like "people who buy chips
often buy soda").

○ Prediction: Guessing future values based on current data (like predicting next
month's sales).

3. Background knowledge to be used: This includes any existing information or rules that
the system can use to improve its results. For example, a store might already know some
sales rules that help guide the mining process.

4. Interestingness measures: Not all patterns found are useful. You can set rules to tell the
system which results are interesting or valuable to you. These rules help filter out the less
useful patterns and focus only on what matters.

5. Use of constraints: Sometimes, you want to limit the mining to a certain part of the data
or focus only on specific results. For example, you might only want results related to a
particular product or customer group.

Data, Information, and Knowledge

Data is raw, unprocessed facts and figures. It has no meaning on its own and is just a collection
of values or numbers. For example, a list of numbers like 30, 45, 50, 42 doesn’t tell us anything
unless we know what they represent. In a retail store, data could include things like sales
amounts, product names, dates, customer IDs, or locations. Data is the starting point in data
mining, and it must be cleaned, organized, and analyzed to become useful.

Information is data that has been processed and given context so that it has meaning. For
example, if we say that 30, 45, 50, 42 are the number of items sold in the past four days,
now we have useful information. It tells us something about how the business is performing.
Information answers basic questions like who, what, when, and where. In data mining,
information helps identify what is happening in the data, such as which product sells the most

Knowledge is the deeper understanding or insight gained from analyzing information. It answers
questions like why or how, and helps in making decisions or predictions. For example, if data
mining reveals that "sales increase on weekends for snacks and drinks," that's knowledge. It
combines information with patterns and trends to give meaning that can help in decision-making.
Knowledge allows businesses to plan better, such as increasing stock on weekends or offering
discounts on certain items.
Attribute Types in Data Mining

In data mining, attributes (also called features or variables) are the properties or characteristics of
the data that you're analyzing. Depending on the nature of the values they hold, attributes are
categorized into different types. The type of attribute determines what kind of operations and
analyses can be performed on it.

1. Nominal Attributes

Nominal attributes are categorical attributes that represent names or labels without any order or
ranking between them. They are used simply to identify or classify data into categories.

● Example: Gender (Male, Female), Color (Red, Blue, Green), City (Delhi, Mumbai,
Chennai)

● Key Point: You cannot say one value is greater than or less than another.

● Operations allowed: Equality (=), inequality (≠); no

mathematical operations.

In data mining, nominal data is often converted into numerical codes using
techniques like one-hot encoding so that algorithms can process it.

2. Binary Attributes

Binary attributes are a special type of nominal attribute that can have only two possible values,
typically representing yes/no or true/false situations.

● Example:

○ Is student? (Yes or No)

○ Light switch (On or Off)

○ Has account? (1 or 0)

Binary attributes are of two types:

● Symmetric binary: Both outcomes are equally important (e.g., gender).

● Asymmetric binary: One outcome is more important (e.g., disease = present/absent).

Binary attributes are common in classification tasks like spam detection, fraud
detection, or medical diagnosis.

3. Ordinal Attributes

Ordinal attributes represent categories that have a meaningful order or ranking, but the
differences between the values are not measurable.

● Example:

○ Education level (High school < Bachelor < Master < PhD)

○ Customer satisfaction (Poor < Average < Good < Excellent)

○ Movie rating (1 star to 5 stars)

● Key Point: You can compare values using greater than or less than, but you can’t do
arithmetic on them (e.g., "Excellent" is not 2× "Good").

Ordinal attributes are useful when data has a natural ranking, but we don’t know the
exact difference between the ranks.

4. Numeric Attributes

Numeric attributes are quantitative and represent numbers that can be measured and calculated.
They allow for mathematical operations like addition, subtraction, average, etc. Numeric
attributes are further divided into:

a. Interval Attributes

● These attributes have values with equal intervals, but no true zero point.

● Example: Temperature in Celsius or Fahrenheit.

● Key Point: You can say the difference between 30°C and 40°C is 10°C, but you can’t say
40°C is twice as hot as 20°C.

b. Ratio Attributes

● These have equal intervals and a true zero point, which means you can perform all
mathematical operations including ratios.
● Example: Height, weight, age, salary, distance.

● Key Point: You can say "20 kg is twice as heavy as 10 kg" because 0 kg means no weight
at all.

Introduction to Data Preprocessing

Data preprocessing is one of the most important steps in the data mining process. It involves
preparing and cleaning the raw data so that it can be used effectively for analysis and mining.
Real-world data is rarely perfect — it often contains errors, missing values, duplicates,
inconsistencies, or irrelevant information. If this data is used directly, it can lead to poor results.
Data preprocessing helps solve these problems by converting raw data into a clean, consistent,
and usable format.

Data preprocessing is considered a data preparation phase before the actual data mining begins

Data Cleaning

Data cleaning is the process of fixing or removing incorrect, incomplete, duplicate, or irrelevant
data from a dataset. It is one of the most important steps in data preprocessing because real-
world data is often messy. Without cleaning, inaccurate data can lead to wrong conclusions, poor
decision-making, or unreliable results in data mining or machine learning.

Common Problems

1. Missing Values

● Sometimes data is not recorded, which leads to blanks or null values.

● Example: A customer forgot to enter their phone number in a form.

● Solution: Fill missing values using:

○ Mean, median, or mode of the column

○ Predicting the value using other data

○ Removing the record (if it’s not important)

2. Noisy Data (Random or Incorrect Values)

● Noise refers to errors or random values in data that don’t make sense.
● Example: A person's age entered as 500 or a salary as -10000

● Solution: Use techniques like:

○ Binning (grouping similar values)

○ Smoothing (average out values)

○ Manual correction or using software

3. Duplicate Records

● When the same data appears more than once.

● Example: A customer appears twice in a database with slightly different names ("Jon"
and "John").

● Solution: Use deduplication tools or algorithms to merge or delete duplicates.

4. Inconsistent Data

● This happens when data formats vary across sources.

● Example: Dates written as "03/08/2025" and "August 3, 2025"

● Solution: Standardize data formats across the dataset.

5. Irrelevant Data

● Sometimes, extra data is collected that has no value for the analysis.

● Example: Including the "profile picture size" while analyzing customer purchase
behavior.

● Solution: Remove irrelevant columns or fields.

Data integration is the process of combining data from multiple sources into a single, unified
view. It helps in building a complete dataset that can be used for effective data mining and
analysis. When data comes from different departments, systems, or files, it may be inconsistent,
redundant, or conflicting. Integration helps resolve these issues and creates a coherent data
structure.

One key step during data integration is correlation analysis. This is used to identify
relationships between different attributes (columns or variables) in the combined dataset. If
two attributes are strongly related, they may carry similar information. This can help reduce
redundancy and improve the quality of the integrated data.

What is Correlation Analysis?

Correlation analysis is a statistical method used to measure the strength and direction of a
relationship between two attributes. The result is called the correlation coefficient and
usually ranges from -1 to +1:

● +1 → Perfect positive correlation (as one increases, the other increases)

● -1 → Perfect negative correlation (as one increases, the other decreases)

● 0 → No correlation (no relationship between the attributes)

In data integration, correlation analysis helps in deciding whether two attributes

from different sources are closely related (and possibly duplicate) or completely
independent.

Why is Correlation Analysis Useful in Data Integration?

● To detect redundancy: If two attributes are highly correlated, one may be removed to
reduce complexity.

● To align similar attributes: Sometimes, two sources use different names for similar data
(e.g., "income" and "salary"). Correlation can help match them.

● To improve data consistency: Helps in resolving conflicts by understanding how

attributes interact.

What is a Correlation Coefficient?

The correlation coefficient is a statistical measure that describes the strength and direction
of a relationship between two variables. The most commonly used type is the Pearson
correlation coefficient, also called Pearson’s r.

● If r = +1 → perfect positive correlation

● If r = -1 → perfect negative correlation

● If r = 0 → no correlation

Pearson Correlation Coefficient Formula

Let’s calculate the correlation coefficient for the following small dataset:

X (Hours Studied) Y (Marks Scored)

2 50

4 60

6 65

8 80
10 85

Step 1: Find the mean

xˉ=2+4+6+8+105=6yˉ=50+60+65+80+855=68\bar{x} = \frac{2 + 4 + 6 + 8 + 10}{5} = 6 \\
\bar{y} = \frac{50 + 60 + 65 + 80 + 85}{5} = 68xˉ=52+4+6+8+10=6yˉ=550+60+65+80+85=68

Step 2: Apply the formula

First, create a table for calculations:

x y x−x̄ y−ȳ (x−x̄)(y−ȳ) (x−x̄)² (y−ȳ)²

2 50 -4 -18 72 16 324

4 60 -2 -8 16 4 64

6 65 0 -3 0 0 9

8 80 2 12 24 4 144

10 85 4 17 68 16 289

Now sum the columns:

∑(x−xˉ)(y−yˉ)=72+16+0+24+68=180∑(x−xˉ)2=16+4+0+4+16=40∑(y−yˉ)2=324+
64+9+144+289=830\sum (x - \bar{x})(y - \bar{y}) = 72 + 16 + 0 + 24 + 68
= 180 \\ \sum (x - \bar{x})^2 = 16 + 4 + 0 + 4 + 16 = 40 \\ \sum (y -
\bar{y})^2 = 324 + 64 + 9 + 144 + 289 = 830∑(x−xˉ)(y−yˉ
)=72+16+0+24+68=180∑(x−xˉ)2=16+4+0+4+16=40∑(y−yˉ
)2=324+64+9+144+289=830

Step 3: Plug into the formula

r=18040×830=18033200≈180182.19≈0.988r = \frac{180}{\sqrt{40 \times
830}} = \frac{180}{\sqrt{33200}} \approx \frac{180}{182.19} \approx
0.988r=40×830180=33200180≈182.19180≈0.988
Result: r ≈ 0.988

Data Transformation
Data transformation is a preprocessing step that converts data into a suitable format for mining.
It’s especially important when attributes have different scales, which can negatively affect
algorithms like K-Means, k-NN, or neural networks.

Definition:

Min-max normalization rescales the data to a fixed range, typically [0, 1]. It preserves the
relationships among the original data values.

Where:

● x = original value

● min(x) = minimum value in the column

● max(x) = maximum value in the column

● 'x′ = normalized value

📊 Example:

Original data: [50, 60, 70, 80, 90]

● Min = 50, Max = 90

Normalize value 70:

Z-score normalization, also known as standardization, is a crucial data preprocessing technique

in machine learning and statistics. It is used to transform data into a standard normal distribution,
ensuring that all features are on the same scale. This process helps to avoid the dominance of
certain features over others due to differences in their scales, which can significantly impact the
performance of machine learning models

v', v is the new and old of each entry in data respectively. σA, A is the standard deviation and
mean of A respectively.

Decimal Scaling Method

It normalizes by moving the decimal point of values of the data. To normalize the data by this
technique, we divide each value of the data by the maximum absolute value of data. The data
value, vi, of data is normalized to vi' by using the formula below -
where j is the smallest integer such that max(|vi'|)<1.
Example :
Let the input data is: -10, 201, 301, -401, 501, 601, 701. To normalize the above data,
Step 1: Maximum absolute value in given data(m): 701
Step 2: Divide the given data by 1000 (i.e j=3)
Result: The normalized data is: -0.01, 0.201, 0.301, -0.401, 0.501, 0.601, 0.701

Data Reduction :

● Data Reduction refers to techniques that reduce the volume of data while maintaining its
integrity and analytical value.

● It helps in improving efficiency of data processing and mining by summarizing or

compressing the data.

● Common methods include sampling, dimensionality reduction, discretization, and

aggregation.

Data Cube Aggregation :

Data Cube Aggregation is a multidimensional data reduction technique used primarily in
Online Analytical Processing (OLAP).

A data cube organizes data in a multi-dimensional structure, with each dimension representing a
different attribute (e.g., time, location, product).

Aggregation means summarizing data along one or more dimensions by applying aggregation
functions like SUM, COUNT, AVG, MIN, or MAX.
The result is a reduced dataset with fewer rows but more meaningful insights at higher
abstraction levels.

Steps to perform Data Cube Aggregation

Step 1. Identify Dimensions and Measures

● Dimensions (Day,Product,Region)
● Measure: Sales( The Numeric Data)
Step 2. Define the Aggregation Function
● Choose which Aggregation function to apply on Measures.
Step 3. Decide level of aggregation(Granularity)
● The Raw data is at finest Granularity
● Aggregations mean Summarizing over one by more dimensions.
Level
(Day,Product) - -> Aggregated Over Region
(Day,Region)--> Aggregated Over Product
(Product,Region)--> Aggregated Over Day
Step 4. Perform group by Aggregations

Step 5. Store the aggregated Result .

● Create separate data cube structure that stores the aggregated data for different grouping
Step 6. Use Aggregated Data to Answer queries

Attribute Subset Selection:

Attribute Subset Selection is a data preprocessing technique used to reduce the number of input
variables (attributes/features) in a dataset by selecting only the most relevant ones.
This Technique used for :
To eliminate irrelevant, redundant, or noisy features
To reduce computation time and complexity
To improve model accuracy by removing features that may confuse the learning algorithm
To avoid overfitting by simplifying the model.

Steps in Attribute Subset Selection:

1. Start with the Full Set of Attributes
The original dataset may have many features (e.g., age, income, job type, city, etc.)

2. Evaluate Each Attribute's Relevance

Use statistical, information-theoretic, or machine learning-based methods to assess
how much each attribute contributes to the prediction.

3. Common metrics:
Information Gain
Chi-Square test
Correlation Coefficient
Mutual Information
Gini Index

4. Select the Best Subset of Attributes

Choose attributes that give the highest predictive power and remove others.

5. Selection techniques:
Filter Methods: Use ranking based on statistical scores
Wrapper Methods: Use a predictive model to test different subsets
Embedded Methods: Perform selection during model training (e.g., decision trees,
LASSO)

6. Validate the Resulting Subset

Use cross-validation to check that the reduced feature set performs well on unseen data.

Suppose you’re building a model to predict whether a customer will buy a product.

Original Attributes:

● Age

● Gender

● Email Address

● Phone Number

● Annual Income
● Product View Count

● Number of Clicks

● Purchase History

After Attribute Subset Selection, you may find:

● Email and Phone Number are irrelevant to purchase prediction

● Age, Income, Click Count, and Purchase History are highly predictive

Final selected subset = {Age, Annual Income, Product View Count, Purchase History}

Sampling :

Sampling is the process of selecting a subset of data from a large dataset to analyze and draw
conclusions about the entire data.

In data mining, sampling is especially useful when:

● The dataset is too large to process efficiently

● You need quick insights or build prototypes

● You want to improve performance without sacrificing much accuracy

Why Sampling is Used in Data Mining?

● Reduces computational cost

● Speeds up data analysis and model training

● Helps in testing and validating models

● Enables visualization and exploration of massive datasets

Types of Sampling Techniques:

1. Random Sampling
Every record has an equal chance of being selected.
Best for unbiased, general-purpose sampling.
Example: Randomly picking 1,000 records from a million.

2. Stratified Sampling
The dataset is divided into strata (groups) (e.g., by class label), and samples are drawn
from each stratum proportionally.
Useful when some classes are rare.
Example: Sampling 20% from each income bracket.

3. Systematic Sampling
Selects every k-th record from a sorted list.
Example: From 10,000 records, pick every 10th record → total 1,000 records.

4. Cluster Sampling
Divide data into clusters (groups), and then randomly select entire clusters.
Example: Choose a few cities and include all customers from those cities.

5. Reservoir Sampling

Useful for data streams or when data size is unknown in advance.

Maintains a sample of fixed size from a stream of unknown length.

Example:

Scenario: Online Retail Store

You work as a data analyst for an online retail store with a dataset containing 1 million customer
transactions. Each record includes:

CustomerID Age Gender Country PurchaseAmount ProductCategory Purchase

Date

You want to build a machine learning model to predict whether a customer will make a purchase
above ₹10,000. But training on all 1 million rows is computationally expensive.

Step-by-Step Sampling Example

Objective:
Use sampling to create a smaller, representative dataset (e.g., 10,000 records) for fast model
development and testing.

Step 1: Understand the Target Variable

Let’s define:

● HighValuePurchase = 1 if PurchaseAmount > 10,000

● HighValuePurchase = 0 otherwise

Assume class distribution in full dataset:

● 5% are high-value purchases

● 95% are not

If we randomly sample, we might miss the minority class (only 500 out of 10,000 will be high-
value).

Step 2: Choose Sampling Technique

We will use Stratified Sampling to maintain class balance in the sample.

Step 3: Implement Sampling

Step 4: Use Sampled Data for Further Processing

Data Discretization :
Data Discretization is the process of transforming continuous attributes (numeric data) into a
finite set of intervals or categorical labels.

It is a crucial data preprocessing step in data mining and machine learning, particularly for
algorithms that require categorical inputs.

It Reducing computational complexity , Improving model interpretability, Handling noisy data,

Enhancing performance of classifiers

Example:

You have a dataset with customer ages, and you want to group ages into categories (Young,
Middle-aged, Senior) for use in a classification model.
Original Data (Continuous):
CustomerID Age

1 23

2 37

3 45

4 59

5 63

6 72

discretize Age into 3 categories:

● Young: 0–35

● Middle-aged: 36–60

● Senior: 61 and above

Step-by-Step Discretization:

CustomerID Age Age Category

1 23 Young

2 37 Middle-aged

3 45 Middle-aged

4 59 Middle-aged
5 63 Senior

6 72 Senior

Binning in Data Mining?

Binning is a data discretization technique used to convert continuous numerical data into discrete
bins or intervals. It’s commonly used in data preprocessing to smooth noisy data, reduce the
effect of outliers, and prepare the data for categorical models like decision trees or Naive Bayes.

Purpose of Binning
● Reduce complexity of the data

● Handle noise and outliers

● Convert numerical features into categorical features

● Enhance model performance for certain algorithms

Example of Binning

Suppose we have a column of Age values:

[23, 27, 34, 38, 42, 45, 49, 53, 58, 60, 66, 72]

Let’s say we want to bin this into 3 age groups:

● Bin 1: 20–40 → Young

● Bin 2: 41–60 → Middle-aged

● Bin 3: 61–80 → Senior

The binned result:

Age Age Group

23 Young

27 Young

34 Young

38 Young

42 Middle-aged

45 Middle-aged

49 Middle-aged

53 Middle-aged

58 Middle-aged

60 Middle-aged

66 Senior

72 Senior

Histogram Analysis
A histogram is a type of bar chart that represents the frequency distribution of a continuous
variable. The data is divided into bins (intervals), and the height of each bar shows how many data
points fall within that bin.

Histogram Analysis in Data Mining :

Histogram analysis is a visual and statistical method used in data preprocessing and exploratory
data analysis (EDA) to understand the distribution of continuous numerical attributes. It helps
identify patterns such as skewness, modality, outliers, and spread of data.

Data Warehousing and Data Mining: DR Seema Agarwal
No ratings yet
Data Warehousing and Data Mining: DR Seema Agarwal
72 pages
Data Mining
No ratings yet
Data Mining
15 pages
Combinepdf 1
No ratings yet
Combinepdf 1
74 pages
R21 DM Unit1
No ratings yet
R21 DM Unit1
77 pages
DM Unit-1
No ratings yet
DM Unit-1
14 pages
Data Mining Techniques (DMT) by Kushal Anjaria Session-1 (Lecture Note)
No ratings yet
Data Mining Techniques (DMT) by Kushal Anjaria Session-1 (Lecture Note)
2 pages
Unit 1
No ratings yet
Unit 1
28 pages
Data Mining Essentials for Analysts
No ratings yet
Data Mining Essentials for Analysts
73 pages
Data Mining for Tech Enthusiasts
No ratings yet
Data Mining for Tech Enthusiasts
61 pages
Unit I Notes
No ratings yet
Unit I Notes
23 pages
DataMining S
No ratings yet
DataMining S
103 pages
Chapter 1
No ratings yet
Chapter 1
6 pages
What Is Data Mining?
No ratings yet
What Is Data Mining?
17 pages
Data Mining Notes
No ratings yet
Data Mining Notes
25 pages
Datamining-Lect1 2
No ratings yet
Datamining-Lect1 2
44 pages
Lecture Notes PRM42
No ratings yet
Lecture Notes PRM42
3 pages
Data Mining and Preprocessing Guide
No ratings yet
Data Mining and Preprocessing Guide
40 pages
TTDS Lecture 1
No ratings yet
TTDS Lecture 1
22 pages
DM Day2 DataUnderstanding MS S25
No ratings yet
DM Day2 DataUnderstanding MS S25
165 pages
Data Mining Notes1
No ratings yet
Data Mining Notes1
56 pages
Data Mining Introduction & Techniques
No ratings yet
Data Mining Introduction & Techniques
9 pages
MR22-DM 1
No ratings yet
MR22-DM 1
21 pages
Data Mining for Business Growth
No ratings yet
Data Mining for Business Growth
7 pages
DM Lec01
No ratings yet
DM Lec01
27 pages
DWDM Reference Notes
No ratings yet
DWDM Reference Notes
126 pages
Data Mining - Digital Notes (Unit I To V)
No ratings yet
Data Mining - Digital Notes (Unit I To V)
85 pages
Data Mining for Students
No ratings yet
Data Mining for Students
122 pages
Data Mining-CH5
No ratings yet
Data Mining-CH5
49 pages
Data Mining
No ratings yet
Data Mining
35 pages
Introduction To Data Mining-Week1
No ratings yet
Introduction To Data Mining-Week1
43 pages
Lecture 1428550844
No ratings yet
Lecture 1428550844
87 pages
DM 1 PDF
No ratings yet
DM 1 PDF
67 pages
Data Mining
No ratings yet
Data Mining
14 pages
Nptel Swayam DWDM Slides
No ratings yet
Nptel Swayam DWDM Slides
406 pages
Comprehensive Data Mining Guide
No ratings yet
Comprehensive Data Mining Guide
52 pages
CSC 425 Data Mining and Warehousing 2024
No ratings yet
CSC 425 Data Mining and Warehousing 2024
54 pages
Data Mining and Warehousing-1
No ratings yet
Data Mining and Warehousing-1
43 pages
Chapter 3-IB
No ratings yet
Chapter 3-IB
69 pages
Internship
No ratings yet
Internship
12 pages
What Is Business Analytics?: Predictive Analytics Descriptive Analytics Prescriptive Analytics
No ratings yet
What Is Business Analytics?: Predictive Analytics Descriptive Analytics Prescriptive Analytics
35 pages
Module1 1 Introduction
No ratings yet
Module1 1 Introduction
27 pages
Introduction to Data Mining
No ratings yet
Introduction to Data Mining
89 pages
DM - Midsem - Question Bank
No ratings yet
DM - Midsem - Question Bank
5 pages
Data Mining For Exam
No ratings yet
Data Mining For Exam
10 pages
DM Unit-1
No ratings yet
DM Unit-1
27 pages
Understanding Data Mining Processes
No ratings yet
Understanding Data Mining Processes
6 pages
Datamining Lect1
No ratings yet
Datamining Lect1
61 pages
Datamining 1class
No ratings yet
Datamining 1class
76 pages
Data Mining: Techniques & Applications
No ratings yet
Data Mining: Techniques & Applications
21 pages
Week-1-Introduction To Data Mining
No ratings yet
Week-1-Introduction To Data Mining
43 pages
DM - Unit I-Updated
No ratings yet
DM - Unit I-Updated
65 pages
Data Mining
No ratings yet
Data Mining
11 pages
Introduction To Data Mining
No ratings yet
Introduction To Data Mining
6 pages
Data Mining - Prashant
No ratings yet
Data Mining - Prashant
10 pages
Data Mining for Computer Science Students
No ratings yet
Data Mining for Computer Science Students
20 pages
Unit 1 Data Mining
No ratings yet
Unit 1 Data Mining
15 pages
Data Mining Assignment
No ratings yet
Data Mining Assignment
4 pages
Teaching Children Mathematics
No ratings yet
Teaching Children Mathematics
9 pages
Surviviorship Bubble Lab
No ratings yet
Surviviorship Bubble Lab
7 pages
Stat6201 ch3-3
No ratings yet
Stat6201 ch3-3
3 pages
Maximum Mark: 50: Cambridge International Examinations Cambridge Ordinary Level
No ratings yet
Maximum Mark: 50: Cambridge International Examinations Cambridge Ordinary Level
6 pages
Farm Management Handout STS Plant and Horticulture
100% (1)
Farm Management Handout STS Plant and Horticulture
104 pages
Lecturer 6: Conic Section (Cont..)
No ratings yet
Lecturer 6: Conic Section (Cont..)
8 pages
Engineering Mechanics Exam
No ratings yet
Engineering Mechanics Exam
13 pages
DM R-18 Important Questions
No ratings yet
DM R-18 Important Questions
8 pages
PCS7 APL Styleguide en EN en-US PDF
No ratings yet
PCS7 APL Styleguide en EN en-US PDF
130 pages
Cotton Fabric Quality & Production Analysis
No ratings yet
Cotton Fabric Quality & Production Analysis
4 pages
DIBLS MAT1105E Final Exam Paper
No ratings yet
DIBLS MAT1105E Final Exam Paper
5 pages
Introduction To Seaborn
No ratings yet
Introduction To Seaborn
11 pages
Chapter 4 Utility Maximization and Choice
100% (1)
Chapter 4 Utility Maximization and Choice
50 pages
DLP in Work FINAL.
No ratings yet
DLP in Work FINAL.
9 pages
Circle Vocabulary PDF
No ratings yet
Circle Vocabulary PDF
4 pages
RSHS TOS Grade 7
No ratings yet
RSHS TOS Grade 7
1 page
100 Most Asked English Grammar 2025
No ratings yet
100 Most Asked English Grammar 2025
187 pages
M3 答案
No ratings yet
M3 答案
19 pages
Engineering Mathematics 2A (Scee08009) : Refresher Sheet
No ratings yet
Engineering Mathematics 2A (Scee08009) : Refresher Sheet
2 pages
Issyll PDF
No ratings yet
Issyll PDF
141 pages
66.0.0 Differentiation Q
No ratings yet
66.0.0 Differentiation Q
4 pages
Help Tops 1130
No ratings yet
Help Tops 1130
312 pages
Kendall's Tau and Spearman's Rank Correlation Coefficient Assess Statistical
No ratings yet
Kendall's Tau and Spearman's Rank Correlation Coefficient Assess Statistical
7 pages
Exercise 1.4
No ratings yet
Exercise 1.4
2 pages
IB 04 Straight Lines (11 16)
100% (1)
IB 04 Straight Lines (11 16)
3 pages
A Novel Unified Handover Algorithm For LTE-A
No ratings yet
A Novel Unified Handover Algorithm For LTE-A
5 pages
Sec 3.4
No ratings yet
Sec 3.4
17 pages
Derivatives Problem Set
No ratings yet
Derivatives Problem Set
6 pages
CPL Navigation PDF
100% (1)
CPL Navigation PDF
225 pages
Ebk - TPC 29 - F
No ratings yet
Ebk - TPC 29 - F
1 page