0% found this document useful (0 votes)

11 views84 pages

Module 4

Uploaded by

samaymistry105

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views84 pages

Module 4

Uploaded by

samaymistry105

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 84

Unit: 4

Data Mining &

Knowledge Discovery
2
Definitions & Motivations
Why Data Mining?
Explosive Growth of Data:, Web, etc. from terabytes to petabytes
Data Collections and Data Availability
Crawlers, database systems

Sources
B usiness: Web, e-commerce, transactions, etc.
Sc ience: Remote sensing, bioinformatics, etc.
Society and everyone: news, YouTube, etc.

Problem: We are drowning in data, but starving for knowledge!

Solution: Use Data Mining tools for Automated Analysis of massive data sets
Why Data Mining

➝Credit ratings/targeted marketing:

⇾Given a database of 100,000 names, which persons are the least
likely to default on their credit cards?
⇾Identify likely responders to sales promotions

➝Fraud detection
⇾Which types of transactions are likely to be fraudulent, given the
demographics and transactional history of a particular customer?

➝Customer relationship management:

⇾Which of my customers are likely to be the most loyal, and which are
most likely to leave for a competitor? :

Data Mining helps extract such information

What is Data Mining?
Data mining (knowledge discovery from data)
Extraction of interesting (non-trivial, implicit, previously unknown
and potentially useful) patterns or knowledge from huge amount of
data
Data mining: a misnomer?

Gold Mining
Stone Not Stone Mining

Data Knowledge Knowledge Mining?

Alternative names
Knowledge discovery (mining) in databases (KDD), knowledge
extraction, data/pattern analysis, data archeology, data dredging,
information harvesting, business intelligence, etc.
Applications
➝Banking: loan/credit card approval
⇾predict good customers based on old customers

➝Customer relationship management:

⇾identify those who are likely to leave for a competitor.

➝Targeted marketing:
⇾identify likely responders to promotions

➝Fraud detection: telecommunications, financial

transactions
⇾from an online stream of event identify fraudulent events

➝Manufacturing and production:

⇾automatically adjust knobs when process parameter changes
Applications (continued)

➝Medicine: disease outcome, effectiveness of treatments

⇾analyze patient disease history: find relationship between diseases

➝Molecular/Pharmaceutical: identify new drugs

➝Scientific data analysis:

⇾identify new galaxies by searching for sub clusters

➝Web site/store design and promotion:

⇾find affinity of visitor to pages and modify layout
Examples: What is (not) Data Mining?

l What is not Data Mining? l What is Data Mining?

– Certain names are more
– Look up phone number in prevalent in certain US locations
phone directory (O’Brien, O’Rurke, O’Reilly… in
Boston area)
– Query a Web search – Group together similar
engine for information about documents returned by search
“Amazon” engine according to their context
(e.g. Amazon rainforest,
[Link],)
Query Examples

➝Database
– Find all credit applicants with last name of Smith.

– Identify customers who have purchased more than $10,000 in the last month.

– Find all customers who have purchased milk

➝Data Mining
– Find all credit applicants who are poor credit risks. (classification)

– Identify customers with similar buying habits. (Clustering)

– Find all items which are frequently purchased with milk. (association rules)
Knowledge
DATA
Discovery

10
Why do we need KDD ?

11
Knowledge Discovery in Database

12
Data Mining as a step
in the knowledge discovery process
Data mining plays
an essential role in the
knowledge discovery process

13
Knowledge Discovery Process

•Goals STEP – 1: IDENTIFYING THE GOAL

•Data Selection,
Acquisition & • First step is developing an
Integration understanding of the application
•Data Cleaning domain and the relevant prior
•Data reduction and knowledge and identifying the
Projection
goal of the KDD process from the
•Matching the goals
•Exploratory Data customer’s viewpoint.
Analysis
•Data Mining
•Interpretation and
Testing
•Consolidation & Use
Knowledge Discovery Process

•Goals STEP – 2: CREATING A TARGET DATA SET

•Data Selection,
Acquisition & • Selecting a data set, or focusing on
Integration a subset of variables or data
•Data Cleaning samples, on which discovery is to
•Data reduction and be performed.
Projection
•Matching the goals
•Exploratory Data
Analysis
•Data Mining
•Interpretation and
Testing
•Consolidation & Use
Knowledge Discovery Process

•Goals STEP – 3: DATA CLEANING AND

•Data Selection, PREPROCESSING
Acquisition &
Integration • Basic operations include removing
•Data Cleaning noise if appropriate, collecting the
•Data reduction and
necessary information to model or
Projection
account for noise, deciding on
•Matching the goals
•Exploratory Data strategies for handling missing data
Analysis fields, and accounting for time-
•Data Mining sequence information and known
•Interpretation and changes.
Testing
•Consolidation & Use
Knowledge Discovery Process

•Goals STEP – 4: DATA REDUCTION

•Data Selection, AND PROJECTION
Acquisition &
Integration • Finding useful features to represent the
•Data Cleaning data depending on the goal of the task.
•Data reduction and • With dimensionality reduction or
Projection transformation methods, the effective
•Matching the goals
number of variables under
•Exploratory Data
Analysis
consideration can be reduced, or
•Data Mining invariant representations for the data
•Interpretation and can be found.
Testing
•Consolidation & Use
Knowledge Discovery Process

•Goals STEP – 5: MATCHING THE GOALS

•Data Selection,
Acquisition & • Matching the goals of the KDD process
Integration
to a particular data-mining method
•Data Cleaning
•Data reduction and
such as summarization, classification,
Projection regression, clustering, Association etc.
•Matching the goals
•Exploratory Data
Analysis
•Data Mining
•Interpretation and
Testing
•Consolidation & Use
Knowledge Discovery Process

•Goals STEP – 6: EXPLORATORY ANALYSIS

•Data Selection, AND MODEL & HYPOTHESIS
Acquisition & SELECTION
Integration • Choosing the data mining algorithms
•Data Cleaning and selecting methods to be used for
•Data reduction and searching for data patterns.
Projection • This process includes deciding which
•Matching the goals
models and parameters might be
•Exploratory Data
Analysis
appropriate and matching a particular
•Data Mining data-mining method with the overall
•Interpretation and criteria of the KDD process.
Testing
•Consolidation & Use
Data Mining Techniques

Data Mining Techniques

Descriptive Predictive

Clustering Classification

Association Decision Tree

Sequential Analysis Rule Induction

Neural Networks

Nearest Neighbor Classification

Regression
Knowledge Discovery Process

•Goals STEP – 7: DATA MINING

•Data Selection,
Acquisition &
▪ Searching for patterns of
Integration interest in a particular
•Data Cleaning representational form or a set of
•Data reduction and such representations, including
Projection classification rules or trees,
•Matching the goals
•Exploratory Data
regression, and clustering.
Analysis ▪ The user can significantly aid
•Data Mining the data- mining method by
•Interpretation and correctly performing the
Testing
•Consolidation & Use
preceding steps.
Knowledge Discovery Process

•Goals STEP – 8: INTERPRETATION & TESTING

•Data Selection,
Acquisition & • Interpreting mined patterns,
Integration possibly returning to any of steps 1
•Data Cleaning through 7 for further iteration.
•Data reduction and
Projection
• This step can also involve
•Matching the goals visualization of the extracted
•Exploratory Data patterns and models or
Analysis visualization of the data given the
•Data Mining extracted models.
•Interpretation and
Testing
•Consolidation & Use
Knowledge Discovery Process

•Goals STEP – 9: KNOWLEDGE PRESENTATION

•Data Selection, • Using the knowledge directly,
Acquisition & incorporating the knowledge into
Integration
•Data Cleaning
another system for further action,
•Data reduction and or simply documenting it and
Projection reporting it to interested parties.
•Matching the goals • This process also includes
•Exploratory Data
Analysis
checking for and resolving
•Data Mining potential conflicts with previously
•Testing and Verification believed (or extracted) knowledge.
•Interpretation
•Consolidation & Use
OLAP - On-line
Analytical
Processing
◦ Provides you
with a very
good view of
what is
happening, but
can not predict
what will
happen in the
future or why it
is happening

24
25
26
27
28
 Data mining algorithms need large amounts of data,
more so at the detailed level. Most data warehouses
contain data at the lowest level of granularity.

 Data mining flourishes on integrated and cleansed

data. If ETL functions were carried out properly, data
warehouse will contain such data, very suitable for data
mining.

 The infrastructure for data warehouses is already

robust, with parallel processing technology and
powerful relational database systems. Because such
scalable hardware is already in place, no new
investment is needed to support data mining.

29
30
 A Predictive model makes a prediction about
values of data using known results found
from different data
 Predictive modeling may be based on the use
of other historical data

 A Descriptive model identifies patterns or

relationships in data. It serves as a way to
explore the properties of the data examined,
not to predict new properties.

31
 Classification maps data into predefined
groups or classes
◦ Supervised learning
◦ Pattern recognition

 Clustering groups similar data together

into clusters.
◦ Unsupervised learning
◦ Segmentation or Partitioning of data into sets

32
 Link Analysis uncovers relationships among
data.
◦ Affinity Analysis
◦ Association Rules
◦ Sequential Analysis determines sequential patterns.

33
 Also called as link analysis or affinity analysis.
 It is one of the data mining technique to
uncover the relationships among data.
 An association rule is a model that identifies
specific types of data associations.
 Association rule can be used to assist retail
store management in effective advertising,
marketing & inventory control.

34
Supervised learning
• Classification and regression

Unsupervised learning
• Clustering
• Associations

35
 Maps the data into predefined groups or classes.
 Also called as supervised learning as classes are
determined before examining the data.
 Classification algo requires that classes be defined
based on data attribute values.
 Classes are described by looking at characteristics of
data that is already known.
 Pattern recognition is a type of classification where an
input pattern is classified into one of several classes
based on its similarity of predefined classes.

36
 An emergency room in a hospital measures 17
variables (e.g., blood pressure, age, etc) of
newly admitted patients.
 A decision is needed: whether to put a new
patient in an intensive-care unit (ICU) .
 Due to the high cost of ICU, those patients
who may survive less than a month are given
higher priority.
 Problem: to predict high-risk patients and
discriminate them from low-risk patients.

37
 A credit card company receives thousands of
applications for new cards. Each application
contains information about an applicant,
◦ age
◦ Marital status
◦ annual salary
◦ outstanding debts
◦ credit rating
◦ etc.
 Problem: to decide whether an application
should approved, or to classify applications
into two categories, approved and not
approved.

38
 Like human learning from past experiences.
 A computer does not have “experiences”.
 A computer system learns from data, which
represent some “past experiences” of an
application domain.
 Our focus: learn a target function that can be
used to predict the values of a discrete class
attribute, e.g., approve or not-approved, and
high-risk or low risk.
 The task is commonly called: Supervised
learning, classification, or inductive learning.

39
 Similar to classification except that the groups
are not predefined.
 Also referred as unsupervised learning or
segmentation.
 Clustering is accomplished by determining the
similarity among the data on predefined
attributes.
 Most similar data are grouped into clusters
 As clusters are not predefined, a domain
expert is required to interpret the meaning of
created clusters.

40
 Supervised learning: classification is seen as
supervised learning from examples.
◦ Supervision: The data (observations, measurements,
etc.) are labeled with pre-defined classes. It is like
that a “teacher” gives the classes (supervision).
◦ Test data are classified into these classes too.
 Unsupervised learning (clustering)
◦ Class labels of the data are unknown
◦ Given a set of data, the task is to establish the
existence of classes or clusters in the data

41
DATA
PREPROCESSING

42
Confluence of Multiple Disciplines

Database
Technology Statistics

Machine Visualization
Learning Data Mining

Pattern
Recognition Other
Disciplines
Algorithm
Multi-Dimensional Measure of Data Quality

▪ A well-accepted multidimensional view:

– Accuracy

– Completeness

– Consistency

– Timeliness

– Believability

– Interpretability
Data Preprocessing Steps

▪ Data Cleaning
– Fill in missing values, smooth noisy data, identify or remove
outliers, and resolve inconsistencies
▪ Data Integration
– Combines data from multiple sources(multiple databases,
data cubes, or flat files) & provides a unified view of data.
▪ Data Transformation & discretization
– Transform data in appropriate forms suitable for mining
process. Ex. Normalization, aggregation, discretization
▪ Data Reduction
– Obtains reduced representation in volume but produces the
same or similar analytical results. Ex. Data Cube Aggregation,
Attribute Subset Selection
Forms of Data Preprocessing
Data Preprocessing
Data Cleaning

▪ Importance
– “Data cleaning is one of the three biggest problems in data
warehousing”—Ralph Kimball
– “Data cleaning is the number one problem in data
warehousing”—DCI survey

▪ Data cleaning tasks

– Fill in missing values
– Identify outliers and smooth out noisy data
– Correct inconsistent data
– Resolve redundancy caused by data integration
Data Cleaning

▪ Data in the real world is dirty

– incomplete: lacking attribute values, lacking certain
attributes of interest, or containing only aggregate data.
e.g., occupation=“”
– noisy: containing errors or outliers. e.g., Salary=“-10”,
Age=“222”
– inconsistent: containing discrepancies in codes or names.

• e.g., Age=“42” Birthday=“03/07/1997”

• e.g., Was rating “1,2,3”, now rating “A, B, C”

• e.g., discrepancy between duplicate records

Missing Data

▪ Data is not always available. Many tuples have no recorded

value for several attributes, such as customer income in sales
data
▪ Missing data may be due to
– Equipment malfunction
– Inconsistent with other recorded data and thus deleted
– Data not entered due to misunderstanding
– “Not applicable” data value or when collected
– Certain data not considered important at time of entry
– Different considerations between the time when the data was
collected and when it is analyzed.
– Human/hardware/software problems
Noisy Data

▪ Noise: random error or variance in a measured variable

▪ Noisy data (incorrect values) may come from

– Faulty data collection instruments
– data entry problems

– data transmission problems

– technology limitation

– inconsistency in naming convention

▪ Inconsistent data may come from

– Different data sources
– Functional dependency violation (e.g., modify some linked
data)
▪ Duplicate records also need data cleaning
How to Handle Missing Data?

▪ Ignore the tuple: not effective when the percentage of missing

values per attribute varies considerably.
▪ Fill in the missing value manually: tedious + infeasible?
▪ Fill in it automatically with
– a global constant : e.g., “unknown”, a new class?
– the attribute mean/ median/ mode

– the attribute mean for all samples belonging to the same class

– the most probable value: inference-based such as Bayesian

formula or decision tree
How to Handle Noisy Data?

▪ Binning
– first sort data and partition into (equal-frequency) bins

– then one can smooth by bin means, smooth by bin median,

smooth by bin boundaries, etc.
▪ Regression
– smooth by fitting the data into regression functions

▪ Clustering
– detect and remove outliers

▪ Combined computer and human inspection

– detect suspicious values and check by human (e.g., deal
with possible outliers)
Binning Methods

▪ Sorted data for price (in dollars): 4, 8, 9, 15, 21, 21, 24, 25, 26,
28, 29, 34
▪ Partition into equal-frequency (equi-depth) bins:
- Bin 1: 4, 8, 9, 15
- Bin 2: 21, 21, 24, 25
- Bin 3: 26, 28, 29, 34
▪ Smoothing by bin means:
- Bin 1: 9, 9, 9, 9
- Bin 2: 23, 23, 23, 23
- Bin 3: 29, 29, 29, 29
▪ Smoothing by bin boundaries:
(4,?,?,15) (lowest value,_,_,highest value)
- Bin 1: 4, 4, 4, 15 (8-4=4,15-8 = 7) →(4,4,?,15)
- Bin 2: 21, 21, 25, 25 (9-4=5,15-9 = 6) →(4,4,4,15)
- Bin 3: 26, 26, 26, 34
Data Integration

▪ Combines data from multiple sources into a coherent store

▪ Detecting and resolving data value conflicts
▪ For the same real world entity, attribute values from different
sources are different
▪ Possible reasons: different representations, different scales,
e.g., metric vs. British units

▪ Entity identification problem

– Schema integration - [Link]-id  [Link]#
– Object Matching - Identify real world entities from multiple
data sources, e.g., Bill Clinton = William Clinton
Handling Redundancy in Data Integration

▪ Redundant data occur often when integration of multiple

databases
– Derivable data: One attribute may be a “derived” attribute
in another table, e.g., annual revenue
– Inconsistencies in attribute naming can also cause
redundancies in resulting data set

▪ Redundant attributes may be able to be detected by

correlation analysis
Correlation Analysis (Numerical Data)

▪ Correlation coefficient (Pearson’s product moment

coefficient)
rA, B =  ( A − A)(B − B)  ( AB) − n AB
=
(n − 1)AB (n − 1)AB
A B
n - number of tuples, & - respective means of A and B, σA
& σB - respective standard deviation of A and B, Σ(AB) - sum
of the AB cross-product.

▪ If rA,B > 0, A and B are positively correlated (A’s values

increase as B’s). The higher, the stronger correlation.
▪ rA,B = 0: independent;
▪ rA,B < 0: negatively correlated
Correlation Analysis (Categorical Data)

(Observed − Expected) 2
▪ Χ2 (chi-square) test 2 = 
Expected

▪ The larger the Χ2 value, the more likely the variables are
related
▪ Correlation does not imply causality
– # of hospitals and # of car-theft in a city are correlated
– Both are causally linked to the third variable: population
Chi-Square Calculation: An Example

Play chess Not play chess Sum (row)

Like science fiction 250(90) 200(360) 450

Not like science fiction 50(210) 1000(840) 1050

Sum(col.) 300 1200 1500

▪ Χ2 chi-square calculation (numbers in parenthesis are

expected counts calculated based on the data distribution in
the two categories)
( 250 − 90) 2
(50 − 210) 2
( 200 − 360) 2
(1000 − 840) 2
2 = + + + = 507.93
90 210 360 840
▪ Itshows that like_science_fiction and play_chess are
correlated in the group
59
Data Transformation

▪ Ex. Bithday date → age, rupees → dollar

▪ Smoothing
– Binning, regression, clustering

▪ Normalization
– scale the data values in a specified range (Ex. -1.0 to 1.0 or
0.0 to 1.0)

▪ Aggregation
– In Aggregation, summary or aggregation operations are
applied to the data. For example, daily sales data may be
aggregated so as to compute monthly and annual total
amounts.
▪ Discretization and concept hierarchy generation
Cluster Analysis

61
Regression

y=x+1
Y1’

x
X1

62
Data Transformation: Normalization

▪ Min-max normalization: to [new_minA, new_maxA]

v − minA
v' = (new _ maxA − new _ minA) + new _ minA
maxA − minA

▪ Ex. Let income range $12,000 to $98,000 normalized to

[0.0, 1.0]. Then $73,000 is mapped to

73,600 − 12,000
(1.0 − 0) + 0 = 0.716
98,000 − 12,000
Data Transformation: Normalization

▪ Z-score normalization (μ: mean, σ: standard deviation):

v − A
v' =
 A

▪ Ex. Let μ = 54,000, σ = 16,000. Then

73,600 − 54,000
= 1.225
16,000

▪ Normalization by decimal scaling

v Where j is the smallest integer such that Max(|ν’|) < 1

v'= j
10
Discretization

▪ Discretization
– Used to divide the range of continuous attribute into
intervals. Numerous continuous attribute values are
replaced by small interval labels.
– Ex. Create discreet categories below 18 yrs, 18-30yrs,…..

– Discretization can be done by binning, histogram analysis,

cluster, decision tree, correlation analyses
Concept Hierarchy Generation

▪ Concept Hierarchy Generation

– Used to reduce the data by collecting and replacing low-
level concepts with higher-level concepts.
– Here attributes are converted from level to higher level in
hierarchy. For Example-The attribute “city” can be converted
to “country”, age by young, middle-aged, or senior
15 distinct values
country

province_or_ state 365 distinct values

3567 distinct values

city

674,339 distinct values

street
Data Reduction

▪ Data reduction
– Obtain a reduced representation of the data set that is much
smaller in volume but yet produce the same (or almost the
same) analytical results
– Reduce some variables in table (Ex. Attribute Selection )
– Reduce some values in table (Ex. Discretization)
– Reduce some cases(rows) from the table (Ex. Sampling)

▪ Why data reduction?

– A database/data warehouse may store terabytes of data
– Complex data analysis/mining may take a very long time to
run on the complete data set
Data Reduction Strategies

▪ Data cube aggregation

– Aggregation operation is applied to data for the construction
of the data cube.
▪ Dimensionality reduction
– variable composition
– Variable Selection

▪ Data Compression
– Encoding mechanisms are used to reduce data set size

▪ Numerosity reduction
– Store the model of data instead of whole data, i.e fit data
into model. Ex. Histogram, clustering, sampling, regression
▪ Discretization and concept hierarchy generation
Data Cube Aggregation

▪ Multiple levels of aggregation in data cubes further reduces

the size of data to deal with
Data Compression

Original Data Compressed

Data
lossless

Original Data
Approximated
Major Techniques of Dimensionality
Reduction

▪ Feature reduction (Feature/ variable composition)

– Discarding the less important features and making new
synthetic features from linear combination of the original
ones.
– All original features used but in form of their linear
combinations
– Ex. Average miles driven per year = mileage/(current year
– year +1)

▪ Feature Selection (Attribute Subset /Variable Selection)

– Find a subset of original set of variables, or features, to get
a smaller subset which can be used to model the problem.
Greedy Method for Attribute Subset
Selection

All attributes
that do not
appear in tree
Keep adding
are irrelevant
best attributes
Numerosity Reduction

▪ Reduce data volume by choosing alternative, smaller forms

of data representation

▪ Parametric methods
– Assume the data fits some model, estimate model
parameters, store only the parameters, and discard the
data (except possible outliers)
– Example: Log-linear models

▪ Non-parametric methods
– Do not assume models
– Major families: histograms, clustering, sampling
Regression and Log-Linear Models

▪ Linear regression
– Data are modeled to fit a straight line
– Often uses the least-square method to fit the line

▪ Multiple regression
– allows a response variable Y to be modeled as a linear
function of multidimensional feature vector

▪ Log-linear model
– Approximates discrete multidimensional probability
distributions
Linear Regressions
Histograms

▪ Dividedata into buckets and store average (sum) for each

bucket. (Binning)
Histograms

▪ Partitioning
rules:
– Equal-width: equal bucket range
– Equal-frequency (or equal-depth)
Clustering

▪ Partition data set into clusters based on similarity, and store

cluster representation (e.g., centroid and diameter) only
▪ Can be very effective if data is clustered but not if data is
“dirty”
▪ Elements within the cluster are similar and elements
across the cluster are dissimilar
Sampling

▪A large data set to be represented by much smaller random

sample or subset.
▪ Stratified sampling- each type included
▪ Identifies representative subset to data
Types of Sampling

▪ Simple random sampling

– There is an equal probability of selecting any particular item
▪ Simple random sampling without replacement (SRSWOR)
– Once an object is selected, it is removed from the
population
▪ Simple random sampling with replacement (SRSWR)
– A selected object is not removed from the population
▪ Stratified sampling
– Partition the data set, and draw samples from each partition
(proportionally, i.e., approximately the same percentage of
the data)
Sampling: with or without Replacement
Types of Sampling

Raw Data
Discretization

▪ To transform a continuous attribute into a categorical attribute

– Divide the range of a continuous attribute into intervals
– Some classification algorithms only accept categorical
attributes. E.g. Apriori for ARM
– Reduce data size by discretization
– Prepare for further analysis
– Better results may be obtained with discretized attributes
▪ Three types of attributes:
– Nominal —unordered set, e.g., color, profession
– Ordinal — ordered set, e.g., military or academic rank
– Continuous — real numbers, e.g., integer or real numbers
Discretization
University Questions

▪ Issuesin data integration

▪ KDD process

Dmbi Unit-3
No ratings yet
Dmbi Unit-3
21 pages
Data Mining & Knowledge Discovery
No ratings yet
Data Mining & Knowledge Discovery
60 pages
Data Mining (Introduction)
No ratings yet
Data Mining (Introduction)
31 pages
Understanding Data Mining Concepts
No ratings yet
Understanding Data Mining Concepts
44 pages
Data Mining and KDD
No ratings yet
Data Mining and KDD
15 pages
01 - Introduction To Datamining
No ratings yet
01 - Introduction To Datamining
19 pages
DM Course Material
No ratings yet
DM Course Material
128 pages
Unit 1
No ratings yet
Unit 1
43 pages
5 Data Mining Proccess and Techniques - Week 7
No ratings yet
5 Data Mining Proccess and Techniques - Week 7
61 pages
Introduction To Data Mining
No ratings yet
Introduction To Data Mining
38 pages
PPT-DWDM Unit 3
No ratings yet
PPT-DWDM Unit 3
106 pages
INTRODUCTION Data Mining
No ratings yet
INTRODUCTION Data Mining
43 pages
Knowledge Discovery and Data Mining
No ratings yet
Knowledge Discovery and Data Mining
5 pages
Introduction Lecture1gghhhhh
No ratings yet
Introduction Lecture1gghhhhh
23 pages
To Data Mining: Motivation: "Necessity Is The Mother of Invention"
No ratings yet
To Data Mining: Motivation: "Necessity Is The Mother of Invention"
14 pages
Data Mining Introduction
No ratings yet
Data Mining Introduction
52 pages
Understanding the KDD Process in Data Mining
No ratings yet
Understanding the KDD Process in Data Mining
5 pages
Datamining&warehousing
No ratings yet
Datamining&warehousing
65 pages
Lect 1 2 Data Mining 3
No ratings yet
Lect 1 2 Data Mining 3
19 pages
DMW ALLinONE
No ratings yet
DMW ALLinONE
64 pages
KDD Process
No ratings yet
KDD Process
56 pages
Chapter 7
No ratings yet
Chapter 7
26 pages
Introduction
No ratings yet
Introduction
60 pages
Data Knowledge
No ratings yet
Data Knowledge
44 pages
DWDM Unit 3
No ratings yet
DWDM Unit 3
16 pages
Unit - I MLT
No ratings yet
Unit - I MLT
137 pages
Data Mining
No ratings yet
Data Mining
20 pages
Topic 3 - Data Mining
No ratings yet
Topic 3 - Data Mining
37 pages
Data Mining and Knowledge Discovery Guide
No ratings yet
Data Mining and Knowledge Discovery Guide
21 pages
Data Mining & Data Warehousing
No ratings yet
Data Mining & Data Warehousing
17 pages
Understanding Data Mining Techniques
No ratings yet
Understanding Data Mining Techniques
39 pages
Knowledge Discovery Database (KDD Process)
No ratings yet
Knowledge Discovery Database (KDD Process)
5 pages
CIS 467 - Topic 1 - Introduction - 2020
No ratings yet
CIS 467 - Topic 1 - Introduction - 2020
79 pages
1.1 DM-intro
No ratings yet
1.1 DM-intro
25 pages
21SE204-B DATA MINING - S2 M.Tech: Prepared By, Prince V Jose Ap, Cse Saintgits College of Engg
No ratings yet
21SE204-B DATA MINING - S2 M.Tech: Prepared By, Prince V Jose Ap, Cse Saintgits College of Engg
31 pages
Data Mining and Its Branches
No ratings yet
Data Mining and Its Branches
37 pages
Data Mining in Knowledge Management
No ratings yet
Data Mining in Knowledge Management
41 pages
Unit 1
No ratings yet
Unit 1
102 pages
Data Mining for Business Insights
100% (3)
Data Mining for Business Insights
11 pages
Data Analysis & Mining for Business Decisions
No ratings yet
Data Analysis & Mining for Business Decisions
100 pages
Data Mining for Business Insights
100% (1)
Data Mining for Business Insights
39 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
22 pages
Data Mining Essentials for Students
No ratings yet
Data Mining Essentials for Students
15 pages
Clustering Algorithm For Spatial Data Mining: An: A.Padmapriya, N.Subitha
No ratings yet
Clustering Algorithm For Spatial Data Mining: An: A.Padmapriya, N.Subitha
6 pages
Lecture 2 Data Mining Functions
No ratings yet
Lecture 2 Data Mining Functions
40 pages
Data Mining: Concepts and Techniques
No ratings yet
Data Mining: Concepts and Techniques
14 pages
Introduction to Data Mining Concepts
No ratings yet
Introduction to Data Mining Concepts
17 pages
Unit-1 Data Mining
No ratings yet
Unit-1 Data Mining
19 pages
Data Mining Concepts and Applications
No ratings yet
Data Mining Concepts and Applications
27 pages
Chapter 1 Intro
No ratings yet
Chapter 1 Intro
23 pages
Lecture 1-Introduction To Data Mining - M
No ratings yet
Lecture 1-Introduction To Data Mining - M
38 pages
Data Mining Process Guide
No ratings yet
Data Mining Process Guide
18 pages
Introduction To Data Mining
No ratings yet
Introduction To Data Mining
11 pages
Unit 1
No ratings yet
Unit 1
19 pages
Introduction To Data Mining
No ratings yet
Introduction To Data Mining
43 pages
What Is Data Mining: Effective Data Collection Warehousing
No ratings yet
What Is Data Mining: Effective Data Collection Warehousing
21 pages
AppendixK COT RPMS - Forms MTI IV - Final
100% (4)
AppendixK COT RPMS - Forms MTI IV - Final
2 pages
Forms of the Verb "To Be"
No ratings yet
Forms of the Verb "To Be"
1 page
ImaAgista TBI
No ratings yet
ImaAgista TBI
92 pages
Kagan Structures Packet
100% (9)
Kagan Structures Packet
15 pages
Part 1
No ratings yet
Part 1
11 pages
Fischer 1993
No ratings yet
Fischer 1993
18 pages
Reframing Organizations Artistry Choice and Leadership 6th Edition Test Bank
No ratings yet
Reframing Organizations Artistry Choice and Leadership 6th Edition Test Bank
10 pages
Behavior Assessment System For Children Basc
100% (4)
Behavior Assessment System For Children Basc
83 pages
Teaching Guide For Catch Up Friday - Peace Education
100% (4)
Teaching Guide For Catch Up Friday - Peace Education
3 pages
Competency-Based HR Management Framework
100% (1)
Competency-Based HR Management Framework
53 pages
Creative Writing - Final Tos
No ratings yet
Creative Writing - Final Tos
2 pages
The Social Self
No ratings yet
The Social Self
7 pages
Management: Definition & Functions
No ratings yet
Management: Definition & Functions
6 pages
Teaching Macro Skills in Listening
No ratings yet
Teaching Macro Skills in Listening
20 pages
2nd Sem FT EED SCI 2 Science Module
100% (1)
2nd Sem FT EED SCI 2 Science Module
71 pages
Art and Appreciation Paul Patalinghug
No ratings yet
Art and Appreciation Paul Patalinghug
9 pages
Reflective Writing Task 1 and 2 Semester 1 2025 - 16 May
No ratings yet
Reflective Writing Task 1 and 2 Semester 1 2025 - 16 May
5 pages
Reader's Gallery Walk for 5th Grade
No ratings yet
Reader's Gallery Walk for 5th Grade
5 pages
Grade 2 English: Action Words Lesson
100% (1)
Grade 2 English: Action Words Lesson
3 pages
Neural Network Based Image Processing For Vitamin Deficiency Detection Using CNN
No ratings yet
Neural Network Based Image Processing For Vitamin Deficiency Detection Using CNN
6 pages
Information Processing Theory of Learning
No ratings yet
Information Processing Theory of Learning
20 pages
Philosophy for Grade 12 Students
No ratings yet
Philosophy for Grade 12 Students
2 pages
Facilitation Skills for Leaders
100% (1)
Facilitation Skills for Leaders
143 pages
ILAC IELTS Test 04 Reading PDF
No ratings yet
ILAC IELTS Test 04 Reading PDF
14 pages
G12 - PAG-ASA Learner's Packet
No ratings yet
G12 - PAG-ASA Learner's Packet
15 pages
Essay Rubric
No ratings yet
Essay Rubric
2 pages
Thesis 111
No ratings yet
Thesis 111
19 pages
Tips for Teaching English Learners
No ratings yet
Tips for Teaching English Learners
1 page
Proficiency for Cognitive Disabilities
No ratings yet
Proficiency for Cognitive Disabilities
18 pages
Professional Education 200 With Separate Answer Key
No ratings yet
Professional Education 200 With Separate Answer Key
26 pages