0% found this document useful (0 votes)
22 views17 pages

2-Tasks and Techniques

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views17 pages

2-Tasks and Techniques

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Data Mining

Tasks and Techniques


Data Mining Tasks

Descriptive

Classi cation and Prediction


fi
Descriptive Function
• Deals with the general properties of data in the database.

➡ Class/Concept Description

➡ Mining of Frequent Pattern

➡ Mining of Associations

➡ Mining of Correlations

➡ Mining of Clusters
Descriptive Function
Class/Concept Description

• refers to the data to be associated with classes or concepts

• these descriptions can be derived in the following ways

➡ Data Characterization - refers to summarizing data of class under study. Class


understudy is called Target Class

➡ Data Discrimination - refers to the mapping or classi cation of a class with some
prede ned group or class
fi
fi
Descriptive Function
Mining of Frequent Patterns

• Frequent patterns occur in transactional data

➡ Frequent Item Set - refers to a set of items that frequently appear together

➡ Frequent Subsequence - a sequence of patterns that occur frequently such as


purchasing a camera is followed by memory card

➡ Frequent Sub Structure - sub structure refers to a different structural forms, such as
graphs, trees, or lattices, which may be combined with item-sets or subsequences
Descriptive Function
Mining of Associations

• Associations are used in retail to identify patterns that are frequently


purchased together.

• This process refer to the process of uncovering the relationship among data
and determining the association rules

• example: a retailer generates an association rule that shows that 70% of the
time milk is sold with bread and only 30% of the time biscuits are sold with
bread
Descriptive Function
Mining of Correlation

• It is a kind of additional analysis performed to uncover interesting statistical


correlation between associate-attribute-value pairs or between item sets to
analyze if they have positive, negative, or no effect on each other
Descriptive Function
Mining of Clusters

• Clusters refers to forming groups of objects that are very similar to each
other but are highly different from the objects in other clusters
Classi cation and Prediction
• Classi cation is the process of nding a model that describes the data into
classes or concepts

• The main purpose is to be able to use this model to predict class of objects
whose class label is unknown.

• The derived model is based on the analysis of sets of training data and can
be presented in the following forms:

Classification (If-Then) Rules


Decision Trees
Mathematical Formulae
Neural Networks
fi
fi
fi
Data Mining Techniques
Data mining includes utilization of re ned data analysis
tools to nd previously unknown, valid patterns and
relationships in huge data sets. These tools can
incorporate statistical model, machine learning
techniques, mathematical algorithms such as neural
networks and decision trees. Thus, data mining
incorporates analysis and prediction
fi
fi
Data Mining Techniques
• Classi cation
➡ This technique is used to obtain
important relevant information about
data and metadata

➡ This Data Mining technique helps


classify data in different classes
fi
Data Mining Techniques
• Clustering
➡ It is a division of information into groups
of connected objects

➡ Describing the data by a few clusters


mainly loses certain con ne details, but
accomplishes improvement. It model
data by clusters

➡ Clustering analysis is a data mining


technique to identify similar data
fi
Data Mining Techniques
• Regression
➡ Regression analysis is a data mining
process used to identify and analyze
relationship between variables because
of the presence of the other factor.

➡ Regression, primarily is a form of


planning and modeling

➡ it gives exact relationship between two


or more variables in the given data set
Data Mining Techniques
• Association Rules
➡ This is a technique that helps discover
the link between two or more items.

➡ It nds hidden pattern in the data set

➡ These are if-then statements that support


to show the probability of interactions
between data items within large data sets
in different types of database

➡ Association Rule mining has several


applications and is commonly used to
help sales correlation in data or medical
data sets
fi
Data Mining Techniques
• Outlier
➡ This technique relates to the observation
of data items in a data set, which do not
match an expected pattern or behavior

➡ This technique may be used in various


domains like intrusion, detection, fraud,
etc.

➡ Outlier is a data point that diverge too


much from the rest of the dataset

➡ It is valuable in elds like network


interruption, identi cation, credit or debit
fraud detection, etc.
fi
fi
Data Mining Techniques
• Sequential Patterns
➡ It is a data mining technique specialized
in evaluating sequential data to discover
sequential patterns

➡ It comprises of nding interesting


subsequences in a set of subsequence,
where the stake of a sequence can be
measured in terms of different criteria
like length, occurrence frequency, etc.

➡ This technique helps to discover or


recognize similar patterns in transaction
data over some time
fi
Data Mining Techniques
• Prediction
➡ This technique uses a combination of
other data mining technique such as
trends, clustering, classi cation, etc.

➡ It analyzes past events or instances in


the right sequence to predict future
events
fi

You might also like