0% found this document useful (0 votes)

236 views7 pages

Data Mining and Business Analytics

The document discusses predictive analytics, emphasizing its role in predicting future trends and behaviors using measurable variables known as predictors. It outlines the knowledge discovery process in data mining and highlights various tools, including open-source options like R, Weka, Orange, and KNIME, for performing predictive data analysis. The paper aims to enhance understanding of predictive analytics to facilitate quick decision-making across different sectors.

Uploaded by

rnjagi12

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

236 views7 pages

Data Mining and Business Analytics

Uploaded by

rnjagi12

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Finding predictive information in large databases

11-19-2022
Abstract
Predictive analytics is the branch of data mining concerned with the prediction of future
probabilities and trends. The central element of predictive analytics is the predictor, a variable
that can be measured for an individual or other entity to predict future behavior. For example, an
insurance company is likely to take into account potential driving safety predictors such as age,
gender, and driving record when issuing car insurance policies.

Performing Predictive Data Analytics on huge data sets will help us in quick Decision Making
forecasting on the results obtained on live or sample data. Data mining, the extraction of hidden
predictive information from large databases, is a powerful new technology with great potential to
help companies or individuals to focus on the most important information in their data
warehouses. Predictive data analytics can be performed in various areas such as medical,
agriculture, behavior prediction of kids, behavior of a customer in a particular business etc. In
this aspect the paper elaborates on the tools available to do perform predictive data analytics and
also introduce the importance of data mining. Predictive data analysis is done using variables as
attributes known as predictors.

Scope of the project

This paper helps us in specifying how to do Predictive Data Analytics in data mining using
various tools. There are various Open Source Tools which help us in performing Predictive
Analytics such as R Studio, Weka, KNIME etc. This paper also lists various predictive analytic
tools and specify there features and usage. A comparison also can be made or decision can be
taken by the reader to use a specific tool based on the requirement.

The main scope is to enhance the study of predictive data analysis and provide the necessary
help in quick decision making in any of the important area.

Data mining derives its name from the similarities between searching for valuable business
information in a large database — for example, finding linked products in gigabytes of store
scanner data — and mining a mountain for a vein of valuable ore. Both processes require either
sifting through an immense amount of material, or intelligently probing it to find exactly where
the value resides. Given databases of sufficient size and quality, data mining technology can
generate new business opportunities by providing these capabilities:

1
Data mining for business analytics
A. Automated prediction of trends and behaviors. Data mining automates the process of
finding predictive information in large databases. Questions that traditionally required
extensive hands-on analysis can now be answered directly from the data — quickly. A
typical example of a predictive problem is targeted marketing. Data mining uses data on
past promotional mailings to identify the targets most likely to maximize return on
investment in future mailings. Other predictive problems include forecasting bankruptcy
and other forms of default, and identifying segments of a population likely to respond
similarly to given events.
B. Automated discovery of previously unknown patterns. Data mining tools sweep through
databases and identify previously hidden patterns in one step. An example of pattern
discovery is the analysis of retail sales data to identify seemingly unrelated products that
are often purchased together. Other pattern discovery problems include detecting
fraudulent credit card transactions and identifying anomalous data that could represent
data entry keying error.

KNOWLEDGE DISCOVERY PROCESS

Knowledge Discovery in Databases (KDD) is an automatic, exploratory analysis and modeling
of large data repositories. KDD is the organized process of identifying valid, novel, useful, and
understandable patterns from large and complex data sets.

The knowledge discovery process consists of six stages:

 Data Selection
 Cleaning
 Enrichment
 Coding
 Data mining
 Reporting

TOOLS FOR PREDICTIVE DATA ANALYSIS

Data Mining Tools
There are various effective software tools for Data Mining that can help to find the relationships,
clusters, patterns, categorizing, summarizing, etc. from the huge data sets. Such data mining

2
Data mining for business analytics
tools can help one to take most accurate decisions which come out profitable for their business.
Categories of Data Mining Tools

There are many tools used for Data Mining. They are broadly classified into three categories
Traditional data mining tools, Dashboards and text mining tools.

Traditional Data Mining Tools

Traditional mining programs help the companies in establishing data patterns and trends by using
various complex algorithms and techniques. Some of these tools are installed on the desktop
computers to monitor the data and emphasize trends and others capture information residing
outside a data base. Majority of these programs are supported by windows and UNIX versions.
However, some software specializes in one operating system only. In addition to that some may
work in only one database type. But, Most of the software will be able to handle any data using
online analytical processing or a similar technology.

Dashboards

Dashboards reflect data changed and update on screen. Dashboards are normally installed in
computers to monitor information in a database and it reflects data changes and updates the data
in the form of a chart or table on the screen. It enables the user to see how the business is
performing. Historical data can be referenced and checks against the current status in order to see
the changes in the business. By this way, dashboards is very easy to use and helps the manager a
lot with great appeal to have an overview of the company’s performance.

Text-Mining Tools
The third type of data mining tools is called as a text-mining tool because of its ability to mine
data from different kind of text starting from Microsoft Word, Acrobat PDF documents to simple
text files. This provides facility of scanning the content and converts the selected into a format
that is compatible with the tools database without opening different applications.

OPEN SOURCE TOOLS FOR DATA MINING

R is an open source programming language and environment for statistical computing and
graphics. R provides a wide variety of graphical and statistical techniques such as linear and non-

3
Data mining for business analytics
linear modeling, classical statistical tests, series analysis, classification clustering and is highly
extensible. Researchers in various fields of applied statistics have adopted R for statistical
software development and data analysis. Extensibility and superb data visualization are the two
main reasons for the success of R

Weka.

Weka is a collection of machine learning algorithms for data mining tasks and well suited for
developing new machine learning schemes. Weka is a java based software capability of working
under various operating systems and contains tools for data pre-processing, classification,
regression, clustering, association rules and visualization. The algorithms can either be applied
directly to a dataset or called from a user’s java code. Weka is probably the most successful open
source data mining software which has inspired by the development of other programs with more
sophisticated graphical user interface and better visualization methods.

Orange

Orange is an open source data mining and visualization software with active community and
which helps novice and experts for their analysis. It has the ability to work under various
platforms like windows, Mac Os C and GNU/Linux operating systems and it’s packed with data
analytics features. It enables design of data analysis process through user friendly visual
programming or python scripting. It has specialized add-ons like Bio orange for bio informatics.
Python is picking up in popularity because it’s simple and easy to learn yet powerful. Hence,
when it comes to looking for a tool for your work and you are a Python developer, look no
further than Orange, a Python-based, powerful and open source tool for both novices and
experts.

KNIME

Data preprocessing has three main components: extraction, transformation and loading. KNIME
does all three. It gives you a graphical user interface to allow for the assembly of nodes for data
processing. It is an open source data analytics, reporting and integration platform. KNIME also

4
Data mining for business analytics
integrates various components for machine learning and data mining through its modular data
pipelining concept and has caught the eye of business intelligence and financial data analysis.
Written in Java and based on Eclipse, KNIME is easy to extend and to add plug-in. Additional
functionalities can be added on the go. Plenty of data integration modules are already included in
the core version.

References

5
Data mining for business analytics
1. Philip K. Chan, Florida Institute of Technology Wei Fan, Andreas L. Prodromidis, and
Salvatore J. Stolfo, Columbia University “Distributed Data Mining in Credit Card Fraud
Detection” IEEE intelligent systems.
2. P. Chan and S. Stolfo, “Metalearning for Multistrategy and Parallel Learning,” Proc.
Second Int’l Workshop Multistrategy Learning, Center for Artificial Intelligence, George
Mason Univ., Fairfax,Va., 1993, pp. 150–165

6
Data mining for business analytics

Unit - 1 - Pca20g02t
No ratings yet
Unit - 1 - Pca20g02t
17 pages
Introduction To Data Mining For Business Analytics
No ratings yet
Introduction To Data Mining For Business Analytics
51 pages
Data Mining Practical 7
No ratings yet
Data Mining Practical 7
7 pages
Data Mining Tools
No ratings yet
Data Mining Tools
9 pages
Data Mining
No ratings yet
Data Mining
4 pages
Overview of Data Mining Tools & Techniques
No ratings yet
Overview of Data Mining Tools & Techniques
22 pages
Data Mining for Business Insights
No ratings yet
Data Mining for Business Insights
18 pages
Data Mining Group Project .
No ratings yet
Data Mining Group Project .
26 pages
DMW2 Tools
No ratings yet
DMW2 Tools
3 pages
Data Mining: Unlocking Data Insights
No ratings yet
Data Mining: Unlocking Data Insights
6 pages
Data Mining Seminar
100% (2)
Data Mining Seminar
21 pages
Data Mining Seminar
50% (2)
Data Mining Seminar
21 pages
Data Mining
No ratings yet
Data Mining
18 pages
Dmi Unit 5
No ratings yet
Dmi Unit 5
12 pages
Knowledge Discovery Process and Data Mining - Final Remarks: - Moore's Law
No ratings yet
Knowledge Discovery Process and Data Mining - Final Remarks: - Moore's Law
25 pages
A Research Review On Comparative Analysis of Data Mining Tools, Techniques and Parameters
No ratings yet
A Research Review On Comparative Analysis of Data Mining Tools, Techniques and Parameters
7 pages
An Introduction To Data Mining
No ratings yet
An Introduction To Data Mining
11 pages
Lecture 1428550844
No ratings yet
Lecture 1428550844
87 pages
Durga Erpppt
No ratings yet
Durga Erpppt
16 pages
Data Mining and Data Warehousing
100% (1)
Data Mining and Data Warehousing
12 pages
Open Source and Free Data Mining
No ratings yet
Open Source and Free Data Mining
5 pages
Data Mining 445545
No ratings yet
Data Mining 445545
11 pages
Introduction DM2
No ratings yet
Introduction DM2
13 pages
An Introduction To Data Mining
No ratings yet
An Introduction To Data Mining
16 pages
L - 1 Data Mining
No ratings yet
L - 1 Data Mining
17 pages
Data Mining Merged PDF CS1 CS8
No ratings yet
Data Mining Merged PDF CS1 CS8
272 pages
Data Mining for Business Insights
No ratings yet
Data Mining for Business Insights
54 pages
Data Mining Tools: Categorization & History
No ratings yet
Data Mining Tools: Categorization & History
13 pages
Combinepdf 1
No ratings yet
Combinepdf 1
74 pages
Data Mining and Data Warehouse BY: Dept. of Computer Science Engineering
No ratings yet
Data Mining and Data Warehouse BY: Dept. of Computer Science Engineering
10 pages
Data Mining Process Overview
100% (1)
Data Mining Process Overview
51 pages
Web Intelligence: What Is Webintelligence?
No ratings yet
Web Intelligence: What Is Webintelligence?
25 pages
AI Techniques in Data Mining Analysis
No ratings yet
AI Techniques in Data Mining Analysis
10 pages
Bhabesh - Chapter 2
No ratings yet
Bhabesh - Chapter 2
34 pages
DM Notes
No ratings yet
DM Notes
26 pages
Unit 1
No ratings yet
Unit 1
27 pages
Unit 5 DM
No ratings yet
Unit 5 DM
50 pages
DA unit-II
No ratings yet
DA unit-II
15 pages
Data Mining Tools: A Comprehensive Review
No ratings yet
Data Mining Tools: A Comprehensive Review
13 pages
Data Mining in Digital Humanities
No ratings yet
Data Mining in Digital Humanities
84 pages
An Introduction To Data Mining
No ratings yet
An Introduction To Data Mining
12 pages
Unit 2
No ratings yet
Unit 2
26 pages
Data Mining Tools Review
No ratings yet
Data Mining Tools Review
1 page
Web Data Mining Case Study Insights
No ratings yet
Web Data Mining Case Study Insights
6 pages
Data Mining 1
No ratings yet
Data Mining 1
10 pages
Top Data Mining Tools Overview
No ratings yet
Top Data Mining Tools Overview
3 pages
Es 2646574663
No ratings yet
Es 2646574663
7 pages
Data Mining Information
100% (1)
Data Mining Information
15 pages
DW and DM Notes
No ratings yet
DW and DM Notes
89 pages
Synopsis Data Warehouse and Data Mining
No ratings yet
Synopsis Data Warehouse and Data Mining
4 pages
SWEN3165 Lecture 9 - Data Mining
No ratings yet
SWEN3165 Lecture 9 - Data Mining
32 pages
Business Intelligence
No ratings yet
Business Intelligence
38 pages
Data Analysis
No ratings yet
Data Analysis
15 pages
(Ebook PDF) Data Mining Concepts and Techniques 3rdinstant Download
100% (4)
(Ebook PDF) Data Mining Concepts and Techniques 3rdinstant Download
44 pages
Life Lessons from Youth Experiences
No ratings yet
Life Lessons from Youth Experiences
2 pages
Emerging Trends in Computer Systems
No ratings yet
Emerging Trends in Computer Systems
4 pages
WSN Assignment 2
No ratings yet
WSN Assignment 2
5 pages
Document 12
No ratings yet
Document 12
6 pages
Simulation and Modelling - Assignment One
100% (1)
Simulation and Modelling - Assignment One
4 pages
CAT 1 Database Design
No ratings yet
CAT 1 Database Design
4 pages
Assignment 4 Excel
No ratings yet
Assignment 4 Excel
2 pages
Customer Relationship Management
No ratings yet
Customer Relationship Management
56 pages
Social Network Data Mining Guide
No ratings yet
Social Network Data Mining Guide
28 pages
Lee J, 2020
No ratings yet
Lee J, 2020
16 pages
Text Analytics and Sentiment Analysis Guide
No ratings yet
Text Analytics and Sentiment Analysis Guide
10 pages
Rbi Cims
No ratings yet
Rbi Cims
32 pages
Text Mining Ch1-3 Questions
No ratings yet
Text Mining Ch1-3 Questions
3 pages
DATA MINING Notes
No ratings yet
DATA MINING Notes
37 pages
Unit 1 - Ba
No ratings yet
Unit 1 - Ba
10 pages
Getting Started With SAS Text Miner
No ratings yet
Getting Started With SAS Text Miner
102 pages
The Forrester Wave AI Bas
No ratings yet
The Forrester Wave AI Bas
15 pages
Future Trends in Text Analytics
No ratings yet
Future Trends in Text Analytics
38 pages
NoteGPT AI PPT Maker 1728839183012
No ratings yet
NoteGPT AI PPT Maker 1728839183012
18 pages
MLX Forum GPT Chronicles
No ratings yet
MLX Forum GPT Chronicles
1 page
Text Mining Unlocking Insights From Unstructured Data
No ratings yet
Text Mining Unlocking Insights From Unstructured Data
8 pages
HR Management & Data Analysis Guide
No ratings yet
HR Management & Data Analysis Guide
36 pages
Module 3 - Classification
No ratings yet
Module 3 - Classification
9 pages
Text Analytics
No ratings yet
Text Analytics
5 pages
Unit-4 Part 2 Bioinformatics
No ratings yet
Unit-4 Part 2 Bioinformatics
11 pages
Training As Infringement
No ratings yet
Training As Infringement
74 pages
Business Analytics for Bankers
No ratings yet
Business Analytics for Bankers
80 pages
DataMining Course Handout
No ratings yet
DataMining Course Handout
5 pages
Chapter 7-Text Analytics, Text Mining
100% (1)
Chapter 7-Text Analytics, Text Mining
24 pages
Role of Big Data in Decision Making
100% (1)
Role of Big Data in Decision Making
9 pages
IS 7118 Unit1 Introduction
No ratings yet
IS 7118 Unit1 Introduction
58 pages
Predictive Analytics & Data Mining
No ratings yet
Predictive Analytics & Data Mining
15 pages
SAS Training
No ratings yet
SAS Training
11 pages
Findings On Paper 23242
No ratings yet
Findings On Paper 23242
7 pages
Tools For Educational Data Mining - A Review
No ratings yet
Tools For Educational Data Mining - A Review
20 pages
Chapter 01: Types of Digital Data
No ratings yet
Chapter 01: Types of Digital Data
80 pages
Practical Research Social Media Revised
No ratings yet
Practical Research Social Media Revised
14 pages

Data Mining and Business Analytics

Uploaded by

Data Mining and Business Analytics

Uploaded by

Finding predictive information in large databases

Scope of the project

KNOWLEDGE DISCOVERY PROCESS

The knowledge discovery process consists of six stages:

TOOLS FOR PREDICTIVE DATA ANALYSIS

Traditional Data Mining Tools

OPEN SOURCE TOOLS FOR DATA MINING

You might also like