0% found this document useful (0 votes)

25 views15 pages

Data Science Applications Overview

Unit 5 covers classification techniques in data science, including performance measures, K-Nearest Neighbors (KNN), and K-Means clustering, along with their implementations in R. It also discusses case studies such as weather forecasting, stock market prediction, object recognition, and real-time sentiment analysis, highlighting the importance of machine learning in these applications. Additionally, it provides step-by-step implementations of logistic regression, KNN, and K-Means clustering in R.

Uploaded by

prakulrokz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views15 pages

Data Science Applications Overview

Uploaded by

prakulrokz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Unit 5

Classification: Performance measures, Logistic Regression implementation in R, K-Nearest

Neighbors(KNN), K-Nearest Neighbors implementation in R, Clustering: K-Means Algorithm, K-Means
implementation in R.

Case studies of Data Science Applications: Weather Forecasting, Stock Market Prediction, Objection
Recognition, Real Time Sentiment Analysis.

_____________________________________________________________________________________

Classification
Classification is a type of supervised learning which predicts discrete value of target attribute for given
input. The examples for classification algorithms includes K-Nearest Neighbor Algorithm, Logistic
regression, Support Vector Machine Algorithm, Decision Tree Algorithm, etc.

Performance measures
Classification models predict categorical outcomes for the given test set records. To evaluate the
performance of the classification models, the various performance measures or metrics are used.
They are accuracy, precision, sensitivity or recall, F-score, specificity and confusion matrix.
The following terms are used duringth measure the performance of the model.
True Positive (TP) – The classification model predicts the output as positive and actual output is
also positive.
True Negative (TN) – The model predicts the output as negative and actual output is also negative.
False Positive (FP) – The model predicts the output as positive but the actual output is negative.
False Negative (FN) – The model predicts the output as negative but the actual output is positive.
True Negative (TN) – The classification

1. Accuracy
Accuracy is the ratio of correct predictions to the total predictions.
𝑇𝑃 + 𝑇𝑁
Accuracy =
𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁

Example: If a model correctly classifies 90 out of 100 instances, the accuracy is 90%.

2. Precision
Precision measures how many of the predicted positive instances are actually positive.
It is the ratio of correct positive predictions to the total positive predictions.
𝑇𝑃
Precision =
𝑇𝑃 + 𝐹𝑃
Example: In a spam detection system, if 80 emails are predicted as spam and 70 are actual spam,
70
precision is 80.

3. Recall (Sensitivity)
Recall measures the proportion of actual positive cases that are correctly identified.
It is the ratio of correct positive predictions to the actual positives.

𝑇𝑃
Recall =
𝑇𝑃 + 𝐹𝑁
70
Example: If there are 100 spam emails, and the model correctly identifies 70, recall is .
100

4. F1 Score
F1 Score is the harmonic mean of precision and recall, balancing both metrics.

Precision×Recall
F1 Score = 2 × Precision+Recall
0.875×0.7
Example: If precision is 0.875 and recall is 0.7, the F1 Score is 2 × 0.875+0.7 ≈ 0.778.

5. Specificity
Specificity measures the proportion of actual negative cases that are correctly identified.
It is the ratio of correct negative predictions the actual negatives.
𝑇𝑁
Specificity = 𝑇𝑁+𝐹𝑃
900
Example: If 900 non-spam emails are correctly identified out of 950, specificity is .
950

6. Confusion Matrix
Definition: A confusion matrix summarizes the model's performance by showing the number of
true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN). Example:
Predicted
Positive Negative
Actual Positive TP=70 FN=30
Negative FP=10 TN=9
K-Nearest Neighbor(KNN) Algorithm
KNN Algorithm is supervised learning used for classification and regression. It stores the available training
set data and classify the new input or new data point based on similarity.

KNN algorithm is also called lazy learning algorithm because it does not learn from the training set
immediately, instead it stores the dataset in memory and at the time of classification, it performs the
action on the dataset.

Working of KNN Algorithm or steps in KNN Algorithm

1. Choose the number of neighbors K (usually an odd number to avoid ties).

2. Calculate the distance between the new data point (or new input) and all points (all records )in
the training set using Euclidean distance.

3. Identify the K nearest neighbors (smallest distances) based on the calculated distances.

4. Among these K Neighbors, find the number of data points (or records) in each category.

5. Assign new data point to that for which number of neighbors is maximum.

Advantages

Simple and easy to implement.

No training phase, useful for small datasets.

Works well for classification.

Disadvantages

Computationally expensive for large datasets.

Requires careful selection of K.

Each record in training set is plotted as a point in 2D space. For a given new record, place it 2 Dimensional
space and find K Nearest Neighbors to it. Find the category of each neighbor. Assigns the category to the
new data point, where number of data points to that category is maximum.
Suppose we have a new data point and we need to put it in the required category. Consider the below
image:

Assume that K=5. i.e. we have to find 5 nearest neighbors. Find the category of each neighbor.

Among 5 neighbors, 3 neighbors belongs to category A and neighbors belongs to category.

So new input belongs to category A.

Erampa

y+2s
Clustering
The process of forming clusters or groups from dataset is known as clustering.

The data objects within a cluster are similar and data objects in different clusters are dissimilar. Each
cluster is said to have cluster centroid or center. The cluster centroid is obtained by calculating the mean
or average of all data objects within that cluster.

The Euclidian distance will be calculated from data object to every cluster centroid. The data object will
be placed in a cluster, where distance from data object to that cluster is minimum.

K-Means Clustering

K-Means is an unsupervised machine learning algorithm used for clustering. It partitions a dataset into
K clusters, where each data point belongs to the cluster with the nearest mean.

K-Means Algorithm

Input

K: The number of clusters to be formed

D : Dataset that contains n number of data objects or records

Output

K Clusters from data set

Method:

1. Randomly assign K data objects as cluster centroids or cluster centers for K clusters.
2. Repeat
3. Assign each data object to a cluster where distance from that data object to that cluster
centroid is minimum.
4. Update the cluster centroids. i.e. recalculate the mean of each cluster with updated
cluster members.
5. Until there is no change in cluster member or centroids.

Some of the applications of K-Means Clustering includes: Customer segmentation in marketing. Anomaly
detection in cybersecurity, etc.
Exampl tr k-Means dusteiy

Al= (2,10) w As (7, 5)

A1
An =I, 2)

Ay (5,8)

C4,10)

ALE e)

|A5I,s) dCA7,
As y =Va

A7tI
ASE C2
As ce /9+ib=/as

V1tl6)
d CAb,c»)>
cltai:f

o clren ccwtiko
dmeah data Polat
Fhd ta dytan

duntenl
Case Studies of Data Science Applications

1. Weather Forecasting
Weather forecasting is the application of science and technology to predict atmospheric conditions for a
given location and time. With advancements in data science and machine learning, forecasting accuracy
has significantly improved.

1. Importance of Weather Forecasting

 Agriculture: Helps farmers plan irrigation and crop protection.

 Disaster Management: Predicts extreme weather events like hurricanes, floods, and droughts.

 Aviation: Ensures safe flight operations.

 Energy Sector: Optimizes power generation and distribution.

2. Data Sources for Weather Forecasting

 Meteorological Stations: Collect temperature, humidity, wind speed, and pressure data.

 Satellites: Provide cloud cover, precipitation, and storm movement.

 IoT Sensors: Gather real-time weather data from different locations.

3. Data Science and Machine Learning Approaches

 Regression Models: Predict temperature and precipitation.

 Time Series Analysis: Uses past weather data to make future predictions.

 Deep Learning Models: Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM)
networks analyze time-dependent weather data.

5. Work-flow of Weather Forecasting Using Machine Learning

1 Data Collection and Preprocessing

Obtain data from sources like Kaggle, or OpenWeatherMap.
Handle missing values and perform feature scaling.
Convert categorical variables (e.g., sunny, rainy) into numerical format.

2. Model Building
Split the collected dataset into training set and test set
Build the model by supplying the training set to Machine learning or deep learning algorithm
3. Model Evaluation
The performance evaluation of model can be done using the test set data and performance
metrics.
2. Stock Market Prediction

Stock market is a public market where public can buy and sell shares of listed companies. Stock market
prediction involves using historical data, statistical methods, and machine learning algorithms to
forecast stock prices and trends.

Data Sources for Stock Market Prediction

Historical Stock Prices: Open, close, high, and low prices of certain company shares in the past.

Fundamental Data: Company earnings, revenue, and financial profits.

Machine Learning-Based Approaches

Regression Models: Linear Regression for stock price prediction.

Classification Models: Predict stock movement (increase/decrease) using Support Vector

Machines (SVM) and Random Forest.

Deep Learning Models:

Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM): uses
historical prices of stock prices to predict its future share price.
3. Object Recogniton
Object recognition is the technique of identifying the object present in images and videos. It is one of the
most important applications of machine learning and deep learning. The goal of this field is to teach
machines to understand (recognize) the content of an image just like humans do.

It takes image as input and outputs the classification label of that image. Object detection algorithms also
locates the presence of an object in the image and represents it with a bounding box or location in form
of position, height and width.

Applications of Object Recognition

 Medical Imaging: Identifying tumors in X-rays or MRIs.
 Security and Surveillance: Recognizing faces and detecting suspicious activities.
 Retail and Inventory Management: Automated checkout and product detection.
Agriculture: Monitoring plant health and detecting pests.

The deep learning technique, particularly Convolutional Neural Networks (CNNs) is used for object
detection.

Steps in Object Recognition Process

1 Data Collection and Preprocessing

 Collect images from sources like Open Images, or custom datasets.

 Label objects using tools like LabelImg.

 Normalize and resize images for uniform processing.

2 Model Training

 Split data into training, validation, and test sets.

 Build a model with training set images.

3 Model Evaluation

 Evaluates model performance using test set images by using accuracy metrics.
4. Real Time Sentiment Analysis

Real Time Sentiment Analysis technique will analyze and classify sentiments (positive, negative,
neutral) in text data from social media, reviews, and customer feedback.

Real-time Sentiment Analysis is a machine learning (ML) technique that automatically recognizes
and extracts the sentiment in a text whenever it occurs. It is most commonly used to analyze brand
and product mentions in live social comments and posts. So real-time sentiment analysis can be
done only from social media platforms that share live feeds like Twitter does.

Working of Real Time Sentiment Analysis

1. Data Collection
Collect sentiment data about a product or event from social media such as Instagram,
twitter, facebook, etc.
2. Data Processing
All text comments (sentiments) are cleaned up and processed for next stage. All non-text
data from live video or audio feeds are added as text comments.

3. Data Analysis
All text comments are analyzed using Natural Language Processing, Clustering. It will give
overall percentage of opinions or feedback given by people about a product or event.
4. Data visualization
The results of opinions or sentiments are shown in form of charts or graphs.

The following techniques can be used to build the model for Real Time Sentiment Analysis

NLP Techniques: Bag of Words

Machine Learning: Logistic Regression, Naïve Bayes, Random Forest.

Deep Learning: Recurrent Neural Networks (RNN)

Logistic Regression implementation in R
Steps in implementation

Step1. Importing the dataset

Step 2: Splitting the dataset into training set and test set.
Step 3: Build the logistic regression model by supplying the training set to logistic
regression algorithm.
Step 4: Evaluate the model performance with test set data and confusion matrix.

R Code for logistic regression

# importing the dataset from Social_Network_Ads.Csv file

dataset = read.csv('Social_Network_Ads.csv')
dataset = dataset[3:5]

# Splitting the dataset into the Training set and Test set
# install.packages('caTools')
library(caTools)
set.seed(123)
split = sample.split(dataset$Purchased, SplitRatio = 0.75)
training_set = subset(dataset, split == TRUE)
test_set = subset(dataset, split == FALSE)

# Feature Scaling
training_set[-3] = scale(training_set[-3])
test_set[-3] = scale(test_set[-3])

# Fitting Logistic Regression to the Training set

classifier = glm(formula = Purchased ~ .,
family = binomial,
data = training_set)

# Predicting the Test set results

prob_pred = predict(classifier, type = 'response', newdata = test_set[-3])
y_pred = ifelse(prob_pred > 0.5, 1, 0)

# Making the Confusion Matrix

cm = table(test_set[, 3], y_pred > 0.5)
print(cm)

output
Confusion Matrix

0 1
0 57 7

1 10 26
Accuracy = 83
K-Nearest Neighbors (K-NN) implementation in R

# Importing the dataset

dataset = read.csv('Social_Network_Ads.csv')
dataset = dataset[3:5]

# Feature Scaling
training_set[-3] = scale(training_set[-3])
test_set[-3] = scale(test_set[-3])

# Fitting K-NN to the Training set and Predicting the Test set results
library(class)
y_pred = knn(train = training_set[, -3],
test = test_set[, -3],
cl = training_set[, 3],
k = 5,
prob = TRUE)

# Making the Confusion Matrix

cm = table(test_set[, 3], y_pred)
print(cm)
output
Confusion Matrix

0 1
0 59 5

1 6 30
Accuracy = 89%
K-Means Clustering implementation in R

# K-Means Clustering

# Importing the dataset

dataset = read.csv('mall.csv')

X = dataset[4:5]

# Using the elbow method to find the optimal number of clusters

wcss = vector()

for (i in 1:10) wcss[i] = sum(kmeans(X, i)$withinss)

plot(x = 1:10, y = wcss, type = 'b', main = paste('The Elbow Method'),

xlab = 'Number of clusters', ylab = 'WCSS')

# Fitting K-Means to the dataset

kmeans = kmeans(x = X, centers = 5, iter.max = 300, nstart = 10)

# Visualising the clusters

library(cluster)

clusplot(x = X, clus = kmeans$cluster, lines = 0, shade = TRUE, color = TRUE, labels = 2,

plotchar = FALSE, span = TRUE, main = paste('Clusters of customers'),

xlab = 'Annual Income', ylab = 'Spending Score')

Classification
No ratings yet
Classification
50 pages
DSV Ia2
No ratings yet
DSV Ia2
18 pages
04 Unit-Iv - ML
No ratings yet
04 Unit-Iv - ML
23 pages
ML Unit 3
No ratings yet
ML Unit 3
12 pages
Machine Lar Arii
No ratings yet
Machine Lar Arii
9 pages
Classification
No ratings yet
Classification
58 pages
Distance-Based Methods - KNN
0% (1)
Distance-Based Methods - KNN
8 pages
Module 3
No ratings yet
Module 3
63 pages
DataMining Unit-3
No ratings yet
DataMining Unit-3
8 pages
FPA Unit 2
No ratings yet
FPA Unit 2
20 pages
Machine Learning Unit 3
No ratings yet
Machine Learning Unit 3
40 pages
Introduction To Classification and Classification Algorithms
100% (1)
Introduction To Classification and Classification Algorithms
9 pages
DM - MP
No ratings yet
DM - MP
15 pages
Lecture Week 2 KNN and Model Evaluation PDF
100% (1)
Lecture Week 2 KNN and Model Evaluation PDF
53 pages
Outline: Three Basic Algorithms
No ratings yet
Outline: Three Basic Algorithms
34 pages
Chapter 6 - Data Science and K Nearest Neighbour Model (PART B)
No ratings yet
Chapter 6 - Data Science and K Nearest Neighbour Model (PART B)
5 pages
Data Mining Algorithms Comparison
No ratings yet
Data Mining Algorithms Comparison
32 pages
ML CH 3
No ratings yet
ML CH 3
88 pages
ML Unit 2 Possible Questions and Answers
No ratings yet
ML Unit 2 Possible Questions and Answers
48 pages
ML Unit-2
No ratings yet
ML Unit-2
33 pages
Jntuk R20 ML Unit-Ii
No ratings yet
Jntuk R20 ML Unit-Ii
37 pages
New Classification and Regression Models
No ratings yet
New Classification and Regression Models
7 pages
Machine Learning
No ratings yet
Machine Learning
55 pages
UNIT 3 - Final
No ratings yet
UNIT 3 - Final
37 pages
Supervised Learning Techniques Overview
No ratings yet
Supervised Learning Techniques Overview
71 pages
ML Unit 2 (Ab22)
No ratings yet
ML Unit 2 (Ab22)
61 pages
CH 2
No ratings yet
CH 2
30 pages
Predict Classify Cluster
No ratings yet
Predict Classify Cluster
12 pages
Machine Learning: Supervised Learning Basics
No ratings yet
Machine Learning: Supervised Learning Basics
46 pages
Overview of Classification Algorithms
No ratings yet
Overview of Classification Algorithms
27 pages
Data Science Basics for Beginners
No ratings yet
Data Science Basics for Beginners
16 pages
ML Unit2
No ratings yet
ML Unit2
38 pages
Michael Melese (PH.D.) Michael - Melese@aau - Edu.et
No ratings yet
Michael Melese (PH.D.) Michael - Melese@aau - Edu.et
22 pages
ML Unit-2
No ratings yet
ML Unit-2
55 pages
8.predictive Analytics - Classification 2
No ratings yet
8.predictive Analytics - Classification 2
28 pages
JNTUK R20 B.tech CSE 3-2 Machine Learning Unit 2 Notes
No ratings yet
JNTUK R20 B.tech CSE 3-2 Machine Learning Unit 2 Notes
33 pages
Unit-4 Unsupervised Algorithm
No ratings yet
Unit-4 Unsupervised Algorithm
18 pages
DWDM Unit-3
No ratings yet
DWDM Unit-3
9 pages
2.unit 2 ML Q&A
No ratings yet
2.unit 2 ML Q&A
36 pages
K Nearest Neighbors
No ratings yet
K Nearest Neighbors
19 pages
UNIT 2 - Notes
No ratings yet
UNIT 2 - Notes
31 pages
(KtabPDF Com) xrwA7TEBGp
No ratings yet
(KtabPDF Com) xrwA7TEBGp
32 pages
ML - Machine Learning PDF
No ratings yet
ML - Machine Learning PDF
13 pages
Fam Question Bank CT
No ratings yet
Fam Question Bank CT
14 pages
Data Science & Analytics Basics
No ratings yet
Data Science & Analytics Basics
71 pages
Machine Learning Classification Overview
No ratings yet
Machine Learning Classification Overview
4 pages
K-Nearest Neighbors Overview
No ratings yet
K-Nearest Neighbors Overview
31 pages
Unit 2 1
No ratings yet
Unit 2 1
28 pages
Yunsu Han KNN K Means
No ratings yet
Yunsu Han KNN K Means
8 pages
DW&M Unit 3 Part I
No ratings yet
DW&M Unit 3 Part I
101 pages
Cse Vsem 503 B PR Unit 2 Notes
No ratings yet
Cse Vsem 503 B PR Unit 2 Notes
17 pages
Machine Learning Techniques Overview
No ratings yet
Machine Learning Techniques Overview
6 pages
Unit 2 Supervised Learning and Applications
No ratings yet
Unit 2 Supervised Learning and Applications
13 pages
Artificial Intelligence and Machine Learning
No ratings yet
Artificial Intelligence and Machine Learning
12 pages
Machine Learning and Web Scraping Lecture 03
No ratings yet
Machine Learning and Web Scraping Lecture 03
22 pages
Supervised Learning Techniques
No ratings yet
Supervised Learning Techniques
33 pages
Data Science Unit 3
No ratings yet
Data Science Unit 3
33 pages
DADM S15 K-NN Classification
No ratings yet
DADM S15 K-NN Classification
13 pages
ML 7th Sem Aiml Ite Notes Complete Long (1) - 63-155
No ratings yet
ML 7th Sem Aiml Ite Notes Complete Long (1) - 63-155
93 pages
MNIST CNN & SVM Classification
No ratings yet
MNIST CNN & SVM Classification
5 pages
Machine-Learning Algorithm To Predict Hypotension Based On High-Fidelity Arterial Pressure Waveform Analysis
No ratings yet
Machine-Learning Algorithm To Predict Hypotension Based On High-Fidelity Arterial Pressure Waveform Analysis
12 pages
ML 5
No ratings yet
ML 5
61 pages
BE IT Project Synopsis Format 2022 23 V1
No ratings yet
BE IT Project Synopsis Format 2022 23 V1
11 pages
MCQ
100% (7)
MCQ
37 pages
Stress Detection via Smartphone Sensors
No ratings yet
Stress Detection via Smartphone Sensors
9 pages
Pattern Recognition - 28 Nov23
No ratings yet
Pattern Recognition - 28 Nov23
3 pages
Data Analytics For Business Foundations and Industry Applications
No ratings yet
Data Analytics For Business Foundations and Industry Applications
289 pages
Machinelearning - Alisya Athirah Binti Mohd Huzzainny (Updated)
No ratings yet
Machinelearning - Alisya Athirah Binti Mohd Huzzainny (Updated)
26 pages
Artificial Intelligence and Machine Learning Question Bank
100% (1)
Artificial Intelligence and Machine Learning Question Bank
23 pages
Question Paper LH - MLT
No ratings yet
Question Paper LH - MLT
93 pages
Neural Network Handwritten Digit Recognition
No ratings yet
Neural Network Handwritten Digit Recognition
13 pages
Mining Software Repositories Guide
No ratings yet
Mining Software Repositories Guide
58 pages
Classification Model To Classify Network Traffic
No ratings yet
Classification Model To Classify Network Traffic
5 pages
Experiment 7 Support Vector Machine (SVM)
No ratings yet
Experiment 7 Support Vector Machine (SVM)
2 pages
Deep Learning and Neural Networks
No ratings yet
Deep Learning and Neural Networks
98 pages
Machine and Deep Learning (Nezar A. El-Kady)
No ratings yet
Machine and Deep Learning (Nezar A. El-Kady)
353 pages
Hierarchical Clustering Mall Data
No ratings yet
Hierarchical Clustering Mall Data
2 pages
8 Decision Trees Option
No ratings yet
8 Decision Trees Option
59 pages
Discriminant Analysis: How Can You Answer These Questions?
No ratings yet
Discriminant Analysis: How Can You Answer These Questions?
14 pages
Ai 900
100% (2)
Ai 900
24 pages
AI Based Career Guidance System-Project Report
No ratings yet
AI Based Career Guidance System-Project Report
53 pages
DL Unit-3
No ratings yet
DL Unit-3
9 pages
Data Mining Lab Guide
33% (3)
Data Mining Lab Guide
44 pages
Assignment 3
No ratings yet
Assignment 3
25 pages
Unit 4
No ratings yet
Unit 4
36 pages
247-Article Text-517-1-10-20201130 PDF
No ratings yet
247-Article Text-517-1-10-20201130 PDF
9 pages
(Signals and Communication Technology) Jens Ohm (Auth.) - Multimedia Content Analysis-Springer-Verlag Berlin Heidelberg (2016)
No ratings yet
(Signals and Communication Technology) Jens Ohm (Auth.) - Multimedia Content Analysis-Springer-Verlag Berlin Heidelberg (2016)
425 pages
Texture Segmentation Using LBP
No ratings yet
Texture Segmentation Using LBP
5 pages
A Survey On Trust Evaluation Based On Machine Learning
No ratings yet
A Survey On Trust Evaluation Based On Machine Learning
37 pages

Data Science Applications Overview

Uploaded by

Data Science Applications Overview

Uploaded by

Unit 5

Classification: Performance measures, Logistic Regression implementation in R, K-Nearest

Working of KNN Algorithm or steps in KNN Algorithm

1. Choose the number of neighbors K (usually an odd number to avoid ties).

Simple and easy to implement.

No training phase, useful for small datasets.

Works well for classification.

Computationally expensive for large datasets.

Requires careful selection of K.

Among 5 neighbors, 3 neighbors belongs to category A and neighbors belongs to category.

So new input belongs to category A.

K: The number of clusters to be formed

D : Dataset that contains n number of data objects or records

K Clusters from data set

Al= (2,10) w As (7, 5)

1. Importance of Weather Forecasting

 Agriculture: Helps farmers plan irrigation and crop protection.

 Aviation: Ensures safe flight operations.

 Energy Sector: Optimizes power generation and distribution.

2. Data Sources for Weather Forecasting

 Satellites: Provide cloud cover, precipitation, and storm movement.

 IoT Sensors: Gather real-time weather data from different locations.

3. Data Science and Machine Learning Approaches

 Regression Models: Predict temperature and precipitation.

5. Work-flow of Weather Forecasting Using Machine Learning

1 Data Collection and Preprocessing

Data Sources for Stock Market Prediction

Fundamental Data: Company earnings, revenue, and financial profits.

Machine Learning-Based Approaches

Regression Models: Linear Regression for stock price prediction.

Classification Models: Predict stock movement (increase/decrease) using Support Vector

Deep Learning Models:

Applications of Object Recognition

Steps in Object Recognition Process

1 Data Collection and Preprocessing

 Collect images from sources like Open Images, or custom datasets.

 Label objects using tools like LabelImg.

 Normalize and resize images for uniform processing.

 Split data into training, validation, and test sets.

 Build a model with training set images.

Working of Real Time Sentiment Analysis

NLP Techniques: Bag of Words

Machine Learning: Logistic Regression, Naïve Bayes, Random Forest.

Deep Learning: Recurrent Neural Networks (RNN)

Step1. Importing the dataset

R Code for logistic regression

# importing the dataset from Social_Network_Ads.Csv file

# Fitting Logistic Regression to the Training set

# Predicting the Test set results

# Making the Confusion Matrix

# Importing the dataset

# Making the Confusion Matrix

# Importing the dataset

# Using the elbow method to find the optimal number of clusters

for (i in 1:10) wcss[i] = sum(kmeans(X, i)$withinss)

plot(x = 1:10, y = wcss, type = 'b', main = paste('The Elbow Method'),

xlab = 'Number of clusters', ylab = 'WCSS')

# Fitting K-Means to the dataset

kmeans = kmeans(x = X, centers = 5, iter.max = 300, nstart = 10)

# Visualising the clusters

clusplot(x = X, clus = kmeans$cluster, lines = 0, shade = TRUE, color = TRUE, labels = 2,

plotchar = FALSE, span = TRUE, main = paste('Clusters of customers'),

xlab = 'Annual Income', ylab = 'Spending Score')

You might also like