0% found this document useful (0 votes)
17 views17 pages

Comparative Study

The document provides a comparative study of various machine learning algorithms, including Naïve Bayes, Decision Trees, Random Forest, Support Vector Machines, and K Nearest Neighbors, highlighting their advantages, disadvantages, and applications. Each algorithm has unique strengths, such as Naïve Bayes' simplicity and Decision Trees' interpretability, while also facing challenges like overfitting and computational expense. Applications span multiple fields including healthcare, finance, and marketing, showcasing the versatility of these algorithms.

Uploaded by

koolavarghese6
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views17 pages

Comparative Study

The document provides a comparative study of various machine learning algorithms, including Naïve Bayes, Decision Trees, Random Forest, Support Vector Machines, and K Nearest Neighbors, highlighting their advantages, disadvantages, and applications. Each algorithm has unique strengths, such as Naïve Bayes' simplicity and Decision Trees' interpretability, while also facing challenges like overfitting and computational expense. Applications span multiple fields including healthcare, finance, and marketing, showcasing the versatility of these algorithms.

Uploaded by

koolavarghese6
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Comparative Study

Machine Learning Algorithms


• Supervised Learning techniques
– Naïve Bayes Classifier
– Decision tree
– Random Forest
– Support Vector Machine
– K Nearest Neighbors
Naïve Bayes
• Advantages
• It is simple and easy to implement
• It doesn’t require as much training data
• It handles both continuous and discrete data
• It is highly scalable with the number of predictors
and data points
• It is fast and can be used to make real-time
predictions
• It is not sensitive to irrelevant features
Naïve Bayes
• Disadvantages
• Naive Bayes assumes that all predictors (or features) are
independent, rarely happening in real life. This limits the
applicability of this algorithm in real-world use cases.
• This algorithm faces the ‘zero-frequency problem’ where
it assigns zero probability to a categorical variable whose
category in the test data set wasn’t available in the
training dataset.
• Its estimations can be wrong in some cases, so you
shouldn’t take its probability outputs very seriously.
Applications
Decision Tree
• Advantages
• Simplicity and Interpretability: Decision trees are
straightforward and easy to understand. You can visualize them
like a flowchart which makes it simple to see how decisions are
made.
• Versatility: It means they can be used for different types of
tasks can work well for both classification and regression
• No Need for Feature Scaling: They don’t require you to
normalize or scale your data.
• Handles Non-linear Relationships: It is capable of capturing
non-linear relationships between features and target variables.
Decision Tree
• Disadvantages
• Overfitting: Overfitting occurs when a decision tree captures
noise and details in the training data and it perform poorly on
new data.
• Instability: instability means that the model can be unreliable
slight variations in input can lead to significant differences in
predictions.
• Bias towards Features with More Levels: Decision trees can
become biased towards features with many categories
focusing too much on them during decision-making. This can
cause the model to miss out other important features led to
less accurate predictions .
Applications
• Predictive Analytics in healthcare
• Credit card risk assessment in finance
• Customer segmentation in marketing
• Churn prediction in telecom
• Fraud detection in banking
Random Forest
• Advantages
• High Accuracy due to ensemble learning
• Handles large datasets with higher
dimensionality
• Robust to overfitting due to multiple decision
trees
• Works well for both classification and regression
• Handles missing data effectively
Random Forest
• Disadvantages
• Computationally slower ,especially when dealing
with large datasets
• Less interoperable as compared to other algorithms
• Requires parameter tuning:number of trees,the
maximum depth of each tree and the number of
features considered at each split
• May not perform well on datasets with high noise
levels
Applications
• Medical Diagnosis
• Image classification
• Fraud detection
• Customer segmentation
• Energy demand forecasting
Support Vector Machine
• Advantages
• SVM performs well with data that has many
attributes
• Gives good results even if there is not enough
information about the data.Also works well
with unstructured data
• SVM can use kernels to transform data and
learn non linear patterns
• Robust to noise
Support Vector Machine
• Disadvantages
• Computationally expensive
• Limited to two class problems
• Not suitable for datasets with missing values
• No probabilistic interpretation
Applications
• Face detection
• Text categorization to find importatnt
information
• Bioinformatics
• Handwriting recognition
K Nearest Neighbor
• Advantages
• No Training period as it does not learn anything in
the training period
• Since it requires no training before making
predictions,new data can be added seamlessly
which will not impact the accuracy of the algorithm
• Easy to implement,as there are two parameters
required to implement KNN i.e.the value of K and
the distance function
K Nearest Neighbor
• Disadvantages
• Works slow with large dataset
• Does not work well with high dimensions
• Problem of overfitting
• We need to compulsorily do feature scaling
before applying KNN
• Sensitive to noisy data,missing values and
outliers
Applications
• Recommendation Systems
• Spam Detection
• Customer segmentation
• Speech Recognition

You might also like