Limitations of Data Analytical Algorithms

The document discusses challenges faced when applying machine learning algorithms to large datasets and proposes alternatives to address issues related to memory and storage constraints, computational time, scalability, data distribution, handling real-time and streaming data, feature engineering, and data privacy and security.

Uploaded by

cowcow121010

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views2 pages

Limitations of Data Analytical Algorithms

Uploaded by

cowcow121010

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

1.

Memory and Storage Constraints:

Example: K-Nearest Neighbours (KNN) algorithm requires storing all the
training data points in memory to make predictions for new data points.
When dealing with large datasets, this can lead to high memory
consumption.
Alternative: Approximate Nearest Neighbours algorithms, such as
Locality-Sensitive Hashing (LSH) or KD-Trees, can be used as memory-
efficient alternatives to KNN. These techniques reduce the memory
overhead by sacrificing a bit of accuracy in the nearest neighbour search.
2. Computational Time:
Example: Support Vector Machines (SVM) can have high computational
time complexity, especially for large datasets with a high number of
features.
Alternative: Stochastic Gradient Descent (SGD) is a popular optimization
algorithm used to train large-scale SVM models. It updates the model
parameters using a subset (mini-batch) of data at each iteration, making it
faster and more scalable for big data.
3. Scalability:
Example: Decision Trees are simple and interpretable models, but their
scalability is limited when handling big datasets with millions of data
points.
Alternative: Random Forest is an ensemble method that combines multiple
decision trees. It can be easily parallelized, allowing for distributed
computation, and offers improved scalability compared to a single decision
tree.
4. Data Distribution:
Example: K-Means clustering algorithm typically requires all data points
to be available at once for calculating cluster centroids. This can be a
challenge when the data is distributed across multiple machines.
Alternative: Mini-batch K-Means is a variant of K-Means that processes
subsets (mini-batches) of data at each iteration. It makes the algorithm
more scalable and suitable for distributed environments.
5. Real-time and Streaming Data:
Example: Naive Bayes classifiers require all training data to be present
during model training. In real-time or streaming scenarios, this might not
be practical.
Alternative: Online learning algorithms, such as Online Naive Bayes, can
continuously update the model as new data arrives, making them suitable
for real-time and streaming applications.
6. Feature Engineering:
Example: Principal Component Analysis (PCA) is widely used for
dimensionality reduction, but it might not be the best choice for handling
high-dimensional big data with complex feature interactions.
Alternative: Non-linear dimensionality reduction techniques like t-
distributed Stochastic Neighbor Embedding (t-SNE) or UMAP (Uniform
Manifold Approximation and Projection) can handle high-dimensional
data while preserving local and global structures more effectively.
7. Data Privacy and Security:
Example: Logistic Regression is a popular algorithm, but it might not be
suitable for analysing sensitive data without proper privacy measures.
Alternative: Differential Privacy is a framework that adds noise to the data
before analysis to protect individual privacy while still providing useful
aggregate information. It can be applied to various machine learning
algorithms to ensure privacy and security.

Review of AppliedMachineLearning
No ratings yet
Review of AppliedMachineLearning
2 pages
ML Assigment 3
No ratings yet
ML Assigment 3
4 pages
Analysis of Introduction To Machine Learning, Second Edition (Adaptive Computation and Machine Learning)
No ratings yet
Analysis of Introduction To Machine Learning, Second Edition (Adaptive Computation and Machine Learning)
3 pages
Northbay Summarizes Data Pre-Processing Algorithms
No ratings yet
Northbay Summarizes Data Pre-Processing Algorithms
10 pages
Supervised vs. Unsupervised Learning
No ratings yet
Supervised vs. Unsupervised Learning
7 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
19 pages
PRCV Viva Notes
No ratings yet
PRCV Viva Notes
32 pages
Untitled Document
No ratings yet
Untitled Document
8 pages
DWDM Unit-3
No ratings yet
DWDM Unit-3
9 pages
Aiml Model
No ratings yet
Aiml Model
13 pages
Introduction To Data Mining
No ratings yet
Introduction To Data Mining
9 pages
Ass 2
No ratings yet
Ass 2
6 pages
1.write The Formula For Sigmoid, Hyperbolic Tangen...
No ratings yet
1.write The Formula For Sigmoid, Hyperbolic Tangen...
3 pages
ML
No ratings yet
ML
18 pages
Unit-4 Data Mining
No ratings yet
Unit-4 Data Mining
19 pages
5.unit DA
No ratings yet
5.unit DA
10 pages
Lecture 5 - Feature Extraction, Model Building & Evaluation
No ratings yet
Lecture 5 - Feature Extraction, Model Building & Evaluation
35 pages
Preface To The Second Edition V 1 1
No ratings yet
Preface To The Second Edition V 1 1
9 pages
Classifiers
No ratings yet
Classifiers
3 pages
ML Algorithms Comprehensive Study
No ratings yet
ML Algorithms Comprehensive Study
9 pages
Assignment 1
No ratings yet
Assignment 1
5 pages
Data Mining
No ratings yet
Data Mining
18 pages
PYTHON PROGRAMMING FOR MACHINE LEARNING-220901004 - Compressed
No ratings yet
PYTHON PROGRAMMING FOR MACHINE LEARNING-220901004 - Compressed
6 pages
Interview AI Algo
No ratings yet
Interview AI Algo
3 pages
What Are The Common Algorithms in Machine Learning
No ratings yet
What Are The Common Algorithms in Machine Learning
3 pages
AI Phase2
No ratings yet
AI Phase2
13 pages
BDA Lecture Unit 3 With LAB
No ratings yet
BDA Lecture Unit 3 With LAB
20 pages
6th - SEM Machine Learning Notes PDF
100% (1)
6th - SEM Machine Learning Notes PDF
36 pages
Research Paper
No ratings yet
Research Paper
14 pages
Project Des
No ratings yet
Project Des
52 pages
Machine Learning
No ratings yet
Machine Learning
8 pages
ML CheatSheet
No ratings yet
ML CheatSheet
14 pages
Models
No ratings yet
Models
46 pages
Predictive Models and Techniques
No ratings yet
Predictive Models and Techniques
9 pages
4251 Assignment 6
No ratings yet
4251 Assignment 6
11 pages
CC Unit IV
No ratings yet
CC Unit IV
30 pages
Syllabus Sem 7
No ratings yet
Syllabus Sem 7
10 pages
PRCV Unit-2
No ratings yet
PRCV Unit-2
24 pages
Jadavpur University: Assignment Submission
No ratings yet
Jadavpur University: Assignment Submission
9 pages
Unit 4 Introduction To Algorithm
No ratings yet
Unit 4 Introduction To Algorithm
10 pages
ML Unit4
No ratings yet
ML Unit4
10 pages
Introduction To Data Mining
No ratings yet
Introduction To Data Mining
13 pages
Tools of Machine Learning
No ratings yet
Tools of Machine Learning
3 pages
M.L. 3,5,6 Unit 3
No ratings yet
M.L. 3,5,6 Unit 3
6 pages
ML Overview
No ratings yet
ML Overview
11 pages
ML ModuleUntitled 2
No ratings yet
ML ModuleUntitled 2
8 pages
Exam Preparation Notes
No ratings yet
Exam Preparation Notes
31 pages
Unit 4 DS
No ratings yet
Unit 4 DS
16 pages
Functions:: Sparse Modeling
No ratings yet
Functions:: Sparse Modeling
7 pages
Paper
No ratings yet
Paper
6 pages
Deep Learning
No ratings yet
Deep Learning
23 pages
Research Paper Data Mining
No ratings yet
Research Paper Data Mining
5 pages
ML - Machine Learning PDF
No ratings yet
ML - Machine Learning PDF
13 pages
Unsupervised Machine Learning
No ratings yet
Unsupervised Machine Learning
10 pages
Divorce Prediction Using ML
No ratings yet
Divorce Prediction Using ML
12 pages
Key Metrics and Techniques in Classification
No ratings yet
Key Metrics and Techniques in Classification
4 pages
Machine Learning
No ratings yet
Machine Learning
38 pages
Neural Networks Play A Significant Role in Data Mining
No ratings yet
Neural Networks Play A Significant Role in Data Mining
3 pages

Limitations of Data Analytical Algorithms

Uploaded by

Limitations of Data Analytical Algorithms

Uploaded by

1.

Memory and Storage Constraints:

You might also like