Decision Stump Algorithm: Time and Space Complexity Analysis
Pardeep Singh
MSc Data science, University of Europe for applied science
AEBS_DC_MAVe_P2ML01: Machine Learning
Prof. Dr. Eman Abukhousa
Decision Stump Algorithm: Time and Space Complexity Analysis
1. Introduction
A Decision Stump is a simple, one-level decision tree used as a weak learner in machine
learning, particularly for binary classification.
2. Time Complexity Analysis
To understand the Decision Stump's efficiency, we look at its main steps: sorting feature
values, evaluating split thresholds, and calculating classification error. Each step depends on
the number of data points (n) and features (d)
2.1 Sorting Feature Values
For each of the d features, the algorithm sort’s n value, which takes O (n log n) time.
Total time for shorting all features:
O (d. n log n)
2.2 Threshold Evaluation
After sorting, the algorithm checks possible thresholds between sorted values. This requires
O (n) time for each feature.
Total time for evaluating thresholds:
O (d. n)
2.3 Classification Error Calculation
The algorithm calculation how well each threshold classifies the data, which again
takes O (n) per feature.
Total time for error calculation:
O (d. n)
2.4 Overall Time Complexity
Combining all steps, the total time complexity is:
O (d. n log n)
3. Space Complexity Analysis
The space needed for the algorithm can be summarized as follows:
3.1 Data Storage
The algorithm needs to store n data points with d features, requiring:
O (n. d)
3.2 Predictions and Thresholds
Space for storing predictions: O(n)
Space for storing thresholds: O(d)
3.3 Overall Space Complexity
Combining these, the overall space complexity is:
O (n.d)
4. Complexity Comparison with Other Algorithms
4.1 Decision Trees
Decision Trees with multiple levels have higher complexity. If a tree has depth k, the
complexity can be:
O (n. d. k)
This makes them much more computationally intensive than Decision Stumps.
4.2 Logistic Regression
Logistic Regression requires iterative optimization, generally needing O (n.d) time per
iteration, with multiple iterations for convergence. This can result in a higher overall
complexity compared to Decision Stumps.
5. Example Complexity Analysis
Consider a dataset with 10 data points and 3 features:
1. Sorting each feature takes about O (10 log 10 ≈ O 33) for each feature, totalling O
(99).
2. Threshold Evaluation takes O (30).
3. Classification Error Calculation also takes O (30).
Thus, the total time complexity for this small dataset is approximately:
O (159)
This shows that Decision Stumps handle small datasets efficiently.