If you are preparing for Data Scientist / ML Engineer role, study this comprehensive list of most
frequently asked questions from classical Machine Learning.
✅ Regression & Classification
1. What’s the difference between linear and logistic regression?
2. How do you interpret coefficients in linear regression?
3. What assumptions does linear regression make?
4. What is regularization? Difference between L1 and L2?
5. How do you handle multicollinearity?
6. What metrics would you use to evaluate a classification model?
7. Explain ROC curve and AUC.
✅ Trees & Ensembles
8. How does a decision tree decide where to split?
9. What’s Gini vs entropy?
10. Why do decision trees overfit?
11. How does Random Forest reduce overfitting?
12. How does boosting work (e.g. AdaBoost, XGBoost)?
13. Differences between bagging and boosting?
✅ Model Evaluation & Validation
14. What is bias-variance trade-off?
15. Explain k-fold cross-validation.
16. How do you handle imbalanced classes?
17. What is precision vs recall?
✅ Clustering & Unsupervised Learning
18. How does k-means clustering work?
19. How do you choose the value of k in k-means?
20. Explain PCA and how it helps in ML.
21. What’s the difference between PCA and LDA?
22. When would you use hierarchical clustering?
✅ Feature Engineering & Data Preparation
23. How do you handle missing data?
24. What’s feature scaling and why is it important?
25. Explain one-hot vs label encoding.
26. How do you detect outliers?
✅ General ML Knowledge
27. What is overfitting? How do you prevent it?
28. Explain the curse of dimensionality.
29. What’s the difference between parametric and non-parametric models?
30. How do you select the right model for your data?
Don’t just memorize definitions. Practice explaining these questions with mathematical
intuition. That’s what impresses interviewers the most.
Fee free to check-out my ML articles specially curated for interview preparation. [Link in
comment]