Machine Learning Techniques
Assignment-3
Aim – Perform exploratory data analysis using statistical and visualization techniques.
Theory –
Q.1 What is Exploratory Data Analysis (EDA), and why is it an essential step in the CRISP-
ML process?
Q.2 Differentiate between univariate, bivariate, and multivariate EDA. Give an example
technique for each.
Q.3 How can visualization techniques (histograms, scatter plots, box plots, heatmaps) help in
understanding dataset structure and feature relationships? Give one example.
Q.4 Explain how correlation analysis helps in EDA. What are the limitations of correlation
when exploring feature relationships?
Q.5 You generate a correlation heatmap and find that two features (`Height` and `Weight`)
have a correlation of 0.95. What does this tell you, and how should it influence your next
steps?
Q.6 Why is it important to use both statistical measures (mean, variance, skewness) and
visualizations (histograms, scatterplots, pairplots) in EDA?
Reference Study Material -
Web References :
1. Exploratory Data Analysis in Pandas | Python Pandas Tutorials
https://www.youtube.com/watch?v=Liv6eeb1VfE
2. Exploratory Data Analysis with Pandas Python
https://www.youtube.com/watch?v=xi0vhXFPegw
3. Complete Exploratory Data Analysis And Feature Engineering In 3 Hours| Krish Naik
https://www.youtube.com/watch?v=fHFOANOHwh8