0% found this document useful (0 votes)
10 views2 pages

Module 2

The document outlines a comprehensive curriculum for a Data Science module focusing on Exploratory Data Analysis, Data Visualization, Linear Algebra, Probability, and Statistics. It includes detailed topics such as various plotting techniques, probability distributions, hypothesis testing, dimensionality reduction methods like PCA and t-SNE, and interview questions for each section. Each topic is accompanied by specific time allocations for instructional content, indicating a structured approach to learning data science concepts.

Uploaded by

97arshukla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views2 pages

Module 2

The document outlines a comprehensive curriculum for a Data Science module focusing on Exploratory Data Analysis, Data Visualization, Linear Algebra, Probability, and Statistics. It includes detailed topics such as various plotting techniques, probability distributions, hypothesis testing, dimensionality reduction methods like PCA and t-SNE, and interview questions for each section. Each topic is accompanied by specific time allocations for instructional content, indicating a structured approach to learning data science concepts.

Uploaded by

97arshukla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Module-2: Data Science:Exploratory Data Analysis and Data Visualization

Plotting for exploratory data analysis (EDA)


• Introduction to IRIS dataset and 2D scatter plot 26 mins
• 3D scatter plot 6 mins
• Pair plots 14 mins
• Limitations of Pair Plots 2 mins
• Histogram and Introduction to PDF(Probability Density Function) 17 mins
• Univariate Analysis using PDF 6 mins
• CDF(Cumulative Distribution Function) 15 mins
• Mean, Variance and Standard Deviation 17 mins
• Median 10 mins
• Percentiles and Quantiles 9 mins
• IQR(Inter Quartile Range) and MAD(Median Absolute Deviation) 6 mins
• Box-plot with Whiskers 9 mins
• Violin Plots 4 mins
• Summarizing Plots, Univariate, Bivariate and Multivariate analysis 6 mins
• Multivariate Probability Density, Contour Plot 9 mins
• Assignment-1: Data Visualization with Haberman Dataset 4 mins
Linear Algebra
• Why learn it ? 4 mins
• Introduction to Vectors(2-D, 3-D, n-D) , Row Vector and Column Vector 14 mins
• Dot Product and Angle between 2 Vectors 14 mins
• Projection and Unit Vector 5 mins
• Equation of a line, Plane and Hyperplane 23 mins
• Distance of a point from a Plane/Hyperplane, Half-Spaces 10 mins
• Equation of a Circle (2-D), Sphere (3-D) and Hypersphere (n-D) 7 mins
• Equation of an Ellipse (2-D), Ellipsoid (3-D) and Hyperellipsoid (n-D) 6 mins
• Square ,Rectangle 6 mins
• Hyper Cube,Hyper Cuboid 3 mins
• Revision Questions 30 mins
Probability and Statistics
• Introduction to Probability and Statistics 17 mins
• Population and Sample 7 mins
• Gaussian/Normal Distribution and its PDF(Probability Density Function) 27 mins
• CDF(Cumulative Distribution function) of Gaussian/Normal distribution 11 mins
• Symmetric distribution, Skewness and Kurtosis 25 mins
• Standard normal variate (Z) and standardization 6 mins
• Kernel density estimation 7 mins
• Sampling distribution & Central Limit theorem 19 mins
• Q-Q plot:How to test if a random variable is normally distributed or not? 23 mins
• How distributions are used? 17 mins
• Chebyshev’s inequality 20 mins
• Discrete and Continuous Uniform distributions 13 mins
• How to randomly sample data points (Uniform Distribution) 10 mins
• Bernoulli and Binomial Distribution 11 mins
• Log Normal Distribution 12 mins
• Power law distribution 12 mins
• Box cox transform 12 mins
• Applications of non-gaussian distributions? 26 mins
• Co-variance 14 mins
• Pearson Correlation Coefficient 13 mins
• Spearman Rank Correlation Coefficient 7 mins
• Correlation vs Causation 5 mins
• How to use correlations? 13 mins
• Confidence interval (C.I) Introduction 8 mins
• Computing confidence interval given the underlying distribution 11 mins
• C.I for mean of a random variable 14 mins
• Confidence interval using bootstrapping 18 mins
• Hypothesis testing methodology, Null-hypothesis, p-value 16 mins
• Hypothesis Testing Intution with coin toss example 27 mins
• Resampling and permutation test 15 mins
• K-S Test for similarity of two distributions 15 mins
• Code Snippet K-S Test 6 mins
• Hypothesis testing: another example 18 mins
• Resampling and Permutation test: another example 19 mins
• How to use hypothesis testing? 23 mins
• Proportional Sampling 18 mins
• Revision Questions 30 mins
Interview Questions on Probability and statistics
• Questions & Answer 30 mins
Dimensionality reduction and Visualization
• What is Dimensionality reduction? 3 mins
• Row Vector and Column Vector 5 mins
• How to represent a data set? 4 mins
• How to represent a dataset as a Matrix. 7 mins
• Data Preprocessing: Feature Normalisation 20 mins
• Mean of a data matrix 6 mins
• Data Preprocessing: Column Standardization 16 mins
• Co-variance of a Data Matrix 24 mins
• MNIST dataset (784 dimensional) 20 mins
• Code to Load MNIST Data Set 12 mins
PCA(principal component analysis)
• Why learn PCA? 4 mins
• Geometric intuition of PCA 14 mins
• Mathematical objective function of PCA 13 mins
• Alternative formulation of PCA: Distance minimization 10 mins
• Eigen values and Eigen vectors (PCA): Dimensionality reduction 23 mins
• PCA for Dimensionality Reduction and Visualization 10 mins
• Visualize MNIST dataset 5 mins
• Limitations of PCA 5 mins
• PCA Code example 19 mins
• PCA for dimensionality reduction (not-visualization) 15 mins
(t-SNE)T-distributed Stochastic Neighbourhood Embedding
• What is t-SNE? 7 mins
• Neighborhood of a point, Embedding 7 mins
• Geometric intuition of t-SNE 9 mins
• Crowding Problem 8 mins
• How to apply t-SNE and interpret its output 38 mins
• t-SNE on MNIST 7 mins
• Code example of t-SNE 9 mins
• Revision Questions 30 mins
Interview Questions on Dimensionality Reduction
• Questions & Answers 30 mins

You might also like