Unit 1: Data Modelling through Machine Learning Techniques
Topic 1: Introduction to Data Modelling
Data modelling is the process of creating a data structure that supports efficient storage, retrieval, and use of
information in a machine learning environment. It includes defining input-output relationships, determining
features, and understanding patterns within data...
Unit 1: Data Modelling through Machine Learning Techniques
Topic 2: Types of Machine Learning
Machine learning can be categorized into Supervised Learning, Unsupervised Learning, and Reinforcement
Learning. Supervised learning uses labeled data, unsupervised learning discovers hidden patterns, and
reinforcement learning focuses on reward-based learning...
Unit 1: Data Modelling through Machine Learning Techniques
Topic 3: Exploratory Data Analysis (EDA)
EDA involves summarizing the main characteristics of data using visual and statistical techniques. It helps
detect anomalies, understand data distributions, and prepare for modelling. Techniques include histograms,
boxplots, correlation matrices, etc...
Unit 1: Data Modelling through Machine Learning Techniques
Topic 4: Descriptive Statistics
Descriptive statistics summarize data using measures such as mean, median, mode (central tendency), and
range, variance, standard deviation (dispersion). It also includes measures of shape like skewness and
kurtosis...
Unit 1: Data Modelling through Machine Learning Techniques
Topic 5: Understanding Relationships Between Variables
This topic covers techniques like correlation (Pearson, Spearman), Chi-square test, t-tests, and F-tests to
study how variables are associated. It helps in feature selection and hypothesis testing...
Unit 1: Data Modelling through Machine Learning Techniques
Topic 6: Non-Parametric Tests
Non-parametric tests are used when data doesn't follow a normal distribution. Includes Wilcoxon,
Mann-Whitney, Sign Test, Kruskal-Wallis, and Friedman tests...
Unit 1: Data Modelling through Machine Learning Techniques
Topic 7: Data Transformation
Transforming data ensures better model performance and stability. Techniques include standardization,
normalization, log transformation, encoding, and binning...
Unit 1: Data Modelling through Machine Learning Techniques
Topic 8: Univariate and Multivariate Analysis
Univariate analysis explores a single variable. Multivariate analysis looks at multiple variables and their
interrelationships using regression, correlation, PCA, MANOVA...
Unit 1: Data Modelling through Machine Learning Techniques
Topic 9: Feature Engineering and Selection
Feature engineering involves creating new features from raw data. Feature selection removes irrelevant or
redundant features using filter, wrapper, or embedded methods...