0% found this document useful (0 votes)
2 views6 pages

Data Science Using Python - Course Curriculum

Uploaded by

pmdcloud1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views6 pages

Data Science Using Python - Course Curriculum

Uploaded by

pmdcloud1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

DATA SCIENCE USING PYTHON

Duration: 30 Hours

Course Overview:
The Introduction to Data Science course provides students with a foundational understanding of
data science concepts, techniques, and tools. The course covers key topics, including machine
learning, Python programming, data manipulation, exploratory data analysis, basic statistics,
predictive modeling, and machine learning algorithms. By leveraging Python and its libraries,
students will learn how to analyze data, build predictive models, and apply various machine learning
techniques to solve real-world business problems.

Course Objectives
Upon completion of this course, students will be able to:
1. Understand the fundamentals of Data Science, Machine Learning, and Artificial Intelligence.
2. Explore the role of Python in Data Science and learn essential Python programming concepts.
3. Master data manipulation, cleansing, and munging using Python modules such as Pandas and
NumPy.
4. Perform exploratory data analysis, including data visualization and statistical analysis.
5. Gain an understanding of core statistical concepts and their applications in data analysis.
6. Learn and implement predictive modeling techniques such as regression, classification, clustering,
and neural networks.
7. Work with various machine learning algorithms, including linear regression, logistic regression,
KNN, decision trees, random forests, and SVM.
8. Apply deep learning techniques using artificial neural networks and TensorFlow.
9. Develop proficiency in handling real-world business problems through case studies and projects.
10. Build, validate, and optimize machine learning models for different types of data-driven
applications.

Course Agenda:

DATA SCIENCE - INTRODUCTION

• What is Data Science


• What is Machine Learning
• Machine Learning vs. Data Science vs. AI
• How leading companies are harnessing the power of Data Science with Python?
• Different phases of a typical Analytics/Data Science projects and role of python
• Anaconda vs. Python
• Machine Learning flow to code
• Regression vs. Classification
• Features, Labels, Class
• Supervised Learning, Semi-Supervised and Unsupervised Learning
• Cost Function and Optimizers
• Discussion on Various ML Learnings
PYTHON ESSENTIALS (CORE)
• Introduction to installation of Anaconda
Decision Tree Solutions
#37/1997, Infra Futura Building, Opp Bharata Mata College, Seaport Airport Road, Kakkanad, Kochi
Kerala - 682021. INDIA
• Introduction to Python Editors & IDE's (Anaconda, Spyder, Jupyter etc…)
• Understand Jupyter notebook & Customize Settings
• Overview of Python- Starting with Python
• Python data types: Primitive
• Core built-in data structures – Lists, Tuples, Dictionaries
• String, String built-in methods
• Data manipulation tools(Operators, Functions, Packages, control structures, Loops, arrays etc)
• Python UDFs – def keywords
• Reading and writing data
• Control flow & conditional statements
• Debugging & Code profiling
• How to create class and modules and how to call them?
• Installing & loading Packages & Name Spaces
• Scientific distributions used in python for Data Science - Numpy, scify, pandas, scikitlearn,
statmodels etc.
ACCESSING/IMPORTING AND EXPORTING DATA USING PYTHON MODULES

• Concept of Packages/Libraries - Important packages (NumPy, SciPy, Pandas)


• Importing Data from various sources (CSV, txt, excel, access etc)
• Exporting Data to various formats
DATA MANIPULATION – CLEANSING – MUNGING USING PYTHON MODULES

• Cleansing Data with Pandas


• Data Manipulation steps (Sorting, filtering, duplicates, merging, appending, subsetting,
derived variables, sampling, Data type conversions, renaming, formatting etc)
• Stripping out extraneous information
• Scaling and Normalizing data
• Pre-processing and Formatting data
• Feature selection – RFE, Correlation etc
DATA ANALYSIS – VISUALIZATION USING PYTHON
• Introduction exploratory data analysis
• Descriptive statistics, Frequency Tables and summarization
• Univariate Analysis (Distribution of data & Graphical Analysis)
• Bivariate Analysis (Cross Tabs, Distributions & Relationships, Graphical Analysis)
• Creating Graphs- Bar/pie/line chart/histogram/ boxplot/ scatter/ density etc)
BASIC STATISTICS & IMPLEMENTATION OF STATS METHODS IN PYTHON

• Basic Statistics - Measures of Central Tendencies and Variance


• Building blocks - Probability Distributions - Normal distribution
• Central Tendency
• Standard Deviation
• Quartiles, IQR, Boxplot, Outliers
• Skewness, Kurtosis
• Anova
PYTHON: DATA SCIENCE -PREDICTIVE MODELING – BASICS
• Introduction to Machine Learning & Predictive Modeling
Decision Tree Solutions
#37/1997, Infra Futura Building, Opp Bharata Mata College, Seaport Airport Road, Kakkanad, Kochi
Kerala - 682021. INDIA
• Mapping of Techniques - Regression vs. classification vs. segmentation vs. Forecasting
• Major Classes of Learning Algorithms -Supervised vs Unsupervised Learning
• Different Phases of Predictive Modelling (Data Pre-processing, Sampling, Model Building,
Validation)
• Overfitting (Bias-Variance Trade off) & Performance Metrics
• Feature engineering & dimension reduction
• Concept of optimization & cost function
• Concept of Cross validation (Bootstrapping, K-Fold validation etc)
ML ALGORITHMS & IMPLEMENTATION FOR BUISNESS ANALYTICS
REGRESSION
LINEAR REGRESSION

• Regression Problem Analysis


• Mathematical modelling of Regression Model
• Gradient Descent Algorithm
• Model Specification
• L1 & L2 Regularization
• Optimizers
• Polynomial Linear Regression
• Parameters & Hyperparameters
• Cost Function & Cost Optimizer: Gradient Descent Algorithm
• R Squared & Adj. Squared
• Building simple Univariate Linear Regression Model
• Multivariate Regression Model
• Validate the Model Performance
• Model Predictions, Model Accuracy, Graphical Plotting
CLASSIFICATION
LOGISTIC REGRESSION

• Assumptions
• Reason for the Logit Transform
• Logit Transformation
• Hypothesis
• Odds Ratio and Interpretation
• Model Specification
• Case for Prediction Probe
• Model Parameter Significance Evaluation
• Model Optimization of threshold value
• ROC Curve, Classification report
KNN

• K- Nearest Network – Introduction


• How does the KNN algorithm work?
• Methods of calculating distance between points
• How to choose the K?
• KNN Practical Approach
Decision Tree Solutions
#37/1997, Infra Futura Building, Opp Bharata Mata College, Seaport Airport Road, Kakkanad, Kochi
Kerala - 682021. INDIA
• Lab: Classification of medical data analysis
EMSEMBLE LEARNING
RANDOM FOREST & DECISION TREE ALGORITHM:

• Concept and Working Principle


• Mathematical Modelling
• Optimization Function Formation
• Analysis of Classification Problem case
• Math: Role of Entropy, Gini Index and Information Gain in Decision Trees
• Analysis of Regression Problem case
• Use Cases & Programming using Python
• Decision Trees – ID3
• Overfitting and Pruning
• Ensemble Learning
• Bagging and Boosting
• Classification and Regression with Random Forest
• Project: Credit card fraud analysis

BOOSTING

• Gradient Boosting
• XGBOOST Algorithm
• Classification & regression
• Use cases & examples
UNSUPERVISED LEARNING
CLUSTERING – K-MEANS and Hierarchical

• Unsupervised Learning
• Clustering Introduction
• K-Means Clustering
• Handling K-Means Clustering
• Maths behind KMeans Clustering – Centroids
• Mean shift Introduction
• Elbow Method – Picking K in K-Means
• Hierarchical Clustering
• Types – Agglomerative and Divisive
• Dendrogram
NEURAL NETWORKS
Artificial Neural Networks

• Perceptron
• Logic gates
• ANN & Working
• Single Layer Perceptron Model
Decision Tree Solutions
#37/1997, Infra Futura Building, Opp Bharata Mata College, Seaport Airport Road, Kakkanad, Kochi
Kerala - 682021. INDIA
• Multilayer Neural Network
• Feed Forward Neural Network
• Cost Function Formation
• Activation Function
• Cost Function Optimization
• Applying Gradient Descent Algorithm
• Stochastic Gradient Descent
• Backpropagation Algorithm & Mathematical Modelling
• Programming Flow for backpropagation algorithm
• Use Cases of ANN
• Programming Single Layer Neural Networks using Python
• Programming MLNN using Python
• XOR Logic using MLNN & Backpropagation
• Score Predictor
TENSORFLOW

• Tensorflow library for AI


• Keras – High Level Tensorflow API
• Getting started with Tensorflow
• Installation & Setting up Tensorflow
• Tensorflow Data Structures
• Tensorboard - Graph Visualization
Support Vector Machine (SVM)

• Concept and Working Principle


• Mathematical Modelling
• Linear Support Vector Machine
• Hyperplanes
• Optimal separating hyperplane
• Drawing Margins
• Optimization Function Formation
• The Kernel Method and Nonlinear Hyperplanes
• Use Cases & Programming SVM using Python
• Anomaly Detection with SVM
• Use Cases & Programming using Python
• Case study of KNN Vs SVM
• Applying KNN & SVM for Supervised Problems of Classification & Unsupervised problems
like Clustering.
ANN - Regression & Classification concepts

• Introduction to Multilayer Neural Network


• Concept of Deep neural networks
• Multi-layer perceptron
• Neural network hyperparameters
• Backpropagation, Forward propagation, overfitting, hyperparameters.
• Training of neural networks
• The various techniques used in training of artificial neural networks
• Gradient descent rule
Decision Tree Solutions
#37/1997, Infra Futura Building, Opp Bharata Mata College, Seaport Airport Road, Kakkanad, Kochi
Kerala - 682021. INDIA
• Perceptron learning rule
• Tuning learning rate
• Stochastic process
• Optimization techniques
• Regularization techniques – L1 & L2
Regression - Linear Regression with Neuron

• Learning Algorithm
• Linear Regression – Theory
• Feature selection - Correlation
• XOR – XNOR Gates
• Input Matrix & Output Labels
• Activation Function
• Training A single perceptron
• Model Optimizers - Parameters and Hyperparameters
• Multiple Linear Regression
• Overfitting - Regularization – L1 & L2
Classification – Logistic Regression with Neuron

• Logistic Regression – Theory


• Classification Problems
• Training the model
• Hypothesis, Parameters & Hyperparameters, Cost Function, Model Optimization –
Optimizers

PROJECT - CONSOLIDATE LEARNINGS


• Applying different algorithms to solve the business problems and bench mark the results
• Analytics on Classification, Regression, Clustering, Artificial Intelligence etc.

Decision Tree Solutions


#37/1997, Infra Futura Building, Opp Bharata Mata College, Seaport Airport Road, Kakkanad, Kochi
Kerala - 682021. INDIA

You might also like