Data Science Program
no of
l.p. form topic scope
hours
Introduction to Data Science: basic concepts, working environment, general
1 3 stationary Data Scientist - introduction
knowledge about Data Science
video GIT - video
working with the system console
System tools and GIT version vim editor
2 7 stationary practical exercises
control system
git version control system
Python Basics
Python object-oriented programming
3 28 stationary Python in Data Science
Python technology including Google Colab
Intermediate Python
I. Elements of linear algebra - using NumPy
- scalars, vectors, matrices, tensors
- matrix and vector operations (multiplication, addition)
II. Probability and information theory
- probability
4 14 stationary Statistics and probability
- random variables
- probability distributions
- conditional probability and Bayes' theorem
III. Descriptive statistics
IV. Statistical tests
- data frames - pandas library
- working with data, basic operations
- feature engineering
- pandas profiling
5 14 stationary Processing of data sets - working with files - serialization
- data download from API
- SQL language and databases
- database programming
- data visualization
- matplotlib and seaborn
6 14 stationary Data visualisation - ggplot and plotly
- descriptive statistics
- what is machine learning
- the ability of machines to learn
- supervised learning
- unsupervised learning
- incremental learning
Machine Learning in practice: - the main problems of machine learning
7 14 stationary regression problems in - supervised learning - regression problems
supervised learning a. linear regression
b. least squares method
c. gradient descent
d. polynomial regression
e. regression of decision trees
f. model evaluation methods (ROC curve, cross-validation)
Page 1
no of
l.p. form topic scope
hours
- data loading and processing
- EDA
- problem analysis
8 7 Practical project (regression)
- presentation of models solving the problem
- creating a basic model
- comparison of several approaches and drawing conclusions
- logistic regression
- binary classification
- multi-class classification
Machine Learning in practice: - k-nearest neighbors
stationary/r
9 21 classification problems in - support vector machines
emote
supervised learning - decision trees
- Naive Bayes
- model evaluation methods (ROC curve, cross-validation)
- xgboost
- data loading and processing
- EDA
Practical project - problem analysis
10 7
(classification) - presentation of models solving the problem
- creating a basic model
- compare several approaches and draw conclusions
- curse of dimensionality, PCA decomposition
- data clustering - k-means algorithm
Machine Learning in practice: - image segmentation as a clustering problem
11 14 stationary
unsupervised learning - data clustering - DBSCAN algorithm
- Gaussian mixed model
- computational graphs
- creating simple models
- saving and loading models
12 7 stationary TensorFlow Library - visualization of a computational graph
- practical project
- tf.data API
- biological neuron
- artificial neuron - perceptron
Introduction to artificial neural - training neural networks
13 7 stationary - introduction to deep learning
networks
- deep neural networks
- neural network hyperparameters
- problems of training deep neural networks
- vanishing/exploding gradient problem
14 7 stationary Training deep neural networks - weights initialization
- optimizers
- regularization methods
Page 2
no of
l.p. form topic scope
hours
- an overview of the architecture of the visual cortex
- construction of a digital image
- the concept of convolution operation
- CNN architecture
Image processing - Computer - convolution operation in the network
15 14 stationary - CNN filters
Vision
- MaxPooling
- image classifier
- transfer learning
- OpenCV library
- data loading and processing
- EDA
Practical project (image - problem analysis
16 7 stationary
processing) - presentation of models solving the problem
- creating a basic model
- compare several approaches and draw conclusions
- an introduction to NLP
- working with files and operations on the text
- working with the Spacy library
- Tokenization
17 14 stationary Natural language processing - Stemming
- Lemmatization
- text classification
- the concept of recursion
- sequence data examples
- recursive neuron
Working with sequences - - input and output sequences
18 14 stationary
recursive neural networks - data preparation for the network
- problem in remembering long dependencies by the network
- structure and application of LSTM and GRU cells
- embeddings, attention layer, natural language processing
- Design thinking - creating and inventing a solution to a problem based on DS.
- loading and processing of data from various sources
- EDA, understanding your data
19 21 stationary Final project - analysis of data problems
- presentation of models solving the problem
- creating a basic model
- improvement of the solution
234
Page 3