Amity School of Engineering and Technology
Introduction to machine learning/AI
Programme: B. Tech-CSE
Course: SKE-309
Faculty: Dr. S. K. Dubey
Introduction
• Machine learning is making great strides
• Large, good data sets
• Compute power
• Progress in algorithms
• Many interesting applications
• commericial
• scientific
• Links with artificial intelligence
• However, AI machine learning
2
Machine learning tasks
• Supervised learning
• regression: predict numerical values
• classification: predict categorical values, i.e., labels
• Unsupervised learning
• clustering: group data according to "distance"
• association: find frequent co-occurrences
• link prediction: discover relationships in data
• data reduction: project features to fewer features
• Reinforcement learning
3
Regression
Colorize B&W images automatically
https://tinyclouds.org/colorize/
4
Classification
Object recognition
https://ai.googleblog.com/2014/09/building-
deeper-understanding-of-images.html
5
Reinforcement
learning
Learning to play Break Out
https://www.youtube.com/watch?v=V1eYniJ
0Rnk
6
Clustering
Crime prediction using k-means clustering
http://www.grdjournals.com/uploads/article
/GRDJE/V02/I05/0176/GRDJEV02I050176.pd
f
7
Applications in science
8
Machine learning algorithms
• Regression:
Ridge regression, Support Vector Machines, Random Forest,
Multilayer Neural Networks, Deep Neural Networks, ...
• Classification:
Naive Base, , Support Vector Machines,
Random Forest, Multilayer Neural Networks,
Deep Neural Networks, ...
• Clustering:
k-Means, Hierarchical Clustering, ...
9
Issues
• Many machine learning/AI projects fail
(Gartner claims 85 %)
• Ethics, e.g., Amazon has/had
sub-par employees fired by an AI
automatically
10
Reasons for failure
• Asking the wrong question
• Trying to solve the wrong problem
• Not having enough data
• Not having the right data
• Having too much data
• Hiring the wrong people
• Using the wrong tools
• Not having the right model
• Not having the right yardstick
11
Frameworks
• Programming languages
• Python
Fast-evolving ecosystem!
• R
• C++
• ...
• Many libraries classic machine learning
• scikit-learn
• PyTorch
deep learning frameworks
• TensorFlow
• Keras
• …
12
scikit-learn
• Nice end-to-end framework
• data exploration (+ pandas + holoviews)
• data preprocessing (+ pandas)
• cleaning/missing values
• normalization
• training
• testing
• application
• "Classic" machine learning only
• https://scikit-learn.org/stable/
13
Keras
• High-level framework for deep learning
• TensorFlow backend
• Layer types
• dense
• convolutional
• pooling
• embedding
• recurrent
• activation
• …
• https://keras.io/
14
Data pipelines
• Data ingestion
• CSV/JSON/XML/H5 files, RDBMS, NoSQL, HTTP,...
• Data cleaning Must be done systematically
• outliers/invalid values? → filter
• missing values? → impute
• Data transformation
• scaling/normalization
15
Supervised learning: methodology
• Select model, e.g., random forest, (deep) neural network, ...
• Train model, i.e., determine parameters
• Data: input + output
• training data → determine model parameters
• validation data → yardstick to avoid overfitting
• Test model
• Data: input + output
• testing data → final scoring of the model
• Production Experiment with underfitting and overfitting:
010_underfitting_overfitting.ipynb
• Data: input → predict output
16
From neurons to ANNs
inspiration
𝑥1
𝑤1
𝑥2 𝑤2
𝑦 𝑁 𝜎 𝑥
𝑤3 𝑦=𝜎 𝑤𝑖 𝑥𝑖 + 𝑏
𝑖=1 activation function
𝑥3
𝑏
+1
𝑥
...
𝑤𝑁
𝑥𝑁
17
Multilayer network
How to determine
weights?
18
Training: backpropagation
• Initialize weights "randomly"
• For all training epochs
• for all input-output in training set
• using input, compute output (forward)
• compare computed output with training output
• adapt weights (backward) to improve output
• if accuracy is good enough, stop
19
Task: handwritten digit recognition
• Input data
• grayscale image
• Output data
• digit 0, 1, ..., 9
• Training examples
• Test examples
Explore the data: 020_mnist_data_exploration.ipynb
20
First approach
• Data preprocessing
• Input data as 1D array array([ 0.0, 0.0,..., 0.951, 0.533,..., 0.0, 0.0], dtype=float32)
• output data as array with 5
one-hot encoding
• Model: multilayer perceptron
• 758 inputs
array([ 0, 0, 0, 0, 0, 1, 0, 0, 0, 0], dtype=uint8)
• dense hidden layer with 512 units
• ReLU activation function
• dense layer with 512 units Activation functions: 030_activation_functions.ipynb
• ReLU activation function
• dense layer with 10 units Multilayer perceptron: 040_mnist_mlp.ipynb
• SoftMax activation function
21
Deep neural networks
• Many layers
• Features are learned, not given
• Low-level features combined into
high-level features
• Special types of layers
• convolutional
• drop-out
• recurrent
• ...
22
Convolutional neural networks
1 ⋯ 0
⋮ ⋱ ⋮
0 ⋯ 1
23
Convolution examples
1 ⋯ 0 1 ⋯ 0
⋮ ⋱ ⋮ ⋮ ⋱ ⋮
0 ⋯ 1 0 ⋯ 1
0 ⋯ 1 0 ⋯ 1
⋮ ⋱ ⋮ ⋮ ⋱ ⋮
1 ⋯ 0 1 ⋯ 0
Convolution: 050_convolution.ipynb
24
Second approach
• Data preprocessing
• Input data as 2D array array([[ 0.0, 0.0,..., 0.951, 0.533,..., 0.0, 0.0]], dtype=float32)
• output data as array with 5
one-hot encoding
• Model: convolutional neural network
(CNN)
• 28 28 inputs array([ 0, 0, 0, 0, 0, 1, 0, 0, 0, 0], dtype=uint8)
• CNN layer with 32 filters 3 3
• ReLU activation function Convolutional neural network: 060_mnist_cnn.ipynb
• flatten layer
• dense layer with 10 units
• SoftMax activation function
25
Task: sentiment classification
• Input data <start> this film was just brilliant casting location
scenery story direction everyone's really suited the part
• movie review (English) they played and you could just imagine being there Robert
• Output data redford's is an amazing actor and now the same being director
norman's father came from the same scottish island as myself
/ so i loved the fact there was a real connection with this
film the witty remarks throughout the film were great it was
• Training examples just brilliant so much that i bought the film as soon as it
• Test examples
Explore the data: 070_imdb_data_exploration.ipynb
26
Word embedding
• Represent words as one-hot vectors
length = vocabulary size
Issues:
• unwieldy
• no semantics
• Word embeddings
• dense vector
• vector distance semantic distance
• Training
• use context
• discover relations with surrounding
words
27
How to remember?
Manage history, network learns
• what to remember
• what to forget
Long-term correlations!
Use, e.g.,
• LSTM (Long Short-Term Memory
• GRU (Gated Recurrent Unit)
Deal with variable length input and/or output
28
Gated Recurrent Unit
(GRU)
• Update gate
𝑧𝑡 = 𝜎 𝑊𝑧 𝑥𝑡 + 𝑈𝑧 ℎ𝑡−1
• Reset gate
𝑟𝑡 = 𝜎 𝑊𝑟 𝑥𝑡 + 𝑈𝑟 ℎ𝑡−1
• Current memory content
ℎ′𝑡 = tanh 𝑊𝑥𝑡 + 𝑟𝑡 ⊙ 𝑈ℎ𝑡−1
• Final memory/output
ℎ𝑡 = 𝑧𝑡 ⊙ ℎ𝑡−1 + 1 − 𝑧𝑡 ⊙ ℎ′𝑡
29
Approach
• Data preprocessing
• Input data as padded array
• output data as 0 or 1
• Model: recurrent neural network
(GRU)
• 100 inputs
• embedding layer, 5,000 words, 64
element representation length
• GRU layer, 64 units Recurrent neural network: 080_imdb_rnn.pynb
• dropout layer, rate = 0.5
• dense layer, 1 output
• sigmoid activation function
30
Caveat
• InspiroBot (http://inspirobot.me/)
• "I am an artificial intelligence dedicated to generating unlimited amounts of unique inspirational quotes for endless
enrichment of pointless human existence".
31
Thank You
32