0% found this document useful (0 votes)

70 views65 pages

AI/ML Basics for Tech Enthusiasts

Uploaded by

webserviceszion

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

70 views65 pages

AI/ML Basics for Tech Enthusiasts

Uploaded by

webserviceszion

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 65

A to Z of AI/ML:

A Quick Introduction to Artificial Intelligence and

Machine Learning Capabilities and Tools
EngCon 2017

Mark Crowley
Assistant Professor
Electrical and Computer Engineering
University of Waterloo
[email protected]

Sep 23, 2017

Mark Crowley A to Z of AI/ML Sep 23, 2017 1 / 112

Introduction

Outline

Introduction

What is AI?

Neural Networks

Convolutional Neural Networks

Do you need AI/ML?

Mark Crowley A to Z of AI/ML Sep 23, 2017 2 / 112

Introduction

My Background

Waterloo : Assistant Professor, ECE Department since 2015

PhD at UBC in Computer Science with Prof. David Poole
Postdoc at Oregon State University
UW ECE ML Lab:
https://uwaterloo.ca/scholar/mcrowley/lab
Waterloo Institute for Complexity and Innovation (WICI)
Research Fellow at ElementAI
Pattern Analysis and Machine Intelligence (PAMI)
http:\waterloo.ai
List of faculty
Research projects (co-op/internships)
List of spinoff companies from UWaterloo (good place for project ideas)

Mark Crowley A to Z of AI/ML Sep 23, 2017 3 / 112

What is AI?

What do you think of when you hear?

Artificial Intelligence Machine Learning

Mark Crowley A to Z of AI/ML Sep 23, 2017 4 / 112

What is AI? Landscape of Big Data/AI/ML

Data, Big Data, Machine Learning, AI, etc, etc,

Big Data Machine Artificial

Tools Learning Intelligence

Data Analysis Classification, Automated

Data Patterns, Decision Making
Reports, statistics,
X f1 f2 … fk Predictions
Charts, trends
.5 104.2 China Probabilities
x1
x2 .34 92.0 USA Policies, Decision Rules,
Summaries
…
Human Decision
xn .2 85.2 Canada
Making

Mark Crowley A to Z of AI/ML Sep 23, 2017 9 / 112

What is AI? Landscape of Big Data/AI/ML

Data, Big Data, Machine Learning, AI, etc, etc,

Vision
Nat Lang Robotics
Processing
LSTM
Game
RNN A3C
CNN DQN Theory
Multi-
Deep Reinforcement agent
Probabilistic Learning Learning systems
Programming
ILP
Big Data Machine Artificial
Tools Learning Intelligence Constraint
Programming

SAT

SMP

Data Analysis Classification, Automated

Data
Data points

Patterns, Decision Making

Reports, statistics,
X f1 f2 … fk Predictions
Charts, trends
.5 104.2 China Probabilities
x1
x2 .34 92.0 USA Policies, Decision Rules,
Summaries
…
Human Decision
xn .2 85.2 Canada
Making

features

Mark Crowley A to Z of AI/ML Sep 23, 2017 10 / 112

What is AI? Landscape of Big Data/AI/ML

Major Types/Areas of AI

Artificial Intellgience: some algorithm to enable computers to perform

actions we define as requireing intelligence.

Mark Crowley A to Z of AI/ML Sep 23, 2017 11 / 112

What is AI? Landscape of Big Data/AI/ML

Major Types/Areas of AI

Artificial Intellgience: some algorithm to enable computers to perform

actions we define as requireing intelligence. This is a moving target.

Mark Crowley A to Z of AI/ML Sep 23, 2017 11 / 112

What is AI? Landscape of Big Data/AI/ML

Major Types/Areas of AI

Artificial Intellgience: some algorithm to enable computers to perform

actions we define as requireing intelligence. This is a moving target.
Search Based Heuristic Optimization (A*)
Evolutionary computation (genetic algorithms)
Logic Programming (inductive logic programming, fuzzy logic)
Probabilistic Reasoning Under Uncertainty (bayesian networks)
Computer Vision
Natural Language Processing
Robotics
Machine Learning

Mark Crowley A to Z of AI/ML Sep 23, 2017 11 / 112

What is AI? Landscape of Big Data/AI/ML

Types of Machines Learning

Machine Learning: ”Detect patterns in data, use the uncovered patterns
to predict future data or other outcomes of interest” – Kevin Murphy,
Google Research.

Mark Crowley A to Z of AI/ML Sep 23, 2017 12 / 112

What is AI? Landscape of Big Data/AI/ML

Deep Learning

Deep Learning: methods which perform machine learning through the use
of multilayer neural networks of some kind. Deep Learning can be applied
in any of the three main types of ML:
Supervised Learning : very common, enourmous improvement in
recent years
Unsupervised Learning : just beginning, lots of potential
Reinforement Learning : recent (past 3 years) this has exploded,
exspecially for video games

Mark Crowley A to Z of AI/ML Sep 23, 2017 14 / 112

What is AI? Landscape of Big Data/AI/ML

Increasing Complexity of Supervised ML Methods

1 mean, mode, max, min - basic statistics and patterns
2 prediction/regression - least squares, ridge regression
3 linear classification - use distances and separation of data points.
(logistic regression, SVM, KNN)
4 Kernel Based Classification - define a mapping from original data to a
new space, allow nonlinear divisions to be found
5 Decision trees - learn rules that divide data arbitrarily (C4.5, Random
Forests, AdaBoost)
6 Neural Networks - learn function using ’neurons’
7 Deep Neural Networks - same, but deep :)
8 Recurrent Neural Networks - adding links to past timesteps, learning
with memory of the past
9 Convolutional Neural Networks - adding convolutional filters, good for
images
10 Inception Resnets, Long-Term Short-Term Networks, Voxception
Networks, .... oh it keeps going...
Mark Crowley A to Z of AI/ML Sep 23, 2017 15 / 112
What is AI? Classification

One Example of ML: Classification

Mark Crowley A to Z of AI/ML Sep 23, 2017 16 / 112

What is AI? Classification

Clustering vs. Classification

Clustering Classification
Unsupervised Supervised
Uses unlabeled data Uses labeled data
Organize patterns w.r.t. an Requires training phase
optimization criteria Domain sensitive
Requires a definition of similarity Easy to evaluate (you know the
Hard to evaluate correct answer)
Examples: K-means, Fuzzy Examples: Naive Bayes, KNN,
C-means, Hierarchical SVM, Decision Trees, Random
Clustering, DBScan Forests

Mark Crowley A to Z of AI/ML Sep 23, 2017 18 / 112

What is AI? Classification

Classification Performance Depends on the Algorithm

A good example of this choices is Support Vector Machines (SVMs).

popular until dawn of deep learning in past few years
core idea: find a dividing hyperplane
many variations: plane can be linear, polynomial, gaussian,
high-dimensional

Mark Crowley A to Z of AI/ML Sep 23, 2017 20 / 112

What is AI? Classification

Classification Performance Depends on the Algorithm

A good example of this choices is Support Vector Machines (SVMs).

popular until dawn of deep learning in past few years
core idea: find a dividing hyperplane
many variations: plane can be linear, polynomial, gaussian,
high-dimensional

So what is the “right” approach?

Mark Crowley A to Z of AI/ML Sep 23, 2017 20 / 112

What is AI? Classification

Classification Performance Depends on the Algorithm

A good example of this choices is Support Vector Machines (SVMs).

popular until dawn of deep learning in past few years
core idea: find a dividing hyperplane
many variations: plane can be linear, polynomial, gaussian,
high-dimensional

So what is the “right” approach? Experimentation!

Mark Crowley A to Z of AI/ML Sep 23, 2017 20 / 112

What is AI? Classification

Classification Performance Depends on the Algorithm

So choose carefully...
See http://scikit-learn.org/stable/auto_examples/
classification/plot_classifier_comparison.html

Mark Crowley A to Z of AI/ML Sep 23, 2017 21 / 112

Neural Networks Building Upon Classic Machine Learning

Outline

Introduction

What is AI?

Neural Networks
Building Upon Classic Machine Learning
History Of Neural Networks
Improving Performance

Convolutional Neural Networks

Do you need AI/ML?

Mark Crowley A to Z of AI/ML Sep 23, 2017 22 / 112

Neural Networks Building Upon Classic Machine Learning

Linear Regression vs. Logistic Regression

A simple type of Generalized Linear Model

Linear regression learns a function to predict a continuous variable
output of continous or discrete input variables
X
Y = b0 + (bi Xi ) +

Logistic regression predicts the probability of an outcome, the

appropriate class for an input vector or the odds of one outcome
being more likely than another.

Mark Crowley A to Z of AI/ML Sep 23, 2017 23 / 112

Neural Networks Building Upon Classic Machine Learning

Logistic Regression as a Graphical Model

X 1
o(x) = σ(w T xi ) = σ(w0 + wi xi ) = P
1 + exp(−(w0 + i wi xi ))
i

Mark Crowley A to Z of AI/ML Sep 23, 2017 25 / 112

Neural Networks Building Upon Classic Machine Learning

Logistic Regression Used as a Classifier

Logistic Regression can be used as a simple linear classifier.

Compare probabilities of each class P(Y = 0|X ) and P(Y = 1|X ).
Treat the halfway point on the sigmoid as the decision boundary.

P(Y = 1|X ) > 0.5 classify X in class 1

X
w0 + wi xi = 0
i

Mark Crowley A to Z of AI/ML Sep 23, 2017 26 / 112

Neural Networks Building Upon Classic Machine Learning

Training Logistic Regression Model via Gradient Descent

Can’t easily perform Maximum Likelihood Estimation

The negative log-likehood of the logistic function is given by NLL and
it’s gradient by g

N
!
X X
NLL(w ) = log 1 + exp(−(w0 + wi xi ))
i=1 i
∂ X
g= = (σ(w T xi ) − yi )xi
∂w
i

Then we can update the parameters iteratively

θk+1 = θk − ηk gk

where ηk is the learning rate or step size.

Mark Crowley A to Z of AI/ML Sep 23, 2017 27 / 112
Neural Networks History Of Neural Networks

Neural Networks to learn f : X → Y

f can be a non-linear function
X (vector of) continuous and/or discrete variables
Y (vector of) continuous and/or discrete variables
Feedforward Neural networks - Represent f by network of non-linear
(logistic/sigmoid/ReLU) units:

Nonlinear Unit
Sigmoid/ReLU/ELU Output layer, Y

Hidden layer, H

Input layer, X

Mark Crowley A to Z of AI/ML Sep 23, 2017 37 / 112

Neural Networks History Of Neural Networks

Basic Three Layer Neural Network

Input Layer
vector data, each input collects one feature/dimension of the data
and passes it on to the (first) hidden layer.
Hidden Layer
Each hidden unit computes a weighted sum of all the units from the
input layer (or any previous layer) and passes it through a nonlinear
activation function.
Output Layer
Each output unit computes a weighted sum of all the hidden units
and passes it through a (possibly nonlinear) threshold function.

Mark Crowley A to Z of AI/ML Sep 23, 2017 38 / 112

Neural Networks History Of Neural Networks

Properties of Neural Networks

Universality: Given a large enough layer of hidden units (or multiple

layers) a NN can represent any function.
Representation Learning: classic statistical machine learning is about
learning functions to map input data to output. But Neural Networks,
and especially Deep Learning, are more about learning a
representation in order to perform classification or some other task.

Mark Crowley A to Z of AI/ML Sep 23, 2017 40 / 112

Neural Networks History Of Neural Networks

Hidden Layer: Adding Nonlinearity

Each hidden unit emits an output that is a nonlinear activation

function of its net activiation.

yj = f (netj )

This is essential to neural networks power, if it’s linear then it all

becomes just linear regression.
The output is thus thresholded through this nonlinear activation
function.

Mark Crowley A to Z of AI/ML Sep 23, 2017 44 / 112

Neural Networks History Of Neural Networks

Activation Functions

tanh was another common function.

sigmoid is now discourage except for final layer to obtain
probabilities. Can over-saturate easily.
ReLU is the new standard activation function to use.
Mark Crowley A to Z of AI/ML Sep 23, 2017 45 / 112
Neural Networks History Of Neural Networks

Rectified Linear Activation

g(z) = max{0, z}

0 (Goodfellow 2016)

Figure 6.3: The rectified linear activation function. This activation function is the default
activation function recommended for use with most feedforward neural networks. Applying
Rectified Linear Units (ReLU) have become standard max(0, netj )
this function to the output of a linear transformation yields a nonlinear transformation.
However, the function remains very close to linear, in the sense that is a piecewise linear
strong signals are alwasy easy to distinguish
function with two linear pieces. Because rectified linear units are nearly linear, they
mostmany
preserve valuesof theare zero,that
properties deritive is mostly
make linear zero
models easy to optimize with gradient-
based methods. They also preserve many of the properties that make linear models
they do not saturate as easily as sigmoid
generalize well. A common principle throughout computer science is that we can build
complicated systems from minimal components. Much as a Turing machine’s memory
new Exponential linear units - evidence that they perform better than
needs only to be able to store 0 or 1 states, we can build a universal function approximator
from rectified linear functions.
ReLU in some situations.
Mark Crowley A to Z of AI/ML Sep 23, 2017 46 / 112
Neural Networks History Of Neural Networks

Gradient Descent

E – Error function
MSE, cross-entro
loss
Error Function: Mean Squared Error, cross-entropy loss, etc.
d

For Neural Networks

E[w] no longer conve
(Slides from Tom Mitchell ML Course, CMU, 2010)
Mark Crowley A to Z of AI/ML Sep 23, 2017 53 / 112
Neural Networks History Of Neural Networks

Gradient Descent

E – Error function
MSE, cross-entro
loss
Error Function: Mean
h Squared Error, cross-entropy
i loss, etc.
∂E ∂E ∂E
Gradient: 5E [w] = ∂w0 , ∂w1 , . . . , ∂wd d

∂E
Training Update Rule: ∆wi = −η ∂w i
where η is the training rate.
Note: For regression, others, this gradient is convex. In ANNs it is not. So
we must solve iteratively For Neural Networks
E[w] no longer conve
(Slides from Tom Mitchell ML Course, CMU, 2010)
Mark Crowley A to Z of AI/ML Sep 23, 2017 53 / 112
Neural Networks History Of Neural Networks

Incremental Gradient Descent

Let error function be : El [w] = 12 (y l − o l )2

Do until satisfied:
For each training example l in D
1 Compute the gradient 5E [w]
2 update weights : w = w − η 5 E [w]
Note: can also use batch gradient descent on many points at once.

Mark Crowley A to Z of AI/ML Sep 23, 2017 54 / 112

Neural Networks History Of Neural Networks

Backpropagation Algorithm
We need an iterative algorithm for getting the gradient efficiently.
For each training example:
1 Forward propagation: Input the training example to the network and

compute outputs
2 Compute output units errors:

δkl = okl (1 − okl )(ykl − okl )

3 Compute hidden units errors:

X
δhl = ohl (1 − ohl ) wh,k δkl
k

4 Update network weights:

l
wi,j = wi,j + ∆wi,j = wi,j + ηδjl oil

Mark Crowley A to Z of AI/ML Sep 23, 2017 56 / 112

Neural Networks History Of Neural Networks

A Short History

40’s Early work in NN goes back to the 40s with a simple model
of the neuron by McCulloh and Pitt as a summing and
thresholding devices.
1958 Rosenblatt in 1958 introduced the Perceptron,a two layer
network (one input layer and one output node with a bias in
addition to the input features.
1969 Marvin Minsky: 1969. Perceptrons are ’just’ linear, AI goes
logical, beginning of ”AI Winter”
1980s Neural Network resurgence: Backpropagation (updating
weights by gradient descent)
1990s SVMs! Kernals can do anything! (no, they can’t)

Mark Crowley A to Z of AI/ML Sep 23, 2017 32 / 112

Neural Networks History Of Neural Networks

A Short History
1993 LeNet 1 for digit recognition
2003 Deep Learning (Convolutional Nets Dropout/RBMs, Deep
Belief Networks)
1986, 2006 Restricted Boltzman Machines
2006 Neural Network outperform RBF SVM on MNIST
handwriting dataset (Hinton et al.)
2012 AlexNet for ImageNet challenge - this algorithm beat
competition by error rate of 16% vs 26% for next best
ImageNet : contains 15 million annotated images in
over 22,000 categories.
ZFNet paper (2013) extends this and has good
description of network structure
2012-present Google Cat Youtube, speech recognition, self driving cars,
computer defeats regional Go champion, ...
2014 GoogLeNet added many layers and introduced inception
modules (allows parallel computation rather than serially
Mark Crowley A to Z of AI/ML Sep 23, 2017 33 / 112
Neural Networks History Of Neural Networks

A Short History

2014 Generative Adversarial Networks (GANs) introduced.

2015 Microsoft algorithm beats human performance at ImageNet
challenge.
2016 AlphaGo defeats one of best world players of Go Lee Sedol
using Deep Reinforcement Learning.
2016 Deep Mind introduces A3C Deep RL algorithm that can
learn to play Atari games from images by playing with no
instructions.

Mark Crowley A to Z of AI/ML Sep 23, 2017 34 / 112

Neural Networks History Of Neural Networks

Outline

Introduction

What is AI?

Neural Networks
Building Upon Classic Machine Learning
History Of Neural Networks
Improving Performance

Convolutional Neural Networks

Do you need AI/ML?

Mark Crowley A to Z of AI/ML Sep 23, 2017 58 / 112

Neural Networks History Of Neural Networks

Problems with ANNs

Overfitting
Very inneficient for images, timeseries, large numbers of
inputs-outputs
Slow to train
Hard to interpret the resulting model
Overfitting

Mark Crowley A to Z of AI/ML Sep 23, 2017 59 / 112

Neural Networks Improving Performance

Heuristics for Improving Backpropagation

There are a number of useful heuristics for training Neural Networks that
are useful in practice (maybe we’ll learn more today):
Less hidden nodes, just enough complexity to work, not too much to
overfit.
Train multiple networks with different sizes and search for the best
design.
Validation set: train on training set until error on validation set starts
to rise, then evaluate on evaluation set.
Try different activiation functions: tanh, ReLU, ELU, ...?
Dropout (Hinton 2014) - randomly ignore certain units during
training, don’t update them via gradient descent, leads to hidden
units that specialize
Modify learning rate over time (cooling schedule)

Mark Crowley A to Z of AI/ML Sep 23, 2017 60 / 112

Neural Networks Improving Performance

Dropout
Dropout (Hinton 2014) - randomly ignore certain units during
training, don’t update them via gradient descent, leads to hidden
units that specialize.
With probability p don’t include a weight in the gradient updates.
Reduces overfitting by encouraging robustness of weights in the
network.

Mark Crowley A to Z of AI/ML Sep 23, 2017 64 / 112

Neural Networks Improving Performance
CHAPTER 6. DEEP FEEDFORWARD NETWORKS

Large, Shallow Models Overfit More

97
3, convolutional
Test accuracy (percent)

96
3, fully connected
95 11, convolutional
94

91
0.0 0.2 0.4 0.6 0.8 1.0
Number of parameters ⇥108
(Goodfellow 2016)

Figure 6.7: Deeper models tend to perform better. This is not merely because the model is
larger. This experiment from Goodfellow et al. (2014d) shows that increasing the number
of parameters in layers of convolutional networks without increasing their depth is not
nearly as eﬀective at increasing test set performance. The legend indicates the depth of
network used to make each curve and whether the curve represents variation in the size of
the convolutional or the fully connected layers. We observe that shallow models in this
context overfit at around 20 million parameters while deep ones can benefit from having
over 60 million. This suggests that using a deep model expresses a useful preference over
the space of functions the model can
Mark Crowley learn.
A to Specifically, it expresses a belief that
Z of AI/ML Septhe
23, 2017 67 / 112
Convolutional Neural Networks

Outline

Introduction

What is AI?

Neural Networks

Convolutional Neural Networks

Motivation
Other Types of Deep Neural Networks

Do you need AI/ML?

Mark Crowley A to Z of AI/ML Sep 23, 2017 68 / 112

Convolutional Neural Networks Motivation

Convolutional Network Structure

input data: image (eg. 256x256 pixels x3 channels RGB)

output : categorical label

Mark Crowley A to Z of AI/ML Sep 23, 2017 70 / 112

Convolutional Neural Networks Motivation

Example Applications of CNNs

(Karpathy Blog, Oct, 25, 2015 - http:karpathy.github.io20151025selfie)

Mark Crowley A to Z of AI/ML Sep 23, 2017 71 / 112

Convolutional Neural Networks Motivation

Parameter sharing
CHAPTER 9. CONVOLUTIONAL NETWORKS

Convolution s1 s2 s3 s4 s5
shares the same
parameters
across all spatial x1 x2 x3 x4 x5

locations
Traditional s1 s2 s3 s4 s5
matrix
multiplication
does not share x1 x2 x3 x4 x5

any parameters
Figure 9.5: Parameter sharing: Black arrows indicate the connections that use a
(Goodfellow 2016)

parameter in two diﬀerent models. (Top)The black arrows indicate uses of

element
Mark Crowley of a 3-element kernel
A to in
Z ofaAI/ML
convolutional model. Due to2017
Sep 23, parameter 73 / 112sh
Convolutional Neural Networks Motivation

2D Convolution

Input
Kernel / Tensor
a b c d
w x
e f g h
y z
i j k l

Output

aw + bx + bw + cx + cw + dx +
ey + fz fy + gz gy + hz

ew + fx + fw + gx + gw + hx +
iy + jz jy + kz ky + lz

Figure 9.1: An example of 2-D convolution without kernel-flipping. In this case we restrict
(Goodfellow 2016)

the output to only positions where the kernel lies entirely within the image, called “valid”
convolution
Mark Crowley in some contexts. We drawA toboxes with arrows to indicate how the Sep
Z of AI/ML upper-left
23, 2017 76 / 112
Convolutional Neural Networks Motivation

A simpleCHAPTER
example9. CONVOLUTIONAL NETWORKS

Edge Detection by Convolution

Edge detection by convolution with a kernal that subtracts the value from
the neighbouring pixel on the left for every pixel.

Input
ure 9.6: Eﬃciency of edge detection. The image on the right was formed by taking
pixel in the original image and subtracting the value of its neighboring pixel on the
This shows the strength of all of the vertically oriented edges in the input image,
ch can be a useful operation for object detection. Both images are 280 pixels tall.
input image is Figure -1
320 pixels -1 while the output image is 319 pixels
9.6: wide
Eﬃciency of edge detection. The image on the
Output
wide. This
right was formed by ta
sformation can be described by a convolution kernel containing two
each pixel in the original image and subtracting the value elements, and
of its neighboring pixel on
uires 319 ⇥ 280 ⇥left. Kernel
3 = 960 floating
267,shows
This point of
the strength operations
all of the (two multiplications
vertically and in the input im
oriented edges
(Goodfellow 2016)
addition per output
whichpixel)
can betoacompute using convolution.
useful operation To describe
for object detection. theimages
Both same are 280 pixels
Mark Crowley A to Z of AI/ML Sep 23, 2017 77 / 112
Convolutional Neural Networks Motivation

Other CNN Modification

Pooling: Nearby pixels tend to represent the same thing/class/object.

So, pool responses from nearby nodes. (eg. mean, median,
min, max)
Strides: number of pixels overlap between adjacent filters
Zero padding: removing edge pixels from filter scan, can reduce size of
network and deal with edge effects
Connectivity: Alternate local connectivity options, partial connectivity

Mark Crowley A to Z of AI/ML Sep 23, 2017 90 / 112

Convolutional Neural Networks Other Types of Deep Neural Networks

Other Types of Deep Neural Networks

RBM: Restricted Boltzman Machines (RBM) - older directed deep

Mark Crowley A to Z of AI/ML Sep 23, 2017 99 / 112

Convolutional Neural Networks Other Types of Deep Neural Networks

Other Types of Deep Neural Networks

RBM: Restricted Boltzman Machines (RBM) - older directed deep

model.
RNN: Recurrent Neural Networks (RNN) - allow links from outputs
back to inputs, over time, good for time series learning
LSTM: Long-Term Short-Term networks - more complex form of
RNN
integrate strategically remembered particular
information from the past
formalizes a process for forgetting information over
time.
useful if you need to learn patterns over time and your
data feautres
DeepRL: Deep Reinforcement Learning
GAN: General Adversarial Networks - train two networks at once
Mark Crowley A to Z of AI/ML Sep 23, 2017 99 / 112
Convolutional Neural Networks Other Types of Deep Neural Networks

Other Types of Deep Neural Networks

RBM: Restricted Boltzman Machines (RBM) - older directed deep

model.
RNN: Recurrent Neural Networks (RNN) - allow links from outputs
back to inputs, over time, good for time series learning
LSTM: Long-Term Short-Term networks - more complex form of
RNN
DeepRL: Deep Reinforcement Learning
CNNs + Fully Connected Deep Network for learning a
representation of a policy
Reinforcement Learning for updating the policy through
experience to make improved decision decisions
Requires a value/reward function
GAN: General Adversarial Networks - train two networks at once

Mark Crowley A to Z of AI/ML Sep 23, 2017 99 / 112

Convolutional Neural Networks Other Types of Deep Neural Networks

Other Types of Deep Neural Networks

RBM: Restricted Boltzman Machines (RBM) - older directed deep

Mark Crowley A to Z of AI/ML Sep 23, 2017 99 / 112

Convolutional Neural Networks Other Types of Deep Neural Networks

General Adversarial Networks

One network produces/hallucinates new answers (generative)

Second network distinguishes between the real and the generated
answers (adversary/critic)
Blog withCode: ”GANS in 50 lines of code PyTorch code.” easy way
to get started
Mark Crowley A to Z of AI/ML Sep 23, 2017 100 / 112
Do you need AI/ML?

Outline

Introduction

What is AI?

Neural Networks

Convolutional Neural Networks

Do you need AI/ML?

Defining Your Questions
Designing Your AI/ML System
Languages and Libraries
Deep Learning Frameworks
Compute Resources
Mark Crowley A to Z of AI/ML Sep 23, 2017 101 / 112
Do you need AI/ML? Defining Your Questions

Defining Your Questions

Is it a decision to be made?
Is there a pattern to detect?
Do you have data?
What kinds of questions do you have about the data?
Yes/No questions - Did X happen? Are A and B correlated?
Timing - When did X happen?
Anomaly detection - Is X strange/abnormal/unexpected?
Classification - What kind of Y is X?
Prediction - Weve seen lots of (X,Y) now we want to know (X’,?)
Do you have labels?
Can you give the right answer for some portion of the data?
Collecting labels: Automatic? Manual? Crowd-sourced? (eg. Amazon
Mechanical Turk) Y
Yes → Supervised Learning - Lots of options
No → Unsupervised Learning - Some options (getting better all the
time)
Mark Crowley A to Z of AI/ML Sep 23, 2017 102 / 112
Do you need AI/ML? Defining Your Questions

Answers and Constraints

What kind of answer do you need? (increasing difficulty)

Find patterns which are present in the data and view them
Most likely explanation for a pattern
Probability of (fact about X,A,B...) being true
A policy for actions to take in the future to maximize benefit
Guarantees that X will (or will not) happen (very hard)
How big is your data?
Is it static?
MB, GB, TB?
Is it streaming?
KB/sec, MB/sec
How many data points/rows/events will there be?

Mark Crowley A to Z of AI/ML Sep 23, 2017 103 / 112

Do you need AI/ML? Designing Your AI/ML System

How to Design your AI/ML Question

Define your task:

Prediction, Clustering, Classification,
Anomaly Detection?
Define objectives, error metrics,
performance standards
Collect Data:
Set up data stream (storage, input
flow, parallelization, Hadoop)
Preprocessing:
Noise/Outlier Filtering
Completing missing data (histograms,
interpolation)
Normalization (scaling data)

Mark Crowley A to Z of AI/ML Sep 23, 2017 104 / 112

Do you need AI/ML? Designing Your AI/ML System

How to Design your AI/ML Question

Dimensionality Reduction / Feature
Selection:
Choose features to use/extract
from data
PCA/LDA/LLE/GDA
Choose Algorithm:
Consider goals, questions
Tractability
Experimental Design:
train/validate/test data sets
cross-validation
Run it! :
Deployment
Maintenance
Mark Crowley A to Z of AI/ML Sep 23, 2017 105 / 112
Do you need AI/ML? Languages and Libraries

Language Choices
Any language can be used for implementing/using AI/ML algorithms, but
some make it much easier
C++: you can do it, may need to implement many things yourself
Java: many of libraries for ML (Weka is a good open source one,
Deeplearning4j)
Scala: leaner, functional language that compile to JVM bytecode,
good for prototyping, can reuse libraries for Java
(Deeplearning4j)
R: focussed on statisical methods, more and more machine
learning libraries implemented for this
Matlab: good for all the calculations, if you have the right libraries
it’s great (not cheap or very portable beyond school)
Python: most commonly used right now for deep learning, we’re
gonna need another slide ...
Mark Crowley A to Z of AI/ML Sep 23, 2017 106 / 112
Do you need AI/ML? Languages and Libraries

Python

numpy - numerical libraries, implementation of matrix and linear

algerbra datastructures, graphing tools
pandas - table datastructure, statistical analysis tools (implements
many useful features from R)
scipy - includes all of the above and more, full installation of
scientific libraries, basically turns Python into R+Matlab
scikit-learn - many standard machine learning algorithms implemented as
easy-to-use Python APIs
jupyter notebooks - these are powerful web-based interfaces to python for
data analysis and machine learning.

Mark Crowley A to Z of AI/ML Sep 23, 2017 107 / 112

Do you need AI/ML? Deep Learning Frameworks

Deep Learning Frameworks

Caffe - older, easy to set up mockups, harder to install?

Theano - made out of University of Montreal, great theoretical setup,
very flexible, python only
Tensorflow - made by Google, scales to many GPUs, servers, lots of
optimization, requires planning of the whole system
beforehand, most languages
PyTorch - easier to mock things up, try different designs, not as
optimized for large scale performance as tensorflow
MXNet - made by Microsoft, supports most languages and OS’s
Deeplearning4j - Java focussed framework
Keras - open interface to create models in multiple frameworks
(tensorflow, theano, MXNet)

Mark Crowley A to Z of AI/ML Sep 23, 2017 108 / 112

Do you need AI/ML? Compute Resources

Cloud Services

There are several powerful, free services you can access via a student
account which you can request directly.
AWS: Amazon Web Service - very large, has accessible APIs to
connect to, many options for hardware to run on (but the
best ones will cost extra)
Azure: Microsoft - lots of visual tools for composing AI/ML
components.
Google Cloud ML Engine: - uses all the latest tools and tensorflow models
None of these provide GPU servers for free, that will cost
extra. (It will still work, just be slower for deep learning.)

Mark Crowley A to Z of AI/ML Sep 23, 2017 109 / 112

Do you need AI/ML? Compute Resources

Summary

Introduction

What is AI?
Landscape of Big Data/AI/ML
Classification

Neural Networks
Building Upon Classic Machine Learning
History Of Neural Networks
Improving Performance

Convolutional Neural Networks

Motivation
Other Types of Deep Neural Networks

Do you need AI/ML?

Defining Your Questions
Designing Your AI/ML System
Languages and Libraries
Deep Learning Frameworks
Compute Resources

Mark Crowley A to Z of AI/ML Sep 23, 2017 110 / 112

Do you need AI/ML? Compute Resources

Useful Books

A book for of three eras of Machine Learning:

[Goodfellow, 2016]
Goodfellow, Bengio and Courville. “Deep Learning”, MIT Press, 2016.
http://www.deeplearningbook.org/
Website has free copy of book as pdf’s.
[Murphy, 2012]
Kevin Murphy, Machine Learning: A Probabilistic Perspective, MIT
Press, 2012.
[Duda, Pattern Classification, 2001]
R. O. Duda, P. E. Hart and D. G. Stork, Pattern Classification (2nd
ed.), John Wiley and Sons, 2001.

Mark Crowley A to Z of AI/ML Sep 23, 2017 111 / 112

Do you need AI/ML? Compute Resources

Useful Papers and Blogs

[lecun2015]
Y. LeCun, Y. Bengio, G. Hinton, L. Y., B. Y., and H. G., “Deep
learning”, Nature, vol. 521, no. 7553, pp. 436444, 2015. Great
references at back with comments on seminal papers.
[bengio2009]
Y. Bengio, “Learning Deep Architectures for AI”, Foundations and
Trends in Machine Learning, vol. 2, no. 1. 2009. An earlier general
referenceon the fundamentals of Deep Learning.
[krizhevsky2012]
A. Krizhevsky, G. E. Hinton, and I. Sutskever,“ImageNet Classification
with Deep Convolutional Neural Networks”, Adv. Neural Inf. Process.
Syst. pp. 19, 2012. The beginning of the current craze.
[Karpathy, 2015]
Andrej Karpathy’s Blog - http://karpathy.github.io Easy to
follow explanations with code.
Mark Crowley A to Z of AI/ML Sep 23, 2017 112 / 112

Lecture 3 - Color Image Processing
No ratings yet
Lecture 3 - Color Image Processing
105 pages
AI Statistical Methods Course
No ratings yet
AI Statistical Methods Course
23 pages
LINFO2262: Decision Trees + Random Forests: Pierre Dupont
No ratings yet
LINFO2262: Decision Trees + Random Forests: Pierre Dupont
43 pages
Data Mining
No ratings yet
Data Mining
73 pages
An Introduction To Pattern Recognition - 2
No ratings yet
An Introduction To Pattern Recognition - 2
46 pages
Recommendation Systems
No ratings yet
Recommendation Systems
29 pages
Lightgt: A Light Graph Transformer For Multimedia Recommendation
No ratings yet
Lightgt: A Light Graph Transformer For Multimedia Recommendation
10 pages
Pattern Recognition: Tutorial 2
No ratings yet
Pattern Recognition: Tutorial 2
23 pages
Neural Networks & Pattern Recognition Guide
No ratings yet
Neural Networks & Pattern Recognition Guide
9 pages
Duda ch10
No ratings yet
Duda ch10
17 pages
Pattern Recognition: Dr. Farah Qais Al-Khalidi
No ratings yet
Pattern Recognition: Dr. Farah Qais Al-Khalidi
43 pages
Pattern Recognition: Dr. Farah Qais Al-Khalidi
100% (1)
Pattern Recognition: Dr. Farah Qais Al-Khalidi
49 pages
Bayesian Decision Theory Guide
No ratings yet
Bayesian Decision Theory Guide
39 pages
Lecture 2 Data Mining Functions
No ratings yet
Lecture 2 Data Mining Functions
40 pages
AI and Decision Support Systems Overview
No ratings yet
AI and Decision Support Systems Overview
21 pages
Pattern Recognition Systems
No ratings yet
Pattern Recognition Systems
81 pages
1 Autoencoders
No ratings yet
1 Autoencoders
22 pages
22 - State Graph Reasoning For Multimodal Conversational Recommendation
No ratings yet
22 - State Graph Reasoning For Multimodal Conversational Recommendation
12 pages
19-Introduction Classification Algorithm-18-09-2024
No ratings yet
19-Introduction Classification Algorithm-18-09-2024
102 pages
What Is Computer Vision?
No ratings yet
What Is Computer Vision?
120 pages
01 - ML Introduction - Course Outline
No ratings yet
01 - ML Introduction - Course Outline
21 pages
Deep Learning All Modules
No ratings yet
Deep Learning All Modules
445 pages
Statistical Learning: Classification Course
No ratings yet
Statistical Learning: Classification Course
65 pages
Data Mining: Concepts and Techniques: - Chapter 6
No ratings yet
Data Mining: Concepts and Techniques: - Chapter 6
172 pages
Pattern Classification: All Materials in These Slides Were Taken From
No ratings yet
Pattern Classification: All Materials in These Slides Were Taken From
16 pages
Mathematical Foundations of Machine Learning
No ratings yet
Mathematical Foundations of Machine Learning
74 pages
Efficient Fine-Tuning with PEFT
No ratings yet
Efficient Fine-Tuning with PEFT
10 pages
Web Graph Mining Techniques Overview
No ratings yet
Web Graph Mining Techniques Overview
15 pages
Data Mining
No ratings yet
Data Mining
63 pages
Week 02 PDF
No ratings yet
Week 02 PDF
39 pages
Lecture 1
No ratings yet
Lecture 1
38 pages
Bayes Classification for Fish Sorting
No ratings yet
Bayes Classification for Fish Sorting
86 pages
ITA6016 - Machine Learning Introduction
No ratings yet
ITA6016 - Machine Learning Introduction
13 pages
Lecture No 6 Deep Learning Algorithm
No ratings yet
Lecture No 6 Deep Learning Algorithm
37 pages
Course Notes
No ratings yet
Course Notes
141 pages
ML1 17 Hepsi
No ratings yet
ML1 17 Hepsi
90 pages
Lecture2 ImageFormationRepresentation
No ratings yet
Lecture2 ImageFormationRepresentation
34 pages
Deep Learning Algorithms
No ratings yet
Deep Learning Algorithms
19 pages
Btech CSE
100% (1)
Btech CSE
17 pages
Weights and Biases in Neural Networks
No ratings yet
Weights and Biases in Neural Networks
10 pages
Pattern Recognition for CS Scholars
0% (1)
Pattern Recognition for CS Scholars
37 pages
Random Forest - Basics
100% (1)
Random Forest - Basics
9 pages
Ann Rec054
No ratings yet
Ann Rec054
1 page
ACFrOgABKqiEmBaJQj83YBH9WU2PNJItTti h1WSPYfkkPrPX2 - svP3B2IdRd 44Vx8ACohvs0MFVQo9R4uEi1LgRMZ56HmHHK3DDreHYPkFbamuAEa5XHZVUjxgtM
No ratings yet
ACFrOgABKqiEmBaJQj83YBH9WU2PNJItTti h1WSPYfkkPrPX2 - svP3B2IdRd 44Vx8ACohvs0MFVQo9R4uEi1LgRMZ56HmHHK3DDreHYPkFbamuAEa5XHZVUjxgtM
691 pages
Machine Learning Ppts
No ratings yet
Machine Learning Ppts
38 pages
Machine Learning - 2 Books in 1 - The Complete Guide For Beginners To Master Neural Networks, Artificial Intelligence, and Data Science With Python (BooksRack - Net)
No ratings yet
Machine Learning - 2 Books in 1 - The Complete Guide For Beginners To Master Neural Networks, Artificial Intelligence, and Data Science With Python (BooksRack - Net)
201 pages
20.tooth Segmentation On Dental Meshes Using Morphologic Skeleton
No ratings yet
20.tooth Segmentation On Dental Meshes Using Morphologic Skeleton
13 pages
NLP Presentation
No ratings yet
NLP Presentation
20 pages
ML All Chapter
No ratings yet
ML All Chapter
118 pages
Deep Neural Networks Explained
No ratings yet
Deep Neural Networks Explained
12 pages
AI Applications
100% (1)
AI Applications
165 pages
Example of 2D Convolution
No ratings yet
Example of 2D Convolution
5 pages
Neural Networks in Pattern Classification
No ratings yet
Neural Networks in Pattern Classification
35 pages
Bishop ML
No ratings yet
Bishop ML
3 pages
Machine Learning in Cybersecurity Seminar
No ratings yet
Machine Learning in Cybersecurity Seminar
18 pages
A Multitask Multilingual Multimodal Evaluation of
No ratings yet
A Multitask Multilingual Multimodal Evaluation of
52 pages
ML 01
No ratings yet
ML 01
18 pages
AI & ML: Concepts and Comparisons
No ratings yet
AI & ML: Concepts and Comparisons
179 pages
NNDL 1
No ratings yet
NNDL 1
13 pages
(English (Auto-Generated) ) Google's AI Course For Beginners (In 10 Minutes) ! (DownSub - Com)
No ratings yet
(English (Auto-Generated) ) Google's AI Course For Beginners (In 10 Minutes) ! (DownSub - Com)
5 pages
Question Bank - 3
No ratings yet
Question Bank - 3
5 pages
Deep Learning Interview Questions Guide
No ratings yet
Deep Learning Interview Questions Guide
46 pages
Introduction To Convolutional Neural Networks
No ratings yet
Introduction To Convolutional Neural Networks
4 pages
IJRASET Sample Paper For Format
No ratings yet
IJRASET Sample Paper For Format
9 pages
Syllabus 2025 Final
No ratings yet
Syllabus 2025 Final
4 pages
Chapter 5 Final
No ratings yet
Chapter 5 Final
80 pages
Words Related To Artificial Intelligence
No ratings yet
Words Related To Artificial Intelligence
2 pages
Finxter OpenAI Glossary
No ratings yet
Finxter OpenAI Glossary
1 page
Sequence Modeling: RNNs & Architectures
No ratings yet
Sequence Modeling: RNNs & Architectures
5 pages
Question Text: Clear My Choice
0% (2)
Question Text: Clear My Choice
10 pages
Intelligent Information Processing With Matlab - Xiu Zhang
No ratings yet
Intelligent Information Processing With Matlab - Xiu Zhang
347 pages
Unit 6 - Week 5: Assignment 5
No ratings yet
Unit 6 - Week 5: Assignment 5
3 pages
Machine Learning: Neural Networks Slides Mostly Adapted From Tom Mithcell, Han and Kamber
No ratings yet
Machine Learning: Neural Networks Slides Mostly Adapted From Tom Mithcell, Han and Kamber
41 pages
Neural Network Learning Algorithms Guide
No ratings yet
Neural Network Learning Algorithms Guide
5 pages
Unit 4
No ratings yet
Unit 4
8 pages
Deep Leraning Sarker
No ratings yet
Deep Leraning Sarker
21 pages
Iii Bba-Avi - B Ai Set1
No ratings yet
Iii Bba-Avi - B Ai Set1
1 page
Kohonen SOMs
No ratings yet
Kohonen SOMs
14 pages
Emotion Recognition Using CNN and RNN
No ratings yet
Emotion Recognition Using CNN and RNN
37 pages
Deep Learning & Activation Functions
No ratings yet
Deep Learning & Activation Functions
32 pages
التعلم العميق
No ratings yet
التعلم العميق
192 pages
Diff Between ML DL AI PDF
100% (1)
Diff Between ML DL AI PDF
3 pages
Neural Network Activation Guide
No ratings yet
Neural Network Activation Guide
38 pages
Prompt Engineering Lab Midterm Exam
No ratings yet
Prompt Engineering Lab Midterm Exam
7 pages
Neural Networks: Key Concepts & Functions
No ratings yet
Neural Networks: Key Concepts & Functions
22 pages
CAM++: A Fast and Efficient Network For Speaker Verification Using Context-Aware Masking
No ratings yet
CAM++: A Fast and Efficient Network For Speaker Verification Using Context-Aware Masking
5 pages
Lecture3 Transfer Learning
No ratings yet
Lecture3 Transfer Learning
28 pages
ResNet Deep Residual Learning For Image Recognition
No ratings yet
ResNet Deep Residual Learning For Image Recognition
12 pages