0% found this document useful (0 votes)

36 views39 pages

AI & Neural Networks Basics

Uploaded by

albertadi412

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views39 pages

AI & Neural Networks Basics

Uploaded by

albertadi412

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 39

CSCI218: Foundations

of Artificial Intelligence
Classical stats/ML: Minimize loss function
§ Which hypothesis space H to choose?
§ E.g., linear combinations of features: hw(x) = wTx
§ How to measure degree of fit?
§ Loss function, e.g., squared error Σj (yj – wTx)2
§ How to trade off degree of fit vs. complexity?
§ Regularization: complexity penalty, e.g., ||w||2
§ How do we find a good h?
§ Optimization (closed-form, numerical); discrete search
§ How do we know if a good h will predict well?
§ Try it and see (cross-validation, bootstrap, etc.)

2
Deep Learning/Neural Network

Image Classification
Very loose inspiration: Human neurons

Axonal arborization

Axon from another cell

Synapse
Dendrite Axon

Nucleus

Synapses

Cell body or Soma

Simple model of a neuron (McCulloch & Pitts, 1943)
Bias Weight
a0 = 1 aj = g(inj)
w0,j
g
wi,j inj
ai
Σ aj

Input Input Activation Output

Links Function Function Output Links

§ Inputs ai come from the output of node i to this node j (or from “outside”)
§ Each input link has a weight wi,j
§ There is an additional fixed input a0 with bias weight w0,j
§ The total input is inj = Si wi,j ai
§ The output is aj = g(inj) = g(Si wi,j ai) = g(w.a)
Activation functions g
g(ini) g(ini)

+1 +1

ini ini
(a)
Threshold (b)1/(1+e-x)
Sigmoid
Reminder: Linear Classifiers

▪ Inputs are feature values

▪ Each feature has a weight
▪ Sum is the activation

▪ If the activation is: f1

w1
▪ Positive, output +1 w2
▪ Negative, output -1
f2
w3 Σ >0?
f3
How to get probabilistic decisions?

If very positive, want probability going to 1

If very negative, want probability going to 0

Sigmoid function
Best w?
Maximum likelihood estimation:

with:

= Logistic Regression
Multiclass Logistic Regression
Multi-class linear classification
A weight vector for each class:

Score (activation) of a class y:

Prediction w/highest score wins:

How to make the scores into probabilities?

original activations softmax activations

Best w?
Maximum likelihood estimation:

with:

= Multi-Class Logistic Regression

Optimization

i.e., how do we solve:

Hill Climbing
A simple, general idea
Start wherever
Repeat: move to the best neighboring state
If no neighbors better than current, quit

What’s particularly tricky when hill-climbing for multiclass

logistic regression?
• Optimization over a continuous space
• Infinitely many neighbors!
• How to do this efficiently?
1-D Optimization

Could evaluate and

Then step in best direction

Or, evaluate derivative:

Tells which direction to step into
2-D Optimization

Source: offconvex.org
Gradient Ascent
Perform update in uphill direction for each coordinate
The steeper the slope (i.e. the higher the derivative) the bigger the step
for that coordinate
E.g., consider:

Updates: ▪ Updates in vector notation:

with: = gradient
Steepest Descent
o Idea:
o Start somewhere
o Repeat: Take a step in the steepest descent direction

Figure source: Mathworks

Steepest Direction
o Steepest Direction = direction of the gradient

2 @g
3
@w1
6 @g7
6 @w2
7
rg = 6 7
4 ··· 5
@g
@wn
Optimization Procedure: Gradient Ascent

init
for iter = 1, 2, …

▪ : learning rate --- hyperparameter that needs to be chosen

carefully
Batch Gradient Ascent on the Log Likelihood Objective

init
for iter = 1, 2, …
Stochastic Gradient Ascent on the Log Likelihood Objective

Observation: once gradient on one training example has been

computed, might as well incorporate before computing next one

init
for iter = 1, 2, …
pick random j
Mini-Batch Gradient Ascent on the Log Likelihood Objective

Observation: gradient over small set of training examples (=mini-batch)

can be computed in parallel, might as well do that instead of a single one

init
for iter = 1, 2, …
pick random subset of training examples J
Neural Networks
Multi-class Logistic Regression
= special case of neural network (single layer, no hidden layer)
f1(x)

z1 s
f2(x) o
f
z2 t
f3(x)
m
a
x
… z3

fK(x)
Multi-layer Perceptron

x1
s
x2 o
f
… t
x3 m
a
… … … … x
…

g = nonlinear activation function

Multi-layer Perceptron
Common Activation Functions

[source: MIT 6.S191 introtodeeplearning.com]

Multi-layer Perceptron
Training the MLP neural network is just like logistic regression:

just w tends to be a larger vector

just run gradient ascent è Back-propagation algorithm

Neural Networks Properties
Theorem (Universal Function Approximators). A two-layer
neural network with a sufficient number of neurons can
approximate any continuous function to any desired accuracy.

Practical considerations
Can deal with more complex, nonlinear classification & regression
Large number of neurons and weights
Danger for overfitting
Deep Learning Model

Neural network as
General computation graph

Krizhevsky, Suskever, Hinton, 2012

Deep Learning Model
Deep Learning Model
§ We need good features!

Feature Extraction Classification “Panda”?

Prior Knowledge,
Experience

Pose Occlusion Multiple Inter-class

objects similarity

Image courtesy of M. Ranzato

Deep Learning Model

§ Directly learn features representations from data.

§ Joint learn feature representation and classifier.

More abstract representation

Low-level Mid-level High-level

Features Features Features
Classifier “Panda”?

Deep Learning: train layers of features so that classifier works well.

Image courtesy of M. Ranzato

Deep Learning Model
Have we been here before?
ØYes.
• Basic ideas common to past neural networks research
• Standard machine learning strategies still relevant.
ØNo.
Today’s Deep Learning

Computational
Large-scale Data New Algorithms
Power
Deep Learning Model
Convolutional Neural Networks (CNNs)
§ A special multi-stage architecture inspired by visual system
§ Higher stages compute more global, more invariant features
Deep Learning Model

https://www.datasciencecentral.com/lenet-5-a-classic-cnn-architecture/
Different Neural Network Architectures
§ Exploration of different neural network architectures
§ ResNet: residual networks
§ Networks with attention
§ Transformer networks
§ Neural network architecture search
§ Really large models
§ GPT2, GPT3
§ CLIP

37
Acknowledgement

The lecture slides are based on the materials from ai.Berkey.edu

Thank you. Questions?

cs188 sp24 Note22
No ratings yet
cs188 sp24 Note22
8 pages
cs188 Fa24 Lec24
No ratings yet
cs188 Fa24 Lec24
46 pages
NN Theory
No ratings yet
NN Theory
138 pages
Unit 2 - Class
No ratings yet
Unit 2 - Class
16 pages
Julien
No ratings yet
Julien
84 pages
lec22-ML III
No ratings yet
lec22-ML III
51 pages
Machine Learning Algorithms Explained
No ratings yet
Machine Learning Algorithms Explained
46 pages
CS 188 Introduction To Artificial Intelligence Fall 2017 Note 10 Neural Networks: Motivation
No ratings yet
CS 188 Introduction To Artificial Intelligence Fall 2017 Note 10 Neural Networks: Motivation
9 pages
Machine Learning-Gkouzionis
No ratings yet
Machine Learning-Gkouzionis
14 pages
Neural Networks & Gradient Descent
No ratings yet
Neural Networks & Gradient Descent
77 pages
Pattern Classification 10. Linear Perceptron, Least Squares & Multi-Layer Nns
No ratings yet
Pattern Classification 10. Linear Perceptron, Least Squares & Multi-Layer Nns
38 pages
Unit 4 ML NN, DL, CNN-1
No ratings yet
Unit 4 ML NN, DL, CNN-1
84 pages
Neural Networks
No ratings yet
Neural Networks
14 pages
XOR Problem & Two-Layer Perceptron
No ratings yet
XOR Problem & Two-Layer Perceptron
74 pages
Neural Networks for Healthcare Experts
No ratings yet
Neural Networks for Healthcare Experts
43 pages
Supervised Learning: Linear Models
No ratings yet
Supervised Learning: Linear Models
34 pages
M03 Networks
No ratings yet
M03 Networks
40 pages
2021 Logistic Regression
No ratings yet
2021 Logistic Regression
33 pages
Fundamentals of Deep Learning Explained
No ratings yet
Fundamentals of Deep Learning Explained
72 pages
10 Multilayer Perceptrons
No ratings yet
10 Multilayer Perceptrons
54 pages
Lec 05
No ratings yet
Lec 05
46 pages
Neural Networks Course Overview
No ratings yet
Neural Networks Course Overview
72 pages
Lecture8 DeepLearning
No ratings yet
Lecture8 DeepLearning
94 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
100 pages
Deep Learning 1
No ratings yet
Deep Learning 1
48 pages
MachineLearning Lecture 2
No ratings yet
MachineLearning Lecture 2
23 pages
AI ML Session Slides
No ratings yet
AI ML Session Slides
34 pages
Week3 Perceptron Mlprwerwerwer
No ratings yet
Week3 Perceptron Mlprwerwerwer
8 pages
Domnic Object Detecion Basics
No ratings yet
Domnic Object Detecion Basics
62 pages
Neural Network
No ratings yet
Neural Network
55 pages
Neural Network Overview and Techniques
No ratings yet
Neural Network Overview and Techniques
55 pages
Machine Learning INTRO
No ratings yet
Machine Learning INTRO
12 pages
CC511 Week 5 - 6 - NN - BP
No ratings yet
CC511 Week 5 - 6 - NN - BP
62 pages
Brief Summary ML
No ratings yet
Brief Summary ML
25 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
48 pages
5 LogRegNN
No ratings yet
5 LogRegNN
74 pages
Understanding Neural Networks Basics
No ratings yet
Understanding Neural Networks Basics
48 pages
16 DL 1
No ratings yet
16 DL 1
9 pages
6103 Deep Neural Network - Related Concepts (Lecture 12)
No ratings yet
6103 Deep Neural Network - Related Concepts (Lecture 12)
7 pages
Lec 21
No ratings yet
Lec 21
34 pages
Neural Network
No ratings yet
Neural Network
97 pages
DL Notes
No ratings yet
DL Notes
21 pages
02-03-Warming-Up and Data and Features
No ratings yet
02-03-Warming-Up and Data and Features
22 pages
12 Advanced Machine Learning Algorithms
No ratings yet
12 Advanced Machine Learning Algorithms
41 pages
Neural Networks (Basics)
No ratings yet
Neural Networks (Basics)
30 pages
Week-12 - Introduction To ML-NN-CNN
No ratings yet
Week-12 - Introduction To ML-NN-CNN
45 pages
Neuralnetworks 1
No ratings yet
Neuralnetworks 1
65 pages
Neural Networks: Artificial Intelligence: Representation and Problem Solving
No ratings yet
Neural Networks: Artificial Intelligence: Representation and Problem Solving
19 pages
Neural Networks for Beginners
No ratings yet
Neural Networks for Beginners
79 pages
Learning With Linear Neurons: Adapted From Lectures by Geoffrey Hinton and Others Updated by N. Intrator, May 2007
No ratings yet
Learning With Linear Neurons: Adapted From Lectures by Geoffrey Hinton and Others Updated by N. Intrator, May 2007
59 pages
DeepLearning Aulas2e3
No ratings yet
DeepLearning Aulas2e3
72 pages
ChatGPT - Machine Learning Overview
No ratings yet
ChatGPT - Machine Learning Overview
34 pages
Probability Neuron Network
No ratings yet
Probability Neuron Network
84 pages
AN2DL 02 2324 Perceptron 2 FeedForward
No ratings yet
AN2DL 02 2324 Perceptron 2 FeedForward
55 pages
Deep Learning
No ratings yet
Deep Learning
11 pages
Unit - II ML
No ratings yet
Unit - II ML
9 pages
Foundations of Computer Vision Techniques
No ratings yet
Foundations of Computer Vision Techniques
58 pages
Week1 Lecture1
No ratings yet
Week1 Lecture1
40 pages
Foundations of AI Course Outline
No ratings yet
Foundations of AI Course Outline
39 pages
Week3 LearningI
No ratings yet
Week3 LearningI
48 pages
I.T Report On Building Construction
63% (19)
I.T Report On Building Construction
24 pages
RSHS TOS Grade 7
No ratings yet
RSHS TOS Grade 7
1 page
QT PPT Simplex 4
No ratings yet
QT PPT Simplex 4
27 pages
MAPLE Basics: Quick Start Guide
No ratings yet
MAPLE Basics: Quick Start Guide
6 pages
Sven O Krumke Integer Programming Polyhedra and Algorithms Lecture Notes
No ratings yet
Sven O Krumke Integer Programming Polyhedra and Algorithms Lecture Notes
188 pages
Gantt Charts for Project Managers
No ratings yet
Gantt Charts for Project Managers
5 pages
Error Proofing2011
100% (1)
Error Proofing2011
93 pages
A Practical Method For Load Balancing in The LV Distribution Networks Case Study Tabriz Electrical Network
No ratings yet
A Practical Method For Load Balancing in The LV Distribution Networks Case Study Tabriz Electrical Network
6 pages
Civil & Environmental Engineering Courses
No ratings yet
Civil & Environmental Engineering Courses
6 pages
Thermodynamics
100% (5)
Thermodynamics
341 pages
IGCSE Co-Ordinated Sciences 0654 - 52 Paper 5 Oct - Nov 2020
No ratings yet
IGCSE Co-Ordinated Sciences 0654 - 52 Paper 5 Oct - Nov 2020
2 pages
Class 9th Holiday Homework 2024-2025
No ratings yet
Class 9th Holiday Homework 2024-2025
7 pages
Engineering Mechanics Exam
No ratings yet
Engineering Mechanics Exam
13 pages
Axisymmetric Elements
No ratings yet
Axisymmetric Elements
32 pages
Feedback Assistant in SLS For Maths - 10 Jul
No ratings yet
Feedback Assistant in SLS For Maths - 10 Jul
1 page
Understanding Polygons and Their Types
No ratings yet
Understanding Polygons and Their Types
11 pages
P.6 M.T.C End Term 1
No ratings yet
P.6 M.T.C End Term 1
8 pages
Ge 4 Module 3
No ratings yet
Ge 4 Module 3
13 pages
Desktop 10 QA Exam Prep Guide
0% (1)
Desktop 10 QA Exam Prep Guide
16 pages
Fuzzy Soft Set Theory and Its Applications
No ratings yet
Fuzzy Soft Set Theory and Its Applications
19 pages
Mathematical Biology Textbook
No ratings yet
Mathematical Biology Textbook
119 pages
Bsed Math 123 Syllabus PDF
No ratings yet
Bsed Math 123 Syllabus PDF
7 pages
Geometry+Final+Review+1 0+2025
No ratings yet
Geometry+Final+Review+1 0+2025
12 pages
Simple Interest Practice Questions 2023
No ratings yet
Simple Interest Practice Questions 2023
262 pages
Young's vs Bulk Modulus Explained
No ratings yet
Young's vs Bulk Modulus Explained
4 pages
Comparisonof ISO21748 and ISO11352 Standardsformeasurementuncertaintyestimationinwateranalysis
No ratings yet
Comparisonof ISO21748 and ISO11352 Standardsformeasurementuncertaintyestimationinwateranalysis
8 pages
Class 9th - Kinematics PDF
No ratings yet
Class 9th - Kinematics PDF
1 page
Advanced Economic Theory Lecture
No ratings yet
Advanced Economic Theory Lecture
36 pages
Uzan-The Arrow of Time and Meaning PDF
No ratings yet
Uzan-The Arrow of Time and Meaning PDF
29 pages
Push and Point The Bit 1689751487
No ratings yet
Push and Point The Bit 1689751487
9 pages