0% found this document useful (0 votes)

33 views3 pages

Intro To Linear Programming

The document discusses convolution in the context of Convolutional Neural Networks (CNNs), defining how a filter operates on image data represented as tensors. It explains the process of applying multiple kernels to extract features from images and introduces the concept of robustness in classifiers, detailing how to determine the smallest perturbation that changes a data point's classification. Additionally, it covers the optimization problem for linear classifiers and the discriminator's objective in maximizing certain values related to data distributions.

Uploaded by

spammysharky

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views3 pages

Intro To Linear Programming

Uploaded by

spammysharky

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

MA3K1 Mathematics of Machine Learning April 10, 2021

Solution (23) Given a vector x 2 Rd and a sequence g = (g d+1 , . . . , gd 1 )

T
, the
convolution is defined as the vector y = (y1 , . . . , yd )T , where
d
X
yk = xi · g k i , k 2 {1, . . . , d}
i=1

In terms of matrices,
0 1
0 1 g0 g 1 ··· g d+1 0 1
y1 B g1 C x1
B ..C B g0 ··· g d+2 C B .C
@y2 .A = B .. .. .. .. C @x2 ..A
@ . . . . A
yd xd
gd 1 gd 2 ··· g0

The sequence g is called a filter. Convolutional Neural Networks (CNNs) operate on

image data, and in the context of colour image data, a vector x that enters a layer of
the CNN will not be seen as a single vector in some Rd , but as a tensor with format
N ⇥ M ⇥ 3. Here, the image is considered to be a N ⇥ M matrix, with each entry
corresponding to a pixel, and each pixel is represented by a 3-tuple consisting of the red,
blue and green components. A convolution filter will therefore operate on a subset of
this tensor, typically a n ⇥ m ⇥ 3 window. Typical parameters are N = M = 32 and
n = m = 5.

2 3
⇤ ⇤ ⇤ ⇤ ⇤
2 3
⇤ ⇤66⇤⇤ 3⇤⇤ ⇤⇤ ⇤ ⇤7
7
2 6 7 ···
⇤ ⇤66⇤⇤
⇤ ⇤
⇤⇤6 ⇤⇤ ⇤ ⇤7 ⇤ 7 ⇤ ⇤ 7
6⇤ ⇤66⇤⇤ ⇤⇤ ⇤⇤7 ⇤ ⇤7 · · · 5
4 ⇤ ⇤ ⇤ 7 ⇤ ⇤ [⇤] · · ·
6 7
6⇤ ⇤ 4⇤⇤ ⇤⇤ ⇤⇤⇤7⇤⇤· · ⇤·⇤5 ⇤ ⇤ ..
6 7 ..
4⇤ ⇤ ⇤⇤ ⇤⇤ ⇤⇤5 ⇤ .⇤ .. . .
.. .
⇤ ⇤ ⇤ ⇤ .⇤ .
.. ..
.. ..
. .

Each 5 ⇥ 5 ⇥ 3 block is mapped linearly to an entry of a 28 ⇥ 28 matrix in the

same way (that is, applying the same filter). The filter applied is called a kernel and the
resulting map is interpreted as representing a feature. Specifically, instead of writing
it as a vector g, we can write it as a matrix K, and then have, if we denote the input
image by X, X
Y = X ⇤ K, Yij = Xk` · Ki k,j ` .
k,`

In a convolutional layer, various different kernels or convolutions are applied to the

image. This can be seen as extracting several features from an image. CNNs have the
advantage of being scalable: they work on images of arbitrary size.

19
MA3K1 Mathematics of Machine Learning April 10, 2021

Solution (24) Let h : Rd ! Y be any classifier. Define the smallest perturbation that
moves a data point into a different class as

(x) = inf {krk : h(x + r) 6= h(x)}. (3.5)

Given a probability distribution on the input spaces, define the robustness as


h (X)
⇢(h) = E .
kXk

For a linear classifier with only two classes, where h(x) = wT x + b, we get the
vector that moves point x to the boundary by solving the optimization problem (as for
SVMs)
1
r ⇤ = argmin krk2 subject to wT (x + r) + b = 0.
2
The solution is
|wT x + b|
r⇤ = .
kwk
Assume that we now have k linear functions f1 , . . . , fk , with fi (x) = wiT x + bi , and
a classifier h : Rd ! {1, . . . , k} that assigns to each x the index j of the largest value
fj (x) (this corresponds to the one-to-many setting for multiple classification). Let x be
such that maxj fj (x) = fk (x) and define the linear functions

gi (x) := fi (x) fk (x) = (wi wk )T x + (bi bk ), i 2 {1, . . . , k 1}.

Then \
x2 {y : gi (y) < 0}.
1ik 1

The intersection of half-spaces is a polyhedron, and x is in the interior of the polyhedron

P delineated by the hyperplanes Hi = {y : gi (x) = 0} (see Figure 6).

H4
H3

H2
H5
x

Figure 6: The distance to misclassification is the radius of the largest enclosed ball in
a polyhedron P .

20
MA3K1 Mathematics of Machine Learning April 10, 2021

A perturbation x + r ceases to be in class k as soon as gj (x + r) > 0 for some j,

and the smallest length of such an r equals the radius of the largest enclosed ball in the
polyhedron. Formally, noting that wj wk = rgj (x), we get

|gj (x)| gĵ (x)

ĵ := arg min , r⇤ = · rgĵ (x). (3.6)
j krgj (w)k krgĵ (x)k2

Solution (25) The discriminator D would like to achieve, on average, a large value
on G0 (Z0 ), and a small value on G1 (Z1 ). Using the logarithm, this can be expressed as
the problem of maximizing

EZ0 [log(D(G0 (Z0 )))] + EZ1 [log(1 D(G1 (Z1 )))].

As an integral, using the push-forward measures, we get the expression

Z
[log(D(x))⇢X0 (x) + log(1 D(x))⇢X1 (x)] dx.
X

We choose D(x) so that the integrand becomes maximal. So considering the function

f (y) = log(y)⇢X0 (x) + log(1 y)⇢X1 (x)

and computing the first and second derivatives,

⇢X0 (x) ⇢X1 (x) ⇢X0 (x) ⇢X1 (x)

f 0 (y) = , f 00 (y) < 0,
y 1 y y2 (1 y)2

we see that we get a maximum at

⇢X0 (x)
y= ,
⇢X0 (x) + ⇢X1 (x)

which is the value of D⇤ (x) that we want.

Neural Networks and CPD Kernels Homework
No ratings yet
Neural Networks and CPD Kernels Homework
10 pages
Dis3 Sol
No ratings yet
Dis3 Sol
7 pages
CS 182 Berkeley 2021 Discussion 3
No ratings yet
CS 182 Berkeley 2021 Discussion 3
5 pages
Classification
No ratings yet
Classification
47 pages
Lecture 13 - Kernels
No ratings yet
Lecture 13 - Kernels
5 pages
hw3 Sol
No ratings yet
hw3 Sol
12 pages
Convolutional Neural Networks Introduction To Convolution Neural Networks
No ratings yet
Convolutional Neural Networks Introduction To Convolution Neural Networks
8 pages
2.convolution NN
No ratings yet
2.convolution NN
50 pages
Two-Layer Neural Networks Overview
No ratings yet
Two-Layer Neural Networks Overview
10 pages
6.86x Machine Learning With Python: Linear Classifiers
No ratings yet
6.86x Machine Learning With Python: Linear Classifiers
7 pages
CH 9
No ratings yet
CH 9
41 pages
Fundations Data Science
No ratings yet
Fundations Data Science
16 pages
תרגול - SVM 1
No ratings yet
תרגול - SVM 1
32 pages
Cours 8 A
No ratings yet
Cours 8 A
34 pages
03 PL, Activation, BackProp, CNN
No ratings yet
03 PL, Activation, BackProp, CNN
95 pages
Cs221 Section2 Solutions
No ratings yet
Cs221 Section2 Solutions
7 pages
Project Report
No ratings yet
Project Report
20 pages
Lecture 04
No ratings yet
Lecture 04
28 pages
הרצאה-Classifiers and Decision Trees
No ratings yet
הרצאה-Classifiers and Decision Trees
119 pages
Lecture 4
No ratings yet
Lecture 4
49 pages
HODL Lec 3 DNNs For Vision 1
No ratings yet
HODL Lec 3 DNNs For Vision 1
36 pages
CNN Concept
No ratings yet
CNN Concept
57 pages
Vahid
No ratings yet
Vahid
18 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
22 pages
Cs221 LEC 2 Slides
No ratings yet
Cs221 LEC 2 Slides
37 pages
Kakade S. Tewari A. - Topics in Artificial Intelligence (Learning Theory)
No ratings yet
Kakade S. Tewari A. - Topics in Artificial Intelligence (Learning Theory)
68 pages
Linear Separability
No ratings yet
Linear Separability
4 pages
ML Section15 Neural Networks
No ratings yet
ML Section15 Neural Networks
133 pages
6 CNN
No ratings yet
6 CNN
92 pages
MV cs4243 2024 Amir 2
No ratings yet
MV cs4243 2024 Amir 2
70 pages
TFM Lichtner Bajjaoui Aisha
No ratings yet
TFM Lichtner Bajjaoui Aisha
51 pages
CS229 Problem Set 4: EM, DL & RL
No ratings yet
CS229 Problem Set 4: EM, DL & RL
10 pages
Perceptrons
No ratings yet
Perceptrons
12 pages
Lec 05
No ratings yet
Lec 05
46 pages
CNN PPT Unit Iv
100% (2)
CNN PPT Unit Iv
134 pages
Understanding Deep Convolutional Networks
No ratings yet
Understanding Deep Convolutional Networks
17 pages
Sarma CNN Vce Oct 2022
No ratings yet
Sarma CNN Vce Oct 2022
63 pages
CNN Slides PDF
No ratings yet
CNN Slides PDF
81 pages
Instructor Solution Manual To Neural Networks and Deep Learning A Textbook Solutions 3319944622 9783319944623 - Compress
No ratings yet
Instructor Solution Manual To Neural Networks and Deep Learning A Textbook Solutions 3319944622 9783319944623 - Compress
40 pages
Instructor's Solution Manual For Neural Networks
0% (1)
Instructor's Solution Manual For Neural Networks
40 pages
Midterm Review Spring18 Sols
No ratings yet
Midterm Review Spring18 Sols
22 pages
Ds 2
No ratings yet
Ds 2
27 pages
Unit 2
No ratings yet
Unit 2
45 pages
Cs221 Section2 Problems
No ratings yet
Cs221 Section2 Problems
5 pages
DL Assignment Solutions
No ratings yet
DL Assignment Solutions
64 pages
Ad3501 DL Unit 2
No ratings yet
Ad3501 DL Unit 2
40 pages
DL Lecture 07 - 0430
No ratings yet
DL Lecture 07 - 0430
35 pages
Solution 02
No ratings yet
Solution 02
9 pages
Six Lectures On NN - Montanari
No ratings yet
Six Lectures On NN - Montanari
77 pages
Deep Learning - 9 - 1 - 1
No ratings yet
Deep Learning - 9 - 1 - 1
81 pages
Sol3 2020
No ratings yet
Sol3 2020
5 pages
Lec 3
No ratings yet
Lec 3
131 pages
Very Deep Learning: Logistic Regression & CNNs
No ratings yet
Very Deep Learning: Logistic Regression & CNNs
145 pages
Supervised Learning: Linear Models
No ratings yet
Supervised Learning: Linear Models
34 pages
Support Vector Machines
No ratings yet
Support Vector Machines
57 pages
SML Lecture5
No ratings yet
SML Lecture5
45 pages
Types of Neural Networks
No ratings yet
Types of Neural Networks
7 pages
Lecture12 19feb2025
No ratings yet
Lecture12 19feb2025
12 pages
Lecture09 12feb2025
No ratings yet
Lecture09 12feb2025
17 pages
Introduction To SVM
No ratings yet
Introduction To SVM
3 pages
Deep Learning
No ratings yet
Deep Learning
3 pages
Brownian Motion Notes Ma4f7 2024
No ratings yet
Brownian Motion Notes Ma4f7 2024
63 pages
Notes For Stats
No ratings yet
Notes For Stats
8 pages
Acceptance Rate
No ratings yet
Acceptance Rate
232 pages
Spotify Logo
No ratings yet
Spotify Logo
1 page
Bbc102 Mathematical Foundation
No ratings yet
Bbc102 Mathematical Foundation
3 pages
Matrices and Determinant L1 Youtube PDF
No ratings yet
Matrices and Determinant L1 Youtube PDF
36 pages
Gaussian Elimination for Students
No ratings yet
Gaussian Elimination for Students
12 pages
02 Math 270 1.8 NJJ
No ratings yet
02 Math 270 1.8 NJJ
13 pages
NumPy Assignment: Null Vector Creation
No ratings yet
NumPy Assignment: Null Vector Creation
3 pages
Bargmann-Segal Spaces in Fock Space
No ratings yet
Bargmann-Segal Spaces in Fock Space
28 pages
CSBS Syllabus Sem 3 8SemPattern - Revised
0% (1)
CSBS Syllabus Sem 3 8SemPattern - Revised
8 pages
Vector Functions for Math Students
No ratings yet
Vector Functions for Math Students
6 pages
MATH 251-02 Fall 22, Sept 6th Calculus Quadratic Surfaces
No ratings yet
MATH 251-02 Fall 22, Sept 6th Calculus Quadratic Surfaces
11 pages
JEE Main 2021 Question Paper Maths Feb 24 Shift 2
No ratings yet
JEE Main 2021 Question Paper Maths Feb 24 Shift 2
18 pages
Aspects of Multivariate Statistical Theory - Robb Muirhead (Appendix)
No ratings yet
Aspects of Multivariate Statistical Theory - Robb Muirhead (Appendix)
78 pages
9th - 1. VECTOR ALGEBRA - PRACTICE SHEET SOLUTIONS
No ratings yet
9th - 1. VECTOR ALGEBRA - PRACTICE SHEET SOLUTIONS
12 pages
Holiday Assignment: Class-Xii Mathematics
No ratings yet
Holiday Assignment: Class-Xii Mathematics
2 pages
Unit 1 Algebra To Prime Numbers
No ratings yet
Unit 1 Algebra To Prime Numbers
124 pages
Deep Learning April 2025 Question Paper Part 2
No ratings yet
Deep Learning April 2025 Question Paper Part 2
4 pages
Engineering Mathematics For Gate Chapter1
100% (1)
Engineering Mathematics For Gate Chapter1
52 pages
DIPS 6.008 Vector Analysis Report
No ratings yet
DIPS 6.008 Vector Analysis Report
1 page
Differential Geometry and Kinematics of Continua (Clayton, John D.) (Z-Library)
No ratings yet
Differential Geometry and Kinematics of Continua (Clayton, John D.) (Z-Library)
189 pages
Grade 12 Math Question Bank PDF
No ratings yet
Grade 12 Math Question Bank PDF
18 pages
Advanced Line Integrals Guide
No ratings yet
Advanced Line Integrals Guide
63 pages
الحركة في بعد واحد انجليزي
No ratings yet
الحركة في بعد واحد انجليزي
21 pages
Functional Analysis for Students
No ratings yet
Functional Analysis for Students
187 pages
Multiphase Transmission Line Modeling: Basic Theory
No ratings yet
Multiphase Transmission Line Modeling: Basic Theory
8 pages
Creasing Recommendation July2020 V7-Eng
No ratings yet
Creasing Recommendation July2020 V7-Eng
3 pages
Sequential Quadratic Programming Methods: Abstract. in His 1963 PHD Thesis, Wilson Proposed The First Sequential Quadratic
No ratings yet
Sequential Quadratic Programming Methods: Abstract. in His 1963 PHD Thesis, Wilson Proposed The First Sequential Quadratic
78 pages
Chapter 5: Vectors SM025: Mathematics Unit Kedah Matriculation College
No ratings yet
Chapter 5: Vectors SM025: Mathematics Unit Kedah Matriculation College
14 pages
Linear Equations in Linear Algebra: Solution Sets of Linear Systems
No ratings yet
Linear Equations in Linear Algebra: Solution Sets of Linear Systems
18 pages
June 2022 Mat 102
No ratings yet
June 2022 Mat 102
7 pages
Math 54 Vector Spaces Cheat Sheet
No ratings yet
Math 54 Vector Spaces Cheat Sheet
4 pages
TS - JR - Maths Ia - Imp Questions PDF
67% (3)
TS - JR - Maths Ia - Imp Questions PDF
7 pages

Intro To Linear Programming

Uploaded by

Intro To Linear Programming

Uploaded by

MA3K1 Mathematics of Machine Learning April 10, 2021

Solution (23) Given a vector x 2 Rd and a sequence g = (g d+1 , . . . , gd 1 )

The sequence g is called a filter. Convolutional Neural Networks (CNNs) operate on

Each 5 ⇥ 5 ⇥ 3 block is mapped linearly to an entry of a 28 ⇥ 28 matrix in the

In a convolutional layer, various different kernels or convolutions are applied to the

(x) = inf {krk : h(x + r) 6= h(x)}. (3.5)

Given a probability distribution on the input spaces, define the robustness as

gi (x) := fi (x) fk (x) = (wi wk )T x + (bi bk ), i 2 {1, . . . , k 1}.

The intersection of half-spaces is a polyhedron, and x is in the interior of the polyhedron

A perturbation x + r ceases to be in class k as soon as gj (x + r) > 0 for some j,

|gj (x)| gĵ (x)

EZ0 [log(D(G0 (Z0 )))] + EZ1 [log(1 D(G1 (Z1 )))].

As an integral, using the push-forward measures, we get the expression

f (y) = log(y)⇢X0 (x) + log(1 y)⇢X1 (x)

and computing the first and second derivatives,

⇢X0 (x) ⇢X1 (x) ⇢X0 (x) ⇢X1 (x)

we see that we get a maximum at

which is the value of D⇤ (x) that we want.

You might also like