0% found this document useful (0 votes)

78 views38 pages

Math Essentials for ML Engineers

The document provides an overview of key mathematics concepts used in machine learning, including: 1) Linear algebra topics like tensors, vectors, matrices, and matrix operations. 2) Calculus topics like convex sets and functions, extrema, derivatives, and the rules for computing derivatives. 3) How these linear algebra and calculus concepts are applied in machine learning algorithms and modeling.

Uploaded by

14 Asif Akhtar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

78 views38 pages

Math Essentials for ML Engineers

Uploaded by

14 Asif Akhtar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

1

MATHEMATICS IN
MACHINE LEARNING -I
ROHAN PILLAI
ASSISTANT PROFESSOR
DEPARTMENT OF ELECTRICAL ENGINEERING, DTU
2
Table of Content

❑ Mathematics in Data Science vs Machine Learning ❑ CALCULUS REVISITED

❑ LINEAR ALGEBRA • Convex Sets

• Tensors • Convex Functions

• Vectors • Extrema

• Matrices • Derivatives

• Special Matrices • Rules of Derivatives

• Transpose of a Matrix • Gradient

• Inverse of a matrix • Hessian

• Matrix Differentiation • Jacobian

• Stationary Points

• Stationary Points in higher dimensions

• Example

❑ References
Mathematics in ML-I
3
Mathematics in
Data Science vs Machine Learning

Mathematics in ML-I
4
Mathematics in
Data Science vs Machine Learning

Mathematics in data science and machine learning is not about

How can we learn crunching numbers, but about what is happening, why it’s
mathematics without getting happening, and how we can play around with different things to
bogged down into the theory? obtain the results we want.

Mathematics in ML-I
5

1. LINEAR
ALGEBRA

Mathematics in ML-I
6
Tensors

 A tensor is an array of numbers, that may be

❖ a scalar (0th Order Tensor)
❖ a vector (1st Order Tensor)
❖ a matrix (2nd Order Tensor)

Mathematics in ML-I
7
Vectors

 Vector in Rn is an ordered set of n real numbers.

o e.g. v = (1,5,3,4,2) is in R5
1
5
o A column vector: 3
4
2
o A row vector: (1 5 3 4 2)
 Vector operations : Scalar multiplication and Vector addition
 Any point on the plane containing the vectors u and v is some linear combination
of au + bv

Mathematics in ML-I
8
Vectors…

 Vector norms : A norm of a vector ||x|| is informally a measure of the “length” of the vector.

 Common norms:
o L1 (Manhattan) -
Matrices
Frobenius Norm :
Af = σ𝑖,𝑗 𝐴𝑖𝑗 2
o L2 (Euclidean) -

Mathematics in ML-I
9
Vectors…

 Vector dot (inner) product:

 Geometric interpretation of a dot product : Projection of one vector upon the other

❑ If u•v=0, ||u||2 != 0, ||v||2 != 0 → u and v are orthogonal

❑ If u•v=0, ||u||2 = 1, ||v||2 = 1 → u and v are orthonormal

Mathematics in ML-I
10
Matrices

 An N×M matrix has N rows and M columns i.e. it is 2-D array of numbers. Here is an
example matrix, A, with 3 rows and 2 columns :
1 2
A= 3 4
5 6
 Linear transformations can be represented as matrices.
 Multiplying a matrix with a vector has the geometric meaning of applying a linear
transformation to the vector
 Multiplying two matrices has the geometric meaning of applying one linear
transformation after another

Mathematics in ML-I
11
Matrices…

 Matrix Multiplication : C = AB  Hadamard Product : C = A ⊙ B

▪ Cij = σk AikBkj ▪ Element wise multiplication
▪ Sizes must match ▪ A,B,C must be of the same size

Mathematics in ML-I
12
Special Matrices

Diagonal matrix Upper Triangular / Symmetric Matrix

Lower Triangular
Matrix
Example : Example :

 a 0 0 a b c a 0 0 A = AT
     
 0 b 0 0 d e b c 0
0 0 c 0 0 f  d e f 
   

Mathematics in ML-I
13
Transpose of a Matrix

 Transpose of a Matrix : We can think of it as

o “flipping” the rows and columns

OR
o “reflecting” vector/matrix on line

T T
a a b  a c 
 Example :
  = (a b )   =  
b c d  b d 

Mathematics in ML-I
14
Inverse of a matrix

 Inverse of a square matrix A, denoted by A-1  Pseudo- Inverse of a matrix A ε Rm× n is

is the unique matrix s.t. defined as A+ ∈ Rn× m, where
o AA-1 =A-1A= I (identity matrix) A+ = (ATA)-1AT
 If A-1 and B-1 exist, then
o (AB)-1 = B-1A-1,
o (AT)-1 = (A-1)T
 For orthogonal matrices : (A)-1 = (AT)
 For diagonal matrices :
D-1 = diag {d1-1 ,…, dn-1}

Mathematics in ML-I
17
Matrix Differentiation

Mathematics in ML-I
18

Mathematics in ML-I
19

Mathematics in ML-I
20

Mathematics in ML-I
21
Matrix Differentiation…

Mathematics in ML-I
22

Mathematics in ML-I
23

Mathematics in ML-I
24

2. CALCULUS
REVISITED

Mathematics in ML-I
25
Convex Sets

C ⊆ ℝd
∀ x, y ∈ C
∀ λ ∈ [0,1]
z = λ.x + (1- λ).y

Mathematics in ML-I
26
Convex Functions

f : ℝd ℝ
∀ x, y ∈ C
∀ λ ∈ [0,1]
z = λ.x + (1- λ).y
f(z) ≤ λ. f(x) + (1- λ). f(y)

Mathematics in ML-I
27
Extrema

 Since we always seek the “best” values of a function, usually we are looking for the
maxima or the minima of a function
 Global extrema: a point which achieves the
best value of the function (max/min) among
all the possible points
 Local extrema: a point which achieves the
best value of the function only in a small
region surrounding that point
 Most machine learning algorithms love to find the global extrema
o E.g. we saw that SVM wanted to find the model with max margin
 Sometimes it is difficult so we settle for local extrema (e.g. deepnets)

Mathematics in ML-I
28
Derivatives

 Derivatives measure how much one variable(y = f(x))

changes when there is a small change in the other
variable(x)
𝑑𝑦
o f '(x) =
𝑑𝑥
 It is zero when the function is flat (horizontal), such as at
the minimum or maximum of f(x)
 It is positive when f(x) is strictly increasing and negative
when f(x) is strictly decreasing
 Double Derivatives :
𝑑2𝑦
o Double derivative f ''(x) OR y '' =
𝑑𝑥2
o It is positive for convex functions (having single
minima) and negative for concave functions (having
single maxima)
 In higher dimensions (functions of many variables /
vectors ), we use the idea of partial derivatives
Mathematics in ML-I
29
Rules of Derivatives

′
 Sum Rule: 𝑓 𝑥 + 𝑔 𝑥 = 𝑓 ′ 𝑥 + 𝑔′ 𝑥
′
 Scaling Rule: 𝑎 ⋅ 𝑓 𝑥 = 𝑎 ⋅ 𝑓 ′ 𝑥 if 𝑎 is not a function of 𝑥
′
 Product Rule: 𝑓 𝑥 ⋅ 𝑔 𝑥 = 𝑓 ′ 𝑥 ⋅ 𝑔 𝑥 + 𝑔′ 𝑥 ⋅ 𝑓 𝑥
2
 Quotient Rule: 𝑓 𝑥 /𝑔 𝑥 ′ = 𝑓 ′ 𝑥 ⋅ 𝑔 𝑥 − 𝑔′ 𝑥 𝑓 𝑥 / 𝑔 𝑥
′
′
 Chain Rule: 𝑓 𝑔 𝑥 ≝ 𝑓∘𝑔 𝑥 = 𝑓′ 𝑔 𝑥 ⋅ 𝑔′ 𝑥

Mathematics in ML-I
30
Gradient

 Gradient ∶ Multivariate generalisation of the derivative

 For multivariate functions with 𝑑-dim inputs, the
gradient simply records how much the function would
change if we move a little bit along each one of the 𝑑
axes!
𝜕𝑓 𝜕𝑓 𝜕𝑓 ′
 𝛻𝑓 𝐱 = , ,…,
𝜕𝑥1 𝜕𝑥2 𝜕𝑥𝑑

 The gradient also has the distinction of offering the

steepest ascent i.e. if we want maximum increase in
function value, we must move a little bit along the
gradient. Similarly, we must move a little bit in the
direction opposite to gradient to get the maximum
decrease in the function value, i.e. the gradient also
offers us the steepest descent
 At minima or maxima , the gradient is a zero vector
Mathematics in ML-I
31
Hessian

 Hessian : Gradient of the gradient

 2nd derivative of 𝑓: ℝ𝑛 → ℝ is a n × n matrix called the
Hessian

𝜕2𝑓 𝜕2𝑓 𝜕2𝑓

𝜕𝑥12 𝜕𝑥1 𝑥2 … 𝜕𝑥1 𝑥𝑛
𝜕2𝑓 𝜕2𝑓 … 𝜕2𝑓
𝛻 2 𝑓 𝐱 = 𝜕𝑥 𝑥 𝜕𝑥22 𝜕𝑥2 𝑥𝑛
2 1
⋮ ⋮ ⋱ ⋮
2 2 2
𝜕 𝑓 𝜕 𝑓 𝜕 𝑓
…
𝜕𝑥𝑛 𝑥1 𝜕𝑥𝑛 𝑥2 𝜕𝑥𝑛2
 Note that the Hessian matrix is Symmetric
 It is the equivalent of 2nd derivative in scalar calculus and
has similar uses
 All rules of derivatives (chain, product etc) apply here as
well
Mathematics in ML-I
32
Jacobian

 Jacobian is the equivalent of the gradient for vector valued functions

 Hessian can be seen as the gradient (Jacobian) of the gradient (which is a vector)

𝜕𝑓 𝑥
 For 𝑓: ℝ𝑛 → ℝ𝑚 , Jacobian is a m × n matrix in which we have Ji, j = 𝑖
𝜕𝑥𝑗

𝜕𝑓1 𝜕𝑓1 𝜕𝑓1

𝜕𝑥1 𝜕𝑥2 𝜕𝑥𝑛
𝜕𝑓2 𝜕𝑓2 𝜕𝑓2
J= 𝜕𝑥1 𝜕𝑥2 𝜕𝑥𝑛
⋮ ⋱ ⋮
𝜕𝑓𝑚 𝜕𝑓𝑚
⋯
𝜕𝑥1 𝜕𝑥𝑛

Mathematics in ML-I
33
Stationary Points
If 𝑓 ′′ 𝑥 < 0 and 𝑓 ′ 𝑥 = 0 then derivative moves from +ve to -ve around
this point – local/global max!

If 𝑓 ′′ 𝑥 = 0 and 𝑓 ′ 𝑥 = 0 then
 These are places where the derivative vanishes i.e. is 0 this may be extrema/saddle –
higher derivatives e.g. 𝑓 ′′′ 𝑥
 These can be local/global extrema or saddle points needed

 The derivative being zero is its way of telling us

that at that point, the function looks flat
 Saddle points can be tedious in ML
 We can find out if a stationary point
is saddle or extrema using 2nd derivative If 𝑓 ′′ 𝑥 > 0 and 𝑓 ′ 𝑥 = 0 then derivative moves
from -ve to +ve around this point – local/global min!

 Just as sign of the derivative tells us if the function is increasing or decreasing at a given
point, the 2nd derivative tells us if the derivative is increasing or decreasing

Mathematics in ML-I
34
Stationary Points in higher dimensions
If a square 𝑑 × 𝑑 symmetric matrix 𝐴 satisfies 𝐱 ⊤ 𝐴𝐱 < 0 for all non-zero 𝐱 ∈ ℝ𝑑
then it is negative definite (ND)

 These are places where the gradient vanishes i.e. is a zero vector!
 We can still find out if a stationary point is saddle or extrema using the 2nd derivative test
just as in 1D
 A bit more complicated to visualize, but the Hessian tells us how the surface of the
function is curved at a point
 If 𝛻𝑓 𝐱 = 𝟎 and 𝛻 2 𝑓 𝐱 is a PD matrix, then 𝐱 is a local/global min
 If 𝛻𝑓 𝐱 = 𝟎 and 𝛻 2 𝑓 𝐱 is a ND matrix, then 𝐱 is a local/global max
 If neither of these are true, then either 𝐱 is a saddle point or the test fails, need higher
order derivatives to verify
 Whether point is saddle or test has failed depends on eigenvalues of 𝛻𝑓 𝐱
Recall that if a matrix satisfies 𝐱 ⊤ 𝐴𝐱 > 0 for all non-zero 𝐱 ∈ ℝ𝑑
then it is called positive definite (PD)

Mathematics in ML-I
35

Mathematics in ML-I
36
Example – Function Values

In this discrete example, we can

calculate gradient at a point 𝑥0 , 𝑦0
as
Δ𝑓 Δ𝑓
o 𝛻𝑓 𝑥0 , 𝑦0 = , where
Δ𝑥 Δ𝑦
Δ𝑓 𝑓 𝑥0 +1,𝑦0 −𝑓 𝑥0 −1,𝑦0
o =
Δ𝑥 2
Δ𝑓 𝑓 𝑥0 ,𝑦0 +1 −𝑓 𝑥0 ,𝑦0 −1
o =
Δ𝑦 2

Mathematics in ML-I
37
Example – Gradients

In this discrete example, we can

Local max
calculate gradient at a point 𝑥0 , 𝑦0
as
Local min Δ𝑓 Δ𝑓
o 𝛻𝑓 𝑥0 , 𝑦0 = , where
Δ𝑥 Δ𝑦
Δ𝑓 𝑓 𝑥0 +1,𝑦0 −𝑓 𝑥0 −1,𝑦0
Saddle point o =
Δ𝑥 2
Δ𝑓 𝑓 𝑥0 ,𝑦0 +1 −𝑓 𝑥0 ,𝑦0 −1
o =
Δ𝑦 2

Mathematics in ML-I
38
Example – Gradients

Gradients
converge
toward local In this discrete example, we can calculate
max
gradient at a point 𝑥0 , 𝑦0 as
Gradients Δ𝑓 Δ𝑓
diverge o 𝛻𝑓 𝑥0 , 𝑦0 = , where
Δ𝑥 Δ𝑦
away from
local min Δ𝑓 𝑓 𝑥0 +1,𝑦0 −𝑓 𝑥0 −1,𝑦0
o =
Δ𝑥 2
Δ𝑓 𝑓 𝑥0 ,𝑦0 +1 −𝑓 𝑥0 ,𝑦0 −1
At saddle points,
o =
Δ𝑦 2
both can happen
along different axes o We can visualize these gradients
using simple arrows as well

Mathematics in ML-I
39
Example – Hessians

−𝟐 −𝟎. 𝟐𝟓  In this discrete example, we can calculate

𝟐
𝛁 𝒇=
−𝟎. 𝟐𝟓 −𝟐 Hessian at 𝑥0 , 𝑦0 as
which is ND i.e. local max
Δ2 𝑓 Δ2 𝑓
Δ𝑥 2 Δ𝑥Δ𝑦
𝛁𝟐 𝒇 = o 𝛻 2 𝑓 𝑥0 , 𝑦0 = where
𝟐 𝟎. 𝟏𝟐𝟓 Δ2 𝑓 Δ2 𝑓
𝟎. 𝟏𝟐𝟓 𝟐 Δ𝑥Δ𝑦 Δ𝑦 2
which is PD i.e.
local min Δ2 𝑓
o = 𝑓 𝑥0 + 1, 𝑦0 + 𝑓 𝑥0 − 1, 𝑦0 − 2𝑓 𝑥0 , 𝑦0
Δ𝑥 2
Δ2 𝑓
o = 𝑓 𝑥0 , 𝑦0 + 1 + 𝑓 𝑥0 , 𝑦0 − 1 − 2𝑓 𝑥0 , 𝑦0
𝟐 Δ𝑦 2
𝛁 𝒇=
−𝟐 −𝟎. 𝟏𝟐𝟓 Δ2 𝑓 𝑓𝑥𝑦 +𝑓𝑦𝑥
−𝟎. 𝟏𝟐𝟓 𝟐
o = where
Δ𝑥Δ𝑦 2
which is neither PD nor Δ𝑓 Δ𝑓
𝑥0 ,𝑦0 +1 − 𝑥0 ,𝑦0 −1
ND (it is a saddle point) ❑ 𝑓𝑥𝑦 = Δ𝑥 Δ𝑥
2
Δ𝑓 Δ𝑓
𝑥0 +1,𝑦0 − 𝑥0 −1,𝑦0
Δ𝑦 Δ𝑦
❑ 𝑓𝑦𝑥 = 2

Mathematics in ML-I
40
References

 https://github.com/purushottamkar/ml19-20w/tree/master/lecture_slides
 https://www.analyticsvidhya.com/blog/2019/10/mathematics-behind-machine-learning/
 https://tminka.github.io/papers/matrix/minka-matrix.pdf

Mathematics in ML-I

AIMLB PGP 2025 Session 4
No ratings yet
AIMLB PGP 2025 Session 4
38 pages
Mit18 S096iap23 Lec1
No ratings yet
Mit18 S096iap23 Lec1
16 pages
MLF Combined
No ratings yet
MLF Combined
84 pages
Lecture 2 - Math
No ratings yet
Lecture 2 - Math
39 pages
Mathematics For AI
No ratings yet
Mathematics For AI
5 pages
Lecture 2 - Review
No ratings yet
Lecture 2 - Review
50 pages
MSc Finance Math Induction Notes
No ratings yet
MSc Finance Math Induction Notes
123 pages
DL Notes Unit 1
No ratings yet
DL Notes Unit 1
28 pages
Lecture1 Math Basics
No ratings yet
Lecture1 Math Basics
74 pages
matrixcalc Đạo hàm ma trận PDF
No ratings yet
matrixcalc Đạo hàm ma trận PDF
25 pages
Matrix Calculus Derivatives Guide
No ratings yet
Matrix Calculus Derivatives Guide
8 pages
Math for ML Enthusiasts
No ratings yet
Math for ML Enthusiasts
100 pages
Matrix Calculus for Deep Learning
No ratings yet
Matrix Calculus for Deep Learning
34 pages
Math Revision For DS and ML
No ratings yet
Math Revision For DS and ML
74 pages
Math Essentials for ML Enthusiasts
No ratings yet
Math Essentials for ML Enthusiasts
25 pages
Maths For ML
No ratings yet
Maths For ML
1 page
Math for ML: Vectors & Probability
No ratings yet
Math for ML: Vectors & Probability
1 page
Basic Maths For DL
No ratings yet
Basic Maths For DL
50 pages
ML Module II
No ratings yet
ML Module II
11 pages
Math Review For ML
No ratings yet
Math Review For ML
41 pages
Lecture Notes
No ratings yet
Lecture Notes
38 pages
ML1 Skript 2023
No ratings yet
ML1 Skript 2023
97 pages
Class 12 Math Syllabus Overview
No ratings yet
Class 12 Math Syllabus Overview
2 pages
CENG3300 Lecture 2-1
No ratings yet
CENG3300 Lecture 2-1
21 pages
Derivatives and Backpropagation Guide
No ratings yet
Derivatives and Backpropagation Guide
7 pages
Vector Calculus in Machine Learning
No ratings yet
Vector Calculus in Machine Learning
46 pages
Linear Algebra and Optimization
No ratings yet
Linear Algebra and Optimization
48 pages
1 2.-Maths ML
No ratings yet
1 2.-Maths ML
18 pages
Matrix Chapter1 Part1 2025
No ratings yet
Matrix Chapter1 Part1 2025
39 pages
Crespin, D. - Matrix Formulas For Semilinear Backpropagation
No ratings yet
Crespin, D. - Matrix Formulas For Semilinear Backpropagation
29 pages
Calculus Crash Course: Derivatives & Gradients
No ratings yet
Calculus Crash Course: Derivatives & Gradients
24 pages
All LEC
No ratings yet
All LEC
377 pages
Mathematics For Machine Learning
No ratings yet
Mathematics For Machine Learning
249 pages
CSE465 T2 Mathematics For DL
No ratings yet
CSE465 T2 Mathematics For DL
29 pages
Lecture Notes On Linear Algebra: October 2023
No ratings yet
Lecture Notes On Linear Algebra: October 2023
123 pages
Neural Network Gradient Descent
No ratings yet
Neural Network Gradient Descent
23 pages
Chap A
No ratings yet
Chap A
19 pages
01 Section 2.1 QR Code Content
No ratings yet
01 Section 2.1 QR Code Content
23 pages
Quick Recap Applied Maths Formula Sheet Class 12
93% (15)
Quick Recap Applied Maths Formula Sheet Class 12
12 pages
Assignment No. 2 Q. 1 (A) Solve The System of Linear Equation Using Matrices: X+ 2y - 3z 1 2x - y + Z 2 X - Z 3
No ratings yet
Assignment No. 2 Q. 1 (A) Solve The System of Linear Equation Using Matrices: X+ 2y - 3z 1 2x - y + Z 2 X - Z 3
23 pages
Linear Algebra Essentials for Machine Learning
No ratings yet
Linear Algebra Essentials for Machine Learning
62 pages
Linear Transformations & Matrices
No ratings yet
Linear Transformations & Matrices
40 pages
Diagrammatic Matrix Differentiation
No ratings yet
Diagrammatic Matrix Differentiation
6 pages
Maths I Jntu Matrices Jacobian Taylors Diff Eqn Parial Laplace Transform
No ratings yet
Maths I Jntu Matrices Jacobian Taylors Diff Eqn Parial Laplace Transform
248 pages
Introduction To Linear Algebra
No ratings yet
Introduction To Linear Algebra
28 pages
Thomas Minka - Note On Matrix Calculus and Algebra
No ratings yet
Thomas Minka - Note On Matrix Calculus and Algebra
19 pages
Math Prelims
No ratings yet
Math Prelims
40 pages
Matrix and Tensor Factorization For Machine Learning: IFT 6760A
No ratings yet
Matrix and Tensor Factorization For Machine Learning: IFT 6760A
49 pages
Introduction To Linear Algebra
No ratings yet
Introduction To Linear Algebra
33 pages
Machine Learning: The Basics
No ratings yet
Machine Learning: The Basics
288 pages
EML Couse Outcome
No ratings yet
EML Couse Outcome
2 pages
BDA Math Practice Questions 2025
No ratings yet
BDA Math Practice Questions 2025
6 pages
AML 04 Backpropagation
100% (1)
AML 04 Backpropagation
26 pages
Matrixcalc PDF
No ratings yet
Matrixcalc PDF
23 pages
Data Science Unit - 3 - 31.8.23
No ratings yet
Data Science Unit - 3 - 31.8.23
62 pages
Essential Math for AI Beginners
No ratings yet
Essential Math for AI Beginners
12 pages
Tensor Techniques for Matrix Differentiation
No ratings yet
Tensor Techniques for Matrix Differentiation
11 pages
L3-7 Mathematical Foundations
No ratings yet
L3-7 Mathematical Foundations
25 pages
Lec 6 - Loss Function
No ratings yet
Lec 6 - Loss Function
5 pages
PYQ Choppers
No ratings yet
PYQ Choppers
20 pages
PYQ Inverters
No ratings yet
PYQ Inverters
18 pages
PYQ Phase Controlled Rectifiers
No ratings yet
PYQ Phase Controlled Rectifiers
32 pages
S.No Faculty Short Name
No ratings yet
S.No Faculty Short Name
3 pages
(@dtualertbot) E22 Btech IV 1445 (Sem-4 Result)
No ratings yet
(@dtualertbot) E22 Btech IV 1445 (Sem-4 Result)
423 pages
Remediation Project Guidelines and Goals
No ratings yet
Remediation Project Guidelines and Goals
2 pages
Rokdeshwar Jadhav
No ratings yet
Rokdeshwar Jadhav
2 pages
Catalogue
100% (1)
Catalogue
121 pages
Pobble - Notes
No ratings yet
Pobble - Notes
2 pages
Limit Gauges, Comparator
No ratings yet
Limit Gauges, Comparator
56 pages
Farmer Database Format NP & CAN User North Zone
No ratings yet
Farmer Database Format NP & CAN User North Zone
354 pages
Solenoid Switch
No ratings yet
Solenoid Switch
62 pages
Format of Khutba by MD Naim Khan
No ratings yet
Format of Khutba by MD Naim Khan
4 pages
Cryostar Corporate
No ratings yet
Cryostar Corporate
16 pages
CH 12 Brain Teasers
No ratings yet
CH 12 Brain Teasers
9 pages
Sheldon Nomenclature 20151103 110339
No ratings yet
Sheldon Nomenclature 20151103 110339
4 pages
CSS Specificity and Cascade Guide
No ratings yet
CSS Specificity and Cascade Guide
10 pages
4 Aa
No ratings yet
4 Aa
5 pages
Usais Pamphlet 350-6 Expert Infantryman Badge
No ratings yet
Usais Pamphlet 350-6 Expert Infantryman Badge
84 pages
Term-2 Portion Grade 12 - New
No ratings yet
Term-2 Portion Grade 12 - New
8 pages
4+of+4+Google+SketchUp+for+Interior+Design+and+Space+Planning+ How+to+Communicate+Your+Ideas+in+a+Convincing+Way+ (Book+4+PREVIEW)
100% (1)
4+of+4+Google+SketchUp+for+Interior+Design+and+Space+Planning+ How+to+Communicate+Your+Ideas+in+a+Convincing+Way+ (Book+4+PREVIEW)
14 pages
G-335 X-Ray Fixer Safety Data Sheet
No ratings yet
G-335 X-Ray Fixer Safety Data Sheet
9 pages
Ge Elec 1-3
No ratings yet
Ge Elec 1-3
10 pages
Pavan Et Al 2024 Characterization of Leaky Deep Saline Aquifer For Storing SC Co2
No ratings yet
Pavan Et Al 2024 Characterization of Leaky Deep Saline Aquifer For Storing SC Co2
13 pages
Rule 130 Evidence 21 Questions
No ratings yet
Rule 130 Evidence 21 Questions
4 pages
OBDII Pinouts for VAG, BMW, and Mercedes
100% (2)
OBDII Pinouts for VAG, BMW, and Mercedes
4 pages
Chapter 7 Commercial Bank
No ratings yet
Chapter 7 Commercial Bank
43 pages
Accounting Research Insights
No ratings yet
Accounting Research Insights
140 pages
Dream Death and The Self Valberg J J Available Full Chapters
100% (1)
Dream Death and The Self Valberg J J Available Full Chapters
76 pages
PCB Concepts Guide
100% (2)
PCB Concepts Guide
296 pages
ARCHER Live Review 1
No ratings yet
ARCHER Live Review 1
554 pages
Differential Calculus (Exercises With Detailed Solutions)
No ratings yet
Differential Calculus (Exercises With Detailed Solutions)
5 pages
1c Elster Kent Optima 100 Water Meter Brochure PDF
No ratings yet
1c Elster Kent Optima 100 Water Meter Brochure PDF
4 pages
Class 12- Portion for December and Cssc Exam
No ratings yet
Class 12- Portion for December and Cssc Exam
5 pages
Accounting Lesson 1 Bank Reconciliation Notes
100% (1)
Accounting Lesson 1 Bank Reconciliation Notes
9 pages

Math Essentials for ML Engineers

Uploaded by

Math Essentials for ML Engineers

Uploaded by

1

❑ Mathematics in Data Science vs Machine Learning ❑ CALCULUS REVISITED

❑ LINEAR ALGEBRA • Convex Sets

• Tensors • Convex Functions

• Special Matrices • Rules of Derivatives

• Transpose of a Matrix • Gradient

• Inverse of a matrix • Hessian

• Matrix Differentiation • Jacobian

• Stationary Points in higher dimensions

Mathematics in data science and machine learning is not about

 A tensor is an array of numbers, that may be

 Vector in Rn is an ordered set of n real numbers.

 Vector dot (inner) product:

❑ If u•v=0, ||u||2 != 0, ||v||2 != 0 → u and v are orthogonal

 Matrix Multiplication : C = AB  Hadamard Product : C = A ⊙ B

Diagonal matrix Upper Triangular / Symmetric Matrix

 Transpose of a Matrix : We can think of it as

o “flipping” the rows and columns

 Inverse of a square matrix A, denoted by A-1  Pseudo- Inverse of a matrix A ε Rm× n is

 Derivatives measure how much one variable(y = f(x))

 Gradient ∶ Multivariate generalisation of the derivative

 The gradient also has the distinction of offering the

 Hessian : Gradient of the gradient

𝜕2𝑓 𝜕2𝑓 𝜕2𝑓

 Jacobian is the equivalent of the gradient for vector valued functions

𝜕𝑓1 𝜕𝑓1 𝜕𝑓1

 The derivative being zero is its way of telling us

In this discrete example, we can

In this discrete example, we can

−𝟐 −𝟎. 𝟐𝟓  In this discrete example, we can calculate

You might also like