MODULE 2
ANALYTIC GEOMETRY
Athira S Nair, AP,CS, MEC
What is a Vector Norm?
• The length of the vector is referred to as the vector norm or the vector’s magnitude.
• The length of a vector is a nonnegative number that describes the extent of the vector in space
• Notations are used to represent the vector norm in broader calculations
A single bar is used to denote a vector
norm, absolute value, or complex modulus,
while a double bar is reserved for denoting
a matrix norm.
Athira S Nair, AP,CS, MEC
Manhattan Norm or L1 norm
This norm is technically the summation over the absolute values of a vector.
The simple mathematical formulation is as below:
Athira S Nair, AP,CS, MEC
L2 Norm ( Euclidean Norm)
This norm is commonly used in Machine Learning due to being differentiable, which
is crucial for optimization purposes.
Like the L1 norm, the L2 norm is often used when fitting machine learning algorithms
as a regularization method, e.g. a method to keep the coefficients of the model small
and, in turn, the model less complex.
By far, the L2 norm is more commonly used than other vector norms in machine
learning.
Athira S Nair, AP,CS, MEC
Max Norm
• The length of a vector can be calculated using the maximum
norm, also called max norm.
• Max norm of a vector is referred to as L∞
• The max norm is calculated as returning the maximum value of
the vector, hence the name.
• Max norm is also used as a regularization in machine
learning, such as on neural network weights, called max
norm regularization.
Athira S Nair, AP,CS, MEC
Inner products
• Inner products allow for the introduction of intuitive geometrical concepts, such as the length of a
vector and the angle or distance between two vectors.
• A major purpose of inner products is to determine whether vectors are orthogonal to each other.
Dot Product
Athira S Nair, AP,CS, MEC
General Inner Products
A bilinear mapping Ω is a mapping with two arguments, and it is linear in
each argument, i.e., when we look at a vector space V then it holds that
Athira S Nair, AP,CS, MEC
Symmetric, Positive Definite Matrices
• Symmetric, positive definite matrices play an important role in machine learning, and they are defined via
the inner product
• A symmetric matrix is called symmetric, positive definite, or definite just positive definite if
Athira S Nair, AP,CS, MEC
Lengths and Distances
• In geometry, we are often interested in lengths of vectors.
• We can now use an inner product to compute them using
• Let us take x = [1, 1]⊤ ∈ R2. If we use the dot product as the inner product
Athira S Nair, AP,CS, MEC
Distance and metric
Athira S Nair, AP,CS, MEC
Angles
• We use the Cauchy-Schwarz inequality to define angles ω in inner product spaces between two vectors x, y
• Assume that x ̸= 0, y ̸= 0. Then
• The number ω is the angle between the vectors x and y.
• Intuitively, the angle between two vectors tells us how similar their orientations are.
• For example, using the dot product, the angle between x and y = 4x, i.e.,
• Y is a scaled version of x, is 0: Their orientation is the same.
Athira S Nair, AP,CS, MEC
Orthogonality
• A key feature of the inner product is that it also allows us to characterize vectors that are orthogonal
Athira S Nair, AP,CS, MEC
Athira S Nair, AP,CS, MEC
Orthogonal Matrix
• A square matrix A ∈ Rn×n is an orthogonal matrix orthogonal matrix if and only if its columns are orthonormal
so that
• Transformations by orthogonal matrices are special because the length of a vector x is not changed when
transforming it using an orthogonal matrix A. For the dot product, we obtain
• The angle between any two vectors x, y, as measured by their inner product
Athira S Nair, AP,CS, MEC
Orthonormal Basis
We can use Gaussian elimination to find a basis for a vector space spanned by a set of vectors.
Assume we are given a set {˜b1, . . . ,˜b } of non-orthogonal and unnormalized basis vectors.
n
We concatenate them into a matrix ˜B = [˜b , . . . ,˜b ] and apply Gaussian elimination to the augmented
1 n
matrix [˜B ˜B | ˜B] to obtain an orthonormal basis.
⊤
This constructive way to iteratively build an orthonormal basis {b , . . . , b } is called the Gram-Schmidt
1 n
process
Athira S Nair, AP,CS, MEC
Orthogonal Complement
Generally, orthogonal complements can be used to describe hyperplanes
in n-dimensional vector and affine spaces
Athira S Nair, AP,CS, MEC
Projections
• Projections are an important class of linear transformations
• Projections play an important role in graphics, coding theory, statistics and
machine learning.
• In machine learning, we often deal with data that is high-dimensional.
• High-dimensional data is often hard to analyze or visualize.
• However, high-dimensional data quite often possesses the property that only a
few dimensions contain most information, and most other dimensions are not
essential to describe key properties of the data.
• When we compress or visualize high-dimensional data, we will lose information.
• To minimize this compression loss, we ideally find the most informative
dimensions in the data.
• More specifically, we can project the original high-dimensional data onto a lower-
dimensional feature space and work in this lower-dimensional space to learn
more about the dataset and extract relevant patterns.
Athira S Nair, AP,CS, MEC
Orthogonal Projections
Athira S Nair, AP,CS, MEC
Projection onto a one dimensional subspaces(Lines)
Athira S Nair, AP,CS, MEC
Athira S Nair, AP,CS, MEC
Athira S Nair, AP,CS, MEC
Projection on to general subspace
Athira S Nair, AP,CS, MEC
1. Let us find the coordinates λ1,…,λm of the projections with respect to U
Athira S Nair, AP,CS, MEC
Athira S Nair, AP,CS, MEC
Athira S Nair, AP,CS, MEC
• Given that the linear equation cannot be solved exactly,we can find an approximate solution.
• The idea is to find the vector in the subspace spanned by the columns of A that is closest to b,
• i.e., we compute the orthogonal projection of b onto the subspace spanned by the columns of A.
• This problem arises often in practice, and the solution is called the least-squares least-squares solution.
• The projection error is the norm of the difference vector between the original vector and its projections onto U
This displacement vector is orthogonal to all basis vectors of U.
If the basis vectors {b1,…,bm} is Orthonormal, then the projection equation is simplified to
Athira S Nair, AP,CS, MEC
Example- University Question
The
projection
Matrix is
Athira S Nair, AP,CS, MEC
Gram-Schmidt Orthogonalization
Athira S Nair, AP,CS, MEC
Athira S Nair, AP,CS, MEC
Use the Gram-Schmidt process to find an orthogonal basis for the column space of the following
matrix.
Athira S Nair, AP,CS, MEC
Example ( University Question)
Use Gram-Schmidt process to find an orthonormal basis from the ordered
basis
Athira S Nair, AP,CS, MEC
Athira S Nair, AP,CS, MEC
Athira S Nair, AP,CS, MEC
Athira S Nair, AP,CS, MEC