Linear Algebra and Machine
Learning
CS158 - 1 (Artificial Intelligence)
Mr. Raymond Sedilla, MSIT
Linear algebra
Linear algebra is a field of mathematics that could be
called the mathematics of data. It is undeniably a pillar
of the field of machine learning, and many recommend
it as a prerequisite subject to study prior to getting
started in machine learning.
Learn linear algebra
notation
Linear algebra is the mathematics of data
and the notation allows you to describe
operations on data precisely with specific
operators. You need to be able to read
and write this notation.
In partnership with the notation of
linear algebra are the arithmetic
Learn linear operations performed. You need to
algebra know how to add, subtract, and
arithmetic multiply scalars, vectors, and
matrices.
Learn Linear Algebra for Statistics
You must learn linear algebra in order to be able to
learn statistics. Especially multivariate statistics.
Statistics and data analysis are another pillar field of
mathematics to support machine learning.
Learn matrix factorization
Building on notation and arithmetic is the
idea of matrix factorization, also called
matrix decomposition. You need to know
how to factorize a matrix and what it
means. Matrix factorization is a key tool in
linear algebra and used widely as an
element of many more complex operations
in both linear algebra (such as the matrix
inverse) and machine learning (least
squares).
Dataset and Data Files
Images and Photographs
One Hot Encoding
Examples of Linear Regression
linear Regularization
algebra in Principal Component Analysis
machine Singular-Value Decomposition
learning Latent Semantic Analysis
Recommender Systems
Deep Learning
Dataset and
Data Files
In machine learning, you fit a
model on a dataset. This is the
table like set of numbers where
each row represents an
observation, and each column
represents a feature of the
observation. For example,He is a
snippet of the Iris flowers
dataset1:
Images and Photographs
Perhaps you are more used to working with images or photographs
in computer vision applications. Each image that you work with is
itself a table structure with a width and height and one pixel value in
each cell for black and white images or 3 pixel values in each cell for
a color image. A photo is yet another example of a matrix from linear
algebra. Operations on the image, such as cropping, scaling, shearing
and so on are all described using the notation and operations of
linear algebra.
One Hot Encoding
A one hot encoding is where a table is created to represent the variable
with one column for each category and a row for each example in the
dataset. A check or one-value is added in the column for the categorical
value for a given row, and a zero-value is added to all other columns.
For example, the variable color variable with the 3 rows:
Linear Regression
Linear regression is an old method from
statistics for describing the relationships
between variables. It is often used in
machine learning for predicting
numerical values in simpler regression
problems. There are many ways to
describe and solve the linear regression
problem, i.e. finding a set of coefficients
that when multiplied by each of the
input variables and added together
results in the best prediction of the
output variable.
Regularization
A technique that is often used to encourage a model to minimize the
size of coefficients while it is being fit on data is called regularization.
Common implementations include the L2 and L1 forms of
regularization. Both of these forms of regularization are in fact a
measure of the magnitude or length of the coefficients as a vector
and are methods lifted directly from linear algebra called the vector
norm.
Principal Methods for automatically reducing the number of
columns of a dataset are called dimensionality
reduction, and perhaps the most popular is method is
component called the principal component analysis or PCA for
short.
analysis
This method is used in machine learning to create
projections of high-dimensional data for both
visualization and for training models. The core of the
PCA method is a matrix factorization method from
linear algebra. The eigendecomposition can be used
and more robust implementations may use the
singular-value decomposition or SVD.
Singular-Value Decomposition
Another popular dimensionality reduction method is the singular-value
decomposition method or SVD for short. As mentioned and as the name of the
method suggests, it is a matrix factorization method from the eld of linear
algebra. It has wide use in linear algebra and can be used directly in
applications such as feature selection, visualization, noise reduction and more.
Latent Matrix factorization methods such as the singular-value
decomposition can be applied to this
Semantic
Analysis sparse matrix which has the effect of distilling the
representation down to its most relevant essence.
Documents processed in thus way are much easier to
compare, query and use as the basis for a supervised
machine learning model.
This form of data preparation is called Latent Semantic
Analysis or LSA for short and is also known by the name
Latent Semantic Indexing or LSI.
Recommender Predictive modeling problems that involve the
recommendation of products are called
Systems recommender systems, a sub-field of machine
learning.
Examples include the recommendation of books
based on previous purchases and purchases by
customers like you on Amazon, and the
recommendation of movies and TV shows to
watch based on your viewing history and viewing
history of subscribers like you on Netflix.
Deep Learning
Deep learning is the recent resurged use of artificial neural networks
with newer methods and faster hardware that allow for the
development and training of larger and deeper (more layers)
networks on very large datasets. Deep learning methods are
routinely achieved state-of-the-art results on a range of challenging
problems such as machine translation, photo captioning, speech
recognition and much more.
What is a vector?
A vector is a tuple of one or more values called scalars.
Vectors are built from components, which are ordinary
numbers. You can think of a vector as a list of numbers,
and vector algebra operations performed on the numbers
in the list.
Vector addition
Two vectors of equal length can be added
together to create a new third vector.
c=a+b
Vector subtraction
One vector can be subtracted from
another vector of equal length to create
a new third vector. c = a – b
Vector multiplication
Two vectors of equal length can be
multiplied together. c = a * b
Vector division
Two vectors of equal length can be
divided. c = a / b.
Vector dot product
• We can calculate the sum of the multiplied
elements of two vectors of the same length
to give a scalar. This is called the dot
product, named because of the dot
operator used when describing the
operation.
• The dot product is the key tool for
calculating vector projections, vector
decompositions, and determining
orthogonality. The name dot product comes
from the symbol used to denote it.
Vector – Scalar Multiplication
A vector can be multiplied by a scalar, in effect scaling the magnitude of
the vector. To keep notation simple, we will use lowercase s to
represent the scalar value. c = s * v or c = sv