0% found this document useful (0 votes)
27 views27 pages

ML Algorithms Explained

Uploaded by

yadavasit24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views27 pages

ML Algorithms Explained

Uploaded by

yadavasit24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

ML Algorithms Explained

ML Algorithms: Code & Concepts


scikit-learn

mlpfu.pages.dev 1
ML Algorithms Explained

Linear Regression: Theory


Concept: A foundational algorithm that models the linear relationship
between features and a continuous target. It fits a line (or hyperplane) that
minimizes the sum of squared errors (the vertical distance from each point
to the line).

Pros: Simple to understand, highly interpretable coefficients, fast to


train.
Cons: Assumes the relationship is linear, can be sensitive to outliers.

When to Use: Excellent as a starting point or baseline for any regression


problem. Use it when you need a simple, explainable model.

mlpfu.pages.dev 2
ML Algorithms Explained

Linear Regression: Visualization


This plot shows the raw data points and the best-fitting line found by the
model. The goal is to minimize the collective distance from all points to this
line.

mlpfu.pages.dev 3
ML Algorithms Explained

Linear Regression: Code


from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import StandardScaler

# Scaling features is a good practice that helps with model convergence.


scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Create and train the model


model = LinearRegression(
fit_intercept=True # Calculates the y-intercept. Set to False if data is pre-centered.
)
model.fit(X_train_scaled, y_train)

mlpfu.pages.dev 4
ML Algorithms Explained

Polynomial Regression: Theory


Concept: A powerful variation of linear regression that can model non-
linear, curved relationships. It works by creating new polynomial features
(e.g., x², x³) from the original features and then fitting a linear model to this
expanded feature set.

Pros: Can capture complex, non-linear patterns.


Cons: Prone to overfitting if the degree is too high. Choosing the right
degree can be tricky.

When to Use: When you visually inspect your data and see a clear curve or
non-linear trend.

mlpfu.pages.dev 5
ML Algorithms Explained

Polynomial Regression: Visualization


This plot shows how a degree-2 polynomial model can fit a curved
relationship in the data much better than a straight line could.

mlpfu.pages.dev 6
ML Algorithms Explained

Polynomial Regression: Code


from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import make_pipeline

# A pipeline is the best way to chain the feature creation and modeling steps.
model = make_pipeline(
StandardScaler(),
PolynomialFeatures(
degree=2, # The degree of the polynomial. Higher = more complex curve.
include_bias=False # Avoids a redundant bias term that LinearRegression handles.
),
LinearRegression()
)
model.fit(X_train, y_train)

mlpfu.pages.dev 7
ML Algorithms Explained

Regularization: Theory
Concept: A technique to combat overfitting by adding a penalty to the loss
function based on the size of the model's coefficients. This discourages the
model from becoming too complex.

Ridge (L2): Shrinks all coefficients towards zero, but never to exactly
zero. Good for general-purpose shrinkage.
Lasso (L1): Can shrink coefficients all the way to zero, effectively acting
as a form of automatic feature selection.

When to Use: Whenever you have a model with many features or a


complex model (like polynomial regression) that might be overfitting.

mlpfu.pages.dev 8
ML Algorithms Explained

Regularization: Visualization
These plots show how coefficients change as the regularization strength
( alpha ) increases. Notice how Lasso (right) forces coefficients to become
exactly zero, while Ridge (left) only shrinks them.

mlpfu.pages.dev 9
ML Algorithms Explained

Regularization: Code
from sklearn.linear_model import Ridge, Lasso

# Ridge (L2) Regression - good for reducing model complexity


ridge_model = Ridge(
alpha=1.0 # Regularization strength. Higher alpha = simpler model.
)
ridge_model.fit(X_train_scaled, y_train)

# Lasso (L1) Regression - good for feature selection


lasso_model = Lasso(
alpha=0.1 # Regularization strength. Higher alpha = more features set to zero.
)
lasso_model.fit(X_train_scaled, y_train)

mlpfu.pages.dev 10
ML Algorithms Explained

Logistic Regression: Theory


Concept: The go-to algorithm for binary classification. It calculates the
probability of an instance belonging to a class by passing a linear equation
through the sigmoid function, which squashes the output to a value between
0 and 1.

Pros: Fast, highly interpretable, provides probabilities.


Cons: Assumes a linear decision boundary between classes.

When to Use: A first-choice algorithm for any binary classification task,


especially when interpretability is important.

mlpfu.pages.dev 11
ML Algorithms Explained

Logistic Regression: Visualization


The line represents the decision boundary learned by the model. Points on
one side are classified as class 0, and points on the other side are classified
as class 1.

mlpfu.pages.dev 12
ML Algorithms Explained

Logistic Regression: Code


from sklearn.linear_model import LogisticRegression

# Create and train the model


model = LogisticRegression(
penalty='l2', # Specifies the regularization type ('l1', 'l2').
C=1.0, # Inverse of regularization strength. Smaller C = stronger penalty.
solver='liblinear',# Optimization algorithm. Good choice for small datasets.
multi_class='ovr' # Strategy for multi-class problems: One-vs-Rest.
)
model.fit(X_train_scaled, y_train)

mlpfu.pages.dev 13
ML Algorithms Explained

Naive Bayes: Theory


Concept: A fast, probabilistic classifier based on Bayes' Theorem. Its core is
the "naive" assumption that all features are completely independent of one
another. While this is rarely true, the algorithm is surprisingly effective in
practice.

Pros: Extremely fast, performs very well with high-dimensional data


(many features).
Cons: The independence assumption is a strong one and often not true.

When to Use: A classic choice for text classification (e.g., spam filtering)
where the number of features (words) is very large.

mlpfu.pages.dev 14
ML Algorithms Explained

Naive Bayes: Code


from sklearn.naive_bayes import GaussianNB

# This version (GaussianNB) is used when the features are continuous


# and assumed to follow a normal (Gaussian) distribution.
# Other versions include MultinomialNB (for word counts) and BernoulliNB (for binary features).
model = GaussianNB()
model.fit(X_train_scaled, y_train)

mlpfu.pages.dev 15
ML Algorithms Explained

K-Nearest Neighbors (KNN): Theory


Concept: A simple, "lazy" algorithm that makes predictions by looking at
the 'K' closest data points in the training set. It classifies a new point based
on a majority vote of its neighbors. It doesn't "learn" a model; it just
memorizes the entire training dataset.

Pros: Very simple to understand, no training phase required.


Cons: Can be very slow at prediction time on large datasets, sensitive to
irrelevant features and the scale of the data.

When to Use: For simple problems or as a baseline. When the decision


boundary is highly irregular and you don't need lightning-fast predictions.

mlpfu.pages.dev 16
ML Algorithms Explained

K-Nearest Neighbors (KNN): Visualization


These plots show how the decision boundary changes with K. A small K (left)
creates a complex, jagged boundary that can be prone to noise. A larger K
(right) creates a smoother, more generalized boundary.

mlpfu.pages.dev 17
ML Algorithms Explained

K-Nearest Neighbors (KNN): Code


from sklearn.neighbors import KNeighborsClassifier

# Create and train the model


model = KNeighborsClassifier(
n_neighbors=5, # The number of neighbors to use (K). This is the key hyperparameter.
weights='uniform', # 'uniform' gives all neighbors equal weight. 'distance' gives more weight to closer neighbors.
metric='minkowski', # The distance metric. 'minkowski' with p=2 is the standard Euclidean distance.
p=2
)
model.fit(X_train_scaled, y_train)

mlpfu.pages.dev 18
ML Algorithms Explained

Support Vector Machines (SVM): Theory


Concept: A powerful and versatile classifier that works by finding the
optimal hyperplane that best separates the classes. "Optimal" means the
one with the largest possible margin—the distance between the hyperplane
and the nearest points from each class (the "support vectors").

Pros: Very effective in high-dimensional spaces, memory efficient as it


only uses a subset of points (support vectors).
Cons: Can be slow to train on very large datasets, less interpretable
than other models.

When to Use: For complex classification problems where you need high
accuracy, even if the data is not linearly separable (thanks to the kernel
trick).
mlpfu.pages.dev 19
ML Algorithms Explained

Support Vector Machines (SVM): Visualization


This plot shows the decision boundary (solid line), the margins (dashed
lines), and the circled support vectors that define the margin.

mlpfu.pages.dev 20
ML Algorithms Explained

Support Vector Machines (SVM): Code


from sklearn.svm import SVC

# Create and train the model


model = SVC(
kernel='rbf', # Kernel type. 'rbf' is a powerful default for non-linear problems. 'linear' for linear data.
C=1.0, # Regularization parameter. Controls the trade-off between a wide margin and classifying all points correctly.
gamma='scale' # Kernel coefficient for 'rbf'. 'scale' is a robust default setting.
)
model.fit(X_train_scaled, y_train)

mlpfu.pages.dev 21
ML Algorithms Explained

Decision Tree: Theory


Concept: A highly interpretable model that creates a flowchart of if-then-
else rules based on the data's features. It recursively splits the data into
subsets that are as "pure" (homogeneous) as possible.

Pros: Very easy to understand and visualize, requires no feature scaling.


Cons: Individual trees are prone to overfitting and can be unstable
(small changes in data can lead to a completely different tree).

When to Use: When model interpretability is a top priority. Also serves as


the building block for more powerful ensemble models like Random Forests.

mlpfu.pages.dev 22
ML Algorithms Explained

Decision Tree: Visualization


This image shows the flowchart-like structure of a trained decision tree. You
can follow the path from the root node down to a leaf to get a prediction.

mlpfu.pages.dev 23
ML Algorithms Explained

Decision Tree: Code


from sklearn.tree import DecisionTreeClassifier

# Create and train the model. Note: Trees do not require feature scaling.
model = DecisionTreeClassifier(
criterion='gini', # The function to measure the quality of a split ('gini' or 'entropy').
max_depth=3, # The maximum depth of the tree. Setting this is the primary way to prevent overfitting.
min_samples_leaf=1 # The minimum number of samples required to be at a leaf node.
)
model.fit(X_train, y_train)

mlpfu.pages.dev 24
ML Algorithms Explained

Hyperparameter Tuning: Theory


Concept: The process of finding the optimal settings for a model's
parameters that are not learned from the data (e.g., K in KNN, C in SVM).
This is done by systematically searching through a "grid" of possible
parameter values and evaluating each combination using cross-validation.

Why: Default parameters are rarely optimal. Tuning is crucial for


maximizing model performance.
How: GridSearchCV automates this search, making it a standard and
essential step in the modeling pipeline.

mlpfu.pages.dev 25
ML Algorithms Explained

Hyperparameter Tuning: Visualization


This heatmap shows the cross-validated accuracy for different combinations
of an SVM's C and gamma parameters. This allows you to visually identify
the region of best performance.

mlpfu.pages.dev 26
ML Algorithms Explained

Hyperparameter Tuning: Code


from sklearn.model_selection import GridSearchCV
from sklearn.svm import SVC

# 1. Define the grid of parameters you want to search


param_grid = {
'C': [0.1, 1, 10], # Test these regularization values
'gamma': [1, 0.1, 0.01], # Test these kernel coefficient values
'kernel': ['rbf']
}

# 2. Create the GridSearchCV object


grid_search = GridSearchCV(
estimator=SVC(), # The model you want to tune
param_grid=param_grid, # The parameter grid to search
cv=5, # Number of folds for cross-validation
scoring='accuracy', # The metric to optimize
verbose=1 # Set to 1 or higher to see progress updates
)

# 3. Fit it to the data. This will start the search.


grid_search.fit(X_train_scaled, y_train)
print(f"Best Parameters Found: {grid_search.best_params_}")
mlpfu.pages.dev 27

You might also like