0% found this document useful (0 votes)

5 views24 pages

Modul 4 - Regression

Uploaded by

norfai7979

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views24 pages

Modul 4 - Regression

Uploaded by

norfai7979

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

KURSUS LATIHAN ARTIFICIAL

INTELLIGENCE (AI)
TAHAP LANJUTAN
Modul 4
Machine Learning
Supervised Learning (Regression)

INSTITUT TADBIRAN AWAM NEGARA

(INTAN)
Topics
• Regression
• Regression Application
• Regression Algorithms
• Hands-on Regression Activity
Hasilan Pembelajaran
• Pada akhir sesi ini, peserta dapat:
• Menerangkan konsep regresi dengan betul
• Menerangkan algoritma untuk regresi
• Mengaplikasi algoritma regresi menggunakan Python.
Regression
• Regression is a type of supervised
learning algorithm used to predict a
continuous outcome variable based
on one or more predictor variables
• The primary objective is to establish
a mathematical relationship
between the input features and the
output variable, enabling the model
to make accurate predictions on Source: https://www.javatpoint.com/regression-analysis-in-machine-learning

new, unseen data.

Type of Regressions
Regression Application

Healthcare Finance Retail Manufacturing

• Predictive • Credit • Demand • Predictive

disease scoring forecasting maintenance
diagnosis • Fraud • Inventory • Quality
• Healthcare detection optimization control
resource • Stock price • Pricing • Supply chain
utilization forecasting optimization optimization
Regression example
• Stock market prediction

Zhu, Tianlei & Liao, Yuexin & Tao, Zheng. (2022). Predicting Google’s Stock Price with LSTM
Model. Proceedings of Business and Economic Studies. 5. 82-87. 10.26689/pbes.v5i5.4361.
Methods/Algorithms for Regression
• Example algorithms for regression are:
• Linear regression
• Polynomial regression
• Support vector regression
• Decision tree
• Random forest
Linear regression
• A fundamental method for modeling the relationship between a
dependent variable and one or more independent variables.
• The primary objective is to find the best-fitting line through the
data points
• Linear regression makes several key assumptions:
• Linearity: The relationship between the independent and dependent
variables is linear.
• Independence: The observations are independent of each other.
• Homoscedasticity: The variance of the error terms is constant across all
levels of the independent variables.
• Normality: The error terms are normally distributed (especially important
for hypothesis testing).
Example
Polynomial Regression
• Regression analysis in which the relationship between the
independent variable and the dependent variable is modeled as
an nth degree polynomial.
• Polynomial regression fits a curve thus can represent non-linear
relationships between the variables.
• Increasing the degree of the polynomial, the model can fit more
complex data patterns.
• However, higher-degree polynomials can lead to overfitting.
Linear vs Polynomial Regression
Random Forest Regression
• Constructs multiple decision trees during training and outputting
the average prediction of the individual trees for regression.
• Random Forest Regression builds a "forest" of decision trees. Each tree in
the forest is built using a different bootstrap sample from the training
data.
• At each node, a subset of features is randomly selected, and the best split
is chosen from this subset.
• The final prediction is the average of the predictions from all individual
trees.
Sample
Steps for Random Forest Regression
Bootstrap Sampling Building Trees Prediction

Randomly select a For each tree, at For regression, the

subset of the data each node, a final prediction for a
(with replacement) random subset of new data point is
to create multiple features is selected. obtained by
bootstrap samples The best split is averaging the
Each bootstrap determined based predictions of all
sample is used to on the chosen individual trees in
train a different subset of features. the forest.
decision tree. The tree is grown to
its maximum depth
or until a stopping
criterion is met (e.g.,
minimum number of
samples per leaf).
Hands-on
Regression
California Housing – Linear Regression
STEP 1: Upload the dataset into Colab folder
STEP 2: Install and import necessary libraries
pip install scikit-learn
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
import matplotlib.pyplot as plt

STEP 3: Load and display data sample

data = pd.read_csv('housing.csv')
print(data.head())
STEP 4: Preprocess the dataset
print(data.isnull().sum()) # Check for missing values

data.dropna(inplace=True) # Drop rows with missing values

# Split the dataset into features (X) and target (y)

X = data.drop('median_house_value', axis=1)
y = data['median_house_value’]

X_train, X_test, y_train, y_test = train_test_split(X, y,

test_size=0.2, random_state=42)

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
STEP 5: Implement Linear regression
model = LinearRegression()
model.fit(X_train, y_train)

y_pred = model.predict(X_test)

# Evaluate the model

mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print(f'Mean Squared Error: {mse:.2f}')

print(f'R^2 Score: {r2:.2f}')
STEP 6: Display output
plt.figure(figsize=(10, 6))
plt.scatter(y_test, y_pred, edgecolor='k', alpha=0.7)
plt.plot([min(y_test), max(y_test)], [min(y_test),
max(y_test)], 'r--', lw=3)
plt.xlabel('True Values')
plt.ylabel('Predicted Values')
plt.title('True vs. Predicted Values')
plt.show()
STEP 7: Test regression
sample_input = pd.DataFrame({
'longitude': [-122.23],
'latitude': [37.88],
'housing_median_age': [41],
'total_rooms': [6.9841],
'total_bedrooms': [1.0238],
'population': [322],
'households': [2.5556],
'median_income': [8.3252],})

sample_input_standardized = scaler.transform(sample_input)
sample_prediction =
model.predict(sample_input_standardized)
print(f'The predicted house value for the sample input is:
${sample_prediction[0] :.2f}')

The Digital Landscape 2025 HANDOUT
No ratings yet
The Digital Landscape 2025 HANDOUT
109 pages
AI Adoption and Diffusion in Public Administration - A Systematic Literature Review and Future Research Agenda
No ratings yet
AI Adoption and Diffusion in Public Administration - A Systematic Literature Review and Future Research Agenda
18 pages
A Security-UTAUT Framework For Evaluating Key Security Determinants in Smart City Adoption by The Australian City Councils
No ratings yet
A Security-UTAUT Framework For Evaluating Key Security Determinants in Smart City Adoption by The Australian City Councils
6 pages
Ganesh Moss 2022 Resistance and Refusal To Algorithmic Harms Varieties of Knowledge Projects
No ratings yet
Ganesh Moss 2022 Resistance and Refusal To Algorithmic Harms Varieties of Knowledge Projects
17 pages
A Review of Artificial Intelligence in Government and Its
100% (1)
A Review of Artificial Intelligence in Government and Its
9 pages
A Framework To Overcome Challenges To The Adoption of AI in Indian GovOrg
No ratings yet
A Framework To Overcome Challenges To The Adoption of AI in Indian GovOrg
13 pages
A Review of Maslahah Mursalah and Maqasid Shariah As Methods of Determining Islamic Legal Ruling
100% (1)
A Review of Maslahah Mursalah and Maqasid Shariah As Methods of Determining Islamic Legal Ruling
8 pages
A Review of Halal Supply Chain Research Sustainability and Operations Research Perspective
No ratings yet
A Review of Halal Supply Chain Research Sustainability and Operations Research Perspective
12 pages
Exploring Factors Influencing Organizational Adoption of Artifici
No ratings yet
Exploring Factors Influencing Organizational Adoption of Artifici
34 pages
Technology Adoption
No ratings yet
Technology Adoption
25 pages
Artificial Intelligence Adoption by SMEs To Achieve Sustainable Business Performance Application of Technology-Organization-Environment Framework
No ratings yet
Artificial Intelligence Adoption by SMEs To Achieve Sustainable Business Performance Application of Technology-Organization-Environment Framework
24 pages
Analisis Faktor-Faktor Yang Mempengaruhi Pendapatan Nelayan
No ratings yet
Analisis Faktor-Faktor Yang Mempengaruhi Pendapatan Nelayan
13 pages
Evaluation of Pedestrian Bridges and Pedestrian Safety
No ratings yet
Evaluation of Pedestrian Bridges and Pedestrian Safety
13 pages
Business Research Methods: Multivariate Analysis
No ratings yet
Business Research Methods: Multivariate Analysis
34 pages
50 Interview Questions & Answers!
No ratings yet
50 Interview Questions & Answers!
52 pages
Business Regression Analysis Guide
No ratings yet
Business Regression Analysis Guide
19 pages
Infidelity in Committed Relationships I: A Methodological Review
100% (1)
Infidelity in Committed Relationships I: A Methodological Review
34 pages
Y F (A, K, L) Y Industrial Production Function (GDP) F The Function of Variables A Technology Represent by Deathrate K Net Capital L Labor Force
No ratings yet
Y F (A, K, L) Y Industrial Production Function (GDP) F The Function of Variables A Technology Represent by Deathrate K Net Capital L Labor Force
6 pages
Conditional Regression
No ratings yet
Conditional Regression
7 pages
Statistics Explained - 4th Edition ISBN 0367366355, 9780367366353 (FULL VERSION DOWNLOAD)
No ratings yet
Statistics Explained - 4th Edition ISBN 0367366355, 9780367366353 (FULL VERSION DOWNLOAD)
16 pages
Backward Stepwise Regression Guide
No ratings yet
Backward Stepwise Regression Guide
2 pages
D 1980 Dunning OLI Theory
No ratings yet
D 1980 Dunning OLI Theory
24 pages
Exponential Function Project
No ratings yet
Exponential Function Project
6 pages
7.1.1. Linear Regression - Intuition
No ratings yet
7.1.1. Linear Regression - Intuition
7 pages
Research 9 Module 5 Data Collection and Methods
100% (3)
Research 9 Module 5 Data Collection and Methods
13 pages
Paper 4-Enhancing Cyber Security Through Predictive Analytics
No ratings yet
Paper 4-Enhancing Cyber Security Through Predictive Analytics
12 pages
Research in Organizational Behaviour
No ratings yet
Research in Organizational Behaviour
12 pages
Lecture 8: Gradient Descent and Logistic Regression
No ratings yet
Lecture 8: Gradient Descent and Logistic Regression
39 pages
Theinfluenceoflearningvalueonlearningmanagementsystemuse Anextensionof UTAUT2
100% (1)
Theinfluenceoflearningvalueonlearningmanagementsystemuse Anextensionof UTAUT2
17 pages
Understanding Variables in Research
No ratings yet
Understanding Variables in Research
4 pages
Syllabus FDS
No ratings yet
Syllabus FDS
4 pages
IEA 02 Operation Research
No ratings yet
IEA 02 Operation Research
49 pages
Container Ship Design Regression Analysis
No ratings yet
Container Ship Design Regression Analysis
25 pages
Lab-Ethics and Ai
No ratings yet
Lab-Ethics and Ai
13 pages
Activity 1
No ratings yet
Activity 1
6 pages
University of Mumbai: Teacher's Reference Manual
No ratings yet
University of Mumbai: Teacher's Reference Manual
66 pages
QT Module II Correlation and Regression Analysis
No ratings yet
QT Module II Correlation and Regression Analysis
10 pages
Linear Regression Model Slope: Ŷ B + B X + B X + B X + + B X
No ratings yet
Linear Regression Model Slope: Ŷ B + B X + B X + B X + + B X
9 pages
Contemporary Marketing Research e Book
No ratings yet
Contemporary Marketing Research e Book
162 pages
Operations Management, 10e: (Heizer/Render) Chapter 4 Forecasting
No ratings yet
Operations Management, 10e: (Heizer/Render) Chapter 4 Forecasting
22 pages
Bibliografie3 - Anderson 2001 Empirical Direction in Design and Analysis
No ratings yet
Bibliografie3 - Anderson 2001 Empirical Direction in Design and Analysis
26 pages

Modul 4 - Regression

Uploaded by

Modul 4 - Regression

Uploaded by

KURSUS LATIHAN ARTIFICIAL

INSTITUT TADBIRAN AWAM NEGARA

new, unseen data.

Healthcare Finance Retail Manufacturing

• Predictive • Credit • Demand • Predictive

Randomly select a For each tree, at For regression, the

STEP 3: Load and display data sample

data.dropna(inplace=True) # Drop rows with missing values

# Split the dataset into features (X) and target (y)

X_train, X_test, y_train, y_test = train_test_split(X, y,

# Evaluate the model

print(f'Mean Squared Error: {mse:.2f}')

You might also like