UNIVERSITI TEKNOLOGI MARA
ASSIGNMENT
COURSE : NUMERICAL METHODS
COURSE CODE : MEK551
DEADLINE : WEEK 14
MODE : ASSIGNMENT
GROUP ASSIGNMENT GUIDELINES
1 Answer ALL questions in ENGLISH.
.
2 Use the provided format for the cover page and answer template.
.
3
It is COMPULSORY to use computational tools (MATLAB/OCTAVE) to complete this
.
assignment.
4 Answers WITHOUT computational approach or simulation tool will result in ZERO marks.
.
5 Each student needs to send in a softcopy of the assignment through MS TEAMS at your
. respective class setting not later than 19 JANUARY 2025 (11:59 PM) with the file name:
a) ClassID_GroupNo.pdf (Report)
b) ClassID_GroupNo.m (m.file)
*only insert the information in blue.
6 This assignment is designed for Course Outcome 1 (CO1), 2 (CO2) and 3 (CO3) of the course:
.
CO1: Describe numerical techniques and their limitations in solving engineering problems
[PO1] {C2}
PO1: Able to apply knowledge of mathematics, science, engineering fundamentals and an
engineering specialisation to defined and applied engineering procedures, processes, systems or
methodologies.
CO2: Evaluate the numerical techniques in solving engineering problems [PO3] {C5}.
PO3: Able to design solutions for broadly-defined engineering technology problems and
contribute to the design of systems, components or processes to meet specified needs with
appropriate consideration for public health and safety, cultural, societal, and environmental
considerations.
CO3: Construct a computational approach to solve mathematical problems [PO5] {P4}.
PO5: Able to select and apply appropriate techniques, resources, and modern engineering and IT
tools, including prediction and modelling, to broadly-defined engineering activities, with an
understanding of the limitations.
UNIVERSITI TEKNOLOGI MARA
ASSIGNMENT
COURSE : NUMERICAL METHODS
COURSE CODE : MEK551
DEADLINE : WEEK 14
MODE : ASSIGNMENT
GROUP NO DUE DATE SUBMISSION DATE REMARKS
4 19th JANUARY 2025 25/01/2025
LECTURER NAME DR SITI MARIAM BINTI ABD RAHMAN
No Name Student ID
1
MUHAMMAD DANIAL BIN EXQUANDE 2023503169
2
MUHAMMAD AMMAR BIN AMRAN 2023126741
NO CO / PO Assessment Criteria (Report) Allocated Marks
Marks
2.0
1. CO1/PO1/C2 Defining problems
3.0
2. CO1/PO1/C2 Defining method and criteria
2.0
3. CO3/PO5/P4 Curve Fitting Onramp
4.0
4. CO3/PO5/P4 Display computational approach
2.0
5. CO3/PO5/P4 Illustrate and report the findings
6. CO3/PO5/P4 Execution of computational approach 2.0
7. CO2/PO3/C5 Discussion and conclusions 5.0
TOTAL 20%
Comment:
Introduction
Regression is used to predict numerical values given a certain input. Linear regression attempts to model
the line of best fit that explains the relationship between two or more variables by fitting a linear equation
to observed data. Polynomial regression, on the other hand, is a form of regression analysis in which the
relationship between the independent variable and the dependent variable is modelled as an nth-degree
polynomial. In regression analysis, one variable is set as the independent (or explanatory) variable, and
the other is the dependent (or response) variable.
Before fitting a linear or polynomial model to observed data, first, one should determine whether there is
a relationship between the variables of interest. This does not necessarily imply that one variable causes
the other because correlation does not necessarily imply causation. In other words, the correlation
between two events or variables indicates a relationship, whereas causation is more specific and says that
one event causes the other.
For example, men who wear larger shoes are taller than those who wear smaller shoes. From the data, we
may see that as shoe size increases, so does height. Using the data, we want to do more than say that shoe
size and height are related: we want to have a way to predict (guess) a man’s height from his shoe size.
We use linear regression to make a straight line closest to the most data to achieve that.
Assignment 2
Students are assigned to evaluate the relationship between independent and dependent measures from the
data library suggested by Telus Digital. The dataset will be assigned to students based on the assigned
group number.
In this assignment, you are expected to do the following:
1. Clearly state the objective assigned to you and provide relevant information or research related to
the topic.
1. To analyse the duration Years of experience vs salary
2. To study the Age of workers vs salary
Underlying Issues
1. Complexity of Variables: Non-Linear Growth: In some industries, salary increments may
slow down after a certain level of experience, while others may show exponential jumps at
key milestones (e.g., promotion to management).
2. Accuracy of Data: Real-world datasets may include noise or outliers, which can affect the
accuracy of regression models and subsequent predictions.
3. Correlation vs. Causation : A statistical analysis might show that employees with more
years of experience generally earn higher salaries. This suggests a correlation, but it does
not necessarily mean that experience causes the salary increase.
Research
1. Dickens, W.T.; Lang, K. A test of dual labour market theory. Am. Econ. Rev. 1985, 67,
792–805. [Google Scholar]
Income levels in the informal sector are considerably lower compared to those in the formal
employment sectors, revealing that informal work plays a crucial role in ensuring the
subsistence of impoverished individual
2. Brockhaus, R.H. The Psychology of the Entrepreneur; UIUC Academy for
Entrepreneurial Leadership Historical Research Reference: Champaign, IL, USA,
1982. [Google Scholar]
Scholars have noted a strong path dependence among flexible workers, where individuals’
past employment experiences significantly shape their current and future choices,
irrespective of the advantages or disadvantages. As a result, those presently engaged in
flexible employment are more inclined to continue pursuing such opportunities due to the
lasting impact and inertia stemming from their prior experiences
3. Jin, Y.H. The formation and development of informal labour market. Acad. Bimest.
2000, 4, 91–97. [Google Scholar]
The prevailing perspective suggests that the entire informal sector constitutes a low-end
and secondary labor market, which is distinct from the formal sector, where flexible workers
are often characterized as possessing low-level skills and low human capital
[CO1: 2 MARKS]
States the problems and objectives clearly and Does not identify the problems and objectives.
identifies underlying issues.
*Provide a brief background of the situation you will look at.
*Cite at least three previous research on the same topic.
2. Identify METHODS that can be used to solve the problem. Briefly describe the theory involved.
The goal is to develop a regression model that predicts Salaries Range (dependent variable) based
on the following independent variables:
1. Multicollinearity Between Age and Experience: Age and years of experience are highly
correlated (older individuals generally have more experience). This can cause
multicollinearity in statistical models, making it difficult to determine which factor truly
impacts salary..
2. Discrimination and Bias in Salary Growth: Age discrimination (older workers may face
stagnation in salary) and experience bias (some industries prioritize skills over tenure) can
distort the expected relationship..
3. Non-Linear Salary Growth & Career Plateaus: Salary increases may not follow a linear
pattern growth can be rapid in early careers but stagnate after a certain point.
Methods and Theory Involved:
1. Linear Regression:
○ Theory: Linear regression fits a straight line to data that minimizes the sum of
squared residuals. It assumes a linear relationship between dependent and
independent variables.
○ Implementation: MATLAB's polyfit function will calculate the coefficients (β) for
each independent variable.
○ Application: Evaluate if the relationships between Stress Level and each
independent variable are linear.
2. Polynomial Regression (if needed):
○ Theory: When the relationship is not linear, polynomial regression can capture
curvature in the data by introducing higher-order terms.
○ Application: Examine residuals from linear regression; if the pattern shows
non-linearity, use polynomial regression to refine the fit.
3. Correlation Analysis:
○ Theory: Correlation coefficients (r) measure the strength and direction of
relationships between variables. This preliminary step evaluates if there are
meaningful associations.
○ Application: Use MATLAB’s corrcoef function to calculate correlations and justify
variable inclusion.
4. Alternative Strategy - Multiple Linear Regression:
○ Theory: This method includes all independent variables in one model to evaluate
their combined effect on Stress Level. It adjusts for interdependence between
variables (e.g., Sleep Duration and Heart Rate may be correlated).
○ Application: MATLAB’s fitlm function will generate a multivariable regression
model.
[CO1: 3 MARKS]
Develops a clear and concise plan to solve the Does not develop a coherent plan to solve the
problem, offers alternative strategies problem
*Indicate the general model that you are going to estimate. Discuss what you think the relationship is between the
dependent variable and the independent variables and what that leads you to conclude about the expected signs of the
coefficients in the model.
3. Complete the Curve Fitting Onramp exercise.
https://matlabacademy.mathworks.com/progress/share/certificate.html?id=c11c3cba-e779-4798-b72
d-a1dfd9082ee9&&classId=7f1f943c-dbab-4cad-b3fb-ecee9c8d1a25&assignmentId=92f9d361-588
9-4c1c-8380-36762eca6108&submissionId=57dac5ef-461d-b0b2-e37c-61848588cf49
https://matlabacademy.mathworks.com/progress/share/certificate.html?id=aef81b39-6f7b-4ce7-869c
-805e7008a4e6&
[CO3: 2 MARKS]
Complete the exercise Did not complete the exercise
*Include the link to the exercise completion certificate.
4. Show the flowchart of your Matlab code and its corresponding code. Make sure all main steps
are well-defined and explained.
clear all
close all
clc
% Ensure the file exists in the
working directory
fileName = "PhD_Data.xlsx"; % File
name
sheetName = "Sheet1"; % Sheet name
% Check if the file exists
if ~isfile(fileName)
error('The file "%s" does not
exist in the current folder.',
fileName);
end
% Read data from the Excel file
A = xlsread(fileName, sheetName);
% Ensure the dataset has enough
columns for analysis
if size(A, 1) < 6
error('The sheet "%s" must
contain at least 6 columns.',
sheetName);
end
% Dataset 1: Years of Experience vs
Salary
x1 = A(:, 1); % Independent
Variable: Years of Experience
y1 = A(:, 2); % Dependent Variable:
Salary
pI1 = polyfit(x1, y1, 1); % Linear
fit coefficients
xsort1 = sort(x1); % Sort for
smooth plotting
fI1 = polyval(pI1, xsort1); %
Evaluate the linear fit
% Create figure
figure('Position', [100 100 1200
600]);
% First subplot for Years of
Experience vs Salary
subplot(1, 2, 1);
plot(xsort1, fI1, '-',
'DisplayName', sprintf('Linear Fit
1: y = %.2fx + %.2f', pI1(1),
pI1(2)));
hold on
scatter(x1, y1, 'o', 'DisplayName',
'Data Points: Experience');
% Dataset 2: Age vs Salary
x2 = A(:, 3); % Independent
Variable: Age
y2 = A(:, 2); % Dependent Variable:
Salary (same as y1)
pI2 = polyfit(x2, y2, 1); % Linear
fit coefficients
xsort2 = sort(x2); % Sort for
smooth plotting
fI2 = polyval(pI2, xsort2); %
Evaluate the linear fit
% Second subplot for Age vs Salary
subplot(1, 2, 2);
plot(xsort2, fI2, '-',
'DisplayName', sprintf('Linear Fit
2: y = %.2fx + %.2f', pI2(1),
pI2(2)));
hold on
scatter(x2, y2, 's', 'DisplayName',
'Data Points: Age');
% Calculate correlations
corr1 = corrcoef(x1, y1); %
Correlation for Experience vs
Salary
corr2 = corrcoef(x2, y2); %
Correlation for Age vs Salary
% Extract correlation coefficients
r1 = corr1(1, 2);
r2 = corr2(1, 2);
% Configure first subplot
subplot(1, 2, 1);
xlabel('Years of Experience');
ylabel('Salary');
title('Experience vs Salary
(PhD)');
text(mean(x1), max(fI1),
sprintf('Correlation: %.2f', r1),
'FontSize', 10);
legend('Location', 'best');
grid on;
% Configure second subplot
subplot(1, 2, 2);
xlabel('Age');
ylabel('Salary');
title('Age vs Salary (PhD)');
text(mean(x2), max(fI2),
sprintf('Correlation: %.2f', r2),
'FontSize', 10);
legend('Location', 'best');
grid on;
% Calculate R-squared values
% For Experience vs Salary
y1_fit = polyval(pI1, x1);
y1_mean = mean(y1);
St1 = sum((y1 - y1_mean).^2);
Sr1 = sum((y1_fit - y1_mean).^2);
r2_1 = Sr1 / St1;
% For Age vs Salary
y2_fit = polyval(pI2, x2);
y2_mean = mean(y2);
St2 = sum((y2 - y2_mean).^2);
Sr2 = sum((y2_fit - y2_mean).^2);
r2_2 = Sr2 / St2;
% Display statistical results
fprintf('Analysis Results:\n');
fprintf('\nExperience vs
Salary:\n');
fprintf('Correlation: %.4f\n', r1);
fprintf('R-squared: %.4f\n', r2_1);
fprintf('Linear equation: y = %.2fx
+ %.2f\n', pI1(1), pI1(2));
fprintf('\nAge vs Salary:\n');
fprintf('Correlation: %.4f\n', r2);
fprintf('R-squared: %.4f\n', r2_2);
fprintf('Linear equation: y = %.2fx
+ %.2f\n', pI2(1), pI2(2));
[CO3: 4 MARKS]
Report the results clearly and identify k
underlying issues.
*Make sure to label your plot properly.
*Calculate the correlation between each of the independent variables and the dependent variable.
6. Runtime of .m file (to be assessed by the lecturer)
[CO3: 2 MARKS]
Executes without errors, excellent user Does not execute due to syntax errors/runtime
prompts, good use of symbols, and spacing in errors (endless loop, crashes etc.) User prompts
output. are misleading or non-existent.
7. Discuss and summarise the results of your analysis.
1. Summary of Analysis
The analysis was conducted to understand the relationship between PHD salary (dependent
variable) and the following independent variables:
● Years of experience
● Age
● Salary
Using linear regression and correlation analysis, the relationships were evaluated. Multiple linear
regression was implemented to estimate the combined effect of these variables on stress levels.
Each independent variable showed expected relationships:
● Years of experience : give positive correlation with salary
● Age : give positive correlation with salary
● Salary
2. Estimated Model
1. Experience vs. Salary (Simple Linear Regression)
Using the regression equation from your code:
\text{Salary} = pI1(1) \times \text{Experience} + pI1(2)
Where:
• pI1(1) is the slope (effect of experience on salary).
• pI1(2) is the y-intercept.
2. Age vs. Salary (Simple Linear Regression)
Similarly, the equation for Age vs. Salary is:
\text{Salary} = pI2(1) \times \text{Age} + pI2(2)
Where:
• pI2(1) is the slope (effect of age on salary).
• pI2(2) is the intercept.
3. Multiple Regression (Experience & Age vs. Salary)
If you extend it to multiple regression, the estimated model is:
\text{Salary} = b_0 + b_1 \times \text{Experience} + b_2 \times \text{Age}
Where:
• b_0 is the intercept.
• b_1 is the coefficient for experience.
• b_2 is the coefficient for age.
Interpretation of the Coefficient of Determination ( R^2 )
R^2 (R-squared) measures how well the independent variable(s) explain the variability in
salary.
1. Experience vs. Salary ( R^2_1 )
• If R^2_1 is close to 1, it means years of experience strongly predict salary.
• If R^2_1 is close to 0, experience has little influence on salary.
2. Age vs. Salary ( R^2_2 )
• If R^2_2 is high, age is a strong predictor of salary.
• If it’s low, age does not significantly determine salary.
4. Prediction Using the Model
.Predict Salary Using Simple Linear Models
For Experience vs. Salary:
\hat{y} = pI1(1) \times \text{Experience} + pI1(2)
For Age vs. Salary:
\hat{y} = pI2(1) \times \text{Age} + pI2(2)
Predict Salary Using Multiple Regression Model
If use both experience & age together:
\hat{y} = b_0 + b_1 \times \text{Experience} + b_2 \times \text{Age}
5. Can a Prediction Be Made? Will the Prediction Be Valid?
Prediction validity depends on several factors:
1. Linearity Assumption
• If the relationship between salary and experience/age is not truly linear, the
prediction may not be valid.
• Checking scatter plots and residuals can help confirm linearity.
2. R-squared ( R^2 ) Value
• If R^2 is high (close to 1), the model explains a large portion of salary
variation, making predictions more reliable.
• If R^2 is low, the model does not capture salary trends well, making
predictions less valid.
3. Extrapolation Risk
• Predictions should only be made within the range of the training data.
• Example: If your data has experience values between 5 and 20 years,
predicting for 50 years would be unreliable.
4. Multicollinearity in Multiple Regression
• If Experience and Age are highly correlated, it could affect the stability of
the model.
• Checking the variance inflation factor (VIF) can help detect multicollinearity.
5. External Factors Missing in the Model
• Salary is influenced by more than just experience and age (e.g., industry,
location, education, performance).
• If important variables are missing, the model will have limited predictive
power.
[CO2: 5 MARKS]
Report the results clearly and identify Does not identify the solution to the problem.
underlying issues.
*Summarize the results of your analysis.
*Present the estimated model. Give an interpretation of the coefficient of determination (r2).
*Use your model to predict any combination of values for your independent variables.
*Can a prediction be made? Will the prediction be valid?