100% found this document useful (1 vote)

68 views12 pages

Oil Export Indonesia

This research uses regression analysis to test whether variables like last oil price, USD exchange rate, and inflation influence Indonesia's oil export volume. The analysis uses sample data from 2021-2022 downloaded from various sources. Linear regression is performed on training data to create a model, which is then used to predict oil export volumes in test data. The model achieves an R-squared value of 0.82, indicating it can accurately predict 82% of variations in oil export volumes.

Uploaded by

Rifky Kurniawan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

68 views12 pages

Oil Export Indonesia

Uploaded by

Rifky Kurniawan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

This is research to test whether the variables

last_oil_price, mid_USD_rate and inflation

influence oil_export volume in Indonesia. The
test uses regression calculations by taking
sample data from 2021 to 2022. The data is
downloaded from the BPS, BI and investing
websites.

1. Preparation of Data

Import all of libraries that is required

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import sklearn
import seaborn as sns
from math import log
from scipy import stats
import statsmodels.api as sm
from statsmodels.stats.stattools import jarque_bera
%matplotlib inline

#Read in the export_oil csv file as DataFrame called oil_indonesia

oil_indonesia = pd.read_csv("/content/export_oil.csv")

#Check the head of oil_indonesia

oil_indonesia.head()

last_oil_price mid_USD_rate oil_export inflation

0 51.56 13662.0 815.3 0.0268
1 44.76 14234.0 805.2 0.0298
2 20.48 16367.0 617.4 0.0296
3 18.84 15157.0 562.1 0.0267
4 35.49 14733.0 560.9 0.0219
#Check the oil_indonesia table's shape

oil_indonesia.shape

(48, 5)

#Check the oil_indonesia info

oil_indonesia.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 48 entries, 0 to 47
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 end_month 48 non-null object
1 last_oil_price 48 non-null float64
2 mid_USD_rate 48 non-null int64
3 oil_export 48 non-null float64
4 inflation 48 non-null float64
dtypes: float64(3), int64(1), object(1)
memory usage: 2.0+ KB

Change data type mid_USD_rate to float64 type

oil_indonesia['mid_USD_rate'] =
oil_indonesia['mid_USD_rate'].astype('float64')

oil_indonesia.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 48 entries, 0 to 47
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 end_month 48 non-null object
1 last_oil_price 48 non-null float64
2 mid_USD_rate 48 non-null float64
3 oil_export 48 non-null float64
4 inflation 48 non-null float64
dtypes: float64(4), object(1)
memory usage: 2.0+ KB

#Check null value in oil_indonesia

print(oil_indonesia.isnull().sum())
end_month 0
last_oil_price 0
mid_USD_rate 0
oil_export 0
inflation 0
dtype: int64

Check possibility duplicated value in

oil_indonesia
oil_indonesia.duplicated().sum()

#Drop "end_month" column that it is useless for regression calculation

oil_indonesia.drop(['end_month'],axis=1,inplace=True)

Check the oil_indonesia describe methods

oil_indonesia.describe()

last_oil_price mid_USD_rate oil_export inflation

count 48.000000 48.000000 48.000000 48.000000
mean 69.608542 14776.416667 1061.262500 0.028981
std 22.174758 554.931463 342.029537 0.014455
min 18.840000 13662.000000 0.000000 0.013200
25% 52.040000 14346.750000 849.275000 0.015975
50% 74.580000 14671.500000 1066.150000 0.026050
75% 82.242500 15101.500000 1311.150000 0.038500
max 114.670000 16367.000000 1662.900000 0.059500

2. Exploratory Data Analysis

Use heatmap to recreate the plot below

plt.figure(figsize=(10,5))
c = oil_indonesia.corr()
sns.heatmap(c,cmap="BrBG",annot=True)
c
last_oil_price mid_USD_rate oil_export inflation
last_oil_price 1.000000 0.073903 0.755607 0.458852
mid_USD_rate 0.073903 1.000000 0.156464 0.545627
oil_export 0.755607 0.156464 1.000000 0.541510
inflation 0.458852 0.545627 0.541510 1.000000

Use pairplot to recreate the plot below

sns.pairplot(oil_indonesia)

<seaborn.axisgrid.PairGrid at 0x78b559296e90>
Create a linear model plot (using seaborn
lmplot) inflation vs oil_export
sns.lmplot(x='inflation', y="oil_export", data=oil_indonesia)

<seaborn.axisgrid.FacetGrid at 0x78b557c3aa10>
3. Training and Testing Data

Set variable X equal to the numerical features of

the oil _indonesia and a variable y equal to the
oil_export column
X = oil_indonesia.drop(['oil_export'],axis=1)

y = oil_indonesia['oil_export']

from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
scaler.fit(X)
X = scaler.transform(X)
Import cross validation train_test_split from
sklearn
from sklearn.model_selection import train_test_split

Split the data into training and testing sets. Set

test size=0.2 and random test=0
X_train,X_test, y_train, y_test = train_test_split(X,y, test_size=0.2,
random_state=0)

4. Training the Model

Import linear regression from

sklearn.linear_model
from sklearn.linear_model import LinearRegression

Create a instance of a LinearRegression model

named lm
lm = LinearRegression()

Train/fit lm on the training data

lm.fit(X_train, y_train)

LinearRegression()

Print out the coefficients of the model

lm.coef_

array([200.44050601, -10.80755483, 121.11087484])

Print out the X_test
prediction = lm.predict(X_test)
prediction

array([1528.45452188, 705.9208487 , 1343.42451644, 1511.02483507,

1416.10915511, 1357.42519896, 1372.09041617, 1159.72902241,
700.81715592, 757.00143871])

5. Predicting the Test Data

Create a scatterplot of the real test values vs

the predicted values
plt.scatter(y_test, prediction)
plt.xlabel("Y_test")
plt.ylabel('predicted values')

Text(0, 0.5, 'predicted values')

6. Evaluating the Model

Calculate the mean_absolute_error,

mean_squared_error and
metrics.mean_squared_error for normalization
of data
from sklearn import metrics

print('MAE',metrics.mean_absolute_error(y_test, prediction))
print('MSE',metrics.mean_squared_error(y_test,prediction))
print('RMSE',np.sqrt(metrics.mean_squared_error(y_test,prediction)))

MAE 139.51781104780872
MSE 25329.345231343865
RMSE 159.15195641695348

Print out the R square value for deteminate how

accurate the model in prediction
metrics.explained_variance_score(y_test,prediction)

0.8234756635987014

Print head of independent variable(X) from

oil_indonesia DataFramePrint head of
independent variable(X) from oil_indonesia
DataFrame
X = oil_indonesia.drop(["oil_export"],axis=1)
X.head()

last_oil_price mid_USD_rate inflation

0 51.56 13662.0 0.0268
1 44.76 14234.0 0.0298
2 20.48 16367.0 0.0296
3 18.84 15157.0 0.0267
4 35.49 14733.0 0.0219

Print head of independent variable(y) from

oil_indonesia DataFrame
y = oil_indonesia['oil_export']
y.head()

0 815.3
1 805.2
2 617.4
3 562.1
4 560.9
Name: oil_export, dtype: float64

Add constant in X variable

X = sm.add_constant(X)
X.head()

const last_oil_price mid_USD_rate inflation

0 1.0 51.56 13662.0 0.0268
1 1.0 44.76 14234.0 0.0298
2 1.0 20.48 16367.0 0.0296
3 1.0 18.84 15157.0 0.0267
4 1.0 35.49 14733.0 0.0219

Print OLS Regression Result in watching

performance of regression model
OLS = sm.OLS(endog = y, exog = X).fit()
OLS.summary()

<class 'statsmodels.iolib.summary.Summary'>
"""
OLS Regression Results

======================================================================
========
Dep. Variable: oil_export R-squared:
0.620
Model: OLS Adj. R-squared:
0.594
Method: Least Squares F-statistic:
23.93
Date: Wed, 17 Jan 2024 Prob (F-statistic):
2.44e-09
Time: 10:50:40 Log-Likelihood:
-324.46
No. Observations: 48 AIC:
656.9
Df Residuals: 44 BIC:
664.4
Df Model: 3

Covariance Type: nonrobust

======================================================================
============
coef std err t P>|t| [0.025
0.975]
----------------------------------------------------------------------
------------
const 546.7744 1021.789 0.535 0.595 -1512.506
2606.054
last_oil_price 9.7749 1.661 5.887 0.000 6.428
13.122
mid_USD_rate -0.0238 0.070 -0.339 0.736 -0.166
0.118
inflation 6432.0732 3031.540 2.122 0.040 322.405
1.25e+04
======================================================================
========
Omnibus: 48.862 Durbin-Watson:
1.555
Prob(Omnibus): 0.000 Jarque-Bera (JB):
286.338
Skew: -2.453 Prob(JB):
6.65e-63
Kurtosis: 13.913 Cond. No.
1.45e+06
======================================================================
========

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is
correctly specified.
[2] The condition number is large, 1.45e+06. This might indicate that
there are
strong multicollinearity or other numerical problems.
"""
7. Conclusion

Testing of prediction instead testing data that is

prepared. Create a new variable DataFrame (df)
new_df = pd.DataFrame({"Actual":y_test,"Predicted":prediction})
new_df

Actual Predicted
29 1551.8 1528.454522
4 560.9 705.920849
26 1493.3 1343.424516
30 1287.6 1511.024835
32 1259.0 1416.109155
37 1186.5 1357.425199
34 1101.9 1372.090416
40 1308.6 1159.729022
7 599.6 700.817156
10 762.2 757.001439

Regression Anallysis Hands0n 1
100% (1)
Regression Anallysis Hands0n 1
3 pages
### Data Exploration: 'Yes' 'No' 'Agency' 'Direct' 'Employee Referral' 'Yes' 'No'
100% (1)
### Data Exploration: 'Yes' 'No' 'Agency' 'Direct' 'Employee Referral' 'Yes' 'No'
6 pages
Variosalgoritmos - Jupyter Notebook
100% (1)
Variosalgoritmos - Jupyter Notebook
9 pages
Assignment No - 6-1
100% (1)
Assignment No - 6-1
3 pages
Heart Disease Prediction Guide
100% (1)
Heart Disease Prediction Guide
73 pages
Loading The Dataset: 'Churn - Modelling - CSV'
No ratings yet
Loading The Dataset: 'Churn - Modelling - CSV'
6 pages
Heart Disease Prediction - Jupyter Notebook
100% (1)
Heart Disease Prediction - Jupyter Notebook
9 pages
Regression Diagnostics Overview
100% (1)
Regression Diagnostics Overview
53 pages
Bootstrap Powerpoint
100% (1)
Bootstrap Powerpoint
20 pages
Lab 3. Linear Regression 230223
100% (1)
Lab 3. Linear Regression 230223
7 pages
PR01
100% (1)
PR01
41 pages
Vinee
100% (1)
Vinee
28 pages
Classification Problems
100% (1)
Classification Problems
25 pages
ML Lect1
100% (1)
ML Lect1
51 pages
Stats & ML Model Comparisons
100% (1)
Stats & ML Model Comparisons
72 pages
Linear Regression with Python OLS
No ratings yet
Linear Regression with Python OLS
23 pages
Linear Regression: What Is Regression Analysis?
100% (1)
Linear Regression: What Is Regression Analysis?
21 pages
Linear Regression Models Overview
100% (1)
Linear Regression Models Overview
39 pages
ML Project Guide for Practitioners
No ratings yet
ML Project Guide for Practitioners
7 pages
Importing Libraries: Import As Import As Import As From Import As From Import From Import Import
100% (1)
Importing Libraries: Import As Import As Import As From Import As From Import From Import Import
11 pages
Sajjad DS
100% (2)
Sajjad DS
97 pages
Correlation Measures and Hypothesis Tests
100% (1)
Correlation Measures and Hypothesis Tests
24 pages
Importing Stock Data with Pandas
100% (1)
Importing Stock Data with Pandas
4 pages
Linear Regression Models Guide
100% (1)
Linear Regression Models Guide
61 pages
IRIS BPNN - Ipynb - Colaboratory
100% (1)
IRIS BPNN - Ipynb - Colaboratory
4 pages
Diabetes Classification with SMOTE Analysis
100% (1)
Diabetes Classification with SMOTE Analysis
7 pages
Regression Analysis Essentials
100% (1)
Regression Analysis Essentials
2 pages
ML0101EN Clas Logistic Reg Churn Py v1
100% (1)
ML0101EN Clas Logistic Reg Churn Py v1
13 pages
Outlines: Statements of Problems Objectives Bagging Random Forest Boosting Adaboost
100% (1)
Outlines: Statements of Problems Objectives Bagging Random Forest Boosting Adaboost
14 pages
Credit Card Fraud Detection Using Machine Learning
100% (1)
Credit Card Fraud Detection Using Machine Learning
82 pages
Introduction to Statistics Basics
100% (1)
Introduction to Statistics Basics
46 pages
Logistics Regression
100% (1)
Logistics Regression
5 pages
Csi 5155 ML Project Report
100% (1)
Csi 5155 ML Project Report
24 pages
B Ridge - and - Lasso - Regression
No ratings yet
B Ridge - and - Lasso - Regression
5 pages
Patient Data Management System
100% (1)
Patient Data Management System
27 pages
Regression Analysis
100% (2)
Regression Analysis
9 pages
Machine Learning in Mechanical Engineering
No ratings yet
Machine Learning in Mechanical Engineering
20 pages
Finance-Focused Big Data Techniques
100% (1)
Finance-Focused Big Data Techniques
23 pages
KNN for Telecom Customer Segmentation
100% (1)
KNN for Telecom Customer Segmentation
11 pages
Intro to Machine Learning Basics
100% (1)
Intro to Machine Learning Basics
52 pages
ML MU Unit 2
100% (3)
ML MU Unit 2
84 pages
Supervised Regression in Machine Learning
No ratings yet
Supervised Regression in Machine Learning
32 pages
Life Expectancy Using Data Analytics
100% (1)
Life Expectancy Using Data Analytics
9 pages
Regressao Linear Simples - Ipynb - Colaboratory
100% (1)
Regressao Linear Simples - Ipynb - Colaboratory
2 pages
Insurance Claim Prediction Models Analysis
67% (3)
Insurance Claim Prediction Models Analysis
33 pages
Churn Modeling
100% (1)
Churn Modeling
11 pages
Least Squares Problems: How To State and Solve Them, Then Evaluate Their Solutions
100% (1)
Least Squares Problems: How To State and Solve Them, Then Evaluate Their Solutions
63 pages
Predicting Life Expectancy with ML
100% (1)
Predicting Life Expectancy with ML
36 pages
SAT and GPA Regression Analysis
100% (1)
SAT and GPA Regression Analysis
1 page
Logistic Regression
100% (1)
Logistic Regression
14 pages
SVM Guide for Data Science Enthusiasts
100% (1)
SVM Guide for Data Science Enthusiasts
28 pages
Logistic Regression
100% (1)
Logistic Regression
29 pages
ECG Image Classification with ML
100% (1)
ECG Image Classification with ML
16 pages
Peter Dueben: Royal Society University Research Fellow & ECMWF's Coordinator For Machine Learning and AI Activities
100% (1)
Peter Dueben: Royal Society University Research Fellow & ECMWF's Coordinator For Machine Learning and AI Activities
33 pages
TGV Delay Prediction and Analysis
100% (1)
TGV Delay Prediction and Analysis
25 pages
Diagnosing Multicollinearity in Data
100% (1)
Diagnosing Multicollinearity in Data
6 pages
A) What Is Motivation Behind Ensemble Methods? Give Your Answer in Probabilistic Terms
100% (1)
A) What Is Motivation Behind Ensemble Methods? Give Your Answer in Probabilistic Terms
6 pages
Lecture 9 PDF
100% (1)
Lecture 9 PDF
28 pages
Human Life Span Prediction Using Machine Learning
100% (1)
Human Life Span Prediction Using Machine Learning
9 pages
Machine Learning - Multi Linear Regression Analysis
No ratings yet
Machine Learning - Multi Linear Regression Analysis
29 pages
Gauss's Backward Interpolation Formula
No ratings yet
Gauss's Backward Interpolation Formula
4 pages
CH9 Algebraic Expressions
No ratings yet
CH9 Algebraic Expressions
14 pages
ECEA106L EXP2 Solving A System of Linear Equations PDF
No ratings yet
ECEA106L EXP2 Solving A System of Linear Equations PDF
4 pages
UMTYMP Algebra 2 - Spring 2024 Syllabus
No ratings yet
UMTYMP Algebra 2 - Spring 2024 Syllabus
4 pages
Numerical Analysis
No ratings yet
Numerical Analysis
7 pages
Numerical Methods C Programs
No ratings yet
Numerical Methods C Programs
19 pages
High-Accuracy Numerical Differentiation Techniques
No ratings yet
High-Accuracy Numerical Differentiation Techniques
30 pages
Factoring Polynomials With Common Monomial Factor
No ratings yet
Factoring Polynomials With Common Monomial Factor
31 pages
Math 313 Cat Ii 2024
No ratings yet
Math 313 Cat Ii 2024
2 pages
Newton-Raphson Method Explained
No ratings yet
Newton-Raphson Method Explained
39 pages
Mathematics: Quarter 1 - Module 8: Performing Division of Polynomials Using Long and Synthetic Division
100% (2)
Mathematics: Quarter 1 - Module 8: Performing Division of Polynomials Using Long and Synthetic Division
32 pages
MICROTEACHING ON POLYNOMIALS SHASHIKANTA BEHERA B Ed M Ed Integrated
No ratings yet
MICROTEACHING ON POLYNOMIALS SHASHIKANTA BEHERA B Ed M Ed Integrated
7 pages
Part 2 Lecture Notes On Interpolation
No ratings yet
Part 2 Lecture Notes On Interpolation
26 pages
Lab 4 Newton Divided Difference Lagrange Interpolation: Objectives
No ratings yet
Lab 4 Newton Divided Difference Lagrange Interpolation: Objectives
9 pages
Match Each Equation With The Most Suitable Graph. Write The Letter of The Equation Beneath The Matching Graph
No ratings yet
Match Each Equation With The Most Suitable Graph. Write The Letter of The Equation Beneath The Matching Graph
5 pages
Polynomials-Brochure 20240921 175309 0000
No ratings yet
Polynomials-Brochure 20240921 175309 0000
2 pages
Comparison of Linear - Quadratic Elements, Stress Smoothing
No ratings yet
Comparison of Linear - Quadratic Elements, Stress Smoothing
49 pages
Iterative Methods for Linear Systems
No ratings yet
Iterative Methods for Linear Systems
4 pages
Determinants and Inverses of Matrices
No ratings yet
Determinants and Inverses of Matrices
3 pages
Simplex Method Calculator
No ratings yet
Simplex Method Calculator
5 pages
Unit II - Numerical Methods IV Sem
No ratings yet
Unit II - Numerical Methods IV Sem
37 pages
9-F The Simplex Maximization Method of Linear Programming
No ratings yet
9-F The Simplex Maximization Method of Linear Programming
18 pages
Second Midterm Exam
No ratings yet
Second Midterm Exam
11 pages
Project LInear Programming
No ratings yet
Project LInear Programming
22 pages
Taylor and Maclaurin Series (Ol) Dehwk:Rrg
No ratings yet
Taylor and Maclaurin Series (Ol) Dehwk:Rrg
5 pages
Linear Programming Techniques Explained
100% (1)
Linear Programming Techniques Explained
3 pages
2 Bus Line Flow Equations
No ratings yet
2 Bus Line Flow Equations
4 pages
Grade 8 Factoring Polynomials Guide
100% (8)
Grade 8 Factoring Polynomials Guide
16 pages
Root Finding Methods
No ratings yet
Root Finding Methods
20 pages
Differential Linear Matrix Inequalities Optimization: Tiago R. Gonçalves, Gabriela W. Gabriel, and José C. Geromel
No ratings yet
Differential Linear Matrix Inequalities Optimization: Tiago R. Gonçalves, Gabriela W. Gabriel, and José C. Geromel
6 pages

Oil Export Indonesia

Uploaded by

Oil Export Indonesia

Uploaded by

This is research to test whether the variables

last_oil_price, mid_USD_rate and inflation

Import all of libraries that is required

#Read in the export_oil csv file as DataFrame called oil_indonesia

#Check the head of oil_indonesia

last_oil_price mid_USD_rate oil_export inflation

#Check the oil_indonesia info

Change data type mid_USD_rate to float64 type

#Check null value in oil_indonesia

Check possibility duplicated value in

#Drop "end_month" column that it is useless for regression calculation

Check the oil_indonesia describe methods

last_oil_price mid_USD_rate oil_export inflation

2. Exploratory Data Analysis

Use heatmap to recreate the plot below

Use pairplot to recreate the plot below

Set variable X equal to the numerical features of

from sklearn.preprocessing import StandardScaler

Split the data into training and testing sets. Set

4. Training the Model

Import linear regression from

Create a instance of a LinearRegression model

Train/fit lm on the training data

Print out the coefficients of the model

array([200.44050601, -10.80755483, 121.11087484])

array([1528.45452188, 705.9208487 , 1343.42451644, 1511.02483507,

5. Predicting the Test Data

Create a scatterplot of the real test values vs

Text(0, 0.5, 'predicted values')

Calculate the mean_absolute_error,

Print out the R square value for deteminate how

Print head of independent variable(X) from

last_oil_price mid_USD_rate inflation

Print head of independent variable(y) from

Add constant in X variable

const last_oil_price mid_USD_rate inflation

Print OLS Regression Result in watching

Covariance Type: nonrobust

Testing of prediction instead testing data that is

You might also like