0 ratings0% found this document useful (0 votes) 72 views2 pagesDecision Tree Regression
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here.
Available Formats
Download as PDF or read online on Scribd
14.4.2. Decision Tree for Regression Problems
Decision Tree algorithm for reprenion problems canbe applied using DacisionTreeRegeessor () funcdon fom estan
tees ibvary Fortandersanding the utity of decision ice, we will consider Longley datasr thar can be downloaded fom hsp
‘ywwith nist. gov/div898/stedflsdata/LINKS/DATAJLongley.tat
Dataset Informations The Longley dataset contains various U.S. macroeconomic variables chat are known to be highly colina
Variable name defiitions are as fllows:
Employed—Toral Employment
[GNPdeflator—GNP deflator
‘GNP—GNP
‘Unemployed —Number of unemployed
Armed. Forces—Size of armed forces
Population —Population
Yeut—Year (1947-1962)
‘importing libraries
from sklearn.model se
1 sole:
selection Inport train test spt
Siearnseses gery besetentestageicoor
fron oh export graphviz
pone jnborsRegressor
price, inpors mean_ squared ezzor
ort sqrt
import pandas
import matplotlib.pyplot as pit
Reading data.
longley = pandas.read_esv(*longley.cav")
sDieplaying the characteristics of
Print (vthe dimension of the date set tei") lengley. step
print ("The names of the var: ee eee
bles in the data set are: \F
, lengley. columns)
fetermining null values.
print ("Null values in datase
Print ("Not available val
* longley.isnuli ()-sum0)
es in dataset:\n", longley.isna() -sus(0)
fusing a random seed function for generating the sane data set.
nompy. randon. seed (3000)
training, test = train test_spl
x teg © training. p
training| Employed")
test.drop("Employed', axis~1)
test ['Employed')
(Longley, test_size-0.3)
ficeeating a decision tree model.
print DECISION TREE HO!
veeblongley = DecisionTracRegressor (randon_state
treelongley.f1t (xtra, v_tra)
maining the inportance of the predictor voriables
Aatennining the sspoe eT atacaree of ancinion eres ink for Toney ave st; \0 "+
PelSiagteyefeatare. inpostences,)
eating a vison! model of decision tree alg0Hi2 oe
teroting via, ole Sateen hemi aee fete seers SS
{npurity-False, #illed-True)
sprediction on test set ct eout
tte pred = treelongley-predict (fe!
fcaiculate FASE for the mode}.
free, ruse = sgzt (nean_squared oF
cor (y_tests tree pred))
‘gree model is: *
fuind (URNSE value £or Decision THe del
orar " txee_mnse)
ve ett 0
a
on neces
fout_file=
serett graphviz (ereelongleynews OU
Geeuritysralse, filled=tTeue)
, randon_state=0)
got", feature_sanes=x_trg.coluans#Displaying importance of each variable in decision tree.
1, figsize= (20,10)
Lt .yticks (range (0, 9) ,x_trg.columns)
barh (range (0,9), treelongleynew. feature_importances_,aligr
xlabel ("Coefficient")
1t.ylabel ("Value")
Lt. show ()
#Prediction on test set for new decision tree model.
tree pred? = treelongleynew.predict (x_test)
#Calculate RMSE for the new decision tree model.
sqrt (mean_squared_error(y test, tree_pred2)}
80.4 ‘$tree_xmse2)