0% found this document useful (0 votes)
226 views8 pages

Stepwise Regression Tutorial With NumXL

This is the second entry in our regression analysis and modeling series. In this tutorial, we continue the analysis discussion we started earlier and leverage an advanced technique – stepwise regression ‐ to help us find an optimal set of explanatory variables for the model For more information and/or to download the spreadsheet file, http://bitly.com/133pCQn

Uploaded by

NumXL Pro
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
226 views8 pages

Stepwise Regression Tutorial With NumXL

This is the second entry in our regression analysis and modeling series. In this tutorial, we continue the analysis discussion we started earlier and leverage an advanced technique – stepwise regression ‐ to help us find an optimal set of explanatory variables for the model For more information and/or to download the spreadsheet file, http://bitly.com/133pCQn

Uploaded by

NumXL Pro
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Tutorial:

Regression 102
[Link],wecontinuethe analysisdiscussionwestartedearlierandleverageanadvancedtechniquestepwiseregressionto helpusfindanoptimalsetofexplanatoryvariablesforthemodel. Again,[Link] attemptstoexplainandpredictweeklysalesforeachsalesperson(dependentvariable)usingtwo explanatoryvariables:intelligence(IQ)andextroversion.

Data Preparation
Similartowhatwedidinanearliertutorial,weorganizeoursampledatabyplacingthevalueofeach variableinaseparatecolumnandeachobservationinaseparaterow. Next,[Link](0,1),whichchooseswhichvariableis included(orexcluded)fromtheanalysis. Initially,atthetopofthetable,letsinsertthemaskcellsarray,eachwithavalueof1([Link]).The arrayisshownhighlightedbelow.

Inthisexample,wehave20observationsandtwoindependent(explanatory)[Link] dependentvariableistheweeklysales.

Process
Now,[Link],selectanemptycellinyourworksheet whereyouwishtheoutputtobegenerated,thenlocateandclickontheregressioniconintheNumXL

Regression102Tutorial

SpiderFinancialCorp,2013

tab(ortoolbar).

TheRegressionwizardappears.

Selectthecellsrangefortheresponse/dependentvariablevalues([Link]).Selectthecells rangefortheexplanatory(independent)[Link](X)Mask,selectthecellsatthe topofthedatatable(Booleanarray). Notes: 1. Thecellsrangeincludes(optional)theheading(Label)cell,whichwouldbeusedintheoutput tableswhereitreferencesthosevariables. 2. Theexplanatoryvariables(i.e.X)arealreadygroupedbycolumns(eachcolumnrepresentsa variable),sowedontneedtochangethat. 3. Bydefault,theoutputcellsrangeissettothecurrentlyselectedcellinyourworksheet. PleasenotethatonceweselecttheXandYcellsrange,theOptions,ForecastandMissingValues tabsbecomeavailable(enabled). Next,selecttheOptionstab.

Regression102Tutorial

SpiderFinancialCorp,2013

Initially,thetabissettothefollowingvalues: Theregressionintercept/[Link] [Link]([Link](0)),enterit there. Thesignificancelevel(aka. )issetto5%. IntheOutputsection,themostcommonregressionanalysesareselected. [Link].

Now,clickontheMissingValuestab.

Regression102Tutorial 3 SpiderFinancialCorp,2013

Inthistab,youcanselectanapproachtohandlemissingvaluesinthedataset(XandY).Bydefault,any missingvaluefoundinXorinYinanyobservationwouldexcludetheobservationfromtheanalysis. Thistreatmentisagoodapproachforouranalysis,soletsleaveitunchanged. Now,clickOKtogeneratetheoutputtables:

Analysis
AsidefromtheVariables(X)Masksettings,everythingisexactlythesameaswedidintheprior tutorial,sowhatsournextstep? TheMaskvariabledetermineswhichvariableisincludedintheregressionanalysis,soletstake anotherlookattheCoefficientstable.

First,[Link] maskvalueforthiscelltozero.

Regression102Tutorial

SpiderFinancialCorp,2013

Now,ifyouhavetheCalculationoptionsettomanual,[Link],thespreadsheet recalculatesautomatically.

Checkingtheoutputtables,wefindthefollowing: Rsquaredroppedby6%. AdjustedRsquaredroppedby1.5%. Standarderrorincreasedby$3. AICdroppedbyone(1). ANOVAtableshowstheregressionissignificant. Residualdiagnosischecksoutforalltests. Intheregressioncoefficientstable,theinterceptandthecoefficientoftheExtroversion variablearebothstatisticallysignificant.

Thismodelhasfewerparameters([Link])andexplainsthevariationinthevaluesoftheresponse variablejustaswellaswhenwehadtwo(2)explanatoryvariables. Now,letsplottheestimatedvaluesagainsttheactual.

Regression102Tutorial

SpiderFinancialCorp,2013


$4,500 $Sales/Week Estimated $4,000

$3,500

$3,000

$2,500

$2,000

$1,500 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Theshadedarearepresentsthe95%confidenceintervalfortheestimatesoftheregressionmodel. Sofar,wehavedemonstratedthatdroppingavariablefromtheanalysisisaseasyasflippingaswitch; [Link],butyou mightbewondering:ifIhadmoreexplanatoryvariables(say10),whatistheoptimalsetofvariables? ShouldItryeverysinglesubset? NumXLsupportsaninterestingfunctionalitystepwiseregressiontohelpyouselectthisoptimalset. Letsdemonstratehowyouwoulduseit. (1) IntheMaskcellsrange,turnthevariablesonoroffthatyouwishthestepwiseregressionto [Link],wewillturnthemallon.

(2) LocateandclickontheregressioniconintheNumXLtab. Regression102Tutorial 6 SpiderFinancialCorp,2013

(3) TheRegressionWizardpopsup. (4) IntheGeneraltab,selecttheinputcellsrangeandthemaskcellsrange. (5) UndertheOptionstab,checktheStepwiseRegressionbox.

(6) Leavethe3differentmethodschecked. (7) ClickOK. (8) Theoutputtablesaregenerated.

Thestepwiseregressiongeneratesoneadditionaltablenexttothecoefficientstable. Regression102Tutorial 7 SpiderFinancialCorp,2013

Letstakeacloserlookatthisnewtable. ThestepwiseregressioncarriesonaseriesofpartialF testtoinclude(ordrop)variablesfromtheregression model. Forwardselection:westartwithanintercept, andexamineaddinganadditionalvariable. Backwardelimination:westartfromthefull modelwithallvariablesin,andconsider droppingonerepressoratatime. Bidirectionaleliminationisahybridofthetwo methods.

[Link](1)standsforinclusion andzero(0)forexclusion. Atthebottomofthetable,[Link] thiscase,thethreemodelscamebackwiththesamesetofvariables,sonocomparisonisneeded. Pleasenotethat,giventhesamesetofinputvariablesandresponses,themaskisusedtodifferentiate onemodelfromotherssimplybylistingtheinclusion/exclusionlist.

Conclusion
Sofar,wehavecreatedaregressionmodel,examineditssignificance,verifiedthatitsatisfiesunderlying assumptions,andfoundtheoptimalsubsetofvariablesofthemodel. Formany,thisistheendofanalysis,andtheywouldprobablystartusingitforforecasting. Beforewecanusethemodelforforecasting,therearetwomorequestionsweoughttoanswer: (1) Dowehaveanyobservationthatexertsasignificantinfluence([Link])ontheregression model? (2) Istheregressionmodelstableoverthesampledata? [Link],readon.

Regression102Tutorial

SpiderFinancialCorp,2013

You might also like