0 ratings 0% found this document useful (0 votes) 22 views 14 pages Supervised Learning
The document discusses various concepts in machine learning, including Markov Chain Sampling, causal networks, and supervised learning techniques such as linear regression and Bayesian inference. It explains the least squares method for regression analysis and provides examples of how to model relationships between dependent and independent variables. Additionally, it touches on optimization techniques for improving model accuracy.
AI-enhanced title and description
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here .
Available Formats
Download as PDF or read online on Scribd
Go to previous items Go to next items
Save supervised learning For Later
inital in gener, bah to comput exactly nd to approximate (ety
sampling slats.
1, What ir Markov Chan Sampling? \
© Generate even by making arom ange tothe preceding vest
|& The cange is made wing the Markov Blanket of the aril 1 te
staged
© Makov Blanket parent, lence’ paras
© Taly and nomaize rests
12, What ic consal network?
‘A curl network is anacyeic dignph ising ftom an evclaton
s shsnton sytem and repisenting ish. The lsat above shows
‘nul aetrrk (BB 0A, AA B > BA A.B} componiing 10
‘es oped in a etic nd nl condon AB A AB,
UNIT III
SUPERVISED LEARNING
———
Totuton omchinelenaiog Liner Repeon Model: Lest jemes singe
& sulle vaibes, Bayes liner repession, gadieat dees, Linear
‘Ciesifeaton Modes: Diseinian fain ~ Penepon alg, Potbisic
ovininave motel - Logistic regression, Probab generative model ~ Naive
‘ages Maximan marin csifer ~ Suppor vector machine, Desison Tie,
anion Frets
31, LINEAR REGRESSION MODELING
near Regression has actly been sound for very loo tine (around 200
yeas), Is linear model, iit asunes a near relaoosipteocen the apt
asl) and a single ouput varibe). They here cle bythe Kner
‘ombitog of th inpt variables. Linear rpeaoa ea typeof machi lsng
‘grid tat i ed to model te relaion between sear dependent abd oe ot
‘moe independent varies. The case of hvng oe independent variable shown
sil Ener regression wil th case of havieg ellen ropes ie
‘owns mle ner epressions.
‘ina Rgrson
Explain Baye’ hore with example.
Expsin ecking of Nate Bye’ Casi with example
‘Wee infil bout exc inference in Bayesian network.
Explain dtl about approxima inereos in Bayesian etek.
thse causal network wid ct diagram,
‘gered xy
phe ee
lin epreeon
Fash
1h toh of thee Inlar presen, the model ie conse sing 9 near
‘rece finetion, The wknown a parameter a ese wing the elblea
set besaise tis fester various apletons suchas fina, emo,
cpidenilogy,
22141, REGRESSION ALGORTHOS ‘
Regresion alors are ed when you ste that the ouput i continuous
‘wise, whereas clasfcton grits are wed when the ouput vided ito
sections sch at Pasi, Good / Average / Bad te: We have Ysious grits
for pefoming the regeson or clifton scons, with Liar Regression
Algo being he base lori in Regreson.—*
‘Coming hs Represion, bee eting it the loi, let mest he ave
Rope you renter the line eguton concept. Let me ge
ent points onthe XY pln, ny.) nd
yu)ewher y isthe oupat ox, andy isthe ouput of, hn thee equsion
‘tht passthrough te pins iy.) mt ~) where m isthe lope of thei
for you. asco
It about You
\ |
Fig
Ate nding he ne eqution if you ae gives apa, ty i.) you wou]
‘be cal able to protic if he oi es a tein o the distance ofthe point ra
the Lae Ths wat the base regreson tht had done fa shoolig- without
Artif neligence and Mahi Learning
ser aa
sic tt i lw pt ee i ay Wo
s Soaictna ascpsin cam an cua iiges
Pa craatvatn puyeann a artemens aaa
(Ce babelrekren evita
icepimennes
Si oie ein
nee
°. taming
eerie
Spee ita yb as at ee
cirri parmes sacar
Soest ates
ieee
yoni
1 Yo aetna i
A ee ee a ae
‘i ton mpc
2 tpn) boron
‘ett nm pf Sapte apn oi
searingepenent variable we the amos fifa aden vaibe
Marks scored by student based on numberof hours stale (eal
seri Ler fra]
he equation that describes how the ‘predicted values of y is related top
innit vical Mabe ear Reps eqton,
Abe the gph fr Mati Linear Regression Mods epi on te is ata
“3, LEAST SQUARE METHOD
‘Theles square meted isthe proces of ing the best-fting carve one of
‘st fora tof datapoints by redoing the sum of te ures ofthe offs
(spt of the pins frm the cure, During the proces of ining th lion
‘etween 90 vibes, the tend of oucome ae einsed qumitativey. This
oc is emad a egresion analysis, The method of curve iting approach
1» meson mais. This mend of fing equations which apposite the
cure ena dn ithe east ques,
ie obvious tat he iting f crv fr a patra str nat hey
ie, Ths it reuite o fs ere vag minal devin fom ll he
mesure pins. Tiss known a tebe iting cure andi found by sing
tele-quas meted
1121, EAST SQUARE METHOD DEFINITON
The lenses metod is «eral attic metho hts pacted tof
‘gs o best tin forte gen pat. This metbod ides by an
‘putin with pele parameter. The meio of least squares genus wed in
‘vation nd represon In regression lis, hs metod i sald be sda
‘sacl forth apposition of ss of equations having mor guts thn the
mnie of kz
‘Tre method of les squares actly defines te slton fr he minimization of
"he su of sq of deviations or the eer inte result of exch eqn. Fad
‘be oul foram of squares fees, which lp oid the ain in bed
7 menses wf the prditionsfm]
the dese apt For example, ifthe olion 371s 7, end th eect x
1 10, then we old ae ~ 107-9. Squaring it important tt itis ala
pote pd
Filly, eft dp al of thew indies. Since the sult possible]
‘nes forte sqm tern incl hat thie fis te dts cosy as pos
the line of bet ft eternal by he cee fe end b ocus exact the salle]
sale of). Forti reba, he model sao elle est squeseression.
Optimising the Model
Axis pit, we have ly defined both our moet
yieva,b) = art
sad ou los faction, into which we ean ttt model:
wats = Eo Gator Slavs bn)
Ps
Ths rd dot marked onthe lot of L sows where he dee isin i. We
‘el an alot 10 fod ts misiman. From cao, we know tht ef
nar Lust be etl at ht he derivative re bh 0:
ae
&- Suess,
% - 3 reei+b-r-0 8
you me to reve tis apes of lel, | wou
Vio. Now, fr hs problem tis possible to ave fr aand bang he egations
‘Sov le we would na yp eles eau, Bu for more advanced machine
ewig this is impossible, so instead we wil eam 40 tt an algritim
‘lle rain descent oid he nimi,
‘Thee inte plac tl at an wb loon on the sure of Ld
‘vial rol dail towards he Mt vay of nd thu find the minnum,
‘Weksow th diction of wail at ny lean nce we kaw the derivesArti Iligence and Machine Learning
of Ls the derivatives are the dieeton of greatest upward lope (ti is known as
‘he pie, she oppeste (negative) deve are th most downil con.
“Tf if be al is cuenta aon) wea Se wre it woul go by
moving it locaton 6) ies:
tian a
ae a-a®
ro eeu
w= o-adk
serine 3]
1. How stoag the rlaionsp ie between ‘wo or ore indepenicet
‘arabes an one dependent vie (eg how nil, temperate, and
amount of erie add et crop growth).
2. The vale of he dependent vars at carta vale ofthe independent
verti (the exes yell of a xp at cerais lvls of snl,
temper, nd feliz dition.
utiplelinesr regression example
‘You sea pli healheseucer terete in soil actor tht infcce bear.
iss. Yousurvey 50 towns and pater da on the percentage f pel in each
toy win sks the peotntage of peopl in ech town Who bike wok, a the
rex of people in ech town who hve har cas: Bees you le
Independent vcisbles and one dependent variable, and all your vale are
‘qmtaine, you can use mall linear reson wo aalyze the relionstip
‘vee then
‘Assumption of muipe near regression
Malle Uacar regression makes ll f the sine ssamptions simple Hea
ges: :
I. Homogeniyof variance (hmosedaticy: the ie oft eer ino
prodidon docm't change signficenly acess the valux of the
Independent ail,
2 Independence of observations the choovaions inthe diast were
collected wing sisal vadsampling methods, and the ae 20
ade elaisshipe among aril,
In muliple tna repesion its posible tat some of the independent
‘ables are dell corel wih one another, 20 perc to
eck thee before developing the repression model. I tw independent
‘shes are to highly corel (2 > ~ 0.6), then nly coe of thm
should eased in th genoa nods.
3. Normality: The da lows sal dstibtion:
‘4. snr: the line of best fit tro th dts pont i» sight ine,
‘her an a carve or ome set af upg facto,(sia) Arita! igen and Machine Learner Learning a
ee
pam oer oan
Y= Bot By Xi +. +B Ket ©
ee
secant,
Sane omirla
caesar
‘has on the predicted y value) 3%
ase cola
Seen a ene
Seis eae
nanec Ceten oe ese
sees
Peas een ae.
css
scam ate eatin be
Se ee fe ecm aoe
Pacifier
ipebscrepigorectcreines ag a
rae
erin © (aiid by the ety mati bens hei © mal-inensinat
faomlatn of he mode.
‘Teaim of Bayesian Lina Regression isnt dhe sgl “tet vle ofthe
nel prams, but rather to deems the posto dstbten forthe model
peace. Not oly isthe response gz fr a probably dstbon, but
‘he ode permet are asumed to come fom ribo well. The porn
woabliy ofthe model parameters is conan upon the ting lus nd
cnt
_Relnnran
ra = Ree
1s P73 pn py iit ht pce
pateinen nt apes Iie pel es beads tm Sc
Zia nee pa oes det ps a
a elutes omer chet
ee eaote
‘he ta moll bythe pie probity of he partes, Here we anche
‘temo pinay beefs of Bayesian Liner Repession, e
1. Priore we have domain owls, or» guess for wht he mide
rmeters stood be, we canola the in oor model le inthe
‘equ apronch Which sasues everyting the is to kw shot te
ramets cames fom the at Ie dont have any exits ahead
tins, we fan we noinfomatve po for the partes auch st &
rma dition,
2 Peterior: The ret of performing Biyslan Liner Repression is a
isin of psibe noel procter asd on the dia ed the ri.
‘This lows ws to qamfy our uncer about he model: if we have
eyed points, the potion wil be more send ot.
A te sount of da pis nceases, the klod washes outer, a
‘Se tase of infinite ds, the outputs fr the parameters converge 1 the ales
3:5. BAYESIAN LINEAR REGRESSION
nthe Buje vewpoit, we. fue laer repression ting prob
Ssiouons arta pit ena. The espns, int etimae asa
vale, huis sumed 0 be dawe fom a probaly csebuion. The model
Bayesian Linear Regression wit th spose sampled froma nomal dsb is
y= NOTX, oH)
‘The ouput» generated fom s noma (Gaussian Disubuton hace
‘mean and vera, The mea fer ler repression she canspose of he
‘trix mle by the predictor mai The variance isthe square of theis) Arif Iligence and Machine _sipervioed Leaning =
nsiood fom OLS. The frmulaion of motel parmetes os disuibuion|
‘Seapults he Bayesian wordvews we rout wih an nal est cur wg
‘Sue we pater more evidence, oor mode ecomes¥ess wrong. Bayesian asc
{Sa nnunlexetsion of ou ition. Of, we ve a nl hypotess, nd a
Colla dita tht ter pore or proves or Wes, we change our model
‘ve eal hs iho we would reso).
35:1. APLMENTING BAYESIAN NEAR REGRESSION
1s pete, eating te posterior dseibuion forthe model parameters
‘usable fr continous ables, 20 we we siping metals to row
‘Bom the poste in overt appoint the posterior. The tesiqu of dn
‘indo spl fom diauibation o approximate the dtibton ore appa
fMowe Calo meds, There are «nk of algritins fr Mont
‘amping, wth he mot common beng varias of Martov Chain Monte Cars (
this post fran plea in Python).
{52 BAYESIAN UNEAR MODELING APPLICATION
‘Snibton forte mel partes. Tie en est will be pose i
for the ‘We can inet thse ditcibutons to ge a sense of wit
ncuring. The int plots show th approximation ofthe posterior dstiutons
‘motel prance. These ae the rent of 1000 seps of MCMC, meming
loth 1000 step am the posterior isbn.
‘we compete mean val fe the sape and intercept tse oto
(OLS (interes lom OLS wa 21.59 sd the hope was 7.17, we se at.
‘every similar. However, while we cap tse th mean angle point eit,
tlio havea ange of posible values or te del permet. Ashe saber of
pots cease, is ge wil sink nl converge aoe snl vale repent
‘roc coufdenoe inthe del parte. (i Bayesin inference a rags f
‘vole is called cei otra and which hs a iit fen inter
fom sonic interval in fequntistinene).(a0)
“When we wat show the Hoss fi from Bayesian modal, inal of sbowag
nly esi, we can daw aang of fins, wih each oe representing eileret
‘imate ofthe mal parameter. As the abe of apo ire, the ies
‘el fo verbo tees ess uncer inthe model pacts. In certo
eons the offs of he aber fatale inthe model, [sed two mode
‘he i wth te renting tr show onthe fw 50 dpm and the neo
‘he ih sed 1500 tpi. Each rap hows 100 pose models avn foe,
‘he model prance pst. 4
Bayesian Linear Regression Model Reels wih 50 (ef and 15000 oberon
Co)
‘Thee is much more vriton ide fits when using fewer etn points, which
pest reer cera inthe model. Wiha fhe ta pints, tbe OLS and
[Bayesian Pte ae newly Gensel teease the pls are washed ox by te
‘eloods om the dat. When ring thous ingle daapoit singt
[ytsan Lineie Mods, reals do aot get a single vale tat estate,
Following isthe probability dens po for the muner of eles bumed exereiing
‘15.5 mints Thee vert ine indets th piesa fom OLS.
Ari Ityonce at Machine Ler
Pig 11, Fro Probably Denso ares Burned rom Bayes Mode
‘We seta th pray ofthe number of calrisbuned peaks rund 93,
‘ut te ul exinat a ange of possible ves.
136, GRADIENT DESCENT.
Gratien Descent is known as eno of the most commonly used option
stern tin machine lesing models by meme of mining errs beewcen
cual nd expected ress. Fut, gait descent i so used t tin Netra
Nero In matematia terioloy,Opinization grt refers to th ak of
inizingimaciniing an objective faction le) paramere by x Sina
machine leaning, opiizaton is the tsk of miimiing the com fncon
‘The main oljeiv f gradient deze eo minnie the convex fnton wing
‘terton of panmster pates. Once ths machine leaing modes ae epinine,
‘ese mols cn be wed as poet ols for Artificial Inteligene ef yarous
computer sicveapplctons. The best way Yo define the lea lnm or eal
‘moximum of fiction using pdt eset i flows:
+ Ifwemove toads a napive gre! or amy fan te gaat of he
won. st the eureit point, i will ge thee! minimum of dt
fncton.
"© Wheoever we move towards a postive gaict or toads he endif
‘the fino tte curent pit, we wl get he eal masa ofthat
Fea :
‘Tis cite procedue is own as Gat Aseat, which isso known
Sse dient The an objective af using gradient decent elgrtin sto
ini the cast faction wsing erat. To achive tis gl, tproms twp
‘Spliervely:aes
wz Arif tiered Mace earing ssp Learning 2 a
‘The main beste of erat descents 1 iin he cos fiction ote
ror betwen expected and acta To minim te est feo, a poi
evel Diet & Learning Rate
‘These tv ts ewe eerie he rl eae ekalatn of ice
ication and slow it he pont of comvergee of lel minum oe gel
nun Lets useage inet,
aang ate:
tthe ep se he exh he niu owes pois Thi
aly asl ae hat seve el ped sed 0 he Debi of he
cot tio, igri high eas in ger steps but alo eds ©
{sof vesting the minum. Lh sae ine «ow ing tw
sl ep sis, which compromises evel eficiney bet Ee he avenge of
ne
NG 5
wae =e
reais
{41 TYPES OF GRADIENT DESCENT
Based on the en in various tmining models, the Gent Desent lering
Nim can be vie no
4 Bath gniet descent,
“© Sth adit dosent, nd
& Minit yada dese,
‘ets understand these dire type of dient decent:
© ca rr it cin ci be a
reer
rene ton tr no ep tik mee
Seeley ysis te ia id
Se tenig pee na pun es
‘ Ceucteeringhatecoe
ow does Gradient escent werk? a
‘efor stating the working pile of patent decent, we shoal know soe
‘bss ence find ut the slope of ine fm ear regression. The eatin or
simple nea regain sven .
Ys mxte
‘Wher ners the slope ofthe line, and represent the interop on the
NE
r |
“The stating pnshown i above ig) is edt eve the perfomance sit
conser just a an cbiary ole AL this atin pint, we wil rive the
Aevivatv or slope andthe se a tngzat ine ocala the seeps of tis le
‘urbe, thi slope wl infor he wpe toh parameter (weight od Bi)
"Te slope becomes step a he tang plat or rita pit but whee
ow partes ate geet then stepaes gradually elutes, wd tte
poi it approche he lowest oa, which is elle pat of nova.aad Ari Itligones and Machine Larnn
1. tne rade erat
Bach grit det (HD) i lf te er fr ech pol in
tuning stan wpne be motel ae eiing A ting exams. Th
onc is we rh isn epoch In singe wok tt a ey
“the whe eu oral xanple fr ech pie
avrtags of oth rent ect:
‘tpt eee in competion hr pt eset
rans sal pt dene eanverce
{© iis Conputnly ise mses me mel fr
—
2. sadasticgne deit
Sch put deceat SGD) ix Ope of ee ect at ns
ning cmp per rsp Or in oer or proses tig Sch
echogenic nt nig nam arent
‘ine Aro ny ne tig naman, ence tse
loaiel meno. Howes, sr some eomptoa tery hee
‘expan oh puto oe tes ent pes at ei
Sc and pol Fer i to sunt es tse tated a 0
trata Hover, sometin cn be pin ing he bal iin
Seri tf man
Adana of Soca drat
in Sct paet deentS0D) lei hppes en eve ss
cons ofa fe vgs or thr pn et.
© iscsi indsod meno
© isle att compe daa di dee
© Wismor ei ori acs
2. atch erent Descent 8
Dash paint decent ith contin tach prt cen
sexi oie decent die he ting testo mal ath
mh watson hw ce opty Sng elig datets
Smale ths mae ace mie compl oz |
wa
pet descent sd pes of stochastic pat decent. Hence, we em ahi’
Spex ype of aint decent wih higher eumpuatonal eficeney and less sy
gaat desea
Nbvantges fn Bateh Grate Descent:
Iiseuiert fitin allocated memory,
Iiscomputoaly fen,
© Itproducersublegacent descent cuaverpense,
1242. CHALLENGES WaT THE GRADIENT DESCENT
Athoh we know Gradient Descent‘ on of te most popular methods for
‘xsiizaton problems it stl also has sone lenges. There ar a fe challenges
stllows
4.10 Mini and Sade ott:
For conver problems, gin descent can fn th global minim ely, wile
fer non-convex problems itis sometimes diel © nd the global nisin,
whe the machine leaming modes achive the best result.
Sipe teasing
raat
aL WN i
Fase
Whnseer he slope ofthe cst ction iat zo ort lose oz, his adel
‘eps ening fate. Apart fom th lobe minim, tere occ some secatos
‘hci show this slop, whch i sale pola and loa simi. Lolitesa}
‘orete the spe similar tothe global minimum, where the slope of the cost
finan nrese on bth ier fe euret ois
{In contrast, with sad prints, the negative gaint daly ocurs on one side of
te pont which aces acl imum on one sie anda focal minimum on the
‘tha sie The name of ale pot stale by hat of hon’ sole The name
oa ima becuse the va ofthe loss fasion i minim a ta pint in
‘Towa sain, onas th ame ofthe hb mime is ven 99 beans the
Aric eligence and Machine Learning
‘value of the Joss function ix miaimum there; globally sorbes the entire domain the =
vor fin. IBBB) = E1003) 46
2. ehng and aig Saert 2 leer
Sn pel newt I te model nel ith godt sd Ws Wiha
ee cco we more iss era ii nd ne ib ea ee
lat
VonihingGradents:
‘Vaishing Graictocours when te dient is smaller as expected. Dating
tack propapiton this gradetbecoes smaller hat casing the decrease in te
Ienng ato eter ae thn te ter yer of eve. Ope hs happens
the weight rare wpa ty Become insiaifnt,
plosing cr
‘ploding fin jut opposte ote vas gat st occu when i
Cons suo few cee ble mode Pb, a hs ee model
‘eit increases nd they wil be eeesented as NAN, This problem canbe sled
‘Zane the dimensionality ruciontecinigus, wlich helps 10 minimize complexity
hin he model we
[37 LINEAR CLASSIFICATION
‘A incr lester dos casicaton decision based 8 the value of 0
combination of he casters age thatthe near hss will ae
fee weighs all the chai that dele w partiuar clos. (Lik merge
samples of te clas as opts)
“This ype flair wake beter when te pole i ier separable
Pig 516
_DB Pio mi ave oo fever cas ht eb ob aid
‘poss sume ee aes (a) of One pete avec oe wi
presented by ow inour weigh mae,
Welt and las eect
‘Th effect of changing the weight wil chang he ine ange, wile changing
ia wil moe thine eight ars ica