0 ratings0% found this document useful (0 votes) 84 views22 pagesML Notes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here.
Available Formats
Download as PDF or read online on Scribd
S wpeatid Leaning
We awe gua drat and ahead hia stat the conneet
Supe shuld Ade Lhe, hag He ielia Hat Hane 1 @ melation
betuceur te inprt aud he output
iseol Pro lens ane coegorizsa! ts “regression” aural
“dassiP-cahion ” pw bmn
Ln
WAL ue ae, fagig B predict neselts with a continued
adpat —> Te mp inpat vawabés to some conhinusos Anchor
Jn, welessiferenction.
we tag be prdict aesulls in a disoute autput
(Te map input -muwabes ints disoete cateqaie>-
rasperied Acauring
Alwss wa + oppeack poltems wits Lffle or no tdta what our
aeults bol the. We can deste shuclune from data where we
dnt veceasartly th now toe hfs af te vatahh
Coast Functor
Excauple : Vening set of housing priced
Die hoketon Leoll be Sip in ft (2) | Bee tn 1000's (4)
He following: a 104 460:
14g 43g m =4)
m> # hair exaupher 1534 31s
Rese HX “input * canabko ro
ee 175%
Y> rout pat aaaable / tnget soiolle t
(113) > one having example.
(x, y’) > ith frac ng exauple
4 2 ku04 9 = 345
aN es 1G[ig 201 | How ob ue aprtet 7
+
vm Agel | hy x) = ©, +0, %
et Lgl — [a
x
hipotuesis: Fonction Hone 8 ane parameters
“Lt maps fom x's b y's
LL Cover von ole waut fo corre — camnect saluer Ay 8, aud Be
so Hat WO) iv cbse te gy Be cur tmining exempts (x, 4).
ln oder dy 20, fi best option id 4 minimize Oo aud B1'
2 - 3 Such that the
oo atime LC (hye) - 3) distance between
9,6, 2m t Wola aut yi is small
ort da A he (xt) = 8 #8,x6
Buenas, end venue ners
The ape of (1) has a nana ont its colled cost qenction or Sguarred
Toe et et Pett ay enor function.
@ | Flee). £Z (he Cx!) - y' ) |
This is Hue mest common egualton Br sagerssions-
# Reminder that our goal is to minimize 9 and O
Nolice hat yd depends or % aud DLO) is @ Ractron thar depends
on 8s
Do hls aualigo whet tys means,
hale) = OK
> m1 °
* Os i > :\2
ad Tio, = Ey Cho (x)- y? )
1 4 =
"mE (exut- ye)
tao T
For cack of Huse et E loot +o )~oF
tades hod) 4 Aer Ais careHes care O20, BOs
ie Se Of He cot genchou we will bane
3
‘ a Wore Ceo are cyaay pi oa]
ty = G [or tr tae ]- oF x O58}
4034 “ ad
The wot fuchon cnsicerug aloo last example uvold lodle [ke
this. LP 20 bum Toe,n0)= Me x23
Sie) So Pun beep cvaloating fuis fluction Fer
4 Afleaut voters of 8, we odd get Hu
7 Ga bine
PLATELET ATT eae elle od TL dre Pana fl Pola:
| Wspotesis: This wo tu narn wus need te find
Ha value Piney soch Haat i+ ominimppesmen(e) Hu
cot Rnclior. Doing 0 will gue as te best model
For Hes care Gazi is Hu oor global mivimom
{
The bes oq fo viaaliza T(8.,8.) 2 30 plots o conbam plots.
TCo.,0-)
20(&, 8,)
Ect |
These 5
o. @ points
? conespont to
Hk een wate
NE of F(e.,e)
hes te doses or leat different wats Ly
omalest ( we are fahivg lage stego
PIE wax we ove taking buty skys
@®
The deak wilh G) is that ue gaat to nupeede the wale af Os auet
© stmuifhancoosly, On fre computer Huis is hoi ibs doe
lomp0 <= = «20,504 ,6) teup 0'= Oo ~ 90, 3
ewwg l= 0; = 498, le, 6) Be + baud x
ert femy 1:2 0, —& 90,5
= teupt Os tee!
bat a On borg here ee updei
: Cree Sette i a a ene OUND wR ese The wate
aeety O, a4d Oy ana “wupdated of Hu ne updied undale 40 be
altered.
LP 9,15 olwadg
a minimmun (gocu if
ih is & local mininauu
aud rof & glob one)
, leave it onchauged
Alo for ee X as we appoack fo a loat
mintinon , grade! descend wi aulomotilly tae smaller
skp. So Hore vs wed & ceoeane a goer himaThis mally stp io absius when go check on the definition of
derivative. Ln a minim its voluc will appeach to seo
Alou we can make ox Aust son acl ;
These ane oor coment lon :
20; Te) = 96 [4.2 (ho(x!)- aii
= 26;[ 4.2 (8.+8,x! -y')'|
a=0, af Z (hele!) - 4") Hae
Su faneous
Le. O- x F (holxi)-g*) x "9
On the computer mimize G6 cia Hs
Yadiect descent Kineton will look (ihe
diss, Oni of cwentially oppiaches
minimas. The iaear ap poxi matory
will ook Uke Hes +
Onkil over time it
vadus Hu cvect
Bi
op proximation
2 /e.1ax'-4 o =MATLAB Rawew
BR wri mihie>-
Aa[12,3/ 48 (6/4 8/9, O14, a]
Ais a «xs mahix
we i mansion:
To maha a neler: en 7
wala ara Tmin}= size (A)
TT dim - size C4)
We cn sau an index inte a asiabe:
Aza = A(z,3) ALE pu om an scalar
in Matlats bo a Matrix of
hu apo mullipy —ratrces pial [dated elo | pl te
Axa * Pas, ~ Cons
Meer eet
AB cnak an idontiky nabix A-tms 4
Lc eye(n) on 5 due sag] © Hho te meerse:
of He square A_inv= inv(a)Mulbpk eames
Ln Fhe orginal adel ute only bad one frhre "2+ beck Imag re
Hal ws ge U1, Le, Xs.
36 sat le pun whcc Retming f fla exauple of Ha ustue of
We will denote
He booed :
(44s: * Bow | “Fos | Ap Fiza (woe)
4 it OL Xs ted ah Pealon with
404 Ss 1 as. a strinmdkke Le aul
x? Ce 3 z Hw) } 282 (Leach clemuul of tea
sa | Ls 2 Jo als Feadone uth a seipor-
tee ~ ne indix X).
Zoi
1g
AL ye Select YO we wil haut ee) 3
wee ene
Ga) > yp PL e
Aud Hy alee 7 wes athe eof
The hapettess wih cheuge ty:
holede O%*Ox, +8xX. tt AX, | (8)
Fhersuber. Hat te=/
The veche PX is 6p dinusiorn rt. Se whic dimuowor F ta 6
cwate Ha hypottisis finchion in vectotah shope We have:
[ber <07% | @)
Equation (6) is called Weilhiumiahioes Linsar negnessione
Cnnadcent dscerd br mlb pre vencables
K's recall the oot Rachou:
HO) = TO...) 2 gt 2 (hy cx) =y')*
cote malligte putanetens we fan neni 1b os:
> - o 3)t
Bee 2 (LEG, yoyo) omreal ae hmmnes ciate mbes
Papert suh( ccnp
G = 8 - 498, Ta,-,&) | PL reel ep Hetty
an wt ave for N24 et
Horcng rullipk fotres causa pollens in Hur cost fauchou AUC of tux garomelers
right vot le on *similan” scl. w crauple.
phate fe Fa Baad ie atte oe
t iat Ha goauhites (Huis cour be performed
by poking Hx mean of tc gooubity
on shendiriging them). For cxauple
4%, = Sig (449
"2000
@. Mae Beceem
ota rT peeteal eto By shiny Mia tact ek
ee Pee re Fe function will be les, slrawest
AF rmenee ¢ Sen ound ibill tae less time aud
het fo Hu gradract descent got
Whee mabrin Reahenscatings tk fo Hen ini nan
1> important not b be vouy ohrict ;
en Pe each up wile values | Se now all
such as fia Loatons
oes iie 3 | a Xe ome approx
LEX LOS foe
-t4 x *£4)
-100 +r 100 J 8,
U abo applies for voug, small values :
~0.001 2X4 £ 0.001
There cme other mere “precice” saxting mutteds Uk snuansneinnatgar-
How aud sleudarigeionMean noymil’ guttow.
*Dees not apply 1 Xs. (Beaune Xs 18 always 1)
All yu need to 6 ruptace ms with Lente) Wheme Ay is the
rican of oll Hh. saple. Ex:
ye 59 - 1000
This quantity is quem Gey He niax wlie
Ean
Ane ¥Badeons - 2 prinas Hie oninimom aber
zt
Abus ut will mike sine that Mcbnomerercamrmesrres
To do so ue com "dclory” by plotting S(@) against Hee nomber
oFrilerabtons . hal we sould see bey doing this is thst siGeDetworenss
Fler eosin. irerahon
he i ate aoe thot Fle) does not
ef we decrease oer iterations if means
oe DONS Yun, drusen valet alo might vot be
eee a cate ow
et dol abel et
4 of iheabion
fo lanomial Regression
Kireas requssion Fils Insicallyy bec is o mie nell: Lt os bared
on an idiak “meclel whune oor dishibukions ae linear.
growl reyussion allows us te f4 Afferent distrilartions b4
‘oising Feohmos “X” te sone poheucies:
Osing the example of tha Wore -eibing, suppose Hue fenknes ane
the front oud depth aud an ave te Bibutng dishigahorn
xx AC quadratic model could Fit dais Such os
aan
Qe + Ax + Ox! J
Bot what buppaus if fu chosen
node! os ikration Falls?
We might accept thet oF think pices event GilWel we an vse Hun another mvdl. Peulaps a cubic ore:
be tala Let + Oat eat ieee
cL ,
4 SH is imprbnt fo add wep Har we au
also choose factional values «> poknces
Di cramp we could hare deren
B+ 0jx4 +O, VK"
ie Hel Acted oe ceed ET
Norsk Egvation aud it would mser fll,
Kis am mx ne) makin I
ra) xs rt
ae Xo Kyo vee Kn bu 4: 1 Ba m= dimsuoion
- xf) 1 J +
4 ‘. + om
Sim) Ste J
Yeo x
¥ moms abe GlleAmedusgnemadixm
Arctiar cay te cone Bis by |O= (x7x) xy] @
Equation (9) is cated nsmalmegeation aud is cralepas to igeactieut velescercl
the only odamuteye 1s tat tHe is no reecl to choose a value fr x rw
reeA fo Weate We can compar botn moths.
Gradiew Qescend Nlowmal equation
BM‘ Need to ceore No wed b dvore x
aS. Don'4 need to ikrate
* Needs many ilerations
WF) Boris wal usher ‘nt is lange seed bo compute (XIX) Canad
RSlow if tn is veg huge
Lf 1340 200, :
vse gadiest discon. Sepitent compl thy
a :
Non - invert billy oF +a normal 4.
The possible veasns of why IX) ‘might rot be imertible ane
eRadondant Featres Too mauy Feakues Cog. men)
Ex
— Delete some fecdoned
Fegilcsis ion
Hy = sige in FL
"| re or Ose
Re a sige in m 4S
metdagiste Regression
Ld ees classify ouputs "4" ind $0.1 f > fer or noe
8 ve chess
J -| t “eh Cas fbr chs Paha prtlen.
ly elesifertion ble
gy efora att
Sgpore we baer Hus dishileion.
Whew ut darsifiy auower in geo or rot:
w Ha os. The qustian js: in a set af medion
4 fatioads
whe has a molgn der
Lie © bow fe hyptuse ight foote
JE we cat mithe predichor we cou noke a tohold
classifier outed pla) af OS such that
i£ hele) > 0-5 pdich go
Fhe leS OS yudic yro
This might soon core "fs one.
aud a ttle obvisus
48) 4
but what if our dish buhon rea moligs
Arouges suck that ik looks ibe
Sis
(Wedd
rie
TF pe contine with Ye Some outalysts
Hen at by=0-S we might do some ustong predictions whe choos
Me fight 04 left side. ‘s
Oor condlassion is thal applying (near nugession sleeveless (anton rasIoh
Usually ibis astean geod iden.
Aogpte Aagnissont Logphe Regiession.
oe hy WEE
Ln Hie hgshe nigremon reeil we tout Hat Me
Ny pottusts Mees within 0 aud t ich Hat:
l OfWwerf (1)ln order to do 20 ome reed be chage tu shape ot te
Iypottests we nos:
gle) 0% —+ [hex) =9(0%) /
hve. ge) —] cw) Eaton (10) 0 called
5 € sipreidtrclion, And sb isa ge
'
v fotchon. “Thats why it's called
; Togele Aagfeneton.
ze =
hod Teeth] (4) Ths epention 2) is
logste regression mal
Ln this cre WoG) represents te tna pobility tit 41 on
some ingtt X Dor cxaupe inte fume case »
u“ % Ber THis 1s Hee
ae ie |: ieee “pol OH atad tof
hgpotuesis for some
x.
Tes cor patiet Is a 407 chome of
Moving onal ga Ame
We cam aloo arderstaug fue iypollasrs a>:
We(x)= Plgsil x; @) The prability of gor gir sone
An kod
THE Wed Plyce Ix 20 3,
Then: Weled= Plyroly;e)=0.3_
io
Plyeola; d= (= Plyo [xie) | 3)
How can we Irow trek yt o Yyro7
Whenever Logshe veyerin hyptasis is egal or higuer tina OS Han predict ger
et if
wt Ff hele) 205 [Horns fervolison, agement |
geo if sa iano!
ah hooking of tle sigmoid distribttion we can coe tret
B QED OS when 2>0. Thus
——
Wed = g(e'x)20.5] Uherenn (O% 20The cere ouidlysis for gro’.
geo if hele) = g(0%) 50-5
=> OK X e. j have yet when
Feet inet Eat “3 4X+He2 Od
Meme Z yo
hid & a Sshalght bine (purple)
Therefore. everything up fiom te prfle Wwe cotrespods +0 yea
This ushaightlnesis called decision bowery’, loecovse it separates the region
Where the ypotuss pedicels yet or gro, The decision noraupis-absna
popedysobatiehyptiers, Jaros © 8 aut x. (Akt agoprty of the oahact)
Praesent COPE Consider the Plowing raving seh:
We com use polyremial regerion a> uel’ for logistic
wegesion. So suppose Hu following huyposte sc
hgead= 3(0. 10.24 + x, + GR)tt & OX)
aud we fond Ot: (-1, 010, 444)
Thee, eretit art iPa
al x a4
a*
X
ate Xr 4% 20
> Re xg SII > cide cqration.
Our decision body is a cicle of radar 1
A> tha polynoniat oder inceaes we. night el decision boundaries
with diferet Cure ox finny ) shaper +
beled = 4( 0.4 Oix, + Oat, + Ostf 4 O42. 1 OXE + AXL4.)
a Xi Xe
x -K eo)
gue oeSo les wocap a litle GH:
We Wve our haining Set:
faery) Mey), angle
Sree 4%
Ant ptt babel lone Geatback IT Leta Tell tata lala?
Bieneriesoct rari ctat tc cociseet
Y< fou}
he(x) = 4
ame
x,
x goes Ha hain so, Ales dou choose 87
Behr ue osed te cast Luchon J:
= a
Seo) = + 2 (hte!) it 4‘) 2
had's the rofetion Sea
Socket that cost ( hye), 4 Jeg (bote) rh af i
We ama naming Ha Super indeves,
Wits Bacau oo Hu hypottasis now tt io a pbb hy ‘Hak oa
cblain Ve label y
Sn Negsticomegesston fae munimize He otginal wat nction (is)
we ath obban a norcower 6
SNe sinble. solution
aa Be atria ibaa tated lak
wri Minhnwom.
So for Lagstic regression ovr ast Rinchion will be ble Hes:
Cost (uci) J - Li (hola) # get
Lal snbee)) £ 420
(ie)
An (ey
cost ae i
Weep ethSo tue cost as 0 tf yer y hyledot
byt as Wk)SO fun Gotu
Captnes inition that f IyGdao (react that (ys t(x,¢) #0),
Lb yee sell ponalige leauning eget linn by a very Lange cook
Crit other weds, in avr hier exampe Cin labeling cares) a
fe by cblawine, WC120 we are Syn) that se Prekadoilihey Hat
a patat Was 07 crane of having a malign lumor.
Which in eal (fe Hak is impasoibe. Thos tnatis why we send
Me Got to the _wGinhy.
VLE byldey, Han cont =o for (ye0,1)
yo
TAL fea, te cot as bimy re
coo t Regardice of heller geo or ge
iF he) 6S, then Neastnonsy
oper et To simplify we con corik He cost Finction a’
cost Chex ),y)* yal heed) = Cg) be C1-he ed)
Thus te Kage noguansion sat favetion ia:
(2 gba (hace!) + (at) Ly (kien)
Oy
Us) Je).
This funtion ig indeed Convex
To Fit panamatins © we olmo recd to minimize &
Kid lo rake a prdichon given a nud X we need a0 viel fo vse
|
hyod= Gp oae
To niminiza, © we oboo need gyadiet dercend, aa repeat until divergonce :
085-62 Ciel!) ~g')a} ame smullouenaly upplate all 8
wobce that the only thery Hat tas chard > the defiritton of het)
Rameniber that ue cur cee if the pacers 18 working popperly by
plotting tia cot Anchor vs tue ikeeattors om tie value of T (0)
most lower as stegs ineneanerA vedorgeds tepiemuntshon’ is
h-g xo)
Teode ge» (—g7hlw)- (sy) (4-4)
9= 0-% x"(g(x6)-3 )
Optimization
“he epbingsabon thure s¢ alysthin eaty lt sch a
—Grediet descend - BFaS*
TCL What Tedd bagal at
=Car, ~ L-BFGS i
Conjgple gated = ABC Tete fe pdt
Obter bran godiet discerd ane 1 Fate tran guadiewt cuscerdl
wrone camper Agoithims,
mes peqanming leuguage [i] Min Sie)
Kunehion C Vel, gradient] = coot Fanckon (Hela) S- le,| we
ge Nol = (WMebate) -S)'2 + Ctakule)-S)"2 <> 300>= @, -5)*, (9, 9)
be
te Se lade) =), HEE eer aes)
Guadieal (a) = aw (Huhate)-S)y 78,3 = al0.-s)
ceptions = optiset (Gabi!) 'en', ‘Monkter’, '100'); A These go Mmgether
intial Thu = 200s (214); er ee
tTheka , factor al ext Flag |.
op leg.
= Fein und Prcttunchon , inital Thales optars),
Mathis Classification
These ave problems Where the autput valu “y" are laweled
OD 42 RS sich A»
aa
4
Email Felduring + Worle Farley | Hobbies, reweigls.
Medical diagoms: Alt it, cold, Fla
4 ee
heater» Sway, Clocdey , Rain. Sneed
1 atesBron Cheobeon Mulh-Clans clooifcabion
e Plast
-3\! oat th
ra for multiclors ldakacte
(ene ut Quel 0 cove Het
v f Fe ek
segaiattad an fitted lotr We canier option in this care
is separate te heining set
\
a in to, such Heck
Mulhi-Claas closeifteation:
Mulki-Claas clossiftcation
x ae
hc xb
: i)
x,
Malki-Class clossification: Se we wll have differed classifiers
x for hg (x)
he @) w
(44) | We = Plgxe VY Gera)
x So ww we train a lagehe nugeeosion
claasifien Mg Ca) Ge each class ¢ 4 predict
te qotability Ha gee
On a nwo input % bo make 0 putichon, pick Hu case é that
eer w
max bet) <— This way we pier te most confidat Cue wil
Aric the ont vote higher praboiiby )
Se, Fuse have a rmmuh-clacs cloeGation poblew with k clases
So YE Ft, 2, Afe Oding the 1-3 olf method , we will have A different
logistte nayession Classifiers trainedThue © ont Hiing pu wsk be care about
Oder Ap ce have fe many fratonr (x), the learned hypothesis
rma PE the amin sof org well (He cast faction Tro), but ral
Ss peeled nn feria
How well o hypetiiss apples fo new cxamples
things Had ove rot in Hu taining se)
TS addens over Hirg
A. Redce Ihe numiaer of fectones.
dalcfe oud select which featunes lo Meg
+ thet = 10") 7
Well, He algprAnn costs ta, under fithing Chails to Lt con the
Ararning set)pee, eee rm wil fe appoxindtely 0. Tho ble OQ. This deviouty
Wao Yo erly Hun to Ye gated descant anh other fonchiow we dibind betve>
For gedied decent witht ielacawaaan
Bi 4- a LZ (lols) gt) mj Seven
Nolice Hut % does not — offick A, thus we will Ihave be ogltt
our egoakions -
So for regularized yotienl descent we col have:
Oc'= O - ae Z (ie) gilee
an 6 9[eE tuosl-g e+ Ra] jen
(hich can be simplified to
= G (t- <2 \= a2 (haced-9') x, (20
Nokce Hot (i aA )e 1 becnwse 0,230 but abvo
ove Small numbers. Thos (1-4) shrinks the ule of 6}
Ard the second part of gradient devend> renains He Sone.
C Wok happened fe rormal eguotion7
u wll cae fe ST
Oo oy
o.(xx eX [ 1) 7 ee her this equation
(Le this mahix inveclibl or
if AZ0 Pewloazation ene that His radix non-singular 7
© imettble
Hs doce Nogahc grein leots with regplenisaior
Recall You layptie: Ragension cat Fonction,
Tw) = {é a y An hex) + (+a) be (a -icea)]
+e z 8; 2 7s ne seplonaotion We jos need add
As hea
Remenyy we sult at | wotead of 200Gradient Descend —o3ill ook ihe fais
G2 8. Ok Be Che 4)- gi Vue
Oj := % eT E (yew 9 4! +26 |
J
jeter ;
Rememter that in log regression holes 14€
¥ Just as bere , the best way to ronitovige that gueliodd cescerel i worling is
fo plot Hu reglanga logistic regresion ves the number of iterations.
Pover Aen dt Offi mizacen Aropseda
weer 3 | Regdariged Cogshe Pagresson | 3:05Neoval Aetwochs
We Wros that wuurine learning is abe vaed te cecayige Sepe
on pidure When we do His we reve o very lage amount of coda
Ano pire of tconioo pirels we aight cowicler the number oP
somple> Lecause we will Wwe 10,000 piccho
in HL (This emotdertng a grogscale erctre )
Cha 98 we wilh ase 30000)
S nz 10000 in grqeote
| pinet 4 anboushy
Xe | pixel 2 inteusiley
pixel 40,000 iaksty
AP we tant do fit data mio a veyerion model as ue hove seer
se For ve Meech to ndode Wigher orders. LP cue consider qurctate
featwres (ve; ) = £ we wsovld ek up will, — agper — $O millcon
(sxto7) fertows, then 2 A Lot.
coral nelsoks helps 4 impose tHe proerses where we hue a comper
hy potuss
Nevral Networks ove inspired /bared on real nevrons.
Anon kamiral
PVE SZ
Naclegos €
OQedrite Axon
Revron model: hogishe anit 1
Whe byl) = Wem
Oddput
gg belt) ae 8 [ij
o-/a
Wo x 4.
inal ie < é
Xe also, “connects” but if is cmaidtred as aban «uil”
ae, om il is esi aa
© is mw led “weight” but ofl collec poxcmnahor
>The Signoid Clagistc) ts Mee "ackiaabion fonbton"Ln newal nebeotks we reprawut pocesses 0» loyons
ii (wet sane His U fhe conrect word.)
Ke
sh
et al ¥We can abro add tHe
es a? O—r hale) bis wait in each layer
i Zz ai? TU sdden layer of aby is
Ya “ocdisation” of ousl in
bert { He iz ‘ 3 lager §
fend adder lager — alg layin OD matin oP sacights covtatley
aye (4) Function mogping Aram lager ull Coithout
coredeting the bia). Yep cn see Hu size af bidder layers 1 the formula.
“Y,
Tha OM (Then of He tat lager) wlll be a 3x4 meabix
aL,
7 network Woo S$ amits ta lagers 5 jay ants an Noyes ju
hen @ portl be af dimension — Minx x (5; +L).
. Ny 940)
omen” Grittal tal badabee
we och tu bias tem,
Prom the Ycortowt layer’