Support'Vector'
Machines'
Op2miza2on'
objec2ve'
Machine'Learning'
Alterna(ve*view*of*logis(c*regression*
If'''''''''','we'want'''''''''''''''','
If'''''''''','we'want'''''''''''''''','
Andrew'Ng'
Alterna(ve*view*of*logis(c*regression*
Cost'of'example:'
If'''''''''''(want'''''''''''''''):' If'''''''''''(want'''''''''''''''):'
Andrew'Ng'
We're just gonna get rid of these 1/m terms and this should give the
same optimal value of Theta. Because 1/m is just as constant.
We should end up with the same optimal value for theta.
Support*vector*machine*
Logis2c'regression:'
Support'vector'machine:'
Andrew'Ng'
Support'Vector'
Machines'
Large'Margin'
Intui2on'
Machine'Learning'
Support*Vector*Machine*
I1' 1' I1' 1'
If'''''''''','we'want'''''''''''''''''(not'just''''''')'
If'''''''''','we'want''''''''''''''''''(not'just''''''')'
Andrew'Ng'
if C is large
SVM*Decision*Boundary*
Whenever'''''''''''''''':'
I1' 1'
Whenever'''''''''''''''':'
I1' 1'
Andrew'Ng'
SVM is a Large marging Classifier
Large*margin*classifier*in*presence*of*outliers*
x2'
x1' Better decision
boundary
Andrew'Ng'
Support'Vector'
Machines'
Kernels'I'
Machine'Learning'
NonDlinear*Decision*Boundary*
x2'
x1'
Is'there'a'different'/'beRer'choice'of'the'features''''''''''''''''''''''''''''?'
Andrew'Ng'
Kernel*
Given''''','compute'new'feature'depending''
on'proximity'to'landmarks''
x2'
x1'
Andrew'Ng'
Support'Vector'
Machines'
Kernels'II'
Machine'Learning'
Choosing*the*landmarks*
Given'''':'
'
x2'
'
x1'
Predict'''''''''''''if'
Where'to'get''''''''''''''''''''''''''''''''?'
Andrew'Ng'
SVM*with*Kernels*
Given'
choose'
Given'example''''':'
'
'
For'training'example'''''''''''''''''''':''
Andrew'Ng'
SVM*with*Kernels*
Hypothesis:'Given'''','compute'features'
Predict'“y=1”'if'
Training:'
Solve this minimization problem to get the parameters theta of your SVM
use off the shelf software packages that people have developed to
minimize this cost function, and so those software packages already
embody these numerical optimization tricks Andrew'Ng'
SVM*parameters:*
C'(''''''''').''''Large'C:'Lower'bias,'high'variance.'
''''''''''''''''''''Small'C:'Higher'bias,'low'variance.'
'' '''''''''Large''''':'Features'''''vary'more'smoothly.'
' 'Higher'bias,'lower'variance.'
''''''''''''''''''Small''''':'Features'''''vary'less'smoothly.'
' 'Lower'bias,'higher'variance.'
Andrew'Ng'
Support'Vector'
Machines'
Using'an'SVM'
Machine'Learning'
Use'SVM'so]ware'package'(e.g.'liblinear,'libsvm,'…)'to'solve'for'
parameters''''.'
Need'to'specify:'
Choice'of'parameter'C.'
Choice'of'kernel'(similarity'func2on):'
E.g.'No'kernel'(“linear'kernel”)'
Predict'“y'='1”'if''
Gaussian'kernel:'
'
''''''''''''''''''''''''''''''''''''''''''''','where''''''''''''''''''''''.''
Need'to'choose''''''.'
Andrew'Ng'
Kernel*(similarity)*func(ons:*
function f = kernel(x1,x2)
x1 x2'
return
Note:'Do'perform'feature'scaling'before'using'the'Gaussian'kernel.'
Andrew'Ng'
Other*choices*of*kernel*
Note:'Not'all'similarity'func2ons'''''''''''''''''''''''''''''''make'valid'kernels.'
(Need'to'sa2sfy'technical'condi2on'called'“Mercer’s'Theorem”'to'make'
sure'SVM'packages’'op2miza2ons'run'correctly,'and'do'not'diverge).'
Many'offItheIshelf'kernels'available:'
I Polynomial'kernel:'
I More'esoteric:'String'kernel,'chiIsquare'kernel,'histogram'
intersec2on'kernel,'…'
'
Andrew'Ng'
Mul(Dclass*classifica(on*
Many'SVM'packages'already'have'builtIin'mul2Iclass'classifica2on'
func2onality.'
Otherwise,'use'oneIvs.Iall'method.'(Train'''''''SVMs,'one'to'dis2nguish'
'''''''''''''from'the'rest,'for'''''''''''''''''''''''''''''),'get''
Pick'class''''with'largest'
Andrew'Ng'
Logis(c*regression*vs.*SVMs*
''''''''number'of'features'(''''''''''''''''''''),''''''''''''number'of'training'examples'
If'''''is'large'(rela2ve'to'''''):'
Use'logis2c'regression,'or'SVM'without'a'kernel'(“linear'kernel”)'
If'''''is'small,'''''''is'intermediate:'
Use'SVM'with'Gaussian'kernel'
If'''''is'small,''''''is'large:'
Create/add'more'features,'then'use'logis2c'regression'or'SVM'
without'a'kernel'
Neural'network'likely'to'work'well'for'most'of'these'secngs,'but'may'be'
slower'to'train.'
Andrew'Ng'