CONTENTS.
Foreword
Preface
CHAPTER 1 CONVENTIONS AND CONTROVERSTESIN|‘CODING OF CATAGORICAL DATA,
TRE COMP INDICATOR MATRIX AND ITS
PROPERTIES
QuaxTcaTION
‘THE INCOMPLETE INDICATOR MATROX
‘THE REVERSED INDICATOR MATRIX
TRE INDICATOR MATRIX FOR A CONTINGENCY
(GROUPING OF VARIABLES
HOMOGENEITY ANALYSIS
HOMOGENEITY OF VARIABLES
‘WEIGHTING
gaan HoMOcENETY BY LNEAR WEIGETNG:
PRINCIPAL COMPONENTS ANAL SIS
LINEAR WEIGHTING FOR KSETS OF VARIABLES
“THEORY OF MEET LOSS
‘THE RINCATS PROGRAM.
AERA conDAING MULTLE AND SLE
“sezaots wT DSCRETTZAMONOF
NONLINEAR GENERALIZED CANONICAL ANALYSIS
{SS FUNCTION AND NORMALIZATION OF OVERALS{PUTS RADIOACTY or m
= a m
as Fy
oer 1.6. THEPROGRAM ANAPROF: ANALYSIS OF PROFLE
THE CANALS PROGRAM ‘ECUENCIES a
Bet Ancrnp wi iy ey ds Ea
64 AMALE 1: BOONCRIC INEQUALITY AND POLITICAL, TED Syme meen oepon ces Be
65 Exauce 2 pReDICTION OF 4 SCHOOL j EPLOGUE sa.
[ACHIEVEMENT TEST m BB Bose
cares) 8 CCHAPTIR 9 MODELS AS GAUGHSFOR THE ANALYSISOF
Cour 7 asm nmap on ss soursrrcis 9. SOM GENERAL FORMULAS x
11 MULTIPLE REGRESSION AND MORALS a $22" Senet me Ss
112 DISCRIMONANT ANALYSIS AND CRIDONALS ey pas pee zg
18 523 The spent a
$1 MULTIVARIATE ANALYSISOP VARIANCE AND b 524 Meet decode 8
NANOVAS Py S25 Menachem mt
pce ices eee a 93, NONMONOTONCLATENT TRAIT MODELS m
11S) PARTIAL CANONICAL CORRELATION ANALYSS Pep caDan AN ce Boeier enone a
ANDPAREASS oo 9 DICHOTOMIZED MULTINORMALDISTRIEUTIONS 26,
SOME EXAMPLES 8 96 spLoGuE > ne
LOGUE as
CHAPTER 19 REFLECTIONS ON RESTRICTIONS ~
urn & 101. THECLASS OF RESTRICTONS CLOSED UNDER
4102 EQUALITY CONSTRANNTS
105 OTHERLINEAR CONSTRAINTS
28 14 ZEROS ATSPECIFC PLACES
El 106 ROGUEne EMoGUE
e482 992 88 S8eeges &
& agea88 a88888 eeeees eg
S88 8 88 § #8888 5 388 Sane ES FSseES Ge BARCHAPTER 1
CONVENTIONS AND CONTROVERSIES
IN MULTIVARIATE ANALYSIS
a his ncotactory her we hal yt te dfn of mae21 CONVENTIONS AND CONTROVERSIES IN MULTIVARIATE ANALYSIS 1.3 CONTENT ANALYSIS OF MVA BOOKS
1L1 CONTENT ANALYSIS OF MVA BOOKS
LLL Roy (1957)
esting of hypotheses
Fe also mention a mumber of important problems ithe futer development
ot MIVA tecigoe:
1 many41 cowvanions aN CONTROVERSIES IN MULTIVARIATE ANALYSIS | 1.1 CONTENT ANALYSIS OF MVA BOOKS
| sequence, spies
‘he observations are not necessary independent and denaly
is remarkable that Ke
‘computation in the bok
1LLS Morrison (1967, 1976){6 1 conrvnrions AND CONTROVERSIES IN MULTIVARIATE ANALYSS: 1.1 CONTENT ANALYSIS OF MVA BOOKS 7
1.16 Van de Geer (1967, 1971)
motivate1 CONVENTIONS AND CONTROVERSIES IN MULTIVARIATE ANALYSIS)
plumbing job? Ta the Sit place
MIVA techniques generalize the101 CoVENTIONS AND CONTROVERSIES IV MULTIVARIATE ANALYSIS
oomed to fil. We hive slay seen that dat analyse are wo peimaily
{terested in coeficens, but in pices. A single seo coetiient docs noe
sve very intening pieres.
ficiently pete
and the conditions fer apple
arly concenzat hs stenion
Bretton ofthe results he es obi? (bagels,
1LE CONTENT ANALYSIS OF MVA BOOKS n
relatively few protoypia! problems (p 1), "The heart of any multivariate
is of the dats matt, of
‘lysis is very popular, manly because of the wor andthe
DB12 1 convmnions AND CONTROVERSUES IN MULTIVARIATE ANALYSIS.
am anys and classe! statistics in considerable detail. We shal eoe back
to that discussion erin the chap.
on in the book, but very
lite materi on principal components analysis factorial, ndeanenieal |
analysis
ical repesentations, bt ofa light
er. The book ues proba
Reduction of dieasionaty
Stay of dependence
He also mations th most imporan rolems, ia alistthat resembles the one
sive by Kenda:
‘CONTENT ANALYSIS OF MVA BOOKS B
(@)_isdimeut 1 fn oot wath cent wans to know exay.
Aira the prefered technique
Hass, onedmensional multivariate analyst, uss the uslon-atersecuon12 CORRESPONDENCE ANALYSIS OF TABLES OF CONTENT 1s
112 CORRESPONDENCE ANALYSIS OP TABLES OF CONTENT
PACT: Facto aabsis an principal components cnals
(Canoneatcorelaton ancy
MANO: MANOVA, and be general malaria linear model
Bees!
Jewscont "ANALYSIS OF TABLES OF CONTENT ”
36 1 convaions AND CONTROVERSIES IY MULTIVARIATE ANALYSIS 12 CORRESPONDENCE181 CONVENTIONS AND CONTROVERSIES IN MULTIVARIATE ANALYSS 13 ASHORE SURAMARY AND Sone PROBLEMS 9
nd 2 book close to a
ec ebook pys satel mah aetion oh
post pure
ROY, and
these re
1G1+
‘The projection ofthe boa ad
ee eee i analyte ks se nea te ne between CORR asd MATE,
this cootext a8
dimension in Table
‘much moe atenon 19 MANO than we expect. etna
fvel. Of souree, COL] is a mano for comping MVA solutions and ts
cal. We have als epeted the alysis
‘thor the ‘extremist’ GREC and ROY, The projections onthe
| analysis shows us
‘Bore compact and comprehensive form, Ofcourse thie ede pry to oor
| feoedon ofthe books,
1.3 ASHORTSUMMARY AND SOME PROBLEMS
the hypotheses ee tested, Te ‘approach does not sar witha
‘model, ba oaks fo tasformaions and comisnatons ofthe variables sith
Figure 12 Conerondnse mls
:
1 MIVA Beak ede20 1 conventions an coNTROYERSHS IN MULTIVARIATE ANALYSIS 14 DATA ANALYSIS AND STATISTICS a
the explicit purpose of representing the data
wen graphical, way. es
ute diferent fom that of Nitsa
1.4 DATA ANALYSIS AND STATISTICS
LAL Tukey's definition of data analysisLAL DATA ANALYSIS AND STATISTICS
(A) Pr an woe wh ees gh (3) oF nl ag ar
(8) Sie amino ks a
Se enon f wrk am ah ge et, wih cal
‘aps onc ages241 CONVENTIONS AND CONTROVERSIES IN MULTIVARIATE ANALYSIS. A DATAANALYSIS AND STATISTICS
ofthe spct of modem
ut analy wih
‘which comerponds|
lied exp
pects of gene
iat has sly nb infsened the patie af26 1 convaNTONS AND CONTROVERSIES I MULTIVARIATE ANALYSIS 14, DATA ANALYSIS AND STATISTICS a
1.6.3 Robust tatistles28 1 CONVENTIONS AND CONTROVERSIES IW MULTIVARIATE ANALYSIS 1 DATA ANALYSIS AND STATISTICS »
‘Acinlot hich a nore igen an which sion cong n
te date anu em at ates eae
Again we see the shifting emphasis fom optimization to stability obus-
9).
1.44 Exploration and confirmation
‘These two tems ace becoming very popular these day, but again they are
ed by efferent sh various
‘Toke sages, or example, with Parzen, ho proposed in recent paper to
‘demi exploratory dts analysis with confirmatory nonparametric steala
301 CONVENTIONS AND CONTROVERSIES IN MULTIVARIATE ANALYSIS 14 DATA ANALYSIS AND STATISTICS
ory techniques use tests of hyposbeses, confidence
igs,
tations thery or mode! eds to new eonecres
ofthe model. Thos he proces ices and not
ie adnan fog oct ce by
eran ste, bt by tenon Seah boy
eat a te Esa
1 seems io ws tht the wots ‘exploratory ad ‘oafirmatory’ should be
‘ed inthis Senseo eonjetres and refuations Tukey pts his 4 Moria
erp
|} by Tokey 1980), and als by us32 cowvannons Ax CONTROVERSIES IN MULTIVARIATE ANALYSIS 1 DATA ANALYTIC FRNCILES OF 7H8 BOOK 33
145 Inference i 15 DATA ANALYTICPRINCIPLES OF THIS BOOK
Weave wsed the word = of times, and it i consequeny
© word we ase tht 15. Model and
‘The procedure wally adopted
aqutson orprolem sad then
evance ar pres
and wl ania to have cond
is. Samet
‘The model is replaced by 2
of simple models,M1 CONVENTIONS AND CONTROVERSIES IN MULTIVARIATE ANALYSIS
1.82 Gauging
‘What do we mean by enusing of ¢
save pope ly the ec
'echniqu recovers or epresens the Known props Phe
‘of gauges, and we meason some of the iporta oes,
5 DATA ANALYTIC RINCIFLES OFTHIS BOOK 3s
tigate what aspects ofthe
‘model ae epeseted and ow well In usec analysis the Hiller
rmacix sa falar algerie
8 Benzser
probable ones, espera in expleratry MVA. This
(¢ poychomeste sealing they.36 4 ConvENTIONS aND CONTROVERSIES IN MULTIVARIATE ANALYSIS
[1s para avaLyicrRavemtss or ms 00K st
153 Stabitity
to tlk aboot Bayesian ve
but we ae quite Rapp to ea
bat We agree with he spr ofthe381 CONVENTIONS aND CONTROVERSIES IY MULTIVARIATE ANALYSIS
(@) Sibi nde mod selecion Asal change nthe motel hat we
"si
Social model’ should pay more atenon to thi form of stb han
they aul do,
an inequl
‘tien the algebra ofthe probiem Bh
16 SPECIFICPROBLEMS OF MVA
{L641 The matinormal stirn
1 CONVENTIONS AND CONTROVERSIES IN MULTIVARIATE ANALYSIS
Of course we must remember tht Person's praca! purposes were a
Aeseipve and ot nferenal and
ve equl dispersions, then the points whee the fst
ret thin the second one ae spare
tiv hore properties
Statistician considerably, We mention tho
urposes
a dua analytical point
bot very imporant snd
ny Une ans
se MVA oft ses42 CONVENTIONS AND CONTROVERSIES IN MULTIVARIATE ANALYSS441 convexrions aND CONTROVERSIES IN MULTIVARIATE ANALYSIS | 16 SPECIFIC ROBLES OF VA 4546 1 CONVENTIONS aND CONTROVERSIES IV NARTIVARIATE ANALYSIS
peadeoce ae properties.
ofthe variables, which ean be defines a various
Aisibation there isnot mach choices
of the covariance mates and deve
ons ae neade.
ios in tems of conditional
for combining independent
Dlicorive appro
imainly by Lancaster and his schoo
liferent systems is easy to describe: mi
dive analysis on the ogarits Tn ge
seal probablty-ased definitions of dependence inerdencadrer og
con can more easly be inv ‘aa rose
tive analysis has ober advantages, The two techniques are congered
Daroch (1974 and Laneaste 971, 1975
Tr gener
Romie |
146 SPECI PROSLENS OF HVA
1.64 Causal analysis
CCoosl analysis hae a diferent histones oi
‘an tabular abalyss
ial Wid What we now
‘oience wat the descriptive and
scien theory is merely a shor ard spe summary of large none’ of
xpi! oteeatons (lor example coneaton). Tis interpre of
science isnt very popular these das, exeepe perhaps in some aed
mee cenees. There re tart ee reson fot
netic
‘ereity o ei
“The mmber of corelatons that as bees comp
3s, Bu no tear has as yt come oa
the second place Yule clay showed
Itwe cor
mpl
(each 8 he income of Preshyterian nitrite pot of
‘hom Jemsica). Ts isso
who never eared in he Est plas for Paso’ emp
civ poi of ie,
thee em tnes a ch thay avec! ess red
comeition cose than elds
the major problem of eaual analy.
n we now have the problem of model
when the nomber of
rpodel mare oF less48 1 convayrions ano conTRovERS
utoraialy implies cs
1:7 DERINTION OF AVA °
‘sceepunce of he vesul bythe nite,
17 DEFINITION OF MVA
LIA. Asymmetric rote of roms and eo
(On de bass of te scene receding ston we ean ow ny ge
definition of MVA. Tae oldest, and the ros oe50 | oxy aN conTROVEREIES IY MULTV ARATE ANALY
ed symmetrically, vrabes te not mentioned,
0 use he erm mlidimensional anal ors ele
ce nonce lvl Simpl oe te space
whch te vas are dtd as w elements, tse eat
dette by counting he uber of elements inthe se tnd
Them vats cn be deacd by naling ast of al lane eke 6
cameigodingm vases of he focons fo each seat Pe os
ep
osaie to anlys nvm
of mf Space
todace asymm tel hs made pose ohn of eneratzaons
sacl teminlogy we stay ase, end poplion
{sft and compel sewed We cn ostty sic oper a
lssibton overt eee, bt sow
‘sn longer wv asd we hve information het
cd inthe nx me . a
‘so pose tht
decd ha an
ony of tee
stability. .
he same popatation model. The i
te prev
on which we Reve ebieratons aad which creme
17 DERNIIONOF AVA si
‘say asthe matrof the sample ha consequence or what we observe
‘nthe base of his analysis we a give the following definidon: MVA
nudes syste of eorelated random variables or random samples rom sich
do
ber of random variable mil leo only
‘esurin theory. We have also incorporate the sachasi eleent
‘ur definition, but we ave sen thi i ean be made vial nthe case of ¢
inte population with counting measure. Thus it causes no real loss of
ener. Satis only besoes lnpomant, of coarse, in the special ease
‘hat we actually have random sarples,
1.72 Linear, monotone and nonlinear MVA.
‘We now deine some specific forms of MIVA which wl be impor
books MVA it linear if te results ave invariant snder ene to-one
transformations of the random ¥
Invariant unde one-to-one monotove tat
"The ess canbe formulas,
the rerult of derivations they ean
fouls; and they eat be
infleneed by rounding errs, choice of inital cotgue-
par ofthe output of he actu
2 ‘lower convergence
desired solution, or evea to convergence to an undesirable local
{Upto now we have
have als sean that ablEN AND CONTROVERSS 2 HULTARATE ANAL 15 SoMEn@oRTAN WOREDIENTS 3
‘we
‘we use sats)336 1 CONVENTIONS AND. |
{CONTROVERSIES INMULTIVARIATE ANALYSIS | 18 SOMBIMPORTANTINOREDIENTS 7
Snterpretatons of espelally HOMALS are pssbe, and willbe discssed ix
ond 8. HOMALS ipLTIVARIATE ANALYSIS
with meot problems in which X =m. Again this wil be explained in more
dein the later chips,
CRIMINALS har K=2 and
analysis, PATHALS hat & =? an ge
‘as K'=2and generalizes
ascuealK, which generalizes Ket cuenta analy
Sd ll vasibles ae tot ori
gener
have ooly been
ofthe problem leads ta more ecient
zed oat,
or the
Individuals or object) ane inthe Se
flxmations of he vazabes (efor
loss fnetins zea
Teas squees type, by which ne mea
mated ta the fist sbacp econ
ofthe wasermations inthe scandy
formations for the given bake
ing these subsips obviously produces» deat
‘wales, whic
tum isis foe piven ales
8 of oss
‘converges becaise los ie ounced Sclow oy
ty conditions we can also prove thatthe ase
Be fo values coresponding with a saonaty vane
Particular way of computing te
sealing, because the nan
minimize the los function, In
choten ona prior grounds, a
!MPORTANT INGREDIENTS ED
(OF couse, optimality mis not be interpreted in any wider sense. We donot
Choose. Wheter they ae beter in aby wider sense mu
ecied by he gaging proces.
near spaces cold
ston in he comp-
they desuoy mos af the
and prove vs
ANACOR snd ANAPROF programs, isussee
because the transformations ofthe variables are often resisted in vsious
‘ways, We have aady seen hat single vsibls in the nltileappoech se
rested by proportionally cons
these restos is that the resuling problems ae no longer equivalent 10© 1 convEnOns Asp CONTROVERSIES IN MULTIVARU
We ave been w
(tere inthe ft
19 EPILOGUE
[Arumber of new books on various aspects of MVA have appeared since the62 1 cowven TIONS AND CONTROVERSIES I MULTIVARIATE ANALYSIS
1 BRLOGUE 6
Mardi, Kent, and Bibby (1979),
two are probably mos .
spear books embedded in the classic raion,
982). As well as
De Lets, 1989),
Trere can be no db
‘Some commentsCHAPTER 2
CODING OF CATEGORICAL DATA.
ving the
nregory of variable fy for object 1. These elements are noe necesserly
‘Table 24 Eats ofa mere‘Table 2.2 shows the complete prof frequency matrix
lata mari of Table 21. Obviously, f many prof
‘have sero fret
ron
correspondin
frequency
[
o
2 coding dat wil be of e forthe ype of ani
For each vavable yan ni Binary matrix Gis
‘mapped inthe rh category of hy
not mapped inthe rth category of hy2 CODING OF CATEGORICAL DATA
Forte numa example,
braces along the digo of
thos long the dagonal oC, _
Boa
22 QUANTINCATION e
22 QUANTIFICATION
fvalabl, sack quanication could be
|gnoted at replaced by anol eategericaton
‘he average of the scores of thowe objet that are mapped into tha exes. fe
formula:
ur Dj'Gix enn 2 CODNGOF CATEGORICAL DATA
ter assames that Dj hasan fvere, which implies ht there aren
nes wit zr equncy. I some calgary he 2 eqency, we my
ss well dip is oluma fom he indetr mety
The two procedures can be connected as follows. Let y be a erect.
‘quantification ofthe euegories ofthe fh variable. Lat y be a vector that
roquitement nt ony makes slugs fr xan
mization ferarive loa funtion
ter sony one ston for
eacal, we might be interested i p diferent solutions, This
thatthe eategory quant
dimension f
sypiea etae i
This could be stated mare form
Let haven rvs, one foreach of individual pariementarians,
Columns of G carrespond
ithe corre rentrian voted ‘four’ ofthe
Proposal, and "otherwise, The matrix could be completed by
‘ding, for eaek proposal, a second column
repttered if the individual voted not in favour, and“!
otherwise. THs creates for each proposal complet indictor
maiz G; with 0 columns, and 8 =
| 23 THE INCOMPLETE INDICATOR MATRIX n
er example is hao sing dat. Ie wil be sessed more
$l tn Section 24. For
quantified scconting ote sume prin
iples as oained in Section 22 forthe complete ast. Ags
of average etegory quantification for
nd vice versa category quandfiatons
23)
oe)
rissing dita canbe intepe
favour" ill
‘be considered as Deing inthe ‘same catepory inthis respect.
However, paramen 2 CODING OF CATEGORICAL DATA,
complete mais the oldest ond the newer grave
‘an asing nce dd ico epee te he aoe
‘quontted at opposite ende
TaQle2e leanpleeinditor Table 28h Compe neat marc
A special and ever recuting probiem in MYA isthe presence of ising et,
“Toey ean ooo fora variety of
|
2.4 MISSING DATA
|
|1 2 CODING OF CATEGORICAL DATA
wll wn co make the best goess6 2 CODING OF CATEGORICAL DATA
5 and presence’ as
Govumas a there a
complete G, by edging as many
indesior mati, i. 0 derive
nspsed dats matin A
te bess of he columns 2b and 4,
hat some individ apply some eategoy tol and Hb ot
2.6 THEINDICATOR MATRURFOR A CONTINGENCY TABLE n
iy Variables Analysis ofthe reversed
‘nditor mati qunsis variables abd eateries er individual, but does ot
‘want ails
2.6 THE INDICATOR MATRIX FOR A CONTINGENCY TABLE.
Anindiator matin
ss follows, The
Table 2124 Coningeney ule Table 212b Inara
eis aieae8 2 CODING OF CATEGORICAL DATA
c+ ne) cols
rans fore
i ow category tht applies tothe individual an one forte slum oe
‘hoy. Table 2.12 gives an example Obvionty such an ncaa stn nn
an ficient way of coding ents,
‘convenient to imspne daa coed in
rt ne columns for row categories and
egos. Each sow
have evo ens '' one for
27. GROUPING OF CATEGORIES
ties it makes sease to group categories
cxample is an indicator mart G, for on Es
‘examination, with four response caries, ane of which
8G with tour columns (one for each response cacoo
righ ake only two eotumas (one
oer answer)
ofthe avo types of in
For example, oppose
individu syteatically ave
Tnsesd
the correct answer, and one forthe
‘Wong Fesponee i chosen
responses onthe ether te
of the content ofthe deviant
jon 131, When
ther quanineation will noe necessarily be to
‘uantieaton they would have otined before grog
emake
8. When one groups
prior decison that ey shave ea! weigh
8 the object sore mus be the same Ungro
gore, o the oe hand, obtin ferential weg
28 GROUPING OF VARIABLES
esis expanded by creating «new varia.
ber of posible cmminations of exteore
with as many eatgores,
28 GROUPING OF VARIABLES »
ec oe vrsble have etegries and
‘The combined variable wil ave
ofthe ial variables Fr ena
"and another one estgores
categories ‘ap, nq) ars tp, a
“ofa combination af eegries
‘ddtve. When variables we groepd, their oot
needs to be adtve (whichis preci th
e
of combi
snarinal pr
Sent
(a) A common procedure for collecting preference dats for m
‘inal SG = won) is the method of pated comparisons I
resents ll pstible pairs of smal (SS) #0, dnd foreach
and which vo are leasalke, Obviculy, there ares
assole reponse to each iad Sp
Moss
par
teat oy = ora ap
Sil; Be pee OY ID
o om%0 2 CODING OF CATEGORICAL DATA
foreach
sixcolumns, ne foreach
29 EPILOGUE
developed manly in France, hasbeen zeviewed in
CHAPTER 3
HOMOGENEITY ANALYSIS
‘Thee is
bi
nts pop
of thi chap, from Sesion 3.8 onwards. Homogeneity analysis in the
‘broad vense refer to a case of enti for analysing multivariate dnt in
mb of goin in epee nd te nx chap now of
(0976) or Van de Geer (1971) as accompanying tx,
‘34. HOMOGENEITY OF VARIABLES
ioral, the ex of bomogencity is closely
et
et teen hat iffeent
data ti ny have element vary somewhat
i reasuementenor forests). A pap of prfies hen2 $5 MOMOGENEITY ANALYSIS 4.2 HISTORICAL PRELIMINARIES OF DIFFERENTIAL WEIGHTING 3
fon we give some rs
expected value BU) = O and
p= D0 tary3 HOMOGENEITY ANALYSIS 112 STORICAL PRELIMINARIES OF DIFFERENTIAL WEIGHTING 85
the averape coma sarge,
the numberof ies sree,
be the minimum of 000,
The aon vae
brine by taking xh te ean ofa.
funedon ten becomes
6x
This
of) = 1-880,
where rs the average coneltion between al hy Gncoding fy =
‘sul eoawsponds with Galton observation
eum or Raa oe These concepts, swell as sme atonal nes, ore ilutrated by
tre for Spearman's ‘one factor ode!
seuss such models in toe del in Chapter 9
of meatal test theory (how to define the overall
weights canbe derived. The same is
ind for Ga
Com we replace the columns of H by a single vector x without re
‘sealing the orignal oles? The best soon for Xs the vector
|
The total sum of eares Tor His the race of
‘ering outeves, following in But
‘mewvbat obsurequoation, ays hat
be
tepreaon ofthis
Tomcmewe oan kaye tenngccoraim semeome | “( @ 15) . .
8 ‘of vanables. A ocmal pros fellows wae | 33 2 18), wir = (soez0+i0) = 00.
suum tat ally are sndudied Lex be i ania for repacg al 3 0 10)
implies loss of information, © be evaluated by ti og
j Apennines
fe) 29 Fy $8040 —| 1 becomes B= 3x = 5667. lis called B from Between, becase
(8) WZ) SSB on 4 Atdepend ony on diferencesbereen rows elements thin
row are denen) A direct exprestion ie B = WHT: a direct
‘expression forthe otal sum of squares T is = wDu, where D is
the dagonal ants of 1H.
he notation $8Q(W) is used throagho
aes ofthe elements ofthe veto
ich implies that al hy a dena. Ls
2) 50 }
»-(*»,]
(0) = min (90x)| HOMOGENEITY ANALYSIS 318 MADAZING HOMOGENEITY BY LINEAR WEIGHTING 7
B.The symbol W comes from Wi
OF squares of
928 032-032 re explicit,
M488 2668) | eon hap, oe os
ee
prams [12008 94
0:87 0 0)
(scorn mae yan
( 948)
nd we find B = wD-VPREED-V0uim = w Rem = 234, whereas
Tem=3and Wet B
wir = 02,7
element of |
Wore tr, Core
ten X andthe columns of Hare
Neen t
TAGE F (eh)= O88 w (0.7812, which iustates the88 3 HOMOGENEITY ANALYSIS
ilsmioed with pect oa fr fixe We sa i desi to vats of
is algo
onespordi tion mentioned
Tone xi
cemsbaed
In order 1 Keep the nottion
the clumas ofthe daa natin H
to being centered, are nommalied to unity This norman
WHR (eceneiaton mani.
341 Nortatized scores
nthe normalized score
fy x¥x=1, The algorithm requires
AGO) and thea proseds withthe
song
are not su
ized (acording i some pe
selected enteron of cease),
a5 the valves of|
4 foc ied. Now tat Hil isa vec
saci with colamns By. The up
ze the
the fase re
Satisfy the conssnt), Step 3 eoresponds tothe uncon
8 function (3.4) for fited x. Beoaute x end
of H are centred and unit normalized a” isa vector of eorelaaone
© Seps land 2 gether, and step
faction, which s bounded below
Appendix discusses extnsvely how & = Hii ctn be z
$e inerpeted as
n image of f (and a* = Hx? is an image of x"). The apponuie ane
demonstrates that a stony point ie reached when he image av HI
's proportional x*. The normalized scores algorithm resuies that &
5 ALS ALGORITIOS FOR LINEAR WEGHTING 9
Wx becomes 2 prewdo-adis of s hype:
lipo. The algorithm converges a so-called invari direction, & prin
pal axis of telat hypeeipaoi. Inthe append he alge ated
ule value decomposion of H. Lat H = KAL: be this singular value
ecomposiion (SVD); tus K sth maz of let singular vectors (sting
KK =D, Lis thomatrixofright singular vectors (satisfying LL =, and A
IO Singular value. The algortm converges tox" as he
1 where 2 is the fst dlaposl element of A
radius of bypesphee, so
the eats fr taonary equi)
HX! = LAK LARK hah =a", 6s
oA ae 66
on
38)
with D= dagtV 1 = ag(R) = 1. Te sum of squares of te optinal weighs
‘sequlo the donna egeavlue of HH ~R, The lave lows
WiP=1 p= 1-X 69)
‘he matrix LA i ale the mara of loadings of he priseptl components
nals, sh he fst vector af lenings, coresponing 10x”
1 the vetor of component scares on the fit przepal component bs
tlready been remarked that a* ir 2vostr of correlations between x and
the vectors. The soaton2* maximizes the sum ofthe squared corlasons
For aminiaae exp,
0707-0507
0.000 0.707 \
where R isthe correlation mavri. Table 3.1 ives the resus for
‘he frst ono erations andthe fra solution Fires 32 and 23HOMOGENEITY ANALYSIS
ive the geometry ofthe soluion (alto compare Appendic Bp
Figure 3.2 shows the plan ofthe oo column vectors hy they are
"ad ofthe uit cirele. Figure 33 gives the image of Figure 3.2
7 ° r
igure 3.2 Fs algo, Vectors cae on te wit
ce. nears ao wih apes il ae apo ey
anverge to Tas te fer is De unt
Jeng io ofan comet2 3 MOMOGENEITY ANALYSIS
‘Table 31 Aviad Nr of
‘omnes Sg
Hat = KALI =)
1 berween ed
them (whatever the ia
ud be) depending on sdltional enter
im in Ri equal 0 BOT = 0.5
The average correlation berseen andy equals (B/TN2 =
ones.
142. Normalized weights algorithm
(©) Comergence ves:94 8 HOMOGENEITY ANALSS S44 ALS ALGORITINGS FoR LINEAR WEIGHTING 95
Figure3s ison
and conve,
343 Adaptation for heterogencous
point wit x= =p
ed weighs algoriti we ase nea| HOMOGENEITY ANALYS
now sealed so hat aa = 1 (te ol sum of squares eget
rnomlized weighs algrthn converges toa =D), and
corsa dae for une vancso
The story ofthe so adjusted algorithms is shown in Tables 3.3
and 34. The optimally rescaled data matrix becomes Q, with
columns 9; 375 ALS ALGORITHMS FOR LINEAR WEIGHTING
Sin wl ig ge
xine of atl ra
‘Weteveto mat encanta
solum by the wnt nomaied anpoectn sex props te td ela of
ont te sues ergo tose st a seed calm, ad So on Ts
tors canbe smart by tng ta Xi eoorponed os = UT wih
OU stand an pyr ingle ai. Tats we can ele Sep?
02) Cramer dcompation — UM ER,
(8) Update weigh:
(@) Convergence es:100 2 HOMOGENEITY ANALYSIS
|35 LINEAR WEIGHTING FOR K SETS OF VARIABLES
“Lette data mii H be parioned into K ses =
He has me colons, with 5 m= me. Te pain
specific data a
variables One ox
‘new vaabe tht is Hoearly
her posible objectives in
Ta terms of slave los, me
ifr te wears yin sch aay
Da wEZe/K = Geta CeThsy,
T=wDu=Se\ithin,
ay
e169
ly shows he relationship witha generalized
‘atin a sigle vectra (of dimension).
Define E atthe paniioned diagonal main of HEH Ge
[TH ints diagoralsubmutices Hy, bathe off dagonal submatrices BiB
‘with faereplaced by ero submatices), Ten
Beu22uk = eHHale en
TouDu= aa, Gin
sothat a must be found sucha way tha aH Haya'Ea is maxiiae,
We give areerial example with = 2, my = 3, m= 4
|
4.5 LINEAR WEIGITTNG FORK SETS OF VARIABLES
the result becomas
2068 oan )
358 $235,
oa Ea
| Bis 938 |
(B88 88)
ith Ha =a, Since 7B = i, we have BIT = 1.98272 = 0.91,
So that relative los ie equal to 1 ~BIT = 0.009, Appendix B
0102 3 HOMOGENEITY ANALYSS
sf eanonical
equal to 234, Pr the sme exam
ase tote eipenva
36 MORE HISTORICAL. COMMENTS ON FCA 13
ofthe squared datances it
te distances are measured inthe diction ortogonal fo
weighting i often stutbuted o Hoting
Showa in the quotation above5 MOMOGENEITY ANALYSIS 237 MAKING HOMOGENEITY BY NONLINEAR TRANSFORMATION 105
3.7, MAXIMIZING HOMOGENEITY BY NONLINEAR
ch i is easonable to const an indicator
‘TRANSFORMATION: NONLINEAR PCA
(f. Chapter:
mentioned in the previo
es the comesponding quant
otxa:¢)= ml 3,
over the objet vores
confounded i
subsume gin ih) and writ (3.20) spy
tug) =e! F, SOUR ~
OY) rt F/S5K-Gi¥p, Gay
S Orerow of ios nc inte sec fo bomogerty
Solutions. In these cass, a5 well as when a a prio! oder
lab, we fave to dias he dimension
in other wor106
5 HOMOGENEITY ANALYSIS 3.8 CATEGORICAL FCA: HOMALS 107
basic os fenetion inthe rest 026
the HOMALS
2n
and 34 of the ALS schemes108 5 HOMOGENEITY ANALYSIS
oxy) = xx +e} By}
29 By)
1-334 3)Dgy=1—mehyDy,
ton y= yD-I?w, with w any column of W and the
lar val, yielsthe combined properonaies (3.26) and
governed eigenvalue problem”
‘ined andthe easton condtans chosen
"he propriate @.29 and 27 lo ply
x= (GDAGin)x
18 CATEGORICAL FEA: HOMALS 109
n=} ¥) G)D7'G) for more
Chapa.
eralizes to
ORY) = WXX—ne! Fy Y/N,
spam! ZeyDye
stationary pir of male qu
feores and Y= D-IAW for
Subscript denotes the selection
‘inimam lot besos
tt") =p— Ep yal. 635)3 HOMOGENEITY ANALYSIS
$M CATEGORICAL PCA: HHOMALS m
‘Tables Manic Gor
tune earn
Ye DIGS yells wDy* =wGxt =u, 638)
calculators112 3 HOMOGENEITY ANALYSIS
383 Normalization
‘Thee a wo bse samaintoncpion in cago PCA fe lg
‘rithms of Section 3.4) ie * aan
(©) yisnonmlind so tary issome const. The indaced obec scores
ter ae obtained by x™= Gy, shih aes the
the average of i category gu
dimensions an eb pine wl
object
stant Te induced category ga
7G, where each estogry is qua
the centre of gravity ofthe pins for
“The sandal HOMALS progrts kes normalization (), with xX =m 50
x becomes a stnderd sore’ There ar wo paca! oon
elements of x now cat be interpreted
iar properies of standard sare, The second
applications it happens very often tht ism
‘normalization @) then eves te ies
eins equal spread ll eons
of sebgroups
on above ips. ese te SVD ston D2
= VIEW, tat french ofp salons xy andy, = Ie) eons
TUE a rec tp Ye66 long) the flowing
ean)
nee hy,DVm, an
"Dyse my an
Inthe Quput of te HOMALS progam te egcnvah
° LS program the egéavales are report in he
fam vn. The sale af he cap guaifetont to sed
sears, bt we may deve uper and lower bounds fe ating aa impontion
‘of its range. The category quantifications always satisfy i
~ linda gs
lead) iy,
‘where dy denotes te magi equeney of category rf vaiable
343)
5.8 CATEGORICAL RCA: HOMALS 43
Proof
‘We fi wre the cargory quanto i the form
ere Mae sue er BCG gd eas)
yess hat
(aay)
049)
(dg dey oan
ting @.44 and GAT into .46) now gives he desired esl
Subsiteog G.44) nd 47) into 6.46) now Bs —
1 category quanificatonsdeped onthe ni
The bounds
gina
rales tha the maximal range of very infrequent
y= 18 This fre reetetion of range isone ofthe easons to
becuse wea the variables have widely distinct morber of Gtegves, oF
‘vai apne wih very tin als
|” 384 Contibution o variables: serimination measures
Ls craton neces cfr
F sipDaayde oe
fe 1). The dis
i ‘variable does not contribute to the sth dimension
‘inwension, Whe3 NomOGENEITY ANALYSES
coincide with te ei
«Saline
Sn Giy
Prt
Goi met wot en Te coin even an
Ee
Faa)= 697 Gi6}679 ay ew
igs aeaty B=, 8.8m ereicenn
Pisa = ODay? OD,yir
= Darya, G.49)
‘ice ioonion me. ox
fn eso 39 lb sown at Se dctinatn seas ga
4 terpretain ed component loading. Before prcending nahh
* SmaT auth eomerical popes ofthe HOMALS cece et
meri examples given,
For the data mati of Table 2,
‘mati in Table 2.5 (the exam
We give the HOMALS esa
(9). th nomatison yDym
(©) Wh the sandard HOMALS moon
y=Di'Gx,
(@ Nem
he genera
Blenin Tie
ionyDy
!gersecor equation Cy = y2Dy (C and D are
'axd2.7) has the tee args eigen
Vi= L885, yim = 0.69,
Wr12T, yma o42s
A=1167, Yn =0399,
rela ts tym «0371,
‘eave ls 1m » 0.57
‘ele ls 1~ ym = 0.61
|
|
5.8 CATEGORICAL PCA: HOMALS
Ye38, Using te
category euros pare gens Tobe 38. ing
fi quent optimal dana end Oy
‘Antibes ath corns gts Gy The eter boa
nas Gy argv ore tee Stawys Tebes 10 he
a ‘y'Dy = 1 implies char the sum of squares of
i310 SoluionorX, Ses dane dimensions,
canaaon Opt
3p an as
tan Get
Ce ee
2 Be aot
us2 HowoceNeY aNazysis
£21901) pots. The iste columns
ses of Table 3.10 sive those of
each obec point i the centre
eon for
the gue atthe sum of
‘Teble33 Opinaty scaled daa main,
‘hd on ets HOMALS teton
ia 021
tances beeen a category point and the obiect
ects belonging othe category3 HOMOGENETTY ANALYSIS 5.8 CATEGORICAL RCA: HOMALS °
on. Some ofthe techniques of nonlinear MVA to be dscused ater (eg
PR ape 4) donot have
ai
he cone of praviy ofthe objec
cctegois of that variable arin the
‘sobelouds, andthe eaegory points
ee
4 plot forthe frst no dimensions. Cotegory
of obec poi
her category’ coined j
ay Between 2 and a i
9 of .
ration: obec 2 and category. The same is
‘rue when a category applies uniquely oa group of objects
aideneat response pater (category for objects, 4.7)20 43 HOMOGENEITY ANALYSIS
> cueuory pois wit low marginal fequency wil be plowed fanter’
yes az With gh magia eqncy
cent of te ploe
Ptr sitar ote
wad the eens whe
f
{19 RELATIONS BETWEEN HOMALS AND LEVEAR CA
o iltsrae, we tok te uni narmallzed version of he optimally
quanti data matrix rom Table 3.13. The coresponding
‘nrelaion mate Ry Beco.
ination measure inthe
{rst dimension are the squares ofthese
Figure 3.8 depicts the component loading
maspace. The plot is based om hth horizontal
dimension an the verte ets he sce
bee the fk
|
=wADeDy
ow,
we have used the Burt table frm
Secood PCA diension
rove that the fist HOMALS dimension’s the fist
fhe optimally sealed data matix Qy ise flowing
index forthe dimension. Let Qy be a Ej m mux
th column, where each
ago
tte dsciniaon ences fof Dgna 3 HOMOGENEITY ANALYSIS
‘Tus int HOMALS sla cn sso be esctbed i the ftlowing wy
‘The indicator matrix as it were expands, or blows up, the dat vine
tha cola of ede a teoen
Gy (where Gy and'x
ating te HOMALS.
math then HOMALS
Gy (oronw,
PCA onthe optimally sealed date mati Q
that in Qh indicator matin again compressed
eof wero a
le HOMALS gives t-te
is te elgemector
Loa, where Ad ic
‘he example, these
0.45, Note
equal othe
second HOMALS
Second elu of Sa
419 RELATIONS BETWEEN HONAIS AND LINEAR PCA ns
0.277 1.000 0.279
The elgerectors ore obit as
_{ -a089 016 -0.577
ta=| “0998 370 “0.578 |+
0.685 0405. O.378 )
the marix of component loadings gven by
0049 0,992 -0.385 )
onto a) Oe oie)
Oe Ose 085
and the matrix of component scores is obtained as
“0374 0243 -0.186
O19 -0.380 0.412
0.338 “0.070 “O.150
0378 02k8 -0.186
KeeQalaaz=} “O.047 “O88 “0.492
Gols 0579 -0.247,
D518 0243 ie |
0.048 “0.029 “0.412
Bors 0579 247
Bes 0.029 “0.412,126 3 HOMOGENEITY ANALYSIS 219 RELATIONS BETWEEN HOMALS AND LINEAR PCA wm
Bertone the sum of
cexplind in Seaton
be er elena sma), Tis fer
3 aplied to the data as a whole or w the we data matrices for
Second HOMALS 43.10 THE RELATIONSHIP BETWEEN HOMALS AND TOTAL
va spat SneSQuAKE
gw
Jon
3
go the univariate marginals,
i Seay genie as ceed
Bow Some ogee oops
D-MC—DuwD/gD-V2= weEW,{1 ANILLUSTRATION: HARTIGANS HARDWARE 29
‘Table3.4a Marga hava vanes mn categses
‘The flowing ast
Ahat kes from Hare
reaa
ni
These result are confined inFgue 3.1 which depicts the discrimination
‘measures. This plot, too, shows tha he fast sension slated to variables
1 CTHREAD) an (3
(LENGTH. Vari
F paras fom tacks. The second dimession
‘epates SCREW! and NAILE, bot being very lang om ths ese
A
‘130 3 HOMOGENEITY ANALYSIS
such variables stil dsemint, i enst be Deets inthe
sample. Figure 3.12 shows te category qunttcaiony Figure
3:2 ae the centres of gravity of the object points assed with eacheat=
enor.
SLL ANLLUSTRATION:HARTIOANS HARDWARE a
A more precise and dealied analysis is posite hy staying the x plots in
Figure 3.13. Here the object ssores we ploted agin, but now labeled for
‘exch variable separately ong the label of Table 4, Fro these plos we
se that vaabes Sand G have categories tat eannotbe separate very well at
Jeastin he fist two dimensions). For the otervalbles the objects wih the
Tme315 tot THESIS Cy ne :
Geke Dinei Y Dien? Gees) Dineser Dine iy
vag sous, “HEC
; a] RE
te pour SCREWS
18 NALS
i :
#
8
as
te ws
ie 2
2
3
2
3 3
3
is
B
as T eM T
‘se Sm Ses
eo
= ea
= eg
= & 8
= eR
= Ea
a a
=e
sume abel form fury homogeneous
‘hat for vara
nother
aores We consider this
result stistitory, although the HOMALS lore might not be small in8 HOMOGENEITY ANALYSIS
0 2k
jas harévar: itanton eases,
“The second dimension pearly capita
esl sow in he
{OMALS soliton changes
ens, The send
SAL AN ILLUSTRATION: HARTIGANS HARDWARE im
ve length 1 and the U head ae the only ob
‘hs ars ttt the
nove eva om thr cents positon owas De ane, Oba,Figure 314 targa’ hardware: tera story (it 10 ome,
‘32 HOMALS WITH INCOMPLETE INDICATOR MATRIX136 3 HOMOGENEITY ANALYSIS 2112 HOMALS WITH INCOMELETE INDICATOR MATRIX ry
xemniGy, 650)
Soch aquaniicaon is consent ith solution sed
wep? -ww, os ow may
1x mn and yDy = my
nolongerhave uD; 0,
Tablas
Oita
&
bts8 HOMOGENEITY ANALYSIS 5.82 HOMALS WIE INCOMPLETE INDICATOR MATRIX 1
‘Tyble 347 Resuls for mama exatle wil option (‘ising (i) Missing values matiple category. Her, to, the indictor mats is
‘eshte ep” ‘complete. Rests ae 318 and Figure 3.17.
ee a
Objet see
Epwokere? 066s osm
eres 06s ose
Tyble 348 Roauts for nunercl example with option it ising
aioe.
oie eae "
i # «
i
ag 2 4 ° 1 2
%
‘i Figure3.16 HOMALS solution with option
00 “Tnissing values single category.
Epwalae? 077 06s)
ing dia se randomly
‘vided over ober and categories, dliferenes beeen the thee opin wlMa
3 HOMOGENEITY ANALY
land the ftepestion of iscimination measures wil be amos the
5 forth complet ci),
Figure 3.17 HOMALS sation wid ops
‘lie eacgon
ng values
0 produce a quantitation ofthe missing dat, ta
cverage object sere for abjects with
tion eqal 0, since
tent with the idea hat etegory qu
within ta euegory
‘Suppose an ott
‘yrabes. Option (i) gives
imensons. Open iw quantify it
incomplete ver the completed
Indiotor mati, Figures 318 and 3.19 pve reat forte example used it
Section 2.3 (ie setinon problem), both forthe Fist two dimensions,
Figere318 Hos
‘neopetsie 3 HOMOGENEITY ANALYSIS
‘io that in Figure 3.19 category poins for + and ~of each este gory ae the
‘opposites ofeach oer wih terpectto the engin eft pion
gue 319 HOMALS solaton fa sist expe bse co
completed indicstor matrix. "
343 REVERSED INDICATOR MATRIX
{M3 REVERSED INDICATOR MATANK oy
(©), perhaps elated to whether the tens were English speking or ot. Suds
2 sorts (W L) (CG), pechaps impliatng that he aor ‘before 1900 verses
“ater 1900. judge 3 sos (W L) (C) (©), peeps ty county of bith. Tn the
‘eves indicator mati aproach he objets of eal ae them 2 he
"aiables of analysis are the jules. The data matrix andthe revered nde
2
Fare 320 Hon sltn for min soning tk, Shs
fornvesediicanr saa “3 BOMOGENEITY ANALY
Table 19 Example of onga 3 HOMOGENEITY AN
[REVERSED INDICATOR MATRIX ro
Wen tis dara mati i analysed
te subjects and thee e
pleted ia Figures 323 and
‘he reese indetes hatMe 8 HOMOGENE:TY ANALYSIS
(Me ae dealing with a very dominant fet dimension om which the items
‘a pees ordered ascoting tote erder inthe data mati Tassos
‘we aequlte category poets fr the subjects: we notice fat a¢ Seer,
24 HOMALS sluton for transposed data
sane carpe cangay pus Each det
fom 10s bjt each eed mabe
‘aegis 1 and2 for sabet | have obned exactly he sine quafesions
8 tegres an’ fr sbje 6 Maco, al eat poter eo
‘ransfomedin he fist dimension ¥
9
2 srnogue
‘Wheter or not we hve to analyse reversed indicate max isin pera
of he use of HOMALS ean found,CHAPTER 4
NONLINEAR PRINCIPAL
COMPONENTS ANALYSIS.
441. METRIC PRINCIPAL COMPONENTS ANALYSIS
In Setions 3.2 and 3.6 we reviewed some of the history of homogeneity182 4 NONLINEAR PRINCIPAL COMPONENTS ANALYSIS
42 NONMETRIC PRINCIPAL COMPONENTS ANALYSIS
tor analysis model, and
beter interpreted as» special cate of 2
principal componente alysis
‘42 NONMETRIC PRINCIPAL COMPONENTS ANALYSIS
lyse R
ofthe singulr valve dacompy
The Eckart-Yoong teorem 2 pra
‘tn nx data rain H ea be formate in es4 NONLINEAR PRINCIPAL COMPONENTS ANALYSIS
P
42 NONMETRICPRINCIAL COMPONENTS ANALYSIS 15
er ety yeas we at
ional breakthrough Ia aonmeire sealing
skal Shepard showed tha onda re136 “| NONLINEAR FRINCIPAL-COMPONENTS ANALYSIS 42 NONMETRUCPRINCIPAL COMPONENTS ANALYSIS 177
of random variable c:Appendx A). Because ofthe fora of
the leogth of the projection of the ob
In Coombs (1964) the abvious nontietrc ex138 ‘NONLINEAR PRINCIPAL COMPONENTS ANALYSIS
{is the preference stength of person J fr object, If hy isthe observed
preference sength, then these mes in addon
Iiy> ny ay 4.
43 THEORY OF JON Loss,
In this section we sty some properties of (0,X.A), using explicit
sommalization. The eases way to do thie dete
minim of oy(Q.X,A) over X and A for xed Q,
theorem F. Appedic B)
= ERO), Pare ay
‘wher R(Q) sands forthe coreation max of bem ransomed vases @y
and 3 for its eigenvates in decreasing order. So the azallestm —pelgem,
‘aloes are summed in (4.11, Clealy
‘we want to compare
analysis diseased in Cl
variables are discrete ad
by
Gta,
of components analysis with homogeneity
‘we must soppose inthe fst pace that al
3. Thus th cones Cyare subspaces defined
(413)
45 THEORY C#IONNLOSS 159
‘with the G, complete indicator masces. If p = 1 then the theory ofthe
We cantare some of tee points with che example ef Section
39. not al of hey43 THEORY OF 0NNLOSS
ssormaneoncsanensra, |
mel corepond with ther smallest nonirvlelgevaes, In
Table 4.1 we give he solution for Qwithp™=\ and re
corresponding correlation
InTabie'2 tnt re
the
mars andthe eigeraies of re ROD
iSforp™=2,r=\are gen
Table ta Maric Q frp =
ote on Tae 6. ant
The lentes of te composing homepency nays ar:
Ths eget one the argon in Table. the mast onthe sale
nen fated4 NONLINEAR FRNCIPAL COMPONENTS ANALYSIS
18)
he minimam of converges
cease all mesutenents
he
‘wo diferent tps, Suppose we sar an
sing the contains We
isthe fi Second sep we compote new 9)
forgiven X snd A. Thiscanbe dane foreach sepa Dotan
i=
43 Tapory oF somos 16
Thispari taal exer guntiicaons is bane by
ining (19) ve yume fa pn X aA Tes eos 2
‘ails be pert
SSQUGp/-Kap = SSAGH Xa) +0y-FD}Hj-F, eazy
and consequently we veto minimize the second term onthe ight overall
“Alieratvely we can loose he Daygavet lg mentioned Section
a thee
yo a
© we canchange the order of the tree steps, and we ean pevorm a
‘numberof iterations of the two Daugavet sep befor computing new yy We
‘v0 ofthe more import choies of Cj eater
ste as single nominal if canbe
poly weighted mocotone regression othe vector Fj
ce Appendix O,168 4 NONLINEAR PRINCIPAL COMPONENTS AN
L-COMFONENTS ANALYSIS 43 THEORY OF ONNLOSS a
‘Table 43 Tare teraton ofthe tos sgt
— mle ‘Table 44 Thrce iterations of the tuee-step algorithm to minimize oy
aie is 28 Uae ose ¥ “aah 08 0488 “2H 01” a5 “92m O17 os
i 38 2 0S 3 Sg GS HE ae oe
3m 3 EE Se Se aS
48 sae - x ge 208 4439 are
aust 238 too a tam ee oe Ge oe
38 Eee
Bio ees
gt gy om 8m om au BB Se ee
Fe 33 = 23
rs «tee age ast an
2 Se @ we
3
We do no resic the missing par yP, where denotes missing. Seppose
‘hac (which contin the a pror Amerteleategry ves and hae the
sumber of elements equal tothe number of nonmcsing ceegores) is
ommalized in sch way tat
‘The sitoatonbesomes more compl
lteady seenin Sesion 2
nu to we with
the cones C change, bese we do not
ieanty forthe missg values, For single nom
ae for sige ordinal variables monotone regression
ly performed over the nonmissing categories; fr the salsting etegrice
of course, D? the pan of Dy corresponding wi
ries, We coupe
copy te comesponding elements off and afterwards we normalize, This Beau Di5j/uDja, om
‘alo oo rel eons, For shige numa vasles he tui 7.
Shy more on the nnn par of denoted by 99 wre oy PH 6
* sands or teed, we egue oI eat wh a ;
moons «an
say 426166 “4 NONLINEAR PRINCIPAL COMPONENTS ANALYSIS {1A THEORY OF MEET LOSS 1
44 THEORY OF MEET LOSS
sd we he compte 9 by nora
tvs contoqence ht chang
‘neil component poe otha
solution. Ths option may be te
10 very unpleasant compat
depron ny more
‘Mp, but becomes re
much tore expensive, Using y= Gy we
and Yost be compute ow-wioe Rw
gM:
‘Those wil tke agprozately
RS QROCK): Updating Yoel “eo
Bs
aan
yPn=1 (aia)
wed ag)
Sone sinsle comput168 4 NONLINEAR PRICIPAL COMPONENTS ANALYSIS:
OMAYA)= oy,
+o- 44)
where for all variables single meet os
MENA) «Wl, 8800-6.
‘ai lationship between mee os for singe variables snd jin oss holds
643)
MOA) =p mr Sy ajy—2mrt 3506,
here we have used
WORK) = pand yy, 1,8
XX =1, from whi
aly We obtain for in oss
ORY) = 1 + wy 2015, aan
lows easily. The fs that we ose Yj fr the ml
the single quae
ple, ot bth and
(4.44) tha ining
lent robes, with he sme solution
int evans ing Os the can pote conto
YEU some vases and ot ares I we pore tem favo ee
then we wee doing components anal we
hen we are doing homogeneiyexalyais
ing to mit the two opsons and w give son
‘ons ao others maple qoatfcatons. Espo
swantficationofen snot very natural, bresse
seem #0 sugges ‘utegories can be
Consequently we might compas
numerical vrais, and mls quanltons
Another advantage becomes clea it we
pion for missing dats, Now
‘anaes single
for nominal vari
omc,
ith nomaaton
=m By aX - GY MK-Gyp, (49)
UMEX=0, 44)
XMOX= mf, 450)
where
44 THEORY OF MET Loss 1
MEM, asp
‘The site relation besmeen yg an (equation 44 is no longer ue with
this rearmea of mitsng ate "The alternating laa
minimizing the geceaized og (euation 4.8) is
stu with an X sttying the constants, Deine
yappen., 452)
‘As we Sow thse are the unrestricted cononal minimizer of oy ofthe
‘category quantifcations of variable. The relevant part
aruoned as
nok —-GYyMYR-GyYD
2 GRPMK- G/N + wh) -pDKY;-¥, sy
‘hus misimlng ver Yj canbe done by minimining the second erm onthe
‘ght. Ifthe variable ismlipe nominal we saply set ¥j= 9, and eae
fom,
“We can also introduce a his point new
naturally into the general framework, Fora
jumns of ¥ a ether increasing o decreasing, As explained
by Geteman 1950), we cantor require them alto be increasing. Conpatng
Y, amouns solving two monotooe regression problems fr each column
tnd to keep the best one Multiple
ine ove yan:
‘of Dyn pees
Squares nner erations to folve for y€ G), piven
318 Tar 3. given yeaa be done by deny
3} and foray, ven 80
Heyyy, 455)10 4 NONLINEAR PRINCIPAL COMPONENTS ANALYSIS
Solving for yye 6)
one dough ust oe
5rd
‘the conditionally optimal, ingle eatepory quantifiations
restrictions, and uss the alternative paioning of
(forte 458) given by
Gia} Ty DGai- Tp +aja,
Ths ang 9 ea
‘arsion eben
‘ese cs, and aquesion oa
‘Several comments are in order here
assume hee Uat yyy for
eects to noe X aA
ys not eee onal
idle tion wine
foal. Fom te mean presaving propery ote epee wea
Aine atersegesson SPM 8 He wasn grins we
8p) =wDJ;=uDjay/aa,
=uDDFEXy/ ap,
ass
My=1forallj(o missing du, or option Hor it), then wD) «0
fom WX =0, and our metho ean be interpreted inte ot RG,
m
44 THEORY OF MEET OSS
X forgiven Ys,
amounts tothe
L=SMGN eo)
then
= (Me (ou Me M2, ast)
x n!M RAM,
{wil be found oat that
X sven the ¥y, We find telat squares
rompete homogeniy aalyis 4
‘Uwe define Mi asthe dingo tix
1 hese of thse vectors q with he
if objects ant are inte extepory conespon
Myq is proportional gy for noma
require Mq= gy. However under these
403)
which rants he met
‘the uta lve scionm 4 NONLINEAR FRINCPAL COMPONENTS ANALYSIS
45 GROMETRY OF MEET LOSS
sans for msm nt oss for ech vate have interning
SCout nmerreations. Asin homogencityanlyis we sepesntehing,
Sores as point
lossconetbason
imensiooal mace, Ifa variable is multe nooo
MX) = GM bc ifand only i al object inte
BMY Rave the same cbject seer, whichis then ot
145 GsoMETRY oF MET Loss az
‘vail s mute ordinal we partion the loss contribution
453)
aK- GIy MK-G,Np + acK)-TyoK)-B, 65)
4.66)
omer
Sse eters
tite omabemsonagen anos
Figure 41 Moet os fr amalipe nmin varie
Qu fs te corpo cates tein. sn homoge
‘ys we ge obecs ina caepry bean ogee Tae
Figure 42 see oe
Tot becane the quant
rough orig,
Observe tat for mulipl tina a there generally is
Of rotation ofthe axes. For single variables there arem “NONLREARTRNCPAL COMPONENTS ANALYSIS 45 GEOMETRY OF MEET LOSS Ms
ve se meet os fra singe namical variable the quanifistions should be
Inherit rd ondeqal spaces.
Asi ive pe
sib lor Bed we ua ution wither Sanne a6 ak
ce vena equation 456), tart a sen Oot
ower ewan al ae crs concpontng wih haere
sot sane ad we want clegory danifeaon noaieg see,
set) e on ie though te ogn For singe nani en
icin gre two lott connect ais ices Se
Figur ¢2 Fe sing di! an mod oath
osntitcacons ona ine thoeh toga ne an
‘nthe are way (seed by some cot), heaton or
gore 4.30 Most oe for single eel variable nitonat
Tecan fn quansienonafse Se eal epics ose
atl: aston
erp oer on he
aed (Dot generally not 20) if the
zero, which means that Gi = Xa,
‘Geometally this condion meas tht he objet points caesponing witha
¢ategoy mus be on paralie hypeplanesperpendicur tothe dteevon defined« SONLINEAR PRENGPAL COMPONENTS ANALYSIS
com eason ht
‘roams scepismaliple nominal is that they are based os join os whereas
PRINCALS is based on met oe,4
“¢ NONLINEAR PRINCIPAL COMPONENTS ANALYSIS | 47 ANEXANDLE COMPARING MULTINLE AND SINGLE TREATMENT
4.7. ANEXAMPLE COMPARING MULTIPLE AND SINGLE.
‘NOMINAL TREATMENT
Table 4 Gutman-Bel dite
espn
Vater =
Yates =
Yanbes =4.9 ANEXAMELE COMPARING MULTIPLE AND SINGLE TREATMENT 181
2
Figure. HOMALS suonof te Gutman Bet da Length
oftetines incu los or vanale s+ oboe emcpanei + NONLINEAR PRINCIPAL COMPONENTS ANALYSIS |g AN EXAMPLE WITH PREFERENCE RANK ORDERS 183
artic nic umeciry Werery wet eae how tne AN EXAMPLE WITH FREFERENCE RANK ORDERS
‘of the plots that can be made and some ofthe statistics that canbe computed. “
“ibe. Gn 2 dee Guilt Tobie 47 ees pretence rank oder of 9 pycoigis frien polo:
nae — == el jou eter Renan, 196-152, By contention tow Semen
fnthetbleinder ahh pene ol Tete eal
‘ible Gutman Bell deacon qunifiaiotortnes
‘Table 47 Rostan's oul pefeeoe sk odes
JERE IAPR IFSP MVBR ICLP JEDP PET HORE BU WBE
‘Table Gutimn-Bed data objet scores
CA Cute Betdaobetscons
THOMAS, p=
SgsseeesazagssasssesEZEzaccagNNII9E9999‘+ NONLINEAR FRINCPAL COMPONENTS ANALYSIS 468 AN EXAMPLE WITICPREFERENCE RANK ORDERS 8S
JEXP: — Journel of Esper
mental Poychology
Upp: Tournal ap
Psychology
2
3 i and Seca Paycholgy
£-MVBR: Muliserate Behoira seen
$ ICLP Journal of Consuting Poycotagy
$ EDP: Sounal of Educator Pryce
1 Purr: Prychomarita
8. HURB: ihoman Beats
3. BULL: Pochoogial Butera
10. HUDE: Hunan Bescopoaen
‘Two PRINCALS analyes were performed, both in
with ll variables single ns
theft
‘aod he second wih al varbles sage
Figure 47 Loading forthe 39 peeorn
abel toon ide coca Be a
nea solution the oar
‘Thad! group EXP, PMET, MVBR, JAPP, and186 «4 NONLINEAR PRINCIPAL COMPONENTS ANALYSIS
‘development’ group JEDP, HUDE), anda sot! group (PSP, JCLP,
‘HURE), Is also possible to se joumaisaranged on « cuter svete
nthe mi
‘oup and J moved away fom HUDE to the mile
However, oer ile difference Seren the So solitons snot i is
4.9. ANEXAMPLE WITH DISCRETIZATION OF CONTINUOUS
‘VARIABLES,
Figure 48 Loatings forthe 39 fs inthe single
‘nna sonia song Stree
‘the 39 psychologists ae given in Fignes 47 (gumericl
Observe tat the arrows
econ ofthe psychologss we have used
1968, p. 152), which makes it possible fad out a
syhologs wares. The codes ate18
4 NONLINEAR PRINCIPAL COMPONENTS ANALYSIS
‘Table Cylinder pote: data
AN EXAMOLE WITH DISCRETIZATION OF CONTINUGUS VARIABLES 189
‘Table 4.0 Conelaion mat nd cients of PRINCALS oninal
01 01 03 0S 07 tt
Figure 4.9 Te fico column of Tele 47 isk by row
tess connected with be comependag PRINGAES cet190 4 NONLINEAR PRINCIPAL COMPONENTS ANALYSIS
fore re dimenional sie
stn st Tharsene' eye pr
The aw das are given in Tble 4. we have sete them ia Table
49, and this discretized matrix was aed|. CHAPTER 5
NONLINEAR GENERALIZED
CANONICAL ANALYSIS94 ‘5 NONLINEAR GENERALIZED CANONICAL ANALYSIS ':1-PREVIOUS WORK 185,
Hicher= sin Mis Masaan=0, 3)
cand for al previous solutions ag sin
Suppose the Ht are K
called canonical var
fe K mati Z, with
ityererion and by
hich may involve
nay uaton nny
ion andthe edidonalsoltons can be ued to define
comespondence nals
the centroid principle
‘in be tong or weak. Strong sultans orthogonality consis require
If ihe weak version merely requires that BRU =.
weak erhogonalty conti
Bereta = Ee Meals,
focal previous solusors afa [Re Ree Be
Figure 5 Pond cos poder mars REE.
~#eChe eth)
PREVIOUS Wome
ith 6s
and ace both rnin indies for
theives of th comelaton mats Ss
od denoes an element of
onfor matching, by Davxals ard Pouse (1
Geer (1980), who ses the acron
(simply change al sige i he
smultncous ston’ by genrlizingto the cision
BaD Caley
Which mast be maximized over Ax under the reticto
subproems are now onkogonal rorusts problene
ot eyes of eign
th computationally and
‘computational convenience indicate that MAXVAR and MINVAR are prety
good endiates,$ NONLINEAR GENERALIZED CANONICAL ANALYSIS
‘Table S..a Consaton mais for 1 vale ree at
“Tle S.A Tae 5. wcaiqus moked eotumanise
MOOI MDOED HAS SORSUAL MENDET HARES
woe sts
Fr a a
moe 38
oo’ 8 GGT
wor $30 4 30 GS
Mowe 3 3S
veer Meem2 '5 NONLINEAR GENERALIZED CANONICAL ANALYSIS
The solutions for te ag are given in Table 5.1. For MAXMAX
‘and MINMIN we have applied the weat orthogonal const;
forthe oser four techniques we have applied the trong ortho-
{sonal contains. We donot ie any subtotal intrpresaons ‘hatwerestict each of the ¥ indvidally according to measurement level.
"We have seen a Section 5.1 tha ter techniques would normalize tke Vj,
Teve: we merely rar tha he solutions forte canonical weighs
are exremely sn becomes obvious:
1 Kj this condidon Is aso
‘be triton a8 VT =
satis the constrains hen YT will gency oly sts the constr ot
“Ul lagonal mates Ith coin tus, then we ean aso minimize
oyCEX-T = 1 Y, S80 EG AIT) ey
5b wth he sad ele
‘before. We shall use «sigh diferent formula, by in-
ping the eondition thatthe Tare the sme within se and by wetng thera
tsTp Alsolet
= GN, oun
eh
snd define oy
Sd XX oI. We
1 patoned indict
1 nina of oy ove X suing WK 0
ined matrix Yor each of the set, we cn expe the "
‘neon a5
cuf.¥) =F 880K GY, in04 omen Geena canon 7” Deen crsscemaen] 20s
x 5 cA 153 OVERALS AS A SPECIAL Ci
ces Gy, which area x hy re replaced by a single
“The m indiestor mauioes Gy,
Bete tieyrt,
where (224) denotes the geecalled inverse The proof follows from the
Eckar-Young theorem. Thee
onset S expicty Sseimpy the max that asi
Se SS Se opis gle x Oy
Formula that
of generality hat Ryy = KL wich ove
BCE EWC GN) =x,
2h ie
BeGtviyatyt =x,
haps, bat ts expecially imporantat his point, because sing Gk
special cate can be recover "Show wt» soup of warble ean be reduced oa ingle viable nd that
en Ze = GY the ¥, :
* ‘Tab S.2 Parton intone ai
‘which gives principe component nays
53 OVERALS AS & SPECIAL CASE OF HOMALS: THEUSE OF
INTERACTIVE VARIABLES
‘Weave seen in Chap 2 that its sometines useful to combine m variables
With rook aegis no sngle vale with >=" he casas206 § NONLINEAR GENERALIZED CANONICAL ANALYSIS
‘soigetncly at Ks: problems can be edaced to homogeneity analysis
rolems (while, converey, we have een that ll emogeneg o
‘eset problems with ony oe varieleicach oa)
Table $26 Inerave indir mi
72901 om aq au av tps tov ty im by Ge GY um yw
fective variables must be done
"any calegoris (the argunent
phere the Sy ace the submatices ofS, which are (IR) 4
‘54 sassnva DATA,
= ESwi¥)= 3Sianas
Yer igeat~ pa
528). These ae proper indicator mates, which sist Gh
5.4 MISSING DATA208 4 NONLINEAR GENERALIZED CANONICAL ANALYSIS ‘Sf AN EXAMPLE: EFFECTS OF RADIOACTIVITY ON FSH
55 ALGORITHM CONSTRUCTION
Fe Omigyopx,
‘as of he formas. The iting of X for
Droblems, and we refer to Section 44 Tae
component
Ok GAY Myx Gt
oss present new problems, because Gtisapationed indica
nota simple indicator marx. This ehrateiste pie
(GOCE snot diagonal, which complete the consacton of
= «Gh mycty
Xa Xj GY}
[56 ANEXAMPLE: EFFECTS OF RADIOACTIVITY ON FISH
rnp to ise an algsitim on computing
i tend to become
NX (p66,
here both and are varies nse, in which the coneutions of the the
‘arable ae removed ftom X,
Now the minimization of X,Y) with X ad Vs ned can be Cone by
raining
BK) GXYMAEK—Gyxp (528)
‘over, The condition waconstaied minima i ained for
a0 the comer of tn egal
‘angle and the share closet heir aquarium, Again the same oo groups of
variables canbe dsingushed210 5 ee
[NONLINEAR OENERALIZED CANONICAL ANALYSIS FS ANEXAMPLE: EFFECTS OF RADIOACTIVITY ON FISH au
‘Table s3 Fi dita tom anc
‘Bile itd tro a cxprinent by Ami (ta exeguined ty Calte nd 3
scare forte lah at nthe mame
conse wih ie Conespondig pls in he
avables and the eanonsalvarubleobect sores, and Table 5S gives them22
| NONLINEAR GENERALIZED CANONICAL ANALYSIS
‘ables ng fora
ie Cera losing for sng merical say'scomton of 16
S65 wo cancel vats meee
‘Hel ant re ft cna aches)
Cri ii
‘componeat loadings ofthe mamaria volson in Figure 53 at he foadings
‘fhe ordinal solution n Figure 54, We can se foe the gues hatter ate
four groops of vacales. The fst one consists of the ie variables, of which
1
Table Canonical outings fr sn ort
Osis, agpt ong sino ani creiton of
tom conical vate sco)
Figure $3 singe vaesesl OVERALS solution fr he Fish
(GE! cana lenny of he ovatna $5 NONLINEAR GENERALIZED CANONICAL ANALYSIS
Bowroche and porta eliminated it Clay ou easton of
‘refines the ove give by Cailiez and Pages, and or tectnigh gi
menmedate between dstiminent analysis an pinaCHAPTER 6
NONLINEAR CANONICAL
CORRELATION ANALYSIS
61 PREVIOUS WORK
‘There is considerably mor historical mateal onthe problem of ses
The problem was probably formulates for the6 NONLINEAR CANONICAL COR
RELATION ANALYSS 61 PeavioUs work 219
conver people to using canonical crelation analyse have been writen in
san lomo fe20 ‘6 NONLINEAR CANONICAL CORRELATION ANALYSIS
62 THEORY
Betutitay-Z, 6:
ih 2¢= GOV. We have hv a id
weak erogosliy conseae Sr Rye Beene Bee
‘maximizing the sum of the p largest elgem cn
oes
ShEX is bn iinet, ann which we rege hat 22 + 232,
Because we can choose Ty pendent to
sl
63 THE CANALS PROGRAM
‘The CANALS program does not it nitrally into the series HOMALS ~
PRINCALS ~ OVERALS, beaut it doesnot incorporate mule variables
and does noc use indicator matices. The laters explained bythe fact that
(CANALS dates back oa period in which we sil wanted the posit 0{6 NONLINEAR CANONICAL CORRELATION ANALYSIS (64 EKAMILE BCONOMIC INEQUALITY AND FOLITICAL STABILITY 223
az appro must be incrprate in the definitions of tb cones C; Option i
canbe simulited if one codes missing dats in one ext category per weabe.
would be sensible to sete nominal option nts ase when one doce Not
Jinow where to pace is addional category among te moomising ons
64 EXAMPLE I: ECONOMIC INEQUALITY AND POLITICAL
‘STABILITY
feare taken fom paper by Russet (196, which as
and Graham (1969). The basic
iti: the comeationsbecween the ansormed variables the fist set nd
‘he canonical vaables ote fee se
(Qi2q; te comelationsbeween the ansformed vse fe fst set and
‘he canonical vaables of the second se
eh: the corelations between the transformed Variables of the secdad sat
Milks canonical verabes othe ft sa, 3
(itz; the coreintons beween the rassformed variables ofthe secood set
and the cancoce variables of the second et
‘The cozeatons between the canonical variables are ealled the cuhonic
Careline ene given in the dgonsl masix ZZy =. Ween tac ae
a tes iZa= Oita and Qa2, = Qs2z0, wile Z42,= 2.2401OT TIAL
Baie aETG FSET
we
SISKTVNY NOILY IoD TYOINONVO BVERINON 9
St¢ _AuneveswourTogaxy kLrTynbaNTOMWONOOR KE 9xa
Variables are meses sng
1
Figure 6.2 Diceczed varies conlins in he
‘Smeal pce of ecneac ane,
64 and 62 give the canonicel loadings in the space of
orl variables, Lede comelaons between te nine quanied vacates
rankings a
Flgure 61 Russ orga a comtions
‘atonal pce of erencasne 6 NONLINEAR CANONICAL CORRELATION ANALYSIS
Jaeg preeaag of apical labour (vsabes DEMO, LABO, and GNPR)
wit lized democracies, The ther dimension is more diffi
Seluions are quite diferent. Tey are also diferent from te solution reported
Jn Gif (1980, p. 227-236), which seers to ine thatthe stably i ar
foe sustactory fr this example. The two new analyses ae, fa a8 the
: . ae u 3 T T T T T 1
er Sr ee Oe
Tigwre 62. Rests ogi dt: cnolcl sores in he 264 Dizled vals ania scores inte space of
‘conic vanabe | ee at tml Me spice of, Fiore 64 Ditcretet varie: spe
Soom emiscl
{6 NONLINEAR CANONICAL CORRELATION ANALYSIS
64 BXAMOLE 1:BCONOMICINEQUALITY AND POLITICAL STABILITY 231
S
A
iS
Hy
a
ree
0 page Pammeities ‘Gost ae
i
age ot low te apes
cr
—
Og 0” an a
Dean ie span
Figure 65 Rusots gin! dats tantomat
sbesity of amocty(65 EXAMPLE 2: PREDICTION OF A SCHOOL ACHIEVENENTTEST = 253,
“Table 63 Test cores fusion of ex and fe’ pression
7 =
ae ra
Bo Se 8
Pa Be
Bie 2
oe ie
a0 @
mB la of
Be an
Boke a
Be =
mS 136
(of freedom (4). The imerpretation of the
Pin = e+ Bat te 64)
‘wth pyje the proportion of individuals with father J ex end test sare &
‘Table 64 Loglnese madi tat have hen ted tthe CBS daae 6 NONLINEAR CANONICAL. CORRELATION ANALYS'S
‘hese untested end often
‘Toble 6S Maine an interaon by sbtrction
co 217) Ness
sis si she ase a ape
*p sw 826
Sspisie” Neston”
analysis pot of view eatinat
lbal significance tests We
chisqoace, sig te ortogantl function
sur with He sted mal
Paar & Somnuyend. 3)
ere qy and are the arginalpoponon and where
Zara 66)
65 BXAMOLE2 PREDICTIONGR-A SCHOOL ACHIEVEMENT TEST 235,
Moreover 249 =20= Yo
swbivary. The stra mod
oman EE Erase 6a)
though the choie of orthogoalfontoasisielevat fom the point ofalana
@OnS=P+SP GOnS+P COmP
psi eating
ofp
prryes
(spss?
se?238 {6 NONLINEAR CANONICAL CORRE ATION ANALYSIS 66 EPLOGUE
dcx Burg tod De Leeuw
be computed by coresponince
he 7 Sand 2x5 abies computed from Table 63 by homing,
respectively, ov canonical aalyres
Table 67. The imes the soared can
{ear max, Anco
‘Yous,
aay inneoca be ame
sane vas cfebuest alsely 0 tah
st ndependent from the other quantifiCHAPTER 7
ASYMMETRIC TREATMENT OF SETS:
SOME SPECIAL CASES,
SOME FUTURE PROGRAMS
special casts make some inportntsimpliieatins possible Its also net out
Ittenoe inthe chapier wo cevew the history of each ofthese techniques in
considerable detail, ve mecely digas algoritims and planed pros. In
fi epllogue some more reer developmen wil be meson.
74 MULTIPLE REGRESSION AND MORALS.
Sapot ter = 2 ad mrp he end toi ny oe
ontylg) “SSQG'y! Gay),
we Gia couple intenr mati and 6!“18 DISCRIMINANT ANALYSIS AND CRBINALS 243
112 DISCRIMINANT ANALYSIS AND CRIMINALS
In iseriminan analysis we also bave& single viable inthe second set, but
now tis variables malple nominal Telos faredon is
1¥))=SSQUG!Y! -G3¥) a2
= GAY! an 2, = G3 the normalization is UZ = Oand ZZ = 1.
tm,
ya i fitted by defining the poate
0
tnd by observing that he sino of cy4(X!,¥2) over nonresticted Yo is
ual
1p YEGIPAGIY! a
‘The mai taal ispersion of he c
for alin se st set. FT is the
ispersion mai of @ ~ Gy and Bi he betwezsroup dispersion max,
‘ten
OMY A) =r ATA=tABA,SYMMETRIC TREATMENT OF SETS
we minimize this over A withthe resiction A'TA sT, the ley the result
OMY.) =p 2,93 crm,
sviminant analysis amounts t choosing hey
lergest eigenvalues of TI i maninizod,
clearly alo possible in this case to hs
ofthe eigenvalues of T-'which mun be opin
\wehave no expeiece with any othe: cole
‘73 MULTIVARIATE ANALYSIS OF VARIANCE AND MANOVALS
Suppose te parton indica max G!
5 fered
ated, ina
‘14 PATH ANALYSIS AND PATHALS
1o some of our readers, and we give 8 very7 ASTROMETRIC TREATMENT OF SeTs
tion can be solved by Cholesky decom-
ted Wy seeing dag B) =
lve A from (12) a8,
Be Hy-B)-HyA,
Weave merely shown inthis sections ar
‘A path model becomes re
the above dlgonal elements of Bar
‘Asreatr. wend oss function agen
the som of squares of the residuals as
MAB) =SsqaKA,
se ecnn ay oes nat ocho
OMY ADY) = SSQQAs ~ Oa,248
7 ASYMMETRIC TREATSENT OF Ss
‘ equed, Tis seen poses
{Spite by Roy (957.42) aad ieatywaenealan f oc
‘and mutple nonnemeicl variables 8 ae
7.6 SOME EXAMPLES
Figure A wind ath model,
Table Ress of completly notnear and abitve region assis on
Sana
eee = PETE) Es A aT
esi
Sees
ocr
a tao?
on 2M
‘oe tm Latino7 ASYMMETRIC TREATSENT OF 75
Table 7.3 Regression ress‘mulpe nominal end p=
3
°
12 3° 4. sg
Pleure73 (Protest combination ater ddveeeson aa
253
“Table7.§ Pais andontnary
sonar epi.
PIOeT Poe
208 190
=a3 23
ta $3
177 EPILOGUECHAPTER 8
MULTIDIMENSIONAL SCALING AND.
CORRESPONDENCE ANALYSIS
Mutivaciate analysis @4VA)
developed along rather seperate
Letos consider binary dats matrix (ot yet coded a a indicator mati)
night have the following interpretation{81 HOMOGENEITY AND SEPARATION 257
Figure 82 Unfecing sion fordata nis ved for gu 83
sn, The igri
pnt: chose tems ee always258 MRULTIDIMENSIONAL SCALING AND CORRESPONDENCE ANALYSIS
rangement of groups of objets frequent imply te posit of good
‘Seprtion,s0 that this goa isnot completly os.
m3
4 tee
Figure 83 Degseae ston for at mai aed for Fes 1s 62.
‘82 MINIMUM DISTANCE ANALYSIS OF HOMOGENEOUS GROUPS
‘OF OMECTS
robe
‘became Fis a necessary (prone) lait ma. th next sen
‘we sal llow Fo conan any noangaive values whatsoeve, ia which cae
itis called correspondence table inthe French tee Benzé, 1979),
Table. Ministre exanpe pe 2 design
‘The goal ofthe analysis isto find configuration of poins ia pimen-
sonal space, in which Bomoge losly
De dove by interpreting asm
{2 MONDEUMEDISTANCE ANALYSIS
umber, property that is ot essential or the present derivation.
[821 Scaling the row objects associated with F
0 if oj o4
woe-{
1 otherwise,
Tn atc, we defie foreach column a mas of weight, wit elements260) NULTIDIENSIONAL SCALING AND CORRESPONDENCE ANALYSIS
‘as ahomogencous prop get a0 arbitary) dissimilar of
ee and weight 220
representation af throw objecs. The
‘ingondl mates ofthe coluia tale of P. We may wate sing f
cslana veour of F)
7
‘Table ae Sum matix P of rank-one Mempotent
Tle 8.2¢ Sm a uric was mentioned),
lows tat the simplified
es
BoREs
5
%262 __§ MULTIDIMENSIONAL SCALING AND CORRESPONDENCE ANALYSIS 182 MONDMUMDISTANCE ANALYSIS
108, = Fic Pie HKD ~ Bice a Gia Mg)
and ¥*, What resins fovea ie
rave soto (and therefore also i low-dimensional pap
of them tat we examine npacce) isthe appro
xeve = 9 DzrD tw,
@2y
‘With respect graphical splays this approximation imps the following. f
‘We draw line hough azo paatand projec the colin poisons,
‘be proportional aw of
ha column point and pro
approximate the element of te
he values (ep) ey, wher ey
ander the hypothesis oi
cxpeced val of the ool requensy
Above it was found that 7 distances betwen rows of Fare the save a he
Euclidean dstnces berweca rows of BN"! quDl@. An element of te
Inner max ea be writen a,
PGI = GIT) Kost
@25)
{83 CORRESPONDENCE ANALYSIS 20
28) gives the algebraic
i
‘The numerator of the expression onthe right
alee betwen ow pot fhe coon
proportion ¢j/N. The deaomnaza isthe square Toot ofthe marginal po
‘ston. Provided hat row proporions can be inept aan estate othe
‘marginal poporioas, we may write
3 lby-Gs nF zh
29)
72 dlsances, Als, the
is explains why the disances By were
074; —D} wep!” 29 «!KAL, @s0)
and sie of (3
LAK! = KA2K
ismatix has race equal oN 323, otat we ave the exeaity
xen, 633)
‘whic, onthe assumption of independence betwen rows end clams of Fy
converges to 72 wih (nl) perl) degrees of freedom.
era minanre example, et
with = (10 20 30), = (6.12 1824,
ales are
N= 60, The expected
‘The matric with elements yey)! ew be called WPA:2706 MULTIDIMENSIONAL SCALING AND CORRESPONDENCE ANALYSIS
(
sag.,{ 3200 1.41 -058 2.00
wonn[ 3 18-088 302)
135 000-033 0.87 |
The sn ofthe squared element of NTA is X2 = 1982. has
‘the SVD solution KAL with
0.921 0,029 )
0.282 0.766 |.
{0236 0.682
and ejgomalis 4} = 0307,
The solution forX becomes
0.023. This confirms X2 = W 3.
vooeliea 1388 go
xawity nel 4256 24)
(228 3385}
X zives the representation ofthe rons of Fhe Euclidean die
ances benveen rows of X are equa to the 2 dtancesBenscon
the rows of FE. The squared ditence are
Figure 8.6 sives the join plot The distances between the row
inthe lo are the 7 tances. Rw pots are the cena of
of clus points
m
13 CORRESPONDENCE ANALYSIS
of XY and thar pole
fre proportional 0
the x poles onthe ne rough,
laf RY lone
are y~ eae
2
)
(21000 1.000 -0333 ~1.000
X¥'=| 95300 0800 0333 0.125
{ =B86r “tian -o:1 0:250 }m
' MULTIDMENSIONAL SCALING AND CORRESPONDENCE ANALYSIS
{833 The centroid principe and reciprocal averaging
NPY.“ RD/IAEDAL = KKALL =A,
‘0 tha he SVD used inthis chapecan be subsite to bain{4 CONTINGENCY AND CORRELATION 25
TIN = 24044 964. =
59, 1960, 19608) has generalized some ofthese ress for
dependence beeen mare thn two vases
‘0 contingency tables, Hirsh
ontingecy abies, Insofar asa
cretion
teen £0) nd i)
‘ohn, 1950; Wilans
Bock, 1960.) Maung described the representation
tr notation) with references tothe comparable case of continuous
‘vats. Eset, Meéler (186 had formulated for his case the represses
B53 =ft8 9)
if and yare independent.
Wea x= 660 or = yin, then dy =
18 end a one-one mappings hen By
normal wit conelation patete
40)
where Vs are the Hemite-Chebyshev polynomials, The Maung
cqulenes|
Sts») «ited
where i
canonical
And where &y and ye te the
fomntions, (For farther developments see Lancitr 195%,1S THE PROGRAM ANACOR a
y XX =n D;K,
defined in Setin
‘SVD routine The
XD,X «Na?
YD =NI
X-;ley
ers veut
ies'These
option sales ro points3
2
1 ark
o ee
sans
1 abur
2
so 2 4.0 4G
1386. The analyseTHE PROGRAM ANACOR 281
there {so HOMALS
or matic along the ins of
an identical quantification, Note that in
‘Rosiebcbon
‘Groringsa
{ i. 2
Fath
Figure 8 Tate 83 pope eget on he asf he ie
‘ANCOR slain. Poet 2 sone
852 An example with asinilarty able45 THE PROGRAM ANACOR aes
=
ae sel me
Figure 89¢ ANACOR solution for dtnces berwoes 23 |
Dacca: detonmized tne,
853 An example witha muicimensonal contingency table
tis have
‘Asto poll parents he ive sugar fan leh ong
ae
(DA: any denoasnainal pany (KVP, ARP, CHU, GPV, SOP)
WD = comertvesber]
PwiA labour pany
PACO : any ofthe smaller letwing pares (PSP, CPN)
Dis + pragmatiltealone
—
YNACOR soliton for poles! preerences ofvie (iagoal a bivarae (of lagoeal marginal.
3
5 Tike
4 ° t 2
‘bets dimeasons a
810» ANACOR soluion fr pata preferences of
0c ANACOR sation for pita pretereoes af288 § MULTIDIMENSIONAL SCALING AND CORRESPONDENCE ANALYSIS {6 THE PROGRAM ANARROF 7
same quintifcaton of stporie 8a two van
Scola
{86 THE PROGRAM ANAPROF: ANALYSIS OF PROFILE
"FREQUENCTES
2
ae
q Nh
preferences of aden,
upd a 6 ceo of a age
s unique. Subsiuig the row
presulilying by mil we
3 a shonm in Figure 8.11,
Inthe gure, CDA tle
PAGO a the lower right, WI the
re. Some 0
6;6,SD-12 = ml2G,KAL 44)
der a comespondence analysis on the able F = GiGpS. is row
sma marginale as Dz =, Substituting
iby D> gies us290 MULTIDIMENSIONAL SCALING AND CORRESPONDENCE ANALYSIS.
BK = KGYG;69-1G;K.
ase 1 @43). 50K
=1
_oups of object sores to be eal
sarily ident
on 102),
20 Se
861 An example with binary survey data
out eigioas
86. Theoretical, thre ate aol of 26
{46 THE FROGRAM ANAPROF
BaRRS:
RESEE88
2192 & MULTIDIMENSIONAL SCALING AND CORRESPONDENCE ANALYSIS
16 THE PROGRAM ANAPROF 295
Figure 8.12 ANAPROF siuion for Suiyma daa on assmption of
(otplee scar ma.
Figure £13 7CA sion
Gtrersion.
uve dit ve284 & MULTIDIMENSIONAL SCALING AND CORRESPONDENCE ANALYSIS
62 Asymmetric treatment of response categories
Inthe binary case symm reamentof the response categories implies dat
the optimal
os
Tigure8.1¢ ANAPROF ston for Supyana data on te seuepon
fimipeamiet data a
matrix G of dimension 4243 6 (one columa only foreach tim, re
‘yes as", but omitting the no" fated more genera
2952986 MULTIDIMENSIONAL SCALING AND CORRESPONDENCE ANALSis
[Note that propery lik
vel ofthe grap a proile
icly below, Allrowsof Table 89 togeber
“The graph is given in Figure 8.16. All
subiets of tis graph
2 7 a T 2
Figure 8.172 HOMALS caution or gered cicamplex
(Gatensof tie 89) emesis ads "
[L7 SOME.GAUGING RESULTS FOR BINARY DATA 299
1b HOMALS soon or ge
of Tati 89); deren? Se
solution forthe Sugiyama dia
5.18 shows this for the 22
row CABEDF (outed 90
re B12 of he sx elementary300 § MULTIDIMENSIONAL SCALING AND CORHESPONDENCE ANY
to his end fois the elementary pos from 000001 to 0OF0D0 ite a curve
the peipery ofthe plot Noe tat 000000 dos oe epeaein isp
gory @ pace called dédoble
|
es smLoguE x01
mena he French eat) is ete it the appearence of mimo images on
‘Be nore sad southern hemispheres tht consti he solaon.
2s
as
101080
as
3s
88 EPILOGUE
Inthis chaps homogeneity analyst has ben sussed basically as n MDS
technique that makes low-dimensional displays of data and focuses on the302 § MULTIDIMENSIONAL SCALING AND CORRESPONDENCE ANALYSIS
distances between points in these di
disp
developments «step farder Se define
eed nei
‘Toemininm squared dsane fomulason of hogs
extensively in Heiser "eaeiy ani
19870),
models as beea proposed by
DeLecaw
CHAPTER 9
MODELS AS GAUGES FOR THE
ANALYSIS OF BINARY DATA
ve have sen many times that binary variables are
Ty. Besse 05 =20s
sae 98. MONOTONE LATENT TRAIT MODELS
“yn $ COVUY. b) < minty 2)
10 :
on
06
a
10 stdy 6p Tey 02
will bene clear whea we inzoduce tome ofthe moe common gees
°° ¥
Figere 9.1 Tem charset carves ina geen tert
raed
92 MONOTONE LATENT TRAIT MODELS
Suppose aan unobservable latent rit ach ht p(t) = AVEC 1g) is 92.41 Holomorph items
x called the celine, tem character carve: OF
her the expectation is taken with respect the
Aistibution ofthe inten wai. We also assume conditional o local Indep
dence: forall) #1 we bare
=) wk,
which ipses
y= AVED{0 pa).
Because
‘tis clear chat COV, i) > 0, and
models coneltions str
‘corresponding nonnegntveelgenveior, So the assumption of =
Jatent tat mal imposes sme seve on the observed corelaton saci
‘but not much, Figure 9.1 shows the item: charcterste carves ia genta
Iaent ait model lowe. kaso tn arc ave306
9 MODELS As oAUCES 92. NONOTONE LATENT TRAIT MODELS 301
‘single and malipl, hs means that applying HOMATS comes tothe sme as
{pplying linear BCA: ther ie only one coreation max.
9.2.2 The Guttman scale
‘Gutimas (1944, 1950, 19500) induced a model with items that consti a
treakpoint on the eoatinnume when an individuals Ioeated om the left-hand
ide of he breakpoint, he te wil be answered incorey; when fcated on
the righthand aide, the item wil be answered comectly, The adational
_ssumpons opps) mentioned above are writen as
‘ : y(t e»
rio iio
f-—]- fe] +
J ' | 76
ol d- -|-
F
cel should te the same and oppoite
ls. The pots of es relatos xe eo
‘igure 94 em charaserisiccuneof a Gutanitem9 MODELS As GauaEs
lomorphy
‘Thre ae pereeyholomarphi tems with a
an be constvcted by,
Rasch
[ euocesmenanats
‘Guraman sal, ko cles sinplee. The eral properies ofthe sep re
depicted in Figure 95,
igure 9.5 Oetzal properies of =
Because bisancigemectr,
Beye oa)310
9 MODELS AS oALGES
funcons, oe foreach eigenvec
corresponding with
ements of y are monotone
the poo, working inthe
) = mt, SSG ay BI ~ eae
“They and are anda aisles ened on a comen probability sce,
elements that are supposed to be ordered. Ns we ee
oire Oife > 0.
contnoously dfferenabe a, however, kes i
StH ou
somewhat
‘motel The p() mast be berwecn
‘8c unbounded on the rel tne: ts the
sn= Ci + fe, 14)316
9 MODELS AS causes
qe
ona of pum ef montre ce
98 MONOTONE LATENT TRAIT MODELS a7
Seaman hierarchy are que aiferet Suppose
‘uooidered arf =» 83, The eigenvalues of R= pp +A? then saisly
esoun,
aaspaige..2idest,
ag h+ Hn
svi gn 0, whe
o8)
‘Table 95 Speman Neruchy sss
» 010 020 mo 050 om a7 080 080
ast Gas om
an 039 04 026q 319
318 9 MopaLs As GAUGES 193. MONOTONE LATENT TRAIT MODELS
vary uerations. Table 95 has
‘row, Rin is nex nine rows, and 2, the eigenvelues
last row, Table 9.5 has the elpensectors of R, which are
Flgqwe 910
is fiat ination i aly posible, Ie eres wiv to rea tha rel data ae
beerespes
fee
(Fe
wv
Figure 9.10 Figenectom of Speaman craton marx
“The pattern i the eigenvector p
lifer fom the pt
pens when perfect
stiminaton is posite: Spearman shows ws what happess when dscineoa 9 ones AS xuceEs {92 MONOTONE LATENT TRATT MODELS
9:25 The Rasch model
Gutman
19)
with ofcourse, <5) sep faction, beri does
‘ot ep fom Oto Lay ae, Also deine
020)
4, IF we integrate
the desired result The me‘is oma ts or rear in Seon 921 hat petty ol
‘tems, in this case Rasch items, can have small a
‘tisposbi sete me
penne senna ay pene mao
have the sme eigenvector proper
arn ai cab ph of
93 NONMONOTONIC LATENT TRAIT MODELS
any suo is
‘mote realise to sippse that hep) are unimodal. Plats, for example,
np, with Ry tally postive, with A2 dagonal, and with
9.3 NONMONOTONICEATENT TRATT MODELS 33
seeés coin aunt min If eso nis, encrain eof
inmmany snlaiy and refcteae cote. To we Cals (1972) example:
people who lke oolé beverages sully do aot want them to be completely
Foren, and people who like thei coffee hot do aot waat I tobe acually
boiling
fa)
os
oy
a
oo
‘The del comrpontig wit te pet sl anna aon is
= a