0% found this document useful (0 votes)
380 views294 pages

Gifi - Multivariate Nonlinear Analysis

Analisis Multivariante

Uploaded by

lguilleng
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
380 views294 pages

Gifi - Multivariate Nonlinear Analysis

Analisis Multivariante

Uploaded by

lguilleng
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
CONTENTS. Foreword Preface CHAPTER 1 CONVENTIONS AND CONTROVERSTESIN| ‘CODING OF CATAGORICAL DATA, TRE COMP INDICATOR MATRIX AND ITS PROPERTIES QuaxTcaTION ‘THE INCOMPLETE INDICATOR MATROX ‘THE REVERSED INDICATOR MATRIX TRE INDICATOR MATRIX FOR A CONTINGENCY (GROUPING OF VARIABLES HOMOGENEITY ANALYSIS HOMOGENEITY OF VARIABLES ‘WEIGHTING gaan HoMOcENETY BY LNEAR WEIGETNG: PRINCIPAL COMPONENTS ANAL SIS LINEAR WEIGHTING FOR KSETS OF VARIABLES “THEORY OF MEET LOSS ‘THE RINCATS PROGRAM. AERA conDAING MULTLE AND SLE “sezaots wT DSCRETTZAMONOF NONLINEAR GENERALIZED CANONICAL ANALYSIS {SS FUNCTION AND NORMALIZATION OF OVERALS {PUTS RADIOACTY or m = a m as Fy oer 1.6. THEPROGRAM ANAPROF: ANALYSIS OF PROFLE THE CANALS PROGRAM ‘ECUENCIES a Bet Ancrnp wi iy ey ds Ea 64 AMALE 1: BOONCRIC INEQUALITY AND POLITICAL, TED Syme meen oepon ces Be 65 Exauce 2 pReDICTION OF 4 SCHOOL j EPLOGUE sa. [ACHIEVEMENT TEST m BB Bose cares) 8 CCHAPTIR 9 MODELS AS GAUGHSFOR THE ANALYSISOF Cour 7 asm nmap on ss soursrrcis 9. SOM GENERAL FORMULAS x 11 MULTIPLE REGRESSION AND MORALS a $22" Senet me Ss 112 DISCRIMONANT ANALYSIS AND CRIDONALS ey pas pee zg 18 523 The spent a $1 MULTIVARIATE ANALYSISOP VARIANCE AND b 524 Meet decode 8 NANOVAS Py S25 Menachem mt pce ices eee a 93, NONMONOTONCLATENT TRAIT MODELS m 11S) PARTIAL CANONICAL CORRELATION ANALYSS Pep caDan AN ce Boeier enone a ANDPAREASS oo 9 DICHOTOMIZED MULTINORMALDISTRIEUTIONS 26, SOME EXAMPLES 8 96 spLoGuE > ne LOGUE as CHAPTER 19 REFLECTIONS ON RESTRICTIONS ~ urn & 101. THECLASS OF RESTRICTONS CLOSED UNDER 4102 EQUALITY CONSTRANNTS 105 OTHERLINEAR CONSTRAINTS 28 14 ZEROS ATSPECIFC PLACES El 106 ROGUE ne EMoGUE e482 992 88 S8eeges & & agea88 a88888 eeeees eg S88 8 88 § #8888 5 388 Sane ES FSseES Ge BAR CHAPTER 1 CONVENTIONS AND CONTROVERSIES IN MULTIVARIATE ANALYSIS a his ncotactory her we hal yt te dfn of mae 21 CONVENTIONS AND CONTROVERSIES IN MULTIVARIATE ANALYSIS 1.3 CONTENT ANALYSIS OF MVA BOOKS 1L1 CONTENT ANALYSIS OF MVA BOOKS LLL Roy (1957) esting of hypotheses Fe also mention a mumber of important problems ithe futer development ot MIVA tecigoe: 1 many 41 cowvanions aN CONTROVERSIES IN MULTIVARIATE ANALYSIS | 1.1 CONTENT ANALYSIS OF MVA BOOKS | sequence, spies ‘he observations are not necessary independent and denaly is remarkable that Ke ‘computation in the bok 1LLS Morrison (1967, 1976) {6 1 conrvnrions AND CONTROVERSIES IN MULTIVARIATE ANALYSS: 1.1 CONTENT ANALYSIS OF MVA BOOKS 7 1.16 Van de Geer (1967, 1971) motivate 1 CONVENTIONS AND CONTROVERSIES IN MULTIVARIATE ANALYSIS) plumbing job? Ta the Sit place MIVA techniques generalize the 101 CoVENTIONS AND CONTROVERSIES IV MULTIVARIATE ANALYSIS oomed to fil. We hive slay seen that dat analyse are wo peimaily {terested in coeficens, but in pices. A single seo coetiient docs noe sve very intening pieres. ficiently pete and the conditions fer apple arly concenzat hs stenion Bretton ofthe results he es obi? (bagels, 1LE CONTENT ANALYSIS OF MVA BOOKS n relatively few protoypia! problems (p 1), "The heart of any multivariate is of the dats matt, of ‘lysis is very popular, manly because of the wor andthe DB 12 1 convmnions AND CONTROVERSUES IN MULTIVARIATE ANALYSIS. am anys and classe! statistics in considerable detail. We shal eoe back to that discussion erin the chap. on in the book, but very lite materi on principal components analysis factorial, ndeanenieal | analysis ical repesentations, bt ofa light er. The book ues proba Reduction of dieasionaty Stay of dependence He also mations th most imporan rolems, ia alistthat resembles the one sive by Kenda: ‘CONTENT ANALYSIS OF MVA BOOKS B (@)_isdimeut 1 fn oot wath cent wans to know exay. Aira the prefered technique Hass, onedmensional multivariate analyst, uss the uslon-atersecuon 12 CORRESPONDENCE ANALYSIS OF TABLES OF CONTENT 1s 112 CORRESPONDENCE ANALYSIS OP TABLES OF CONTENT PACT: Facto aabsis an principal components cnals (Canoneatcorelaton ancy MANO: MANOVA, and be general malaria linear model Bees! Jews cont "ANALYSIS OF TABLES OF CONTENT ” 36 1 convaions AND CONTROVERSIES IY MULTIVARIATE ANALYSIS 12 CORRESPONDENCE 181 CONVENTIONS AND CONTROVERSIES IN MULTIVARIATE ANALYSS 13 ASHORE SURAMARY AND Sone PROBLEMS 9 nd 2 book close to a ec ebook pys satel mah aetion oh post pure ROY, and these re 1G1+ ‘The projection ofthe boa ad ee eee i analyte ks se nea te ne between CORR asd MATE, this cootext a8 dimension in Table ‘much moe atenon 19 MANO than we expect. etna fvel. Of souree, COL] is a mano for comping MVA solutions and ts cal. We have als epeted the alysis ‘thor the ‘extremist’ GREC and ROY, The projections onthe | analysis shows us ‘Bore compact and comprehensive form, Ofcourse thie ede pry to oor | feoedon ofthe books, 1.3 ASHORTSUMMARY AND SOME PROBLEMS the hypotheses ee tested, Te ‘approach does not sar witha ‘model, ba oaks fo tasformaions and comisnatons ofthe variables sith Figure 12 Conerondnse mls : 1 MIVA Beak ede 20 1 conventions an coNTROYERSHS IN MULTIVARIATE ANALYSIS 14 DATA ANALYSIS AND STATISTICS a the explicit purpose of representing the data wen graphical, way. es ute diferent fom that of Nitsa 1.4 DATA ANALYSIS AND STATISTICS LAL Tukey's definition of data analysis LAL DATA ANALYSIS AND STATISTICS (A) Pr an woe wh ees gh (3) oF nl ag ar (8) Sie amino ks a Se enon f wrk am ah ge et, wih cal ‘aps onc ages 241 CONVENTIONS AND CONTROVERSIES IN MULTIVARIATE ANALYSIS. A DATAANALYSIS AND STATISTICS ofthe spct of modem ut analy wih ‘which comerponds| lied exp pects of gene iat has sly nb infsened the patie af 26 1 convaNTONS AND CONTROVERSIES I MULTIVARIATE ANALYSIS 14, DATA ANALYSIS AND STATISTICS a 1.6.3 Robust tatistles 28 1 CONVENTIONS AND CONTROVERSIES IW MULTIVARIATE ANALYSIS 1 DATA ANALYSIS AND STATISTICS » ‘Acinlot hich a nore igen an which sion cong n te date anu em at ates eae Again we see the shifting emphasis fom optimization to stability obus- 9). 1.44 Exploration and confirmation ‘These two tems ace becoming very popular these day, but again they are ed by efferent sh various ‘Toke sages, or example, with Parzen, ho proposed in recent paper to ‘demi exploratory dts analysis with confirmatory nonparametric steal a 301 CONVENTIONS AND CONTROVERSIES IN MULTIVARIATE ANALYSIS 14 DATA ANALYSIS AND STATISTICS ory techniques use tests of hyposbeses, confidence igs, tations thery or mode! eds to new eonecres ofthe model. Thos he proces ices and not ie adnan fog oct ce by eran ste, bt by tenon Seah boy eat a te Esa 1 seems io ws tht the wots ‘exploratory ad ‘oafirmatory’ should be ‘ed inthis Senseo eonjetres and refuations Tukey pts his 4 Moria erp |} by Tokey 1980), and als by us 32 cowvannons Ax CONTROVERSIES IN MULTIVARIATE ANALYSIS 1 DATA ANALYTIC FRNCILES OF 7H8 BOOK 33 145 Inference i 15 DATA ANALYTICPRINCIPLES OF THIS BOOK Weave wsed the word = of times, and it i consequeny © word we ase tht 15. Model and ‘The procedure wally adopted aqutson orprolem sad then evance ar pres and wl ania to have cond is. Samet ‘The model is replaced by 2 of simple models, M1 CONVENTIONS AND CONTROVERSIES IN MULTIVARIATE ANALYSIS 1.82 Gauging ‘What do we mean by enusing of ¢ save pope ly the ec 'echniqu recovers or epresens the Known props Phe ‘of gauges, and we meason some of the iporta oes, 5 DATA ANALYTIC RINCIFLES OFTHIS BOOK 3s tigate what aspects ofthe ‘model ae epeseted and ow well In usec analysis the Hiller rmacix sa falar algerie 8 Benzser probable ones, espera in expleratry MVA. This (¢ poychomeste sealing they. 36 4 ConvENTIONS aND CONTROVERSIES IN MULTIVARIATE ANALYSIS [1s para avaLyicrRavemtss or ms 00K st 153 Stabitity to tlk aboot Bayesian ve but we ae quite Rapp to ea bat We agree with he spr ofthe 381 CONVENTIONS aND CONTROVERSIES IY MULTIVARIATE ANALYSIS (@) Sibi nde mod selecion Asal change nthe motel hat we "si Social model’ should pay more atenon to thi form of stb han they aul do, an inequl ‘tien the algebra ofthe probiem Bh 16 SPECIFICPROBLEMS OF MVA {L641 The matinormal sti rn 1 CONVENTIONS AND CONTROVERSIES IN MULTIVARIATE ANALYSIS Of course we must remember tht Person's praca! purposes were a Aeseipve and ot nferenal and ve equl dispersions, then the points whee the fst ret thin the second one ae spare tiv hore properties Statistician considerably, We mention tho urposes a dua analytical point bot very imporant snd ny Une ans se MVA oft ses 42 CONVENTIONS AND CONTROVERSIES IN MULTIVARIATE ANALYSS 441 convexrions aND CONTROVERSIES IN MULTIVARIATE ANALYSIS | 16 SPECIFIC ROBLES OF VA 45 46 1 CONVENTIONS aND CONTROVERSIES IV NARTIVARIATE ANALYSIS peadeoce ae properties. ofthe variables, which ean be defines a various Aisibation there isnot mach choices of the covariance mates and deve ons ae neade. ios in tems of conditional for combining independent Dlicorive appro imainly by Lancaster and his schoo liferent systems is easy to describe: mi dive analysis on the ogarits Tn ge seal probablty-ased definitions of dependence inerdencadrer og con can more easly be inv ‘aa rose tive analysis has ober advantages, The two techniques are congered Daroch (1974 and Laneaste 971, 1975 Tr gener Romie | 146 SPECI PROSLENS OF HVA 1.64 Causal analysis CCoosl analysis hae a diferent histones oi ‘an tabular abalyss ial Wid What we now ‘oience wat the descriptive and scien theory is merely a shor ard spe summary of large none’ of xpi! oteeatons (lor example coneaton). Tis interpre of science isnt very popular these das, exeepe perhaps in some aed mee cenees. There re tart ee reson fot netic ‘ereity o ei “The mmber of corelatons that as bees comp 3s, Bu no tear has as yt come oa the second place Yule clay showed Itwe cor mpl (each 8 he income of Preshyterian nitrite pot of ‘hom Jemsica). Ts isso who never eared in he Est plas for Paso’ emp civ poi of ie, thee em tnes a ch thay avec! ess red comeition cose than elds the major problem of eaual analy. n we now have the problem of model when the nomber of rpodel mare oF less 48 1 convayrions ano conTRovERS utoraialy implies cs 1:7 DERINTION OF AVA ° ‘sceepunce of he vesul bythe nite, 17 DEFINITION OF MVA LIA. Asymmetric rote of roms and eo (On de bass of te scene receding ston we ean ow ny ge definition of MVA. Tae oldest, and the ros oe 50 | oxy aN conTROVEREIES IY MULTV ARATE ANALY ed symmetrically, vrabes te not mentioned, 0 use he erm mlidimensional anal ors ele ce nonce lvl Simpl oe te space whch te vas are dtd as w elements, tse eat dette by counting he uber of elements inthe se tnd Them vats cn be deacd by naling ast of al lane eke 6 cameigodingm vases of he focons fo each seat Pe os ep osaie to anlys nvm of mf Space todace asymm tel hs made pose ohn of eneratzaons sacl teminlogy we stay ase, end poplion {sft and compel sewed We cn ostty sic oper a lssibton overt eee, bt sow ‘sn longer wv asd we hve information het cd inthe nx me . a ‘so pose tht decd ha an ony of tee stability. . he same popatation model. The i te prev on which we Reve ebieratons aad which creme 17 DERNIIONOF AVA si ‘say asthe matrof the sample ha consequence or what we observe ‘nthe base of his analysis we a give the following definidon: MVA nudes syste of eorelated random variables or random samples rom sich do ber of random variable mil leo only ‘esurin theory. We have also incorporate the sachasi eleent ‘ur definition, but we ave sen thi i ean be made vial nthe case of ¢ inte population with counting measure. Thus it causes no real loss of ener. Satis only besoes lnpomant, of coarse, in the special ease ‘hat we actually have random sarples, 1.72 Linear, monotone and nonlinear MVA. ‘We now deine some specific forms of MIVA which wl be impor books MVA it linear if te results ave invariant snder ene to-one transformations of the random ¥ Invariant unde one-to-one monotove tat "The ess canbe formulas, the rerult of derivations they ean fouls; and they eat be infleneed by rounding errs, choice of inital cotgue- par ofthe output of he actu 2 ‘lower convergence desired solution, or evea to convergence to an undesirable local {Upto now we have have als sean that abl EN AND CONTROVERSS 2 HULTARATE ANAL 15 SoMEn@oRTAN WOREDIENTS 3 ‘we ‘we use sats) 3 36 1 CONVENTIONS AND. | {CONTROVERSIES INMULTIVARIATE ANALYSIS | 18 SOMBIMPORTANTINOREDIENTS 7 Snterpretatons of espelally HOMALS are pssbe, and willbe discssed ix ond 8. HOMALS ip LTIVARIATE ANALYSIS with meot problems in which X =m. Again this wil be explained in more dein the later chips, CRIMINALS har K=2 and analysis, PATHALS hat & =? an ge ‘as K'=2and generalizes ascuealK, which generalizes Ket cuenta analy Sd ll vasibles ae tot ori gener have ooly been ofthe problem leads ta more ecient zed oat, or the Individuals or object) ane inthe Se flxmations of he vazabes (efor loss fnetins zea Teas squees type, by which ne mea mated ta the fist sbacp econ ofthe wasermations inthe scandy formations for the given bake ing these subsips obviously produces» deat ‘wales, whic tum isis foe piven ales 8 of oss ‘converges becaise los ie ounced Sclow oy ty conditions we can also prove thatthe ase Be fo values coresponding with a saonaty vane Particular way of computing te sealing, because the nan minimize the los function, In choten ona prior grounds, a !MPORTANT INGREDIENTS ED (OF couse, optimality mis not be interpreted in any wider sense. We donot Choose. Wheter they ae beter in aby wider sense mu ecied by he gaging proces. near spaces cold ston in he comp- they desuoy mos af the and prove vs ANACOR snd ANAPROF programs, isussee because the transformations ofthe variables are often resisted in vsious ‘ways, We have aady seen hat single vsibls in the nltileappoech se rested by proportionally cons these restos is that the resuling problems ae no longer equivalent 10 © 1 convEnOns Asp CONTROVERSIES IN MULTIVARU We ave been w (tere inthe ft 19 EPILOGUE [Arumber of new books on various aspects of MVA have appeared since the 62 1 cowven TIONS AND CONTROVERSIES I MULTIVARIATE ANALYSIS 1 BRLOGUE 6 Mardi, Kent, and Bibby (1979), two are probably mos . spear books embedded in the classic raion, 982). As well as De Lets, 1989), Trere can be no db ‘Some comments CHAPTER 2 CODING OF CATEGORICAL DATA. ving the nregory of variable fy for object 1. These elements are noe necesserly ‘Table 24 Eats ofa mere ‘Table 2.2 shows the complete prof frequency matrix lata mari of Table 21. Obviously, f many prof ‘have sero fret ron correspondin frequency [ o 2 coding dat wil be of e forthe ype of ani For each vavable yan ni Binary matrix Gis ‘mapped inthe rh category of hy not mapped inthe rth category of hy 2 CODING OF CATEGORICAL DATA Forte numa example, braces along the digo of thos long the dagonal oC, _ Boa 22 QUANTINCATION e 22 QUANTIFICATION fvalabl, sack quanication could be |gnoted at replaced by anol eategericaton ‘he average of the scores of thowe objet that are mapped into tha exes. fe formula: ur Dj'Gix en n 2 CODNGOF CATEGORICAL DATA ter assames that Dj hasan fvere, which implies ht there aren nes wit zr equncy. I some calgary he 2 eqency, we my ss well dip is oluma fom he indetr mety The two procedures can be connected as follows. Let y be a erect. ‘quantification ofthe euegories ofthe fh variable. Lat y be a vector that roquitement nt ony makes slugs fr xan mization ferarive loa funtion ter sony one ston for eacal, we might be interested i p diferent solutions, This thatthe eategory quant dimension f sypiea etae i This could be stated mare form Let haven rvs, one foreach of individual pariementarians, Columns of G carrespond ithe corre rentrian voted ‘four’ ofthe Proposal, and "otherwise, The matrix could be completed by ‘ding, for eaek proposal, a second column repttered if the individual voted not in favour, and“! otherwise. THs creates for each proposal complet indictor maiz G; with 0 columns, and 8 = | 23 THE INCOMPLETE INDICATOR MATRIX n er example is hao sing dat. Ie wil be sessed more $l tn Section 24. For quantified scconting ote sume prin iples as oained in Section 22 forthe complete ast. Ags of average etegory quantification for nd vice versa category quandfiatons 23) oe) rissing dita canbe intepe favour" ill ‘be considered as Deing inthe ‘same catepory inthis respect. However, parame n 2 CODING OF CATEGORICAL DATA, complete mais the oldest ond the newer grave ‘an asing nce dd ico epee te he aoe ‘quontted at opposite ende TaQle2e leanpleeinditor Table 28h Compe neat marc A special and ever recuting probiem in MYA isthe presence of ising et, “Toey ean ooo fora variety of | 2.4 MISSING DATA | | 1 2 CODING OF CATEGORICAL DATA wll wn co make the best goess 6 2 CODING OF CATEGORICAL DATA 5 and presence’ as Govumas a there a complete G, by edging as many indesior mati, i. 0 derive nspsed dats matin A te bess of he columns 2b and 4, hat some individ apply some eategoy tol and Hb ot 2.6 THEINDICATOR MATRURFOR A CONTINGENCY TABLE n iy Variables Analysis ofthe reversed ‘nditor mati qunsis variables abd eateries er individual, but does ot ‘want ails 2.6 THE INDICATOR MATRIX FOR A CONTINGENCY TABLE. Anindiator matin ss follows, The Table 2124 Coningeney ule Table 212b Inara eis aieae 8 2 CODING OF CATEGORICAL DATA c+ ne) cols rans fore i ow category tht applies tothe individual an one forte slum oe ‘hoy. Table 2.12 gives an example Obvionty such an ncaa stn nn an ficient way of coding ents, ‘convenient to imspne daa coed in rt ne columns for row categories and egos. Each sow have evo ens '' one for 27. GROUPING OF CATEGORIES ties it makes sease to group categories cxample is an indicator mart G, for on Es ‘examination, with four response caries, ane of which 8G with tour columns (one for each response cacoo righ ake only two eotumas (one oer answer) ofthe avo types of in For example, oppose individu syteatically ave Tnsesd the correct answer, and one forthe ‘Wong Fesponee i chosen responses onthe ether te of the content ofthe deviant jon 131, When ther quanineation will noe necessarily be to ‘uantieaton they would have otined before grog emake 8. When one groups prior decison that ey shave ea! weigh 8 the object sore mus be the same Ungro gore, o the oe hand, obtin ferential weg 28 GROUPING OF VARIABLES esis expanded by creating «new varia. ber of posible cmminations of exteore with as many eatgores, 28 GROUPING OF VARIABLES » ec oe vrsble have etegries and ‘The combined variable wil ave ofthe ial variables Fr ena "and another one estgores categories ‘ap, nq) ars tp, a “ofa combination af eegries ‘ddtve. When variables we groepd, their oot needs to be adtve (whichis preci th e of combi snarinal pr Sent (a) A common procedure for collecting preference dats for m ‘inal SG = won) is the method of pated comparisons I resents ll pstible pairs of smal (SS) #0, dnd foreach and which vo are leasalke, Obviculy, there ares assole reponse to each iad Sp Moss par teat oy = ora ap Sil; Be pee OY ID o om %0 2 CODING OF CATEGORICAL DATA foreach sixcolumns, ne foreach 29 EPILOGUE developed manly in France, hasbeen zeviewed in CHAPTER 3 HOMOGENEITY ANALYSIS ‘Thee is bi nts pop of thi chap, from Sesion 3.8 onwards. Homogeneity analysis in the ‘broad vense refer to a case of enti for analysing multivariate dnt in mb of goin in epee nd te nx chap now of (0976) or Van de Geer (1971) as accompanying tx, ‘34. HOMOGENEITY OF VARIABLES ioral, the ex of bomogencity is closely et et teen hat iffeent data ti ny have element vary somewhat i reasuementenor forests). A pap of prfies hen 2 $5 MOMOGENEITY ANALYSIS 4.2 HISTORICAL PRELIMINARIES OF DIFFERENTIAL WEIGHTING 3 fon we give some rs expected value BU) = O and p= D0 tary 3 HOMOGENEITY ANALYSIS 112 STORICAL PRELIMINARIES OF DIFFERENTIAL WEIGHTING 85 the averape coma sarge, the numberof ies sree, be the minimum of 000, The aon vae brine by taking xh te ean ofa. funedon ten becomes 6x This of) = 1-880, where rs the average coneltion between al hy Gncoding fy = ‘sul eoawsponds with Galton observation eum or Raa oe These concepts, swell as sme atonal nes, ore ilutrated by tre for Spearman's ‘one factor ode! seuss such models in toe del in Chapter 9 of meatal test theory (how to define the overall weights canbe derived. The same is ind for Ga Com we replace the columns of H by a single vector x without re ‘sealing the orignal oles? The best soon for Xs the vector | The total sum of eares Tor His the race of ‘ering outeves, following in But ‘mewvbat obsurequoation, ays hat be tepreaon ofthis Tomcmewe oan kaye tenngccoraim semeome | “( @ 15) . . 8 ‘of vanables. A ocmal pros fellows wae | 33 2 18), wir = (soez0+i0) = 00. suum tat ally are sndudied Lex be i ania for repacg al 3 0 10) implies loss of information, © be evaluated by ti og j Apennines fe) 29 Fy $8040 —| 1 becomes B= 3x = 5667. lis called B from Between, becase (8) WZ) SSB on 4 Atdepend ony on diferencesbereen rows elements thin row are denen) A direct exprestion ie B = WHT: a direct ‘expression forthe otal sum of squares T is = wDu, where D is the dagonal ants of 1H. he notation $8Q(W) is used throagho aes ofthe elements ofthe veto ich implies that al hy a dena. Ls 2) 50 } »-(*»,] (0) = min (90x) | HOMOGENEITY ANALYSIS 318 MADAZING HOMOGENEITY BY LINEAR WEIGHTING 7 B.The symbol W comes from Wi OF squares of 928 032-032 re explicit, M488 2668) | eon hap, oe os ee prams [12008 94 0:87 0 0) (scorn mae yan ( 948) nd we find B = wD-VPREED-V0uim = w Rem = 234, whereas Tem=3and Wet B wir = 02,7 element of | Wore tr, Core ten X andthe columns of Hare Neen t TAGE F (eh)= O88 w (0.7812, which iustates the 88 3 HOMOGENEITY ANALYSIS ilsmioed with pect oa fr fixe We sa i desi to vats of is algo onespordi tion mentioned Tone xi cemsbaed In order 1 Keep the nottion the clumas ofthe daa natin H to being centered, are nommalied to unity This norman WHR (eceneiaton mani. 341 Nortatized scores nthe normalized score fy x¥x=1, The algorithm requires AGO) and thea proseds withthe song are not su ized (acording i some pe selected enteron of cease), a5 the valves of| 4 foc ied. Now tat Hil isa vec saci with colamns By. The up ze the the fase re Satisfy the conssnt), Step 3 eoresponds tothe uncon 8 function (3.4) for fited x. Beoaute x end of H are centred and unit normalized a” isa vector of eorelaaone © Seps land 2 gether, and step faction, which s bounded below Appendix discusses extnsvely how & = Hii ctn be z $e inerpeted as n image of f (and a* = Hx? is an image of x"). The apponuie ane demonstrates that a stony point ie reached when he image av HI 's proportional x*. The normalized scores algorithm resuies that & 5 ALS ALGORITIOS FOR LINEAR WEGHTING 9 Wx becomes 2 prewdo-adis of s hype: lipo. The algorithm converges a so-called invari direction, & prin pal axis of telat hypeeipaoi. Inthe append he alge ated ule value decomposion of H. Lat H = KAL: be this singular value ecomposiion (SVD); tus K sth maz of let singular vectors (sting KK =D, Lis thomatrixofright singular vectors (satisfying LL =, and A IO Singular value. The algortm converges tox" as he 1 where 2 is the fst dlaposl element of A radius of bypesphee, so the eats fr taonary equi) HX! = LAK LARK hah =a", 6s oA ae 66 on 38) with D= dagtV 1 = ag(R) = 1. Te sum of squares of te optinal weighs ‘sequlo the donna egeavlue of HH ~R, The lave lows WiP=1 p= 1-X 69) ‘he matrix LA i ale the mara of loadings of he priseptl components nals, sh he fst vector af lenings, coresponing 10x” 1 the vetor of component scares on the fit przepal component bs tlready been remarked that a* ir 2vostr of correlations between x and the vectors. The soaton2* maximizes the sum ofthe squared corlasons For aminiaae exp, 0707-0507 0.000 0.707 \ where R isthe correlation mavri. Table 3.1 ives the resus for ‘he frst ono erations andthe fra solution Fires 32 and 23 HOMOGENEITY ANALYSIS ive the geometry ofthe soluion (alto compare Appendic Bp Figure 3.2 shows the plan ofthe oo column vectors hy they are "ad ofthe uit cirele. Figure 33 gives the image of Figure 3.2 7 ° r igure 3.2 Fs algo, Vectors cae on te wit ce. nears ao wih apes il ae apo ey anverge to Tas te fer is De unt Jeng io ofan comet 2 3 MOMOGENEITY ANALYSIS ‘Table 31 Aviad Nr of ‘omnes Sg Hat = KALI =) 1 berween ed them (whatever the ia ud be) depending on sdltional enter im in Ri equal 0 BOT = 0.5 The average correlation berseen andy equals (B/TN2 = ones. 142. Normalized weights algorithm (©) Comergence ves: 94 8 HOMOGENEITY ANALSS S44 ALS ALGORITINGS FoR LINEAR WEIGHTING 95 Figure3s ison and conve, 343 Adaptation for heterogencous point wit x= =p ed weighs algoriti we ase nea | HOMOGENEITY ANALYS now sealed so hat aa = 1 (te ol sum of squares eget rnomlized weighs algrthn converges toa =D), and corsa dae for une vancso The story ofthe so adjusted algorithms is shown in Tables 3.3 and 34. The optimally rescaled data matrix becomes Q, with columns 9; 37 5 ALS ALGORITHMS FOR LINEAR WEIGHTING Sin wl ig ge xine of atl ra ‘Weteveto mat encanta solum by the wnt nomaied anpoectn sex props te td ela of ont te sues ergo tose st a seed calm, ad So on Ts tors canbe smart by tng ta Xi eoorponed os = UT wih OU stand an pyr ingle ai. Tats we can ele Sep? 02) Cramer dcompation — UM ER, (8) Update weigh: (@) Convergence es: 100 2 HOMOGENEITY ANALYSIS |35 LINEAR WEIGHTING FOR K SETS OF VARIABLES “Lette data mii H be parioned into K ses = He has me colons, with 5 m= me. Te pain specific data a variables One ox ‘new vaabe tht is Hoearly her posible objectives in Ta terms of slave los, me ifr te wears yin sch aay Da wEZe/K = Geta CeThsy, T=wDu=Se\ithin, ay e169 ly shows he relationship witha generalized ‘atin a sigle vectra (of dimension). Define E atthe paniioned diagonal main of HEH Ge [TH ints diagoralsubmutices Hy, bathe off dagonal submatrices BiB ‘with faereplaced by ero submatices), Ten Beu22uk = eHHale en TouDu= aa, Gin sothat a must be found sucha way tha aH Haya'Ea is maxiiae, We give areerial example with = 2, my = 3, m= 4 | 4.5 LINEAR WEIGITTNG FORK SETS OF VARIABLES the result becomas 2068 oan ) 358 $235, oa Ea | Bis 938 | (B88 88) ith Ha =a, Since 7B = i, we have BIT = 1.98272 = 0.91, So that relative los ie equal to 1 ~BIT = 0.009, Appendix B 01 02 3 HOMOGENEITY ANALYSS sf eanonical equal to 234, Pr the sme exam ase tote eipenva 36 MORE HISTORICAL. COMMENTS ON FCA 13 ofthe squared datances it te distances are measured inthe diction ortogonal fo weighting i often stutbuted o Hoting Showa in the quotation above 5 MOMOGENEITY ANALYSIS 237 MAKING HOMOGENEITY BY NONLINEAR TRANSFORMATION 105 3.7, MAXIMIZING HOMOGENEITY BY NONLINEAR ch i is easonable to const an indicator ‘TRANSFORMATION: NONLINEAR PCA (f. Chapter: mentioned in the previo es the comesponding quant otxa:¢)= ml 3, over the objet vores confounded i subsume gin ih) and writ (3.20) spy tug) =e! F, SOUR ~ OY) rt F/S5K-Gi¥p, Gay S Orerow of ios nc inte sec fo bomogerty Solutions. In these cass, a5 well as when a a prio! oder lab, we fave to dias he dimension in other wor 106 5 HOMOGENEITY ANALYSIS 3.8 CATEGORICAL FCA: HOMALS 107 basic os fenetion inthe rest 026 the HOMALS 2n and 34 of the ALS schemes 108 5 HOMOGENEITY ANALYSIS oxy) = xx +e} By} 29 By) 1-334 3)Dgy=1—mehyDy, ton y= yD-I?w, with w any column of W and the lar val, yielsthe combined properonaies (3.26) and governed eigenvalue problem” ‘ined andthe easton condtans chosen "he propriate @.29 and 27 lo ply x= (GDAGin)x 18 CATEGORICAL FEA: HOMALS 109 n=} ¥) G)D7'G) for more Chapa. eralizes to ORY) = WXX—ne! Fy Y/N, spam! ZeyDye stationary pir of male qu feores and Y= D-IAW for Subscript denotes the selection ‘inimam lot besos tt") =p— Ep yal. 635) 3 HOMOGENEITY ANALYSIS $M CATEGORICAL PCA: HHOMALS m ‘Tables Manic Gor tune earn Ye DIGS yells wDy* =wGxt =u, 638) calculators 112 3 HOMOGENEITY ANALYSIS 383 Normalization ‘Thee a wo bse samaintoncpion in cago PCA fe lg ‘rithms of Section 3.4) ie * aan (©) yisnonmlind so tary issome const. The indaced obec scores ter ae obtained by x™= Gy, shih aes the the average of i category gu dimensions an eb pine wl object stant Te induced category ga 7G, where each estogry is qua the centre of gravity ofthe pins for “The sandal HOMALS progrts kes normalization (), with xX =m 50 x becomes a stnderd sore’ There ar wo paca! oon elements of x now cat be interpreted iar properies of standard sare, The second applications it happens very often tht ism ‘normalization @) then eves te ies eins equal spread ll eons of sebgroups on above ips. ese te SVD ston D2 = VIEW, tat french ofp salons xy andy, = Ie) eons TUE a rec tp Ye66 long) the flowing ean) nee hy,DVm, an "Dyse my an Inthe Quput of te HOMALS progam te egcnvah ° LS program the egéavales are report in he fam vn. The sale af he cap guaifetont to sed sears, bt we may deve uper and lower bounds fe ating aa impontion ‘of its range. The category quantifications always satisfy i ~ linda gs lead) iy, ‘where dy denotes te magi equeney of category rf vaiable 343) 5.8 CATEGORICAL RCA: HOMALS 43 Proof ‘We fi wre the cargory quanto i the form ere Mae sue er BCG gd eas) yess hat (aay) 049) (dg dey oan ting @.44 and GAT into .46) now gives he desired esl Subsiteog G.44) nd 47) into 6.46) now Bs — 1 category quanificatonsdeped onthe ni The bounds gina rales tha the maximal range of very infrequent y= 18 This fre reetetion of range isone ofthe easons to becuse wea the variables have widely distinct morber of Gtegves, oF ‘vai apne wih very tin als |” 384 Contibution o variables: serimination measures Ls craton neces cfr F sipDaayde oe fe 1). The dis i ‘variable does not contribute to the sth dimension ‘inwension, Whe 3 NomOGENEITY ANALYSES coincide with te ei «Saline Sn Giy Prt Goi met wot en Te coin even an Ee Faa)= 697 Gi6}679 ay ew igs aeaty B=, 8.8m ereicenn Pisa = ODay? OD,yir = Darya, G.49) ‘ice ioonion me. ox fn eso 39 lb sown at Se dctinatn seas ga 4 terpretain ed component loading. Before prcending nahh * SmaT auth eomerical popes ofthe HOMALS cece et meri examples given, For the data mati of Table 2, ‘mati in Table 2.5 (the exam We give the HOMALS esa (9). th nomatison yDym (©) Wh the sandard HOMALS moon y=Di'Gx, (@ Nem he genera Blenin Tie ionyDy !gersecor equation Cy = y2Dy (C and D are 'axd2.7) has the tee args eigen Vi= L885, yim = 0.69, Wr12T, yma o42s A=1167, Yn =0399, rela ts tym «0371, ‘eave ls 1m » 0.57 ‘ele ls 1~ ym = 0.61 | | 5.8 CATEGORICAL PCA: HOMALS Ye38, Using te category euros pare gens Tobe 38. ing fi quent optimal dana end Oy ‘Antibes ath corns gts Gy The eter boa nas Gy argv ore tee Stawys Tebes 10 he a ‘y'Dy = 1 implies char the sum of squares of i310 SoluionorX, Ses dane dimensions, canaaon Opt 3p an as tan Get Ce ee 2 Be aot us 2 HowoceNeY aNazysis £21901) pots. The iste columns ses of Table 3.10 sive those of each obec point i the centre eon for the gue atthe sum of ‘Teble33 Opinaty scaled daa main, ‘hd on ets HOMALS teton ia 021 tances beeen a category point and the obiect ects belonging othe category 3 HOMOGENETTY ANALYSIS 5.8 CATEGORICAL RCA: HOMALS ° on. Some ofthe techniques of nonlinear MVA to be dscused ater (eg PR ape 4) donot have ai he cone of praviy ofthe objec cctegois of that variable arin the ‘sobelouds, andthe eaegory points ee 4 plot forthe frst no dimensions. Cotegory of obec poi her category’ coined j ay Between 2 and a i 9 of . ration: obec 2 and category. The same is ‘rue when a category applies uniquely oa group of objects aideneat response pater (category for objects, 4.7) 20 43 HOMOGENEITY ANALYSIS > cueuory pois wit low marginal fequency wil be plowed fanter’ yes az With gh magia eqncy cent of te ploe Ptr sitar ote wad the eens whe f {19 RELATIONS BETWEEN HOMALS AND LEVEAR CA o iltsrae, we tok te uni narmallzed version of he optimally quanti data matrix rom Table 3.13. The coresponding ‘nrelaion mate Ry Beco. ination measure inthe {rst dimension are the squares ofthese Figure 3.8 depicts the component loading ma space. The plot is based om hth horizontal dimension an the verte ets he sce bee the fk | =wADeDy ow, we have used the Burt table frm Secood PCA diension rove that the fist HOMALS dimension’s the fist fhe optimally sealed data matix Qy ise flowing index forthe dimension. Let Qy be a Ej m mux th column, where each ago tte dsciniaon ences fof Dg na 3 HOMOGENEITY ANALYSIS ‘Tus int HOMALS sla cn sso be esctbed i the ftlowing wy ‘The indicator matrix as it were expands, or blows up, the dat vine tha cola of ede a teoen Gy (where Gy and'x ating te HOMALS. math then HOMALS Gy (oronw, PCA onthe optimally sealed date mati Q that in Qh indicator matin again compressed eof wero a le HOMALS gives t-te is te elgemector Loa, where Ad ic ‘he example, these 0.45, Note equal othe second HOMALS Second elu of Sa 419 RELATIONS BETWEEN HONAIS AND LINEAR PCA ns 0.277 1.000 0.279 The elgerectors ore obit as _{ -a089 016 -0.577 ta=| “0998 370 “0.578 |+ 0.685 0405. O.378 ) the marix of component loadings gven by 0049 0,992 -0.385 ) onto a) Oe oie) Oe Ose 085 and the matrix of component scores is obtained as “0374 0243 -0.186 O19 -0.380 0.412 0.338 “0.070 “O.150 0378 02k8 -0.186 KeeQalaaz=} “O.047 “O88 “0.492 Gols 0579 -0.247, D518 0243 ie | 0.048 “0.029 “0.412 Bors 0579 247 Bes 0.029 “0.412, 126 3 HOMOGENEITY ANALYSIS 219 RELATIONS BETWEEN HOMALS AND LINEAR PCA wm Bertone the sum of cexplind in Seaton be er elena sma), Tis fer 3 aplied to the data as a whole or w the we data matrices for Second HOMALS 43.10 THE RELATIONSHIP BETWEEN HOMALS AND TOTAL va spat SneSQuAKE gw Jon 3 go the univariate marginals, i Seay genie as ceed Bow Some ogee oops D-MC—DuwD/gD-V2= weEW, {1 ANILLUSTRATION: HARTIGANS HARDWARE 29 ‘Table3.4a Marga hava vanes mn categses ‘The flowing ast Ahat kes from Hare reaa ni These result are confined inFgue 3.1 which depicts the discrimination ‘measures. This plot, too, shows tha he fast sension slated to variables 1 CTHREAD) an (3 (LENGTH. Vari F paras fom tacks. The second dimession ‘epates SCREW! and NAILE, bot being very lang om ths ese A ‘ 130 3 HOMOGENEITY ANALYSIS such variables stil dsemint, i enst be Deets inthe sample. Figure 3.12 shows te category qunttcaiony Figure 3:2 ae the centres of gravity of the object points assed with eacheat= enor. SLL ANLLUSTRATION:HARTIOANS HARDWARE a A more precise and dealied analysis is posite hy staying the x plots in Figure 3.13. Here the object ssores we ploted agin, but now labeled for ‘exch variable separately ong the label of Table 4, Fro these plos we se that vaabes Sand G have categories tat eannotbe separate very well at Jeastin he fist two dimensions). For the otervalbles the objects wih the Tme315 tot THESIS Cy ne : Geke Dinei Y Dien? Gees) Dineser Dine iy vag sous, “HEC ; a] RE te pour SCREWS 18 NALS i : # 8 as te ws ie 2 2 3 2 3 3 3 is B as T eM T ‘se Sm Ses eo = ea = eg = & 8 = eR = Ea a a =e sume abel form fury homogeneous ‘hat for vara nother aores We consider this result stistitory, although the HOMALS lore might not be small in 8 HOMOGENEITY ANALYSIS 0 2k jas harévar: itanton eases, “The second dimension pearly capita esl sow in he {OMALS soliton changes ens, The send SAL AN ILLUSTRATION: HARTIGANS HARDWARE im ve length 1 and the U head ae the only ob ‘hs ars ttt the nove eva om thr cents positon owas De ane, Oba, Figure 314 targa’ hardware: tera story (it 10 ome, ‘32 HOMALS WITH INCOMPLETE INDICATOR MATRIX 136 3 HOMOGENEITY ANALYSIS 2112 HOMALS WITH INCOMELETE INDICATOR MATRIX ry xemniGy, 650) Soch aquaniicaon is consent ith solution sed wep? -ww, os ow may 1x mn and yDy = my nolongerhave uD; 0, Tablas Oita & bts 8 HOMOGENEITY ANALYSIS 5.82 HOMALS WIE INCOMPLETE INDICATOR MATRIX 1 ‘Tyble 347 Resuls for mama exatle wil option (‘ising (i) Missing values matiple category. Her, to, the indictor mats is ‘eshte ep” ‘complete. Rests ae 318 and Figure 3.17. ee a Objet see Epwokere? 066s osm eres 06s ose Tyble 348 Roauts for nunercl example with option it ising aioe. oie eae " i # « i ag 2 4 ° 1 2 % ‘i Figure3.16 HOMALS solution with option 00 “Tnissing values single category. Epwalae? 077 06s) ing dia se randomly ‘vided over ober and categories, dliferenes beeen the thee opin wl Ma 3 HOMOGENEITY ANALY land the ftepestion of iscimination measures wil be amos the 5 forth complet ci), Figure 3.17 HOMALS sation wid ops ‘lie eacgon ng values 0 produce a quantitation ofthe missing dat, ta cverage object sere for abjects with tion eqal 0, since tent with the idea hat etegory qu within ta euegory ‘Suppose an ott ‘yrabes. Option (i) gives imensons. Open iw quantify it incomplete ver the completed Indiotor mati, Figures 318 and 3.19 pve reat forte example used it Section 2.3 (ie setinon problem), both forthe Fist two dimensions, Figere318 Hos ‘neopets ie 3 HOMOGENEITY ANALYSIS ‘io that in Figure 3.19 category poins for + and ~of each este gory ae the ‘opposites ofeach oer wih terpectto the engin eft pion gue 319 HOMALS solaton fa sist expe bse co completed indicstor matrix. " 343 REVERSED INDICATOR MATRIX {M3 REVERSED INDICATOR MATANK oy (©), perhaps elated to whether the tens were English speking or ot. Suds 2 sorts (W L) (CG), pechaps impliatng that he aor ‘before 1900 verses “ater 1900. judge 3 sos (W L) (C) (©), peeps ty county of bith. Tn the ‘eves indicator mati aproach he objets of eal ae them 2 he "aiables of analysis are the jules. The data matrix andthe revered nde 2 Fare 320 Hon sltn for min soning tk, Shs fornvesediicanr saa “ 3 BOMOGENEITY ANALY Table 19 Example of ong a 3 HOMOGENEITY AN [REVERSED INDICATOR MATRIX ro Wen tis dara mati i analysed te subjects and thee e pleted ia Figures 323 and ‘he reese indetes hat Me 8 HOMOGENE:TY ANALYSIS (Me ae dealing with a very dominant fet dimension om which the items ‘a pees ordered ascoting tote erder inthe data mati Tassos ‘we aequlte category poets fr the subjects: we notice fat a¢ Seer, 24 HOMALS sluton for transposed data sane carpe cangay pus Each det fom 10s bjt each eed mabe ‘aegis 1 and2 for sabet | have obned exactly he sine quafesions 8 tegres an’ fr sbje 6 Maco, al eat poter eo ‘ransfomedin he fist dimension ¥ 9 2 srnogue ‘Wheter or not we hve to analyse reversed indicate max isin pera of he use of HOMALS ean found, CHAPTER 4 NONLINEAR PRINCIPAL COMPONENTS ANALYSIS. 441. METRIC PRINCIPAL COMPONENTS ANALYSIS In Setions 3.2 and 3.6 we reviewed some of the history of homogeneity 182 4 NONLINEAR PRINCIPAL COMPONENTS ANALYSIS 42 NONMETRIC PRINCIPAL COMPONENTS ANALYSIS tor analysis model, and beter interpreted as» special cate of 2 principal componente alysis ‘42 NONMETRIC PRINCIPAL COMPONENTS ANALYSIS lyse R ofthe singulr valve dacompy The Eckart-Yoong teorem 2 pra ‘tn nx data rain H ea be formate in es 4 NONLINEAR PRINCIPAL COMPONENTS ANALYSIS P 42 NONMETRICPRINCIAL COMPONENTS ANALYSIS 15 er ety yeas we at ional breakthrough Ia aonmeire sealing skal Shepard showed tha onda re 136 “| NONLINEAR FRINCIPAL-COMPONENTS ANALYSIS 42 NONMETRUCPRINCIPAL COMPONENTS ANALYSIS 177 of random variable c:Appendx A). Because ofthe fora of the leogth of the projection of the ob In Coombs (1964) the abvious nontietrc ex 138 ‘NONLINEAR PRINCIPAL COMPONENTS ANALYSIS {is the preference stength of person J fr object, If hy isthe observed preference sength, then these mes in addon Iiy> ny ay 4. 43 THEORY OF JON Loss, In this section we sty some properties of (0,X.A), using explicit sommalization. The eases way to do thie dete minim of oy(Q.X,A) over X and A for xed Q, theorem F. Appedic B) = ERO), Pare ay ‘wher R(Q) sands forthe coreation max of bem ransomed vases @y and 3 for its eigenvates in decreasing order. So the azallestm —pelgem, ‘aloes are summed in (4.11, Clealy ‘we want to compare analysis diseased in Cl variables are discrete ad by Gta, of components analysis with homogeneity ‘we must soppose inthe fst pace that al 3. Thus th cones Cyare subspaces defined (413) 45 THEORY C#IONNLOSS 159 ‘with the G, complete indicator masces. If p = 1 then the theory ofthe We cantare some of tee points with che example ef Section 39. not al of hey 43 THEORY OF 0NNLOSS ssormaneoncsanensra, | mel corepond with ther smallest nonirvlelgevaes, In Table 4.1 we give he solution for Qwithp™=\ and re corresponding correlation InTabie'2 tnt re the mars andthe eigeraies of re ROD iSforp™=2,r=\are gen Table ta Maric Q frp = ote on Tae 6. ant The lentes of te composing homepency nays ar: Ths eget one the argon in Table. the mast onthe sale nen fated 4 NONLINEAR FRNCIPAL COMPONENTS ANALYSIS 18) he minimam of converges cease all mesutenents he ‘wo diferent tps, Suppose we sar an sing the contains We isthe fi Second sep we compote new 9) forgiven X snd A. Thiscanbe dane foreach sepa Dotan i= 43 Tapory oF somos 16 Thispari taal exer guntiicaons is bane by ining (19) ve yume fa pn X aA Tes eos 2 ‘ails be pert SSQUGp/-Kap = SSAGH Xa) +0y-FD}Hj-F, eazy and consequently we veto minimize the second term onthe ight overall “Alieratvely we can loose he Daygavet lg mentioned Section a thee yo a © we canchange the order of the tree steps, and we ean pevorm a ‘numberof iterations of the two Daugavet sep befor computing new yy We ‘v0 ofthe more import choies of Cj eater ste as single nominal if canbe poly weighted mocotone regression othe vector Fj ce Appendix O, 168 4 NONLINEAR PRINCIPAL COMPONENTS AN L-COMFONENTS ANALYSIS 43 THEORY OF ONNLOSS a ‘Table 43 Tare teraton ofthe tos sgt — mle ‘Table 44 Thrce iterations of the tuee-step algorithm to minimize oy aie is 28 Uae ose ¥ “aah 08 0488 “2H 01” a5 “92m O17 os i 38 2 0S 3 Sg GS HE ae oe 3m 3 EE Se Se aS 48 sae - x ge 208 4439 are aust 238 too a tam ee oe Ge oe 38 Eee Bio ees gt gy om 8m om au BB Se ee Fe 33 = 23 rs «tee age ast an 2 Se @ we 3 We do no resic the missing par yP, where denotes missing. Seppose ‘hac (which contin the a pror Amerteleategry ves and hae the sumber of elements equal tothe number of nonmcsing ceegores) is ommalized in sch way tat ‘The sitoatonbesomes more compl lteady seenin Sesion 2 nu to we with the cones C change, bese we do not ieanty forthe missg values, For single nom ae for sige ordinal variables monotone regression ly performed over the nonmissing categories; fr the salsting etegrice of course, D? the pan of Dy corresponding wi ries, We coupe copy te comesponding elements off and afterwards we normalize, This Beau Di5j/uDja, om ‘alo oo rel eons, For shige numa vasles he tui 7. Shy more on the nnn par of denoted by 99 wre oy PH 6 * sands or teed, we egue oI eat wh a ; moons «an say 426 166 “4 NONLINEAR PRINCIPAL COMPONENTS ANALYSIS {1A THEORY OF MEET LOSS 1 44 THEORY OF MEET LOSS sd we he compte 9 by nora tvs contoqence ht chang ‘neil component poe otha solution. Ths option may be te 10 very unpleasant compat depron ny more ‘Mp, but becomes re much tore expensive, Using y= Gy we and Yost be compute ow-wioe Rw gM: ‘Those wil tke agprozately RS QROCK): Updating Yoel “eo Bs aan yPn=1 (aia) wed ag) Sone sinsle comput 168 4 NONLINEAR PRICIPAL COMPONENTS ANALYSIS: OMAYA)= oy, +o- 44) where for all variables single meet os MENA) «Wl, 8800-6. ‘ai lationship between mee os for singe variables snd jin oss holds 643) MOA) =p mr Sy ajy—2mrt 3506, here we have used WORK) = pand yy, 1,8 XX =1, from whi aly We obtain for in oss ORY) = 1 + wy 2015, aan lows easily. The fs that we ose Yj fr the ml the single quae ple, ot bth and (4.44) tha ining lent robes, with he sme solution int evans ing Os the can pote conto YEU some vases and ot ares I we pore tem favo ee then we wee doing components anal we hen we are doing homogeneiyexalyais ing to mit the two opsons and w give son ‘ons ao others maple qoatfcatons. Espo swantficationofen snot very natural, bresse seem #0 sugges ‘utegories can be Consequently we might compas numerical vrais, and mls quanltons Another advantage becomes clea it we pion for missing dats, Now ‘anaes single for nominal vari omc, ith nomaaton =m By aX - GY MK-Gyp, (49) UMEX=0, 44) XMOX= mf, 450) where 44 THEORY OF MET Loss 1 MEM, asp ‘The site relation besmeen yg an (equation 44 is no longer ue with this rearmea of mitsng ate "The alternating laa minimizing the geceaized og (euation 4.8) is stu with an X sttying the constants, Deine yappen., 452) ‘As we Sow thse are the unrestricted cononal minimizer of oy ofthe ‘category quantifcations of variable. The relevant part aruoned as nok —-GYyMYR-GyYD 2 GRPMK- G/N + wh) -pDKY;-¥, sy ‘hus misimlng ver Yj canbe done by minimining the second erm onthe ‘ght. Ifthe variable ismlipe nominal we saply set ¥j= 9, and eae fom, “We can also introduce a his point new naturally into the general framework, Fora jumns of ¥ a ether increasing o decreasing, As explained by Geteman 1950), we cantor require them alto be increasing. Conpatng Y, amouns solving two monotooe regression problems fr each column tnd to keep the best one Multiple ine ove yan: ‘of Dyn pees Squares nner erations to folve for y€ G), piven 318 Tar 3. given yeaa be done by deny 3} and foray, ven 80 Heyyy, 455) 10 4 NONLINEAR PRINCIPAL COMPONENTS ANALYSIS Solving for yye 6) one dough ust oe 5rd ‘the conditionally optimal, ingle eatepory quantifiations restrictions, and uss the alternative paioning of (forte 458) given by Gia} Ty DGai- Tp +aja, Ths ang 9 ea ‘arsion eben ‘ese cs, and aquesion oa ‘Several comments are in order here assume hee Uat yyy for eects to noe X aA ys not eee onal idle tion wine foal. Fom te mean presaving propery ote epee wea Aine atersegesson SPM 8 He wasn grins we 8p) =wDJ;=uDjay/aa, =uDDFEXy/ ap, ass My=1forallj(o missing du, or option Hor it), then wD) «0 fom WX =0, and our metho ean be interpreted inte ot RG, m 44 THEORY OF MEET OSS X forgiven Ys, amounts tothe L=SMGN eo) then = (Me (ou Me M2, ast) x n!M RAM, {wil be found oat that X sven the ¥y, We find telat squares rompete homogeniy aalyis 4 ‘Uwe define Mi asthe dingo tix 1 hese of thse vectors q with he if objects ant are inte extepory conespon Myq is proportional gy for noma require Mq= gy. However under these 403) which rants he met ‘the uta lve scion m 4 NONLINEAR FRINCPAL COMPONENTS ANALYSIS 45 GROMETRY OF MEET LOSS sans for msm nt oss for ech vate have interning SCout nmerreations. Asin homogencityanlyis we sepesntehing, Sores as point lossconetbason imensiooal mace, Ifa variable is multe nooo MX) = GM bc ifand only i al object inte BMY Rave the same cbject seer, whichis then ot 145 GsoMETRY oF MET Loss az ‘vail s mute ordinal we partion the loss contribution 453) aK- GIy MK-G,Np + acK)-TyoK)-B, 65) 4.66) omer Sse eters tite omabemsonagen anos Figure 41 Moet os fr amalipe nmin varie Qu fs te corpo cates tein. sn homoge ‘ys we ge obecs ina caepry bean ogee Tae Figure 42 see oe Tot becane the quant rough orig, Observe tat for mulipl tina a there generally is Of rotation ofthe axes. For single variables there are m “NONLREARTRNCPAL COMPONENTS ANALYSIS 45 GEOMETRY OF MEET LOSS Ms ve se meet os fra singe namical variable the quanifistions should be Inherit rd ondeqal spaces. Asi ive pe sib lor Bed we ua ution wither Sanne a6 ak ce vena equation 456), tart a sen Oot ower ewan al ae crs concpontng wih haere sot sane ad we want clegory danifeaon noaieg see, set) e on ie though te ogn For singe nani en icin gre two lott connect ais ices Se Figur ¢2 Fe sing di! an mod oath osntitcacons ona ine thoeh toga ne an ‘nthe are way (seed by some cot), heaton or gore 4.30 Most oe for single eel variable nitonat Tecan fn quansienonafse Se eal epics ose atl: aston erp oer on he aed (Dot generally not 20) if the zero, which means that Gi = Xa, ‘Geometally this condion meas tht he objet points caesponing witha ¢ategoy mus be on paralie hypeplanesperpendicur tothe dteevon defined « SONLINEAR PRENGPAL COMPONENTS ANALYSIS com eason ht ‘roams scepismaliple nominal is that they are based os join os whereas PRINCALS is based on met oe, 4 “¢ NONLINEAR PRINCIPAL COMPONENTS ANALYSIS | 47 ANEXANDLE COMPARING MULTINLE AND SINGLE TREATMENT 4.7. ANEXAMPLE COMPARING MULTIPLE AND SINGLE. ‘NOMINAL TREATMENT Table 4 Gutman-Bel dite espn Vater = Yates = Yanbes = 4.9 ANEXAMELE COMPARING MULTIPLE AND SINGLE TREATMENT 181 2 Figure. HOMALS suonof te Gutman Bet da Length oftetines incu los or vanale s+ oboe emcpane i + NONLINEAR PRINCIPAL COMPONENTS ANALYSIS |g AN EXAMPLE WITH PREFERENCE RANK ORDERS 183 artic nic umeciry Werery wet eae how tne AN EXAMPLE WITH FREFERENCE RANK ORDERS ‘of the plots that can be made and some ofthe statistics that canbe computed. “ “ibe. Gn 2 dee Guilt Tobie 47 ees pretence rank oder of 9 pycoigis frien polo: nae — == el jou eter Renan, 196-152, By contention tow Semen fnthetbleinder ahh pene ol Tete eal ‘ible Gutman Bell deacon qunifiaiotortnes ‘Table 47 Rostan's oul pefeeoe sk odes JERE IAPR IFSP MVBR ICLP JEDP PET HORE BU WBE ‘Table Gutimn-Bed data objet scores CA Cute Betdaobetscons THOMAS, p= SgsseeesazagssasssesEZEzaccagNNII9E9999 ‘+ NONLINEAR FRINCPAL COMPONENTS ANALYSIS 468 AN EXAMPLE WITICPREFERENCE RANK ORDERS 8S JEXP: — Journel of Esper mental Poychology Upp: Tournal ap Psychology 2 3 i and Seca Paycholgy £-MVBR: Muliserate Behoira seen $ ICLP Journal of Consuting Poycotagy $ EDP: Sounal of Educator Pryce 1 Purr: Prychomarita 8. HURB: ihoman Beats 3. BULL: Pochoogial Butera 10. HUDE: Hunan Bescopoaen ‘Two PRINCALS analyes were performed, both in with ll variables single ns theft ‘aod he second wih al varbles sage Figure 47 Loading forthe 39 peeorn abel toon ide coca Be a nea solution the oar ‘Thad! group EXP, PMET, MVBR, JAPP, and 186 «4 NONLINEAR PRINCIPAL COMPONENTS ANALYSIS ‘development’ group JEDP, HUDE), anda sot! group (PSP, JCLP, ‘HURE), Is also possible to se joumaisaranged on « cuter svete nthe mi ‘oup and J moved away fom HUDE to the mile However, oer ile difference Seren the So solitons snot i is 4.9. ANEXAMPLE WITH DISCRETIZATION OF CONTINUOUS ‘VARIABLES, Figure 48 Loatings forthe 39 fs inthe single ‘nna sonia song Stree ‘the 39 psychologists ae given in Fignes 47 (gumericl Observe tat the arrows econ ofthe psychologss we have used 1968, p. 152), which makes it possible fad out a syhologs wares. The codes ate 18 4 NONLINEAR PRINCIPAL COMPONENTS ANALYSIS ‘Table Cylinder pote: data AN EXAMOLE WITH DISCRETIZATION OF CONTINUGUS VARIABLES 189 ‘Table 4.0 Conelaion mat nd cients of PRINCALS oninal 01 01 03 0S 07 tt Figure 4.9 Te fico column of Tele 47 isk by row tess connected with be comependag PRINGAES cet 190 4 NONLINEAR PRINCIPAL COMPONENTS ANALYSIS fore re dimenional sie stn st Tharsene' eye pr The aw das are given in Tble 4. we have sete them ia Table 49, and this discretized matrix was aed |. CHAPTER 5 NONLINEAR GENERALIZED CANONICAL ANALYSIS 94 ‘5 NONLINEAR GENERALIZED CANONICAL ANALYSIS ':1-PREVIOUS WORK 185, Hicher= sin Mis Masaan=0, 3) cand for al previous solutions ag sin Suppose the Ht are K called canonical var fe K mati Z, with ityererion and by hich may involve nay uaton nny ion andthe edidonalsoltons can be ued to define comespondence nals the centroid principle ‘in be tong or weak. Strong sultans orthogonality consis require If ihe weak version merely requires that BRU =. weak erhogonalty conti Bereta = Ee Meals, focal previous solusors a fa [Re Ree Be Figure 5 Pond cos poder mars REE. ~#eChe eth) PREVIOUS Wome ith 6s and ace both rnin indies for theives of th comelaton mats Ss od denoes an element of on for matching, by Davxals ard Pouse (1 Geer (1980), who ses the acron (simply change al sige i he smultncous ston’ by genrlizingto the cision BaD Caley Which mast be maximized over Ax under the reticto subproems are now onkogonal rorusts problene ot eyes of eign th computationally and ‘computational convenience indicate that MAXVAR and MINVAR are prety good endiates, $ NONLINEAR GENERALIZED CANONICAL ANALYSIS ‘Table S..a Consaton mais for 1 vale ree at “Tle S.A Tae 5. wcaiqus moked eotumanise MOOI MDOED HAS SORSUAL MENDET HARES woe sts Fr a a moe 38 oo’ 8 GGT wor $30 4 30 GS Mowe 3 3S veer Mee m2 '5 NONLINEAR GENERALIZED CANONICAL ANALYSIS The solutions for te ag are given in Table 5.1. For MAXMAX ‘and MINMIN we have applied the weat orthogonal const; forthe oser four techniques we have applied the trong ortho- {sonal contains. We donot ie any subtotal intrpresaons ‘hatwerestict each of the ¥ indvidally according to measurement level. "We have seen a Section 5.1 tha ter techniques would normalize tke Vj, Teve: we merely rar tha he solutions forte canonical weighs are exremely sn becomes obvious: 1 Kj this condidon Is aso ‘be triton a8 VT = satis the constrains hen YT will gency oly sts the constr ot “Ul lagonal mates Ith coin tus, then we ean aso minimize oyCEX-T = 1 Y, S80 EG AIT) ey 5b wth he sad ele ‘before. We shall use «sigh diferent formula, by in- ping the eondition thatthe Tare the sme within se and by wetng thera tsTp Alsolet = GN, oun eh snd define oy Sd XX oI. We 1 patoned indict 1 nina of oy ove X suing WK 0 ined matrix Yor each of the set, we cn expe the " ‘neon a5 cuf.¥) =F 880K GY, in 04 omen Geena canon 7” Deen crsscemaen] 20s x 5 cA 153 OVERALS AS A SPECIAL Ci ces Gy, which area x hy re replaced by a single “The m indiestor mauioes Gy, Bete tieyrt, where (224) denotes the geecalled inverse The proof follows from the Eckar-Young theorem. Thee onset S expicty Sseimpy the max that asi Se SS Se opis gle x Oy Formula that of generality hat Ryy = KL wich ove BCE EWC GN) =x, 2h ie BeGtviyatyt =x, haps, bat ts expecially imporantat his point, because sing Gk special cate can be recover "Show wt» soup of warble ean be reduced oa ingle viable nd that en Ze = GY the ¥, : * ‘Tab S.2 Parton intone ai ‘which gives principe component nays 53 OVERALS AS & SPECIAL CASE OF HOMALS: THEUSE OF INTERACTIVE VARIABLES ‘Weave seen in Chap 2 that its sometines useful to combine m variables With rook aegis no sngle vale with >=" he casas 206 § NONLINEAR GENERALIZED CANONICAL ANALYSIS ‘soigetncly at Ks: problems can be edaced to homogeneity analysis rolems (while, converey, we have een that ll emogeneg o ‘eset problems with ony oe varieleicach oa) Table $26 Inerave indir mi 72901 om aq au av tps tov ty im by Ge GY um yw fective variables must be done "any calegoris (the argunent phere the Sy ace the submatices ofS, which are (IR) 4 ‘54 sassnva DATA, = ESwi¥)= 3Sianas Yer igeat~ pa 528). These ae proper indicator mates, which sist Gh 5.4 MISSING DATA 208 4 NONLINEAR GENERALIZED CANONICAL ANALYSIS ‘Sf AN EXAMPLE: EFFECTS OF RADIOACTIVITY ON FSH 55 ALGORITHM CONSTRUCTION Fe Omigyopx, ‘as of he formas. The iting of X for Droblems, and we refer to Section 44 Tae component Ok GAY Myx Gt oss present new problems, because Gtisapationed indica nota simple indicator marx. This ehrateiste pie (GOCE snot diagonal, which complete the consacton of = «Gh mycty Xa Xj GY} [56 ANEXAMPLE: EFFECTS OF RADIOACTIVITY ON FISH rnp to ise an algsitim on computing i tend to become NX (p66, here both and are varies nse, in which the coneutions of the the ‘arable ae removed ftom X, Now the minimization of X,Y) with X ad Vs ned can be Cone by raining BK) GXYMAEK—Gyxp (528) ‘over, The condition waconstaied minima i ained for a0 the comer of tn egal ‘angle and the share closet heir aquarium, Again the same oo groups of variables canbe dsingushed 210 5 ee [NONLINEAR OENERALIZED CANONICAL ANALYSIS FS ANEXAMPLE: EFFECTS OF RADIOACTIVITY ON FISH au ‘Table s3 Fi dita tom anc ‘Bile itd tro a cxprinent by Ami (ta exeguined ty Calte nd 3 scare forte lah at nthe mame conse wih ie Conespondig pls in he avables and the eanonsalvarubleobect sores, and Table 5S gives them 22 | NONLINEAR GENERALIZED CANONICAL ANALYSIS ‘ables ng fora ie Cera losing for sng merical say'scomton of 16 S65 wo cancel vats meee ‘Hel ant re ft cna aches) Cri ii ‘componeat loadings ofthe mamaria volson in Figure 53 at he foadings ‘fhe ordinal solution n Figure 54, We can se foe the gues hatter ate four groops of vacales. The fst one consists of the ie variables, of which 1 Table Canonical outings fr sn ort Osis, agpt ong sino ani creiton of tom conical vate sco) Figure $3 singe vaesesl OVERALS solution fr he Fish (GE! cana lenny of he ovat na $5 NONLINEAR GENERALIZED CANONICAL ANALYSIS Bowroche and porta eliminated it Clay ou easton of ‘refines the ove give by Cailiez and Pages, and or tectnigh gi menmedate between dstiminent analysis an pina CHAPTER 6 NONLINEAR CANONICAL CORRELATION ANALYSIS 61 PREVIOUS WORK ‘There is considerably mor historical mateal onthe problem of ses The problem was probably formulates for the 6 NONLINEAR CANONICAL COR RELATION ANALYSS 61 PeavioUs work 219 conver people to using canonical crelation analyse have been writen in san lomo fe 20 ‘6 NONLINEAR CANONICAL CORRELATION ANALYSIS 62 THEORY Betutitay-Z, 6: ih 2¢= GOV. We have hv a id weak erogosliy conseae Sr Rye Beene Bee ‘maximizing the sum of the p largest elgem cn oes ShEX is bn iinet, ann which we rege hat 22 + 232, Because we can choose Ty pendent to sl 63 THE CANALS PROGRAM ‘The CANALS program does not it nitrally into the series HOMALS ~ PRINCALS ~ OVERALS, beaut it doesnot incorporate mule variables and does noc use indicator matices. The laters explained bythe fact that (CANALS dates back oa period in which we sil wanted the posit 0 {6 NONLINEAR CANONICAL CORRELATION ANALYSIS (64 EKAMILE BCONOMIC INEQUALITY AND FOLITICAL STABILITY 223 az appro must be incrprate in the definitions of tb cones C; Option i canbe simulited if one codes missing dats in one ext category per weabe. would be sensible to sete nominal option nts ase when one doce Not Jinow where to pace is addional category among te moomising ons 64 EXAMPLE I: ECONOMIC INEQUALITY AND POLITICAL ‘STABILITY feare taken fom paper by Russet (196, which as and Graham (1969). The basic iti: the comeationsbecween the ansormed variables the fist set nd ‘he canonical vaables ote fee se (Qi2q; te comelationsbeween the ansformed vse fe fst set and ‘he canonical vaables of the second se eh: the corelations between the transformed Variables of the secdad sat Milks canonical verabes othe ft sa, 3 (itz; the coreintons beween the rassformed variables ofthe secood set and the cancoce variables of the second et ‘The cozeatons between the canonical variables are ealled the cuhonic Careline ene given in the dgonsl masix ZZy =. Ween tac ae a tes iZa= Oita and Qa2, = Qs2z0, wile Z42,= 2.2401 OT TIAL Baie aETG FSET we SISKTVNY NOILY IoD TYOINONVO BVERINON 9 St¢ _AuneveswourTogaxy kLrTynbaNTOMWONOOR KE 9 xa Variables are meses sng 1 Figure 6.2 Diceczed varies conlins in he ‘Smeal pce of ecneac ane, 64 and 62 give the canonicel loadings in the space of orl variables, Lede comelaons between te nine quanied vacates rankings a Flgure 61 Russ orga a comtions ‘atonal pce of erencas ne 6 NONLINEAR CANONICAL CORRELATION ANALYSIS Jaeg preeaag of apical labour (vsabes DEMO, LABO, and GNPR) wit lized democracies, The ther dimension is more diffi Seluions are quite diferent. Tey are also diferent from te solution reported Jn Gif (1980, p. 227-236), which seers to ine thatthe stably i ar foe sustactory fr this example. The two new analyses ae, fa a8 the : . ae u 3 T T T T T 1 er Sr ee Oe Tigwre 62. Rests ogi dt: cnolcl sores in he 264 Dizled vals ania scores inte space of ‘conic vanabe | ee at tml Me spice of, Fiore 64 Ditcretet varie: spe Soom emis cl {6 NONLINEAR CANONICAL CORRELATION ANALYSIS 64 BXAMOLE 1:BCONOMICINEQUALITY AND POLITICAL STABILITY 231 S A iS Hy a ree 0 page Pammeities ‘Gost ae i age ot low te apes cr — Og 0” an a Dean ie span Figure 65 Rusots gin! dats tantomat sbesity of amocty (65 EXAMPLE 2: PREDICTION OF A SCHOOL ACHIEVENENTTEST = 253, “Table 63 Test cores fusion of ex and fe’ pression 7 = ae ra Bo Se 8 Pa Be Bie 2 oe ie a0 @ mB la of Be an Boke a Be = mS 136 (of freedom (4). The imerpretation of the Pin = e+ Bat te 64) ‘wth pyje the proportion of individuals with father J ex end test sare & ‘Table 64 Loglnese madi tat have hen ted tthe CBS da ae 6 NONLINEAR CANONICAL. CORRELATION ANALYS'S ‘hese untested end often ‘Toble 6S Maine an interaon by sbtrction co 217) Ness sis si she ase a ape *p sw 826 Sspisie” Neston” analysis pot of view eatinat lbal significance tests We chisqoace, sig te ortogantl function sur with He sted mal Paar & Somnuyend. 3) ere qy and are the arginalpoponon and where Zara 66) 65 BXAMOLE2 PREDICTIONGR-A SCHOOL ACHIEVEMENT TEST 235, Moreover 249 =20= Yo swbivary. The stra mod oman EE Erase 6a) though the choie of orthogoalfontoasisielevat fom the point of alana @OnS=P+SP GOnS+P COmP psi eating ofp prryes (spss? se? 238 {6 NONLINEAR CANONICAL CORRE ATION ANALYSIS 66 EPLOGUE dcx Burg tod De Leeuw be computed by coresponince he 7 Sand 2x5 abies computed from Table 63 by homing, respectively, ov canonical aalyres Table 67. The imes the soared can {ear max, Anco ‘Yous, aay inneoca be ame sane vas cfebuest alsely 0 tah st ndependent from the other quantifi CHAPTER 7 ASYMMETRIC TREATMENT OF SETS: SOME SPECIAL CASES, SOME FUTURE PROGRAMS special casts make some inportntsimpliieatins possible Its also net out Ittenoe inthe chapier wo cevew the history of each ofthese techniques in considerable detail, ve mecely digas algoritims and planed pros. In fi epllogue some more reer developmen wil be meson. 74 MULTIPLE REGRESSION AND MORALS. Sapot ter = 2 ad mrp he end toi ny oe ontylg) “SSQG'y! Gay), we Gia couple intenr mati and 6! “18 DISCRIMINANT ANALYSIS AND CRBINALS 243 112 DISCRIMINANT ANALYSIS AND CRIMINALS In iseriminan analysis we also bave& single viable inthe second set, but now tis variables malple nominal Telos faredon is 1¥))=SSQUG!Y! -G3¥) a2 = GAY! an 2, = G3 the normalization is UZ = Oand ZZ = 1. tm, ya i fitted by defining the poate 0 tnd by observing that he sino of cy4(X!,¥2) over nonresticted Yo is ual 1p YEGIPAGIY! a ‘The mai taal ispersion of he c for alin se st set. FT is the ispersion mai of @ ~ Gy and Bi he betwezsroup dispersion max, ‘ten OMY A) =r ATA=tABA, SYMMETRIC TREATMENT OF SETS we minimize this over A withthe resiction A'TA sT, the ley the result OMY.) =p 2,93 crm, sviminant analysis amounts t choosing hey lergest eigenvalues of TI i maninizod, clearly alo possible in this case to hs ofthe eigenvalues of T-'which mun be opin \wehave no expeiece with any othe: cole ‘73 MULTIVARIATE ANALYSIS OF VARIANCE AND MANOVALS Suppose te parton indica max G! 5 fered ated, ina ‘14 PATH ANALYSIS AND PATHALS 1o some of our readers, and we give 8 very 7 ASTROMETRIC TREATMENT OF SeTs tion can be solved by Cholesky decom- ted Wy seeing dag B) = lve A from (12) a8, Be Hy-B)-HyA, Weave merely shown inthis sections ar ‘A path model becomes re the above dlgonal elements of Bar ‘Asreatr. wend oss function agen the som of squares of the residuals as MAB) =SsqaKA, se ecnn ay oes nat ocho OMY ADY) = SSQQAs ~ Oa, 248 7 ASYMMETRIC TREATSENT OF Ss ‘ equed, Tis seen poses {Spite by Roy (957.42) aad ieatywaenealan f oc ‘and mutple nonnemeicl variables 8 ae 7.6 SOME EXAMPLES Figure A wind ath model, Table Ress of completly notnear and abitve region assis on Sana eee = PETE) Es A aT esi Sees ocr a tao? on 2M ‘oe tm Latino 7 ASYMMETRIC TREATSENT OF 75 Table 7.3 Regression ress ‘mulpe nominal end p= 3 ° 12 3° 4. sg Pleure73 (Protest combination ater ddveeeson aa 253 “Table7.§ Pais andontnary sonar epi. PIOeT Poe 208 190 =a3 23 ta $3 177 EPILOGUE CHAPTER 8 MULTIDIMENSIONAL SCALING AND. CORRESPONDENCE ANALYSIS Mutivaciate analysis @4VA) developed along rather seperate Letos consider binary dats matrix (ot yet coded a a indicator mati) night have the following interpretation {81 HOMOGENEITY AND SEPARATION 257 Figure 82 Unfecing sion fordata nis ved for gu 83 sn, The igri pnt: chose tems ee always 258 MRULTIDIMENSIONAL SCALING AND CORRESPONDENCE ANALYSIS rangement of groups of objets frequent imply te posit of good ‘Seprtion,s0 that this goa isnot completly os. m3 4 tee Figure 83 Degseae ston for at mai aed for Fes 1s 62. ‘82 MINIMUM DISTANCE ANALYSIS OF HOMOGENEOUS GROUPS ‘OF OMECTS robe ‘became Fis a necessary (prone) lait ma. th next sen ‘we sal llow Fo conan any noangaive values whatsoeve, ia which cae itis called correspondence table inthe French tee Benzé, 1979), Table. Ministre exanpe pe 2 design ‘The goal ofthe analysis isto find configuration of poins ia pimen- sonal space, in which Bomoge losly De dove by interpreting asm {2 MONDEUMEDISTANCE ANALYSIS umber, property that is ot essential or the present derivation. [821 Scaling the row objects associated with F 0 if oj o4 woe-{ 1 otherwise, Tn atc, we defie foreach column a mas of weight, wit elements 260) NULTIDIENSIONAL SCALING AND CORRESPONDENCE ANALYSIS ‘as ahomogencous prop get a0 arbitary) dissimilar of ee and weight 220 representation af throw objecs. The ‘ingondl mates ofthe coluia tale of P. We may wate sing f cslana veour of F) 7 ‘Table ae Sum matix P of rank-one Mempotent Tle 8.2¢ Sm a uric was mentioned), lows tat the simplified es BoREs 5 % 262 __§ MULTIDIMENSIONAL SCALING AND CORRESPONDENCE ANALYSIS 182 MONDMUMDISTANCE ANALYSIS 108, = Fic Pie HKD ~ Bice a Gia Mg) and ¥*, What resins fovea ie rave soto (and therefore also i low-dimensional pap of them tat we examine npacce) isthe appro xeve = 9 DzrD tw, @2y ‘With respect graphical splays this approximation imps the following. f ‘We draw line hough azo paatand projec the colin poisons, ‘be proportional aw of ha column point and pro approximate the element of te he values (ep) ey, wher ey ander the hypothesis oi cxpeced val of the ool requensy Above it was found that 7 distances betwen rows of Fare the save a he Euclidean dstnces berweca rows of BN"! quDl@. An element of te Inner max ea be writen a, PGI = GIT) Kost @25) {83 CORRESPONDENCE ANALYSIS 20 28) gives the algebraic i ‘The numerator of the expression onthe right alee betwen ow pot fhe coon proportion ¢j/N. The deaomnaza isthe square Toot ofthe marginal po ‘ston. Provided hat row proporions can be inept aan estate othe ‘marginal poporioas, we may write 3 lby-Gs nF zh 29) 72 dlsances, Als, the is explains why the disances By were 074; —D} wep!” 29 «!KAL, @s0) and sie of (3 LAK! = KA2K ismatix has race equal oN 323, otat we ave the exeaity xen, 633) ‘whic, onthe assumption of independence betwen rows end clams of Fy converges to 72 wih (nl) perl) degrees of freedom. era minanre example, et with = (10 20 30), = (6.12 1824, ales are N= 60, The expected ‘The matric with elements yey)! ew be called WPA: 2706 MULTIDIMENSIONAL SCALING AND CORRESPONDENCE ANALYSIS ( sag.,{ 3200 1.41 -058 2.00 wonn[ 3 18-088 302) 135 000-033 0.87 | The sn ofthe squared element of NTA is X2 = 1982. has ‘the SVD solution KAL with 0.921 0,029 ) 0.282 0.766 |. {0236 0.682 and ejgomalis 4} = 0307, The solution forX becomes 0.023. This confirms X2 = W 3. vooeliea 1388 go xawity nel 4256 24) (228 3385} X zives the representation ofthe rons of Fhe Euclidean die ances benveen rows of X are equa to the 2 dtancesBenscon the rows of FE. The squared ditence are Figure 8.6 sives the join plot The distances between the row inthe lo are the 7 tances. Rw pots are the cena of of clus points m 13 CORRESPONDENCE ANALYSIS of XY and thar pole fre proportional 0 the x poles onthe ne rough, laf RY lone are y~ eae 2 ) (21000 1.000 -0333 ~1.000 X¥'=| 95300 0800 0333 0.125 { =B86r “tian -o:1 0:250 } m ' MULTIDMENSIONAL SCALING AND CORRESPONDENCE ANALYSIS {833 The centroid principe and reciprocal averaging NPY.“ RD/IAEDAL = KKALL =A, ‘0 tha he SVD used inthis chapecan be subsite to bain {4 CONTINGENCY AND CORRELATION 25 TIN = 24044 964. = 59, 1960, 19608) has generalized some ofthese ress for dependence beeen mare thn two vases ‘0 contingency tables, Hirsh ontingecy abies, Insofar asa cretion teen £0) nd i) ‘ohn, 1950; Wilans Bock, 1960.) Maung described the representation tr notation) with references tothe comparable case of continuous ‘vats. Eset, Meéler (186 had formulated for his case the represses B53 =ft8 9) if and yare independent. Wea x= 660 or = yin, then dy = 18 end a one-one mappings hen By normal wit conelation patete 40) where Vs are the Hemite-Chebyshev polynomials, The Maung cqulenes| Sts») «ited where i canonical And where &y and ye te the fomntions, (For farther developments see Lancitr 195%, 1S THE PROGRAM ANACOR a y XX =n D;K, defined in Setin ‘SVD routine The XD,X «Na? YD =NI X-;ley ers veut ies'These option sales ro points 3 2 1 ark o ee sans 1 abur 2 so 2 4.0 4G 1386. The analyse THE PROGRAM ANACOR 281 there {so HOMALS or matic along the ins of an identical quantification, Note that in ‘Rosiebcbon ‘Groringsa { i. 2 Fath Figure 8 Tate 83 pope eget on he asf he ie ‘ANCOR slain. Poet 2 sone 852 An example with asinilarty able 45 THE PROGRAM ANACOR aes = ae sel me Figure 89¢ ANACOR solution for dtnces berwoes 23 | Dacca: detonmized tne, 853 An example witha muicimensonal contingency table tis have ‘Asto poll parents he ive sugar fan leh ong ae (DA: any denoasnainal pany (KVP, ARP, CHU, GPV, SOP) WD = comertvesber] PwiA labour pany PACO : any ofthe smaller letwing pares (PSP, CPN) Dis + pragmatilteal one — YNACOR soliton for poles! preerences of vie (iagoal a bivarae (of lagoeal marginal. 3 5 Tike 4 ° t 2 ‘bets dimeasons a 810» ANACOR soluion fr pata preferences of 0c ANACOR sation for pita pretereoes af 288 § MULTIDIMENSIONAL SCALING AND CORRESPONDENCE ANALYSIS {6 THE PROGRAM ANARROF 7 same quintifcaton of stporie 8a two van Scola {86 THE PROGRAM ANAPROF: ANALYSIS OF PROFILE "FREQUENCTES 2 ae q Nh preferences of aden, upd a 6 ceo of a age s unique. Subsiuig the row presulilying by mil we 3 a shonm in Figure 8.11, Inthe gure, CDA tle PAGO a the lower right, WI the re. Some 0 6;6,SD-12 = ml2G,KAL 44) der a comespondence analysis on the able F = GiGpS. is row sma marginale as Dz =, Substituting iby D> gies us 290 MULTIDIMENSIONAL SCALING AND CORRESPONDENCE ANALYSIS. BK = KGYG;69-1G;K. ase 1 @43). 50K =1 _oups of object sores to be eal sarily ident on 102), 20 Se 861 An example with binary survey data out eigioas 86. Theoretical, thre ate aol of 26 {46 THE FROGRAM ANAPROF BaRRS: RESEE88 21 92 & MULTIDIMENSIONAL SCALING AND CORRESPONDENCE ANALYSIS 16 THE PROGRAM ANAPROF 295 Figure 8.12 ANAPROF siuion for Suiyma daa on assmption of (otplee scar ma. Figure £13 7CA sion Gtrersion. uve dit ve 284 & MULTIDIMENSIONAL SCALING AND CORRESPONDENCE ANALYSIS 62 Asymmetric treatment of response categories Inthe binary case symm reamentof the response categories implies dat the optimal os Tigure8.1¢ ANAPROF ston for Supyana data on te seuepon fimipeamiet data a matrix G of dimension 4243 6 (one columa only foreach tim, re ‘yes as", but omitting the no" fated more genera 295 2986 MULTIDIMENSIONAL SCALING AND CORRESPONDENCE ANALSis [Note that propery lik vel ofthe grap a proile icly below, Allrowsof Table 89 togeber “The graph is given in Figure 8.16. All subiets of tis graph 2 7 a T 2 Figure 8.172 HOMALS caution or gered cicamplex (Gatensof tie 89) emesis ads " [L7 SOME.GAUGING RESULTS FOR BINARY DATA 299 1b HOMALS soon or ge of Tati 89); deren? Se solution forthe Sugiyama dia 5.18 shows this for the 22 row CABEDF (outed 90 re B12 of he sx elementary 300 § MULTIDIMENSIONAL SCALING AND CORHESPONDENCE ANY to his end fois the elementary pos from 000001 to 0OF0D0 ite a curve the peipery ofthe plot Noe tat 000000 dos oe epeaein isp gory @ pace called dédoble | es smLoguE x01 mena he French eat) is ete it the appearence of mimo images on ‘Be nore sad southern hemispheres tht consti he solaon. 2s as 101080 as 3s 88 EPILOGUE Inthis chaps homogeneity analyst has ben sussed basically as n MDS technique that makes low-dimensional displays of data and focuses on the 302 § MULTIDIMENSIONAL SCALING AND CORRESPONDENCE ANALYSIS distances between points in these di disp developments «step farder Se define eed nei ‘Toemininm squared dsane fomulason of hogs extensively in Heiser "eaeiy ani 19870), models as beea proposed by DeLecaw CHAPTER 9 MODELS AS GAUGES FOR THE ANALYSIS OF BINARY DATA ve have sen many times that binary variables are Ty. Besse 05 = 20s sae 98. MONOTONE LATENT TRAIT MODELS “yn $ COVUY. b) < minty 2) 10 : on 06 a 10 stdy 6p Tey 02 will bene clear whea we inzoduce tome ofthe moe common gees °° ¥ Figere 9.1 Tem charset carves ina geen tert raed 92 MONOTONE LATENT TRAIT MODELS Suppose aan unobservable latent rit ach ht p(t) = AVEC 1g) is 92.41 Holomorph items x called the celine, tem character carve: OF her the expectation is taken with respect the Aistibution ofthe inten wai. We also assume conditional o local Indep dence: forall) #1 we bare =) wk, which ipses y= AVED{0 pa). Because ‘tis clear chat COV, i) > 0, and models coneltions str ‘corresponding nonnegntveelgenveior, So the assumption of = Jatent tat mal imposes sme seve on the observed corelaton saci ‘but not much, Figure 9.1 shows the item: charcterste carves ia genta Iaent ait model lowe. kaso tn arc ave 306 9 MODELS As oAUCES 92. NONOTONE LATENT TRAIT MODELS 301 ‘single and malipl, hs means that applying HOMATS comes tothe sme as {pplying linear BCA: ther ie only one coreation max. 9.2.2 The Guttman scale ‘Gutimas (1944, 1950, 19500) induced a model with items that consti a treakpoint on the eoatinnume when an individuals Ioeated om the left-hand ide of he breakpoint, he te wil be answered incorey; when fcated on the righthand aide, the item wil be answered comectly, The adational _ssumpons opps) mentioned above are writen as ‘ : y(t e» rio iio f-—]- fe] + J ' | 76 ol d- -|- F cel should te the same and oppoite ls. The pots of es relatos xe eo ‘igure 94 em charaserisiccuneof a Gutanitem 9 MODELS As GauaEs lomorphy ‘Thre ae pereeyholomarphi tems with a an be constvcted by, Rasch [ euocesmenanats ‘Guraman sal, ko cles sinplee. The eral properies ofthe sep re depicted in Figure 95, igure 9.5 Oetzal properies of = Because bisancigemectr, Beye oa) 310 9 MODELS AS oALGES funcons, oe foreach eigenvec corresponding with ements of y are monotone the poo, working inthe ) = mt, SSG ay BI ~ eae “They and are anda aisles ened on a comen probability sce, elements that are supposed to be ordered. Ns we ee oire Oife > 0. contnoously dfferenabe a, however, kes i StH ou somewhat ‘motel The p() mast be berwecn ‘8c unbounded on the rel tne: ts the sn= Ci + fe, 14) 316 9 MODELS AS causes qe ona of pum ef montre ce 98 MONOTONE LATENT TRAIT MODELS a7 Seaman hierarchy are que aiferet Suppose ‘uooidered arf =» 83, The eigenvalues of R= pp +A? then saisly esoun, aaspaige..2idest, ag h+ Hn svi gn 0, whe o8) ‘Table 95 Speman Neruchy sss » 010 020 mo 050 om a7 080 080 ast Gas om an 039 04 026 q 319 318 9 MopaLs As GAUGES 193. MONOTONE LATENT TRAIT MODELS vary uerations. Table 95 has ‘row, Rin is nex nine rows, and 2, the eigenvelues last row, Table 9.5 has the elpensectors of R, which are Flgqwe 910 is fiat ination i aly posible, Ie eres wiv to rea tha rel data ae beerespes fee (Fe wv Figure 9.10 Figenectom of Speaman craton marx “The pattern i the eigenvector p lifer fom the pt pens when perfect stiminaton is posite: Spearman shows ws what happess when dscine oa 9 ones AS xuceEs {92 MONOTONE LATENT TRATT MODELS 9:25 The Rasch model Gutman 19) with ofcourse, <5) sep faction, beri does ‘ot ep fom Oto Lay ae, Also deine 020) 4, IF we integrate the desired result The me ‘is oma ts or rear in Seon 921 hat petty ol ‘tems, in this case Rasch items, can have small a ‘tisposbi sete me penne senna ay pene mao have the sme eigenvector proper arn ai cab ph of 93 NONMONOTONIC LATENT TRAIT MODELS any suo is ‘mote realise to sippse that hep) are unimodal. Plats, for example, np, with Ry tally postive, with A2 dagonal, and with 9.3 NONMONOTONICEATENT TRATT MODELS 33 seeés coin aunt min If eso nis, encrain eof inmmany snlaiy and refcteae cote. To we Cals (1972) example: people who lke oolé beverages sully do aot want them to be completely Foren, and people who like thei coffee hot do aot waat I tobe acually boiling fa) os oy a oo ‘The del comrpontig wit te pet sl anna aon is = a

You might also like