0% ont trouvé ce document utile (0 vote)
47 vues21 pages

Statistiques descriptives du dataset louisevie

Le document présente l'analyse statistique descriptive d'une base de données nommée louisevie. Il décrit les étapes de chargement, de nettoyage et de préparation des données, incluant la création de nouvelles variables. Des statistiques descriptives et des tests sont réalisés pour décrire les variables.

Transféré par

septimus.pierre
Copyright
© © All Rights Reserved
Nous prenons très au sérieux les droits relatifs au contenu. Si vous pensez qu’il s’agit de votre contenu, signalez une atteinte au droit d’auteur ici.
Formats disponibles
Téléchargez aux formats PDF, TXT ou lisez en ligne sur Scribd
0% ont trouvé ce document utile (0 vote)
47 vues21 pages

Statistiques descriptives du dataset louisevie

Le document présente l'analyse statistique descriptive d'une base de données nommée louisevie. Il décrit les étapes de chargement, de nettoyage et de préparation des données, incluant la création de nouvelles variables. Des statistiques descriptives et des tests sont réalisés pour décrire les variables.

Transféré par

septimus.pierre
Copyright
© © All Rights Reserved
Nous prenons très au sérieux les droits relatifs au contenu. Si vous pensez qu’il s’agit de votre contenu, signalez une atteinte au droit d’auteur ici.
Formats disponibles
Téléchargez aux formats PDF, TXT ou lisez en ligne sur Scribd

Nous allons importer le Dataset nommé louisevie

> louisevie<-[Link](file=[Link](), header=FALSE, se="")


> dim(louisevie)
[1] 200 7
> names(louisevie)
[1] "V1" "V2" "V3" "V4" "V5" "V6" "V7"

2. Nommons les variables

> colnames(louisevie)<-c("sexe", "age", "abo", "sitfam", "soc", "zau", "sal")


> attach(louisevie)

1. Présentation des statistiques descriptives de la base:

a. Calcul de la moyenne, de l’ecart type, du minimum et du maximum pour l’ensemble des


variables continues de la base :

> summary(louisevie)
sexe age abo sitfam soc
Femme:124 Min. :18.00 Min. :0.000 Min. :1.00 Cadre :32
Homme: 76 1st Qu.:35.00 1st Qu.:0.000 1st Qu.:1.00 Employe:88
Median :44.00 Median :1.000 Median :1.00 Ouvrier:80
Mean :42.52 Mean :0.545 Mean :1.63
3rd Qu.:49.00 3rd Qu.:1.000 3rd Qu.:2.00
Max. :59.00 Max. :1.000 Max. :3.00
zau sal
Min. :1.00 Min. : 3678
1st Qu.:2.00 1st Qu.:13789
Median :2.00 Median :17038
Mean :3.03 Mean :18615
3rd Qu.:4.00 3rd Qu.:22523
Max. :7.00 Max. :35972

b. Calcul de la moyenne, l’ecart-type, le minimum et le maximum pour les variables continues


selon le type d’abonnement

> sd(zau)
[1] 1.90453
> sd(sal)
[1] 7082.492
> summary(louisevie [louisevie [,"abo"]=="0","age"])
Min. 1st Qu. Median Mean 3rd Qu. Max.
31.00 34.00 39.00 40.27 46.00 51.00
> sd(louisevie [louisevie [, "abo"]=="0","age"])
[1] 6.752886
> summary(louisevie [louisevie [,"abo"]=="1","age"])
Min. 1st Qu. Median Mean 3rd Qu. Max.
18.00 38.00 46.00 44.39 52.00 59.00
> sd(louisevie [louisevie [,"abo"]=="1","age"])
[1] 9.628294
> summary(louisevie [louisevie [,"abo"]=="0","zau"])
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.000 2.000 2.000 3.055 4.500 7.000
> sd(louisevie [louisevie [,"abo"]=="0","zau"])
[1] 1.928515
> summary(louisevie [louisevie [,"abo"]=="1","zau"])
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.000 2.000 2.000 3.009 4.000 7.000
> sd(louisevie [louisevie [,"abo"]=="1","zau"])
[1] 1.892947
> summary(louisevie [louisevie [,"abo"]=="0","sal"])
Min. 1st Qu. Median Mean 3rd Qu. Max.
3678 12936 16227 17654 21320 35374
> sd(louisevie [louisevie [,"abo"]=="0","sal"])
[1] 7150.625
> summary(louisevie [louisevie [,"abo"]=="1","sal"])
Min. 1st Qu. Median Mean 3rd Qu. Max.
6296 15528 18190 19418 23030 35972
> sd(louisevie [louisevie [,"abo"]=="1","sal"])
[1] 6956.164

c. Nous allons realiser un tri à plat pour l’ensemble des variables qualitatives

> table(sexe)
sexe
Femme Homme
124 76
> length(sexe)
[1] 200
> [Link](table(sexe))*100
sexe
Femme Homme
62 38
> table(abo)
abo
0 1
91 109
> length(abo)
[1] 200
> [Link](table(abo))*100
abo
0 1
45.5 54.5
> table(sitfam)
sitfam
1 2 3
101 72 27
> legnth(sitfam)
Error in legnth(sitfam) : could not find function "legnth"
> length(sitfam)
[1] 200
> [Link](table(sitfam))*100
sitfam
1 2 3
50.5 36.0 13.5
> table(soc)
soc
Cadre Employe Ouvrier
32 88 80
> length(soc)
[1] 200
> [Link](table(soc))*100
soc
Cadre Employe Ouvrier
16 44 40
> table(zau)
zau
1 2 3 4 5 7
31 97 3 21 24 24
> length(zau)
[1] 200
> [Link](table(zau))*100
zau
1 2 3 4 5 7
15.5 48.5 1.5 10.5 12.0 12.0
> addmargins(table(sexe,abo))
abo
sexe 0 1 Sum
Femme 67 57 124
Homme 24 52 76
Sum 91 109 200
> Tab<-table(sexe,abo)
> addmargins([Link](addmargins(Tab,1),1),2)*100
abo
sexe 0 1 Sum
Femme 54.03226 45.96774 100.00000
Homme 31.57895 68.42105 100.00000
Sum 45.50000 54.50000 100.00000
> Tab<-table(sexe,abo)
c. Realisons un test de correlation que nous allons presenter dans un tableau croise entre les
variables d’abonnement et sexe.

> addmargins([Link](addmargins(Tab,2),2),1)*100
abo
sexe 0 1 Sum
Femme 73.62637 52.29358 62.00000
Homme 26.37363 47.70642 38.00000
Sum 100.00000 100.00000 100.00000
> genre<-ifelse(sexe=="Homme",1,0)
> [Link](genre,abo)

Pearson's product-moment correlation

data: genre and abo


t = 3.1561, df = 198, p-value = 0.001848
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.08262675 0.34706143
sample estimates:
cor
0.2188588

2. Créons de nouvelles variables

> abon<-ifelse(abo=="0",1,0)
> fem<-ifelse(sexe=="Femme",1,0)
> hom<-ifelse(sexe=="Homme",1,0)
> marie<-ifelse(sitfam=="1",1,0)
> celib<-ifelse(sitfam=="2",1,0)
> divor<-ifelse(sitfam=="3",1,0)
> cadre<-ifelse(soc=="Cadre",1,0)
> empl<-ifelse(soc=="Employe",1,0)
> ouv<-ifelse(soc=="Ouvrier",1,0)
>
> urbain<-ifelse(zau<="3",1,0)
> Femu<-ifelse(sexe=="Femme" & urbain=="1",1,0)
> Femr<-ifelse(sexe=="Femme" & urbain=="0",1,0)
> Homu<-ifelse(sexe=="Homme" & urbain=="1",1,0)
> Homr<-ifelse(sexe=="Homme" & urbain=="0",1,0)
> saldiv<-sal/1000
> saldiv2<-(sal^2)/100000
> logsal<-log(sal)
> age2<-age^2
> logage<-log(age
> table(abo + abon)
3. Vérifions alors que ces variables ont été correctement créées

1
200
> table(fem + hom)

1
200
> table(marie + celib + divor)

1
200
> table(cadre + ouv + empl)

1
200
> cbind(zau, urbain)
zau urbain
[1,] 3 1
[2,] 3 1
[3,] 3 1
[4,] 4 0
[5,] 4 0
[6,] 4 0
[7,] 4 0
[8,] 4 0
[9,] 4 0
[10,] 4 0
[11,] 4 0
[12,] 4 0
[13,] 4 0
[14,] 2 1
[15,] 2 1
[16,] 2 1
[17,] 1 1
[18,] 1 1
[19,] 1 1
[20,] 1 1
[21,] 1 1
[22,] 1 1
[23,] 1 1
[24,] 1 1
[25,] 2 1
[26,] 2 1
[27,] 2 1
[28,] 2 1
[29,] 2 1
[30,] 2 1
[31,] 2 1
[32,] 2 1
[33,] 2 1
[34,] 2 1
[35,] 2 1
[36,] 2 1
[37,] 2 1
[38,] 2 1
[39,] 2 1
[40,] 2 1
[41,] 2 1
[42,] 2 1
[43,] 2 1
[44,] 2 1
[45,] 2 1
[46,] 2 1
[47,] 2 1
[48,] 5 0
[49,] 5 0
[50,] 5 0
[51,] 5 0
[52,] 2 1
[53,] 2 1
[54,] 2 1
[55,] 2 1
[56,] 2 1
[57,] 2 1
[58,] 2 1
[59,] 2 1
[60,] 2 1
[61,] 2 1
[62,] 2 1
[63,] 2 1
[64,] 2 1
[65,] 2 1
[66,] 2 1
[67,] 2 1
[68,] 2 1
[69,] 2 1
[70,] 2 1
[71,] 2 1
[72,] 2 1
[73,] 2 1
[74,] 2 1
[75,] 2 1
[76,] 2 1
[77,] 2 1
[78,] 2 1
[79,] 5 0
[80,] 5 0
[81,] 5 0
[82,] 5 0
[83,] 5 0
[84,] 5 0
[85,] 5 0
[86,] 5 0
[87,] 7 0
[88,] 1 1
[89,] 1 1
[90,] 1 1
[91,] 7 0
[92,] 7 0
[93,] 7 0
[94,] 7 0
[95,] 7 0
[96,] 2 1
[97,] 1 1
[98,] 1 1
[99,] 2 1
[100,] 2 1
[101,] 1 1
[102,] 1 1
[103,] 2 1
[104,] 1 1
[105,] 2 1
[106,] 2 1
[107,] 5 0
[108,] 1 1
[109,] 5 0
[110,] 5 0
[111,] 5 0
[112,] 1 1
[113,] 5 0
[114,] 5 0
[115,] 7 0
[116,] 7 0
[117,] 7 0
[118,] 7 0
[119,] 7 0
[120,] 7 0
[121,] 7 0
[122,] 7 0
[123,] 7 0
[124,] 7 0
[125,] 7 0
[126,] 7 0
[127,] 7 0
[128,] 1 1
[129,] 1 1
[130,] 1 1
[131,] 5 0
[132,] 5 0
[133,] 1 1
[134,] 1 1
[135,] 1 1
[136,] 1 1
[137,] 5 0
[138,] 5 0
[139,] 5 0
[140,] 4 0
[141,] 4 0
[142,] 1 1
[143,] 4 0
[144,] 4 0
[145,] 4 0
[146,] 4 0
[147,] 4 0
[148,] 7 0
[149,] 5 0
[150,] 2 1
[151,] 2 1
[152,] 2 1
[153,] 2 1
[154,] 2 1
[155,] 2 1
[156,] 2 1
[157,] 2 1
[158,] 7 0
[159,] 1 1
[160,] 7 0
[161,] 1 1
[162,] 7 0
[163,] 7 0
[164,] 4 0
[165,] 1 1
[166,] 1 1
[167,] 4 0
[168,] 4 0
[169,] 4 0
[170,] 2 1
[171,] 2 1
[172,] 2 1
[173,] 2 1
[174,] 2 1
[175,] 2 1
[176,] 2 1
[177,] 2 1
[178,] 1 1
[179,] 2 1
[180,] 2 1
[181,] 2 1
[182,] 2 1
[183,] 2 1
[184,] 2 1
[185,] 2 1
[186,] 2 1
[187,] 2 1
[188,] 2 1
[189,] 2 1
[190,] 2 1
[191,] 2 1
[192,] 2 1
[193,] 2 1
[194,] 2 1
[195,] 2 1
[196,] 2 1
[197,] 2 1
[198,] 2 1
[199,] 2 1
[200,] 2 1
> table((Femu + Femr) - fem)

0
200
> table((Homu + Homr) - hom)

0
200
> table(saldiv*1000 - sal)

-9.09494701772928e-13 0 1.81898940354586e-12
1 197 2
> table(sqrt(saldiv2*100000) - sal)

-3.63797880709171e-12 -1.81898940354586e-12 0
1 7 184
9.09494701772928e-13 1.81898940354586e-12 3.63797880709171e-12
1 4 3
> table(sqrt(age2) – age)
Error: unexpected input in "table(sqrt(age2) –"
> table(sqrt(age2)- age)

0
200
> table(exp(logage)- age)

-1.4210854715202e-14 -7.105427357601e-15 -3.5527136788005e-15


4 83 7
0 3.5527136788005e-15 7.105427357601e-15
57 5 32
1.4210854715202e-14 2.1316282072803e-14
8 4

4. Estimons le modèle logit

> LogitAboCst<-glm(abo ~ 1, family = binomial(link="logit"))


> summary(LogitAboCst)

Call:
glm(formula = abo ~ 1, family = binomial(link = "logit"))

Deviance Residuals:
Min 1Q Median 3Q Max
-1.255 -1.255 1.102 1.102 1.102

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.1805 0.1420 1.271 0.204

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 275.64 on 199 degrees of freedom


Residual deviance: 275.64 on 199 degrees of freedom
AIC: 277.64

Number of Fisher Scoring iterations: 3

> BetaAboCst<-coef(LogitAboCst)
> P1<-exp(BetaAboCst)/(1+exp(BetaAboCst))
> LogitAbonCst<-glm(abon ~ 1, family = binomial(link="logit"))
> summary(LogitAbonCst)

Call:
glm(formula = abon ~ 1, family = binomial(link = "logit"))
Deviance Residuals:
Min 1Q Median 3Q Max
-1.102 -1.102 -1.102 1.255 1.255

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.1805 0.1420 -1.271 0.204

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 275.64 on 199 degrees of freedom


Residual deviance: 275.64 on 199 degrees of freedom
AIC: 277.64

Number of Fisher Scoring iterations: 3

> BetaAbonCst<-coefficients(LogitAbonCst) P0<-exp(BetaAbonCst)/(1+exp(BetaAbonCst))


Error: unexpected symbol in "BetaAbonCst<-coefficients(LogitAbonCst) P0"
> BetaAbonCst<-coefficients(LogitAbonCst)
> P0<-exp(BetaAbonCst)/(1+exp(BetaAbonCst))
> P0
(Intercept)
0.455

6. Estimation avec variables explicatives

Une seule variable explicative : le sexe

> LogitSexe<-glm(abo ~ fem, family = binomial(link="logit"))


> summary(LogitSexe)

Call:
glm(formula = abo ~ fem, family = binomial(link = "logit"))

Deviance Residuals:
Min 1Q Median 3Q Max
-1.5183 -1.1096 0.8712 1.2468 1.2468

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.7732 0.2468 3.133 0.00173 **
fem -0.9348 0.3056 -3.059 0.00222 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)


Null deviance: 275.64 on 199 degrees of freedom
Residual deviance: 265.89 on 198 degrees of freedom
AIC: 269.89

Number of Fisher Scoring iterations: 4

> BetaSexe<-coef(LogitSexe)
> Pfem<-exp(BetaSexe[1] + BetaSexe[2])/(1+exp(BetaSexe[1] + BetaSexe[2]))
> Pfem
(Intercept)
0.4596774
> BetaSexe<-coef(LogitSexe)
> Phom<-exp(BetaSexe[1])/(1+exp(BetaSexe[1]))
> Phom
(Intercept)
0.6842105
> OddH<-Phom/(1-Phom)
> OddH
(Intercept)
2.166667
> OddF<-Pfem/(1-Pfem)
> OddF
(Intercept)
0.8507463

5. Estimation avec variables explicatives

> Logitmodel1<-glm(abo ~ fem + marie + celib + cadre + empl + urbain + age + sal, family =
binomial(link="logit")) summary(Logitmodel1)
Error: unexpected symbol in "Logitmodel1<-glm(abo ~ fem + marie + celib + cadre + empl + urbain + age +
sal, family = binomial(link="logit")) summary"
> Logitmodel1<-glm(abo ~ fem + marie + celib + cadre + empl + urbain + age + sal, family =
binomial(link="logit"))
> summary(Logitmodel1)

Call:
glm(formula = abo ~ fem + marie + celib + cadre + empl + urbain +
age + sal, family = binomial(link = "logit"))

Deviance Residuals:
Min 1Q Median 3Q Max
-2.3577 -0.8211 0.3776 0.7868 1.6585

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.736e+00 1.145e+00 -2.389 0.01687 *
fem -1.141e+00 3.816e-01 -2.989 0.00280 **
marie 1.209e+00 4.929e-01 2.453 0.01417 *
celib -4.820e-01 5.283e-01 -0.912 0.36156
cadre 1.753e+00 6.443e-01 2.721 0.00652 **
empl -5.122e-01 3.950e-01 -1.297 0.19470
urbain -1.486e-01 3.648e-01 -0.408 0.68363
age 8.629e-02 2.211e-02 3.902 9.55e-05 ***
sal -1.855e-05 2.753e-05 -0.674 0.50044
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 275.64 on 199 degrees of freedom


Residual deviance: 209.36 on 191 degrees of freedom
AIC: 227.36

Number of Fisher Scoring iterations: 4

> Logitmodel2<-glm(abo ~ fem + marie + celib + cadre + empl + urbain + age + saldiv, family =
binomial(link="logit"))
> summary(Logitmodel2)

Call:
glm(formula = abo ~ fem + marie + celib + cadre + empl + urbain +
age + saldiv, family = binomial(link = "logit"))

Deviance Residuals:
Min 1Q Median 3Q Max
-2.3577 -0.8211 0.3776 0.7868 1.6585

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.73595 1.14502 -2.389 0.01687 *
fem -1.14055 0.38161 -2.989 0.00280 **
marie 1.20914 0.49293 2.453 0.01417 *
celib -0.48205 0.52833 -0.912 0.36156
cadre 1.75291 0.64432 2.721 0.00652 **
empl -0.51224 0.39500 -1.297 0.19470
urbain -0.14865 0.36477 -0.408 0.68363
age 0.08629 0.02211 3.902 9.55e-05 ***
saldiv -0.01855 0.02753 -0.674 0.50044
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 275.64 on 199 degrees of freedom


Residual deviance: 209.36 on 191 degrees of freedom
AIC: 227.36

Number of Fisher Scoring iterations: 4

> Logitmodel3<-glm(abo ~ fem + marie + celib + cadre + empl + urbain + age + age2 + saldiv + saldiv2,
family = binomial(link="logit"))
> summary(Logitmodel3)

Call:
glm(formula = abo ~ fem + marie + celib + cadre + empl + urbain +
age + age2 + saldiv + saldiv2, family = binomial(link = "logit"))

Deviance Residuals:
Min 1Q Median 3Q Max
-2.3084 -0.8019 0.1074 0.8211 1.9342

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 16.4043850 6.4066398 2.561 0.01045 *
fem -1.0203878 0.4067757 -2.508 0.01213 *
marie 0.9201744 0.5174169 1.778 0.07534 .
celib -0.7646153 0.5815115 -1.315 0.18855
cadre 1.5893378 0.6846792 2.321 0.02027 *
empl -0.6905511 0.4203864 -1.643 0.10045
urbain -0.1508788 0.3894397 -0.387 0.69844
age -0.9119657 0.3134578 -2.909 0.00362 **
age2 0.0120903 0.0038443 3.145 0.00166 **
saldiv 0.0878204 0.1356691 0.647 0.51743
saldiv2 -0.0002430 0.0003136 -0.775 0.43827
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 275.64 on 199 degrees of freedom


Residual deviance: 194.72 on 189 degrees of freedom
AIC: 216.72

Number of Fisher Scoring iterations: 6

> Logitmodel4<-glm(abo ~ fem + marie + celib + cadre + empl + urbain + logage + logsal, family =
binomial(link="logit"))
> summary(Logitmodel4)

Call:
glm(formula = abo ~ fem + marie + celib + cadre + empl + urbain +
logage + logsal, family = binomial(link = "logit"))
Deviance Residuals:
Min 1Q Median 3Q Max
-2.3864 -0.8067 0.3883 0.8450 1.6142

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -8.9307 5.3756 -1.661 0.096640 .
fem -1.1393 0.3831 -2.974 0.002940 **
marie 1.2387 0.4904 2.526 0.011551 *
celib -0.4824 0.5223 -0.924 0.355692
cadre 1.6911 0.6395 2.644 0.008182 **
empl -0.4704 0.3888 -1.210 0.226252
urbain -0.1317 0.3596 -0.366 0.714158
logage 3.0719 0.8607 3.569 0.000358 ***
logsal -0.2027 0.4904 -0.413 0.679370
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 275.64 on 199 degrees of freedom


Residual deviance: 212.87 on 191 degrees of freedom
AIC: 230.87

Number of Fisher Scoring iterations: 4

> quantile(age, probs=c(0, 0.01 , 0.05, 0.1, 0.25, 0.5 , 0.75 , 0.9, 0.95, 0.99, 1))
0% 1% 5% 10% 25% 50% 75% 90% 95% 99% 100%
18.00 26.95 30.00 31.00 35.00 44.00 49.00 52.20 58.00 59.00 59.00
> age35<-ifelse(age<=35,1,0)
> age45<-ifelse(age>35 & age<=45,1,0)
> age60<-ifelse(age>45,1,0)
> table(age35 + age45 + age60)

1
200
> quantile(sal, probs=c(0, 0.01 , 0.05, 0.1, 0.25, 0.5 , 0.75 , 0.9, 0.95, 0.99, 1))
0% 1% 5% 10% 25% 50% 75% 90% 95% 99% 100%
3678.00 6530.63 8359.00 10725.50 13789.00 17037.50 22522.50 28848.70 33448.90 35787.00
35972.00
> sal15<-ifelse(sal<=15000,1,0)
> sal225<-ifelse(sal>15000 & sal<=22500,1,0)
> sal30<-ifelse(sal>22500 & sal<=30000,1,0)
> salsup<-ifelse(sal>30000,1,0)
> table(sal15 + sal225 + sal30 + salsup)

1
200
> Logitmodel5<-glm(abo ~ fem + marie + celib + cadre + empl + urbain + age45 + age60 + sal225 + sal30 +
salsup, family = binomial(link="logit"))
> summary(Logitmodel5)

Call:
glm(formula = abo ~ fem + marie + celib + cadre + empl + urbain +
age45 + age60 + sal225 + sal30 + salsup, family = binomial(link = "logit"))

Deviance Residuals:
Min 1Q Median 3Q Max
-2.1555 -0.7690 0.3513 0.8202 1.7381

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.8928 0.7631 -1.170 0.242033
fem -0.9723 0.3896 -2.496 0.012577 *
marie 1.3257 0.5278 2.512 0.012017 *
celib -0.2714 0.5460 -0.497 0.619072
cadre 1.9748 0.6860 2.879 0.003993 **
empl -0.5426 0.3993 -1.359 0.174201
urbain -0.1790 0.3780 -0.474 0.635848
age45 1.0952 0.5124 2.137 0.032582 *
age60 1.7658 0.4564 3.869 0.000109 ***
sal225 0.5910 0.4169 1.418 0.156277
sal30 -0.4469 0.5844 -0.765 0.444442
salsup -0.0087 0.7472 -0.012 0.990711
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 275.64 on 199 degrees of freedom


Residual deviance: 207.24 on 188 degrees of freedom
AIC: 231.24

Number of Fisher Scoring iterations: 4

> Logitmodel6<-glm(abo ~ Femu + Femr + Homu + age + age2 + marie + celib + cadre + empl + saldiv +
saldiv2, family = binomial(link="logit"))
> summary(Logitmodel6)

Call:
glm(formula = abo ~ Femu + Femr + Homu + age + age2 + marie +
celib + cadre + empl + saldiv + saldiv2, family = binomial(link = "logit"))

Deviance Residuals:
Min 1Q Median 3Q Max
-2.2200 -0.8103 0.1021 0.8136 1.9393
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 16.2383641 6.4650965 2.512 0.01202 *
Femu -0.9976735 0.5797050 -1.721 0.08525 .
Femr -0.6171168 0.6473748 -0.953 0.34046
Homu 0.2538972 0.6416142 0.396 0.69231
age -0.9216167 0.3167366 -2.910 0.00362 **
age2 0.0122138 0.0038852 3.144 0.00167 **
marie 0.9340910 0.5194527 1.798 0.07214 .
celib -0.8145164 0.5867700 -1.388 0.16510
cadre 1.6176862 0.6859913 2.358 0.01837 *
empl -0.6305470 0.4273594 -1.475 0.14009
saldiv 0.0951634 0.1355617 0.702 0.48268
saldiv2 -0.0002576 0.0003126 -0.824 0.40991
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 275.64 on 199 degrees of freedom


Residual deviance: 194.09 on 188 degrees of freedom
AIC: 218.09

Number of Fisher Scoring iterations: 6

> vcov(Logitmodel6)
(Intercept) Femu Femr Homu age age2 marie celib cadre
empl saldiv
(Intercept) 41.797472688 -3.662931e-01 -4.710581e-01 -0.0045996203 -1.984284e+00 2.409211e-02 -
6.047388e-01 -8.340558e-01 1.585785e-01 -3.438164e-01 -1.220164e-01
Femu -0.366293072 3.360578e-01 2.594732e-01 0.2459708281 4.974367e-03 -9.218309e-05 -
1.370098e-02 -1.656787e-02 -2.904523e-02 -2.360264e-03 4.964772e-03
Femr -0.471058104 2.594732e-01 4.190941e-01 0.2430897581 3.191492e-03 -5.707143e-05 -
8.193036e-03 -3.421170e-02 -4.179343e-02 1.244309e-02 1.836028e-02
Homu -0.004599620 2.459708e-01 2.430898e-01 0.4116688275 -1.122594e-02 1.335229e-04
1.346819e-02 -3.645484e-02 5.003942e-02 4.464801e-02 -5.671508e-03
age -1.984283628 4.974367e-03 3.191492e-03 -0.0112259396 1.003221e-01 -1.226524e-03
1.824812e-02 3.355463e-02 -5.162942e-03 1.625846e-02 -2.901064e-03
age2 0.024092113 -9.218309e-05 -5.707143e-05 0.0001335229 -1.226524e-03 1.509508e-05 -
2.063893e-04 -3.931868e-04 9.527985e-05 -2.190399e-04 3.684612e-05
marie -0.604738801 -1.370098e-02 -8.193036e-03 0.0134681942 1.824812e-02 -2.063893e-04
2.698311e-01 2.032042e-01 -3.418169e-03 2.120221e-02 6.145067e-05
celib -0.834055801 -1.656787e-02 -3.421170e-02 -0.0364548447 3.355463e-02 -3.931868e-04
2.032042e-01 3.442991e-01 -1.367901e-01 -4.855065e-02 -2.791954e-03
cadre 0.158578546 -2.904523e-02 -4.179343e-02 0.0500394168 -5.162942e-03 9.527985e-05 -
3.418169e-03 -1.367901e-01 4.705840e-01 1.217114e-01 -1.132606e-02
empl -0.343816369 -2.360264e-03 1.244309e-02 0.0446480091 1.625846e-02 -2.190399e-04
2.120221e-02 -4.855065e-02 1.217114e-01 1.826361e-01 -4.557175e-03
saldiv -0.122016351 4.964772e-03 1.836028e-02 -0.0056715076 -2.901064e-03 3.684612e-05
6.145067e-05 -2.791954e-03 -1.132606e-02 -4.557175e-03 1.837698e-02
saldiv2 0.000261332 -2.124185e-06 -3.547225e-05 0.0000176759 6.438946e-06 -8.515026e-08
1.506216e-06 8.595541e-06 9.845642e-06 1.028776e-05 -4.147838e-05
saldiv2
(Intercept) 2.613320e-04
Femu -2.124185e-06
Femr -3.547225e-05
Homu 1.767590e-05
age 6.438946e-06
age2 -8.515026e-08
marie 1.506216e-06
celib 8.595541e-06
cadre 9.845642e-06
empl 1.028776e-05
saldiv -4.147838e-05
saldiv2 9.773574e-08
> varFemu<-vcov(Logitmodel6)[2,2]
> varFemr<-vcov(Logitmodel6)[3,3]
> covFemuFemr<-vcov(Logitmodel6)[2,3]
> varFemuFemr<- varFemu + varFemr - 2*covFemuFemr
> seFemuFemr<-sqrt(varFemuFemr)
> BetaFemu<-coef(Logitmodel6)[2]
> BetaFemr<-coef(Logitmodel6)[3]
> Z <- (BetaFemu - BetaFemr)/seFemuFemr
>Z
Femu
-0.7830228
> pvalue1<-2*pt(Z, 200-12, [Link]=TRUE)
> pvaluel
Error: object 'pvaluel' not found
> pvalue1
Femu
0.434599
> Logitmodel7<-glm(abo ~ fem + marie + celib + cadre + empl + urbain + age + age2 + saldiv + saldiv2,
family = binomial(link="logit"))
> summary(Logitmodel7)

Call:
glm(formula = abo ~ fem + marie + celib + cadre + empl + urbain +
age + age2 + saldiv + saldiv2, family = binomial(link = "logit"))

Deviance Residuals:
Min 1Q Median 3Q Max
-2.3084 -0.8019 0.1074 0.8211 1.9342
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 16.4043850 6.4066398 2.561 0.01045 *
fem -1.0203878 0.4067757 -2.508 0.01213 *
marie 0.9201744 0.5174169 1.778 0.07534 .
celib -0.7646153 0.5815115 -1.315 0.18855
cadre 1.5893378 0.6846792 2.321 0.02027 *
empl -0.6905511 0.4203864 -1.643 0.10045
urbain -0.1508788 0.3894397 -0.387 0.69844
age -0.9119657 0.3134578 -2.909 0.00362 **
age2 0.0120903 0.0038443 3.145 0.00166 **
saldiv 0.0878204 0.1356691 0.647 0.51743
saldiv2 -0.0002430 0.0003136 -0.775 0.43827
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 275.64 on 199 degrees of freedom


Residual deviance: 194.72 on 189 degrees of freedom
AIC: 216.72

Number of Fisher Scoring iterations: 6

> Xmoyen<-cbind(1,0,0,0,0,0,1,42.5, 42.5^2 , 18590/1000 , (18590^2)/100000 )


> Beta7<-coef(Logitmodel7)
> Pmoyen<-exp(Xmoyen%*%Beta7)/(1+exp(Xmoyen%*%Beta7))
> Pmoyen
[,1]
[1,] 0.531402
> Xbis<-cbind(1,0,1,0,1,0,0,50, 50^2 , 30000/1000 , (30000^2)/100000)
> Pbis<-exp(Xbis%*%Beta7)/(1+exp(Xbis%*%Beta7))
> Pbis
[,1]
[1,] 0.9818123

7. Estimer un modèle PROBIT

> ProbitSexe<-glm(abo ~ hom, family = binomial(link="probit"))


> summary(ProbitSexe)

Call:
glm(formula = abo ~ hom, family = binomial(link = "probit"))

Deviance Residuals:
Min 1Q Median 3Q Max
-1.5183 -1.1096 0.8712 1.2468 1.2468
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.1012 0.1128 -0.898 0.36925
hom 0.5808 0.1876 3.096 0.00196 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 275.64 on 199 degrees of freedom


Residual deviance: 265.89 on 198 degrees of freedom
AIC: 269.89

Number of Fisher Scoring iterations: 4

> Prob1<-pnorm(-0.1012 + 0.5808)


> Prob1
[1] 0.6842441
> Prob0<-pnorm(-0.1012)
> Prob0
[1] 0.4596958
> Probitmodel<-glm(abo ~ fem + age + age2 + marie + celib + cadre + empl + urbain + saldiv + saldiv2,
family = binomial(link="probit"))
> summary(Probitmodel)

Call:
glm(formula = abo ~ fem + age + age2 + marie + celib + cadre +
empl + urbain + saldiv + saldiv2, family = binomial(link = "probit"))

Deviance Residuals:
Min 1Q Median 3Q Max
-2.25886 -0.80121 0.05407 0.83502 1.92147

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 9.4792485 3.7221752 2.547 0.01087 *
fem -0.6072863 0.2370584 -2.562 0.01041 *
age -0.5334554 0.1817832 -2.935 0.00334 **
age2 0.0071238 0.0022246 3.202 0.00136 **
marie 0.5736498 0.3099084 1.851 0.06417 .
celib -0.3978892 0.3419977 -1.163 0.24466
cadre 0.8440002 0.3923651 2.151 0.03147 *
empl -0.4432831 0.2491456 -1.779 0.07520 .
urbain -0.0798572 0.2292545 -0.348 0.72759
saldiv 0.0531477 0.0794605 0.669 0.50359
saldiv2 -0.0001479 0.0001833 -0.807 0.41964
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)

Null deviance: 275.64 on 199 degrees of freedom


Residual deviance: 194.24 on 189 degrees of freedom
AIC: 216.24

Number of Fisher Scoring iterations: 7

Vous aimerez peut-être aussi