0% ont trouvé ce document utile (0 vote)
23 vues13 pages

ANOVA des Statistiques Financières et Industrielles

Ce document contient les résultats de trois analyses de variance à sens unique (ANOVA) effectuées sur des bases de données financières et manufacturières. Les ANOVA déterminent s'il existe des différences significatives entre les types d'entreprises ou les groupes industriels pour certains indicateurs financiers.

Transféré par

aqillah
Copyright
© © All Rights Reserved
Nous prenons très au sérieux les droits relatifs au contenu. Si vous pensez qu’il s’agit de votre contenu, signalez une atteinte au droit d’auteur ici.
Formats disponibles
Téléchargez aux formats DOCX, PDF, TXT ou lisez en ligne sur Scribd
0% ont trouvé ce document utile (0 vote)
23 vues13 pages

ANOVA des Statistiques Financières et Industrielles

Ce document contient les résultats de trois analyses de variance à sens unique (ANOVA) effectuées sur des bases de données financières et manufacturières. Les ANOVA déterminent s'il existe des différences significatives entre les types d'entreprises ou les groupes industriels pour certains indicateurs financiers.

Transféré par

aqillah
Copyright
© © All Rights Reserved
Nous prenons très au sérieux les droits relatifs au contenu. Si vous pensez qu’il s’agit de votre contenu, signalez une atteinte au droit d’auteur ici.
Formats disponibles
Téléchargez aux formats DOCX, PDF, TXT ou lisez en ligne sur Scribd

Business Statistics by Ken Black, 7th edition

Answers to Analyzing the Databases Questions in Chapter 11

1. Consider the financial database:


The first one-way ANOVA is performed to determine if there is a significant difference in
Earnings Per Share according to Type of Company. This can be solved using either
Minitab or Excel. Shown here is the Minitab output for this problem:

One-way ANOVA: Earnings per Share versus Type

Source DF SS MS F P
Type 6 32.05 5.34 3.46 0.004
Error 93 143.72 1.55
Total 99 175.76

S = 1.243 R-Sq = 18.23% R-Sq(adj) = 12.96%

Individual 95% CIs For Mean Based on


Pooled StDev
Level N Mean StDev ---+---------+---------+---------+------
1 10 2.076 0.906 (---------*---------)
2 19 2.471 1.706 (------*------)
3 13 1.563 0.798 (--------*-------)
4 8 1.448 0.393 (----------*----------)
5 15 1.634 0.942 (-------*-------)
6 16 2.776 1.314 (-------*------)
7 19 2.954 1.432 (------*------)
---+---------+---------+---------+------
0.80 1.60 2.40 3.20

Note that the p-value is .004, thereby indicating statistical significance at  = .01.
Without conducting multiple comparisons (which were not asked for), we can only
visually see that company type 7 (petroleum) has the highest average earnings per
share (mean = 2.954) and company type 4 (grocery) has the lowest (mean = 1.448).

The second one-way ANOVA is performed to determine if there is a significant


difference in Dividends Per Share according to Type of Company. Shown here is the
Minitab output for this problem:

One-way ANOVA: Dividends per Share versus Type

Source DF SS MS F P
Type 6 8.724 1.454 3.94 0.001
Error 93 34.301 0.369
Total 99 43.026

S = 0.6073 R-Sq = 20.28% R-Sq(adj) = 15.13%


Individual 95% CIs For Mean Based on
Pooled StDev
Level N Mean StDev ------+---------+---------+---------+---
1 10 0.6720 0.6780 (------*-------)
2 19 1.0653 0.7953 (----*-----)
3 13 1.4823 0.5053 (------*-----)
4 8 0.6100 0.2685 (-------*--------)
5 15 0.7340 0.4916 (------*-----)
6 16 0.6931 0.4946 (-----*-----)
7 19 1.2168 0.6732 (----*-----)
------+---------+---------+---------+---
0.50 1.00 1.50 2.00

Note that the p-value is .001, thereby indicating statistical significance at  = .001.
Without conducting multiple comparisons (which were not asked for), we can only
visually see that company type 3 (electric power) has the highest average dividends per
share (mean = 1.4823), and company type 4 (grocery) has the lowest (mean = .6100).

The third one-way ANOVA is performed to determine if there is a significant difference


in Average P/E Ratio according to Type of Company. Shown here is the Minitab output
for this problem:

One-way ANOVA: Average P/E Ratio versus Type

Source DF SS MS F P
Type 6 11486 1914 0.33 0.918
Error 93 533995 5742
Total 99 545481

S = 75.78 R-Sq = 2.11% R-Sq(adj) = 0.00%

Individual 95% CIs For Mean Based on Pooled StDev


Level N Mean StDev +---------+---------+---------+---------
1 10 17.85 5.85 (---------------*---------------)
2 19 25.27 20.70 (----------*-----------)
3 13 22.30 18.34 (------------*-------------)
4 8 21.86 4.66 (----------------*-----------------)
5 15 27.92 12.50 (------------*------------)
6 16 32.63 71.43 (------------*-----------)
7 19 50.67 156.89 (-----------*----------)
+---------+---------+---------+---------
-30 0 30 60

Note that the p-value is 0.918 indicating that there is no significant overall difference
between company types on average P/E ratio. Of course, a study of the confidence
interval graphs reinforces this conclusion because there is overlap among all of the
intervals.
2. Consider the Manufacturing database, either Excel or Minitab can be used to solve this
problem. First, we use Excel to determine if there is any significant difference between
the four groups of Value of Industrial Shipments (groups 1 through 4 - independent
variable) on Number of Production Workers (dependent variable).

One-way ANOVA: No. Production Workers versus Value of Industry Shipments


Anova: Single Factor

SUMMARY
Groups Count Sum Average Variance
14.7333 133.098
Column 1 30 442 3 9
40.6216 416.908
Column 2 37 1503 2 4
85.6969 3594.15
Column 3 33 2828 7 5
17852.6
Column 4 40 7401 185.025 9

ANOVA
Source of
Variation SS df MS F P-value F crit
620671. 206890. 33.8945 2.67117
Between Groups 2 3 4 4 2.01E-16 8
830136. 6103.94
Within Groups 5 136 5

Total 1450808 139

Note that the p-value is 0.000 indicating that there is a significant overall difference in
number of production workers for at least as small as .001. Because the independent
variable really represents the size of companies, note that the mean number of
production workers increases as the level (size of company) goes from 1 to 4. This is
determined by reviewing the ‘Summary’ where the average for column 1 is 14.73333
while the average for column 4 is 185.025.

Second, Excel is used to determine if there is any significant difference between the four
groups of Value of Industrial Shipments (groups 1 through 4 – independent variable) on
End-of-Year Inventories.
Anova: Single Factor

SUMMARY
Groups Count Sum Average Variance
Column 1 30 10034 334.4667 49916.67
Column 2 37 36997 999.9189 142827.8
Column 3 33 86076 2608.364 1164309
29400 3423927
Column 4 40 4 7350.1 2

ANOVA
Source of Variation SS df MS F P-value F crit
1.12E+0 2.67117
Between Groups 9 3 3.74E+08 36.90618 1.63E-17 8
1.38E+0 1014102
Within Groups 9 136 1

Total 2.5E+09 139

One-way ANOVA: End Yr. Inventory versus Value of Industry Shipments

Note that the p-value is 0.000 indicating that there is a significant overall difference in
end-of-year inventory for at least as small as .001. Because the independent variable
really represents the size of companies, note that the mean end-of-year inventory
increases as the level (size of company) goes from 1 to 4. This is determined by
reviewing the ‘Summary’ where the average for column 1 is 334.4667 while the average
for column 4 is 7350.1.

Next, Minitab is used to determine if there is any significant difference between the
twenty levels of Industry Groups (independent variable) on Number of Production
Workers. Of course, we could easily have used Excel to perform the analysis.

One-way ANOVA: No. Prod. Wkrs. versus Indus. Grp.

Source DF SS MS F P
Indus. Grp. 19 243867 12835 1.28 0.212
Error 120 1206941 10058
Total 139 1450808

S = 100.3 R-Sq = 16.81% R-Sq(adj) = 3.64%

Individual 95% CIs For Mean Based on


Pooled StDev
Level N Mean StDev ---------+---------+---------+---------+
1 9 123.8 102.6 (-----*------)
2 4 5.8 6.2 (---------*---------)
3 9 54.4 40.3 (-----*------)
4 9 81.0 93.7 (------*------)
5 6 102.2 63.0 (-------*-------)
6 5 83.0 84.8 (--------*--------)
7 5 97.6 74.2 (--------*--------)
8 9 88.9 135.8 (------*------)
9 8 59.6 28.5 (------*------)
10 3 23.0 18.0 (----------*-----------)
11 5 160.0 246.8 (--------*--------)
12 7 9.1 10.2 (-------*------)
13 9 44.7 43.9 (-----*------)
14 7 77.4 60.9 (-------*------)
15 9 123.9 102.3 (-----*------)
16 9 143.0 74.5 (-----*------)
17 8 125.8 102.9 (------*------)
18 7 147.6 223.6 (-------*------)
19 6 70.5 64.7 (-------*-------)
20 6 47.3 43.6 (-------*-------)
---------+---------+---------+---------+
0 100 200 300

Note that the p-value is 0.212 indicating that there is no significant overall difference in
number of production workers.

Finally, Minitab is used to determine if there is any significant difference between the
twenty levels of Industry Groups (independent variable) on End-of-Year Inventory.
Again, Excel could have been used to perform the ANOVA.

One-way ANOVA: End Yr. Inven. versus Indus. Grp.

Source DF SS MS F P
Indus. Grp. 19 643979919 33893680 2.19 0.006
Error 120 1857998122 15483318
Total 139 2501978041

S = 3935 R-Sq = 25.74% R-Sq(adj) = 13.98%


Individual 95% CIs For Mean Based on
Pooled StDev
Level N Mean StDev -------+---------+---------+---------+--
1 9 3997 2425 (-----*-----)
2 4 1564 2637 (---------*---------)
3 9 1053 581 (------*-----)
4 9 1113 1135 (------*-----)
5 6 1738 1560 (-------*-------)
6 5 1297 1265 (-------*--------)
7 5 3177 1980 (--------*--------)
8 9 1447 1490 (------*-----)
9 8 5095 3436 (------*------)
10 3 4015 5805 (----------*----------)
11 5 3117 4917 (--------*--------)
12 7 219 256 (-------*------)
13 9 964 608 (-----*------)
14 7 3483 4326 (-------*------)
15 9 3043 2374 (------*-----)
16 9 6212 2218 (------*-----)
17 8 4744 4275 (------*------)
18 7 8907 13426 (------*-------)
19 6 4086 3542 (-------*-------)
20 6 1410 1102 (-------*------)
-------+---------+---------+---------+--
0 4000 8000 12000

Note that the p-value is 0.006 indicating that there is a significant overall difference in
end-of-year inventory for = .01. Without conducting multiple comparisons (which
were not asked for), we can only visually see Industry Group 18 has the highest average
end-of-the year inventory, and Industry Group 12 has the lowest. This is determined by
reviewing the bottom part of the output and finding the mean for column 18 is 8907
while the mean for column 12 is 219.

3. Using Minitab and the hospital database, a one way ANOVA is performed to determine
if there is a significant difference in Admissions (dependent variable) according to
Geographic region (independent variable). There are seven geographic regions being
compared.

One-way ANOVA: Admissions versus Geog. Region

Source DF SS MS F P
Geog. Region 6 182761795 30460299 0.68 0.664
Error 193 8607296329 44597390
Total 199 8790058124

S = 6678 R-Sq = 2.08% R-Sq(adj) = 0.00%

Individual 95% CIs For Mean Based on


Pooled StDev
Level N Mean StDev -+---------+---------+---------+--------
1 56 7103 6248 (----*---)
2 30 8013 7722 (-----*-----)
3 60 5447 6521 (----*---)
4 3 7182 7631 (------------------*------------------)
5 20 7699 8754 (------*-------)
6 19 7283 4315 (------*-------)
7 12 7288 5343 (--------*---------)
-+---------+---------+---------+--------
0 4000 8000 12000

The p-value is 0.664 indicating that there is no significant difference in hospital


admissions by geographic location.

Next, Minitab is used to determine if there is a significant difference in Births


(dependent variable) according to Geographic Region (independent variable):

One-way ANOVA: Births versus Geog. Region

Source DF SS MS F P
Geog. Region 6 10196027 1699338 1.53 0.172
Error 193 214949499 1113728
Total 199 225145527

S = 1055 R-Sq = 4.53% R-Sq(adj) = 1.56%

Individual 95% CIs For Mean Based on


Pooled StDev
Level N Mean StDev ------+---------+---------+---------+---
1 56 785 995 (---*---)
2 30 874 1064 (----*-----)
3 60 673 1009 (---*--)
4 3 1515 2021 (-----------------*----------------)
5 20 1210 1436 (-----*------)
6 19 1322 890 (------*------)
7 12 867 744 (-------*--------)
------+---------+---------+---------+---
700 1400 2100 2800

The p-value is 0.172 indicating that there is no significant difference in hospital births by
geographic location.
Third, Minitab is used to determine if there is a significant difference in Admissions
independent variable) according to the Type of Control of the hospital (dependent
variable).

One-way ANOVA: Admissions versus Control

Source DF SS MS F P
Control 3 716107274 238702425 5.79 0.001
Error 196 8073950849 41193627
Total 199 8790058124

S = 6418 R-Sq = 8.15% R-Sq(adj) = 6.74%

Individual 95% CIs For Mean Based on


Pooled StDev
Level N Mean StDev -------+---------+---------+---------+--
1 51 6020 7218 (------*------)
2 86 8902 6710 (-----*----)
3 45 5000 5840 (-------*-------)
4 18 3822 2881 (-----------*-----------)
-------+---------+---------+---------+--
2500 5000 7500 10000

Note that the p-value is 0.001 indicating that there is a significant overall difference in
admissions according to type of control for = .001. Without conducting multiple
comparisons (which were not asked for), we can only visually see that the highest mean
number of admissions (mean = 8902) is in type of control group 2 (nongovernment, not-
for-profit) and that the lowest mean number of admissions (mean = 3822) is in type of
control group 4 (federal government).

Last, Minitab is used to determine if there is a significant difference in Births


independent variable) according to the Type of Control of the hospital (dependent
variable).

One-way ANOVA: Births versus Control

Source DF SS MS F P
Control 3 12423343 4141114 3.82 0.011
Error 196 212722183 1085317
Total 199 225145527

S = 1042 R-Sq = 5.52% R-Sq(adj) = 4.07%

Individual 95% CIs For Mean Based on


Pooled StDev
Level N Mean StDev ---+---------+---------+---------+------
1 51 765 1143 (------*------)
2 86 1137 1033 (----*-----)
3 45 694 1038 (------*-------)
4 18 378 741 (-----------*------------)
---+---------+---------+---------+------
0 400 800 1200

Note that the p-value is 0.011 indicating that there is a significant overall difference in
admissions according to type of control for = .05. Without conducting multiple
comparisons (which were not asked for), we can only visually see that the highest mean
number of admissions (mean = 1137) is in type of control group 2 (nongovernment, not-
for-profit) and that the lowest mean number of admissions (mean = 378) is in type of
control group 4 (federal government).

4. Using Minitab and the Consumer Food database, three one way ANOVAs were
performed with Region as the independent variable. The first one-way ANOVA analyzes
Annual Food Spending as the dependent variable. The resulting output is shown below:

One-way ANOVA: Annual Food Spending ($) versus Region

Source DF SS MS F P
Region 3 85851672 28617224 3.02 0.031
Error 196 1857517636 9477131
Total 199 1943369308

S = 3078 R-Sq = 4.42% R-Sq(adj) = 2.95%

Individual 95% CIs For Mean Based on Pooled StDev


Level N Mean StDev -+---------+---------+---------+--------
1 60 9468 3733 (-------*-------)
2 45 8660 2334 (--------*--------)
3 40 7834 2722 (--------*---------)
4 55 9493 3062 (-------*-------)
-+---------+---------+---------+--------
7000 8000 9000 10000

Pooled StDev = 3078

Tukey 95% Simultaneous Confidence Intervals


All Pairwise Comparisons among Levels of Region

Individual confidence level = 98.96%

Region = 1 subtracted from:

Region Lower Center Upper -------+---------+---------+---------+--


2 -2379 -808 763 (-------*-------)
3 -3260 -1634 -8 (-------*-------)
4 -1463 25 1512 (------*-------)
-------+---------+---------+---------+--
-2000 0 2000 4000

Region = 2 subtracted from:

Region Lower Center Upper -------+---------+---------+---------+--


3 -2557 -826 906 (--------*--------)
4 -769 833 2434 (-------*-------)
-------+---------+---------+---------+--
-2000 0 2000 4000

Region = 3 subtracted from:

Region Lower Center Upper -------+---------+---------+---------+--


4 3 1659 3314 (-------*--------)
-------+---------+---------+---------+--
-2000 0 2000 4000

Note that the p-value is 0.031 indicating that there is a significant overall difference in
annual food spending according to region for = .05. Since the overall F value is
significant, Tukey multiple comparisons were run. With Minitab, if the confidence
interval associated with the multiple comparison has the same sign on each end, there is
a significant difference between the pair (because zero is not in the interval). Here
there is a significant difference in annual food spending between 1 and 3 and between 3
and 4. This can also be determined by studying the line plots at the right of the output.
Those intervals that do not include 0 are significantly different from the comparison
group. In studying the means, note that region 3 (South) has the smallest mean annual
food spending and that regions 1 and 4 have the highest means and are almost
identical. Region 1 is Northeast and region 4 is the West.

The second one-way ANOVA analyzes Annual Household Income as its dependent
variable.

One-way ANOVA: Annual Household Income versus Region

Source DF SS MS F P
Region 3 1636917374 545639125 2.60 0.053
Error 196 41139222908 209893994
Total 199 42776140282

S = 14488 R-Sq = 3.83% R-Sq(adj) = 2.35%

Individual 95% CIs For Mean Based on


Pooled StDev
Level N Mean StDev --------+---------+---------+---------+-
1 60 57362 16973 (-------*------)
2 45 54458 13598 (--------*-------)
3 40 50508 13103 (--------*--------)
4 55 58142 13131 (------*-------)
--------+---------+---------+---------+-
50000 55000 60000 65000

Pooled StDev = 14488

Tukey 95% Simultaneous Confidence Intervals


All Pairwise Comparisons among Levels of Region

Individual confidence level = 98.96%

Region = 1 subtracted from:

Region Lower Center Upper ---------+---------+---------+---------+


2 -10298 -2904 4490 (--------*---------)
3 -14508 -6854 799 (--------*---------)
4 -6220 780 7779 (--------*--------)
---------+---------+---------+---------+
-8000 0 8000 16000

Region = 2 subtracted from:

Region Lower Center Upper ---------+---------+---------+---------+


3 -12098 -3950 4198 (---------*---------)
4 -3853 3683 11220 (---------*--------)
---------+---------+---------+---------+
-8000 0 8000 16000

Region = 3 subtracted from:

Region Lower Center Upper ---------+---------+---------+---------+


4 -158 7634 15425 (---------*--------)
---------+---------+---------+---------+
-8000 0 8000 16000

Note that the p-value is 0.053 indicating that there is a significant overall difference in
annual food spending according to region for = .10. Since the overall F value is
significant, Tukey multiple comparisons were run. With Minitab, if the confidence
interval associated with the multiple comparison has the same sign on each end, there is
a significant difference between the pair (because zero is not in the interval). This can
also be determined by studying the line plots at the right of the output. Those intervals
that do not include 0 are significantly different from the comparison group. In this
analysis, the Tukey’s multiple comparison tests do not yield any significant pairwise
differences. Note that the Tukey multiple comparison test was run using a .05 level of
significance and the overall F value in this analysis was not significant at alpha equal .05.
The last one-way ANOVA analyzes Non-mortgage Household Debt as its dependent
variable.

One-way ANOVA: Non mortgage household debt ($) versus Region

Source DF SS MS F P
Region 3 1179900846 393300282 5.72 0.001
Error 196 13481850799 68784953
Total 199 14661751645

S = 8294 R-Sq = 8.05% R-Sq(adj) = 6.64%

Individual 95% CIs For Mean Based on


Pooled StDev
Level N Mean StDev -----+---------+---------+---------+----
1 60 13743 8002 (------*------)
2 45 12807 6891 (-------*-------)
3 40 18717 9964 (-------*--------)
4 55 17660 8325 (------*------)
-----+---------+---------+---------+----
12000 15000 18000 21000

Pooled StDev = 8294

Tukey 95% Simultaneous Confidence Intervals


All Pairwise Comparisons among Levels of Region

Individual confidence level = 98.96%

Region = 1 subtracted from:

Region Lower Center Upper --------+---------+---------+---------+-


2 -5168 -935 3297 (------*------)
3 593 4974 9356 (------*-------)
4 -90 3917 7924 (------*-----)
--------+---------+---------+---------+-
-6000 0 6000 12000

Region = 2 subtracted from:

Region Lower Center Upper --------+---------+---------+---------+-


3 1246 5910 10574 (-------*-------)
4 538 4852 9167 (------*------)
--------+---------+---------+---------+-
-6000 0 6000 12000

Region = 3 subtracted from:

Region Lower Center Upper --------+---------+---------+---------+-


4 -5518 -1057 3403 (------*-------)
--------+---------+---------+---------+-
-6000 0 6000 12000

Note that the p-value is 0.001 indicating that there is a significant overall difference in
non-mortgage household debt according to region for = .01. Since the overall F value
is significant, Tukey multiple comparisons were run. With Minitab, if the confidence
interval associated with the multiple comparison has the same sign on each end, there is
a significant difference between the pair (because zero is not in the interval). This can
also be determined by studying the line plots at the right of the output. Those intervals
that do not include 0 are significantly different from the comparison group. Here there is
a significant difference in annual food spending between 1 and 3, between 2 and 3, and
between 2 and 4. In studying the means, note that region 3 (South) has the highest
mean non-mortgage household debt and that region 2 (Midwest) has the lowest. Group
1 (Northeast) also has a low mean non-mortgage household debt.

Vous aimerez peut-être aussi