0% found this document useful (0 votes)
314 views8 pages

I. Log Transformation: (Gomez and Gomez, 1984)

1. A study was conducted to evaluate the effect of different rates of poultry manure on maize plant leaves. The data showed unequal variances between treatments. 2. Log transformation was applied to make the data fit normal distribution assumptions. After transformation, the P-values increased but were still significant. R-square decreased slightly. 3. Most importantly, transformation revealed that the top treatment was not statistically different from the third treatment as was concluded before transformation, avoiding a type 1 error. Transformation provided a more accurate analysis.

Uploaded by

Audrey Ody
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
314 views8 pages

I. Log Transformation: (Gomez and Gomez, 1984)

1. A study was conducted to evaluate the effect of different rates of poultry manure on maize plant leaves. The data showed unequal variances between treatments. 2. Log transformation was applied to make the data fit normal distribution assumptions. After transformation, the P-values increased but were still significant. R-square decreased slightly. 3. Most importantly, transformation revealed that the top treatment was not statistically different from the third treatment as was concluded before transformation, avoiding a type 1 error. Transformation provided a more accurate analysis.

Uploaded by

Audrey Ody
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

1.

Think of three (3) different experiments where you could use three (3)
different types of data transformation to demonstrate the importance of applying
data transformation in statistical analysis of the experiments. Your conclusion
must show a distinction between before and after transformation.

If a measurement variable does not fit a normal distribution or has greatly different standard
deviations in different groups, you should try a data transformation. Using a statistical test
such as an ANOVA or Linear Regression on such data may give a misleading result. In some
cases, transforming the data will make it fit the assumptions (Gomez and Gomez, 1984) better.
i.

Log Transformation

This consists of taking the log of each observation. You can use either base-10 logs (LOG10
in SAS) or base-e logs, also known as natural logs (LOG in SAS). It makes no difference for
a statistical test whether you use base-10 logs or natural logs, because they differ by a
constant factor; the base-10 log of a number is just 2.303 the natural log of the number.
You should specify which log you're using when you write up the results, as it will affect
things like the slope and intercept in a regression. Base-10 logs is preferred because it's
possible to look at them and see the magnitude of the original number: log(1)=0, log(10)=1,
log(100)=2, etc.
For example a field study was conducted to evaluate the effect of different rates of poultry
manure used on maize number of leaves after 70 days of planting. The experimental design
used was a randomized complete block design. Six treatments were evaluated with three
replicate each as indicated below;
T1= 1200 g poultry manure + 50% fertilization
T2= 1000 g poultry manure + 50% fertilization
T3= 800 g poultry manure + 50% fertilization
T4= 600 g poultry manure + 50% fertilization
T5= 400 g poultry manure + 50% fertilization
T6= 200 g poultry manure + 50% fertilization

Table 1: Raw data after field experiment showing


Treatement

Block 1

Block 2

Block 3

Standard

Mean

Means of X

(Plant-1)

(Plant-1)

(Plant-1)

deviation

T1

16

17

16

0.5773503

16.3333333

17.3333333

T2

15

16

16

0.5773503

15.6666667

16.6666667

T3

14

15

15

0.5773503

14.6666667

15.6666667

T4

14

13

14

0.5773503

13.6666667

14.6666667

T5

13

13

12

0.5773503

12.6666667

13.6666667

T6

12

13

10

1.5275252

11.6666667

12.6666667

Table 2: SAS input before and after transformation.


Input before transformation
data log before;
input trt $ blk leaf;
datalines;
T1 1 16
T1 2 17
T1 3 16
T2 1 15
T2 2 16
T2 3 16
T3 1 14
T3 2 15
T3 3 15
T4 1 14
T4 2 13
T4 3 14
T5 1 13
T5 2 13
T5 3 12
T6 1 12
T6 2 13
T6 3 10
;
proc anova;
class trt blk;
model leaf = trt blk;
means trt blk/lsd;
run;

Input for transformation


data log after;
input trt $ blk leaf;
x=leaf +1;
y=log (x);
datalines;
T1 1 16
T1 2 17
T1 3 16
T2 1 15
T2 2 16
T2 3 16
T3 1 14
T3 2 15
T3 3 15
T4 1 14
T4 2 13
T4 3 14
T5 1 13
T5 2 13
T5 3 12
T6 1 12
T6 2 13
T6 3 10
;
proc print;
proc anova;
class trt blk;
model y=trt blk;
means trt/duncan;
run;

The ANOVA Procedure


Table 3: Anova Table before transformation

Source

DF

Sum of Squares

Mean Square

F Value

Pr > F

trt

47.77777778

9.55555556

14.58

0.0003

blk

1.44444444

0.72222222

1.10

0.3695

Error

10

6.55555556

0.65555556

Corrected Total

17

55.77777778
CV= 8.778059

R-Square

Coeff Var

Root MSE

leaf Mean

0.882470

5.737775

0.809664

14.11111

Means with the same letter


are not significantly different.
t Grouping

Mean

trt

16.3333

T1

15.6667

T2

14.6667

T3

13.6667

T4

12.6667

T5

11.6667

T6

A
A
B
B
B

C
D
D
D

E
E

From the results of the Anova Table 1, there is a significant different between treatment at =
0.05, P 0.0003 and there is no block effect since P 0.3695 is not significantly different at =
0.05 hence this experiment is RCBD. The R-Square value is 88.25%

Table 3: Transforming the data:

Obs trt

blk

leaf

1 T1

16

17

2.83321

2 T1

17

18

2.89037

3 T1

16

17

2.83321

4 T2

15

16

2.77259

5 T2

16

17

2.83321

6 T2

16

17

2.83321

7 T3

14

15

2.70805

8 T3

15

16

2.77259

9 T3

15

16

2.77259

10 T4

14

15

2.70805

11 T4

13

14

2.63906

12 T4

14

15

2.70805

13 T5

13

14

2.63906

14 T5

13

14

2.63906

15 T5

12

13

2.56495

16 T6

12

13

2.56495

17 T6

13

14

2.63906

18 T6

10

11

2.39790

The ANOVA Procedure


Table 5: Anova Table after transformation
Dependent Variable: y
Source

DF

Sum of Squares

Mean Square

F Value

Pr > F

trt

0.21983173

0.04396635

11.90

0.0006

blk

0.00781452

0.00390726

1.06

0.3831

Error

10

0.03694465

0.00369446

Corrected Total

17

0.26459090
CV = 2.244301

R-Square

Coeff Var

Root MSE

y Mean

0.860371

2.244301

0.060782

2.708287

Means with the same letter


are not significantly different.
Duncan Grouping

Mean

N trt

2.85227

3 T1

2.81301

3 T2

2.75108

3 T3

2.68505

3 T4

2.61435

3 T5

2.53397

3 T6

A
A
A
A
B
B
B

C
D

D
D

This is the SAS output after transformation, the values under X are values after adding one
since three of the values were below 10 and the values under Y are the transformed values of
X (Table 3) and there is a there is a significant different between treatment at = 0.05, P
0.0006. The P value here is higher than that of before (P 0.0006).There is no block effect

since P 0.3831 is not significantly different at = 0.05 hence this experiment is still RCBD
and the P value here is higher than that of before (P 0.3695). The R-Square value is 86.04%
which is also lower than that of before transformation (88.25%)

Table 6: Comparing the means before and after transformation


Treatment
T1
T2
T3
T4
T5
T6

Mean before transformation


16.3333 a
15.6667 ab
14.6667 bc
13.6667 cd
12.6667 de
11.6667 e

Mean after transformation


17.3271 a
16.6599 a
15.6595 ab
14.6589 bc
13.6029 cd
12.6034 d

Note: Means with the same letter are not significantly different
Means before transformation show that T1 is statistically higher than that of T3 however,
after transformation the statistical values show that T1 is not statistically different from T3
(Table 6).

Conclusion
The P values of the data before transformation are higher than that of after transformation
however, if the data was not transformed, we would have concluded that T1 is statistically
different from T3 and commit type 1 error. Transformed data have lower R-Square value
but it is more realistic and precise.

You might also like