0% found this document useful (0 votes)
514 views10 pages

T-Tests Using SAS

1. The document describes SAS commands for conducting various statistical tests, including importing data, setting up missing value codes, descriptive statistics, t-tests, and boxplots. 2. Descriptive statistics are presented for continuous variables by sex using both CLASS and BY statements, and boxplots are generated to compare variables by sex. 3. Independent samples t-tests are used to compare mean birth weight, BMI, and log fatfold between males and females in the sample, finding a statistically significant difference in birth weight.

Uploaded by

Robin
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
514 views10 pages

T-Tests Using SAS

1. The document describes SAS commands for conducting various statistical tests, including importing data, setting up missing value codes, descriptive statistics, t-tests, and boxplots. 2. Descriptive statistics are presented for continuous variables by sex using both CLASS and BY statements, and boxplots are generated to compare variables by sex. 3. Independent samples t-tests are used to compare mean birth weight, BMI, and log fatfold between males and females in the sample, finding a statistically significant difference in birth weight.

Uploaded by

Robin
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 10

SAS t-test Commands

/**********************************************
This example shows how to import an Excel
File, set up missing value codes and
create a permanent SAS data set. It also shows
boxplots, two-sample t-tests, paired t-tests
and one-sample t-tests.

Filename: ttest.sas
**********************************************/
OPTIONS FORMCHAR="|----|+|---+=|-/\<>*";

/*Read in the raw data*/


data owen;
infile "owen.dat" ;
input family child age sex race w_rank income_c height weight hemo
vit_c vit_a head_cir fatfold b_weight mot_age b_order m_height
f_height ;
run;

/*Create a new permanent SAS data set*/

libname b510 "c:\documents and settings\kwelch\desktop\b510";

data b510.owen;
set owen;
if vit_a = 99 then vit_a = .;
if head_cir = 99 then head_cir = .;
if fatfold = 99 then fatfold = .;
if b_weight = 999 then b_weight= .;
if mot_age = 99 then mot_age = .;
if b_order = 99 then b_order = .;
if m_height = 999 then m_height=.;
if f_height = 999 then f_height=.;

bwt_g = b_weight*10;

if bwt_g not=. and bwt_g < 2500 then lowbwt=1;


if bwt_g >=2500 then lowbwt=0;

log_fatfold = log(fatfold);
htdiff = f_height - m_height;
bmi = weight /(height/100)**2;
run;

/*Simple Descriptive Statistics on all Numeric Variables*/


proc means data=b510.owen;
run;

1
The MEANS Procedure

Variable Label N Mean Std Dev Minimum Maximum


-----------------------------------------------------------------------------------------------
FAM_NUM_ FAM_NUM 1006 4525.11 1634.03 2000.00 7569.00
CHILDNUM CHILDNUM 1006 1.3359841 0.5716672 1.0000000 3.0000000
AGE AGE 1006 44.0248509 16.6610452 12.0000000 73.0000000
SEX SEX 1006 1.4890656 0.5001291 1.0000000 2.0000000
RACE RACE 1006 1.2823062 0.4503454 1.0000000 2.0000000
W_RANK W_RANK 1006 2.2127237 0.9024440 1.0000000 4.0000000
INCOME_C INCOME_C 1006 1581.31 974.2279710 80.0000000 6250.00
HEIGHT HEIGHT 1001 99.0429570 11.4300111 70.0000000 130.0000000
WEIGHT WEIGHT 1000 15.6290800 3.6523446 8.2400000 41.0800000
HEMO HEMO 1006 12.4606362 1.1578850 6.2000000 24.1000000
VIT_C VIT_C 1006 1.1302187 0.6599121 0.1000000 3.5000000
VIT_A VIT_A 763 36.0380079 8.8951237 15.0000000 78.0000000
HEAD_CIR HEAD_CIR 999 49.3763764 2.0739057 39.0000000 56.0000000
FATFOLD FATFOLD 993 4.4562941 1.6683194 2.6000000 42.0000000
B_WEIGHT B_WEIGHT 986 325.0517241 59.5162936 91.0000000 544.0000000
MOT_AGE MOT_AGE 981 29.2660550 6.2603025 17.0000000 51.0000000
B_ORDER B_ORDER 980 2.9479592 2.1939526 1.0000000 16.0000000
M_HEIGHT M_HEIGHT 980 163.7632653 6.3663343 122.0000000 199.0000000
F_HEIGHT F_HEIGHT 975 178.2194872 7.3821354 152.0000000 210.0000000
bwt_g 986 3250.52 595.1629357 910.0000000 5440.00
lowbwt 986 0.1075051 0.3099115 0 1.0000000
log_fatfold 993 1.4599658 0.2396859 0.9555114 3.7376696
htdiff 972 14.4218107 8.7834139 -12.0000000 56.0000000
bmi 998 15.8124399 1.6634700 11.0247934 26.2912000
----------------------------------------------------------------------------------------------
/*Descriptive Statistics for each level of SEX
using a CLASS statement. No sorting is necessary.*/

proc means data=b510.owen;


class sex;
var bwt_g bmi fatfold log_fatfold;
run;

The MEANS Procedure

N
SEX Obs Variable Label N Mean Std Dev Minimum Maximum
------------------------------------------------------------------------------------------------------------
1 514 bwt_g 497 3340.56 565.3268435 1360.00 5170.00
bmi 510 15.8982386 1.6074313 11.3795135 26.2912000
FATFOLD FATFOLD 507 4.2518738 0.9720458 2.6000000 10.2000000
log_fatfold 507 1.4247028 0.2076417 0.9555114 2.3223877

2 492 bwt_g 489 3159.00 611.1350784 910.0000000 5440.00


bmi 488 15.7227732 1.7171565 11.0247934 24.4485835
FATFOLD FATFOLD 486 4.6695473 2.1489049 2.6000000 42.0000000
log_fatfold 486 1.4967524 0.2643232 0.9555114 3.7376696
-------------------------------------------------------------------------------------------------------------

/*Descriptive Statistics for each level of SEX using a BY statement


Data set must first be sorted BY SEX.*/

proc sort data=b510.owen;


by sex;
run;
proc means data=b510.owen;
by sex;
var bwt_g bmi fatfold log_fatfold;

2
run;

-------------------------------------------- SEX=1 --------------------------------------------

The MEANS Procedure

Variable Label N Mean Std Dev Minimum Maximum


----------------------------------------------------------------------------------------------
bwt_g 497 3340.56 565.3268435 1360.00 5170.00
bmi 510 15.8982386 1.6074313 11.3795135 26.2912000
FATFOLD FATFOLD 507 4.2518738 0.9720458 2.6000000 10.2000000
log_fatfold 507 1.4247028 0.2076417 0.9555114 2.3223877
----------------------------------------------------------------------------------------------

-------------------------------------------- SEX=2 --------------------------------------------

Variable Label N Mean Std Dev Minimum Maximum


----------------------------------------------------------------------------------------------
bwt_g 489 3159.00 611.1350784 910.0000000 5440.00
bmi 488 15.7227732 1.7171565 11.0247934 24.4485835
FATFOLD FATFOLD 486 4.6695473 2.1489049 2.6000000 42.0000000
log_fatfold 486 1.4967524 0.2643232 0.9555114 3.7376696
----------------------------------------------------------------------------------------------

/*Boxplots*/
proc sgplot data=b510.owen;
vbox bwt_g / category=sex;
run;
proc sgplot data=b510.owen;
vbox bmi / category=sex;
run;
proc sgplot data=b510.owen;
vbox fatfold / category=sex;
run;
proc sgplot data=b510.owen;
vbox log_fatfold / category=sex;
run;

3
/*Independent Samples t-test comparing means of continous
variables for each level of SEX. No sorting is necessary*/
proc ttest data=b510.owen;
class sex;
var bwt_g weight log_fatfold;
run;
The SAS System
The TTEST Procedure

Variable: bwt_g

SEX N Mean Std Dev Std Err Minimum Maximum

1 497 3340.6 565.3 25.3584 1360.0 5170.0


2 489 3159.0 611.1 27.6365 910.0 5440.0
Diff (1-2) 181.6 588.5 37.4840

SEX Method Mean 95% CL Mean Std Dev 95% CL Std Dev

1 3340.6 3290.7 3390.4 565.3 532.2 602.8


2 3159.0 3104.7 3213.3 611.1 575.1 652.0
Diff (1-2) Pooled 181.6 108.0 255.1 588.5 563.6 615.7
Diff (1-2) Satterthwaite 181.6 108.0 255.2

Method Variances DF t Value Pr > |t|


Pooled Equal 984 4.84 <.0001
Satterthwaite Unequal 975.39 4.84 <.0001

Equality of Variances

Method Num DF Den DF F Value Pr > F


Folded F 488 496 1.17 0.0842

Variable: bmi

SEX N Mean Std Dev Std Err Minimum Maximum

1 510 15.8982 1.6074 0.0712 11.3795 26.2912


2 488 15.7228 1.7172 0.0777 11.0248 24.4486
Diff (1-2) 0.1755 1.6620 0.1052

SEX Method Mean 95% CL Mean Std Dev 95% CL Std Dev

1 15.8982 15.7584 16.0381 1.6074 1.5145 1.7126


2 15.7228 15.5700 15.8755 1.7172 1.6158 1.8322

4
Diff (1-2) Pooled 0.1755 -0.0311 0.3820 1.6620 1.5921 1.7383
Diff (1-2) Satterthwaite 0.1755 -0.0314 0.3823

Method Variances DF t Value Pr > |t|


Pooled Equal 996 1.67 0.0958
Satterthwaite Unequal 984.1 1.66 0.0963

Equality of Variances
Method Num DF Den DF F Value Pr > F
Folded F 487 509 1.14 0.1407

Variable: log_fatfold

SEX N Mean Std Dev Std Err Minimum Maximum


1 507 1.4247 0.2076 0.00922 0.9555 2.3224
2 486 1.4968 0.2643 0.0120 0.9555 3.7377
Diff (1-2) -0.0720 0.2371 0.0151

SEX Method Mean 95% CL Mean Std Dev 95% CL Std Dev
1 1.4247 1.4066 1.4428 0.2076 0.1956 0.2213
2 1.4968 1.4732 1.5203 0.2643 0.2487 0.2821
Diff (1-2) Pooled -0.0720 -0.1016 -0.0425 0.2371 0.2271 0.2480
Diff (1-2) Satterthwaite -0.0720 -0.1017 -0.0424

Method Variances DF t Value Pr > |t|


Pooled Equal 991 -4.79 <.0001
Satterthwaite Unequal 919.96 -4.76 <.0001

Equality of Variances
Method Num DF Den DF F Value Pr > F
Folded F 485 506 1.62 <.0001
/*Paired samples t-test comparing mother's height and father's height*/
proc ttest data=b510.owen;
paired f_height*m_height;
run;

The TTEST Procedure

Difference: F_HEIGHT - M_HEIGHT

N Mean Std Dev Std Err Minimum Maximum


972 14.4218 8.7834 0.2817 -12.0000 56.0000

Mean 95% CL Mean Std Dev 95% CL Std Dev


14.4218 13.8689 14.9747 8.7834 8.4096 9.1923

DF t Value Pr > |t|


971 51.19 <.0001

/*Paired samples t-test comparing mother's height and father's height


for each level of SEX. Remember, data must be sorted BY SEX first.*/
proc sort data=b510.owen;
by sex;
run;
proc ttest data=b510.owen;
by sex;
paired f_height*m_height;
run;
-------------------------------------------- SEX=1 --------------------------------------------

The TTEST Procedure

5
Difference: F_HEIGHT - M_HEIGHT

N Mean Std Dev Std Err Minimum Maximum


494 14.4352 9.0257 0.4061 -12.0000 56.0000

Mean 95% CL Mean Std Dev 95% CL Std Dev


14.4352 13.6374 15.2331 9.0257 8.4958 9.6266

DF t Value Pr > |t|


493 35.55 <.0001

6
-------------------------------------------- SEX=2 --------------------------------------------

The TTEST Procedure

Difference: F_HEIGHT - M_HEIGHT

N Mean Std Dev Std Err Minimum Maximum


478 14.4079 8.5352 0.3904 -6.0000 52.0000

Mean 95% CL Mean Std Dev 95% CL Std D


14.4079 13.6408 15.1751 8.5352 8.0263 9.1136

DF t Value Pr > |t|


477 36.91 <.0001

/*One-sample t-test to test whether mean of htdiff=0, using Proc ttest*/


proc ttest data=b510.owen;
var htdiff;
run;
The TTEST Procedure

Variable: htdiff

N Mean Std Dev Std Err Minimum Maximum


972 14.4218 8.7834 0.2817 -12.0000 56.0000

Mean 95% CL Mean Std Dev 95% CL Std Dev


14.4218 13.8689 14.9747 8.7834 8.4096 9.1923

DF t Value Pr > |t|


971 51.19 <.0001

/*One-sample t-test to test whether mean of htdiff=15 cm, using Proc ttest*/
proc ttest data=b510.owen h0=15;
var htdiff;
run;
The TTEST Procedure
Variable: htdiff

N Mean Std Dev Std Err Minimum Maximum


972 14.4218 8.7834 0.2817 -12.0000 56.0000

Mean 95% CL Mean Std Dev 95% CL Std Dev


14.4218 13.8689 14.9747 8.7834 8.4096 9.1923

DF t Value Pr > |t|


971 -2.05 0.0404

/*One-sample t-test to test whether mean of htdiff=0, using Proc Univariate*/


proc univariate data=b510.owen plot normal;
var htdiff;
histogram;
run;
The UNIVARIATE Procedure
Variable: htdiff

Moments

N 972 Sum Weights 972


Mean 14.4218107 Sum Observations 14018
Std Deviation 8.78341392 Variance 77.1483601

7
Skewness 0.31703251 Kurtosis 0.56094005
Uncorrected SS 277076 Corrected SS 74911.0576
Coeff Variation 60.9036833 Std Error Mean 0.28172813

Basic Statistical Measures

Location Variability

Mean 14.42181 Std Deviation 8.78341


Median 15.00000 Variance 77.14836
Mode 15.00000 Range 68.00000
Interquartile Range 12.00000

Tests for Location: Mu0=0

Test -Statistic- -----p Value------

Student's t t 51.19052 Pr > |t| <.0001


Sign M 445 Pr >= |M| <.0001
Signed Rank S 219928 Pr >= |S| <.0001

Tests for Normality

Test --Statistic--- -----p Value------

Shapiro-Wilk W 0.989839 Pr < W <0.0001


Kolmogorov-Smirnov D 0.071494 Pr > D <0.0100
Cramer-von Mises W-Sq 0.364574 Pr > W-Sq <0.0050
Anderson-Darling A-Sq 2.035331 Pr > A-Sq <0.0050

Quantiles (Definition 5)

Quantile Estimate
100% Max 56
99% 37
95% 29
90% 25
75% Q3 20
50% Median 15
25% Q1 8
10% 3
5% 0
1% -5
0% Min -12

Extreme Observations

----Lowest---- ----Highest---
Value Obs Value Obs
-12 13 40 839
-7 112 41 305
-7 111 41 459
-6 701 52 879
-6 440 56 125

Missing Values
-----Percent Of-----
Missing Missing
Value Count All Obs Obs
. 34 3.38 100.00

Variable: htdiff

Histogram # Boxplot
57.5+* 1 0

8
.* 1 0
.
.* 4 0
.** 8 0
.******* 34 |
.**************** 76 |
22.5+****************************** 146 +-----+
.*********************************************** 231 *-----*
.******************************************* 213 | + |
.**************************** 140 +-----+
.******************* 92 |
.***** 21 |
.* 4 |
-12.5+* 1 0
----+----+----+----+----+----+----+----+----+--
* may represent up to 5 counts

Normal Probability Plot


57.5+ *
| *
|
| *
| ****
| ******+
| ******+
22.5+ ******+
| *******
| *******
| *******
| ********+
|*******++
|*++
-12.5+*
+----+----+----+----+----+----+----+----+----+----+
-2 -1 0 +1 +2
Fitted Normal Distribution for htdiff

Parameters for Normal Distribution

Parameter Symbol Estimate


Mean Mu 14.42181
Std Dev Sigma 8.783414

Goodness-of-Fit Tests for Normal Distribution

Test ----Statistic----- ------p Value------


Kolmogorov-Smirnov D 0.07149425 Pr > D <0.010
Cramer-von Mises W-Sq 0.36457387 Pr > W-Sq <0.005
Anderson-Darling A-Sq 2.03533100 Pr > A-Sq <0.005

Quantiles for Normal Distribution

------Quantile------
Percent Observed Estimated

1.0 -5.0000 -6.01147


5.0 0.0000 -0.02562
10.0 3.0000 3.16541
25.0 8.0000 8.49749
50.0 15.0000 14.42181
75.0 20.0000 20.34613
90.0 25.0000 25.67821
95.0 29.0000 28.86924
99.0 37.0000 34.85509

9
/*One-sample t-test to test whether mean of htdiff=15, using Proc Univariate*/
proc univariate data=b510.owen mu0=15;
var htdiff;
run;

Partial output from Proc Univariate is shown below:


The UNIVARIATE Procedure

Tests for Location: Mu0=15

Test -Statistic- -----p Value------

Student's t t -2.0523 Pr > |t| 0.0404


Sign M -40 Pr >= |M| 0.0071
Signed Rank S -18300 Pr >= |S| 0.0121

10

You might also like