Types of Variables
PubH 6450: Biostatistics I
Measurement Scales
• Categorical • Numerical
- Nominal - Continuous
• Binary • Interval
- Ordinal • Ratio
- Discrete
PubH 6450 Biostatistics I slide 2
Scales, Types and Methods
• The measurement scale determines the
variable type.
• The variable type determines the
appropriate analysis methods.
PubH 6450 Biostatistics I slide 3
Categorical Scales
• Nominal
- Examples: gender, race, blood type
• Ordinal
- Examples: Apgar scores, tumor stage,
social class
• Binary
- Examples: disease status, diagnostic
test result
PubH 6450 Biostatistics I slide 4
Categorical Data: Methods
• Summary Statistics
- Counts and proportions
• Plots
- Bar graphs or pie charts
• Statistical Analyses
- Confidence intervals for a proportion
- Confidence intervals for relative risks or
odds ratios
- Chi-square test or Fisher’s exact test
- Multiple logistic regression
PubH 6450 Biostatistics I slide 5
Numerical Scales
• Continuous
- Examples: blood pressure, temperature,
age, weight, height, etc.
• Discrete
- Examples: number of children in a family,
number of births in a year, number of
accidents in a month, etc.
PubH 6450 Biostatistics I slide 6
Continuous Scales
• Interval scales
- Examples: temperature (F or C)
• Ratio scales
- Examples: height, weight, age,
temperature (K)
PubH 6450 Biostatistics I slide 7
Continuous Data: Methods
• Summary statistics
- Mean, SD, Medians, IQR, range, quantiles or
percentiles
• Plots
- Histograms, scatter plots, boxplots, dot plots
• Analyses
- Confidence intervals for a mean
- Confidence intervals and t-tests for two means
- Correlation and simple linear regression
- Multiple linear regression
- ANOVA
PubH 6450 Biostatistics I slide 8
Time-to-Event Data: Methods (PubH 6451)
• Summary statistics
- Median survival time
- Five-year survival
• Plots
- Kaplan-Meier curves
• Analyses
- Confidence intervals for survival
- Log-rank tests and confidence intervals to
compare survival curves
- Proportional hazards regression
PubH 6450 Biostatistics I slide 9
Count Data: Methods (PubH 6451)
• Summary Statistics
- Counts, rates
• Plots
- Trend plots, histograms
• Statistical Analyses
- Confidence intervals for a rate
- Poisson regression
PubH 6450 Biostatistics I slide 10
Relationships
• Numerical data can be
converted to categorical data.
• Categorical data can be coded
numerically.
PubH 6450 Biostatistics I slide 11
Types of Variables: Example
From the LHS dataset:
• race: race of participant
- 1=White
- 2=Black/African American
- 3=Asian
- 4=American Indian/Native American
- 5=Other
- 6=Refuse to answer
• f31pipe: smoke a pipe at start of study 1 = yes, 2 = no
• alphagroup: treatment group
- SI-A = Smoking Intervention plus Atrovent
- SI-P = Smoking Intervention plus Placebo
- UC = Usual Care (no active intervention)
• COT0: salivary cotinine levels at baseline, in ng/ml
PubH 6450 Biostatistics I slide 12
Types of Variables: Example
PubH 6450 Biostatistics I slide 13
Types of Variables: Example
Smoking categories:
• <1 ng/ml: nonsmoker
• 1-4 ng/ml: exposed to smoke second hand
• >4 ng/ml: smoker
PubH 6450 Biostatistics I slide 14
Key concepts
Categorical: nominal, ordinal, binary
Numerical: discrete (count), continuous (ratio & interval)
Summary statistics
Plots
Analyses
PubH 6450 Biostatistics I