Why This Assignment Matters: The Foundation of Scientific Knowledge
Research methods aren't just academic exercises - they're the tools scientists use to answer
questions about the world reliably. When you learn to describe the sample and measures, you're
learning how to communicate your work so others can understand, evaluate, and build upon it.
This transparency is what separates scientific knowledge from opinion or casual observation.
Understanding Your Sample: Who Are You Actually Studying?
A sample is simply the group of people (or things) you're actually collecting data from, rather
than everyone you'd ideally want to study. Our sample is like a snapshot. For example, just as
you wouldn't judge an entire movie from one frame, you can't make broad claims about all
college students based on studying only engineering majors, or only freshmen, or only students
at elite universities. Describing your sample helps readers understand who your findings actually
apply to.
Why Sample Description Matters
When you describe your sample, you're answering the critical question: "Who can we reasonably
apply these findings to?" If your study about study habits only included students who voluntarily
signed up for academic success workshops, your findings might not apply to struggling students
who avoid such programs. Being transparent about sample characteristics helps readers make
informed decisions about whether your findings are relevant to their situation or other
populations they care about.
Understanding Measures: What Are You Actually Measuring?
A measure is simply the tool or method you use to capture information about something you're
interested in. If you want to study "happiness," you need to define what that means operationally.
When you describe your measures clearly, you're helping readers understand exactly what you
found. If you say "students showed high levels of stress," readers need to know whether you
mean they scored high on a clinical anxiety inventory, or they reported feeling overwhelmed in
interviews, or their cortisol levels were elevated. Each of these measures captures something
different about the concept of "stress."
1. pwcorr (Pairwise Correlation)
What it does: Tests the linear relationship between two continuous variables.
Key outputs to look at:
• Correlation coefficient (r): Ranges from -1 to +1
o Values close to +1 = strong positive relationship
o Values close to -1 = strong negative relationship
o Values close to 0 = weak/no linear relationship
• P-value: Tests if correlation is significantly different from zero
o p < 0.05 = statistically significant correlation
o p ≥ 0.05 = no significant correlation
Example interpretation: "There is a moderate positive and statistically significant correlation
between study hours and exam scores (r = 0.65, p < 0.001), indicating that students who study
more hours tend to score higher on exams."
Interpretation of the example below: “There is a weak negative correlation between age and
self-reported health (r=-0.100, p=0.1803), indicating that older people report poorer health,
although this relationship is not statistically significant.”
Top value circled in yellow = correlation
Bottom value circled in green = p-value
2. tab chi2 (Chi-square test)
What it does: Tests if two categorical variables are independent of each other.
Key outputs to look at:
• Chi-square statistic: Measures how much observed frequencies differ from expected
frequencies
• P-value: Tests the null hypothesis that variables are independent
o p < 0.05 = variables are significantly associated
o p ≥ 0.05 = no significant association
• Cross-tabulation table: Shows the actual counts in each category combination
Example interpretation: "There is a significant association between gender and preferred major
(χ² = 12.45, p = 0.002), suggesting that men and women differ in their choice of academic
major."
Interpretation of the example below: “There is no significant association between where
students prefer to study and whether they prefer to study on their own or in groups (x2=4.33,
=0.228).”
Yellow = chi-square statistic (you can include in your interpretation like you see it in the
example above, but you don’t have to)
Green = p-value
3. ttest (t-test)
What it does: Compares means between groups or tests if a mean differs from a specific value.
Key outputs to look at:
• Mean difference: How much the groups differ on average
• T-statistic: Standardized measure of the difference
• P-value: Tests if the difference is statistically significant
o p < 0.05 = significant difference between groups
o p ≥ 0.05 = no significant difference
The key is always to report both the statistical significance (p-value) and the practical
significance (size of the effect) when interpreting results
Example interpretation: "Men scored significantly higher on the math test than women (men:
M = 78.2, women: M = 74.1, p = 0.021), with a mean difference of 4.1 points."
Interpretation of the example below: “Although those who prefer to study alone report 0.328
points higher health score than those who study in groups, this difference is not statistically
significant (M1=2.744, M2=2.417, p=0.129).” (side note: I know the example doesn’t make much
sense because it’s silly to draw correlation between health and study preference, but it’s just an
example, so don’t overthink it)
Yellow = mean values
Purple = standard deviation
Green = p-value
Blue = t-statistic (you don’t need to include it in the table or in your interpretation. I just
wanted to point it out)