Digital Sat Cognitive Lab Report
Digital Sat Cognitive Lab Report
Thinking Required by
Select Digital SAT Suite
®
Questions
College Board
January 2024
The Cognitively Complex
Thinking Required by Select
Digital SAT Suite Questions
January 2024
Suggestion Citation:
College Board. 2024. The Cognitively Complex Thinking Required by Select Digital
SAT Suite Questions. New York: College Board.
© 2024 College Board. College Board, SAT, and the acorn logo are registered trademarks of College Board. PSAT/NMSQT is
a registered trademark of College Board and National Merit Scholarship Corporation. PSAT is a trademark of College Board.
Desmos and related trademarks are property of Desmos Studio PBC.
Contents
Executive Summary................................................................................................................. vi
Section 1: Introduction........................................................................................................... 1
Structure of This Report................................................................................................................... 3
Section 3: Methodology...................................................................................................... 10
Test Question Selection................................................................................................................ 10
Question-Level Construct Definition....................................................................................... 13
Protocol Development................................................................................................................... 13
Sample Recruitment, Selection, and Characteristics........................................................ 14
Sample Recruitment and Selection.................................................................................... 14
Sample Characteristics........................................................................................................... 15
Cognitive Interviews....................................................................................................................... 20
Coding and Analysis....................................................................................................................... 21
Coding............................................................................................................................................ 21
Analysis.......................................................................................................................................... 22
Section 4: Results.................................................................................................................... 24
Reading and Writing........................................................................................................................ 24
Craft and Structure.................................................................................................................... 25
Information and Ideas.............................................................................................................. 39
Expression of Ideas................................................................................................................... 61
Math...................................................................................................................................................... 68
Algebra........................................................................................................................................... 68
Advanced Math........................................................................................................................... 81
Problem-Solving and Data Analysis.................................................................................... 95
Geometry and Trigonometry.............................................................................................. 109
Tables
Table 1. Digital SAT Suite Reading and Writing and Math Questions Studied.......... 12
Table 2. Cognitive Interview Participants by Cohort Year................................................ 15
Table 3. Cognitive Interview Participants by Gender......................................................... 16
Table 4. Cognitive Interview Participants by Race/Ethnicity........................................... 16
Table 5. Cognitive Interview Participants by First Language(s) Learned.................... 17
Table 6. Cognitive Interview Participants by Best Language.......................................... 17
Table 7. Cognitive Interview Participants by Digital SAT Suite Test Previously
Taken..................................................................................................................................................... 18
Table 8. Cognitive Interview Participants: Reading and Writing—Prior Mean
Section Scores................................................................................................................................. 18
Table 9. Cognitive Interview Participants: Math—Prior Mean Section Scores........ 18
Table 10. Cognitive Interview Participants: Reading and Writing—Prior
Achievement by Performance Score Band (PSB)................................................................ 19
Table 11. Cognitive Interview Participants: Math—Prior Achievement by
Performance Score Band (PSB).................................................................................................. 20
Table 12. Student Performance on Reading and Writing: Craft and Structure—
Words in Context Questions........................................................................................................ 25
Table 13. Student Performance on Reading and Writing: Craft and Structure—
Text Structure and Purpose Questions................................................................................... 30
Table 14. Student Performance on Reading and Writing: Craft and Structure—
Cross-Text Connections Questions.......................................................................................... 33
Table 15. Student Performance on Reading and Writing: Information and
Ideas—Central Ideas and Details Questions......................................................................... 39
Table 16. Student Performance on Reading and Writing: Information and
Ideas—Command of Evidence: Textual Questions............................................................. 44
Table 17. Student Performance on Reading and Writing: Information and
Ideas—Command of Evidence: Quantitative Questions.................................................. 48
Table 18. Student Performance on Reading and Writing: Information and
Ideas—Inferences Questions...................................................................................................... 57
Table 19. Student Performance on Reading and Writing: Expression of Ideas—
Rhetorical Synthesis Questions................................................................................................. 62
Table 20. Student Performance on Reading and Writing: Expression of Ideas—
Transitions Questions.................................................................................................................... 65
Table 21. Student Performance on Math: Algebra—Linear Functions: Interpret
Question.............................................................................................................................................. 69
Table 22. Student Performance on Math: Algebra—Linear Functions/
Inequalities in One Variable: Create and Use Questions................................................... 71
Qualitatively, each student’s response to each test question was coded against
a set of required (Reading and Writing) or expected (Math) behaviors. These
behaviors, predefined by the College Board research team, described the aspects
of cognitively complex thinking various question types are intended to elicit.
Each student participant was judged by the researchers to have or have not
demonstrated each of these behaviors in their response to the questions, and
their responses were coded correspondingly. Vignette candidates of students
exhibiting these behaviors and, in the process, demonstrating exemplary (if not
necessarily perfect) thinking through a given question were also identified during
the coding stage.
All examined Reading and Writing questions and the vast majority (85 percent)
of examined Math questions performed as intended, with differentials from 0
to 5. Two Math questions had differentials greater than 5, but the qualitative
evidence suggests that students were still exhibiting aspects of cognitively
complex mathematical reasoning. A third Math question was answered correctly
by no student, so although it technically had a differential of 0, it was considered
an outlier. Vignettes of student performance associated with each of the forty
questions supply additional evidence that the questions elicited cognitively
complex thinking from student participants.
The key finding of this study is strong confirmation of the hypothesis that the
digital SAT Suite assessments are capable of eliciting cognitively complex thinking
from student test takers. This is important because, first, a large body of evidence
supports the conclusion that students need to be able to engage in such thinking
to be college and career ready (i.e., prepared to succeed in college or workforce
training programs without remediation) and, second, because the U.S. Department
of Education requires states using the digital-suite tests (or other off-the-shelf
large-scale standardized assessments) as part of their education accountability
systems to supply evidence that the tests are capable of eliciting such thinking.
Based on the findings reported here, policymakers should have high confidence
that the tests of the digital SAT Suite of Assessments satisfy these criteria. In
addition, the results and the methodology laid out in this report may be useful
to researchers interested in evaluating the cognitive demands of large-scale
standardized assessments.
This report presents the results of a 2023 study conducted by College Board,
with the assistance of vendor Vidlet, Inc., to ascertain whether select test
questions of the digital SAT Suite, which comprises the SAT, PSAT/NMSQT®,
PSAT™ 10, and PSAT™ 8/9 college and career readiness assessments, are capable
of eliciting cognitively complex thinking from student test takers. A positive
finding would be important because it would offer evidence that the digital
SAT Suite tests (1) measure important college and career readiness prerequisites,
(2) are appropriate for use as part of state educational accountability systems,
and (3) conform to College Board’s own claims for their tests, as laid out in
specifications documentation (College Board 2023a).
The principal mechanism of this study, which closely follows the approach used
in an earlier project involving the paper-based SAT Suite (College Board and
HumRRO 2020), is the use of cognitive interviews with a sample of high school
juniors and seniors. During these interviews, which were prepared jointly by
College Board and Vidlet and conducted by trained Vidlet staff, participants
were asked to think aloud—that is, verbalize any and all of their thoughts—as
they worked through either a set of twenty digital SAT Suite Reading and Writing
questions or a set of twenty digital SAT Suite Math questions. Twenty-six students
Two related forms of analysis of the coded data are presented in this report. On
the quantitative side, a derived statistic called the differential was calculated as
a measure of the extent to which each sampled digital SAT Suite test question
performed as expected. For a given test question, the differential was found using
the formula D = C - A, where D is the differential, C is the number of participants
answering a given question correctly, and A is the number of participants who both
(1) answered correctly and (2) demonstrated all required (Reading and Writing) or at
least one expected behavior (Math). The differential thus represents the arithmetic
difference between the total number of participants answering a given question
correctly and the number of those participants who also enacted the question
type’s construct by demonstrating particular behaviors.
The results in both Reading and Writing and in Math offer strong evidence that the
sampled digital SAT Suite test questions, which are broadly representative of the
tests’ designs, are capable of eliciting cognitively complex thinking from students.
Zero to low differentials were associated with all questions in Reading and Writing
and with the vast majority (85 percent) of questions in Math, and vignettes were
found and are presented in this report to exemplify this thinking. The behavior of
the three nonconforming Math questions—two with differentials above 5 and one
with a 0 differential but no participants answering correctly—is also analyzed and
presented in Section 5: Discussion, the conclusion being that these questions still
elicited aspects of cognitively complex thinking.
Education has, in fact, been one of the more fertile areas for verbal protocol
studies in recent years. The appeal of the methodology to this field is intuitively
obvious. Researchers, teachers, curriculum specialists, and other stakeholders are
committed to developing and implementing instructional methods and materials
that promote student learning, but such learning takes place, often silently and
unobserved, in students’ heads. Without some sense of how students themselves
are engaging (or not engaging) with these methods and materials, we can’t fully or
fairly account for the success or failure of these interventions.
One foundational verbal protocol study in the education field was that of Pressley
and Afflerbach (1995), who used and refined the approach in an effort to create a
model of conscious mental processes enacted during reading. A particular area
of focus for many literacy-related verbal protocol studies has been distinguishing
the behaviors of more and less successful readers. For example, Kletzien (1991)
Verbal protocol analysis has also been used successfully to explore participants’
thought processes as they engage in math tasks. For instance, Goos and
Galbraith (1996) used the methodology to determine that two high school
seniors collaborating on a series of problems in an applied math course exhibited
“differing, but complementary, metacognitive strengths” (255), which typically
aided in their joint problem-solving. Montague and Applegate (1993) analyzed the
verbal protocols from eighty-one middle school students, roughly a third of whom
were selected randomly from pools of learning disabled, average-achieving, and
gifted students in a large southeastern metropolitan district. The researchers
found that when presented with a range of problems in math, students identified as
gifted were more strategic in their solving approaches than students in the other
two achievement groups; that perceived difficulty of math problems seemed to
affect students’ perseverance and cognition; and that “students with LD [learning
disabilities] approach[ed] problem solving in a qualitatively different manner than
their more proficient peers” (29). Özcan, Imamoğlu, and Katmer Bayraklı (2017) also
used verbal protocol analysis to examine students’ approaches to math problem-
solving, in this case involving sixty-nine sixth graders sampled across achievement
levels. Among their findings, the researchers determined that those students who
employed an incorrect process in solving a nonroutine math problem “mostly [did]
operations aimlessly” and approached the word problem superficially (139–140).
Though obviously not exhaustive, the above overview of verbal protocol studies
in literacy and math education establishes that the methodology has been used
to examine a broad range of cognitive activities in an array of fields. Moreover, in
educational research, this approach has been used successfully in both literacy
and math (as well as in other subject areas) with numerous categories of students,
including younger and older students, higher- and lower-achieving students,
English learners and native speakers, and students who are neurodivergent as well
as students who aren’t.
One of the earliest and most influential critiques of verbal protocols as data
came from Nisbett and Wilson (1977). Drawing from then-burgeoning critiques
of introspection-based research methods, the authors posited three major
conclusions (233):
Rather than outright rejecting these concerns, Ericsson and Simon (1993)
countered with a simple mental processing model that differentiates between
Some researchers, however, have made a case for a hybridized approach, one
that makes use of both concurrent and retrospective dimensions. Johnstone,
Bottsford-Miller, and Thompson (2006) advocated for such a blended approach,
contending that it counterbalanced both the propensity of think-aloud
verbalizations to be “incoherent” (2) and that of interviews to elicit potentially
inaccurate retrospective explanations of behaviors already encoded into long-
term memory.
While noting several concerns about the use of data requiring participants to
retrieve information from long-term memory, Taylor and Dionne (2000) advocate
for the value of retrospective debriefing (RD) in tandem with concurrent verbal
protocols (CVP), which they found obtained “a richer account of problem-solving
When problem solvers are requested to think aloud while solving a problem
(CVP), and then to describe how they solved the problem (RD), CVP data can
be used to provide data-based cues to guide the collection of RD data on a
specific problem-solving event. . . . In turn, convergent information about the
same event contained in the broader spectrum of RD data can be used by
researchers to elaborate CVP data, which tend to focus on the control of the
problem-solving process. . . . Equally important are instances in which CVP
and RD data diverge. These divergent reports offer opportunities for critical
examination and clarification of both the problem solver’s knowledge and the
CVP and RD methodologies. As a result of using the two methodologies as
complementary data sources, the richness of data available on a particular
event is enhanced. (417)
Twenty Reading and Writing questions and twenty Math questions were ultimately
selected for study. These questions were drawn from actual digital SAT Suite item
pools rather than developed specifically for this study and were intended to be
representative of questions students might encounter on test day. The individual
test questions had previously undergone rigorous internal quality control checks
to ensure their content soundness (accuracy) as well as their appropriateness
for use with secondary-level students in a large-scale, high-stakes standardized
assessment of their college and career readiness. Because this study was
conducted prior to the domestic launch of the digital SAT Suite in the 2023–2024
academic year, some Reading and Writing test questions hadn’t been previously
pretested; in those cases, College Board test developers provisionally assigned
them to difficulty levels (i.e., performance score bands, defined below) based on
expert judgment. (This limitation is further discussed in Section 6: Implications.)
Collectively, the Reading and Writing and Math question sets were intended to
represent a wide range of skill/knowledge testing points, subject areas, question
difficulty levels, text complexities, and question formats consistent with the
tests’ designs, although, as discussed below, especially low-difficulty questions
Table 1 breaks down the characteristics of the digital SAT Suite questions included
in the study. Each question has several characteristics:
Note the different organization of Reading and Writing and Math sections above: while the study's Reading and Writing questions were grouped
by content domain, its Math questions were grouped by difficulty level. These two approaches broadly reflect the presentation order in which
actual test takers would be administered the questions in the two sections.
As a group, the twenty sampled Reading and Writing questions represented three
of the section’s four content domains (with Standard English Conventions being
excluded), all major skill/knowledge categories within those three domains, all
four subject areas sampled in the section, and a range of difficulty from 3 (easy)
to 7 (hard). As a group, the Math questions represented all four of the section’s
content domains, many skill/knowledge categories within those domains, all three
subject areas sampled in the section as well as questions set outside of context,
a range of difficulty from 3 (easy) to 7 (hard), and both question formats used in
the section (multiple-choice and student-produced response). Comparatively, the
Math questions skewed harder, on average, than did their Reading and Writing
counterparts, a circumstance discussed in Section 6: Implications.
Protocol Development
The lead author of this study next developed closely parallel Reading and Writing
and Math protocols for conducting the cognitive interviews in which students
would participate (see exhibit 4 in the appendix for a sample). These protocols
were designed as guides for the interviewers conducting sessions with students.
The guides included instructions for conducting the sessions, scripts for
The main goal of the College Board–developed recruitment approach for this
study was to select student samples for both the Reading and Writing and Math
cognitive interview activities that closely mimicked the typical digital SAT Suite
test-taking population.
College Board staff developed the recruitment approach and materials used in the
study. The recruitment method was a direct email (see exhibit 1 in the appendix) to
eleventh- and twelfth-grade students who had previously taken the PSAT/NMSQT /
PSAT 10 or SAT tests and had elected to receive emails from College Board. It was
important that participating students had prior College Board test scores, as a key
sample selection criterion was ensuring, to the extent possible, that the study’s
samples reflected the widest possible span of achievement on the tests and
behaved like a typical digital SAT Suite test-taking population. Prior PSAT/NMSQT /
PSAT 10 scores could be used as good proxies for SAT achievement levels, given
that the tests are on the same vertical scale. (See College Board 2023a, section
2.2.8.2, for more details on the digital SAT Suite’s vertical scale.)
The recruitment email described the study and its eligibility requirements,
including a willingness and ability to participate in a roughly ninety-minute virtual
session with an interviewer. A link in the email navigated interested students to
an online form (see exhibit 3) that provided more information about the study,
including the purpose for conducting the study, an indication of the voluntary
nature of participation, and a description of the activity students would be asked
to participate in. They were also informed about the study’s incentive, which was a
$150 gift card to be delivered on successful completion of the activity. (As noted in
the headers for exhibit 1 and exhibit 2, College Board and Vidlet jointly determined
before the interviews were conducted that the $100 compensation referred to in
the recruitment materials should be increased to $150 to better reflect the time
and effort required to successfully complete the activity.)
Upon receipt of the list of these eligible participants, Vidlet staff contacted the
sampled students through targeted emails. After communication was established
and participation confirmed, all eligible students were randomly assigned to
participate in either the Reading and Writing or Math cognitive interviews. After
initial confirmation, Vidlet requested and collected documentation from students,
including a consent form (see exhibit 3) acknowledging voluntary participation
in the study, and confirmation of the interview time, date, and location. Vidlet
obtained parent or guardian consent for students under eighteen years of age.
In total, fifty thousand students who had previously taken the SAT, PSAT/NMSQT,
or PSAT 10 and who had opted in to receive College Board communications
were emailed about the study opportunity. Of those contacted, 198 (0.4 percent)
completed the application process, 53 were accepted into the study, and 49
ultimately participated.
SAMPLE CHARACTERISTICS
This subsection presents demographic and score information for the twenty-
six Reading and Writing and twenty-three Math cognitive interview participants.
Demographic information is presented separately for the participants in each
sample group (Reading and Writing; Math) as well as for all participants.
The forty-nine participants in the study were nearly evenly split in terms of the
cohort year from which they came. Table 2 shows that twenty-four participants
(thirteen in Reading and Writing and eleven in Math) were from the class of 2023
(i.e., were high school seniors at the time of the study), while the remaining twenty-
five (thirteen in Reading and Writing and twelve in Math) were from the class of
2024 (i.e., were current high school juniors).
Table 3 shows that when it comes to gender, slightly more participants identifying
as female (fifty-four) than as male (forty-six) took part in the Reading and Writing
cognitive interviews, while slightly more participants identifying as male (fifty-
two) than as female (forty-eight) took part in the Math interviews (as noted
*
Source: College Board 2022, p. 4.
Table 4 shows that compared to all SAT test takers nationwide, the percentages of
Reading and Writing study participants identifying as Asian and White were higher,
while that of participants identifying as two or more races/ethnicities was the same
and those of participants identifying as Black/African American and Hispanic/
Latino were lower. As for the Math participants, the percentages identifying as
Asian, White, and from two or more races were lower than those in the national
population, while the percentage identifying as Hispanic/Latino was greater than
that of the national population. In both samples, the percentages identifying as
American Indian/Alaska Native and Native Hawaiian/Other Pacific Islander were
similar to the national percentages. In both groups, the percentage of participants
opting not to report their race/ethnicity was greater than the corresponding
national percentage, and proportionally more Math participants than Reading and
Writing participants chose not to report.
*
Source: College Board 2022, p. 4.
Participants were also asked to report their current best language(s) (table 6). Over
three-quarters of Reading and Writing participants (77 percent) and over half of
Math participants (57 percent) reported English only. Twelve percent of Reading
and Writing participants and 30 percent of Math participants said English and
another language were their best languages. No participants said a language other
than English was their best. Twelve percent of the Reading and Writing participants
and 13 percent of the Math participants chose not to respond. (National
percentages aren’t available for comparison on this dimension.)
Sampling for the cognitive interview study was based in part on the previous SAT,
PSAT/NMSQT, or PSAT 10 scores of those selected, with the goal of having the
cognitive interview samples reflect a wide range of achievement as measured
by the SAT and these PSAT-related assessments. Table 7 shows that for both
Table 8 and table 9, respectively, show the means, standard deviations, minima,
and maxima of the paper and pencil Evidence-Based Reading and Writing section
scores of the Reading and Writing cognitive interview participants and of the
paper and pencil Math section scores of the Math cognitive interview participants.
The tables also show national mean scores. All SAT section scores, whether
paper-based or digital, are on a scale from 200 to 800, while all PSAT/NMSQT and
PSAT 10 section scores, regardless of mode, are on a scale from 160 to 760. (For
details on how the performance score bands were determined, see College Board
2023b.)
* Sources: College Board 2022, p. 9 (PSAT/NMSQT / PSAT 10 mean); College Board 2022, p. 6 (SAT mean)
* Sources: College Board 2022, p. 9 (PSAT/NMSQT / PSAT 10 mean); College Board 2022, p. 6 (SAT mean)
** As discussed below, one participant with a prior SAT Math score of 200 (the section minimum) was
mistakenly included in the sample. The score of the second-lowest-performing student is reported here
as the “true” minimum.
Participants as a group in the Reading and Writing cognitive interviews were higher
performing than was the class of 2022, which took the paper-based versions of
the SAT Suite assessments. The mean PSAT/NMSQT / PSAT 10 paper and pencil
Evidence-Based Reading and Writing section score of the fifteen students who
reported those was 523, compared to the national mean PSAT/NMSQT / PSAT 10
section score of 481, and the minimum PSAT/NMSQT / PSAT 10 section score
reported was 360. (Minimum national section scores for either the SAT or the
Note that a student with a past SAT Math section score of 200—the minimum—
was included in the sample. It’s most likely that this score represents the student
for some reason having not attempted the Math section when they took the SAT.
This student’s inclusion in the sample was an oversight, and future College Board
studies in this vein will correct for that error. Table 9 thus reports the second-
lowest SAT Math section score (370) as the “true” minimum.
Something else to note about the Math participants is the much smaller range of
prior PSAT/NMSQT / PSAT 10 scores than SAT scores they represent. Excluding
the participant whose prior SAT Math section score was 200, the nine participants
with previous paper-based SAT scores had Math section scores spanning a broad
expanse of the score range (370–730). The fifteen participants with prior paper-
based PSAT/NMSQT / PSAT 10 scores, despite having a mean nearly equal to the
national mean, spanned only thirteen scale score values: 410–530. This indicates
that the Math sample tended to overrepresent average section scores and
underrepresent higher- and lower-achieving students.
Finally, table 10 and table 11 show the distributions of Reading and Writing and
Math cognitive interview participants, respectively, by the performance score
bands (PSBs) their SAT or PSAT/NMSQT / PSAT 10 scores fall into. Note that
the score ranges for the bands used below are the same as those used for the
questions themselves as proxies of question difficulty.
For Reading and Writing, the most populous performance score band was
680–800, at 30.8 percent of participants, followed by 420–480 at 26.9 percent,
610–670 at 15.4 percent, and 490–540 also at 15.4 percent. Approximately
90 percent of all Reading and Writing participants fell into one of these bands. For
Math, 26.1 percent of participants fell into score band 470–540, followed by
420–460 at 21.7 percent, 680–800 at 17.4 percent, and 550–600 at 13.0 percent.
About 80 percent of Math participants fell into one of these bands. (National
percentages aren’t available for comparison on these dimensions.)
In short, the Reading and Writing and Math samples in this cognitive interview
study were largely like their respective national populations on several
demographic variables, including gender, race/ethnicity, and self-reported first
language(s), with the observed variance largely attributable to the small sample
sizes and the voluntary nature of participating in this activity. In terms of prior SAT
Suite test achievement, the Reading and Writing sample was somewhat more able
than is true nationally, while the Math sample was comparably or slightly less able.
Given the recruitment constraints of the study, the College Board researchers
deemed the samples sufficiently comparable to their national counterparts to
make them appropriate for analysis.
Cognitive Interviews
Once recruitment for the study was completed, College Board’s vendor, Vidlet,
began setting up interviews with each participant. After recruited students had
confirmed their participation in the study, the Vidlet team collected consent
forms and scheduled students for ninety-minute sessions. Sessions were
conducted one-on-one between a Vidlet interviewer and a student; no College
Board personnel were present except on one occasion, when the study lead sat in
(with his camera off and with the participant’s knowledge) for interviewer training
purposes.
During each session with participants, the Vidlet interviewer executed the protocol
that College Board had previously developed and that the vendor had been trained
on. In brief, each interview consisted of the following elements:
§ Welcome
§ A briefing on the study, participants’ role in it, and participants’ right to decline
to answer questions and/or stop participating at any time
§ Modeling by the interviewer, who, reading from a script, demonstrated the think-
aloud process on a sample question (not analyzed in the study)
§ One or (at the interviewer’s discretion) two chances on the participant’s part to
practice thinking aloud on sample questions (not analyzed in the study)
§ A thinking-aloud period of approximately seventy minutes, during which the
participant worked through as many as twenty Reading and Writing or Math
questions
§ A set of debriefing questions
§ Wrap-up
Interview sessions were conducted remotely via Zoom and were videorecorded
with the participants’ awareness and consent. Participants were asked to share
their device screen with the interviewer so that the latter could follow along as a
given participant worked through the Reading and Writing or Math questions.
Vidlet conducted the interviews from April 12 to April 30, 2023. Vidlet
subsequently produced verbatim transcripts of each interview and submitted
them, along with the associated video recordings and certain ancillary data, to
College Board. These ancillary data included reports of any irregularities during
interviews (e.g., late arrival, connectivity issues) and records of participants’ start
and stop times on each question. These ancillary materials aren’t analyzed in this
report, but no major irregularities were reported.
ANALYSIS
The College Board researchers analyzed the coded data both qualitatively and
quantitatively. Qualitative analysis consisted of assigning behavior codes to
participants’ transcribed responses and identifying and selecting illustrative
examples (vignettes) of participants’ cognitively complex thinking in accordance
with the behaviors defined for each question type.
The most important aspect of the quantitative analysis, which builds on the
preceding qualitative behavior coding, is a derived statistic developed for this
and similar studies (College Board and HumRRO 2020) called the differential. This
statistic is determined for each studied test question using the formula
D = C – A, where D is the differential, C is the number of participants answering
a given question correctly, and A is the number of participants who both
(1) answered correctly and (2) demonstrated all required (Reading and Writing) or
at least one expected behavior (Math).
This statistic, D, calculated for each test question in the study, is posited to
express the extent to which a given question functions as intended. C in the
above expression simply represents the count of the number of participants who
answered a given question correctly, while A represents the number of correctly
answering participants who, by exhibiting all required behaviors (for Reading and
Writing section questions) or at least one expected behavior (for Math section
questions), have enacted the question type’s construct.
Note that because of differences in the disciplines of literacy and math, the lists
of required Reading and Writing behaviors included providing the best answer,
while the lists of expected Math behaviors don’t. This makes no difference in how
the differential was calculated between test sections, as the Math calculation still
considers whether participants answered a given question correctly.
Two Math questions had differentials of greater than 5, while an additional Math
question with a technical differential of 0 had no participants answering correctly.
These three questions are analyzed in detail in Section 5: Discussion. No Reading
and Writing questions had a differential above 5 or were otherwise nonconforming.
Table 12. Student Performance on Reading and Writing: Craft and Structure—
Words in Context Questions.
Demonstrated
Demonstrated
Required Behaviors
Answered Both Behaviors and
Question Subject Area PSB 1 2 Both Correctly Answered Correctly Differential
1 Science 5 16 7 6 7 6 1
n = 25 (64%) (28%) (24%) (28%) (24%)
2 History/ 7 19 18 17 18 17 1
n = 25 social studies (76%) (72%) (68%) (72%) (68%)
For this and all tables in the Results section, color gradations indicate percentage quartiles, with darker
shades denoting higher percentages. Purple columns highlight data used to calculate differentials.
Table 12 indicates that both Words in Context questions included in the study
performed as expected, with differentials of 1. Although relatively few students
answered question 1 correctly, those who did almost always demonstrated
both behaviors, while a majority of students at least demonstrated some level
Question 1
Question 1, a medium-difficulty (PSB 5) question set in a science context, asks
students to best complete the text (i.e., fill in the blank) with the most logical and
precise word or phrase.
Some foraging models predict that the distance bees travel when
foraging will decline as floral density increases, but biologists Shalene
Jha and Claire Kremen showed that bees’ behavior is inconsistent with
this prediction if flowers in dense patches are b l a n k : bees will forage
beyond patches of low species richness to acquire multiple resource
types.
Which choice completes the text with the most logical and precise word
or phrase? Methodology Notes: Vignettes
A) immature All vignettes in this report are as
close to verbatim representations of
B) homogeneous
students’ transcribed responses as
C) depleted
possible. Omissions (typically made
D) dispersed to reduce repetition of passage
and/or question content) are noted
To answer this question correctly, students must determine from the passage that by ellipses (“. . .”). Text in [brackets]
bees will extend their foraging range beyond nearby dense flower patches if these has been inserted for clarity, most
patches have “low species richness” and therefore don’t offer access to “multiple commonly to unambiguously
resource types.” identify answer choices. Material
in “quotation marks” (and possibly
Choice B, homogenous, is the best answer, as it clearly indicates that when the
including bracketed text) represents
flowers in nearby dense patches are highly similar, bees will range beyond them
verbatim quotations from a given
to look for greater resource variety. Answering this question correctly requires
test passage or question. Note that
more than prior knowledge of the meaning of the words immature, homogeneous,
answer choice letters (A–D) are
depleted, and dispersed because any of these options could be meaningfully “read
provided with the multiple-choice
into” the passage, while only homogeneous supplies the word that logically and
questions to increase clarity about
precisely completes the thought expressed in the passage.
which choice is being referred to;
Student RW17 begins their successful approach to question 1 by paraphrasing because of technical limitations, the
the topic of the passage, thereby indicating some conceptual understanding of Qualtrics survey didn’t include such
the task, and restating the researchers’ claim. letters, although many students
supplied them on their own. Note
So this text talks about bees and the distance they travel when foraging.
that the genderless third person
And it’s asking me to fill in the blank. So the sentence that it wants me
pronoun "they" is consistently used
to fill in, or the phrase, is the biologists “show[ed] that bees’ behavior is
in this section to refer to individual
inconsistent with this prediction if flowers in dense patches are,” blank.
students.
And it says, “bees will forage beyond patches of low species richness.”
It’s worth noting that student RW17’s approach initially rules in depleted and
dispersed as possible correct answer choices. By design, Words in Context
questions pose incorrect answer choices (distractors) that are at least surface
plausible in terms of meaning and that can be “read into” the context without
awkwardness. To answer such questions correctly, then, students must use
context clues and may use other techniques, such as calls to prior knowledge
and division of words into meaningful, more recognizable parts (base words and
affixes), to determine the best answer.
With a clearer sense of the meaning of the passage, student RW17 uses the
context to select homogeneous, a basic definition to which the student has
recalled or perhaps inferred, and concurrently to rule out depleted and dispersed.
Which choice completes the text with the most logical and precise word
or phrase?
A) commonalities with
B) animosities toward
C) refinements of
D) precursors of
Student RW22 begins their analysis by providing a text-based rationale for the
best answer.
Animosities toward, that’s more of an emotion. I don’t think you can tell
through architecture so much.
Precursors of ? Does that mean the Tikal—or, no, that would mean this
Mexican architecture would have—oh, no, no, no. The Tikal architecture
would have been made before the Mexican architecture, which means—in
portions of Tikal. Wait, Tikal architecture would be before the new thing.
So I don’t think—that that would mean that the Mexican architecture
wouldn’t influence Tikal. I don’t think that makes sense.
Student RW22 ultimately chooses the best answer mainly on the basis of the
passage’s reference to the later Tikal architecture including an almost exact copy
of a Teotihuacan temple.
Although student RW22 never fully rules out refinements of as an option, their
rationale for affirmatively selecting commonalities with exhibits a strong sense of
the passage’s content and the meaning of the phrase itself.
Part-whole relationship
Main purpose
Both question types are designed to elicit cognitively complex thinking from
students. For questions about part-whole relationships, students are expected
to develop a clear sense of the overall message and structure of an appropriately
challenging passage in one of several subject areas in order to determine the main
rhetorical role that a particular, substantive part of the passage (e.g., a clause,
a sentence; designated by underlining) plays in the passage as a whole. For
questions about main purpose, students are expected to use an understanding
of the content of a given passage to ascertain its primary aim and to distinguish
Table 13 summarizes how students performed on the two Text Structure and
Purpose questions included in the study.
Table 13. Student Performance on Reading and Writing: Craft and Structure—
Text Structure and Purpose Questions.
Demonstrated
Demonstrated
Required Behaviors
Answered Both Behaviors and
Question Subject Area PSB 1 2 Both Correctly Answered Correctly Differential
3 Literature 3 26 25 25 25 25 0
Main purpose (100%) (96%) (96%) (96%) (96%)
n = 26
4 Science 4 25 24 24 24 24 0
Part-whole (96%) (92%) (92%) (92%) (92%)
relationship
n = 26
Table 13 indicates that both Text Structure and Purpose questions included in the
study performed as expected, with differentials of 0. Each student who answered
one or both of the questions correctly demonstrated both required behaviors.
Question 3
Question 3, an easy (PSB 3) question set in a literature context, asks students to
read a passage and then determine the passage’s main purpose.
The following text is from Holly Goldberg Sloan’s 2017 novel Short.
More than two years ago my parents bought a piano from some
people who were moving to Utah. Mom and Dad gave it to my
brothers and me for Christmas. I had to act really happy because it
was such a big present, but I pretty much hated the thing from the
second it was carried into the hallway upstairs, which is right next to
my bedroom. The piano glared at me. It was like a songbird in a cage.
It wanted to be set free.
©2017 by Holly Goldberg Sloan
In working through the question, students are expected to realize that the
passage’s main purpose is to establish how the narrator feels about the piano
(choice A), as every element included in the passage furthers this writerly
aim. While a partial reason for why the parents bought the piano is included in
Student RW1 follows closely the analysis path described above, and their
response is representative of many others’. They demonstrate their solid grasp
of the passage by immediately identifying the best answer based on their own
interpretation of the text, and then they provide passage-based rationales for
ruling out each of the distractors.
So I think immediately, I already know that the first option is the best
answer because I feel like as I was reading [the passage], I saw that, um,
it doesn’t really explain—I don’t think the purpose of the test—text was
to explain why the narrator’s parents bought the piano because it’s just
stating that—how they got the piano, but not, like, like, the actual events
that led up to it. That was only in, like, the first sentence, and there’s—the
rest of the passage is there, and it does not talk about that. Um, she—and
then this third option, I’d say it’s not that because she clearly says how
she “hated the thing from the second it was carried into the hallway,”
which is right next to her bedroom. She did not want it to be close to her
bedroom because she literally hates the piano. And it’s not the last option
because nowhere does it say that her brothers are talented piano players.
It just says that it was gifted to her and her brother[s] for Christmas, but
it doesn’t suggest anything about them being piano players. So I would
say [choice] A because it just talks about how she was not happy about
getting the piano for Christmas and how she felt like it was “a songbird in
a cage” and “it wanted to be set free.”
Question 4
Question 4, a medium-difficulty (PSB 4) question set in a science context, asks
students to determine the main function of the underlined portion in relation to
the passage as a whole. To answer this question successfully, students need to
understand both the substance of the whole passage and the specific role that the
underlined portion plays in the passage.
Part of the Atacama Desert in Peru has surprisingly rich plant life despite receiving
almost no rainfall. Moisture from winter fog sustains plants once they’re growing, but
the soil’s tough crust makes it hard for seeds to germinate in the first place. Local
birds that dig nests in the ground seem to be of help: they churn the soil, exposing
buried seeds to moisture and nutrients. Indeed, in 2016 Cristina Rengifo Faiffer found
that mounds of soil dug up by birds were far more fertile and supported more
seedlings than soil in undisturbed areas.
Which choice best describes the function of the underlined portion in the text as a
whole?
A) It identifies the reason particular bird species dig nests in Atacama Desert soil.
B) It explains how certain birds promote seed germination in Atacama Desert soil.
C) It describes the process by which seeds are deposited into Atacama Desert soil.
D) It elaborates on the idea that the top layer of Atacama Desert soil forms a tough
crust.
Choice B is the best answer, as the underlined portion notes how nest-digging
birds “churn the soil, exposing buried seeds to moisture and nutrients.” The
passage doesn’t explain why some bird species dig nests in the soil, so choice A
is incorrect; similarly, the passage doesn’t describe how seeds are (initially)
deposited into the desert soil, so choice C can be ruled out. While the fact that the
Atacama Desert soil has a tough crust is mentioned in the passage, the underlined
portion doesn’t build on that idea specifically, which makes choice D attractive but
incorrect.
Student RW11’s analysis focuses on identifying the best answer (without direct
consideration of the alternatives). In the process, the student demonstrates
comprehension of the passage as a whole as well as a clearheaded sense of how
the underlined portion contributes to the passage’s message.
Well, the under[lined] portion reads, “they [churn] the soil, exposing
buried seeds to moisture and nutrients,” “they” referring to local birds in
the desert. And it says that the nests that they dig “in the ground seem
to be of help” because “the soil’s tough crust makes it hard for seeds to
germinate,” but the nests that they dig seem to be of help. And then it
explains how they dig the nests in the underlined portion, that they churn
the soil exposing buried seeds to moisture and nutrients.
Table 14. Student Performance on Reading and Writing: Craft and Structure—
Cross-Text Connections Questions.
Question 5
Question 5, a hard (PSB 6) question set in a humanities context, requires students
to read and understand two differing but overlapping perspectives on author
Virginia Woolf’s book Orlando. The first passage stresses how much of an outlier
Orlando is within Woolf’s oeuvre, while the second passage argues for the novel’s
importance despite its unusual characteristics.
Text 2
Like Woolf’s other great novels, Orlando portrays how people’s
memories inform their experience of the present. Like those works, it
examines how people navigate social interactions shaped by gender
and social class. Though it is lighter in tone—more entertaining, even—
this literary “joke” nonetheless engages seriously with the themes that
motivated the four or five other novels by Woolf that have achieved the
status of literary classics.
Based on the texts, how would the author of Text 2 most likely respond
to the assessment of Orlando presented in Text 1?
Besides having to read and understand the two passages, students must make
a two-part inference to select the best answer, choice D. The author of Text 2
shares with the author of Text 1 the view that Orlando is unusual among Woolf’s
works but, in contrast to the author of Text 1, contends that the novel is still
important because, despite having a lighter tone, it covers the same general
thematic content as Woolf’s other significant fiction. To reach this understanding,
students need a clear sense of both passages individually as well as where the two
texts agree and differ in perspective. Choice A is incorrect because Text 2 never
concurs that Orlando’s reputation has kept potential readers away. Choices B and
C are incorrect because the author of Text 2 regards Orlando as among Woolf’s
“great novels.”
Student RW13 begins working the question by summarizing in their own words
the gist of Texts 1 and 2, in the process noting the differences in perspective
represented by each passage.
After reading choice A and deciding to come back to it later, student RW13 then
rules out choices B and C, the question’s other distractors.
“By agreeing that Orlando is less impressive than certain other novels
[by Woolf].” [The authors of Texts 1 and 2] definitely do not agree. [The
author of Text 2 is] saying that Orlando is just as good even though
it’s “lighter in tone.” So it’s not [choice] B. “By conceding that Woolf’s
talents were best suited to serious novels but asserting that the humor
in Orlando is often effective.” I think [the author of Text 2] also would not
agree that [Woolf’s] talents were best suited to serious novels because
what they’re trying to say is that Orlando, even though it’s not—even
though it’s lighter, it’s just as good. So they’re not saying that she had a
worse sense of style or a worse writing style in this book.
Student RW13 next offers a text-based rationale for the best answer, observing
not only that the authors of Texts 1 and 2 agree that Orlando is an outlier among
Woolf’s fiction but also that they disagree about how seriously to treat the work.
The student concludes by returning to and ruling out choice A on the grounds that
the author of Text 2 wouldn’t consider Orlando a minor work.
Text 1
Films and television shows commonly include a long list of credits
naming the people involved in a production. Credit sequences may not
be exciting, but they generally ensure that everyone’s contributions are
duly acknowledged. Because they are highly standardized, film and
television credits are also valuable to anyone researching the careers of
pioneering cast and crew members who have worked in the mediums.
Text 2
Video game scholars face a major challenge in the industry’s failure to
consistently credit the artists, designers, and other contributors
involved in making video games. Without a reliable record of which
people worked on which games, questions about the medium’s
development can be difficult to answer, and the accomplishments of all
but its best-known innovators can be difficult to trace.
Based on the texts, how would the author of Text 1 most likely respond
to the discussion in Text 2?
The gist of Text 1 is that the film and television industries consistently and
thoroughly document contributors, while the gist of Text 2 is that the video game
industry doesn’t. The key point of interaction between the passages concerns
the value of credits as a research tool. Text 1 notes that the practice of having
transparent and complete credits allows researchers to more easily study “the
careers of pioneering cast and crew members” who’ve worked in films and
television, while Text 2 observes that the comparative lack of crediting in video
games can leave “questions about the medium’s development . . . difficult to
answer” and makes “the accomplishments of all but its best-known innovators . . .
difficult to trace.”
Choice C is the best answer. The “widespread practice” choice C refers to is the
prevalence of crediting in the film and television industries, and the “problem faced
After reading the two passages and before analyzing the answer choices, student
RW4 offers their own tentative conclusion about the relationship between Text 1
and Text 2.
I think that author of Text 1 would probably say something along the
lines of standardization could help or something along the lines—along
the lines of—I don’t know, because they agree that standardization and
having this credit sequence is a good thing. So they would probably agree
with that.
Student RW4 then uses text-based reasoning to block two of the distractors.
To rule out choice A, the student cites the basic agreement between the two
passages on the value of thorough credits to researchers of both the film and
television and the video game industries. They also recall the basic contrast: the
former have a longstanding practice of crediting all contributors, while the latter
doesn’t.
The student uses similar reasoning to rule out a second distractor, choice B,
observing that despite differences in the extent to which credits are available in
the film and television and the video game industries, the role and value of credits
are essentially the same.
Choice B, “By pointing out that credits have a different intended purpose
in film and television than in the medium addressed by the scholars
mentioned in Text 2.” Definitely not. Both say that it’s to find the people
who worked on it who were great. Just Text 1 says, “. . . ensure that
everyone’s contributions are duly acknowledged.” Text 2 says, “Without
a reliable record of which people worked on which games, questions
about the medium’s development can be difficult to answer, and [the
In the process, student RW4 draws a subtle but necessary distinction between this
answer choice and a more accurate statement about the relationship between the
texts: the issue isn’t that video game scholars should make more use of credits in
their research but rather that they can’t because these credits don’t exist to the
extent that they do in film and television.
To answer Central Ideas and Details questions as intended, students are expected
to demonstrate the following behaviors:
Table 14 summarizes how students performed on the two Central Ideas and
Details questions included in the study.
Table 15 indicates that both Central Ideas and Details questions included in the
study performed as expected, with differentials of 0 and 1.
Question 7
For question 7, a medium-difficulty (PSB 4) question set in a humanities context,
students need to ascertain that the passage’s main idea is that Richard Hunt
In many of his sculptures, artist Richard Hunt uses broad forms rather
than extreme accuracy to hint at specific people or ideas. In his first
major work, Arachne (1956), Hunt constructed the mythical character
Arachne, a weaver who was changed into a spider, by welding bits of
steel together into something that, although vaguely human, is strange
and machine-like. And his large bronze sculpture The Light of Truth
(2021) commemorates activist and journalist Ida B. Wells using mainly
flowing, curved pieces of metal that create stylized flame.
Which choice best states the text’s main idea about Hunt?
In ruling out the distractors, students should recognize that choice A represents,
at best, a subordinate, rather than main, point made by the passage; that choice B
is factually incorrect per the passage; and that choice D is unsupported by the
passage.
Student RW14 offers some interpretive commentary on Hunt and his works, as
depicted in the passage, while recounting the passage’s key elements.
So to me, I think they’re just calling him creative and good at his work, so
I’m looking for answer choices that kind of fit that.
As it turns out, this encapsulation doesn’t directly embody the question’s narrower
and more precise best answer. Nonetheless, the student has demonstrated strong
comprehension of the passage and called attention to the fact that Hunt created
art representing both historical and mythical figures, a point that will subsequently
rule out one of the question’s distractors.
Student RW14 then correctly determines that choice A, one of the distractors,
represents a subordinate rather than the main idea of the passage.
Student RW14’s prior summation of the passage content enables them to easily
block another distractor, choice B.
[Choice B,] “He tends to base his art on important historical figures
[rather than on fictional characters].” Okay, I’m just going to go ahead
and cut that there. Yeah. He made Ida B. Wells, but he also made Spider-
Arachne Man. So, no, I’m going to go ahead and cross that out.
[Choice C,] “He often depicts the subjects of his sculpture[s] using an
unrealistic style.” That kind of works because it kind of calls to—he
doesn’t use accuracy. He uses “broad forms,” so he tries to be more
creative. That kind of calls “creative” to me. So I’m going to put a little
dash next to that to the side, and I’m going to move on to the third—or
final [answer choice].
The passage doesn’t support choice D’s assertion that Hunt has “altered his
approach to sculpture over time” or that “his works have become increasingly
abstract,” as both 1956’s Arachne and 2021’s The Light of Truth are essentially
equally abstract in style. As student RW14 considers this distractor, they make
a misstep in reasoning: they correctly reject the answer choice but do so on the
errant basis that Hunt’s works have grown less abstract over time.
[Choice D,] “He has altered his approach to sculpture over time, and his
works have become increasingly abstract.” Okay. What I’m thinking here
The student later clarifies that their main reason for ruling out this choice was
because they “don’t think [it] says enough about that here in the passage” to make
it the best answer. While the student’s rationale remains incomplete and imperfect,
they nonetheless evince conceptual awareness that part of the task posed in this
question is to differentiate the main idea from subordinate ideas and details.
Ultimately, the student reaffirms the best answer choice, C, tying their selection
back to their initial assessment of the passage’s message.
“He often depicts the subjects [of his sculptures using] an unrealistic
style.” I like that one because it kind of talks about creativity, which is
what I was saying before.
Question 8
Question 8, a medium-difficulty (PSB 4) question set in a literature context,
focuses on the passage’s key analogy.
The following text is from Ezra Pound’s 1909 poem “Hymn III,” based on
the work of Marcantonio Flaminio.
Based on the text, in what way is the human mind like a flower?
According to the passage, just as flowers need moisture in the forms of dew and
rain to survive, the human mind needs its own form of “food” to thrive. Selecting
the best answer, choice B, requires students to parse this analogy and its
component parts—a common activity in English language arts classes. Choices
A and C aren’t supported by the passage. Choice D draws its appeal from echoing
the speaker’s sense of possible calamity, but it doesn’t accurately describe the
speaker’s analogy.
So, basically, what the question is asking [is] how the human mind is like
a flower.
The student then rereads the opening lines of the passage and offers their own
summation of the analogy.
So, in the poem, “As a fragile and lovely flower unfolds its gleaming
foliage on the breast of the fostering earth, if the dew and the rain draw
it forth; / So doth my tender mind flourish, if it be fed with the sweet dew
of the fostering spirit.” For right there, when it says, “if it be fed [with]
the sweet dew of the fostering spirit,” I feel like it’s talking about how
the mind can flourish with nutrients, which kind of relates to . . . how the
human mind, given proper nutrients, . . . could thrive like a flower. From
right here, it says, “[if] [the] dew and the rain draw it forth; / So doth my
tender mind flourish,” which I feel like it’s trying to relate right there.
Student RW9 then offers text-based rationales for ruling out each of the three
distractors in the question, finding little evidentiary support for these choices.
But for answer choice A, “It becomes increasingly vigorous with the
passage of time.” I don’t really see how time can be related in this poem
because they didn’t really talk about time at all. So I wouldn’t feel like
that’s a right answer choice.
But [choice] C, it says, “It draws strength from changes in the weather,”
which I could see how this could be a possible correct answer choice
because it does talk about how “the dew and the rain draw it forth” twice,
. . . which would make sense about the weather part. But I just didn’t
feel like how the changes in the weather would relate to the human mind
because that didn’t make sense.
The student then provides a similarly text-based rationale for the question’s best
answer, choice B.
But I did feel like answer choice B made the most sense because [the
passage] does talk about how the flower’s flourishing, how it’s gleaming
on the fostering earth in the rain and the dew or the rain because it’s
getting the right nutrients. So as the mind, if it gets the right nutrients, it
can flourish like the plant is, or the flower.
10 Science 3 23 25 25 23 25 23 2
n = 25 (92%) (100%) (100%) (92%) (100%) (92%)
Question 9
Question 9 is a hard (PSB 6) question set in a literature context. This difficult
question doesn’t have a traditional test passage preceding the question; rather,
it embeds what would typically be passage content into the four answer choices.
The question lays out the interpretive claim to be supported by one of the four
answer choices. Each of the quotations in questions such as this is an accurate
representation of the original text, and such quotations are carefully selected to
ensure that students, irrespective of background knowledge (including whether
they’ve previously read the sampled work of literature), are able to make sense of
them.
Which quotation from “On Virtue” most effectively illustrates the claim?
After reading the question, student RW15 evaluates the various answer choices,
in the process demonstrating a grasp of the meaning and implication of each.
The only one is [choice] B. B is talking about time. It’s talking about
“guide my steps to endless life and bliss.” And it’s talking directly to the
quality of virtue and imploring it to assist her in reaching a future goal.
And that’s the only one that really talks about the future.
Question 10
Question 10 is an easy (PSB 3) question set in a science context. Unlike question 9,
question 10 has a traditional reading passage preceding the question and answer
choices.
Scientists have long believed that giraffes are mostly silent and
communicate only visually with one another. But biologist Angela Stöger
and her team analyzed hundreds of hours of recordings of giraffes in three
European zoos and found that giraffes make a very low-pitched humming
sound. The researchers claim that the giraffes use these sounds to
communicate when it’s not possible for them to signal one another visually.
Which finding, if true, would most directly support Stöger and her team’s
claim?
The student then exhibits text-based reasoning to assess the impact that each
presented finding, if true, would have on the claim. Although the student’s
exclusion of choice A, one of the incorrect answer choices, is imprecise, their
reasoning on the best answer (choice B) and the other two distractors (choices C
and D) is clear and explicit.
Students whose interviews offer evidence of carrying out these behaviors have
demonstrated cognitively complex thinking. To successfully answer Command
of Evidence: Quantitative questions involving tables and graphs, students must
read and interpret the passage, which provides crucial context for understanding
the included table or figure. Students must also understand the substance of the
associated informational graphic, including what the graphic as a whole represents
as well as the nature of its components (i.e., tabular data, bars, lines). They must
additionally have a clear grasp of the criterion established by the question,
which indicates what argumentative claim or informational point is meant to be
supported by data from the table or figure. Finally, students must synthesize
elements of the passage, informational graphic, and question to arrive at the best
answer among the provided choices.
Question 11
Question 11 is a medium-difficulty (PSB 4) table-based question set in a history/
social studies context. Students are presented with a table of four locations in the
Navajo Nation and those locations’ average high and low temperatures, in degrees
Fahrenheit, in July. The accompanying passage indicates that the large expanse of
The Navajo Nation has the largest land area of any tribal nation in the
United States: over 27,000 square miles in the Southwest. Because this
area is so huge and its communities are located at various elevations,
the people of the Navajo Nation can experience different climate
conditions depending on where they live. For example, in July, b l a n k
Which choice most effectively uses data from the table to support the
claim?
A) the lowest temperature for both Cameron and Teec Nos Pos
was 65°.
B) the lowest temperature for both Ramah and Tuba City was 50°.
C) Tuba City’s average highest temperature was 94°, while Teec Nos
Pos’s was 93°.
D) Ramah’s average highest temperature was 83°, while Cameron’s
was 99°.
To answer correctly, students must understand that the question is asking for the
example from among the provided choices that best supports the passage’s claim
that “the people of the Navajo Nation can experience different climate conditions
depending on where they live.” This claim is best supported by an answer choice
that establishes, using data from the table, a wide divergence in average highest
and lowest temperatures in July.
Choice D best accomplishes this goal, as it accurately uses data from the
table to establish such a divergence: Ramah’s and Cameron’s average highest
temperatures in July are sixteen degrees different (83 degrees versus 99 degrees).
Choices A and B are incorrect on two grounds. First, each refers to the “lowest
temperature” rather than the average lowest temperature at two locations and
thus mispresents the table. Second, both pairs of temperatures cited are the same
and therefore don’t establish a striking dissimilarity, as sought by the question’s
criterion. Choice C incorrectly represents data from the table: Tuba City’s average
highest temperature in July was 83 degrees, not 94 degrees, and Teec Nos Pas’s
average highest temperature in July was 94 degrees, not 93 degrees.
After reading the question itself, student RW20 describes the question’s criterion
and what the nature of the best answer would be.
Student RW20 then processes the various answer choices, checking each for
both accuracy relative to what’s reported in the table and appropriateness for
supporting the passage’s claim—in the latter case, for the answer choice that
establishes the clearest and widest contrast.
So option A, “The lowest temperature for both Cameron and Teec Nos
Pos was 65[°].” And then looking at both of them, and that is correct,
actually. But it doesn’t establish a contrast, which is what we’re trying to
get. So I’m ruling out option A. Option B, “the lowest temperature for both
Ramah and Tuba City was 50°.” And although this is correct [per the
table], [it’s wrong for] the same reason as option A, because we want to—I
want to establish a contrast, and that’s what, like, the question is trying to
say because that’s why they have “for example”: they want to show that
people in different parts of the Navajo Nation can experience different
climate conditions because the area is so big and because it’s so different
in different places. So that leaves me with option[s] C and D. Option C,
“Tuba City’s average highest temperature was 94°, while Teec Nos Pos’s
was 93°.” And this actually is factually incorrect, so I can rule it out right
now. And that leaves me with option D. Let me check and make sure
that it is correct. “Ramah’s average highest temperature was 83°, while
Cameron’s was 99[°].” Compared to all the other answers, it establishes
the most contrast. And I’m just scanning through the table once again. It
looks like it’s also factually correct. So I’m going to click on option D.
It’s worth noting that student RW20 doesn’t observe that choices A and B,
about the “lowest temperature” in two locations, are incorrect also because they
Question 12
Question 12 is a hard (PSB 6) question set in a humanities context. In the table,
students are given information about four individuals: their years active in the film
industry and their known professional contributions. The passage accompanying
the table contextualizes the tabular data by noting that “counts of those four
figures’ output should be taken as bare minimums rather than totals” because “so
many films and associated records for this era have been lost.”
Which choice most effectively uses data from the table to complete the
example?
A) Lillian St. Cyr acted in far more than 66 films and Edwin Carewe
directed more than 58.
B) James Young Deer actually directed 33 films and acted in only 10.
C) Dark Cloud acted in significantly fewer films than did Lillian St. Cyr,
who is credited with 66 performances.
D) Edwin Carewe’s 47 credited acting roles include only films made
after 1934.
The “example” referred to in the passage’s last sentence, which students are
expected to complete via their answer selection, is intended to illustrate the
Like student RW20 for question 11, student RW16 starts their approach to
question 12 by reading through and summarizing the table.
“Credited Film Output of James Young Deer, Dark Cloud, Edwin Carewe,
and Lillian St. Cyr.” Um, so this, this, this chart is specifically talking
about the, uh, film and credits of certain individuals.
After reading through the table, passage, and question, the student then offers
their own encapsulation of the passage’s gist.
So the passage is saying that, um, the, uh, known, uh, credits of these,
uh, filmmakers are minimums. They—they’re—the known is just the
bare minimum for which they actually did. So, even though we know
that these—they’re accredited for, uh, so, so many films, they probably
produced, acted [in], or directed so many more. Um, so it’s asking, you
know, uh, essentially who—well, let me look at the answer choices first.
Student RW16 proceeds to work through each of the answer choices (in reverse
order), settling on the question’s best answer, choice A.
Um, so [choice] D, it says, “after 1934,” even though he was active up, up,
up until 1934. So I don’t think D’s correct just because it doesn’t support
the data. Um, [choice] C is comparing two of the filmmakers. Um, and
it’s saying that “Dark Cloud acted in significantly fewer films than did
Lillian St. Cyr, who [is] credited with 66 [performances].” Um, that is
true, but, um, we are—the, the passage is not talking about comparing
these, uh, these filmmakers. We’re—it’s talking about, um, how they were
probably in way more films than, uh, than is previously known. So I don’t
think [this] is correct either. Um, [choice B,] “James Young Deer actually
directed 33 films and acted in only 10.” Um, that’s true. That is directly
using the data. Um, so that is true, but also I don’t think that supports the
passage. Um, it’s just saying—it’s just stating facts. Um, but the passage
is not talking about facts. The passage is actually talking about—uh,
it’s hypothesizing what could be possible. And then [choice] A, “Lillian
St. Cyr acted in far more than 66 films and Edwin Carewe directed more
than 58.” So, um, it’s saying that the known number is much lower than
Note that the student’s rationale for ruling out choice B is somewhat opaque but
ultimately correct. Strictly speaking, choice B can’t be “true” per the table, as
the student claims, because the table itself lists higher numbers of film credits.
However, the student rightly observes that choice B can’t effectively complete
the passage’s example because it posits lower, not higher, outputs for Young Deer
than he’s commonly credited with.
Question 13
Question 13, the first of the study’s two Command of Evidence: Quantitative
questions incorporating a graph, is a medium-difficulty (PSB 5) question set in
a science context. To answer this question correctly, students must read and
understand the graph, noting that its bars refer to cantaloupe yields in three years
under two different conditions (the experimental condition, in which nitrogen
fertilizer was used, and a control condition, in which a fertilizer without nitrogen
was used); determine from the passage that the claim to be supported is that
“nitrogen fertilizer increases cantaloupe yield”; and then determine which answer
choice provides data from the graph that best support this claim.
Cantaloupe Yield
45
Yield (pounds per acre)
40
35
30
25
20
15
10
5
0
2017 2018 2019
Year
control nitrogen fertilizer
Which choice best describes data in the graph that support the researchers’
conclusion?
A) The yield for plants treated with the nitrogen fertilizer increased from 2017
to 2018.
B) In every year of the experiment, plants treated with the nitrogen fertilizer
had a greater yield than did plants treated with the control fertilizer.
C) The 2018 yield for plants treated with the control fertilizer was greater than
was the 2019 yield for plants treated with the nitrogen fertilizer.
D) In every year of the experiment, plants treated with the nitrogen fertilizer
had a yield of at least 30 pounds per acre.
Student RW17, like many others quoted so far, starts their successful approach to
answering by paraphrasing for themselves the content being presented.
So the question is asking, which best supports the data? And [in] the
data, we see the plants that were grown with the control fertilizer and
then the other ones were grown with the nitrogen. And we see that the
nitrogen fertilizer is greater than the control fertilizer.
The student then evaluates the answer choices, demonstrating in the process
the understanding that the best answer, as in their summary of the data, must
accurately capture the intended comparison between the experimental and
control conditions.
[Choice] A, “The yield for plants treated with the nitrogen fertilizer
increased from 2017 to 2018.” So we do see that’s true, but I’m going
to hold off on that one because I feel like the researcher’s conclusion is
comparing how the nitrogen fertilizer was better than the control fertilizer.
So I wouldn’t really compare it to itself. [Choice] B says, “In every year
of the experiment, plants treated with the nitrogen fertilizer had a greater
yield than did plants treated with the control fertilizer.” So this I’m going
to say is a possible answer because it’s true; we do see that the nitrogen
fertilizer had a greater yield than the control one. And the researchers,
that’s what they’re trying to test—that the nitrogen fertilizer did help the
cantaloupe production. So I’m going to say [this] might be the answer.
[Choice] C, “The 2018 yield for plants treated with the control fertilizer
was greater than was the 2019 yield for plants treated with the nitrogen
fertilizer.” So, again, I’m not going to say it’s this one just because I
feel like comparing them or comparing it to itself isn’t really what the
researchers were trying to do. And then [choice] D says, “In every year of
the experiment, plants treated with the nitrogen fertilizer had a yield of
at least 30 pounds per acre.” So this one I’m not going to say either just
because thirty pounds per acre, we don’t really have anything to compare
this number to, so we don’t really know what it means.
Although the student doesn’t recognize or at least point out that this last choice
is factually incorrect per the graph, they do, critically, understand that even if this
option were true, it wouldn’t supply evidence that would support the researchers’
conclusion.
So for [choice] B, “In every year of the experiment, plants treated with
the nitrogen fertilizer had a greater yield than did plants treated with the
control [fertilizer].” So I’m going to go with B as my answer because that’s
what the researchers were trying to find. And they concluded that the
nitrogen fertilizer increases cantaloupe yield, so I’m going to go with B.
Question 14
The last of the studied Command of Evidence: Quantitative questions is a medium-
difficulty (PSB 4) question with a line graph set in a history/social studies context.
The two lines represented in the graph identify the monthly hours of sunshine from
April to September in two locations in Alaska. To answer this question correctly,
students must determine from both the graph and passage that the two cities
“show a similar pattern in the monthly hours of sunshine from April to September.”
250
200
150
100
50
0
ril ay ne Ju
ly
gu
st be
r
Ap M Ju
Au em
pt
Se
Month
Anchorage
Fairbanks
Which choice best describes data from the graph that support the
student’s conclusion?
Choice A is the best answer here because it captures the basic similarity in the
hours of sunshine experienced by the two cities over time: they increase from
April to June and then decrease from June to September. Choice B is factually
erroneous given the information in the graph. While choice C tries to represent a
trend across the studied time span, it does so with inaccurate information about
monthly hours of sunshine in the two cities. Choice D is incorrect because it’s
factually inaccurate—the hours of sunshine don’t “hold steady in June and July” in
the two cities but rather begin decreasing from June to July—and because it fails
to accurately represent the overall trend in the two cities from April to September.
Having correctly discerned the intended sort of answer, the student rules in the
best response and rules out the distractors using largely the same sort of rationale
provided above.
Inferences Questions
On the digital SAT Suite tests, Inferences questions assess students’ ability to
reach reasonable, text-supported conclusions based on what passages say
explicitly and strongly imply. Inferences questions include a blank, which students
must “fill in” with the most logical option among the provided answer choices.
Table 18 indicates that both Inferences questions included in the study performed
as expected, with differentials of 3 and 0.
Question 15
Question 15 is a difficult (PSB 6) Inferences question set in a science context.
Like other questions of the type, this question requires students to read and
understand an appropriately challenging passage and then determine which
choice among the answer options most logically follows from what’s been
presented in the text—in this case, teasing out what suggestion is implied by the
passage’s information about mosses growing in the desert.
To answer this question correctly, students must trace the line of reasoning
presented in the passage and determine which of the provided conclusions is
logically entailed, or strongly indicated, by that reasoning. Schematically, students
might break this passage down as follows:
1. Mosses can dry out in harsh, bright desert conditions but still need enough
sunlight for photosynthesis.
2. Scientists found a type of desert moss growing under quartz crystals.
3. The scientists wondered whether growing under quartz crystals benefited the
moss.
4. Shoot tissue, a measure of plant growth, was sixty-two percent longer for moss
growing under the quartz than for moss growing on the soil surface.
5. This finding suggests that . . .
Despite their apt summary of the passage, student RW11 momentarily gravitates
toward choice B, one of the question’s distractors. The appeal of choice B to the
student appears to be that it suggests the same sort of balancing act as that
introduced in the passage’s first sentence: desert mosses need some sunlight to
survive, but not too much.
As they work through the answer choices more fully, however, student RW11
recognizes their earlier mistake and rules out choice B as well as the other two
distractors.
Student RW11 then provides a text-supported rationale for the best answer.
But it does say that they “struggle in harsh desert conditions” just
because they require enough sunlight. Choice C, they grow under—
“[S. caninervis] growing under quartz crystals experience lower light
intensity and are thus able to retain more moisture.” [The passage]
does reference needing enough sunlight but not so much that they risk
drying out. So because it’s under the crystal, it can retain more moisture
because—but it still has enough for photosynthesis. But it can retain
more moisture because it’s a lower intensity, and they will be less likely to
dry out.
Question 16
Question 16 is a medium-difficulty (PSB 4) Inferences question set in a history/
social studies context. In this question, students must determine the logical
consequence that follows from the results of a study on interruptions in the
workplace.
The underlying structure of the passage can be broken down in a manner similar to
that for question 15’s text.
Choice D most logically completes the text because it acknowledges that whether
interruptions are good or bad for employees depends on the circumstances: while
some interruptions could have negative effects, others, specifically those involving
colleagues interrupting coworkers, could have positive social effects. Choice
A is unsupported by the passage and is therefore incorrect. Choice B is wrong
because it ignores the contingent nature of interruptions, which can be either
good or bad depending on the circumstances. Choice C is incorrect because it
erroneously posits that all interruptions are problematic.
After reading the passage, student RW27 provides a succinct summary of the
gist, albeit one that lacks some of the passage’s nuance.
Using that frame, student RW27 then evaluates the answer choices. Although
the student neglects the qualified nature of the researchers’ endorsement of
workplace interruptions when summarizing the passage, they clearly exhibit
textual comprehension as they work through the choices, observing that some
such interruptions were found to be beneficial. In their selection of the best
answer, choice D, student RW27 uses vocabulary knowledge and context clues to
ascertain that “offset” essentially means “take away.”
EXPRESSION OF IDEAS
Rhetorical Synthesis Questions
The digital SAT Suite’s Rhetorical Synthesis questions assess students’ ability to
combine information and ideas in ways aligned to specified writerly goals. Each
Rhetorical Synthesis question includes three elements:
Table 19 indicates that both Rhetorical Synthesis questions included in the study
performed as expected, with differentials of 1 and 2.
Question 17
Question 17, a medium-difficulty (PSB 4) question set in a science context, asks
students to selectively use the provided notes to establish an advantage of a new
type of platinum catalyst.
A) While still highly effective, the new platinum catalyst requires far less of the
rare and expensive metal than do other platinum catalysts.
B) Platinum is a rare and expensive metal that is used as a catalyst for
chemical reactions; however, platinum catalysts typically require a large
amount of platinum to be effective.
C) Researcher Jianbo Tang and his colleagues created a platinum catalyst
that combines platinum, a rare and expensive metal, with liquid gallium.
D) Like other platinum catalysts, the new platinum catalyst requires a
particular amount of the metal to be effective.
After observing that the topic in the question is “very cool,” student RW22
evaluates the answer choices. They first note that the best answer, choice A,
accurately reflects the notes and expresses an advantage of the new platinum
catalyst.
“While still highly effective, the new platinum catalyst requires far less
of the rare and expensive metal than do other platinum catalysts.” That’s
true. It does compare platinum catalysts usually requiring a large amount
versus the new one [that] only needed, like, one ten-thousandth, or .0001
percent. So yes, that would make sense based on the notes.
Student RW22 then rules out the other answer choices on the intended basis:
none of them is as successful as choice A in identifying an advantage of the new
catalyst.
Question 18
Question 18 is another medium-difficulty (PSB 5) Rhetorical Synthesis question,
this time set in a humanities context and with the goal of describing a particular
work of art in the exhibition “Labor of Love.”
Choice A is the best answer, as it most effectively uses relevant information from
the notes to describe “The Choreography of Labor,” a work by Isabel Toledo that
was shown as part of the “Labor of Love” exhibition and that “blended dress
designs from its creator . . . with images of laborers from Diego Rivera’s murals.”
Choice B is incorrect because although it mentions a specific work that was part
of the “Labor of Love” exhibition, it doesn’t describe “The Choreography of Labor”
in any detail but instead simply names a feature shared by many of the works in
the exhibition. Choices C and D are incorrect because they only offer information
about the “Labor of Love” exhibition in general.
Okay. First of all, [choices] C and D don’t really explain anything about
the exhibition. They’re just saying that there was one. So they’re kind of
the same thing. Don’t really say anything about it. And what [the student
is] trying to accomplish is to describe [Isabel Toledo’s] work, which
[choices C and D] don’t do. So we can rule those out.
The student next observes that choice B is closer to being viable than are choices
C and D but that B, too, doesn’t successfully meet the goal set forth in the
question.
They then settle on choice A, the best answer, as the most descriptive of an
individual work in the exhibition.
Transitions
The final Reading and Writing question type examined in this study focuses
on assessing students’ ability to skillfully use transition words and phrases to
enhance the logic and cohesion of texts.
Table 20 indicates that both Transitions questions included in the study performed
as expected, with differentials of 1.
Which choice completes the text with the most logical transition?
A) On the contrary,
B) Consequently,
C) Regardless,
D) For example,
Choice B is the best answer, as the blank in the passage should be filled in with the
cause-effect transition “consequently”: two results of expecting mail recipients
to pay the cost of postage were that fees were collected in a “slow and arduous”
way and that “heaps” of abandoned mail “piled up in post offices.” Choices A, C,
and D are incorrect because each fails to complete the passage with a logical
transition indicating a cause-effect relationship. “On the contrary” means “just
the opposite”; “regardless” means “despite everything”; and “for example” signals
exemplification.
I’ll read the sentence before and after so I’ll know what to put there. So
it says, “The cost of . . . postage was usually paid by the recipient of [a]
letter rather than [the] sender, [and] recipients were not always able or
willing to pay promptly.” Blank. Maybe, [choice A,] “On the contrary,
collecting this fee could be slow and arduous, and heaps of unpaid-for,
undeliverable mail piled up in post offices.” “On the contrary.” I don’t
think it fits. It doesn’t sound like the right transition. It’s not the right kind
of transition to use. It does not feel right. It’s not really, “on this hand,
it’s like this.” It’s not describing something like that. So the next choice
would be [choice B,] “consequently.” Some “recipients were not always
[able or] willing to pay promptly. Consequently, collecting this fee could
be slow and arduous, and heaps of unpaid-for, undeliverable mail piled up
in [post] office[s].” That could be correct because since they’re not always
Question 20
Question 20 is an easy (PSB 3) question also set in a history/social studies context.
To answer this question correctly, students need to recognize that an adversative
transition (i.e., one signaling opposition or contradiction) is needed in the blank.
It has long been thought that humans first crossed a land bridge into the
Americas approximately 13,000 years ago. b l a n k based on
radiocarbon dating of samples uncovered in Mexico, a research team
recently suggested that humans may have arrived more than 30,000
years ago—much earlier than previously thought.
Which choice completes the text with the most logical transition?
A) Similarly,
B) In conclusion,
C) As a result,
D) However,
Choice D is the best answer because “however” logically signals the simple
contrast set up between the two sentences in the passage: what was once
thought to be true is no longer believed to be so. Choices A, B, and C are incorrect
because each fails to complete the passage with a logical transition indicating an
adversative relationship. “Similarly” indicates likeness; “in conclusion” indicates a
summation; and “as a result” indicates consequence.
Somewhat like student RW23 for question 19, student RW6 provides a clear and
explicit basis for selecting the best answer, noting that a word or phrase signaling
contradiction is required, though they’re less precise about the reasoning for ruling
out choices A and B.
In addition, two key differences between the Reading and Writing and Math
analyses are worth noting here. First, the Math behaviors listed for each question
type are expected rather than required, meaning that students needed only
to demonstrate one of the behaviors to have been considered enacting the
question’s intended construct. This is because, by design, Math questions tend
to have multiple solving pathways, only one of which a given student is expected
to pursue. Second, answering correctly isn’t one of the listed behaviors, as it is
for Reading and Writing. However, the differential is calculated in a similar way and
represents the difference between the number of students answering a given
question correctly and the number of students who both answered correctly and
exhibited at least one of the expected behaviors.
ALGEBRA
Questions in the Algebra content domain of the digital SAT Suite tests align most
closely with topics covered in a typical rigorous first-year secondary algebra
course, including assessing the skills and knowledge associated with working
with linear expressions, linear equations in one and two variables, linear functions,
systems of linear equations, and linear inequalities. Test questions cover such
skills and knowledge as creating and using a linear equation; identifying an
expression or equation that represents a situation; interpreting parts of a linear
equation in context; making connections between linear equations, graphs, tables,
and contexts; determining the number of solutions and the conditions that lead
to different numbers of solutions; and calculating and solving. The test questions
aligned to algebra skill/knowledge elements range in difficulty from relatively
easy to relatively complex and challenging. The test questions require students to
demonstrate skill in generalization, abstraction, and symbolization, with a strong
emphasis on equivalence and using structure. Many of the test questions are
constructed to allow for more than one solving strategy.
Five Algebra questions were included in this cognitive interview study: one
Linear Functions: Interpret question, two Linear Functions/Inequalities in One
Variable: Create and Use questions, one Linear Equations in Two Variables: Make
Connections question, and one Linear Systems: Determine Conditions question.
MC = multiple-choice
Question 1
Question 1 is a relatively easy (PSB 3) multiple-choice question set in a science
context. The question asks students to read and understand the context and to
identify the best interpretation of the number 16 in the equation in terms of that
context.
Choice C is the correct answer. To answer this question correctly, students are
expected to determine from the passage that the variable d on the left-hand
side of the equation represents distance, in inches, and that the variable t on the
right-hand side of the equation represents time, in seconds. Then students must
understand that to compute the distance in inches, the time in seconds must be
multiplied by a rate in inches per second. Therefore, 16 is the rate at which the
object is moving in inches per second.
Student M1 seems to want an initial value for time, which isn’t relevant for this
context but doesn’t impede identifying the correct answer.
Student M23 also shows some comprehension of the context by identifying that
the variable t represents time, though they don’t provide a clear explanation for
the meaning of the variable d. However, that understanding may be implied by the
student’s first statement.
Um, since it’s distance, distance is always positive. And, um, it’s telling
us it’s moving [in] seconds, and t represent[s] a number of seconds. So, so
it’s—the object is moving at a rate of—no. Yeah. The object is moving at a
rate of 16 inches per second.
Student M23 does correctly identify the best interpretation of the number 16 in
the context but doesn’t provide clear verbal detail about how they came to that
interpretation.
Question 4
Question 4 is a medium-difficulty (PSB 4) multiple-choice question set in a science
context. This question asks students to determine an unknown value based on a
given context. This context represents a common linear pattern in which a starting
amount changes evenly over time.
A) 3
B) 6
C) 24
D) 44
Choice D is the correct answer. This context represents a common linear pattern
in which a starting amount (in this case, 17 ounces of wax) experiences a constant
rate of change (here, a decrease by 1 ounce every 4 hours), meaning the change
is the same amount for even increments of time. To answer this question correctly,
students can either write an equation in one variable that represents the situation
in the given context and solve for the number of hours the candle has been
burning or use algebraic reasoning to determine the number of hours the candle
has been burning.
In order to find this number of hours, students could write the equation
17 - 14 t = 6, where 17 is the amount of wax, in ounces, the candle has before it
begins to burn; - 14 is the constant rate of change that represents that the amount
of wax in the candle decreases by 1 ounce every 4 hours; t is the amount of time
the candle has been burning, in hours; and 6 is the amount of wax, in ounces,
remaining in the candle. To solve this equation, students would likely subtract 17
from both sides of the equation, which gives - 14 t =-11. Then students would
multiply both sides of this equation by -4, which results in t = 44. Alternatively,
students may choose to use algebraic reasoning that follows a pattern similar
to that involved in solving the equation, such as first observing that the candle
starts with 17 ounces of wax and decreases by 1 ounce every 4 hours and then
reasoning that 4 hours after the candle has been burning, 16 ounces of wax
remain; 8 hours after the candle has been burning, 15 ounces of wax remain; and
so on, until the number of ounces of wax remaining reaches 6 ounces.
On my paper, I’m going to first note the total amount, 17 [ounces], from
the start. When the candle is burning, the amount of wax is subtracting
1 [ounce] every four 4 hours. So 1 every 4 hours, so that would be the
amount over the time. So - 14 t or x. I like using x. I believe that’s how we
begin, but I’ll check. 17 - 14 t 6x@, and I want to say that equals 6.
Student M7 next solves the equation for the time variable, which they referred to
as both t and x. Student M7 first eliminates the fraction by multiplying both sides of
the equation by the denominator in the fraction (4) and then isolating the variable
to solve for its value.
And I’ll solve for x or what would be time in this example. So to simplify
and answer this, I’m going to multiply everything by 4. 17 # 4 is 68,
668@ - x = 24. Let me subtract 24 from both sides and then add x. So
68 - 24, that’s 44, and that equals x. So that says 44 hours.
However, let me make sure this is correct. So every hour, it’s losing
1 ounce. So it’s going to leave 11 ounces within the 4 hours. No, within
44 hours, it’ll lose 11 ounces. It starts with 17, ends with 6. If I lose 11
from the 17, that’s 6, so 44 hours would be the correct answer.
Student M21 follows the alternative path to solve the question. They begin by
writing down what was given in the question and trying to determine what the
context means algebraically.
So first I’ll write down all the values that it gives me. So 17 ounces of
wax initially. And when the candle’s burning, the amount of wax in the
candle decreases by 1 ounce every 4 hours. So I’ll also write that down.
Decreases 1 ounce every 4 hours. All right, so 6 ounces of wax remain
in this candle for how many hours? So I’ll write down the end amount,
6 ounces of wax at the end. So I’ll just set it equal to 6 for the ounces . . .
Student M21 seems unsure how to use the given rate or write an appropriate
equation from the information provided in the question, but they do understand
how to find the number of hours.
Question 5
Question 5 is a medium-difficulty (PSB 4) student-produced response question
set in a real-world context. This question asks students to find an unknown value
based on the context described. Question 5’s context exhibits a linear pattern
similar to that in the context in question 4: a starting amount changes at a constant
rate by an incremental amount over each unit of time. While question 4 represents
a linear equation in one variable, question 5 represents a linear inequality in one
variable.
The correct answer for this question is 16. To answer this question correctly,
students should either write an inequality in one variable that represents the
situation described, use algebraic reasoning, or use a guess-and-check method
to find the number of attendees. Then students need to determine whether
their decimal solution requires rounding in order to meet the requirements in the
question.
To find the greatest number of attendees possible without exceeding the budget,
students could write the inequality 35 + 10.25a # 200, where 200 is the dollar
amount of the budget that the event planner can’t exceed; 35 is the cost of the
onetime fee, in dollars; 10.25 is the cost per attendee, in dollars; and a is the
number of attendees. Alternatively, students may choose to solve for the number
of attendees using algebraic reasoning that follows a pattern similar to writing
and solving an inequality but doesn’t use variables. Students may also choose to
use guess-and-check by trying different values for the number of attendees until
they narrow in on the largest number that keeps the total cost under $200. Once
students find an answer through one of these paths, they must decide whether
an answer that includes a decimal part should be rounded up or down so as not
to exceed the budget. Since the value of a represents the number of attendees,
it must be a whole number. Therefore, any decimal value has to be rounded in
the direction that keeps the cost under the budget—that is, rounded down to the
nearest whole number.
Student M5 then uses a guess-and-check method to get from the fixed cost (the
onetime fee) to the greatest number of attendees that the budget could support.
Student M5 doesn’t mention rounding their answer, but the description of their
thinking shows an understanding of the need for a whole number: 15 is not enough
attendees and 17 is too many, but 16 is the correct number.
So you have $200 for this party, and then you subtract 35 because of the
onetime fee. And so that would be $165 left over. And then per attendee,
it is $10.25. So you’d do $165 divided by 10.25, $10.25 per attendee, and
see. You get 16. And you can’t have 0.9 other person, so the max people
that could attend would be 16 people without exceeding the budget.
Note that the reasoning used by student M25 closely mirrors the process used by
student M20 to solve the inequality they wrote.
MC = multiple-choice
Table 23 indicates that the Linear Equations in Two Variables: Make Connections
question performed as intended, with a differential of 1.
Question 16
Question 16 is a hard (PSB 7) multiple-choice question without a context. The
question shows a linear graph and informs students that the graph represents
a translation of the graph of the function f. Students are asked to identify
which equation from the four answer options defines the function f prior to the
translation.
A) f ^xh =- 14 x - 12
B) f ^xh =- 14 x + 16
C) f ^xh =- 14 x + 2
D) f ^xh =- 14 x - 14
Let’s see. I’m going to find a good point. When it goes 1 down, it goes 1,
2, 3, 4 [over]. So 14 is a slope. And it’s negative. Okay.
Student M13 next rereads the question. They then count the distance in the graph
from the y-axis to the line to determine the y-coordinate of the y-intercept (2) of
the line graphed.
It’s not clear whether student M13 is using numbers from the options or has
silently reasoned that to get to the 2 in the graph after a shift of 14 to f ^xh, the
function f ^xh had −12 as the y-coordinate of the y-intercept.
So 14. -12 + 14 is 2. Yep. -12 + 14. It’s +2. So I would go with that.
So I know that for this one, the—my intercept is 2. So, looking at this,
I just know that a first something plus 14 would have to equal 2 for the
+b-value. So, looking at this, I just know that it would be that because
the—all the slopes are constant, so I’m just looking at the b-value, and
since f (x) is just -12, if I added 14, it would just be +2. And that’s what
that is, so.
1. Understand the conditions for the number of solutions for a linear system of
equations.
2. Find the value of a constant in a linear system of equations.
3. Use a graph to determine the solution(s) to a linear system of equations.
Question 20
Question 20 is a hard (PSB 7) student-produced response question without a
context. The question gives a system of two linear equations in two variables, x
and y, with an unknown constant, r. Students are asked to identify the value of
r if the system has no solution. This is a very challenging question because the
system of linear equations isn’t written in a standard form, it includes fractions and
negative values for elements of the given equations, and one equation includes an
unknown constant.
ry = 16 - 16x
The correct answer to this question is −34. To answer this question correctly,
students must articulate what it means for a system of two linear equations in two
variables to have no solution. This means that the two lines are nonintersecting,
or parallel (i.e., they have the same slope), and don’t coincide (i.e., they don’t have
“Means they do not equal each other” can reasonably be presumed to mean that
when solving the system of equations algebraically, the student has discovered
that the interim solution when solving for one variable leads to a false statement,
such as 0 = 5, which means there’s no value for that variable and thus no solution
to the system. Student M7 correctly states that “or if they do equal each other,
that’s all real solutions,” which can reasonably be presumed to mean that if solving
the system algebraically, the student has discovered that the interim result when
solving for one variable yields an equation that is always true no matter the
value of the variable, such as 5 = 5, and that the value of that variable is “all real
numbers,” and thus the system has infinitely many solutions.
Next, student M7 rewrites the equations in standard form. They wisely note that
careful work is required to avoid making errors when rewriting. They then rewrite
both equations in the order presented.
They have to be two different numbers when they first simplify, meaning
48x - 72y = 30y + 24. My original thought is this is going to be a lot of
manipulation of the equation, which can make annoying mistakes. So I
got to make sure to be careful with it. I add 72 to both sides. Or I’ll
subtract 30. I’ll keep it on the same side. -102y . [inaudible] Yeah.
-102y = 24. r # y equals—I’m going to add 16x . So ry + 16x = 16 . Okay.
So now that I’ve [inaudible] just a little bit, let me fix this so it’s a little
better. 16x + ry . Okay, so it’s 48x , 16x + ry . What is the value r? Okay.
At this point, student M7 has rewritten the two equations in standard form, with
x- and y-terms on one side of the equation and the constant term on the other:
48x - 102y = 24
16x + ry = 16
Student M7 talks through what needs to happen next in order for the two
equations to represent parallel lines. They correctly observe that the x-term in the
So in order for this not to work, I know I need to have x and y cancel
each other out. However, I know I ha[ve] to multiply 16 by 3. So it has to
be 102 - r , but it has to be -3r . So I know it would have to be some—r
would have to be multiplied by 3 and so when added to 102. Okay. So
-102 - 3r = 0. -3r = 102. Subtracted [the] 2 [equations], 102 divided by
3, or -3 . . . -34. So it’s -34 # 3. -102. If I subtract the two, it cancel[s]
out. So I know it is -34.
Student M7 next shows some concern about whether they’ve gotten the sign
correct and checks their work to be sure. They seem to believe a negative answer
isn’t an acceptable answer but then realize the test section’s directions say they
can enter a negative response. (In the final paper-based version of the SAT Suite,
negative answers couldn’t be provided for student-produced response questions
in Math, nor would questions in that version call for such answers.)
Oh. All right. Is it just positive 34? [inaudible] Let me see. Make sure
all my signs are correct. 48x - 72y = 30y + 24. And say r is 34,
34y = 16 - 16x . Okay. Okay. [inaudible] Would it have to be positive or
negative? Well, negative is not an option. So I believe it’s 34. I’ll double-
check for the sake of making sure. [inaudible] 24 and 34. 16x + 34y =
[inaudible]. I don’t think that it can be the right answer because I thought
it would have to be negative. I don’t believe negative is an option. So I
think. Make sure. It says I could put a negative, but whenever I try to type
negative, it doesn’t work. Okay. So I can put a negative. Oh, [inaudible] at
the beginning. Okay. Yeah. So -34. That would make sense.
Student M29 also has a successful solution to question 20, this time using a
graphical approach. Student M29 starts with an algebraic solving method similar
to student M7’s but decides that there should be an easier way and eventually
graphs the system of equations.
Right, yes, I’m just looking at this and seeing maybe if I put in—for the
first equation, if I just put x to one side and then plug in that value to
the second equation to just get r, which is a constant, and only have one
value. But there should be an easier way to do this. I am still thinking
that—I’m going to try doing what I first did, and I’m assuming I just
probably made a algebra error, where I’m trying to have the equations
be equal to each other. So 48x [inaudible]. So -102y + 48x - 24. And
then ry + 16x - 16 . Let me just see. 18. Yeah. It will not work out that way,
Student M29 doesn’t describe how they graphed the equations, but one can
reasonably infer from what student M29 says next that both equations were
graphed.
So in order for it to have no solutions, that cannot cross. Right now, they
look to be perpendicular to each other.
“That cannot cross” is likely a reference to the fact that if there is no solution to the
system, the two lines cannot intersect.
Student M29’s assertion that they’re “just plugging in random values” can
reasonably be presumed to refer to entering random values for r. Note that
while narrating their work, student M29 first mistakenly says “to make sure it’s
perpendicular” but later corrects to “parallel to each other.”
Let me see. I’m just plugging in random values. I don’t really know what
[inaudible]. Let me try plugging negative values. Oh, interesting. So
plugging in negative values, which is what I thought earlier, does make
it move away from each other. I just have to make sure it’s perpendicular
to each other so—parallel to each other so it never crosses, because 33
seems to cross over here. So let me just see what value would work. 34.
Oh, I lost my graph. So 34 does not seem to cross. I’m just going to make
sure, plugging in 35. 35 definitely crosses. All right. So I’m just going to
put 34.
At the end, student M29 seems to be making an observation comparing use of the
Desmos Graphing Calculator, which is embedded in the test delivery software, and
a handheld graphing calculator. For many of the latter, the equation would need to
have been rewritten in a form of “y= ____” in order to have been entered.
I do think we can’t easily plug this into the calculator. So we would have
to set y equal to a certain function first. So with decimals, this question
is much easier than if we were doing the actual SAT with only the
calculator.
ADVANCED MATH
The Advanced Math topics assessed on the digital SAT Suite tests extend those
covered in the Algebra content domain into the realm of nonlinear equations
and functions and align most closely with topics mastered in a typical rigorous
second-year secondary algebra course and sometimes beyond. Since these
Advanced Math test questions build on skills and knowledge first mastered with
linear expressions and equations, it follows that these topics should also be
well represented on college and career readiness assessments such as those
of the digital SAT Suite. As a result, skill/knowledge elements in Advanced Math
are represented on the digital SAT Suite tests in relatively high proportions. The
Advanced Math content domain assesses skills and knowledge associated with
working with quadratic, exponential, polynomial, rational, radical, absolute value,
and conic section equations and functions. Similar to Algebra questions, questions
in the Advanced Math domain cover skill/knowledge applications similar to those
in the Algebra domain but with different types of equations, including creating and
Six Advanced Math questions were included in this cognitive interview study: two
Nonlinear Functions questions, one Make Connections question, one Determine
Conditions question, one Nonlinear Equations: Solve question, and one Rewrite
question.
Nonlinear Functions
To answer the Nonlinear Functions questions as intended, students are expected
to demonstrate at least one of the following behaviors:
MC = multiple-choice
Table 25 indicates that the Nonlinear Functions questions included in the study
performed as intended, with differentials of 1. In both cases, all but one student
demonstrating one or more of the expected behaviors answered each question
correctly.
Question 3
Question 3 is a medium-difficulty (PSB 4) multiple-choice question set in a real-
world context. The question asks students to read and understand a financial
context about an account balance and create a nonlinear equation that could
define this balance.
A) A^t h = 36,100.00^1.05ht
B) A^t h = 31,971.93^1.05ht
C) A^t h = 31,971.93^0.05ht
D) A^t h = 36,100.00^0.05ht
A0 is the initial account balance, r is the growth (or interest) rate, and t is
the time, in years. Since it’s given that the initial balance of the account is
$36,100.00, it follows that A0 = 36,100.00. Substituting 36,100.00 for A0 , it
follows that A ^t h = 36,100.00 `1 + 100
r
j . Students must then make a connection
t
can evaluate each of the given options by substituting 13 for t, where the correct
answer would then be the answer choice that would lead to A (13) = 68,071.93.
Okay. A company opens up—okay, so this looks like it’s a, um, rate of
change question, an exponential growth question. So, um, a company
opens an account with an initial balance of 36,100, the account earns
interest, and, uh, no additional deposits or withdrawals are made. The
account balance is given by an exponential equation, um, on function A
where A (t), um, is the account balance in dollars t years after the account
is open. The account balance after 13 years is 68,071.93. Which equation
could define A?
Um, okay, so it looks like the rate of change is 5%, so it’s either [choice] A
or B. Um, yeah, it’s [choice] A because we know that the initial balance is
36,100.
Okay. So, first, I want to see that A is the exponential function. So I’ll
write that down so I can remember it. And then A ◊ t is account balance.
Okay. So this right here would be probably the account balance for that
initial. And t is the years after the account is open. So I know we have
13 years. So that’s going to be our t. And a total of 68,000. And I’m trying
to find A. Okay. Okay, in order to find this. And we try to find something
that’s going to give me this number. So I would probably plug into my
calculator just to be sure. The first one, and then put t as 13. And it’s
going to give me 68,071 93. So I could have that as an option.
Question 6
Question 6 is a hard (PSB 6) multiple-choice question set in a science context. The
question asks students to read and understand a context with two sets of values
and then create an exponential equation in the given form to identify a constant
within this equation.
A) 1
12,000
B) 1
4
C) 4
D) 12,000
Student M15 interprets from the context that when t = 4, P = 24,000, and
substitutes these values in the given equation, P = 12,000 ^2hrt . They erroneously
divide the time by 100, thinking the rate is a percentage. Then they explore a
couple of ideas about how to solve for r.
Student M15 then plugs the values of t and C that they’d previously identified into
the answer options. They then check to see whether the equation then yields the
corresponding value of P, 24,000, that they’d previously identified and, thus, an
equation that represents the situation.
Let me just plug in the answers to see which one I could get that could
equal it. 12,000 # 2. I was never good at these exponential problems.
But the constant value of r—P is the number of bacteria after the
initial measurement. . . . Gives 24,000. Okay. So I’ve put in, plugged in
12,000 (2)0.25 (4) . Yeah, and the 0.25 is representative of 1 over 4. And I
did get 12,000—or 24,000 bacteria, so I’m just going to put that in, and
then I’d come back and try to plug more things in, but since I got that as
an answer, I’m probably not going to double-check everything else just
because I’m already fairly okay on time.
t would be—um, I, yeah. I think it would be—let me test this out by just
plugging it in, so. Um, 4, so then—yeah, it would be 14 . Because after
setting up the equation 12,000 # 2, uh, and then 14 # 4, I was left with
24,000. And I think that’s the most reasonable choice.
Make Connections
To answer the Make Connections question as intended, students are expected to
demonstrate at least one of the following behaviors:
MC = multiple-choice
Table 26 indicates that the Make Connections question included in the study
performed as expected, with a low differential of 3. Six of the nine students who
answered the question correctly demonstrated one or more of the expected
behaviors.
Question 12
Question 12 is a hard (PSB 6) multiple-choice question without a context. The
question expects students to determine the x-intercepts of a function involving
a transformation and then calculate the sum of the x-coordinates of the
x-intercepts.
A) -15
B) -9
C) 11
D) 15
Choice B is the correct answer. To answer this question correctly, students could
use an understanding that function g can be obtained by substituting x - 1
for x in function f: f ^x - 1h = ^x - 1 + 6h^x - 1 + 5h^x - 1 + 1h, which simplifies to
f ^x - 1h = ^x + 5h^x + 4h^xh. Next, setting each factor equal to 0 and solving for x
leads to finding the x-intercepts of y = g ^xh, which correspond to the values of a,
b, and c. The values of a, b, and c, then, are -5, -4, and 0. Students would then
find the sum of those values, which is -9. Alternatively, students could answer
this question by graphing function f and observing that function g is obtained
by shifting function f one unit to the right. They could then observe that the
x-intercepts of function g are (-5, 0), (-4, 0), and (0, 0).
And for the x-intercept, so it’s—see where all that equals to 0 and that
would give me -5, -4, and 0. So if I wanted to add all those values, I
would have -9.
Student M20 follows a successful solution path similar to that used by student M8.
Determine Conditions
To answer the Determine Conditions question as intended, students are expected
to demonstrate at least one of the following behaviors:
MC = multiple-choice
Table 27 indicates that the Determine Conditions question included in the study
performed as expected, with a differential of 3. Six of the nine students who
answered the question correctly demonstrated one or more of the expected
behaviors.
Question 15
Question 15 is a hard (PSB 7) multiple-choice question without a context. The
question asks the students to determine the value of an unknown constant in a
quadratic equation given that the equation will have more than one real solution.
64x 2 + bx + 25 = 0
A) -91
B) -80
C) 5
D) 40
As student M9 substitutes the options for b from the answer choices into the
given equation, they aren’t sure what to do next. Student M9 starts with answer
choice C (5), squares 64, and then realizes that squaring 64 isn’t correct and
doesn’t mention it again. Next, they try answer choice B (−80) but quickly move on
to choice D (40).
So when I factor this, what I’m going to get is. So we know that A is
going to be 64. B is going to be 40, and then C is going to be 25. But
there’s really no numbers that we can factor this. Maybe if I change it to 5.
Or maybe that’s going to give it to me. 64, 5, and 25. But that’s not going
to give it to me. So I can try the negative numbers. And A, it’s going to be
64. B is going to be -80. Maybe that one’s going to give it to me. 1,264.
-80. That’s not going to give me either.
At this point, student M9 acknowledges that -80 gives one solution, 0.625, and
proceeds to substitute -91 for b in the quadratic equation. Unlike in their other
attempts to solve by factoring, student M9 now uses technology (either their own
calculator or the Desmos Graphing Calculator) to solve the equation by graphing
to determine the number of solutions. Student M9 doesn’t clearly articulate what
they’re doing with the technology but does say what the two values of x are when
So we’re left with 91. I think I’m going to err. Let’s see 4. Okay. So I got
-.625. And then 64. So assuming that it’s probably 80. But I’m just going
to double-check with the 91 solution. 64. -91 and 25. But that also gives
us—but they’re looking for different solutions. So I’m going to have to go
with -91 because it gave me -1.04 and -.37.
At this point, being unsure of the formula, student M8 changes their approach
and substitutes the values in the answer choices in for b in the given quadratic
equation. Next, student M8 graphs each equation and determines that of the
options, the graph has two x-intercepts only when b =-91.
I guess I would, in the worst case, just graph this out and then see where
there are, um, maybe solutions, as I’m not too sure about the formula.
Okay. Um, sorry, I’m just gonna graph that out. Um, so it seems that, um,
40 would not work out and—um, yeah, I’m just trying out and testing the
different values of b to see if there would be any solutions. Okay. And,
um, looking at the graph, it seems that -91 would be the only one to
cross the x-axis two times, so I think it would be -91.
Okay. So I know that in order to determine how many solutions there are,
it’s just -b or it’s not. Square root of b 2 - 4ac 7 b 2 - 4acA, that has to be
positive and not 0.
Student M7 then evaluates each answer choice. After trying 5 (answer choice C)
and 40 (choice D), student M7 makes an astute observation in evaluating the
“biggest number” since squaring a negative number results in a positive number.
Even though student M7 feels as though they’ve identified the correct answer,
they continue their process and evaluate the remaining option. Student M7 again
shows command of what the discriminant provides in stating that when the value
of the discriminant is 0, there’s going to be only one solution.
Let me just check -80. So that’s 80 # 80. 6,400 - 4 # 64 # 25. And that
gets me a 0. So I know that’s only going to be one solution. Therefore,
this is going to be -91.
Table 28 indicates that the Nonlinear Equations: Solve question included in the
study, which has a differential of 2, was exceptionally challenging for students. One
of the three students who answered the question correctly demonstrated one of
the expected behaviors.
A vignette from the student who answered correctly and demonstrated one of the
expected behaviors illustrate the kinds of complex thinking elicited by Nonlinear
Equations: Solve questions on the digital SAT Suite assessments.
5 ^x - kh = x - k
The correct answer for this question is 7. To answer this question correctly,
students could recognize the structure of this equation and let y = x - k .
Substituting y for x - k in the given equation yields 5y = y . Squaring both sides
of this equation yields the quadratic equation 5y = y 2 . Subtracting 5y from both
sides of this equation gives 0 = y 2 - 5y . Since both terms on the right-hand side of
this equation have a common factor of y, the equation can be rewritten as
0 = y ^y - 5h. Therefore, y = 0 or y - 5 = 0. Since y = x - k , substituting x - k for y
in these two equations yields x - k = 0 or x - k - 5 = 0. Therefore, x = k and
x = k + 5 are the solutions to the given equation. It’s given that k is a positive
constant, so k + 5 is greater than k. It’s also given that the greatest solution to the
equation is 12; therefore, 5 + k = 12, or k = 7.
Oh, oh, I’m gonna start by plugging in the greatest solution. If there—if
12 is a solution [then] we can get k. Hmm. So 12 - k = 12 - k .
Student M16 then attempts to solve the equation by squaring both sides. There
are multiple ways to solve this resulting quadratic equation; student M16 elects to
expand the squared binomial and combine like terms.
So based off that, I can just square to the right side. 5 ^12 - k h = ^12 - k h2 .
So foiling that out [i.e., using the FOIL method of multiplying two
binomials], um, I get 144 - 12k - 12k + k 2 . So k 2 - 24k + 144. So
5 ^12 - k h = k 2 - 24k + 144. Um, 5 # 12 - k is, um, 60 - 5k .
60 - 5k = k 2 - 24k + 144. Um, I think I can just move that to the other side
and then solve, um, that for a quadratic. Um, -24 + 5, um, then 144 - 60.
So 0 = k 2 - 84k , oh, uh, +84. Okay. So k 2 - 19k = 84.
At this point, student M16 determines that the solutions are 7 and 12 but doesn’t
describe how they got from the last equation articulated to that solution. Student
M16 then appears to check these two values of k in the given equation to arrive at
a key of 7. Student M16 doesn’t verbalize how they came to this conclusion, but
since it was given that the greatest solution to the equation is 12, it’s possible that
they chose 7 as it is less than 12.
Um, based off of those two solutions, what is the value of k? I guess I can
test—okay, so the only solution that seems to work is 7, I think, ’cause,
um, 12 - 12 would result in 5 # 0. So the 0 = 12 - 12, 0. Oh, that would
work. Well, 7 [corrects self ] 12 - 7, 5. So 5 # 5 = [corrects self ] 5 # 5 = 5
Yeah. That was two answers. So I think I’m just gonna go with 7.
MC = multiple-choice
Table 29 indicates that the Rewrite question included in the study didn’t perform as
expected, with a differential of 6. None of the students who answered the question
correctly demonstrated any of the expected behaviors.
Question 19
Question 19 is a hard (PSB 7) multiple-choice question without a context. The
question gives two equivalent expressions written in different forms, both with
unknown constants, some of which are specified as integer constants. The answer
choices are expressions with unknown constants, and using the given information
about those constants, students are to decide which choice must be an integer in
this situation.
C) 45
h
D) 45
k
In their attempt to find h and j, student M29 recognizes that when the binomial
factors are multiplied, the coefficient of this x 2 term will equal the coefficient of
the x 2 term in the expression 4x 2 + bx - 45, in this case 4.
As student M29 continues, they make the connection that -45 is the product of
two constants, although without specifically identifying the constants as j and k.
Student M29 then focuses heavily on determining the value of b even though this
information isn’t necessary to select the correct answer.
And then b is just what you add up, and it has to—when you multiplied it
to numbers, it has to be -45. A integer, so b just disappears. That makes
it harder. So 45 could just be -9 ◊ 5, but it could also be -5 ◊ 9. I’m
assuming it should be +9 - 5, as b is positive. But since we’re not given
b, I cannot be 100 percent sure. The following must be an integer. So b
[is] 45.
Student M29 makes an error in thinking the value of b is the sum of j and k,
forgetting that the value of h is 4, not 1. In doing so, they conclude that b = 4, h = 4,
k = 9, and j =-5 and proceed to evaluate the answer choices using those values
to determine which must be an integer. Both choice A and choice D result in
integers, and the student incorrectly selects choice A.
While this was a failed attempt, student M29 showed signs of the kinds of
cognitively complex thinking meant to be elicited by the question.
The Problem-Solving and Data Analysis content domain assesses knowledge and
skills in using ratios, rates, proportional relationships, unit analysis, percentages,
probability and conditional probability, one- and two-variable data, scatterplots,
models, inference from sample statistics, and evaluating statistical claims.
Unlike topics covered in the Algebra and Advanced Math content domains, the
topics addressed by the digital SAT Suite in Problem-Solving and Data Analysis
aren’t aligned to those covered in a specific secondary-level math course. State
education systems include the topics covered in this domain in a variety of
courses, starting with middle school/junior high school math and continuing
through high school coursework. The test questions in the Problem-Solving and
Data Analysis domain range in difficulty from relatively easy to relatively complex
and challenging and test a wide range of reasoning skills.
Fit a Model
To answer the Fit a Model question as intended, students are expected to
demonstrate at least one of the following behaviors:
MC = multiple-choice
Table 30 indicates that the Fit a Model question included in the study performed
as intended, with a differential of 0. Every student who demonstrated one or both
expected behaviors answered the question correctly.
Question 2
Question 2 is a medium-difficulty (PSB 4) multiple-choice question without a
context. The question gives a scatterplot with ten data points in a downward trend
and doesn’t show a line of best fit. The answer choices are distinguished by their
varying values for the y-coordinate of the y-intercepts of and slopes for the line.
A) y = 0.9 + 9.4x
B) y = 0.9 - 9.4x
C) y = 9.4 + 0.9x
D) y = 9.4 - 0.9x
Choice D is the correct answer. To answer this question correctly, students could
conceptualize a line of best fit and approximate two points on that line to compute
the slope and y-intercept. Alternatively, given the answer choices, students could
recognize that the y-intercept is close to (0, 10), thereby allowing them to eliminate
choices A and B, and then recognize that the slope is negative, which results in
choice D being the best answer.
Unit Rates
To answer the Unit Rates question as intended, students are expected to
demonstrate at least one of the following behaviors:
Table 31 summarizes how students performed on the Unit Rates question included
in the study.
Table 31 indicates that the Unit Rates question included in the study performed
as intended, with a differential of 0. Every student that demonstrated one or both
expected behaviors answered the question correctly.
One of a planet’s moons orbits the planet every 252 days. A second
moon orbits the planet every 287 days. How many more days does it
take the second moon to orbit the planet 29 times than it takes the first
moon to orbit the planet 29 times?
The correct answer to this question is 1,015 (which students would enter without
a comma in the provided field). To answer this question correctly, students
must recognize that they need to find the number of days it takes each moon to
orbit the planet 29 times and then subtract the two values to find the positive
difference. To do this, students should multiply each of the given rates (the number
of days it takes each moon to orbit the planet) by the number of times the moons
orbit the planet (29): (252)(29) and (287)(29). Once each of these values is found,
students can find the difference in the number of days it took each moon to orbit
the planet 29 times: 8,323 - 7,308, or 1,015.
So right away, just off of reading the question, I can tell that I’m going
to have to multiply each planet’s orbit by 29 and then just subtract the
difference.
Student M21 then performs the necessary calculations and arrives at the correct
answer.
So I’ll just go ahead and do 252 # 29 to find out the amount of days it
takes to orbit that moon. So, 1 second. So 252 # 29. 7,308 for the first
moon, so I’ll just go ahead and write that down. A second one orbits the
planet every 287 days. And now for the second moon, I will do 287 # 29
to figure out how long it takes to orbit. So 287 # 29 is 8,323. So I’ll just
subtract the two. And I’m left with 7,308 - 8,323. 1,015. . . . I’ll go with
1,015.
It’s worth noting that student M21 verbalizes the required subtraction in the wrong
order but has the presence of mind to enter the correct, positive value. Student
M21 also says “orbit that moon” even though the question says each moon is
orbiting the planet.
Next, student M28 recognizes that they need to multiply this difference by 29, the
number of times each moon orbits the planet.
Table 32 indicates that the Probability question included in the study performed
as intended, with a differential of 0. Every student that demonstrated one or more
expected behaviors answered the question correctly.
Question 9
Question 9 is a medium-difficulty (PSB 5) student-produced response question set
in a real-world context. The question indicates the total number of attendees at a
conference and states that each attendee is assigned to one of three groups. The
probability of selecting an attendee at random assigned to two of these groups
is given. Students are to determine how many attendees are assigned to the third
group.
Student M24 begins their successful approach to question 9 by first adding the
given probabilities and then using the complement of the event, in the process
demonstrating a clear command of the skill assessed by the question.
Next, student M24 multiplies the total number of attendees by the probability of
selecting an attendee in group C.
So let’s see. I think I’m gonna do 275 ' 3. Well, no. 275 # .44 is gonna
be 121. So for group A, there’s 121 attendees. And then for group B, I’m
gonna do 275 # .24, which gives me 66.
Student M30 then subtracts the number of attendees in groups A and B from
the total number of attendees at the conference, which results in the number of
attendees assigned to group C.
Finally, student M30 checks their answer via the alternative solution path
discussed above by multiplying the sum of the given probabilities by the total
number of attendees. However, the student only verbalizes how they arrived at the
number of attendees not in group C and doesn’t verbally include a step for arriving
at a value of 88.
And then just to double-check, I’m gonna do .44 + .24, which is .68.
275 # .68 is 187, which would mean that the first choice was right. So
I’m gonna leave it at, um, 88 attendees for group C.
MC = multiple-choice
Table 33 indicates that the Sample Proportion question included in the study
performed as intended, with a differential of 5. Every student who answered
correctly and demonstrated one or more expected behaviors showed evidence
of having engaged in cognitively complex thinking. Some students demonstrated
one or more expected behaviors but ultimately didn’t answer the question
correctly. Five of the students answered the question correctly but didn’t
demonstrate an expected behavior.
Question 11
Question 11 is a hard (PSB 6) multiple-choice question set in a social studies
context. The question describes a situation in which a sample of 1,000 people was
chosen at random from a population of 50,000 and surveyed about support for a
proposed piece of legislation. The question then gives an estimated proportion
of the sampled population that supports the legislation and an associated margin
of error. Students are then asked to identify which of a given set of numbers is a
plausible value for the total number of people in the population who support the
proposed legislation.
A) 350
B) 650
C) 16,750
D) 31,750
Choice C is the correct answer. To answer this question correctly, students need
to comprehend the context and apply the understanding that when a statistic for
an estimated proportion is given with a margin of error, it means that there is a
range of plausible values for the true population value. Students could determine
a plausible value for the total number of people in the population who support
the legislation by using the sample proportion: 35% of 50,000, or 17,500. Then
students could use the margin of error percentage to find the range of plausible
values: ±3% of 50,000, or ±1,500, resulting in a range of 16,000 to 19,000.
Alternatively, students could use the margin of error to identify the range of the
proportion of the population that supports the proposed legislation. Students
could find the range by adding 3% to and subtracting 3% from 35%: 32% to 38%.
Then students could apply the least proportion of the range and the greatest
proportion of the range to the population total to find the range of plausible
values for the number of people from the population who support the proposed
legislation: 32% of 50,000 is 16,000, and 38% of 50,000 is 19,000.
So that’s a lot to digest. I’m going to break it down. First and foremost, I
have a population of 50,000, right? So that’s a given. 50,000 people. Out
of those 50,000, 1,000 were chosen at random, right? So the target group
is 1,000, which is a straight number, and that’s easy to calculate. Based
on the survey, it estimated that 35% of people in the population, so it
would be 3,500 out of the 1,000 here. And then it would be 35% of 50,000,
right? They support the legislation. And there is an associated margin of
error of 3%. Okay. Based on the results, they want the plausible value for
the total number of people in the population who support the proposed
legislation. Okay, so they’re speaking population, which would be 50,000.
If it were the 1,000 that were chosen, it would be 350, but it’s not. So I
would just do 35% of 50,000.
Student M26 next talks through the concept of margin of error and how to use
that in solving the problem. They find 35% of the population and then find the
lower bound of the plausible range, or 32% of the population. Student M26 then
Student M30 begins their successful and efficient approach by first finding the
estimated population mean and then reasoning that a small margin of error means
the plausible value would be close to the estimated population value.
So it’s asking you for the total number of people in the population who
support the proposed legislation. So the sample size is 50,000. Well,
the, the total population. And then the sample is 1,000. Um, okay. So
50,000 # .35, since that’s what it’s saying is the, like, number of people
who were estimated to support the legislation. So 50,000 # .35 is 17,500.
And there can be an estimated margin of error of 3%. So 350 would be
way off. 650 would be way off. 31,000 would be way off. So the closest
answer choice is 16,750, so I’m gonna go with that.
Derived Units
To answer the Derived Units question as intended, students are expected to
demonstrate at least one of the following behaviors:
MC = multiple-choice
Table 34 indicates that the Derived Units question included in the study performed
as intended, with a differential of 4. All three students who demonstrated one or
more of the expected behaviors answered the question correctly. An additional
four students answered the question correctly but didn’t demonstrate any of the
expected behaviors.
Question 14
Question 14 is a hard (PSB 7) multiple-choice question set in a science context.
This question is challenging because it requires multiple steps to solve, involves
using derived units to solve a problem, and requires some geometry skills in order
to find a side length from the volume of a cube. The question gives the density
of a certain type of wood and the mass of a cube-shaped sample of this wood.
Students are then asked to find the length of one edge of the sample.
The density of a certain type of wood is 353 kilograms per cubic meter.
A sample of this type of wood is in the shape of a cube and has a mass
of 345 kilograms. To the nearest hundredth of a meter, what is the
length of one edge of this sample?
A) 0.98
B) 0.99
C) 1.01
D) 1.02
Choice B is the correct answer. To answer this question correctly, students should
use the density of the wood and the mass of the sample, in their corresponding
units, to determine the unknown volume of the sample. Students may know the
formula for density `density = volume
mass
j, or they may use the derived unit for the
density value to write a proportion. Students can represent the unknown volume
of the sample with a variable (such as V) and then write a proportion to represent
the density relationships from the given information: . Therefore, the
353 kg 345 kg
=
1m 3 V m 3
Student M16 successfully solves this problem by first identifying the relationship
between mass and volume with respect to density.
Student M16 next makes a connection between the given density for the wood
and the density of the sample and sets up a proportion. The solution process isn’t
articulated clearly, but student M16 nonetheless performs the steps correctly.
345
So 353 = volume . Okay, I can swap that. 345 ÷ 353. So the volume is 0.977.
What is the length of—oh, so I think based off of that volume. Wait, that
doesn’t make any sense. Density equals [mass over volume]. Yeah, I did
that correctly.
After finding the volume for the sample, student M16 falters a bit before finding
their way to the next step and then uses the value for the volume to find the side
length of the cube.
Am I supposed to find the area of this [stuff or that?]? Oh, okay. So, so
one of the answer choices is cubed, I think. Pretty sure it’s this [0.99].
Wait, I—I’m just gonna do the cube root of—the cube root of 0.977. So
0.99—yeah, yeah, that’s it.
Student M19 also successfully answered question 14. Student M19 remembers
the formula for density and applies it to the given information but lacks some
confidence in this solution path.
Student M19 then uses the formula for the volume of a cube to find the cube’s side
length.
Percentages
To answer the Percentages question as intended, students are expected to
demonstrate at least one of the following behaviors:
MC = multiple-choice
Table 35 indicates that the Percentages question included in this study didn’t
perform as intended. None of the students who exhibited either behavior were
able to answer the question correctly. The five students who exhibited expected
behavior 2 understood that a decrease in value by r% is computed using a factor
of `1 - 100
r
j. None of the nineteen students answering the question used the
correct factor of `1 + 100
r
j to compute an increase in a value of r% where r is
greater than 100 (expected behavior 1). Many students gave evidence of holding
the misconception that the factor to use to compute an increase in a value of
r% where r is greater than 100 is just ` 100
r
j rather than `1 + 100 j. Thirteen of the
r
The included student vignettes for this question illustrate the misconceptions that
students held while answering.
The value of a collectible comic book increased by 167% from the end
of 2011 to the end of 2012 and then decreased by 16% from the end of
2012 to the end of 2013. What was the net percentage increase in the
value of the collectible comic book from the end of 2011 to the end of
2013?
A) 124.28%
B) 140.28%
C) 151.00%
D) 209.72%
Choice A is the correct answer. The question’s difficulty stems from two factors:
one, the question requires multiple steps to solve and, two, the question is prone
to the application of two common misconceptions. To answer this question,
students should first assign a variable to the value of the collectible comic book at
the end of 2011, say x. Then they should write an expression that represents the
comic book’s value at the end of 2012: `1 + 167
100
j x , or 2.67x . Then students should
write an expression that represents a decrease of 16% from the value at the end of
2012 to the end of 2013. This would be written as `1 - 100 j^2.67xh, or
16
(0.84) (2.67x). This is equivalent to 2.2428x . This means that the comic book’s
value increased by a factor of ^1 + 1.2428h x , or `1 + 124
100
.28
j x . This means that the
net percentage increase is 124.28%.
book’s value. Student M16 does, however, correctly use the factor `1 - 100
r
j for
the decrease in value. Instead of assigning a variable to the value of the collectible
comic book at the end of 2011, student M16 starts by assigning a value of
$1 to the comic book. This can be a very effective strategy for simplifying the
computation in this type of question, even though student M16 was ultimately not
successful in solving the question.
So, uh, if we just assume that our initial value is 1, then we multiply that
by 1.67, and then it gets—it gets decreased by 16%. So, um, multiply that
by 0.84. So what was the net percent increase? So based off of that, I got
1.4028, and since my initial value was 1, I can assume that to just be a
percentage. So I move that over to the right two decimal places. That’s
140.28%.
Test questions in the Geometry and Trigonometry content domain involve applying
skills and knowledge in finding areas, perimeters, volumes, and surface areas;
using concepts and theorems related to lines, angles, and triangles (PSAT 8/9
includes triangle angle sum theorem only); solving problems using right triangles
(PSAT 8/9 includes Pythagorean theorem only); solving problems using special
right triangles and right triangle trigonometry (SAT, PSAT/NMSQT, and PSAT 10
only); calculating using sine, cosine, and tangent (SAT only); solving problems
using radian measure and trigonometric ratios in the unit circle (SAT only); and
using definitions, properties, and theorems relating to circles (SAT only). These
test questions vary in difficulty from easy to very hard and allow students to
demonstrate problem-solving skills and knowledge using a variety of solving
strategies.
Equation of a Circle
To answer the Equation of a Circle question as intended, students are expected to
demonstrate at least one of the following behaviors:
1. Use the coordinates of the center of a circle to write the equation for the circle.
2. Solve for the radius of a circle.
3. Identify the correct equation of a circle.
4. Identify the correct equation for a circle by substituting a given point in the
answer choices.
MC = multiple-choice
Table 36 indicates that the Equation of a Circle question included in the study
performed as intended, with a differential of 1. Nine of the ten students who
answered the question correctly demonstrated one or more expected behaviors.
Not all students articulated solving for the value of the radius (expected
behavior 2) through their explanations captured during the interview; rather, they
used the numbers in the answer choices to assume the length of the radius of the
circle. All students who answered correctly needed to identify the correct equation
of the circle, but not all these students successfully articulated the reasons for
their choice. Two of the students used an alternative strategy of substituting the
given point into the equations in the answer choices to determine which equation
represented the correct answer.
Question 8
Question 8 is a medium-difficulty (PSB 5) multiple-choice question without a
context. The question asks students to identify the correct equation of a circle in
the xy-plane when given the coordinates for the center of the circle and a point on
the circle.
A circle in the xy-plane has its center at ^-4, 5h and the point ^-8, 8h
lies on the circle. Which equation represents this circle?
A) ^x - 4h2 + ^y + 5h2 = 5
B) ^x + 4h2 + ^y - 5h2 = 5
C) ^x - 4h2 + ^y + 5h2 = 25
D) ^x + 4h2 + ^y - 5h2 = 25
Well, first, the left side of the equation has to be ^x + 4h2 + ^y - 5h2
because that point’s the center of the circle.
Student M20 then talks through finding the length of the radius of the circle. They
sketch a graph and then note that they’ll use the distance formula. They recognize
that the vertical distance and the horizontal distance between the two given points
form the legs of a right triangle, identify a common Pythagorean triple in the form
of the lengths of the two legs of the right triangle, and thereby determine the
length of the hypotenuse, which is also the radius of the circle.
And then I would . . . just draw it out so I can visualize the circle a little
bit. Putting the center here, ^-4, 5h and ^-8, 8h, [counts out gridlines] 2,
3, 4, 5, 6, 7, 8. And then I’d use distance formula to get the distance
between these two. So the bottom leg, its distance is 4 because
−4 - (-8) = 4. And then the vertical leg would be 3 because 8 is 3 away
from 5. And then since this is a 3-4-5 triangle, the distance between the
radius and the outside point on the circle is 5, which means that it’s this
[choice D] because r is squared . . . on the right side of the equation and
5 2 is 25.
Student M26 takes a circuitous route to the correct answer. Student M26 first
takes stock of the given information and thinks about a possible formula for finding
the equation.
Okay, so I’m given two points. I’m going to write these down for
reference. Writing down −4 and then 5. And I’m actually going to put a
little bit of information as far as what these points represent. So the circle
in the xy-plane has its center at (-4, 5). So that’s the center, right? Then
Okay. Since it’s a xy-plane and I’m seeing here with these answer
choices, they’re taking the points and they’re actually adding them
together, right? So ^x - 4h2 + ^y + 5h2 , and that equals 5. What I don’t see,
actually, is the other point, (-8, 8). All the answer choices, they stem
from -4 and 5. And then that would equal a certain value.
Student M26 then thinks about how a calculator might be helpful and eventually
decides to use the given point to evaluate which answer choice contains the
correct equation for this circle. Note that in their first try of substituting the point
(-8, 8) in answer choice D, student M26 makes an error by saying that 3 2 would
be 6 but later recognizes and corrects the error.
Okay. Now, with this one, I’m not too entirely sure what course of action
I would take, but what I’m going to do is I’m going to take a look at what
I have here with the center. And then I’m going to see which of these
answer choices makes the most sense as far as it being truthful. With
D, for example: ^x + 4h2 + ^y - 5h2 = 25. Obviously if I put that in the
calculator, I wouldn’t get the actual answer that it equals 25. I have to
know what x is and, therefore, what y is.
Now, I’m also thinking that perhaps the x and the y would be taken from
the point like (-8, + 8), but then if I did—yeah, if I did 8, right, for the
y in this case, if I did 8 - 5, that would be 3. 3 2 would be 6. And if I did
-8 + 4, that would be (-4)2 ; that’s 16. 16 plus 6, that’s now, what, 22.
I’m going to try that. I’m going to try that and go through each answer
choice, work backwards a little bit, and see what I’m able to find. So I’ve
established that D wouldn’t work because -8, like I said, x plus—if -8
was x, then that would be -4. (-4)2 , 4 times 4, that would be 16. And
then 16 plus—oh, it was 9, actually. Yeah, 16 + 9 is 25. Yeah. Okay, so it
might be D. That’s an answer choice. Let me try the same thing here. So if
I had -8 - 4, right, that would give me -12. And then (-12)2 , I believe
that’s, like, 144 plus—yeah, that would be a really big answer choice, so
I wouldn’t go with that. So far, I’m liking D, and I’ll write it out here in
the calculator. It was -8 + 4, which gives me -4. (-4)2 would be 16.
I had the other thing—let me write that down. I had the other variable,
MC = multiple-choice
Table 37 indicates that the Special Right Triangles question included in the
study didn’t perform as intended. Many of the students who answered the
question correctly didn’t demonstrate either of the expected behaviors. Both of
the students who got this question correct and who also exhibited one of the
behaviors ultimately found the correct answer by checking the given perimeter
against the side lengths provided in the answer choices and then using their
knowledge of the ratio of the lengths of the sides in a special right triangle. The
majority of students reasoned their way to the correct answer by first realizing that
choices C and D were too large and then deciding they needed a radical in their
answer (choice B) because there was one in the given perimeter value.
Question 10
Question 10 is a hard (PSB 7) multiple-choice question without a context. The
question gives the perimeter of an isosceles right triangle and asks for the length
of one leg of the triangle.
A) 47
B) 47 2
C) 94
D) 94 2
Choice B is the correct answer. To answer this question correctly, students are
expected to recognize that an isosceles triangle is a triangle with two sides of
the same length. An isosceles right triangle is also known as a 45°-45°-90° right
triangle, since the angle measures of all isosceles right triangles are 45°, 45°, and
90°. To successfully solve this question using the expected behaviors, students
must recognize the pattern of the lengths of the sides of a 45°-45°-90° right
triangle. If one leg has a length of x units, the other leg also has a length of x units,
and the hypotenuse has a length of x 2 units. Students would then write an
equation for the perimeter of the triangle and set it equal to the given perimeter:
x + x + x 2 = 94 + 94 2 . This equation can be simplified on the left-hand side,
giving 2x + x 2 = 94 + 94 2 . Factoring out a common factor on the left-hand side
gives x ^2 + 2h = 94 + 94 2 . Next, dividing both sides of the equation by 2 + 2
gives x = 94 + 94 2
. This value can be rewritten by multiplying the right-hand side
2+ 2
of the equation by 2 - 2 , a technique used to remove the radical expression
2- 2
from the denominator, which gives x = 94 + 94 2 # 2 - 2 . This simplifies to
2+ 2 2- 2
x = 188 + 188 2 - 94 2 - 188 , or x = 942 2 , which is equivalent to x = 47 2 . Therefore,
4+2 2 -2 2 -2
the length of each leg of the isosceles right triangle is 47 2.
I can’t remember my triangles. Isosceles, I’m not sure if that’s with all
of them correct. I’m not sure. I’m going to try different numbers. This
references trig. Isosceles? It’s not equilateral. It’s not scalene. Wait, unless
that’s not—is the isosceles with two? I’m going to just assume it’s with
two—if not, whatever.
Student M7 then continues effortfully to figure out how the perimeter is related to
the side lengths. Their response exemplifies the confusion that many students in
the sample had in solving this question: the perimeter value shows 2 but then
doesn’t match the pattern of x + x + x 2 in an obvious way because the 2
seems to be in the wrong place.
It’s going to be 47 because when finding the perimeter, you add all sides
together. And the only way to have 94 + 94 2 is by—the 94 could be a
combination of the two sides. But then why is 94 2 if x is actually 47?
47 2, unless it’s just my wrong memory. However, 94 + 94 2 . Two are
the same. Hypotenuse is, I believe, just one side times 2. Therefore, it’s
that, what should be 94 2 + 94 2. Oh, actually, I’ll come back to this
one. But I’m going to say it’s 47. I don’t know if it’s as simple as it would
just be the 94 because if I wanted to add the perimeter, it’d be 94 twice.
I think I’ll just do with 47 and come back to it and hope maybe another
question has a triangle that gives me the correct dimensions.
That’s so weird because the s had to be 94 in order for this to make any
sense. But if it’s two 94s—it says an isosceles right triangle has a
perimeter of 94. What is the length in inches of one leg of this triangle?
Okay. . . . What would happen if I do 47 2 plus 47 2? Yeah, that’s the
main thing I would expect. Isosceles right triangle. It’d [be] 2s + s 2 . . . .
Oh, I’m done. It’s 47 2. It has to be because then I know it’s
47 2 + 47 2 or each multiplied by that 2, that by 2, which is 94 2. And
then if you were to just multiply that by 2, which is the hypotenuse,
then that ends up getting 94. And that’s why it’s 94 + 94 2 . I thought
94 + 2 was the hypotenuse. That’s actually the two side lines added
together.
I’m looking at the square root and maybe we divide—94 ÷ 2, and that
will be 47. And so we look at the answer choices with 47 and so—I’m
assuming it would be 47 2 ’cause if the right triangle has a perimeter of
94 + 94 2 inches, then I’m assuming it should look similar to this. All
right.
MC = multiple-choice
Table 38 indicates that the Volume question included in the study performed as
intended, with a differential of 3. Nine of the twelve students who answered the
question correctly demonstrated one or more of the expected behaviors.
Question 13
Question 13 is a hard (PSB 6) multiple-choice question without a context. This
question is challenging because it requires multiple steps to solve. The question
describes a three-dimensional figure composed of a cube with a solid sphere
inside of it that touches the center of each cube face. The question asks students
to find the volume of the space in the cube not taken up by the sphere.
A) 149,796
B) 164,500
C) 190,955
D) 310,800
Student M25 answers this question correctly by first finding the volume of the
cube using the formula V = lwh, which for a cube is equivalent to V = s 3 . Note that
the student records the answer of 68 # 68 # 68 incorrectly as 314,422. The actual
value of the volume of the cube is 314,432.
Student M25 then finds the volume of the sphere. They initially give the volume
formula as 43 rr 2 , with the radius squared, but the radius should actually be cubed.
From student M25’s response, it’s clear that they correctly cubed the radius to get
39,304 in an interim calculation.
Student M25 next explains that subtraction is needed to find the space not taken
up by the sphere.
. . . but we’re trying to find the space that is not taken up. So you would
subtract this number and the number we got earlier. So it would be
314,422 -, which is the volume of the cube, 164,636, which would be
149,796, which is right here [option A].
Student M29 uses a similar strategy to answer this question. They start by
identifying the needed formulas and then talk through a plan for finding the
volumes.
Student M29 then finds the volume for the sphere using the formula correctly,
demonstrating skill at thinking about the reasonableness of the answer.
So the volume should be—it seems a little high, but 164,636. Oh, looking
at the answers, it does make sense that it is pretty high.
Student M29 next rereads the question and reasons that the correct answer can’t
be greater than the volume of the sphere. Then they proceed to find the volume
of the cube. Finally, they subtract the volume of the sphere from the volume of the
cube to find the proper answer.
And then to the nearest cube, what is the amount of the space in the
cube not taken up by the sphere? So just looking at the answers, I know
it cannot be higher than the volume of the sphere itself. Actually, never
mind. Okay. Let me see if I can figure out the volume of the cube because
then I could easily just subtract the two. Cube has an edge length. Oh.
Okay. So it kind of slipped my mind, but now looking back on it, a
cube, all of the sides have to be the same length. So if one of them is
68 inches, I know it’s basically just 68 # 68 # 68, and let me put that into
a calculator. It would be 314,432. And then I’m just going to subtract
these, the cube and the sphere. All right. And I get exactly 149,796, which
is option A.
As table 39 indicates, all examined Reading and Writing questions had differentials
from 0 to 5, indicating that the questions performed as intended per the
methodology established for this study. Vignettes presented in this report for each
of the questions further substantiate the claim that the questions were able to
elicit cognitively complex thinking in line with the questions’ intended constructs.
Math
Table 40 summarizes the quantitative results for the Math section’s questions.
Two questions—question 19, Advanced Math: Rewrite, and question 10, Geometry
and Trigonometry: Special Right Triangles—had differentials in excess of 5,
the threshold established by this report’s methodology for acceptable (“low”)
differentials. Possible issues with these two questions are discussed below. A third
question—question 18, Problem-Solving and Data Analysis: Percentages—also
bears discussion, as no student in the sample was able to answer the question
correctly.
Although this question didn’t behave as intended per the study’s methodology,
students answering correctly did nonetheless exhibit aspects of cognitively
complex thinking, as evidenced in the accompanying vignettes. Furthermore,
as observed above, students who reached the correct answer by distractor
elimination tended to show a fundamentally clear understanding of how
the rewriting process should work, ruling out distractors on the basis of an
understanding of the nature of constants and a correct assessment of the likely
value of one of the variables. Finally, it’s worth noting that the incorrect answer
choices included in the question represent surface-reasonable misinterpretations
of what must be an integer based on the rewriting of the expression, an argument
against this question having a flaw that students exploited.
Three factors again militate against the conclusion that question 10 had significant
flaws that precluded students from demonstrating cognitively complex thinking.
First, the vignettes associated with the students who correctly answered the
question and demonstrated one or both expected behaviors indicate that the
question is capable of eliciting cognitively complex thinking in accordance
with the question’s intended construct. Second, those students who answered
question 10 correctly but failed to demonstrate one or both of the expected
behaviors nonetheless demonstrated aspects of cognitively complex thinking
and mathematical understanding—critically, a strong sense of reasonable values
in the given scenario, whereby two distractors were deemed implausible and the
correct answer was deemed necessary because it included a radical. Third, as in
question 19, question 10’s answer choices represent a range of at least surface-
plausible options, and only either enacting the question’s intended construct or
SUBSECTION SUMMARY
To reiterate, seventeen of the twenty Math questions included in the study
performed as expected, with differentials of 5 or lower and vignettes supportive
of the claim that these questions elicited cognitively complex thinking. Two of the
questions that didn’t perform as expected (question 19 in Advanced Math and
question 10 in Geometry and Trigonometry) were deemed to lack significant flaws,
while the third (question 18 in Problem-Solving and Data Analysis) seems simply to
have been too difficult for the participating students, at least under the pressure
of simultaneously solving and thinking aloud, though the question is sound and
still has assessment value in terms of helping measure the achievement of the
highest-performing digital SAT Suite test takers.
Policymakers
The results and discussion sections of this report provide a strong basis for the
conclusion that the digital SAT Suite’s Reading and Writing and Math sections
include numerous questions that elicit cognitively complex thinking from students
in accordance with both the requirements of college and career readiness in
general and the U.S. Department of Education’s expectations for large-scale
standardized assessments used as part of state educational accountability
systems. All examined Reading and Writing questions and the vast majority
(85 percent) of examined Math questions performed as expected and in line
with intended question-level constructs designed to elicit cognitively complex
behavior. The two examined Math questions whose student responses exceeded
the differential threshold of 5 were shown to lack significant flaws while still being
able to elicit aspects of cognitively complex thinking, while a third question simply
proved too difficult for the sampled students to answer during the study.
Researchers
The methodology employed in this and a prior study (College Board and HumRRO
2020) is proposed as a reasonable, vetted, albeit time- and effort-intensive way
to ascertain whether a given assessment’s (or assessment system’s) questions
are capable of eliciting cognitively complex thinking from test takers. It builds on
a robust research base supporting the validity of using verbalizations obtained
from concurrent think-aloud studies to surface and analyze aspects of cognition
that would otherwise be difficult if not impossible to recover. It also establishes
and provides a rationale for a derived metric—the differential—that lends a useful
quantitative complement to an otherwise strictly qualitative analysis of student
responses. As Section 3: Methodology and Section 4: Results make clear, this
metric, created and abstracting from rigorous qualitative coding and analysis,
provides a useful way to identify successful question functioning in relation to
Undertaking this study has also identified methodological refinements that seem
likely to enhance the soundness and quality of future results.
First, the number of test questions included in the study—twenty for each subject
area—resulted in some students rushing to complete the activity or failing to
complete it in the allotted time. In addition, some students were able to finish—
or finish additional questions—only because early in the interview process,
College Board and its vendor, Vidlet, jointly determined that priority should be
given to allowing students as much of the study time as possible for answering
test questions, meaning that postexperience interview questions were often not
asked (and haven’t been analyzed for this study). Variance in n-counts for student
responses by question are largely a product of some students running out of time;
some additional variance resulted from the College Board researchers concluding
that a small number of student responses to individual questions couldn’t be
coded due to ambiguous transcripts.
Because ninety minutes is likely near the upper limit of the time that student
volunteers would be willing and able to engage productively in this activity, the
College Board research team has concluded that the question sets used in
subsequent studies should be pared down to approximately fifteen or sixteen
questions to ensure that all students can give their best effort and pay full
attention throughout the activity.
Second, the proportion of hard questions in the Math sample bears further
examination. Twelve of the twenty Math questions examined in this study came
from performance score bands 6 and 7, the two highest; by contrast, only six of
the twenty Reading and Writing questions came from these bands. High-difficulty
Math questions were disproportionately selected for the study because they
were deemed most likely to elicit cognitively complex behavior; however, students
often struggled to answer these questions correctly at all, and question 18—a
PSB 7 student-produced response question in the Problem-Solving and Data
Analysis content domain—elicited no correct responses from the sampled
students. It’s likely that just as cognitively simple or routine tasks are too “easy”
to elicit cognitively complex thought—or, indeed, much conscious reflection at
all—questions that are highly cognitively demanding risk flummoxing the majority
of student respondents, at least under the think-aloud conditions of the study.
When the set of Math questions is reduced in number for use in subsequent
cognitive interview studies, College Board will closely consider whether some of
the highest-difficulty questions (including question 18) should be eliminated from
the study. This wouldn’t be done to “improve” (bias) the results of subsequent
studies, as the overall results of this study have shown that hard questions can
elicit cognitively complex thought; rather, it would be to carefully limit the range of
presented questions to more closely mimic what a typical test-taking population
would likely be able to solve while under the added pressure of thinking aloud to an
interviewer.
Third, due mainly to the fact that College Board hadn’t yet completed the transition
to the digital-suite tests when this study was being conceptualized and conducted,
some of the Reading and Writing test questions examined here had not yet been
Fourth, regarding sample selection, a volunteering student who had a previous SAT
Math section score of 200—the lowest possible—was included in the study. It’s
most likely that this student didn’t attempt the Math section during their previous
testing. Their inclusion was an oversight, and future College Board studies will
avoid such inclusions.
Fifth, the authors of this study made the assumption that participants were highly
unlikely to have been previously exposed to the test questions sampled in the
study. This assumption seemed warranted because despite these questions
being available publicly as samples or as part of College Board–supported test
preparation, the first domestic testing using the digital SAT Suite tests was roughly
half a year away. This assumption may not hold, however, for subsequent studies
in this vein. Given that, College Board will carefully consider whether the questions
(or subsets of the questions) examined in this study can still safely be used for this
purpose.
Beck, Isabel L., Margaret G. McKeown, and Linda Kucan. 2013. Bringing Words to
Life: Robust Vocabulary Instruction, 2nd ed. New York: Guilford.
Bettman, James R., and Park, C. Whan. 1980. “Effects of Prior Knowledge and
Experience and Phase of the Choice Process on Consumer Decision Processes:
A Protocol Analysis.” Journal of Consumer Research 7, no. 3 (December): 234–48.
https://www.jstor.org/stable/2489009.
Biggs, Stanley F., and Theodore J. Mock. 1983. “An Investigation of Auditor
Decision Processes in the Evaluation of Internal Controls and Audit Scope
Decisions.” Journal of Accounting Research 21, no. 1 (Spring): 234–55. https://doi.
org/10.2307/2490945.
Branch, Jennifer L. 2001. “Junior High Students and Think Alouds: Generating
Information-Seeking Process Data Using Concurrent Verbal Protocols.” Library
and Information Science Research 23, no. 2 (Summer): 107–22. https://doi.
org/10.1016/S0740-8188(01)00065-2.
Branch, Jennifer L. 2013. “The Trouble with Think Alouds: Generating Data Using
Concurrent Verbal Protocols.” In Proceedings of the Annual Conference of CAIS
/ Actes Du congrès Annuel De l’ACSI. Edmonton: University of Alberta Library.
https://doi.org/10.29173/cais8.
College Board. 2019. College Board National Curriculum Survey Report 2019.
New York: College Board. https://satsuite.collegeboard.org/media/pdf/national-
curriculum-survey-report.pdf.
College Board. 2022. 2022 SAT Suite of Assessments Annual Report: Total Group.
New York: College Board. https://reports.collegeboard.org/media/pdf/2022-total-
group-sat-suite-of-assessments-annual-report.pdf.
College Board. 2023a. Assessment Framework for the Digital SAT Suite, version
2.0 (August 2023). New York: College Board. https://satsuite.collegeboard.org/
media/pdf/assessment-framework-for-digital-sat-suite.pdf.
College Board. 2023b. Skills Insight for the Digital SAT Suite. New York: College
Board. https://satsuite.collegeboard.org/media/pdf/skills-insight-digital-sat-
suite.pdf.
College Board and HumRRO. 2020. The Complex Thinking Required by Select
SAT Items: Evidence from Student Cognitive Interviews. New York: College Board.
https://satsuite.collegeboard.org/media/pdf/sat-cognitive-lab-report.pdf.
Ericsson, K. Anders, and Herbert A. Simon. 1993. Protocol Analysis: Verbal Reports
as Data, rev. ed. Cambridge, MA: MIT Press.
Goos, Merrilyn, and Peter Galbraith. 1996. “Do It This Way! Metacognitive
Strategies in Collaborative Mathematical Problem Solving.” Educational Studies in
Mathematics 30, no. 3 (April): 229–60. https://www.jstor.org/stable/3482842.
Johnstone, Christopher, Kristi Liu, Jason Altman, and Martha Thurlow. 2007.
Student Think Aloud Reflections on Comprehensible and Readable Assessment
Items: Perspectives on What Does and Does Not Make an Item Readable. Technical
Report 48. Minneapolis: University of Minnesota, National Center on Educational
Outcomes. https://files.eric.ed.gov/fulltext/ED499410.pdf.
Kletzien, Sharon Benge. 1991. “Strategy Use by Good and Poor Comprehenders
Reading Expository Text of Differing Levels.” Reading Research Quarterly 26, no. 1
(Winter): 67–86. http://www.jstor.com/stable/747732.
Leow, Ronald P., and Kara Morgan-Short. 2004. “To Think Aloud or Not to
Think Aloud: The Issue of Reactivity in SLA Research Methodology.” Studies in
Second Language Acquisition 26, no. 1 (March): 35–57. https://doi.org/10.1017/
S0272263104026129.
Magliano, Joseph P., and Keith K. Millis. 2003. “Assessing Reading Skill with a
Think-Aloud Procedure and Latent Semantic Analysis.” Cognition and Instruction
21(3): 251–83. https://www.jstor.org/stable/3233811.
Nguyen, Lemai, and Graeme Shanks. 2007. “Using Protocol Analysis to Explore
the Creative Requirements Engineering Process.” In Information Systems
Foundations: Theory, Representation, and Reality, edited by Dennis N. Hart and
Shirley D. Gregor, 133–52. Canberra: Australian National University Press.
Nisbett, Richard E., and Timothy DeCamp Wilson. 1977. “Telling More than We Can
Know: Verbal Reports on Mental Processes.” Psychological Review 84, no. 3 (May):
231–59. https://doi.org/10.1037/0033-295X.84.3.231.
Özcan, Zeynep Çiğdem, Yeşim Imamoğlu, and Vildan Katmer Bayraklı. 2017.
“Analysis of Sixth Grade Students’ Think-Aloud Processes While Solving a Non-
Routine Mathematical Problem.” Kuram Ve Uygulamada Eğitim Bilimleri [Journal
of Educational Sciences: Theory and Practice] 17(1): 129–44. https://jestp.com/
menuscript/index.php/estp/article/view/492/444.
Pressley, Michael, and Peter Afflerbach. 1995. Verbal Protocols of Reading: The
Nature of Constructively Responsive Reading. Hillsdale, NJ: Erlbaum.
Russo, J. Edward, Eric J. Johnson, and Debra L. Stephens. 1989. “The Validity of
Verbal Protocols.” Memory and Cognition 17, no. 6 (November): 759–69. https://
doi.org/10.3758/BF03202637.
Stratman, James F., and Liz Hamp-Lyons. 1994. “Reactivity in Concurrent Think-
Aloud Protocols: Issues for Research.” In Speaking about Writing: Reflections on
Research Methodology, edited by Peter Smagorinsky, 89–111. Thousand Oaks,
CA: Sage.
Dear [student],
You’re eligible to earn a $100 digital gift card for participating in an online research
study that will take no more than an hour and a half. Your input will help us ensure
the quality of our assessments for future students.
Learn More
This study will be conducted in early April. Learn about the study, and sign up to
participate.
On the day of the study, you must have 90 continuous minutes to participate, as well
as:
This study will be conducted entirely online and consists of an interview in which
you’ll be asked to describe your approach to answering a series of SAT® questions.
On successful completion of the activity, you’ll receive an email with a $100 digital
gift card that can be used at a variety of retailers.
Complete this form by Wednesday, March 29, to review the details of what you’ll
be asked to do and to sign up to participate. There’s limited space in this research
study, and you’ll be informed on April 3 if you’ve been selected to participate. If you
aren’t selected, we’ll notify you if other opportunities arise in the future.
Sincerely,
College Board
This study will be conducted entirely online and consists of a 90 minute interview
where you’ll be asked to describe how you respond to SAT questions. Upon
successful completion of the activity, you will receive an email with a link to redeem
a $100 digital gift card with a retailer of your choice.
Q2 Are you sure you want to close this form without signing up?
Q3 If selected to participate, your name and email address will be shared with our
contractor, Vidlet, who will send a link to schedule your interview and a consent
form which must be signed by your parent or guardian, or you if you are over 18.
You will be notified on April 3 if you have been selected.
Daytime and evening interview sessions will be held between April 6 and 16 and
are available on a first-come, first-served basis. Once you schedule your session
and return your signed consent form, you will receive a Zoom link to join at the time
of your interview.
To complete your interview and earn a $100 digital gift card, you must have
90 continuous minutes to work as well as:
After you successfully complete the interview, Vidlet will process your gift card
through digital payment platform [redacted]. You will receive a link from [redacted]
to the email address provided, which you can use to redeem your payment in the
form of a bank transfer, PayPal deposit, or a gift card of your choice – [redacted]
has over 300 gift card options for you to choose from.
If you are selected to participate, College Board reserves the option to cancel your
participation in its sole discretion if your participation is no longer needed. In such
case, you will not receive a gift card.
I agree (1)
Q4 Great. Let’s confirm your information. The information you provide here will be
used to confirm your eligibility for the study, and if selected your name and email
address will be shared with Vidlet. Please note that ALL fields are required.
Q5 First Name
Q6 Last Name
Q7 Email
Q8 Grade Level
12th (1)
11th (2)
During the interview, you will meet over Zoom one-on-one with a researcher from
Vidlet. The interviewer will send a link to you for you to access the sample SAT
questions. You will need to share your screen, and the interviewer will ask you to
describe how you would approach answering the questions. The sessions will be
recorded.
If you successfully complete the interview, you will receive a link to redeem a $100
digital gift card through [redacted] at your designated email address.
If you are selected to participate, College Board reserves the option to cancel your
participation in its sole discretion if your participation is no longer needed. In such
case, you will not receive a gift card.
By clicking submit, you are signing up to participate in this research study, your
information will be used to confirm your eligibility to participate, and if selected will
be shared with Vidlet.
We thank you for your time spent taking this survey. Your response has been
recorded.
Termination Page:
Thank you for your interest. We’re sorry to hear that you cannot participate. Please
watch your email for other opportunities.
EXHIBIT 3
The following is a copy of the consent form signed by participating students and
their parents/guardians. As before, the name of the gift card vendor has been
redacted.
By signing this agreement, the student identified below (“Student”), with consent
of their parent/guardian (“Parent/Guardian”), agree to Student’s participation in
SAT Question Interviews, a research study for College Board (“Study”). The Study
involves the Student providing feedback to College Board on SAT questions,
including but not limited to, providing feedback via a screen-sharing session with
a College Board researcher where students may be asked questions or provide
feedback about how they answer SAT questions. The study will be conducted
entirely online. The activity will take no more than an hour and a half, and on
successful completion of the activity, payment will be made via digital payment
platform, [redacted]. Student will receive a link from [redacted] to the email
address provided which can be used to redeem payment in the form of a bank
transfer, PayPal deposit, or a gift card of choice – [redacted] has over 300 gift card
options.
Student and Parent/Guardian hereby give their full and complete permission to
College Board and its agents to photograph, record (audio and video) Student’s
participation (“Images”). Student and Parent/Guardian grant College Board and its
designees, affiliates, agents, subcontractors, and licensees (collectively, “College
Board”) the right to use, transcribe, edit, reproduce, broadcast, publish, exhibit,
publicize, and otherwise distribute, without compensation to Student and Parent/
Guardian, any Images, along with Student responses, statements and comments
Student makes during or in connection with the Study (together with the Images,
“Information”). The rights hereby granted to College Board are perpetual and
worldwide.
Any Images will be stored securely consistent with College Board policies and only
College Board personnel involved in the Study and related research and product
development will access the recordings. Images will be kept for one year and then
securely destroyed. Transcriptions will be kept for two years and then securely
destroyed.
Student and Parent/Guardian acknowledge that College Board will rely on this
permission and that College Board, in its sole discretion, may decide whether or
As the session will include use of live video during the screen-sharing session,
please be mindful of your background (for example, avoid having other individuals
in the room, secure any personal items and information from view of the camera
and other similar safeguards the Student and Parent/Guardian may wish to
consider in their discretion), understanding and acknowledging that the researcher
will be able to view the Student’s background through the Student’s camera.
This Student Research Group Agreement is the full and complete understanding
between College Board, Student, and Parent/Guardian. Student and Parent/
Guardian each represent they have had adequate time to read this document
carefully and to ask any questions that they may have.
Please Print:
All roman-font text in this script indicates directions for the interviewer.
These should not be read aloud to the student.
If, at any time, deviations from this script occur, the interviewer should document
them on the timing and irregularity form (a copy of which is attached as appendix
B). This form should be labeled with the student’s identification code and the date
and time of the interview.
Each student should be assigned a unique identifier for use in reporting on the
study.
Late-Arriving Students
If the student arrives 15 or fewer minutes late to their scheduled interview, the
interviewer should still conduct the interview. Omit the postexperience interview
questions (section D) if necessary to allow the student the full 70 minutes for the
think-aloud activity (section C).
During the interview, sit within camera view. Provide the student with the following
overview:
Thank you for taking time to participate in this research study today. Before I
explain the activity, I want to give you some background. The purpose of this
research study is to help College Board, the makers of the SAT, learn more
about how students like you approach questions on the test—specifically,
Reading and Writing test questions. I’ll be reading a lot from this document
today. This is to help ensure that all students participating in this research
activity have as similar an experience as possible.
This research is to evaluate the test questions, not you, so don’t be concerned
about whether you answer a particular question correctly.
During our time together, please keep your computer’s camera on and the
microphone unmuted as much as possible. Please silence your phone, try to
avoid distractions and interruptions, and don’t allow others to join you in this
activity. Please also close all computer applications except the ones being used
in this activity so that you can better focus on this task.
At any time during this study, you’re welcome to take a break, use the restroom,
or choose to stop participating. All you need to do is let me know. All the
information we collect today will be used only for research purposes, and you
will not be identified by name or other personally identifying information in our
final report. After successfully completing all steps in the study, you’ll receive a
$100 gift card.
Overview
Read the following text to the student:
In this activity, you’ll think aloud as you work through each test question. You’ll
verbally share any and all thoughts you have about each question as you read
and answer it. In doing so, you’ll describe all the steps you take to obtain your
answer as well as any other thoughts about the question that occur to you.
Your goal today is to think aloud as fully, honestly, and freely as possible as you
work through each question. Remember: We’re evaluating the questions, not
I realize you may not have participated in a think-aloud study before, so let’s
consider a couple of examples. First, I’ll demonstrate thinking aloud using a
sample test question. Then I’ll give you a sample question so that you can
practice thinking aloud before you begin answering the rest of the questions.
Direct the student to the interviewer practice question (“IP”) in the Qualtrics survey
so the student can follow along.
I’ll start by reading the test directions, passage, and question aloud and then
narrate what I’m thinking as I answer the question.
Passage and question: The following text is from F. Scott Fitzgerald’s 1925
novel The Great Gatsby.
[Jay Gatsby] was balancing himself on the dashboard of his car with that
resourcefulness of movement that is so peculiarly American—that comes,
I suppose, with the absence of lifting work in youth and, even more, with the
formless grace of our nervous, sporadic games. This quality was continually
breaking through his punctilious manner in the shape of restlessness.
As used in the text, what does the word “quality” most nearly mean?
A) Characteristic
B) Standard
C) Prestige
D) Accomplishment
Reading this passage and question, it looks like I’m being asked to figure out
how the word “quality” is used in the text. “Quality” appears in the last sentence
of the passage: “This quality was continually breaking through his punctilious
manner in the shape of restlessness.” I have no idea what “punctilious” means,
but I think I can still answer the question about “quality” without knowing that.
Going back through the passage, I realize that “this quality” refers to Gatsby’s
“resourcefulness of movement.”
I’m now looking at the answer choices and trying to figure out which one is
the best answer here. “Characteristic,” choice A, makes sense, because the
passage is describing something that Gatsby regularly shows, like a trait. The
passage tells us that Gatsby’s restlessness is “continually breaking through
Notice how when I was thinking aloud, I didn’t try to simply summarize what I
did after I was done answering. Instead, as I approached this question, I told
you exactly what I was thinking as I thought it. I first read the passage and the
question aloud and then explained what I thought the question was asking, how
I went about answering the question, and why I came up with the answer that
I did. I want you to do the same sort of thing when you read and answer test
questions today.
After addressing any questions or concerns, continue to the first student practice
question.
Now I’d like you to practice thinking aloud using this practice question.
Remember: Try to say everything that goes through your mind as you read and
answer the question. Please begin by reading the question and answer choices
out loud. Continue by thinking aloud as you answer the question.
The first student practice question (“SP1”) is reprinted in appendix A for your
information but should be presented to the student onscreen.
For this activity, you’ll read several passages and respond to 20 questions.
Each passage or pair of passages below is followed by a single question. After
reading each passage or pair of passages, choose the best answer to each
question based on what is stated or implied in the passage or passages and in
any accompanying graphics (such as a table or graph).
You’ll have 70 minutes to complete the Reading and Writing test questions.
While you’re working, I’ll be using a timer to keep track of how long you take to
answer each question. This is just so that we have a better sense of how long
each question is taking students to answer.
Answer as many questions as fully and completely as you can. Answer each
question to the best of your ability, and then move on to the next. You should
have enough time for all 20 questions, but don’t worry if you don’t make it all the
way through the full set. If you finish early, you may review your answers if you
wish.
Remember to verbalize any and everything that comes to mind as you work
through each question. If you stop talking for a bit, I’ll prompt you to keep
thinking aloud.
After addressing any questions or concerns, direct the student to advance to the
first actual test question (“1”) in the Qualtrics survey.
Take out the timing and irregularity form (a copy of which appears in appendix B) to
track the student’s time on each question, and commence the activity by reading
the following to the student:
After you finish the activity, I’ll ask you a few questions about your experience
today.
I’ll turn on the timer as soon as you begin reading the first test question aloud.
START RECORDING
Don’t forget to read each passage, question, and answer choice out loud.
Please begin.
The interviewer should watch and listen attentively to the student as they work
through the questions.
§ The interviewer should prompt the student to talk if it’s obvious they’re
making progress but not verbalizing. Such prompts should be minimal and
nondirective—for example, “Please remember to say out loud what you’re
thinking.”
§ If the student goes silent and appears stuck on a question, allow them
approximately 15 seconds of silence before probing. Then say “Remember to
keep talking,” “Please continue,” “Go on,” or the like.
As students work on the questions, the interviewer should maintain the timing
and irregularity form.
§ Record the start time, to the nearest second, for each question.
§ Record the stop time, to the nearest second, for each question.
§ When students complete their first attempt at answering a question, enter the
start and stop times from the timer in the “Attempt 1” column. If the student
returns to a question to complete it or to check their work, use the “Attempt 2”
and “Attempt 3” columns, as needed, to track the time.
§ If a student makes more than one attempt on a particular question, flag this on
the form so that College Board can easily see that the student’s efforts aren’t
confined to the first attempt.
When the student finishes, even if they finish early, continue to section D.
Note that students may ask for input on how well they did or for the answer to one
or more questions. Inform them that they’re not being scored on their answers and
that their verbal responses are what we’re interested in.
Time’s up. Please stop working. Don’t worry about any questions you didn’t
answer.
Continue to section D.
You’ve finished the part of the interview dedicated to answering the study’s
Reading and Writing test questions.
Now I’d like to ask you a few questions about your experience today.
Ask the following retrospective questions, in order. If time runs short, omit later
questions as needed to allow the session to end at the approximately 90-minute
mark.
1. Please tell me a bit about the experience you just had. What was it like to answer
those questions?
STOP RECORDING
Continue to section E.
Say:
That concludes our interview. Thank you for participating. Your input regarding
these Reading and Writing questions is valuable to us.
Gift Card
[Vidlet to describe the gift card procedure]
Conclude Session
Thank the student again for their participation.
Complete Paperwork
Ensure that the timing and irregularity form is complete and appropriately stored.
Finish Up
[Vidlet to provide directions to the interviewer for what to do next]
***
SP1
Some studies have suggested that posture can influence cognition, but we
should not overstate this phenomenon. A case in point: In a 2014 study,
Megan O’Brien and Alaa Ahmed had subjects stand or sit while making risky
simulated economic decisions. Standing is more physically unstable and
cognitively demanding than sitting; accordingly, O’Brien and Ahmed
hypothesized that standing subjects would display more risk aversion during
the decision-making tasks than sitting subjects did, since they would want to
avoid further feelings of discomfort and complicated risk evaluations. But
O’Brien and Ahmed actually found no difference in the groups’ performance.
A) It presents the study by O’Brien and Ahmed to critique the methods and
results reported in previous studies of the effects of posture on cognition.
B) It argues that research findings about the effects of posture on cognition
are often misunderstood, as in the case of O’Brien and Ahmed’s study.
C) It explains a significant problem in the emerging understanding of
posture’s effects on cognition and how O’Brien and Ahmed tried to solve
that problem.
D) It discusses the study by O’Brien and Ahmed to illustrate why caution is
needed when making claims about the effects of posture on cognition.
(Key: D)
SP2
Many animals, including humans, must sleep, and sleep is known to have a
role in everything from healing injuries to encoding information in long-term
memory. But some scientists claim that, from an evolutionary standpoint,
deep sleep for hours at a time leaves an animal so vulnerable that the known
benefits of sleeping seem insufficient to explain why it became so widespread
in the animal kingdom. These scientists therefore imply that ______
(Key: B)
Directions:
Timing: Using a timer, record in the “Attempt 1” column the start and stop times,
to the nearest second, associated with the student working on a given test item. If
the student returns to an item, also list the start/stop times for “Attempt 2” and, if
necessary, “Attempt 3” in the same manner. Flag those rows for easy identification
by College Board.
Irregularities: Record any deviations from the interview script in the irregularities
section. Indicate what impact, if any, such irregularities had on the session.