Papers by Meghan Rector Federer

Our study investigates the challenges introduced by students’ use of lexically ambiguous language... more Our study investigates the challenges introduced by students’ use of lexically ambiguous language in evolutionary explanations. Specifically, we examined students’ meaning of five key terms incorporated into their written evolutionary explanations: pressure, select, adapt, need, and must. We utilized a new technological tool known as the Assessment Cascade System (ACS) to investigate the frequency with which biology majors spontaneously used lexically ambiguous language in evolutionary explanations, as well as their definitions and explanations of what they meant when they used such terms. Three categories of language were identified and examined in this study: terms with Dual Ambiguity, Incompatible Ambiguity, and Unintended Ambiguity. In the sample of 1282 initial evolutionary explanations, 81 % of students spontaneously incorporated lexically ambiguous language at least once. Furthermore, the majority of these initial responses were judged to be inaccurate from a scientific point of view. While not significantly related to gender, age, or reading/writing ability, students’ use of contextually appropriate evolutionary language (pressure and adapt) was significantly associated with academic performance in biology. Comparisons of initial responses to follow-up responses demonstrated that the majority of student explanations were not reinterpreted after consideration of the follow-up response; nevertheless, a sizeable minority was interpreted differently. Most cases of interpretation change were a consequence of resolving initially ambiguous responses, rather than a change of accuracy, resulting in an increased understanding of students’ evolutionary explanations. We discuss a series of implications of lexical ambiguity for evolution education.

Evolution: Education and Outreach, Jan 1, 2010
The notion of “pressure” as an evolutionary “force” that “causes” evolution is a pervasive lingui... more The notion of “pressure” as an evolutionary “force” that “causes” evolution is a pervasive linguistic feature of biology textbooks, journal articles, and student explanatory discourse. We investigated the consequences of using a textbook and curriculum that incorporate so-called force-talk. We examined the frequency with which biology majors spontaneously used notions of evolutionary “pressures” in their explanations, students’ definitions and explanations of what they meant when they used pressures, and the structure of explanatory models that incorporated evolutionary pressures and forces. We found that 12–20 percent of undergraduates spontaneously used “pressures” and/or “forces” as explanatory factors but significantly more often in trait gain scenarios than in trait loss scenarios. The majority of explanations using “force-talk” were characterized by faulty evolutionary reasoning. We discuss the conceptual similarity between faulty notions of evolutionary pressures and linguists’ force-dynamic models of everyday reasoning and ultimately question the appropriateness of force-talk in evolution education.

Behavioral Ecology and …, Jan 1, 2010
Animal color patterns often reflect a compromise between natural selection for crypsis or inconsp... more Animal color patterns often reflect a compromise between natural selection for crypsis or inconspicuousness to predators and sexual selection for conspicuousness to potential mates. In leaf litter-dwelling wolf spider species like Schizocosa ocreata, body coloration often closely matches the background coloration of a generally brown environment. However, body parts used in communication should exhibit high contrast against background coloration. We used spectral analysis to examine male and female S. ocreata for matching and contrasting coloration against leaf litter. Values were plotted in multivariate color space, based on reflectivity in different frequency ranges. When viewed from above, colors of both males and females overlap with values for dead brown leaf litter and soil, suggesting cryptic coloration when viewed by potential predators. However, when viewed from a lateral perspective, both males and females show color values that are polar opposites of litter backgrounds, suggesting higher contrast when viewed by other spiders. Moreover, male secondary characters used in visual signaling by S. ocreata (tibia brushes) show the highest level of background contrast. These findings suggest that S. ocreata wolf spiders have color patterns that provide both crypsis and background contrast at the same time, depending on receiver viewing perspective.

Behavioral Ecology and …, Jan 1, 2011
For visual signaling to be effective, animal signals must be detected and discriminated by receiv... more For visual signaling to be effective, animal signals must be detected and discriminated by receivers, often against complex visual backgrounds with varying light levels. Accordingly, in many species, conspicuous visual displays and ornaments have evolved as a means to enhance background contrast and thereby increase the detection and discrimination of male courtship signals by females. Using video playbacks, we tested the hypothesis that visual courtship displays and leg decorations of male Schizocosa ocreata wolf spiders are more conspicuous against complex leaf-litter backgrounds. Video exemplars of courting males with manipulated leg tufts were superimposed on different backgrounds (complex leaf litter in sun or shade, featureless gray background) and presented to female spiders. Females were more likely to orient to males presented against lighter backgrounds (litter in sun, gray) than the darker ones (litter-shade). Males with larger tufts were also more likely to be detected, as latency to orient was shortest for enlarged and longest for removed tufts.
Talks by Meghan Rector Federer

Understanding sources of performance bias in science assessment is a major challenge for science ... more Understanding sources of performance bias in science assessment is a major challenge for science education reforms. Prior research has documented several limitations of instrument types on the measurement of students’ scientific knowledge (Liu et al., 2011; Messick, 1995; Popham, 2010). Similarly, gender differences in problem-solving strategies and achievement are well documented in a variety of disciplines (Halpern, 2000; Hedges & Nowell, 1995), particularly for performance on verbal and written assessments (Weaver & Raptis, 2001; Penner, 2003). Despite the above documented biases, much has yet to be determined for constructed response (CR) assessments in biology and their use for evaluating students’ conceptual understanding of scientific practices (such as explanation). Understanding differences in science achievement provides important insights into whether science curricula and/or assessments are valid representations of student abilities. In order to investigate gender differences on CR biology performance, we collected student responses to the ACORNS CR instrument (Nehm et al., 2012). Three versions of the CRI consisting of four isomorphic items were administered to a sample of undergraduate biology majors and non-majors, (G1: n=662 [Female=51.6%]; G2: n=184 [F=55.9%]; G3: n=642 [F=55.1%]), resulting in a total of 5,952 evolutionary explanations for analysis. We used Rasch analysis to evaluate differential item (DIF) and test (DTF) function patterns. Preliminary analyses revealed no gender DIF in performance for the majority of items. However, there appeared to be a female advantage on unfamiliar items (e.g., prosimian/tarsi). In contrast, DTF analyses indicated that the test did not function equivalently for males and females, favoring male respondents. Overall, our initial results suggest that while gender differences could be a significant source of bias in CR biology assessment, they may be related to individual item features, combinations of items, and types of assessment. This corroborates previous findings of gender differences and highlights the importance of identifying potential sources of assessment bias when measuring students’ achievement in biology.

Empirical studies have revealed two major limitations with extant multiple-choice Concept Invento... more Empirical studies have revealed two major limitations with extant multiple-choice Concept Inventories (CIs) designed to measure students’ knowledge of and misconceptions about natural selection. First, while oral interviews and open-response assessments indicate that many students’ mental models of natural selection are comprised of both naïve and scientific ideas (“mixed models”), instruments like the CINS only allow students to choose a right or a wrong answer option. Second, while students’ evolutionary thinking has been shown to be strongly influenced by the taxa, traits, and polarities of evolutionary change contained in the items (e.g., reasoning about plant trait gain vs. animal trait loss), instruments like the CINS only test student thinking using one context (familiar animals). In order to address these previously documented limitations, we have developed and evaluated a new Multiple-True-False (MTF) formative assessment instrument for measuring undergraduates’ knowledge and misconceptions of natural selection and evolution. The instrument attempts to not only reveal the structure of student thinking (e.g., pure scientific models, mixed models, and pure naïve models) but also to document student reasoning across different surface features (e.g., trait gain and loss in plants and animals). The instrument development team included experts from evolutionary biology, science education, assessment, and cognitive psychology and produced a MTF instrument with 50 response choices that may be completed in 25 minutes or less. In order to examine the reliability and validity of this new instrument, we pilot tested the new MTF instrument along with the open-response ACORNS instrument, and the multiple choice CINS instrument to 126 undergraduate students enrolled in an evolution class. Using students’ responses to the MTF statements, we categorized students’ reasoning levels into three levels: (1) pure scientific reasoning (“Key Concepts” only), (2) “mixed model” reasoning (both KCs and Misconceptions [MIS]), and (3) pure naïve ideas (MIS only). Reliability of the new instrument (Cronbach’s alpha) was 0.902, and convergent validity analyses revealed strong and significant associations with the other two instruments (MTF/ACORNS: 0.54, p. <0.001; and MTF/CINS 0.47, p. <0.001). In addition, the MTF instrument detected patterns of students’ reasoning about trait gain and loss comparable to the open-response ACORNS instrument (MTF: t124=2.69, Cohen’s d=0.48, p<0.01; ACORNS-KC: t124=2.46, Cohen’s d=0.44). Overall, our new MTF instrument covers the same content as the CINS but can be completed in less time and can detect students’ (1) mixed models and (2) contextual reasoning abilities.

Assessment of students’ scientific knowledge and reasoning processes is a complex task, yet essen... more Assessment of students’ scientific knowledge and reasoning processes is a complex task, yet essential to evidence-based teaching and learning (NRC, 2001, 2007). Indeed, measurement of students’ knowledge is known to be influenced by a variety of factors, which has spurred efforts to understand how different assessment formats and item features differentially inform our inferences about student understanding, as well efforts to develop new tools and practices to measure more authentic problem solving performances. Furthermore, new science education standards emphasizing practices are pushing for increased use of constructed-response (CR) assessment instruments in science education and other content areas (NRC, 2012). While a large body of work has explored a variety of issues relating to the risks and benefits of multiple-choice assessments on student performance, comparatively less work has explored these issues using written explanations. We report on the results from three studies of undergraduate students (S1: n=309; S2: n=262; S3: n=156) that examine the effects of and interactions among item sequencing and surface features on students’ biological explanations. Collectively the results from the three studies identify multiple factors influencing measures of student performance using CR assessments. First, assessments containing items with similar surface features produced greater sequencing effects relative to item sequences that differed in surface features, with response accuracy declining over the item sequence (S1; Wilcoxon Signed Rank test, p.<0.001). However, the decreasing accuracy of student performance across an item sequence was mediated at the population level of analysis by use of a counterbalancing design (e.g., Latin Square). Second, in the context of evolutionary change, accuracy of student performance at the individual level was correlated with the diversity of surface features used in an item sequence, allowing for a more detailed analysis of a variety of mental models (S2, S3). Importantly, while student use of scientifically accurate concepts about natural selections was positively related to both item location and variation in surface features, naïve ideas were more robust to item location. Third, our studies emphasize the importance of considering response verbosity when evaluating student performance on scientific practices such as explanation (S1, S2, S3). In particular, item sequencing had a greater effect on response verbosity when items possessed similar surface features, which directly corresponded with decreased use of scientifically accurate conceptions about evolutionary change (Spearman’s rank correlation, r=0.63). As assessment in science education shifts toward evaluating scientific practices (NRC, 2012), additional research is needed to explore how best to evaluate such competencies.

Innovations in automated scoring and related technologies are facilitating increased use of const... more Innovations in automated scoring and related technologies are facilitating increased use of constructed-response (CR) assessment instruments in science education and other content areas. While a large body of work has explored a variety of issues relating to the effects of multiple-choice item sequencing on student performance, comparatively less work has explored these issues using constructed-response assessments. We report on the results from three studies of undergraduate students that examine the effects of item sequencing, difficulty, and surface features on student performance using constructed response items in biology education. Our first study examined the extent to which item sequencing was associated with student performance on isomorphic, constructed-response instruments. In our response corpus, verbosity (number of words) and use of accurate scientific ideas (key concepts) declined significantly from the first to the fourth items. In contrast, use of naïve ideas was not correlated with verbosity or the use of scientific ideas. Our second study examined the interaction between item sequencing and familiarity of items. Use of key concepts in this sample was significantly related to response verbosity and item familiarity, but not item sequencing. The use of naïve ideas was also correlated with familiarity, with more familiar items eliciting more naïve ideas. Our third study examined the interaction between item sequencing and the polarity of item features. In our collection of student responses, verbosity was significantly related to item sequencing but concept use was not. In contrast, the polarity of the item significantly affected the use of key concepts and naïve ideas, corroborating previous research on the gain and loss of traits. Together the results from our three studies identify multiple factors influencing student performance on CR items. Assessments containing items with similar surface features were subject to greater order effects relative to item sequencing that differed in surface features. Additionally, item sequencing had a greater effect on response verbosity, which corresponded with use of accurate scientific ideas. We conclude with a conceptual model accounting for factors influencing student success on CR assessments.

"Words that have specific meaning(s) in everyday language (e.g., adapt) may also hold specialized... more "Words that have specific meaning(s) in everyday language (e.g., adapt) may also hold specialized meaning(s) in the biological sciences, often with few explicit cues to signify which meaning (everyday or scientific) is truly intended. Such words are characterized by lexical ambiguity – they hold more than one meaning within scientific and everyday language(s). The resulting complexity of biological discourse may complicate scientific communication in the classroom, inhibit comprehension of science concepts, and hinder the valid assessment of scientific knowledge. We explored undergraduate biology majors’ uses and definitions of five terms characterized by lexical ambiguity that are common to evolutionary explanations: pressure, select, adapt, need, and must. Specifically, we asked the following research questions:
1. Are different types of lexically ambiguous terms used in different frequencies?
2. Are question features (e.g. familiar vs. unfamiliar organisms) associated with lexically ambiguous term use patterns?
3. Is there an association between the use of lexically ambiguous terms and biology course performance?
4. Can technology be used to enhance the interpretation of word meaning?
Our sample of evolutionary explanations was gathered using an online assessment cascade system that captured initial answers (n=716) to a set of four validated open response items (see Nehm & Ha, 2011) and also “mined” students’ responses (in real time) for lexically ambiguous terms. For students who spontaneously used such terms in their initial explanations, the system asked them to explain what they meant when they used them (n=747). All student responses were coded independently by two raters (initial Kappa agreement values were > 0.75 for all terms, and all discrepancies were resolved prior to data analysis). We also calculated a composite score to capture a holistic judgments of the scientific accuracy of each initial and follow-up response pair. Quantitative statistical analyses of the data were performed using Ordinal Logistic Regression, Mantel-Haenszel tests of independence, correlations, and percentage comparisons. All analyses were performed in PASW (SPSS Inc.). For the first research question, we found that terms with scientific and everyday meanings (e.g, adapt) represented 78% of cases. Terms that have no specific scientific meanings (e.g., need and must) were less frequently used. For our second research question, we found that students use of pressure (p=0.26) and adapt (p=0.83) were not significantly different across the four assessment items (Mantel-Haenzel test), whereas the use of select (p<0.001), need (p=0.01) and must (p<0.001) varied significantly across prompts. Our third research question examined the scientific accuracy of explanations that used lexically ambiguous terms, and found that among explanations containing at least one multivalent term, a large number were judged to be inaccurate (43.4%). Accurate use of pressure (p<0.001) and select (p<0.001) were significantly item-dependent in the initial student responses, but not so in the composite evaluation (Ordinal Logistic Regression). The relationship between overall?? Explanation accuracy and course grade was also significant for pressure (p<0.01) and adapt (p<0.01) (Ordinal Logistic Regression). For our fourth research question, we found that the use of follow-up prompts did result in a significant resolution of ambiguous explanations for pressure (16%) and select (83%). In summary, our results demonstrated that students not only use lexically ambiguous terms in abundance, and that such use is associated with course performance, but that many biology majors have not yet learned how to align meanings appropriately. Furthermore, we found that follow-up questioning is helpful in resolving what students mean when they use lexically ambiguous language, thereby enhancing the scoring of students' evolutionary explanations."

Our study empirically investigates evolutionary discourse practices associated with the notion th... more Our study empirically investigates evolutionary discourse practices associated with the notion that lexically ambiguous terms are numerous and diverse in scientific language. Specifically, we investigated students’ use and meaning of five terms common to evolutionary explanations (pressure, select, adapt, need, and must) in a sample of 320 responses produced by undergraduate biology students. We employed a new technological tool known as the Assessment Cascade System (ACS) to investigate the frequency with which biology majors spontaneously used lexically ambiguous language in evolutionary explanations as well as their definitions and explanations of what they meant when they used these terms. Our first analysis indicated that 81% of undergraduate biology students spontaneously used these so-called ‘multivalent terms’ at least once in their evolutionary explanations, of which the majority was judged to be inaccurate. The accurate use of biological terms such as “pressure” and “select” was significantly context dependent in students’ initial explanations, and the percentage of accurately used multivalent terms was significantly associated with course grade. Our second analysis examined the utility of the ACS in resolving ambiguous explanations that incorporated the terms “pressure” and “select.” Results differed by term, with 20% of student responses containing “pressure” and 47% of responses containing “select” changing in accuracy. Students’ conceptualizations of evolutionary change that incorporated ‘selective’ or ‘pressure-related’ ideas represented the majority of explanations that were affected by the lexical ambiguity of scientific language, whereas words and meanings recruited from everyday language, such as ‘need’, resulted in student explanations that were more erroneous in evolutionary contexts.
Uploads
Papers by Meghan Rector Federer
Talks by Meghan Rector Federer
1. Are different types of lexically ambiguous terms used in different frequencies?
2. Are question features (e.g. familiar vs. unfamiliar organisms) associated with lexically ambiguous term use patterns?
3. Is there an association between the use of lexically ambiguous terms and biology course performance?
4. Can technology be used to enhance the interpretation of word meaning?
Our sample of evolutionary explanations was gathered using an online assessment cascade system that captured initial answers (n=716) to a set of four validated open response items (see Nehm & Ha, 2011) and also “mined” students’ responses (in real time) for lexically ambiguous terms. For students who spontaneously used such terms in their initial explanations, the system asked them to explain what they meant when they used them (n=747). All student responses were coded independently by two raters (initial Kappa agreement values were > 0.75 for all terms, and all discrepancies were resolved prior to data analysis). We also calculated a composite score to capture a holistic judgments of the scientific accuracy of each initial and follow-up response pair. Quantitative statistical analyses of the data were performed using Ordinal Logistic Regression, Mantel-Haenszel tests of independence, correlations, and percentage comparisons. All analyses were performed in PASW (SPSS Inc.). For the first research question, we found that terms with scientific and everyday meanings (e.g, adapt) represented 78% of cases. Terms that have no specific scientific meanings (e.g., need and must) were less frequently used. For our second research question, we found that students use of pressure (p=0.26) and adapt (p=0.83) were not significantly different across the four assessment items (Mantel-Haenzel test), whereas the use of select (p<0.001), need (p=0.01) and must (p<0.001) varied significantly across prompts. Our third research question examined the scientific accuracy of explanations that used lexically ambiguous terms, and found that among explanations containing at least one multivalent term, a large number were judged to be inaccurate (43.4%). Accurate use of pressure (p<0.001) and select (p<0.001) were significantly item-dependent in the initial student responses, but not so in the composite evaluation (Ordinal Logistic Regression). The relationship between overall?? Explanation accuracy and course grade was also significant for pressure (p<0.01) and adapt (p<0.01) (Ordinal Logistic Regression). For our fourth research question, we found that the use of follow-up prompts did result in a significant resolution of ambiguous explanations for pressure (16%) and select (83%). In summary, our results demonstrated that students not only use lexically ambiguous terms in abundance, and that such use is associated with course performance, but that many biology majors have not yet learned how to align meanings appropriately. Furthermore, we found that follow-up questioning is helpful in resolving what students mean when they use lexically ambiguous language, thereby enhancing the scoring of students' evolutionary explanations."
1. Are different types of lexically ambiguous terms used in different frequencies?
2. Are question features (e.g. familiar vs. unfamiliar organisms) associated with lexically ambiguous term use patterns?
3. Is there an association between the use of lexically ambiguous terms and biology course performance?
4. Can technology be used to enhance the interpretation of word meaning?
Our sample of evolutionary explanations was gathered using an online assessment cascade system that captured initial answers (n=716) to a set of four validated open response items (see Nehm & Ha, 2011) and also “mined” students’ responses (in real time) for lexically ambiguous terms. For students who spontaneously used such terms in their initial explanations, the system asked them to explain what they meant when they used them (n=747). All student responses were coded independently by two raters (initial Kappa agreement values were > 0.75 for all terms, and all discrepancies were resolved prior to data analysis). We also calculated a composite score to capture a holistic judgments of the scientific accuracy of each initial and follow-up response pair. Quantitative statistical analyses of the data were performed using Ordinal Logistic Regression, Mantel-Haenszel tests of independence, correlations, and percentage comparisons. All analyses were performed in PASW (SPSS Inc.). For the first research question, we found that terms with scientific and everyday meanings (e.g, adapt) represented 78% of cases. Terms that have no specific scientific meanings (e.g., need and must) were less frequently used. For our second research question, we found that students use of pressure (p=0.26) and adapt (p=0.83) were not significantly different across the four assessment items (Mantel-Haenzel test), whereas the use of select (p<0.001), need (p=0.01) and must (p<0.001) varied significantly across prompts. Our third research question examined the scientific accuracy of explanations that used lexically ambiguous terms, and found that among explanations containing at least one multivalent term, a large number were judged to be inaccurate (43.4%). Accurate use of pressure (p<0.001) and select (p<0.001) were significantly item-dependent in the initial student responses, but not so in the composite evaluation (Ordinal Logistic Regression). The relationship between overall?? Explanation accuracy and course grade was also significant for pressure (p<0.01) and adapt (p<0.01) (Ordinal Logistic Regression). For our fourth research question, we found that the use of follow-up prompts did result in a significant resolution of ambiguous explanations for pressure (16%) and select (83%). In summary, our results demonstrated that students not only use lexically ambiguous terms in abundance, and that such use is associated with course performance, but that many biology majors have not yet learned how to align meanings appropriately. Furthermore, we found that follow-up questioning is helpful in resolving what students mean when they use lexically ambiguous language, thereby enhancing the scoring of students' evolutionary explanations."