Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2008
…
12 pages
1 file
What is our warrant for saying "Student X deserves a Grade C" ? It must be based on evidence, and the only evidence we see is what students produce during the exam. For valid assessment two criteria must be met: the examination must elicit proper evidence of the trait, and we must evaluate the evidence properly.
2008
What is our warrant for saying “Student X deserves a Grade C ” ? It must be based on evidence, and the only evidence we see is what students produce during the exam. For valid assessment two criteria must be met: the examination must elicit proper evidence of the trait, and we must evaluate the evidence properly. This highlights the importance of ensuring quality in the mark schemes with which we evaluate the evidence as well as in the questions which elicit it. Our recent research shows that improving mark schemes can make more impact on validity than further work on improving questions. In this paper we will outline a procedural model for maximising construct validity: at its heart is the concept of Outcome Space, the range of evidence that students produce. The model aims to ensure that our mark schemes evaluate this evidence properly in terms of the achievement trait we want to assess. This model has been developed in consultation with senior examiners and exam board personnel. ...
We are indebted to too many people to list here for enlightening discussions of topics addressed in this paper. We would like to acknowledge Lyle Bachman, Irwin Kirsch, Mary Schedl, and John Norris with regard to issues in language assessment, and, for their comments on an earlier draft, the editor Mark Wilson and two anonymous referees.
Psychometrics in practice at RCEC, 2012
To evaluate the quality of educational assessments, several evaluation systems are available. These systems are, however, focused around the evaluation of a single type of test. Furthermore, within these systems, quality is defined as a non-flexible construct, whereas in this paper it is argued that the evaluation of test quality should depend on the test's purpose. Within this paper, we compare several available evaluation systems. From this comparison, design principles are derived to guide the development of a new, comprehensive quality evaluation system. The paper concludes with an outline of the new evaluation system, which intends to incorporate an argument-based approach to quality.
PsycEXTRA Dataset
Applied Measurement in Education, 2019
Despite the call for an argument-based approach to validity over 25 years ago, few examples exist in the published literature. One possible explanation for this outcome is that the complexity of the argument-based approach makes implementation difficult. To counter this claim, we propose that the Assessment Triangle can serve as the overarching framework for operationalizing and instantiating the argument-based approach to validation. Integrating these frameworks can streamline the validation process by providing a conceptual lens for identifying, collecting, and evaluating relevant sources of evidence throughout the testing process. To fully examine this proposed conceptualization, we apply the integrated framework to an example case of a universal screener for middleschool mathematics. We articulate an interpretation and use argument for the universal screener, and then present relevant sources of evidence to evaluate the plausibility of the inferences and warrants underlying test score use. Based on this applied example, the strengths and limitations of the integrated framework are considered, and recommendations are made for future instantiations. Multiple frameworks exist in educational assessment for making sense of validity. Perhaps the most comprehensive treatment of validity within the broader context of educational assessment is the National Research Council's report titled Knowing What Students Know (Pellegrino, Chudowsky, & Glaser, 2001). In this report, validity (interpretation) is integrally connected to theories of learning (cognition) and students' response processes on tested items, tasks, or situations (observation) in the Assessment Triangle. Interpretation is defined as the process of making meaning from the evidence obtained about students' knowledge, skills, and abilities. Cognition refers to the models of thinking and learning that underlie students' development of knowledge, skills, and abilities in the domain. Observation includes the items, tasks, or situations designed to elicit students' knowledge, skills, and abilities. Representing these components as the Assessment Triangle underscores the interdependent relationship and importance of alignment between these dimensions of educational assessment. Although more than 25 years have passed since the introduction of this conceptual model of educational assessment, the three integrated components are often examined separately. However, some connections between the components have begun to receive greater attention in the past 15 years. For example, recent research in mathematics and science education has called for designing classroom assessments based on learning trajectories and learning progressions (cf., Corcoran, Mosher, & Rogat, 2009; Daro, Mosher, & Corcoran, 2011), especially as it relates to providing teachers with instructionally relevant information. This work seeks to connect the cognition and interpretation vertices of the Assessment Triangle. Relatedly, researchers investigating students' response processes when interacting with items, tasks, or situations are attempting to examine the alignment between the intended and elicited cognitive processes, thereby connecting the cognition and observation vertices (cf., Leighton & Gierl, 2011; Padilla & Benitez, 2014).
Journal of Vocational Education & Training, 2020
School-and college-based vocational and technical qualifications (VTQs) in England are required to award successful candidates a grade rather than simple pass or fail. Ensuring the reliability and validity of these grades is considered vital, particularly in light of the high-stakes purposes for which school assessment results in England are used. Whilst previous research has shown how mark scheme design can support examiner judgement, school-and college-based VTQs are assessed to a large extent by non-exam assessment, which is marked or graded by an assessor within the candidate's school or college. This article addresses the question of how mark scheme design can most effectively support reliable and valid assessment, by considering the distinctive characteristics of assessors' marking or grading task in this particular context. The article synthesises research on the nature of internal VTQ assessment, empirically validated studies in mark scheme design, and theoretical arguments, to produce a framework for mark scheme review. The framework constitutes a tool for evaluating design decisions, intended to help assessment designers in the process of mark scheme improvement, and hence to support schooland college-based VTQ assessment.
Psicothema, 2014
Validation is the process of providing evidence that tests and questionnaires are adequately and appropriately fulfilling the purposes for which they are developed. In this special issue, experts from several countries describe specific approaches to test validation and provide examples of their approach. These approaches and examples illustrate the validation framework implied by the Standards for Educational and Psychological Testing. We describe the Standards' approach for building a validity argument based on validity evidence based on test content, response processes, internal structure, relations to other variables, and testing consequences. The five articles provide comprehensive examples of gathering data regarding these five sources of evidence and how they contribute to the validation of the use of test scores for particular purposes. These five articles provide concrete examples of how the five sources of validity evidence suggested by the Standards can be used to dev...
How can we effectively tell whether learners have acquired, and can exhibit outcomes that were initially established for them, and instructions tailored to? This question leads to assessment of learning outcomes or instructional results. It is for instance held that an outcomes-based approach to this requires assessment, in authentic ways, of what is considered to be most important of students' attainments. Unfortunately, the use of inappropriate assessment/test items/instruments is a widespread phenomenon and has become a practice/malpractice most urgently in need of improvement. To ensure such improvement is to satisfy the most important criteria in assessment/test administration; validity. The prevailing assessment culture is however still steeped in the preoccupation with reliability. This is due to the notion that for an assessment to be reliable it must first be valid, and the subsequent assumption that the reliability of an assessment invariably ensures its validity, as there is no structured/formulaic way of determining validity. It is however known that an assessment can be reliable without necessarily being valid. This paper therefore attempts to fill this validity void, by presenting two well-structured models/flowcharts; one, for verifying the validity or usefulness/appropriateness of assessment items and the other for the construction/writing of valid/appropriate assessment items.
2000
Many authors over the past twenty years have argued that the prevailing ‘psychometric’ paradigm for educational assessment is inappropriate and have proposed that educational assessment should develop its own distinctive paradigm. More recently (and particularly within the last five years) it has become almost commonplace to argue that changes in assessment methods are required because of changing views of human cognition, and in particular, the shift from ‘behaviourist’ towards ‘constructivist’ views of the nature of human learning. However, these changes are still firmly rooted within the psychometric paradigm, since within this perspective, the development of assessment is an essentially ‘rationalist’ project in which values play only a minor (if any) role. The validation of an assessment proceeds in a ‘scientific’ manner, and the claim is that the results of any validation exercise would be agreed by all informed observers. Developing on the work of Samuel Messick, in this paper...
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Australian Journal of Teacher Education, 1998
Assessment & Evaluation in Higher Education, 2015
Measurement: Interdisciplinary Research & Perspective, 2003
British Journal of Psychology, 2002
Studies in Educational Evaluation, 1997
Frontiers in Education
Innovations in Education and Teaching International, 2001
Teaching in Higher …, 2009
British Educational Research Journal, 2008
Educational Measurement: Issues and Practice, 2007
Teaching in Higher Education, 2004
Innovations in Education and Teaching International, 2019