Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2004, International Journal of Testing
In computer-based interactive environments meant to support learning, students must bring a wide range of relevant knowledge, skills, and abilities to bear jointly as they solve meaningful problems in a learning domain. To function effectively as an assessment, a computer system must additionally be able to evoke and interpret observable evidence about targeted knowledge in a manner that is principled, defensible, and suited to the purpose at hand (e.g., licensure, achievement testing, coached practice). This paper concerns the grounding for the design of an interactive computer-based assessment of design and troubleshooting in the domain of computer networking. The application is a prototype for assessing these skills as part of an instructional program, as interim practice tests and as chapter or end-of-course assessments. An Evidence Centered Design (ECD) framework was used to guide the work. An important part of this work is a cognitive task analysis designed to (a) tap the knowledge computer network specialists and students use when they design and troubleshoot networks, and (b) elicit behaviors that manifest this knowledge. After summarizing its results, we discuss implications of this analysis, as well as information gathered through other methods of domain analysis, for designing psychometric models, automated scoring algorithms, and task frameworks, and for the capabilities required for the delivery of this example of a complex computer-based interactive assessment.
Assessment in Education: Principles, Policy & Practice, 2019
Our goal in this chapter is to provide a validity-centered overview of key design decisions in the development, evaluation, and deployment of automated scoring s ystems. In particular, we focus on systems that identify, extract, and synthesize evidence via automated means about students' proficiencies. In our discussions we consider the scoring of students' performances in response to a prompt: the products they submit (e.g., item responses, essays, designs, etc.), the actions they take (e.g., keystrokes, mouse move, clicks, and so on) in producing a product, or both. We use the term work product more broadly to encompass the form(s) of products and performances that are to be evaluated. We take a perspective wherein we do not view scoring as an isolated process, or just an efficiency-enhancing process, but, rather, as the foundation for valid interpretations, inferences, and decisions made about students. Our primary goal is to highlight the many challenges in the lifecycle of an automated scoring engine from initial design to final deployment. Ideally, readers will be able to generalize from the examples discussed here to their particular application, whether they are designing an assessment from scratch, revising an assessment to include automated scoring, evaluating an assessment that completely or partially relies on automated scoring, or evaluating vendors to c ollaborate with in the implementation of automated scoring. We have organized the chapter into four main sections. In the first section, we present an overview of automated scoring ranging from the earliest applications of essay scoring to more recent applications in simulation-and game-based assessments. In the second section, we emphasize the importance of designing scoring with validity in mind, espe cially validity-as-argument as presented by Kane (2006). Similarly, we lay out the rele vance and usefulness of the evidence-centered design (ECD) framework (Mislevy, Steinberg, & Almond, 2003) in the automated scoring context. In the third section,
National Center For Research on Evaluation Standards and Student Testing, 2007
This paper will describe the relationships between research on learning and its application in assessment models and operational systems. These have been topics of research at the National Center for Research on Evaluation, Standards, and Student Testing (CRESST) for more than 20 years and form a significant part of the intellectual foundation of our present research Center supported by the Institute of Education Sciences. This description serves as the context for the presentation of CRESST efforts in building the POWERSOURCE © assessment system as described in subsequent papers delivered at Session N2 of the 2006 annual meeting of the National Council on Measurement in Education. This paper will describe the relationships between research on learning and its application in assessment models and operational systems. These have been topics of research at the National Center for Research on Evaluation, Standards, and Student Testing (CRESST) for more than 20 years and form a significant part of the intellectual foundation of our present research Center supported by the Institute of Education Sciences, as well as of many of the research studies on learning and assessment funded by the Office of Naval Research. This description is intended not to advocate a general approach (although I do), but rather to serve as the context for subsequent presentations about CRESST efforts in building the POWERSOURCE © assessment system. Part 1. Rationale for Model-Based Assessment (MBA): Background of Our Efforts to Incorporate Learning Psychology Into Assessment Systems Definitions of Model CRESST R&D in assessment and learning fits two different but compatible definitions of the term "model." The first definition of model relates to the science it 1 Paper presented at the 2006 annual meeting of the National Council on Measurement in Education (NCME). Note that the first extended section of this piece is for those who have not heard or read the many discussions of model-based assessment as practiced at CRESST. The second section can be best understood by first understanding how the model is used. The section on relating learning to assessment design and use is most directly relevant to the presentation as described in the conference program.
2007
This paper will describe the relationships between research on learning and its application in assessment models and operational systems. These have been topics of research at the National Center for Research on Evaluation, Standards, and Student Testing (CRESST) for more than 20 years and form a significant part of the intellectual foundation of our present research Center supported by the Institute of Education Sciences. This description serves as the context for the presentation of CRESST efforts in building the POWERSOURCE © assessment system as described in subsequent papers delivered at Session N2 of the 2006 annual meeting of the National Council on Measurement in Education. This paper will describe the relationships between research on learning and its application in assessment models and operational systems. These have been topics of research at the National Center for Research on Evaluation, Standards, and Student Testing (CRESST) for more than 20 years and form a significant part of the intellectual foundation of our present research Center supported by the Institute of Education Sciences, as well as of many of the research studies on learning and assessment funded by the Office of Naval Research. This description is intended not to advocate a general approach (although I do), but rather to serve as the context for subsequent presentations about CRESST efforts in building the POWERSOURCE © assessment system. Part 1. Rationale for Model-Based Assessment (MBA): Background of Our Efforts to Incorporate Learning Psychology Into Assessment Systems Definitions of Model CRESST R&D in assessment and learning fits two different but compatible definitions of the term "model." The first definition of model relates to the science it 1 Paper presented at the 2006 annual meeting of the National Council on Measurement in Education (NCME). Note that the first extended section of this piece is for those who have not heard or read the many discussions of model-based assessment as practiced at CRESST. The second section can be best understood by first understanding how the model is used. The section on relating learning to assessment design and use is most directly relevant to the presentation as described in the conference program.
Assessing Model-Based Reasoning using Evidence- Centered Design, 2017
International Encyclopedia of Education, 2010
This paper illustrates how Evidence Centered Design (ECD) can be used to design and develop high-quality learning-centered assessments that have qualities such as: (1) learning-effectiveness and efficiency, (2) validity of assessment results, (3) accessibility for students with special needs, and (4) student engagement. This paper draws on Cognitive Load Theory to provide a research-based rationale for this approach. If validated, this approach may lay a foundation for learning-centered systems that are not only more accessible for students with disabilities but also more learning effective and efficient, valid, and engaging for all students. validity, and engagement) for students generally. There is a need for a framework that helps address such a broad set of concerns in a coherent and integrated manner.
2012
Evidence-centered design (ECD) is a comprehensive framework for describing the conceptual, computational and inferential elements of educational assessment. It emphasizes the importance of articulating inferences one wants to make and the evidence needed to support those inferences. At first blush, ECD and educational data mining (EDM) might seem in conflict: structuring situations to evoke particular kinds of evidence, versus discovering meaningful patterns in available data. However, a dialectic between the two stances increases understanding and improves practice. We first introduce ECD and relate its elements to the broad range of digital inputs relevant to modern assessment. We then discuss the relation between EDM and psychometric activities in educational assessment. We illustrate points with examples from the Cisco Networking Academy, a global program in which information technology is taught through a blended program of face-to-face classroom instruction, an online curricul...
ETS Research Report Series, 2003
There is growing interest in educational assessments that coordinate substantive considerations, learning psychology, task design, and measurement models. This paper concerns an analysis of responses from an assessment of mixed-number subtraction that was created by Kikumi in light of cognitive analyses of students' problem solutions. In particular, we fit a binary-skills multivariate latent class model to the data and compare results to those obtained with an item response theory model and a modified latent class model suggested by model criticism indices. Markov chain Monte Carlo (MCMC) techniques are used to estimate the parameters in the model in a Bayesian framework that integrates information from substantive theory, expert judgment, and empirical data.
Assessment in Education: Principles, Policy & Practice, 2003
Computer-based simulations can give a more nuanced understanding of what students know and can do than traditional testing methods. These extended, integrated tasks, however, introduce particular problems, including producing an overwhelming amount of data, multidimensionality, and local dependence. In this paper, we describe an approach to understanding the data from complex performances based on evidence-centered design (Mislevy, Almond, & Lukas, in press), a methodology for devising assessments and for using the evidence observed in complex student performances to make inferences about proficiency. We use as an illustration the NAEP Problem-Solving in Technology-Rich Environments Study, which is being conducted to exemplify how nontraditional skills might be assessed in a sample-based national survey. The paper focuses on the inferential uses of ECD, especially how features are extracted from student performance, how these extractions are evaluated, and how the evaluations are accumulated to make evaluative judgments.
In computer-based simulations, students must bring a wide range of relevant knowledge, skills, and abilities to bear jointly as they solve meaningful problems in a learning domain. To function effectively as an assessment, a simulation system must additionally be able to evoke and interpret observable evidence about targeted knowledge in a manner that is principled, defensible, and suited to the purpose at hand (e.g., licensure, achievement testing, coached practice). This study focused on the grounding for a simulation-based assessment of design and troubleshooting in the domain of computer networks. The application was designed as a prototype for assessing these skills in an instructional program, as interim practice tests and as chapter or end-of-course assessments. An evidence-centered assessment design framework was used to guide the work. An important part of this work is a cognitive task analysis, designed to tap the knowledge network engineers and students use when they desi...
Computers in Human Behavior, 1999
To function effectively as a learning environment, a simulation system must present learners with situations in which they use relevant knowledge, skills, and abilities. To function effectively as an assessment, such a system must additionally be able to evoke and interpret observable evidence about targeted knowledge in a manner that is principled, defensible, and fitting to the purpose at hand (e.g., licensure, achievement testing, coached practice). This article concerns an evidence-centered approach to designing a computer-based performance assessment of problem-solving. The application is a prototype licensure test, with supplementary feedback, for prospective use in the field of dental hygiene. We describe a cognitive task analysis designed to (a) tap the knowledge hygienists use when they assess patients, plan treatments, and monitor progress, and (b) elicit behaviors that manifest this knowledge. After summarizing the results of the analysis, we discuss implications for designing student models, evidentiary structures, task frameworks, and simulation capabilities required for the proposed assessment.
The current pace of technological advance has provided an unprecedented opportunity to use innovative simulated tasks in computerized assessment. A primary challenge for the successful use of innovation in assessment rests with the application of sound principles of design to produce a valid assessment. An additional challenge is to maximize the utility from the investment in innovative design through leveraging successful innovation to new assessment tasks and new educational tools. This paper describes the Evidence Centered Design (ECD)approach to design of an innovative simulation-based assessment of computer networking ability. The paper emphasizes the design components and how these components may be leveraged for reusability in a variety of ways, including the generation of new assessment tasks, applications to alternative purposes within the domain of computer networking, or as a basis for extending knowledge of proficiencies needed for performance in the domain, and the exte...
Theoretical and Practical Advances in Computer-based Educational Measurement, 2019
Quality assurance systems for psychological and educational tests have been available for a long time. The original focus of most of these systems, be it standards, guidelines, or formal reviewing systems, was on psychological testing. As a result, these systems are not optimally suited to evaluate the quality of educational tests, especially exams. In this chapter, a formal generic reviewing system is presented that is specifically tailored to this purpose: the RCEC review system. After an introduction with an overview of some important standards, guidelines, and review systems, and their common backgrounds, the RCEC review system for the evaluation of educational tests and exams is described. The underlying principles and background of this review system are explained, as well as the reviewing procedure with its six criteria. Next, the system is applied to review the quality of a computerbased adaptive test: Cito's Math Entrance Test for Teachers Colleges. This is done to illustrate how the system operates in practice. The chapter ends with a discussion of the benefits and drawbacks of the RCEC review system.
Tasks are the most visible element in an educational assessment. Their purpose, however, is to provide evidence about targets of inference that cannot be directly seen at all: what examinees know and can do, more broadly conceived than can be observed in the context of any particular set of tasks. This paper concerns issues in assessment design that must be addressed for assessment tasks to serve this purpose effectively and efficiently. The first part of the paper describes a conceptual framework for assessment design, which includes models for tasks. Corresponding models appear for other aspects of an assessment, in the form of a student model, evidence models, an assembly model, a simulator/presentation model, and an interface/environment model. Coherent design requires that these models be coordinated to serve the assessment's purpose. The second part of the paper focuses attention on the task model. It discusses the several roles t h a t task model variables play to achieve the needed coordination in the design phase of an assessment, and to structure task creation and inference in the operational phase.
Computers in Human Behavior
PsycEXTRA Dataset
People use external knowledge representations (EKRs) to identify, depict, transform, store, share, and archive information. Learning how to work with EKRs is central to becoming proficient in virtually every discipline. As such, EKRs play central roles in curriculum, instruction, and assessment. Five key roles of EKRs in educational assessment are described: 1. An assessment is itself an EKR, which makes explicit the knowledge that is valued, ways it is used, and standards of good work. 4. "Design EKRs" can be created to organize knowledge about a domain in forms that support the design of assessment. 5. EKRs from the discipline of assessment design can guide and structure the domain analyses noted in (2), task construction (3), and the creation and use of design EKRs noted in (4). The third and fourth roles are discussed and illustrated in greater detail, through the perspective of an "evidence-centered" assessment design framework that reflects the fifth role. Connections with automated task construction and scoring are highlighted. Ideas are illustrated with two examples: "generate examples" tasks and simulation-based tasks for assessing computer network design and troubleshooting skills. 1.0 Introduction and Overview Knowledge representation is a central theme in cognitive psychology. Internal knowledge representation refers to the way that information about the world is represented in our brains, and as such lies at the center of learning, interacting, and
Advances in Health Sciences Education, 2017
Assessment of complex tasks integrating several competencies calls for a programmatic design approach. As single instruments do not provide the information required to reach a robust judgment of integral performance, 73 guidelines for programmatic assessment design were developed. When simultaneously applying these interrelated guidelines, it is challenging to keep a clear overview of all assessment activities. The goal of this study was to provide practical support for applying a programmatic approach to assessment design, not bound to any specific educational paradigm. The guidelines were first applied in a postgraduate medical training setting, and a process analysis was conducted. This resulted in the identification of four steps for programmatic assessment design: evaluation, contextualisation, prioritisation and justification. Firstly, the (re)design process starts with sufficiently detailing the assessment environment and formulating the principal purpose. Key stakeholders with sufficient (assessment) expertise need to be involved in the analysis of strengths and weaknesses and identification of developmental needs. Central governance is essential to balance efforts and stakes with the principal purpose and decide on prioritisation of design decisions and selection of relevant guidelines. Finally, justification of assessment design decisions, quality assurance and external accountability close the loop, to ensure sound underpinning and continuous improvement of the assessment programme.
2002
This paper introduces the concept of a reusable assessment framework (RAF). An RAF contains a library of linked assessment design objects that express: (1) specific set of proficiencies (i.e. the knowledge, skills, and abilities of students for a given content or skill area); (2) the types of evidence that can be used to estimate those proficiencies; and (3) features of tasks that will aid in the design of activities (e.g. features that need to be present in order for students to produce the evidence, features that affect task difficulty, etc.). While RAFs can speed the design of many kinds of assessments, in this paper the focus is on their use to aid instructional designers in embedding assessments within computer-based learning environments. The RAF concept is based upon the evidence-centered design methodology described in Mislevy, Steinberg, Almond, Haertel, & Penuel (2001). (Author) Reproductions supplied by EDRS are the best that can be made from the original document.
Applied Measurement in Education, 2002
Advances in cognitive psychology deepen our understanding of how students gain and use knowledge, and broaden the range of performances and situations we want to see to acquire evidence about their developing knowledge. At the same time, advances in technology make it possible to capture more complex performances in assessment settings, by including, as examples, simulation, interactivity, and extended responses. The challenge is making sense of the complex data that result. This presentation concerns an evidence-centered approach to the design and analysis of complex assessments. It presents a design framework that incorporates integrated structures for a modeling knowledge and skills, designing tasks, and extracting and synthesizing evidence. The ideas are illustrated in the context of a project with the Dental Interactive Simulation Corporation (DISC), assessing problem-solving in dental hygiene with computer-based simulations. After reviewing the substantive grounding of this effort, we describe the design rationale, statistical and scoring models, and operational structures for the DISC assessment prototype.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.