Fo.ur instruments were selec'ted dev.ploped tor he summative evaluation of the Mathematical Probl... more Fo.ur instruments were selec'ted dev.ploped tor he summative evaluation of the Mathematical Problem (Diving' Project_ pilot' tested'. The instruments, were: (1) Student A t.tude Qyesiionnaire (SAQ) developed. and validated by t MPS P PevaluatIon staff; (2) problems selected from the National Longitudinal Study of Mathematics Achievement (NLSMA); (3) 'the problem-solving subtest of)the Stanford Achievement Test-(SAT); and (4) Problem Solving Survey (PSS); developed by the MPSP evaluation staff. A detailedsrepo.rt on the chkice, development, and analysis the instrumentsis included.
Upper elementary students are the focus of tnis naper, intended to present a few studies that hav... more Upper elementary students are the focus of tnis naper, intended to present a few studies that have been done in the iast five years that are related to two aspects of problem solving:. trategies used by the problem solVer and tasks used in studying problem solving. The review is not/intended to be exhaustive, but is designed to present examples of strategies and tasks that have been studied or used in recent research./ (MP)
One of the implicit goals of the Mathematical Problem > Solving Project, MPSP, was' to increase t... more One of the implicit goals of the Mathematical Problem > Solving Project, MPSP, was' to increase the teachers' awareness ox what problem-sobving .and of different ways of solving problems. The problem solving sort task was developed to assess change due to involvement with MPSP in the,teachers, perception of what problems they would use to teach problem solving and which problems they telt their students would be interested in solving. Instrument development, ailministrition, analysis, and results are given and conclu.sions stated. (MP)
Educational Measurement: Issues and Practice, Oct 26, 2015
The alignment between a test and the content domain it measures represents key evidence for the v... more The alignment between a test and the content domain it measures represents key evidence for the validation of test score inferences. Although procedures have been developed for evaluating the content alignment of linear tests, these procedures are not readily applicable to computerized adaptive tests (CATs), which require large item pools and do not use fixed test forms. This article describes the decisions made in the development of CATs that influence and might threaten content alignment. It outlines a process for evaluating alignment that is sensitive to these threats and gives an empirical example of the process.
Alignment studies were held in two 3-day sessions on June 16-19, 2010, in Dover, Delaware. Twenty... more Alignment studies were held in two 3-day sessions on June 16-19, 2010, in Dover, Delaware. Twenty-eight reviewers, including classroom teachers, reading specialists, reading assessment experts, analyzed the relationships between the Delaware Content Standards, performance indicators, and the item banks for grades 2-10. Four groups were established of 6-7 reviewers each to conduct the analysis by grade cluster (2-4, 5-6, 7-8, and 9-10). Two group leaders were from out of state, two other leaders and reviewers were from Delaware. To facilitate the review process, test items were grouped in units of about 50 items each. The order of the items was arbitrary from the item bank, not a pseudo test form. The report includes summary results by grade and standard and an evaluation of the alignment using the four criteria. The appendices include data on each item, the Depth-of-Knowledge (DOK) level, and the performance indicator and standard to which the item was assigned. The alignment between the two reading standards and the item bank across the nine grades generally needed some improvement. The results indicate an acceptable level of DOK across all grades for Standard 2 and five grades (2, 3, 4, 7, and 8) for Standard 4. According to the reviewers' judgments, the criterion for Range of Knowledge was acceptable for both standards in grades 5, 9, and 10; but only weakly met or unacceptable in six grades (2, 3, 4, 6, 7, and 8) mainly for Standard 4. The item banks for grades with low or weak Range-of-Knowledge Correspondence would support two to five test forms that would be considered to have acceptable alignment, but not to be fully aligned. Up to five additional items are needed for those grades to improve the alignment from an acceptable level to a fully aligned level. The two reading standards were very similar across grades; however, there were discrepancies in the way the groups interpreted some performance indicators. Even with the variation in coding among the groups according to the decision rules by each group, the summary of the alignment results may not have been significantly influenced. It is not surprising that the reviewers coded 70-80% of the items to Standard 2. Standard 2 had 22 to 27 performance indicators, while Standard 4 had 8 to 13 performance indicators. In addition, according to the state the assessments were designed to focus on 15-18 performance indicators for Standard 2 and only 3-6 for Standard 4. The state indicated it was a conscious decision informed by the appropriateness of assessing certain performance indicators on a statewide, high-stakes reading assessment. The reading standards did require some interpretations.
Many individuals made this alignment study possible. First, we would like to thank all the review... more Many individuals made this alignment study possible. First, we would like to thank all the reviewers and team leaders who took their task seriously and worked very hard to complete this alignment review. These educators (all classroom teachers and supervisors) generously gave up three days of their summer vacation without compensation in order to participate in this project and learn from the exercise. We are most grateful for their professionalism and dedication. We are also deeply grateful to Professor Norm Webb of the University of Wisconsin for his support during the study. He traveled to New Mexico and introduced the reviewers to the Depth of Knowledge approach to examining alignment between standards and assessments. He provided guidance throughout the project, from planning to project completion.
A major goal of Library Power was to increase the collaboration among classroom teachers and libr... more A major goal of Library Power was to increase the collaboration among classroom teachers and librarians, The research reported in this article supports the conclusion that Library Power was successful in achieving this goal, Analysis of data from over 400 schools (including collaboration logs completed by librarians and questionnaires completed by principals, librarians, and teachers) shows that participation in Library Power increased the percentage of schools where teachers and librarians collaborated to plan instruction and to develop the library collection. Library Power also apparently increased the percentage of teachers who collaborated with the librarian in schools where collaboration already existed. Collaborative logs supported the conclusion that library skills were integrated into the curriculum at all grade levels.
findings and opinions expressed in this report do not reflect the positions or policies of the Na... more findings and opinions expressed in this report do not reflect the positions or policies of the National
Investigations into Assessment in Mathematics Education, 1993
... Deductive proof is prominent in mathematics in establishing truth, whereas the sciences depen... more ... Deductive proof is prominent in mathematics in establishing truth, whereas the sciences depend heavily on observation and ... mathematics as they work on a project and their responses to probing questions are more authentic indicators of their ability to do mathematics than a ...
The impact of curricular reform efforts that are incorporated into the Statewide Systemic Initiat... more The impact of curricular reform efforts that are incorporated into the Statewide Systemic Initiatives (SSIs) program were examined at the item level using data from the 1990, 1992, and 1996 NAEP state administrations. The item-level data from the State NAEP for grade 8 from 1990, 1992, and 1996 and for grade 4 from 1992, and 1996 were examined applying methodology developed for the study of differential item functioning (DIF) under Item Response Theory. The results suggest that there are some important differences in item functioning between SSI and non-SSI states. Eighth grade Students in SSI states in 1992 and 1996 appear more likely to perform better than those in non-SSI states on items reflecting curricular reforms associated with the NCTM Standards. At grade 4, the earlier differences between SSI states and non-SSI states declined, indicating that SSI states became more comparable with the non-SSI states in the underlying constructs being tested.
The GK-12 program of the National Science Foundation is an innovative program for enriching the v... more The GK-12 program of the National Science Foundation is an innovative program for enriching the value of graduate and advanced undergraduate students' education while simultaneously enriching science and mathematics teaching at the K-12 level. GK-12 is a fellowship program that offers graduate students and advanced undergraduates the opportunity to serve as resources for K-12 teachers of science and mathematics. An evaluation was conducted to provide information about GK-12. One component of the evaluation was the qualitative analysis of case studies from 12 purposively selected sites, and the other was a quantitative analysis of survey data from all project sites. Findings show that the areas most often cited as strongest program areas were: (1) content knowledge gains for teachers; (2.) positive role models for students; (3) improved school-university relationship; and (4) improved communication and instructional skills of Fellows. The two areas most often cited as "less ...
This is a report of the results of a three-day Alignment Analysis Institute conducted September [... more This is a report of the results of a three-day Alignment Analysis Institute conducted September [27][28][29] 2006 in Springfield, Illinois. Five people, including language arts content experts, district language arts supervisors, and language arts teachers, met to analyze the agreement between the state's reading standards and assessments for grades 3-8. This analysis indicates that the alignment needs some improvement except for grade 8. The alignment at grade 8 was considered reasonable. The Balance criterion was not satisfied for Goal 1 across all the grades, primarily due to the over-abundance of assessment items asking for simple inferences about a passage's meaning. For Grades 3-6 the Range of Knowledge Correspondence criterion was also not satisfied, meaning that too high of a proportion of benchmarks were not addressed by assessment items. The depth-of-knowledge levels were low compared to the complexity of the benchmarks for Goal 2 at Grade 4 and Grade 7. These alignment findings were supported and detailed by reviewer debriefing comments. These alignment weaknesses could be addressed by replacing from 3-8 items at each grade level. It is the conclusion of this analysis that the alignment between the Illinois reading standards and assessments needs some improvement.
Fo.ur instruments were selec'ted dev.ploped tor he summative evaluation of the Mathematical Probl... more Fo.ur instruments were selec'ted dev.ploped tor he summative evaluation of the Mathematical Problem (Diving' Project_ pilot' tested'. The instruments, were: (1) Student A t.tude Qyesiionnaire (SAQ) developed. and validated by t MPS P PevaluatIon staff; (2) problems selected from the National Longitudinal Study of Mathematics Achievement (NLSMA); (3) 'the problem-solving subtest of)the Stanford Achievement Test-(SAT); and (4) Problem Solving Survey (PSS); developed by the MPSP evaluation staff. A detailedsrepo.rt on the chkice, development, and analysis the instrumentsis included.
Upper elementary students are the focus of tnis naper, intended to present a few studies that hav... more Upper elementary students are the focus of tnis naper, intended to present a few studies that have been done in the iast five years that are related to two aspects of problem solving:. trategies used by the problem solVer and tasks used in studying problem solving. The review is not/intended to be exhaustive, but is designed to present examples of strategies and tasks that have been studied or used in recent research./ (MP)
One of the implicit goals of the Mathematical Problem > Solving Project, MPSP, was' to increase t... more One of the implicit goals of the Mathematical Problem > Solving Project, MPSP, was' to increase the teachers' awareness ox what problem-sobving .and of different ways of solving problems. The problem solving sort task was developed to assess change due to involvement with MPSP in the,teachers, perception of what problems they would use to teach problem solving and which problems they telt their students would be interested in solving. Instrument development, ailministrition, analysis, and results are given and conclu.sions stated. (MP)
Educational Measurement: Issues and Practice, Oct 26, 2015
The alignment between a test and the content domain it measures represents key evidence for the v... more The alignment between a test and the content domain it measures represents key evidence for the validation of test score inferences. Although procedures have been developed for evaluating the content alignment of linear tests, these procedures are not readily applicable to computerized adaptive tests (CATs), which require large item pools and do not use fixed test forms. This article describes the decisions made in the development of CATs that influence and might threaten content alignment. It outlines a process for evaluating alignment that is sensitive to these threats and gives an empirical example of the process.
Alignment studies were held in two 3-day sessions on June 16-19, 2010, in Dover, Delaware. Twenty... more Alignment studies were held in two 3-day sessions on June 16-19, 2010, in Dover, Delaware. Twenty-eight reviewers, including classroom teachers, reading specialists, reading assessment experts, analyzed the relationships between the Delaware Content Standards, performance indicators, and the item banks for grades 2-10. Four groups were established of 6-7 reviewers each to conduct the analysis by grade cluster (2-4, 5-6, 7-8, and 9-10). Two group leaders were from out of state, two other leaders and reviewers were from Delaware. To facilitate the review process, test items were grouped in units of about 50 items each. The order of the items was arbitrary from the item bank, not a pseudo test form. The report includes summary results by grade and standard and an evaluation of the alignment using the four criteria. The appendices include data on each item, the Depth-of-Knowledge (DOK) level, and the performance indicator and standard to which the item was assigned. The alignment between the two reading standards and the item bank across the nine grades generally needed some improvement. The results indicate an acceptable level of DOK across all grades for Standard 2 and five grades (2, 3, 4, 7, and 8) for Standard 4. According to the reviewers' judgments, the criterion for Range of Knowledge was acceptable for both standards in grades 5, 9, and 10; but only weakly met or unacceptable in six grades (2, 3, 4, 6, 7, and 8) mainly for Standard 4. The item banks for grades with low or weak Range-of-Knowledge Correspondence would support two to five test forms that would be considered to have acceptable alignment, but not to be fully aligned. Up to five additional items are needed for those grades to improve the alignment from an acceptable level to a fully aligned level. The two reading standards were very similar across grades; however, there were discrepancies in the way the groups interpreted some performance indicators. Even with the variation in coding among the groups according to the decision rules by each group, the summary of the alignment results may not have been significantly influenced. It is not surprising that the reviewers coded 70-80% of the items to Standard 2. Standard 2 had 22 to 27 performance indicators, while Standard 4 had 8 to 13 performance indicators. In addition, according to the state the assessments were designed to focus on 15-18 performance indicators for Standard 2 and only 3-6 for Standard 4. The state indicated it was a conscious decision informed by the appropriateness of assessing certain performance indicators on a statewide, high-stakes reading assessment. The reading standards did require some interpretations.
Many individuals made this alignment study possible. First, we would like to thank all the review... more Many individuals made this alignment study possible. First, we would like to thank all the reviewers and team leaders who took their task seriously and worked very hard to complete this alignment review. These educators (all classroom teachers and supervisors) generously gave up three days of their summer vacation without compensation in order to participate in this project and learn from the exercise. We are most grateful for their professionalism and dedication. We are also deeply grateful to Professor Norm Webb of the University of Wisconsin for his support during the study. He traveled to New Mexico and introduced the reviewers to the Depth of Knowledge approach to examining alignment between standards and assessments. He provided guidance throughout the project, from planning to project completion.
A major goal of Library Power was to increase the collaboration among classroom teachers and libr... more A major goal of Library Power was to increase the collaboration among classroom teachers and librarians, The research reported in this article supports the conclusion that Library Power was successful in achieving this goal, Analysis of data from over 400 schools (including collaboration logs completed by librarians and questionnaires completed by principals, librarians, and teachers) shows that participation in Library Power increased the percentage of schools where teachers and librarians collaborated to plan instruction and to develop the library collection. Library Power also apparently increased the percentage of teachers who collaborated with the librarian in schools where collaboration already existed. Collaborative logs supported the conclusion that library skills were integrated into the curriculum at all grade levels.
findings and opinions expressed in this report do not reflect the positions or policies of the Na... more findings and opinions expressed in this report do not reflect the positions or policies of the National
Investigations into Assessment in Mathematics Education, 1993
... Deductive proof is prominent in mathematics in establishing truth, whereas the sciences depen... more ... Deductive proof is prominent in mathematics in establishing truth, whereas the sciences depend heavily on observation and ... mathematics as they work on a project and their responses to probing questions are more authentic indicators of their ability to do mathematics than a ...
The impact of curricular reform efforts that are incorporated into the Statewide Systemic Initiat... more The impact of curricular reform efforts that are incorporated into the Statewide Systemic Initiatives (SSIs) program were examined at the item level using data from the 1990, 1992, and 1996 NAEP state administrations. The item-level data from the State NAEP for grade 8 from 1990, 1992, and 1996 and for grade 4 from 1992, and 1996 were examined applying methodology developed for the study of differential item functioning (DIF) under Item Response Theory. The results suggest that there are some important differences in item functioning between SSI and non-SSI states. Eighth grade Students in SSI states in 1992 and 1996 appear more likely to perform better than those in non-SSI states on items reflecting curricular reforms associated with the NCTM Standards. At grade 4, the earlier differences between SSI states and non-SSI states declined, indicating that SSI states became more comparable with the non-SSI states in the underlying constructs being tested.
The GK-12 program of the National Science Foundation is an innovative program for enriching the v... more The GK-12 program of the National Science Foundation is an innovative program for enriching the value of graduate and advanced undergraduate students' education while simultaneously enriching science and mathematics teaching at the K-12 level. GK-12 is a fellowship program that offers graduate students and advanced undergraduates the opportunity to serve as resources for K-12 teachers of science and mathematics. An evaluation was conducted to provide information about GK-12. One component of the evaluation was the qualitative analysis of case studies from 12 purposively selected sites, and the other was a quantitative analysis of survey data from all project sites. Findings show that the areas most often cited as strongest program areas were: (1) content knowledge gains for teachers; (2.) positive role models for students; (3) improved school-university relationship; and (4) improved communication and instructional skills of Fellows. The two areas most often cited as "less ...
This is a report of the results of a three-day Alignment Analysis Institute conducted September [... more This is a report of the results of a three-day Alignment Analysis Institute conducted September [27][28][29] 2006 in Springfield, Illinois. Five people, including language arts content experts, district language arts supervisors, and language arts teachers, met to analyze the agreement between the state's reading standards and assessments for grades 3-8. This analysis indicates that the alignment needs some improvement except for grade 8. The alignment at grade 8 was considered reasonable. The Balance criterion was not satisfied for Goal 1 across all the grades, primarily due to the over-abundance of assessment items asking for simple inferences about a passage's meaning. For Grades 3-6 the Range of Knowledge Correspondence criterion was also not satisfied, meaning that too high of a proportion of benchmarks were not addressed by assessment items. The depth-of-knowledge levels were low compared to the complexity of the benchmarks for Goal 2 at Grade 4 and Grade 7. These alignment findings were supported and detailed by reviewer debriefing comments. These alignment weaknesses could be addressed by replacing from 3-8 items at each grade level. It is the conclusion of this analysis that the alignment between the Illinois reading standards and assessments needs some improvement.
Uploads
Papers by Norman Webb