Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
…
60 pages
1 file
The use of loglinear latent class models to detect item biap was studied. Purposes of the study were to: (1) develop procedurts for use in assessing item bias when the grouping variable with respect,to which bias occurs is not observed; (2) develop bias detection procedures that relate to a conceptually different assessed trait--a categorical attribute; and (3) exemplify the use of these developed procedures with real world data. Models are formulated so that the attribute to be measured may be continuous, as in a Rasch model, or categorical, as in latent class models. The item bias to be studied may correspond to a manifest grouping variable, a latent grouping variable, or both. Likelihood-ratio tests for assessing the presence of various types of bias are described. These methods are illustrated through analysis of a "real world" data set from a study of multiplication items administered to 286 Dutch undergraduates. Bias was related to a manifest grouping variable by giving 143 of the subjects some training in Roman numerals, in which some of the multiplication problems were written. Results indicate that it was possible to explain item bias through differences in item difficulties or error rates across levels of grouping variables. The model represented can be extended to include several observed and unobserved variables. Ten tables present information about the models and findings of the study. A 39-item list of references is included.
Applied psychological …, 1985
Possible underlying causes of item bias were examined using a simulation procedure. Data sets were generated to conform to specified factor structures and mean factor scores. Comparisons between the item parameters of various data sets were made with one data set representing the "majority" group and another data set representing the "minority" group. Results indicated that items that required a secondary ability, on which two groups differed in mean level, were generally more biased than those items that do not require a secondary ability. Items with different factor structures in two groups were not consistently identified as more biased than those having similar factor structures. A substantial amount of agreement was found among the bias indices used in the study.
Latent class analysis (LCA) has been used to model measurement error, to identify flawed survey questions, and to compare mode effects. Using data from a survey of University of Maryland alumni together with alumni records we evaluate this technique to determine its accuracy and effectiveness for detecting bad questions in the survey context. Our results showed good qualitative results for the latent class models -the items that the model deemed the worst were the worst according to the true scores -but weaker quantitative estimates of the error rates for a given item.
1989
The stat;.lity of bias estimates from J. Schueneman's chi-square method, the transformed Delta method, Rasch's one-parameter residual analysis, and the Mantel-Haenszel procedure, were compared across small and large samples for a data set of 30,000 cases. Bias values for 30 samples were estimated for each method, and means and variances of item bias were computed across all the samples, for comparisons contrasting sample size, sex, and race. The point estimates of item bias, based on 30 replications for each method, were also correlated across random samples, and classification techniques compared the results for agreement. The results showed that none of the methods consistently flagged more or fewer items as biased, though at the larger sample sizes the Mantel-Haenszel and Rasch methods were particularly sensitive at detecting item bias and in high agreement. Reliabilities of the Modified Delta method were generally lower than the others, as 'were the correlations between Modified Delta and the other indices. The results showed that not until the number of cases in each compari.son group reached 1,000 did the reliabilities for any technique approach 0.80. (Contains 5 tables and 22 references. (Author/SLD)
1993
The subject of this dissertation is the examination of differential item functioning (DIF) through the use of loglinear Rasch models with latent classes. DIF refers to the probability that a correct response among equally able test takers is different for various racial, ethnic, and gender groups. Because usual methods of detecting DIF give little information about the reason an item is biased, use of the solution-error response-error (SERE) model of H. Kelderman is proposed. It is demonstrated that the SERE model can show whether DIF is caused by the difficulty of the item, the attractiveness of its alternatives, or both. The large amount of computer memory space required makes this method impractical for a large number of items. A new method is proposed based on the division of the whole item set into several subsets, which is made possible by the collapsibility of the SERE model. With the use of subsets of items, the parameters of the entire SERE model can be obtained only by simultaneous estimation of the parameters of the collapsed SERE models through use of pseudo-likelihood theory. A simulation study demonstrates that a distinction can be made between the two types of DIF using the new approach. A generalization of the SERE model applicable to polytomously scored latent states, that may be explained with a multidimensional latent space, is discussed. Five appendices illustrate applications of these models with reference to existing tests and the collapsed SERE model. (Contains 167 references.) (SLD)
Journal of Educational Statistics, 1984
Theoretically preferred IRT bias detection procedures were applied to both a mathematics achievement and vocabulary test. The data were from black and white seniors on the High School and Beyond data files. To account for statistical artifacts, each analysis was repeated on randomly equivalent samples of blacks and whites ( n’s = 1,500). Furthermore, to establish a baseline for judging bias indices that might be attributable only to sampling fluctuations, bias analyses were conducted comparing randomly selected groups of whites. To assess the effect of mean group differences on the appearance of bias, pseudo-ethnic groups were created, that is, samples of whites were selected to simulate the average black-white difference. The validity and sensitivity of the IRT bias indices was supported by several findings. A relatively large number of items (10 of 29) on the math test were found to be consistently biased; they were replicated in parallel analyses. The bias indices were substantia...
1977
BeCause it -is a true store model-employing. item parameters which:are-independent of the..-examined:sample,. item characteristic cur-ve theory (ICC) offers several-advantages-over classical measurement theory. In this paper.an-approach.to biased -item identification using ICC -theory-is described and applied. The ICC -theory approach is attractive in that it,--(1) appears to be sensitive_largely to cultural-variations in theitrait gauged by test .items, (2). does not assume total scores .to be valid indicators-of --true ability-, (3) places the identified-_degree of item bias on a .guantified-metric, andA4) -is applicable to items of sufficiently varying degrees of difficulty. While sensitive to some..factOrs Other --than item bias-; namely,-local independence, item inipprepriateness-and poor parameterestimates, the approach may prove useful to the measurement field-(Author/RC) 'Documents acguired-by.ERIC.include many informal-unpublished. materials mot.available from other. Sources.'_ERIC-makes every effort * *-to obtain the best copy-available, Nevertheless, items'of-marginal-: * -reproducibility':are.often encountered and this affects the quality ',0-..,.o.f.the.microfiche and hardcopy reproductions ERIC. makes available--..via-the'ERIC Decument:Reproduction Service-(EDRS).
2011
Latent trait theory is often referred to as item response theory (IRT) in the area of educational testing and psychological measurement. IRT models show the relationship between the unobserved constructs (e.g., an academic proficiency) and the observed variables (e.g., an item response of the examinee). Because IRT provides many advantages over classical test theory, IRT methods are used in many testing applications. One of the useful features of IRT is the comparability of the test scores obtained from different test forms. However, the parameters of test items need to be put onto the common metric, namely the item parameter calibration, in advance. Among various IRT models, this study focuses on unidimensional IRT models for dichotomously (0/1) scored tests. Under the three-parameter logistic (3PL) model (Lord, 1980), the probability of a correct response to the item j for the latent trait variable θ is defined as
Participatory Educational Research, 2020
This study examined the existence of latent classes in TIMSS 2015 data from three countries, Singapure, Turkey and South Africa, were analyzed using Mixture Item Response Theory (MixIRT) models (Rasch, 1PL, 2PL and 3PL) on 18 multiple-choice items in the science subtest. Based on the findings, it was concluded that the data obtained from TIMSS 2015 8th grade science subtest have a heterogeneous structure consisting of two latent classes. When the item difficulty parameters in two classes were examined for Singapore, it was determined that the items were considerably easy for the students in Class 1 and the items were easy for the students in Class 2. When the item difficulty parameters in two classes were examined for Turkey, it was found that the items were easy for the students in Class 1 and the items were difficult for the students in Class 2. When the item difficulty parameters in two classes were examined for South Africa, it was ascertained that the items were a bit easy for the students in Class 1 and the items were considerably difficult for the students in Class 2. The findings were discussed in the context of the assumption of parameter invariance and test validity.
Journal of the Royal Statistical Society: Series A (Statistics in Society), 2008
Latent class analysis has been used to model measurement error, to identify flawed survey questions and to estimate mode effects. Using data from a survey of University of Maryland alumni together with alumni records, we evaluate this technique to determine its usefulness for detecting bad questions in the survey context. Two sets of latent class analysis models are applied in this evaluation: latent class models with three indicators and latent class models with two indicators under different assumptions about prevalence and error rates. Our results indicated that the latent class analysis approach produced good qualitative results for the latent class models-the item that the model deemed the worst was the worst according to the true scores. However, the approach yielded weaker quantitative estimates of the error rates for a given item.
Applied Psychological Measurement, 1984
Item bias research has compared methods empirically using both computer simulation with known amounts of bias and real data with unknown amounts of bias. This study extends previous research by "planting" biased items in the realistic context of math word problems. "Biased" items are those in which the reading level is too high for a group of students so that the items are unable to assess the students' math knowledge. Of the three methods assessed (Angoff's transformed difficulty, Camilli's full chisquare, and Linn and Harnisch's item response theory, IRT, approach), only the IRT approach performed well. Removing the biased items had a minor effect on the validity for the minority group.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Psychometrika, 2013
Handbook of Statistics, 2006
ETS Research Report Series, 2005
The Quantitative Methods for Psychology, 2016
Journal of Applied Statistics, 2012
Journal of Applied Psychology, 1983
Educational and Psychological Measurement, 1985
The Journal of Experimental Education, 1979
Journal of Modern Applied Statistical Methods, 2010
The Irish Journal of Psychology, 2007
International Journal of Educational Studies and Policy (IJESP), 2024