
显示样式: 排序: IF: - GO 导出
-
Assessing the Accuracy of Parameter Estimates in the Presence of Rapid Guessing Misclassifications Educ. Psychol. Meas. (IF 1.941) Pub Date : 2021-04-21 Joseph A. Rios
The presence of rapid guessing (RG) presents a challenge to practitioners in obtaining accurate estimates of measurement properties and examinee ability. In response to this concern, researchers have utilized response times as a proxy of RG and have attempted to improve parameter estimation accuracy by filtering RG responses using popular scoring approaches, such as the effort-moderated item response
-
Evaluation of Second- and Third-Level Variance Proportions in Multilevel Designs With Completely Observed Populations: A Note on a Latent Variable Modeling Procedure Educ. Psychol. Meas. (IF 1.941) Pub Date : 2021-04-21 Tenko Raykov, Natalja Menold, Jane Leer
Two- and three-level designs in educational and psychological research can involve entire populations of Level-3 and possibly Level-2 units, such as schools and educational districts nested within a given state, or neighborhoods and counties in a state. Such a design is of increasing relevance in empirical research owing to the growing popularity of large-scale studies in these and cognate disciplines
-
Sample Size Requirements for Simple and Complex Mediation Models Educ. Psychol. Meas. (IF 1.941) Pub Date : 2021-04-19 Mikyung Sim, Su-Young Kim, Youngsuk Suh
Mediation models have been widely used in many disciplines to better understand the underlying processes between independent and dependent variables. Despite their popularity and importance, the appropriate sample sizes for estimating those models are not well known. Although several approaches (such as Monte Carlo methods) exist, applied researchers tend to use insufficient sample sizes to estimate
-
Detecting Careless Responding in Survey Data Using Stochastic Gradient Boosting Educ. Psychol. Meas. (IF 1.941) Pub Date : 2021-04-19 Ulrich Schroeders, Christoph Schmidt, Timo Gnambs
Careless responding is a bias in survey responses that disregards the actual item content, constituting a threat to the factor structure, reliability, and validity of psychological measurements. Different approaches have been proposed to detect aberrant responses such as probing questions that directly assess test-taking behavior (e.g., bogus items), auxiliary or paradata (e.g., response times), or
-
The Use of Theory of Linear Mixed-Effects Models to Detect Fraudulent Erasures at an Aggregate Level Educ. Psychol. Meas. (IF 1.941) Pub Date : 2021-03-29 Luyao Peng, Sandip Sinharay
Wollack et al. (2015) suggested the erasure detection index (EDI) for detecting fraudulent erasures for individual examinees. Wollack and Eckerly (2017) and Sinharay (2018) extended the index of Wollack et al. (2015) to suggest three EDIs for detecting fraudulent erasures at the aggregate or group level. This article follows up on the research of Wollack and Eckerly (2017) and Sinharay (2018) and suggests
-
Testing for Differential Item Functioning Under the D-Scoring Method Educ. Psychol. Meas. (IF 1.941) Pub Date : 2021-03-26 Dimiter M. Dimitrov, Dimitar V. Atanasov
This study offers an approach to testing for differential item functioning (DIF) in a recently developed measurement framework, referred to as D-scoring method (DSM). Under the proposed approach, called P–Z method of testing for DIF, the item response functions of two groups (reference and focal) are compared by transforming their probabilities of correct item response, estimated under the DSM, into
-
A Multidimensional Item Response Theory Model for Continuous and Graded Responses With Error in Persons and Items Educ. Psychol. Meas. (IF 1.941) Pub Date : 2021-03-10 Pere J. Ferrando, David Navarro-González
Item response theory “dual” models (DMs) in which both items and individuals are viewed as sources of differential measurement error so far have been proposed only for unidimensional measures. This article proposes two multidimensional extensions of existing DMs: the M-DTCRM (dual Thurstonian continuous response model), intended for (approximately) continuous responses, and the M-DTGRM (dual Thurstonian
-
Robustness of Latent Profile Analysis to Measurement Noninvariance Between Profiles Educ. Psychol. Meas. (IF 1.941) Pub Date : 2021-03-09 Yan Wang, Eunsook Kim, Zhiyao Yi
Latent profile analysis (LPA) identifies heterogeneous subgroups based on continuous indicators that represent different dimensions. It is a common practice to measure each dimension using items, create composite or factor scores for each dimension, and use these scores as indicators of profiles in LPA. In this case, measurement models for dimensions are not included and potential noninvariance across
-
Nominal Factor Analysis of Situational Judgment Tests: Evaluation of Latent Dimensionality and Factorial Invariance Educ. Psychol. Meas. (IF 1.941) Pub Date : 2021-02-25 Javier Revuelta, Alicia Franco-Martínez, Carmen Ximénez
Situational judgment tests have gained popularity in educational and psychological measurement and are widely used in personnel assessment. A situational judgment item presents a hypothetical scenario and a list of actions, and the individuals are asked to select their most likely action for that scenario. Because actions have no explicit order, the item generates nominal responses consisting of the
-
Evaluating Different Scoring Methods for Multiple Response Items Providing Partial Credit Educ. Psychol. Meas. (IF 1.941) Pub Date : 2021-02-22 Joe Betts, William Muntean, Doyoung Kim, Shu-chuan Kao
The multiple response structure can underlie several different technology-enhanced item types. With the increased use of computer-based testing, multiple response items are becoming more common. This response type holds the potential for being scored polytomously for partial credit. However, there are several possible methods for computing raw scores. This research will evaluate several approaches
-
Prediction With Mixed Effects Models: A Monte Carlo Simulation Study Educ. Psychol. Meas. (IF 1.941) Pub Date : 2021-02-16 Anthony A. Mangino, W. Holmes Finch
Oftentimes in many fields of the social and natural sciences, data are obtained within a nested structure (e.g., students within schools). To effectively analyze data with such a structure, multilevel models are frequently employed. The present study utilizes a Monte Carlo simulation to compare several novel multilevel classification algorithms across several varied data conditions for the purpose
-
Large-Sample Variance of Fleiss Generalized Kappa Educ. Psychol. Meas. (IF 1.941) Pub Date : 2021-02-15 Kilem L. Gwet
Cohen’s kappa coefficient was originally proposed for two raters only, and it later extended to an arbitrarily large number of raters to become what is known as Fleiss’ generalized kappa. Fleiss’ generalized kappa and its large-sample variance are still widely used by researchers and were implemented in several software packages, including, among others, SPSS and the R package “rel.” The purpose of
-
KR20 and KR21 for Some Nondichotomous Data (It’s Not Just Cronbach’s Alpha) Educ. Psychol. Meas. (IF 1.941) Pub Date : 2021-02-15 Robert C. Foster
This article presents some equivalent forms of the common Kuder–Richardson Formula 21 and 20 estimators for nondichotomous data belonging to certain other exponential families, such as Poisson count data, exponential data, or geometric counts of trials until failure. Using the generalized framework of Foster (2020), an equation for the reliability for a subset of the natural exponential family have
-
Determining the Number of Factors When Population Models Can Be Closely Approximated by Parsimonious Models Educ. Psychol. Meas. (IF 1.941) Pub Date : 2021-02-15 Yan Xia
Despite the existence of many methods for determining the number of factors, none outperforms the others under every condition. This study compares traditional parallel analysis (TPA), revised parallel analysis (RPA), Kaiser’s rule, minimum average partial, sequential χ2, and sequential root mean square error of approximation, comparative fit index, and Tucker–Lewis index under a realistic scenario
-
A Simulation Study on the Performance of Different Reliability Estimation Methods Educ. Psychol. Meas. (IF 1.941) Pub Date : 2021-02-15 Ashley A. Edwards, Keanan J. Joyner, Christopher Schatschneider
The accuracy of certain internal consistency estimators have been questioned in recent years. The present study tests the accuracy of six reliability estimators (Cronbach’s alpha, omega, omega hierarchical, Revelle’s omega, and greatest lower bound) in 140 simulated conditions of unidimensional continuous data with uncorrelated errors with varying sample sizes, number of items, population reliabilities
-
Is Differential Noneffortful Responding Associated With Type I Error in Measurement Invariance Testing? Educ. Psychol. Meas. (IF 1.941) Pub Date : 2021-02-12 Joseph A. Rios
Low test-taking effort as a validity threat is common when examinees perceive an assessment context to have minimal personal value. Prior research has shown that in such contexts, subgroups may differ in their effort, which raises two concerns when making subgroup mean comparisons. First, it is unclear how differential effort could influence evaluations of scale property equivalence. Second, if attaining
-
A Polytomous Scoring Approach to Handle Not-Reached Items in Low-Stakes Assessments Educ. Psychol. Meas. (IF 1.941) Pub Date : 2021-02-12 Guher Gorgun, Okan Bulut
In low-stakes assessments, some students may not reach the end of the test and leave some items unanswered due to various reasons (e.g., lack of test-taking motivation, poor time management, and test speededness). Not-reached items are often treated as incorrect or not-administered in the scoring process. However, when the proportion of not-reached items is high, these traditional approaches may yield
-
Examining the Impact of and Sensitivity of Fit Indices to Omitting Covariates Interaction Effect in Multilevel Multiple-Indicator Multiple-Cause Models Educ. Psychol. Meas. (IF 1.941) Pub Date : 2021-02-12 Chunhua Cao, Eun Sook Kim, Yi-Hsin Chen, John Ferron
This study examined the impact of omitting covariates interaction effect on parameter estimates in multilevel multiple-indicator multiple-cause models as well as the sensitivity of fit indices to model misspecification when the between-level, within-level, or cross-level interaction effect was left out in the models. The parameter estimates produced in the correct and the misspecified models were compared
-
On the Relationship Between Item Stem Formulation and Criterion Validity of Multiple-Component Measuring Instruments Educ. Psychol. Meas. (IF 1.941) Pub Date : 2021-02-08 Natalja Menold, Tenko Raykov
The possible dependency of criterion validity on item formulation in a multicomponent measuring instrument is examined. The discussion is concerned with evaluation of the differences in criterion validity between two or more groups (populations/subpopulations) that have been administered instruments with items having differently formulated item stems. The case of complex item stems involving two stimuli
-
A Short Note on Optimizing Cost-Generalizability via a Machine-Learning Approach Educ. Psychol. Meas. (IF 1.941) Pub Date : 2021-02-08 Zhehan Jiang, Dexin Shi, Christine Distefano
The costs of an objective structured clinical examination (OSCE) are of concern to health profession educators globally. As OSCEs are usually designed under generalizability theory (G-theory) framework, this article proposes a machine-learning-based approach to optimize the costs, while maintaining the minimum required generalizability coefficient, a reliability-like index in G-theory. The authors
-
Multiple Group Analysis in Multilevel Data Across Within-Level Groups: A Comparison of Multilevel Factor Mixture Modeling and Multilevel Multiple-Indicators Multiple-Causes Modeling Educ. Psychol. Meas. (IF 1.941) Pub Date : 2021-01-19 Sookyoung Son, Sehee Hong
The purpose of this two-part study is to evaluate methods for multiple group analysis when the comparison group is at the within level with multilevel data, using a multilevel factor mixture model (ML FMM) and a multilevel multiple-indicators multiple-causes (ML MIMIC) model. The performance of these methods was evaluated integrally by a series of procedures testing weak and strong invariance models
-
Detecting Rater Biases in Sparse Rater-Mediated Assessment Networks Educ. Psychol. Meas. (IF 1.941) Pub Date : 2021-01-19 Stefanie A. Wind, Yuan Ge
Practical constraints in rater-mediated assessments limit the availability of complete data. Instead, most scoring procedures include one or two ratings for each performance, with overlapping performances across raters or linking sets of multiple-choice items to facilitate model estimation. These incomplete scoring designs present challenges for detecting rater biases, or differential rater functioning
-
Bayesian Hierarchical Multidimensional Item Response Modeling of Small Sample, Sparse Data for Personalized Developmental Surveillance Educ. Psychol. Meas. (IF 1.941) Pub Date : 2021-01-19 Patricia Gilholm, Kerrie Mengersen, Helen Thompson
Developmental surveillance tools are used to closely monitor the early development of infants and young children. This study provides a novel implementation of a multidimensional item response model, using Bayesian hierarchical priors, to construct developmental profiles for a small sample of children (N = 115) with sparse data collected through an online developmental surveillance tool. The surveillance
-
On the Detection of the Correct Number of Factors in Two-Facet Models by Means of Parallel Analysis Educ. Psychol. Meas. (IF 1.941) Pub Date : 2021-01-05 André Beauducel, Norbert Hilger
Methods for optimal factor rotation of two-facet loading matrices have recently been proposed. However, the problem of the correct number of factors to retain for rotation of two-facet loading matrices has rarely been addressed in the context of exploratory factor analysis. Most previous studies were based on the observation that two-facet loading matrices may be rank deficient when the salient loadings
-
How Days Between Tests Impacts Alternate Forms Reliability in Computerized Adaptive Tests Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-12-15 Adam E. Wyse
An essential question when computing test–retest and alternate forms reliability coefficients is how many days there should be between tests. This article uses data from reading and math computerized adaptive tests to explore how the number of days between tests impacts alternate forms reliability coefficients. Results suggest that the highest alternate forms reliability coefficients were obtained
-
Generalized Linear Factor Score Regression: A Comparison of Four Methods Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-12-11 Gustaf Andersson, Fan Yang-Wallentin
Factor score regression has recently received growing interest as an alternative for structural equation modeling. However, many applications are left without guidance because of the focus on normally distributed outcomes in the literature. We perform a simulation study to examine how a selection of factor scoring methods compare when estimating regression coefficients in generalized linear factor
-
Growth Mixture Modeling With Nonnormal Distributions: Implications for Data Transformation Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-12-08 Yeji Nam, Sehee Hong
This study investigated the extent to which class-specific parameter estimates are biased by the within-class normality assumption in nonnormal growth mixture modeling (GMM). Monte Carlo simulations for nonnormal GMM were conducted to analyze and compare two strategies for obtaining unbiased parameter estimates: relaxing the within-class normality assumption and using data transformation on repeated
-
Combined Approach to Multi-Informant Data Using Latent Factors and Latent Classes: Trifactor Mixture Model Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-11-27 Eunsook Kim, Nathaniel von der Embse
Although collecting data from multiple informants is highly recommended, methods to model the congruence and incongruence between informants are limited. Bauer and colleagues suggested the trifactor model that decomposes the variances into common factor, informant perspective factors, and item-specific factors. This study extends their work to the trifactor mixture model that combines the trifactor
-
A Comparison of Label Switching Algorithms in the Context of Growth Mixture Models Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-11-16 Kristina R. Cassiday, Youngmi Cho, Jeffrey R. Harring
Simulation studies involving mixture models inevitably aggregate parameter estimates and other output across numerous replications. A primary issue that arises in these methodological investigations is label switching. The current study compares several label switching corrections that are commonly used when dealing with mixture models. A growth mixture model is used in this simulation study, and the
-
Explaining Variability in Response Style Traits: A Covariate-Adjusted IRTree Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-11-04 Allison J. Ames, Aaron J. Myers
Contamination of responses due to extreme and midpoint response style can confound the interpretation of scores, threatening the validity of inferences made from survey responses. This study incorporated person-level covariates in the multidimensional item response tree model to explain heterogeneity in response style. We include an empirical example and two simulation studies to support the use and
-
Assessing Preknowledge Cheating via Innovative Measures: A Multiple-Group Analysis of Jointly Modeling Item Responses, Response Times, and Visual Fixation Counts Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-10-31 Kaiwen Man, Jeffrey R. Harring
Many approaches have been proposed to jointly analyze item responses and response times to understand behavioral differences between normally and aberrantly behaved test-takers. Biometric information, such as data from eye trackers, can be used to better identify these deviant testing behaviors in addition to more conventional data types. Given this context, this study demonstrates the application
-
A Simple Model to Determine the Efficient Duration of Exams Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-10-16 Jules L. Ellis
This study develops a theoretical model for the costs of an exam as a function of its duration. Two kind of costs are distinguished: (1) the costs of measurement errors and (2) the costs of the measurement. Both costs are expressed in time of the student. Based on a classical test theory model, enriched with assumptions on the context, the costs of the exam can be expressed as a function of various
-
Model Selection and Average Proportion Explained Variance in Exploratory Factor Analysis Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-10-10 Tenko Raykov, Lisa Calvocoressi
A procedure for evaluating the average R-squared index for a given set of observed variables in an exploratory factor analysis model is discussed. The method can be used as an effective aid in the process of model choice with respect to the number of factors underlying the interrelationships among studied measures. The approach is developed within the framework of exploratory structural equation modeling
-
Performance of the S−χ2 Statistic for the Multidimensional Graded Response Model Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-09-23 Shiyang Su, Chun Wang, David J. Weiss
S−χ2 is a popular item fit index that is available in commercial software packages such as flexMIRT. However, no research has systematically examined the performance of S−χ2 for detecting item misfit within the context of the multidimensional graded response model (MGRM). The primary goal of this study was to evaluate the performance of S−χ2 under two practical misfit scenarios: first, all items are
-
Performance of the Grade of Membership Model Under a Variety of Sample Sizes, Group Size Ratios, and Differential Group Response Probabilities for Dichotomous Indicators Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-09-16 W. Holmes Finch
Social scientists are frequently interested in identifying latent subgroups within the population, based on a set of observed variables. One of the more common tools for this purpose is latent class analysis (LCA), which models a scenario involving k finite and mutually exclusive classes within the population. An alternative approach to this problem is presented by the grade of membership (GoM) model
-
Parameter Estimation Accuracy of the Effort-Moderated Item Response Theory Model Under Multiple Assumption Violations Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-09-02 Joseph A. Rios, James Soland
As low-stakes testing contexts increase, low test-taking effort may serve as a serious validity threat. One common solution to this problem is to identify noneffortful responses and treat them as missing during parameter estimation via the effort-moderated item response theory (EM-IRT) model. Although this model has been shown to outperform traditional IRT models (e.g., two-parameter logistic [2PL])
-
Design of Paper-Based Visual Analogue Scale Items Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-09-02 Klemens Weigl, Thomas Forstner
Paper-based visual analogue scale (VAS) items were developed 100 years ago. Although they gained great popularity in clinical and medical research for assessing pain, they have been scarcely applied in other areas of psychological research for several decades. However, since the beginning of digitization, VAS have attracted growing interest among researchers for carrying out computerized and paper-based
-
Modeling Item Revisit Behavior: The Hierarchical Speed–Accuracy–Revisits Model Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-08-31 Ummugul Bezirhan, Matthias von Davier, Irina Grabovsky
This article presents a new approach to the analysis of how students answer tests and how they allocate resources in terms of time on task and revisiting previously answered questions. Previous research has shown that in high-stakes assessments, most test takers do not end the testing session early, but rather spend all of the time they were assigned to take the test. Rather than being an indication
-
The Poor Fit of Model Fit for Selecting Number of Factors in Exploratory Factor Analysis for Scale Evaluation Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-08-12 Amanda K. Montoya, Michael C. Edwards
Model fit indices are being increasingly recommended and used to select the number of factors in an exploratory factor analysis. Growing evidence suggests that the recommended cutoff values for common model fit indices are not appropriate for use in an exploratory factor analysis context. A particularly prominent problem in scale evaluation is the ubiquity of correlated residuals and imperfect model
-
Corrigendum to Negative Binomial Models for Visual Fixation Counts on Test Items Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-08-07
Man, K., & Harring, J.R. (2019). Negative binomial models for visual fixation counts on test items. Educational and Psychological Measurement, 79(4), 617-635. doi: 10.1177/0013164418824148
-
Evaluating Restrictive Models in Educational and Behavioral Research: Local Misfit Overrides Model Tenability Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-08-01 Tenko Raykov, Christine DiStefano
The frequent practice of overall fit evaluation for latent variable models in educational and behavioral research is reconsidered. It is argued that since overall plausibility does not imply local plausibility and is only necessary for the latter, local misfit should be considered a sufficient condition for model rejection, even in the case of omnibus model tenability. The argument is exemplified with
-
Can High-Dimensional Questionnaires Resolve the Ipsativity Issue of Forced-Choice Response Formats? Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-07-24 Niklas Schulte, Heinz Holling, Paul-Christian Bürkner
Forced-choice questionnaires can prevent faking and other response biases typically associated with rating scales. However, the derived trait scores are often unreliable and ipsative, making interindividual comparisons in high-stakes situations impossible. Several studies suggest that these problems vanish if the number of measured traits is high. To determine the necessary number of traits under varying
-
Examining Multidimensional Measuring Instruments for Proximity to Unidimensional Structure Using Latent Variable Modeling Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-07-24 Tenko Raykov, Matthias Bluemke
A widely applicable procedure of examining proximity to unidimensionality for multicomponent measuring instruments with multidimensional structure is discussed. The method is developed within the framework of latent variable modeling and allows one to point and interval estimate an explained variance proportion-based index that may be considered a measure of proximity to unidimensional structure. The
-
Incorporating Uncertainty Into Parallel Analysis for Choosing the Number of Factors via Bayesian Methods Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-07-22 Roy Levy, Yan Xia, Samuel B. Green
A number of psychometricians have suggested that parallel analysis (PA) tends to yield more accurate results in determining the number of factors in comparison with other statistical methods. Nevertheless, all too often PA can suggest an incorrect number of factors, particularly in statistically unfavorable conditions (e.g., small sample sizes and low factor loadings). Because of this, researchers
-
Disentangling Item and Testing Effects in Inoculation Research on Online Misinformation: Solomon Revisited Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-07-16 Jon Roozenbeek, Rakoen Maertens, William McClanahan, Sander van der Linden
Online misinformation is a pervasive global problem. In response, psychologists have recently explored the theory of psychological inoculation: If people are preemptively exposed to a weakened version of a misinformation technique, they can build up cognitive resistance. This study addresses two unanswered methodological questions about a widely adopted online “fake news” inoculation game, Bad News
-
Exploring the Impact of Missing Data on Residual-Based Dimensionality Analysis for Measurement Models Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-07-13 Stefanie A. Wind, Randall E. Schumacker
Researchers frequently use Rasch models to analyze survey responses because these models provide accurate parameter estimates for items and examinees when there are missing data. However, researchers have not fully considered how missing data affect the accuracy of dimensionality assessment in Rasch analyses such as principal components analysis (PCA) of standardized residuals. Because adherence to
-
Latent D-Scoring Modeling: Estimation of Item and Person Parameters Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-07-13 Dimiter M. Dimitrov, Dimitar V. Atanasov
This study presents a latent (item response theory–like) framework of a recently developed classical approach to test scoring, equating, and item analysis, referred to as D-scoring method. Specifically, (a) person and item parameters are estimated under an item response function model on the D-scale (from 0 to 1) using marginal maximum-likelihood estimation and (b) analytic expressions are provided
-
Developing and Validating a Novel Anonymous Method for Matching Longitudinal School-Based Data Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-07-08 Jon Agley, David Tidd, Mikyoung Jun, Lori Eldridge, Yunyu Xiao, Steve Sussman, Wasantha Jayawardene, Daniel Agley, Ruth Gassman, Stephanie L. Dickinson
Prospective longitudinal data collection is an important way for researchers and evaluators to assess change. In school-based settings, for low-risk and/or likely-beneficial interventions or surveys, data quality and ethical standards are both arguably stronger when using a waiver of parental consent—but doing so often requires the use of anonymous data collection methods. The standard solution to
-
On the Pitfalls of Estimating and Using Standardized Reliability Coefficients Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-07-04 Tenko Raykov, George A. Marcoulides
The population discrepancy between unstandardized and standardized reliability of homogeneous multicomponent measuring instruments is examined. Within a latent variable modeling framework, it is shown that the standardized reliability coefficient for unidimensional scales can be markedly higher than the corresponding unstandardized reliability coefficient, or alternatively substantially lower than
-
The Appropriateness of Sum Scores as Estimates of Factor Scores in the Multiple Factor Analysis of Ordered-Categorical Responses Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-07-03 Pere J. Ferrando, Urbano Lorenzo-Seva
Unit-weight sum scores (UWSSs) are routinely used as estimates of factor scores on the basis of solutions obtained with the nonlinear exploratory factor analysis (EFA) model for ordered-categorical responses. Theoretically, this practice results in a loss of information and accuracy, and is expected to lead to biased estimates. However, the practical relevance of these limitations is far from clear
-
Improvement of Norm Score Quality via Regression-Based Continuous Norming Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-06-16 Wolfgang Lenhard, Alexandra Lenhard
The interpretation of psychometric test results is usually based on norm scores. We compared semiparametric continuous norming (SPCN) with conventional norming methods by simulating results for test scales with different item numbers and difficulties via an item response theory approach. Subsequently, we modeled the norm scores based on random samples with varying sizes either with a conventional ranking
-
Using the Standardized Root Mean Squared Residual (SRMR) to Assess Exact Fit in Structural Equation Models Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-06-08 Goran Pavlov, Alberto Maydeu-Olivares, Dexin Shi
We examine the accuracy of p values obtained using the asymptotic mean and variance (MV) correction to the distribution of the sample standardized root mean squared residual (SRMR) proposed by Maydeu-Olivares to assess the exact fit of SEM models. In a simulation study, we found that under normality, the MV-corrected SRMR statistic provides reasonably accurate Type I errors even in small samples and
-
It Matters: Reference Indicator Selection in Measurement Invariance Tests Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-06-05 Yutian T. Thompson, Hairong Song, Dexin Shi, Zhengkui Liu
Conventional approaches for selecting a reference indicator (RI) could lead to misleading results in testing for measurement invariance (MI). Several newer quantitative methods have been available for more rigorous RI selection. However, it is still unknown how well these methods perform in terms of correctly identifying a truly invariant item to be an RI. Thus, Study 1 was designed to address this
-
Differential Item Functioning Effect Size From the Multigroup Confirmatory Factor Analysis for a Meta-Analysis: A Simulation Study Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-06-04 Sung Eun Park, Soyeon Ahn, Cengiz Zopluoglu
This study presents a new approach to synthesizing differential item functioning (DIF) effect size: First, using correlation matrices from each study, we perform a multigroup confirmatory factor analysis (MGCFA) that examines measurement invariance of a test item between two subgroups (i.e., focal and reference groups). Then we synthesize, across the studies, the differences in the estimated factor
-
Latent Variable Modeling and Adaptive Testing for Experimental Cognitive Psychopathology Research Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-06-02 Michael L. Thomas, Gregory G. Brown, Virginie M. Patt, John R. Duffy
The adaptation of experimental cognitive tasks into measures that can be used to quantify neurocognitive outcomes in translational studies and clinical trials has become a key component of the strategy to address psychiatric and neurological disorders. Unfortunately, while most experimental cognitive tests have strong theoretical bases, they can have poor psychometric properties, leaving them vulnerable
-
Testing Measurement Invariance Across Unobserved Groups: The Role of Covariates in Factor Mixture Modeling Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-05-28 Yan Wang, Eunsook Kim, John M. Ferron, Robert F. Dedrick, Tony X. Tan, Stephen Stark
Factor mixture modeling (FMM) has been increasingly used to investigate unobserved population heterogeneity. This study examined the issue of covariate effects with FMM in the context of measurement invariance testing. Specifically, the impact of excluding and misspecifying covariate effects on measurement invariance testing and class enumeration was investigated via Monte Carlo simulations. Data were
-
The Performance of the Semigeneralized Partial Credit Model for Handling Item-Level Missingness Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-05-15 Sherry Zhou, Anne Corinne Huggins-Manley
The semi-generalized partial credit model (Semi-GPCM) has been proposed as a unidimensional modeling method for handling not applicable scale responses and neutral scale responses, and it has been suggested that the model may be of use in handling missing data in scale items. The purpose of this study is to evaluate the ability of the unidimensional Semi-GPCM to aid in the recovery of person parameters
-
Seeing the Forest and the Trees: Comparison of Two IRTree Models to Investigate the Impact of Full Versus Endpoint-Only Response Option Labeling Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-05-02 Elisabeth M. Spratto, Brian C. Leventhal, Deborah L. Bandalos
In this study, we examined the results and interpretations produced from two different IRTree models—one using paths consisting of only dichotomous decisions, and one using paths consisting of both dichotomous and polytomous decisions. We used data from two versions of an impulsivity measure. In the first version, all the response options had labels; in the second version, only the endpoints were labeled
-
A Mixture IRTree Model for Extreme Response Style: Accounting for Response Process Uncertainty Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-04-27 Nana Kim, Daniel M. Bolt
This paper presents a mixture item response tree (IRTree) model for extreme response style. Unlike traditional applications of single IRTree models, a mixture approach provides a way of representing the mixture of respondents following different underlying response processes (between individuals), as well as the uncertainty present at the individual level (within an individual). Simulation analyses
-
A Mixture IRTree Model for Performance Decline and Nonignorable Missing Data Educ. Psychol. Meas. (IF 1.941) Pub Date : 2020-04-24 Hung-Yu Huang
In educational assessments and achievement tests, test developers and administrators commonly assume that test-takers attempt all test items with full effort and leave no blank responses with unplanned missing values. However, aberrant response behavior—such as performance decline, dropping out beyond a certain point, and skipping certain items over the course of the test—is inevitable, especially