-
A Note on Standard Errors for Multidimensional Two-Parameter Logistic Models Using Gaussian Variational Estimation Applied Psychological Measurement (IF 1.0) Pub Date : 2024-07-24 Jiaying Xiao, Chun Wang, Gongjun Xu
Accurate item parameters and standard errors (SEs) are crucial for many multidimensional item response theory (MIRT) applications. A recent study proposed the Gaussian Variational Expectation Maximization (GVEM) algorithm to improve computational efficiency and estimation accuracy ( Cho et al., 2021 ). However, the SE estimation procedure has yet to be fully addressed. To tackle this issue, the present
-
Are Large-Scale Test Scores Comparable for At-Home Versus Test Center Testing? Applied Psychological Measurement (IF 1.0) Pub Date : 2024-05-11 Katherine E. Castellano, Matthew S. Johnson, Rene Lawless
The COVID-19 pandemic led to a proliferation of remote-proctored (or “at-home”) assessments. The lack of standardized setting, device, or in-person proctor during at-home testing makes it markedly distinct from testing at a test center. Comparability studies of at-home and test center scores are important in understanding whether these distinctions impact test scores. This study found no significant
-
Test Security and the Pandemic: Comparison of Test Center and Online Proctor Delivery Modalities Applied Psychological Measurement (IF 1.0) Pub Date : 2024-04-24 Kirk A. Becker, Jinghua Liu, Paul E. Jones
Published information is limited regarding the security of testing programs, and even less on the relative security of different testing modalities: in-person at test centers (TC) versus remote online proctored (OP) testing. This article begins by examining indicators of test security violations across a wide range of programs in professional, admissions, and IT fields. We look at high levels of response
-
How Scoring Approaches Impact Estimates of Growth in the Presence of Survey Item Ceiling Effects Applied Psychological Measurement (IF 1.0) Pub Date : 2024-03-16 Kelly D. Edwards, James Soland
Survey scores are often the basis for understanding how individuals grow psychologically and socio-emotionally. A known problem with many surveys is that the items are all “easy”—that is, individuals tend to use only the top one or two response categories on the Likert scale. Such an issue could be especially problematic, and lead to ceiling effects, when the same survey is administered repeatedly
-
Evaluating the Douglas-Cohen IRT Goodness of Fit Measure With BIB Sampling of Items Applied Psychological Measurement (IF 1.0) Pub Date : 2024-03-15 John R. Donoghue, Adrienne Sgammato
Methods to detect item response theory (IRT) item-level misfit are typically derived assuming fixed test forms. However, IRT is also employed with more complicated test designs, such as the balanced incomplete block (BIB) design used in large-scale educational assessments. This study investigates two modifications of Douglas and Cohen’s 2001 nonparametric method of assessing item misfit, based on A)
-
Detecting Differential Item Functioning in Multidimensional Graded Response Models With Recursive Partitioning Applied Psychological Measurement (IF 1.0) Pub Date : 2024-03-14 Franz Classe, Christoph Kern
Differential item functioning (DIF) is a common challenge when examining latent traits in large scale surveys. In recent work, methods from the field of machine learning such as model-based recursive partitioning have been proposed to identify subgroups with DIF when little theoretical guidance and many potential subgroups are available. On this basis, we propose and compare recursive partitioning
-
Linking Methods for Multidimensional Forced Choice Tests Using the Multi-Unidimensional Pairwise Preference Model Applied Psychological Measurement (IF 1.0) Pub Date : 2024-03-12 Naidan Tu, Lavanya S. Kumar, Sean Joo, Stephen Stark
Applications of multidimensional forced choice (MFC) testing have increased considerably over the last 20 years. Yet there has been little, if any, research on methods for linking the parameter estimates from different samples. This research addressed that important need by extending four widely used methods for unidimensional linking and comparing the efficacy of new estimation algorithms for MFC
-
Using Interpretable Machine Learning for Differential Item Functioning Detection in Psychometric Tests Applied Psychological Measurement (IF 1.0) Pub Date : 2024-03-12 Elisabeth Barbara Kraus, Johannes Wild, Sven Hilbert
This study presents a novel method to investigate test fairness and differential item functioning combining psychometrics and machine learning. Test unfairness manifests itself in systematic and demographically imbalanced influences of confounding constructs on residual variances in psychometric modeling. Our method aims to account for resulting complex relationships between response patterns and demographic
-
Benefits of the Curious Behavior of Bayesian Hierarchical Item Response Theory Models—An in-Depth Investigation and Bias Correction Applied Psychological Measurement (IF 1.0) Pub Date : 2024-01-20 Christoph König, Rainer W. Alexandrowicz
When using Bayesian hierarchical modeling, a popular approach for Item Response Theory (IRT) models, researchers typically face a tradeoff between the precision and accuracy of the item parameter estimates. Given the pooling principle and variance-dependent shrinkage, the expected behavior of Bayesian hierarchical IRT models is to deliver more precise but biased item parameter estimates, compared to
-
Detecting uniform differential item functioning for continuous response computerized adaptive testing Applied Psychological Measurement (IF 1.0) Pub Date : 2024-01-18 Chun Wang, Ruoyi Zhu
Evaluating items for potential differential item functioning (DIF) is an essential step to ensuring measurement fairness. In this article, we focus on a specific scenario, namely, the continuous response, severely sparse, computerized adaptive testing (CAT). Continuous responses items are growingly used in performance-based tasks because they tend to generate more information than traditional dichotomous
-
Location-Matching Adaptive Testing for Polytomous Technology-Enhanced Items Applied Psychological Measurement (IF 1.0) Pub Date : 2024-01-16 Hyeon-Ah Kang, Gregory Arbet, Joe Betts, William Muntean
The article presents adaptive testing strategies for polytomously scored technology-enhanced innovative items. We investigate item selection methods that match examinee’s ability levels in location and explore ways to leverage test-taking speeds during item selection. Existing approaches to selecting polytomous items are mostly based on information measures and tend to experience an item pool usage
-
Comparing Test-Taking Effort Between Paper-Based and Computer-Based Tests Applied Psychological Measurement (IF 1.0) Pub Date : 2024-01-13 Sebastian Weirich, Karoline A. Sachse, Sofie Henschel, Carola Schnitzler
The article compares the trajectories of students’ self-reported test-taking effort during a 120 minutes low-stakes large-scale assessment of English comprehension between a paper-and-pencil (PPA) and a computer-based assessment (CBA). Test-taking effort was measured four times during the test. Using a within-subject design, each of the N = 2,676 German ninth-grade students completed half of the test
-
Modeling Rating Order Effects Under Item Response Theory Models for Rater-Mediated Assessments Applied Psychological Measurement (IF 1.0) Pub Date : 2023-05-13 Hung-Yu Huang
Rater effects are commonly observed in rater-mediated assessments. By using item response theory (IRT) modeling, raters can be treated as independent factors that function as instruments for measur...
-
Using a Generalized Logistic Regression Method to Detect Differential Item Functioning With Multiple Groups in Cognitive Diagnostic Tests Applied Psychological Measurement (IF 1.0) Pub Date : 2023-05-13 Xiaojian Sun, Shimeng Wang, Lei Guo, Tao Xin, Naiqing Song
Items with the presence of differential item functioning (DIF) will compromise the validity and fairness of a test. Studies have investigated the DIF effect in the context of cognitive diagnostic a...
-
Enhancing Computerized Adaptive Testing with Batteries of Unidimensional Tests Applied Psychological Measurement (IF 1.0) Pub Date : 2023-03-24 Pasquale Anselmi, Egidio Robusto, Francesca Cristante
The article presents a new computerized adaptive testing (CAT) procedure for use with batteries of unidimensional tests. At each step of testing, the estimate of a certain ability is updated on the...
-
A Testlet Diagnostic Classification Model with Attribute Hierarchies Applied Psychological Measurement (IF 1.0) Pub Date : 2023-03-21 Wenchao Ma, Chun Wang, Jiaying Xiao
In this article, a testlet hierarchical diagnostic classification model (TH-DCM) was introduced to take both attribute hierarchies and item bundles into account. The expectation-maximization algori...
-
Confidence Screening Detector: A New Method for Detecting Test Collusion Applied Psychological Measurement (IF 1.0) Pub Date : 2023-03-20 Yongze Xu, Ying Cui, Xinyi Wang, Meiwei Huang, Fang Luo
Test collusion (TC) is a form of cheating in which, examinees operate in groups to alter normal item responses. TC is becoming increasingly common, especially within high-stakes, large-scale examin...
-
Online Parameter Estimation for Student Evaluation of Teaching Applied Psychological Measurement (IF 1.0) Pub Date : 2023-03-19 Chia-Wen Chen, Chen-Wei Liu
Student evaluation of teaching (SET) assesses students’ experiences in a class to evaluate teachers’ performance in class. SET essentially comprises three facets: teaching proficiency, student rati...
-
The Impact of Item Model Parameter Variations on Person Parameter Estimation in Computerized Adaptive Testing With Automatically Generated Items Applied Psychological Measurement (IF 1.0) Pub Date : 2023-03-17 Chen Tian, Jaehwa Choi
Sibling items developed through automatic item generation share similar but not identical psychometric properties. However, considering sibling item variations may bring huge computation difficulti...
-
A Mixed Sequential IRT Model for Mixed-Format Items Applied Psychological Measurement (IF 1.0) Pub Date : 2023-03-17 Junhuan Wei, Yan Cai, Dongbo Tu
To provide more insight into an individual’s response process and cognitive process, this study proposed three mixed sequential item response models (MS-IRMs) for mixed-format items consisting of a...
-
Heywood Cases in Unidimensional Factor Models and Item Response Models for Binary Data Applied Psychological Measurement (IF 1.0) Pub Date : 2023-01-29 Selena Wang, Paul De Boeck, Marcel Yotebieng
Heywood cases are known from linear factor analysis literature as variables with communalities larger than 1.00, and in present day factor models, the problem also shows in negative residual varian...
-
A Likelihood Approach to Item Response Theory Equating of Multiple Forms Applied Psychological Measurement (IF 1.0) Pub Date : 2023-01-24 Michela Battauz, Waldir Leôncio
Test equating is a statistical procedure to make scores from different test forms comparable and interchangeable. Focusing on an IRT approach, this paper proposes a novel method that simultaneously...
-
A New Approach to Desirable Responding: Multidimensional Item Response Model of Overclaiming Data Applied Psychological Measurement (IF 1.0) Pub Date : 2023-01-19 Kuan-Yu Jin, Delroy L. Paulhus, Ching-Lin Shih
A variety of approaches have been presented for assessing desirable responding in self-report measures. Among them, the overclaiming technique asks respondents to rate their familiarity with a larg...
-
A Comparison of Confirmatory Factor Analysis and Network Models for Measurement Invariance Assessment When Indicator Residuals are Correlated Applied Psychological Measurement (IF 1.0) Pub Date : 2023-01-14 W. Holmes Finch, Brian F. French, Alicia Hazelwood
Social science research is heavily dependent on the use of standardized assessments of a variety of phenomena, such as mood, executive functioning, and cognitive ability. An important assumption wh...
-
The Effects of Rating Designs on Rater Classification Accuracy and Rater Measurement Precision in Large-Scale Mixed-Format Assessments Applied Psychological Measurement (IF 1.0) Pub Date : 2023-01-12 Wenjing Guo, Stefanie A. Wind
In standalone performance assessments, researchers have explored the influence of different rating designs on the sensitivity of latent trait model indicators to different rater effects as well as ...
-
autoRasch: An R Package to Do Semi-Automated Rasch Analysis Applied Psychological Measurement (IF 1.0) Pub Date : 2022-10-10 Feri Wijayanto, Ioan Gabriel Bucur, Perry Groot, Tom Heskes
The R package autoRasch has been developed to perform a Rasch analysis in a (semi-)automated way. The automated part of the analysis is achieved by optimizing the so-called in-plus-out-of-questionn...
-
Evaluating Equating Transformations in IRT Observed-Score and Kernel Equating Methods Applied Psychological Measurement (IF 1.0) Pub Date : 2022-10-04 Waldir Leôncio, Marie Wiberg, Michela Battauz
Test equating is a statistical procedure to ensure that scores from different test forms can be used interchangeably. There are several methodologies available to perform equating, some of which ar...
-
Empirical Priors in Polytomous Computerized Adaptive Tests: Risks and Rewards in Clinical Settings Applied Psychological Measurement (IF 1.0) Pub Date : 2022-09-30 Niek Frans, Johan Braeken, Bernard P. Veldkamp, Muirne C. S. Paap
The use of empirical prior information about participants has been shown to substantially improve the efficiency of computerized adaptive tests (CATs) in educational settings. However, it is unclea...
-
Targeted Double Scoring of Performance Tasks Using a Decision-Theoretic Approach Applied Psychological Measurement (IF 1.0) Pub Date : 2022-09-23 Sandip Sinharay, Matthew S. Johnson, Wei Wang, Jing Miao
Targeted double scoring, or, double scoring of only some (but not all) responses, is used to reduce the burden of scoring performance tasks for several mastery tests (Finkelman, Darby, & Nering, 20...
-
An Investigation Into the Impact of Test Session Disruptions for At-Home Test Administrations Applied Psychological Measurement (IF 1.0) Pub Date : 2022-09-20 Katherine E. Castellano, Sandip Sinharay, Jiangang Hao, Chen Li
In response to the closures of test centers worldwide due to the COVID-19 pandemic, several testing programs offered large-scale standardized assessments to examinees remotely. However, due to the ...
-
Modified Item-Fit Indices for Dichotomous IRT Models with Missing Data Applied Psychological Measurement (IF 1.0) Pub Date : 2022-09-19 Xue Zhang, Chun Wang
Item-level fit analysis not only serves as a complementary check to global fit analysis, it is also essential in scale development because the fit results will guide item revision and/or deletion (...
-
The Standardized S-X2 Statistic for Assessing Item Fit Applied Psychological Measurement (IF 1.0) Pub Date : 2022-09-17 Zhuangzhuang Han, Sandip Sinharay, Matthew S. Johnson, Xiang Liu
The S-X2 statistic (Orlando & Thissen, 2000) is popular among researchers and practitioners who are interested in the assessment of item fit. However, the statistic suffers from the Chernoff–Lehman...
-
Attenuation-Corrected Estimators of Reliability Applied Psychological Measurement (IF 1.0) Pub Date : 2022-09-15 Jari Metsämuuronen
The estimates of reliability are usually attenuated and deflated because the item–score correlation (ρgX, Rit) embedded in the most widely used estimators is affected by several sources of mechanic...
-
Modeling Rapid Guessing Behaviors in Computer-Based Testlet Items Applied Psychological Measurement (IF 1.0) Pub Date : 2022-09-09 Kuan-Yu Jin, Chia-Ling Hsu, Ming Ming Chiu, Po-Hsi Chen
In traditional test models, test items are independent, and test-takers slowly and thoughtfully respond to each test item. However, some test items have a common stimulus (dependent test items in a...
-
Efficient Metropolis-Hastings Robbins-Monro Algorithm for High-Dimensional Diagnostic Classification Models Applied Psychological Measurement (IF 1.0) Pub Date : 2022-09-08 Chen-Wei Liu
The expectation-maximization (EM) algorithm is a commonly used technique for the parameter estimation of the diagnostic classification models (DCMs) with a prespecified Q-matrix; however, it requir...
-
Outlier Detection Using t-test in Rasch IRT Equating under NEAT Design Applied Psychological Measurement (IF 1.0) Pub Date : 2022-09-06 Chunyan Liu, Daniel Jurich
In equating practice, the existence of outliers in the anchor items may deteriorate the equating accuracy and threaten the validity of test scores. Therefore, stability of the anchor item performan...
-
Applying Negative Binomial Distribution in Diagnostic Classification Models for Analyzing Count Data Applied Psychological Measurement (IF 1.0) Pub Date : 2022-09-06 Ren Liu, Ihnwhi Heo, Haiyan Liu, Dexin Shi, Zhehan Jiang
Diagnostic classification models (DCMs) have been used to classify examinees into groups based on their possession status of a set of latent traits. In addition to traditional item-based scoring ap...
-
Item Selection With Collaborative Filtering in On-The-Fly Multistage Adaptive Testing Applied Psychological Measurement (IF 1.0) Pub Date : 2022-08-28 Jiaying Xiao, Okan Bulut
An important design feature in the implementation of both computerized adaptive testing and multistage adaptive testing is the use of an appropriate method for item selection. The item selection me...
-
Flexible Item Response Models for Count Data: The Count Thresholds Model Applied Psychological Measurement (IF 1.0) Pub Date : 2022-08-07 Gerhard Tutz
A new item response theory model for count data is introduced. In contrast to models in common use, it does not assume a fixed distribution for the responses as, for example, the Poisson count mode...
-
An Empirical Identification Issue of the Bifactor Item Response Theory Model Applied Psychological Measurement (IF 1.0) Pub Date : 2022-07-10 Wenya Chen, Ken A. Fujimoto
Using the bifactor item response theory model to analyze data arising from educational and psychological studies has gained popularity over the years. Unfortunately, using this model in practice comes with challenges. One such challenge is an empirical identification issue that is seldom discussed in the literature, and its impact on the estimates of the bifactor model’s parameters has not been demonstrated
-
Uncovering the Complexity of Item Position Effects in a Low-Stakes Testing Context Applied Psychological Measurement (IF 1.0) Pub Date : 2022-07-04 Thai Q. Ong, Dena A. Pastor
Previous researchers have only either adopted an item or examinee perspective to position effects, where they focused on exploring the relationships among position effects and item or examinee variables separately. Unlike previous researchers, we adopted an integrated perspective to position effects, where we focused on exploring the relationships among position effects, item variables, and examinee
-
Diagnostic Classification Models for a Mixture of Ordered and Non-ordered Response Options in Rating Scales Applied Psychological Measurement (IF 1.0) Pub Date : 2022-06-24 Ren Liu, Haiyan Liu, Dexin Shi, Zhehan Jiang
When developing ordinal rating scales, we may include potentially unordered response options such as “Neither Agree nor Disagree,” “Neutral,” “Don’t Know,” “No Opinion,” or “Hard to Say.” To handle responses to a mixture of ordered and unordered options, Huggins-Manley et al. (2018) proposed a class of semi-ordered models under the unidimensional item response theory framework. This study extends the
-
Two New Models for Item Preknowledge Applied Psychological Measurement (IF 1.0) Pub Date : 2022-06-22 Kylie Gorney, James A. Wollack
To evaluate preknowledge detection methods, researchers often conduct simulation studies in which they use models to generate the data. In this article, we propose two new models to represent item preknowledge. Contrary to existing models, we allow the impact of preknowledge to vary across persons and items in order to better represent situations that are encountered in practice. We use three real
-
Item-Fit Statistic Based on Posterior Probabilities of Membership in Ability Groups Applied Psychological Measurement (IF 1.0) Pub Date : 2022-06-20 Bartosz Kondratek
A novel approach to item-fit analysis based on an asymptotic test is proposed. The new test statistic, χ2wχw2 , compares pseudo-observed and expected item mean scores over a set of ability bins. The item mean scores are computed as weighted means with weights based on test-takers’ a posteriori density of ability within the bin. This article explores the properties of χ2wχw2 in case of dichotomously
-
Characterizing Sampling Variability for Item Response Theory Scale Scores in a Fixed-Parameter Calibrated Projection Design Applied Psychological Measurement (IF 1.0) Pub Date : 2022-06-20 Shuangshuang Xu, Yang Liu
A common practice of linking uses estimated item parameters to calculate projected scores. This procedure fails to account for the carry-over sampling variability. Neglecting sampling variability could consequently lead to understated uncertainty for Item Response Theory (IRT) scale scores. To address the issue, we apply a Multiple Imputation (MI) approach to adjust the Posterior Standard Deviations
-
Item Response Theory True Score Equating for the Bifactor Model Under the Common-Item Nonequivalent Groups Design Applied Psychological Measurement (IF 1.0) Pub Date : 2022-06-17 Kyung Yong Kim
Applying item response theory (IRT) true score equating to multidimensional IRT models is not straightforward due to the one-to-many relationship between a true score and latent variables. Under the common-item nonequivalent groups design, the purpose of the current study was to introduce two IRT true score equating procedures that adopted different dimension reduction strategies for the bifactor model
-
Termination Criteria for Grid Multiclassification Adaptive Testing With Multidimensional Polytomous Items Applied Psychological Measurement (IF 1.0) Pub Date : 2022-06-16 Zhuoran Wang, Chun Wang, David J Weiss
Adaptive classification testing (ACT) is a variation of computerized adaptive testing (CAT) that is developed to efficiently classify examinees into multiple groups based on predetermined cutoffs. In multidimensional multiclassification (i.e., more than two categories exist along each dimension), grid classification is proposed to classify each examinee into one of the grids encircled by cutoffs (lines/surfaces)
-
Investigating the Effect of Differential Rapid Guessing on Population Invariance in Equating Applied Psychological Measurement (IF 1.0) Pub Date : 2022-06-16 Jiayi Deng, Joseph A. Rios
Score equating is an essential tool in improving the fairness of test score interpretations when employing multiple test forms. To ensure that the equating functions used to connect scores from one form to another are valid, they must be invariant across different populations of examinees. Given that equating is used in many low-stakes testing programs, examinees’ test-taking effort should be considered
-
Evaluation of the Linear Composite Conjecture for Unidimensional IRT Scale for Multidimensional Responses Applied Psychological Measurement (IF 1.0) Pub Date : 2022-06-15 Tyler Strachan, Uk Hyun Cho, Terry Ackerman, Shyh-Huei Chen, Jimmy de la Torre, Edward H. Ip
The linear composite direction represents, theoretically, where the unidimensional scale would lie within a multidimensional latent space. Using compensatory multidimensional IRT, the linear composite can be derived from the structure of the items and the latent distribution. The purpose of this study was to evaluate the validity of the linear composite conjecture and examine how well a fitted unidimensional
-
Application of Sampling Variance of Item Response Theory Parameter Estimates in Detecting Outliers in Common Item Equating Applied Psychological Measurement (IF 1.0) Pub Date : 2022-06-15 Chunyan Liu, Daniel Jurich
In common item equating, the existence of item outliers may impact the accuracy of equating results and bring significant ramifications to the validity of test score interpretations. Therefore, common item equating should involve a screening process to flag outlying items and exclude them from the common item set before equating is conducted. The current simulation study demonstrated that the sampling
-
The Optimal Design of Bifactor Multidimensional Computerized Adaptive Testing with Mixed-format Items Applied Psychological Measurement (IF 1.0) Pub Date : 2022-06-14 Xiuzhen Mao, Jiahui Zhang, Tao Xin
Multidimensional computerized adaptive testing (MCAT) using mixed-format items holds great potential for the next-generation assessments. Two critical factors in the mixed-format test design (i.e., the order and proportion of polytomous items) and item selection were addressed in the context of mixed-format bifactor MCAT. For item selection, this article presents the derivation of the Fisher information
-
Multistage Testing in Heterogeneous Populations: Some Design and Implementation Considerations Applied Psychological Measurement (IF 1.0) Pub Date : 2022-06-13 Leslie Rutkowski, Yuan-Ling Liaw, Dubravka Svetina, David Rutkowski
A central challenge in international large-scale assessments is adequately measuring dozens of highly heterogeneous populations, many of which are low performers. To that end, multistage adaptive testing offers one possibility for better assessing across the achievement continuum. This study examines the way that several multistage test design and implementation choices can impact measurement performance
-
Dual-Objective Item Selection Methods in Computerized Adaptive Test Using the Higher-Order Cognitive Diagnostic Models Applied Psychological Measurement (IF 1.0) Pub Date : 2022-05-20 Chongqin Xi, Dongbo Tu, Yan Cai
To efficiently obtain information about both the general abilities and detailed cognitive profiles of examinees from a single model that uses a single-calibration process, higher-order cognitive diagnostic computerized adaptive testing (CD-CAT) that employ higher-order cognitive diagnostic models have been developed. However, the current item selection methods used in higher-order CD-CAT adaptively
-
Bayesian Item Response Theory Models With Flexible Generalized Logit Links Applied Psychological Measurement (IF 1.0) Pub Date : 2022-05-20 Jiwei Zhang, Ying-Ying Zhang, Jian Tao, Ming-Hui Chen
In educational and psychological research, the logit and probit links are often used to fit the binary item response data. The appropriateness and importance of the choice of links within the item response theory (IRT) framework has not been investigated yet. In this paper, we present a family of IRT models with generalized logit links, which include the traditional logistic and normal ogive models
-
glca: An R Package for Multiple-Group Latent Class Analysis Applied Psychological Measurement (IF 1.0) Pub Date : 2022-05-11 Youngsun Kim, Saebom Jeon, Chi Chang, Hwan Chung
Group similarities and differences may manifest themselves in a variety of ways in multiple-group latent class analysis (LCA). Sometimes, measurement models are identical across groups in LCA. In other situations, the measurement models may differ, suggesting that the latent structure itself is different between groups. Tests of measurement invariance shed light on this distinction. We created an R
-
Factor Retention Using Machine Learning With Ordinal Data Applied Psychological Measurement (IF 1.0) Pub Date : 2022-05-04 David Goretzko, Markus Bühner
Determining the number of factors in exploratory factor analysis is probably the most crucial decision when conducting the analysis as it clearly influences the meaningfulness of the results (i.e., factorial validity). A new method called the Factor Forest that combines data simulation and machine learning has been developed recently. This method based on simulated data reached very high accuracy for
-
Detecting Examinees With Item Preknowledge on Real Data Applied Psychological Measurement (IF 1.0) Pub Date : 2022-04-21 Dmitry I. Belov, Sarah L. Toton
Recently, Belov & Wollack (2021) developed a method for detecting groups of colluding examinees as cliques in a graph. The objective of this article is to study how the performance of their method on real data with item preknowledge (IP) depends on the mechanism of edge formation governed by a response similarity index (RSI). This study resulted in the development of three new RSIs and demonstrated
-
Measurement of Ability in Adaptive Learning and Assessment Systems when Learners Use On-Demand Hints Applied Psychological Measurement (IF 1.0) Pub Date : 2022-04-18 Maria Bolsinova, Benjamin Deonovic, Meirav Arieli-Attali, Burr Settles, Masato Hagiwara, Gunter Maris
Adaptive learning and assessment systems support learners in acquiring knowledge and skills in a particular domain. The learners’ progress is monitored through them solving items matching their level and aiming at specific learning goals. Scaffolding and providing learners with hints are powerful tools in helping the learning process. One way of introducing hints is to make hint use the choice of the
-
Combining Cognitive Diagnostic Computerized Adaptive Testing With Multidimensional Item Response Theory Applied Psychological Measurement (IF 1.0) Pub Date : 2022-04-18 Hao Luo, Daxun Wang, Zhiming Guo, Yan Cai, Dongbo Tu
The new generation of tests not only focuses on the general ability but also the process of finer-grained skills. Under the guidance of this thought, researchers have developed a dual-purpose CD-CAT (Dual-CAT). In the existing Dual-CAT, the models used in overall ability estimation are unidimensional IRT models, which cannot apply to the multidimensional tests. This article intends to develop a multidimensional
-
The Potential for Interpretational Confounding in Cognitive Diagnosis Models Applied Psychological Measurement (IF 1.0) Pub Date : 2022-04-15 Qi (Helen) Huang, Daniel M. Bolt
Binary examinee mastery/nonmastery classifications in cognitive diagnosis models may often be an approximation to proficiencies that are better regarded as continuous. Such misspecification can lead to inconsistencies in the operational definition of “mastery” when binary skills models are assumed. In this paper we demonstrate the potential for an interpretational confounding of the latent skills when