样式: 排序: IF: - GO 导出 标记为已读
-
A Psychometric Framework for Evaluating Fairness in Algorithmic Decision Making: Differential Algorithmic Functioning J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2023-05-10 Youmi Suk, Kyung T. Han
As algorithmic decision making is increasingly deployed in every walk of life, many researchers have raised concerns about fairness-related bias from such algorithms. But there is little research o...
-
Modeling Item-Level Heterogeneous Treatment Effects With the Explanatory Item Response Model: Leveraging Large-Scale Online Assessments to Pinpoint the Impact of Educational Interventions J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2023-05-09 Joshua B. Gilbert, James S. Kim, Luke W. Miratrix
Analyses that reveal how treatment effects vary allow researchers, practitioners, and policymakers to better understand the efficacy of educational interventions. In practice, however, standard sta...
-
Model Misspecification and Robustness of Observed-Score Test Equating Using Propensity Scores J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2023-05-09 Gabriel Wallin, Marie Wiberg
This study explores the usefulness of covariates on equating test scores from nonequivalent test groups. The covariates are captured by an estimated propensity score, which is used as a proxy for l...
-
Chance-Constrained Automated Test Assembly J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2023-05-09 Giada Spaccapanico Proietti, Mariagiulia Matteucci, Stefania Mignani, Bernard P. Veldkamp
Classical automated test assembly (ATA) methods assume fixed and known coefficients for the constraints and the objective function. This hypothesis is not true for the estimates of item response th...
-
Cognitive Diagnosis Testlet Model for Multiple-Choice Items J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2023-05-09 Lei Guo, Wenjie Zhou, Xiao Li
The testlet design is very popular in educational and psychological assessments. This article proposes a new cognitive diagnosis model, the multiple-choice cognitive diagnostic testlet (MC-CDT) mod...
-
Latent Transition Cognitive Diagnosis Model With Covariates: A Three-Step Approach J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2023-04-25 Qianru Liang, Jimmy de la Torre, Nancy Law
To expand the use of cognitive diagnosis models (CDMs) to longitudinal assessments, this study proposes a bias-corrected three-step estimation approach for latent transition CDMs with covariates by...
-
A Within-Group Approach to Ensemble Machine Learning Methods for Causal Inference in Multilevel Studies J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2023-04-25 Youmi Suk
Machine learning (ML) methods for causal inference have gained popularity due to their flexibility to predict the outcome model and the propensity score. In this article, we provide a within-group ...
-
Diagnosing Primary Students’ Reading Progression: Is Cognitive Diagnostic Computerized Adaptive Testing the Way Forward? J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2023-04-20 Yan Li, Chao Huang, Jia Liu
Cognitive diagnostic computerized adaptive testing (CD-CAT) is a cutting-edge technology in educational measurement that targets at providing feedback on examinees’ strengths and weaknesses while i...
-
Using Item Scores and Distractors to Detect Item Compromise and Preknowledge J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2023-04-20 Kylie Gorney, James A. Wollack, Sandip Sinharay, Carol Eckerly
Any time examinees have had access to items and/or answers prior to taking a test, the fairness of the test and validity of test score interpretations are threatened. Therefore, there is a high dem...
-
Is It Who You Are or Where You Are? Accounting for Compositional Differences in Cross-Site Treatment Effect Variation J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2023-04-10 Benjamin Lu, Eli Ben-Michael, Avi Feller, Luke Miratrix
In multisite trials, learning about treatment effect variation across sites is critical for understanding where and for whom a program works. Unadjusted comparisons, however, capture “compositional...
-
An Explicit Form With Continuous Attribute Profile of the Partial Mastery DINA Model J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2023-04-10 Tian Shu, Guanzhong Luo, Zhaosheng Luo, Xiaofeng Yu, Xiaojun Guo, Yujun Li
Cognitive diagnosis models (CDMs) are the statistical framework for cognitive diagnostic assessment in education and psychology. They generally assume that subjects’ latent attributes are dichotomo...
-
A Diagnostic Tree Model for Adaptive Assessment of Complex Cognitive Processes Using Multidimensional Response Options J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2023-04-05 Mark L. Davison, David J. Weiss, Joseph N. DeWeese, Ozge Ersan, Gina Biancarosa, Patrick C. Kennedy
A tree model for diagnostic educational testing is described along with Monte Carlo simulations designed to evaluate measurement accuracy based on the model. The model is implemented in an assessme...
-
Optimizing Diagnostic Classification Models Application Considering Real-Life Constraints J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2023-03-30 Kun Su, Robert A. Henson
This article provides a process to carefully evaluate the suitability of a content domain for which diagnostic classification models (DCMs) could be applicable and then optimized steps for construc...
-
Finding the Right Grain-Size for Measurement in the Classroom J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2023-03-30 Mark Wilson
This article introduces a new framework for articulating how educational assessments can be related to teacher uses in the classroom. It articulates three levels of assessment: macro (use of standa...
-
The Restricted DINA Model: A Comprehensive Cognitive Diagnostic Model for Classroom-Level Assessments J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2023-03-30 Pablo Nájera, Francisco J. Abad, Chia-Yi Chiu, Miguel A. Sorrel
The nonparametric classification (NPC) method has been proven to be a suitable procedure for cognitive diagnostic assessments at a classroom level. However, its nonparametric nature impedes the obt...
-
Expertise on Offer: Why Isn’t Anyone Buying? J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2023-03-29 Henry Braun
It is a much-lamented fact that research with the potential to inform or influence education policy instead remains policy inert. There are many reasons for this frustrating state of affairs, inclu...
-
Detecting Item Preknowledge Using Revisits With Speed and Accuracy J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2023-02-28 Onur Demirkaya, Ummugul Bezirhan, Jinming Zhang
Examinees with item preknowledge tend to obtain inflated test scores that undermine test score validity. With the availability of process data collected in computer-based assessments, the research ...
-
Handling Missing Data in Cross-Classified Multilevel Analyses: An Evaluation of Different Multiple Imputation Approaches J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2023-02-16 Simon Grund, Oliver Lüdtke, Alexander Robitzsch
Multiple imputation (MI) is a popular method for handling missing data. In education research, it can be challenging to use MI because the data often have a clustered structure that need to be acco...
-
A Causal Latent Transition Model With Multivariate Outcomes and Unobserved Heterogeneity: Application to Human Capital Development J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2023-02-09 Francesco Bartolucci, Fulvia Pennoni, Giorgio Vittadini
In order to evaluate the effect of a policy or treatment with pre- and post-treatment outcomes, we propose an approach based on a transition model, which may be applied with multivariate outcomes a...
-
Assessing Inter-rater Reliability With Heterogeneous Variance Components Models: Flexible Approach Accounting for Contextual Variables J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2023-02-09 Patrícia Martinková, František Bartoš, Marek Brabec
Inter-rater reliability (IRR), which is a prerequisite of high-quality ratings and assessments, may be affected by contextual variables, such as the rater’s or ratee’s gender, major, or experience....
-
Handling Missing Data in Growth Mixture Models J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2023-02-08 Daniel Y. Lee, Jeffrey R. Harring
A Monte Carlo simulation was performed to compare methods for handling missing data in growth mixture models. The methods considered in the current study were (a) a fully Bayesian approach using a ...
-
Clinical (In)Efficiency in the Prediction of Dangerous Behavior J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2023-01-11 Ehsan Bokhari
The prediction of dangerous and/or violent behavior is particularly important to the conduct of the U.S. criminal justice system when it makes decisions about restrictions of personal freedom, such...
-
A Randomization P-Value Test for Detecting Copying on Multiple-Choice Exams J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2023-01-09 Joseph B. Lang
This article is concerned with the statistical detection of copying on multiple-choice exams. As an alternative to existing permutation- and model-based copy-detection approaches, a simple randomiz...
-
A Case Study of Nonresponse Bias Analysis in Educational Assessment Surveys J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2022-12-15 Yajuan Si, Roderick J. A. Little, Ya Mo, Nell Sedransk
Nonresponse bias is a widely prevalent problem for data on education. We develop a ten-step exemplar to guide nonresponse bias analysis (NRBA) in cross-sectional studies and apply these steps to th...
-
Computational Strategies and Estimation Performance With Bayesian Semiparametric Item Response Theory Models J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2022-12-08 Sally Paganin, Christopher J. Paciorek, Claudia Wehrhahn, Abel Rodríguez, Sophia Rabe-Hesketh, Perry de Valpine
Item response theory (IRT) models typically rely on a normality assumption for subject-specific latent traits, which is often unrealistic in practice. Semiparametric extensions based on Dirichlet p...
-
Nonparametric Classification Method for Multiple-Choice Items in Cognitive Diagnosis J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2022-11-27 Yu Wang, Chia-Yi Chiu, Hans Friedrich Köhn
The multiple-choice (MC) item format has been widely used in educational assessments across diverse content domains. MC items purportedly allow for collecting richer diagnostic information. The eff...
-
Breaking Our Silence on Factor Score Indeterminacy J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2022-11-07 Niels G. Waller
Although many textbooks on multivariate statistics discuss the common factor analysis model, few of these books mention the problem of factor score indeterminacy (FSI). Thus, many students and cont...
-
Deep Reinforcement Learning for Adaptive Learning Systems J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2022-11-03 Xiao Li, Hanchen Xu, Jinming Zhang, Hua-hua Chang
The adaptive learning problem concerns how to create an individualized learning plan (also referred to as a learning policy) that chooses the most appropriate learning materials based on a learner’...
-
Power Approximations for Overall Average Effects in Meta-Analysis With Dependent Effect Sizes J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2022-10-17 Mikkel Helding Vembye, James Eric Pustejovsky, Therese Deocampo Pigott
Meta-analytic models for dependent effect sizes have grown increasingly sophisticated over the last few decades, which has created challenges for a priori power calculations. We introduce power app...
-
Commentary on “Obtaining Interpretable Parameters From Reparameterized Longitudinal Models: Transformation Matrices Between Growth Factors in Two Parameter Spaces” J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2022-10-06 Ziwei Zhang, Corissa T. Rohloff, Nidhi Kohli
To model growth over time, statistical techniques are available in both structural equation modeling (SEM) and random effects modeling frameworks. Liu et al. proposed a transformation and an invers...
-
Development of a High-Accuracy and Effective Online Calibration Method in CD-CAT Based on Gini Index J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2022-10-03 Qingrong Tan, Yan Cai, Fen Luo, Dongbo Tu
To improve the calibration accuracy and calibration efficiency of cognitive diagnostic computerized adaptive testing (CD-CAT) for new items and, ultimately, contribute to the widespread application...
-
Estimating Heterogeneous Treatment Effects Within Latent Class Multilevel Models: A Bayesian Approach J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2022-08-17 Weicong Lyu, Jee-Seon Kim, Youmi Suk
This article presents a latent class model for multilevel data to identify latent subgroups and estimate heterogeneous treatment effects. Unlike sequential approaches that partition data first and ...
-
A Collection of Numerical Recipes Useful for Building Scalable Psychometric Applications J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2022-08-17 Harold Doran
This article is concerned with a subset of numerically stable and scalable algorithms useful to support computationally complex psychometric models in the era of machine learning and massive data. ...
-
Cognitive Diagnosis Modeling Incorporating Response Times and Fixation Counts: Providing Comprehensive Feedback and Accurate Diagnosis J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2022-07-28 Peida Zhan*, Kaiwen Man*, Stefanie A. Wind, Jonathan Malone
Respondents’ problem-solving behaviors comprise behaviors that represent complicated cognitive processes that are frequently systematically tied to one another. Biometric data, such as visual fixat...
-
Testing Differential Item Functioning Without Predefined Anchor Items Using Robust Regression J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2022-07-18 Weimeng Wang, Yang Liu, Hongyun Liu
Differential item functioning (DIF) occurs when the probability of endorsing an item differs across groups for individuals with the same latent trait level. The presence of DIF items may jeopardize...
-
Zero and One Inflated Item Response Theory Models for Bounded Continuous Data J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2022-07-15 Dylan Molenaar, Mariana Cúri and Jorge L. Bazán
Bounded continuous data are encountered in many applications of item response theory, including the measurement of mood, personality, and response times and in the analyses of summed item scores. A...
-
Forced-Choice Ranking Models for Raters’ Ranking Data J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2022-07-07 Su-Pin Hung, Hung-Yu Huang
To address response style or bias in rating scales, forced-choice items are often used to request that respondents rank their attitudes or preferences among a limited set of options. The rating sca...
-
Pooling Interactions Into Error Terms in Multisite Experiments J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2022-07-04 Wendy Chan, Larry Vernon Hedges
Multisite field experiments using the (generalized) randomized block design that assign treatments to individuals within sites are common in education and the social sciences. Under this design, th...
-
Improving Accuracy and Stability of Aggregate Student Growth Measures Using Empirical Best Linear Prediction J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2022-06-27 J. R. Lockwood, Katherine E. Castellano, Daniel F. McCaffrey
Many states and school districts in the United States use standardized test scores to compute annual measures of student achievement progress and then use school-level averages of these growth meas...
-
Speed–Accuracy Trade-Off? Not So Fast: Marginal Changes in Speed Have Inconsistent Relationships With Accuracy in Real-World Settings J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2022-06-08 Benjamin W. Domingue, Klint Kanopka, Ben Stenhaug, Michael J. Sulik, Tanesia Beverly, Matthieu Brinkhuis, Ruhan Circi, Jessica Faul, Dandan Liao, Bruce McCandliss, Jelena Obradović, Chris Piech, Tenelle Porter, Project iLEAD Consortium, James Soland, Jon Weeks, Steven L. Wise, Jason Yeatman
The speed–accuracy trade-off (SAT) suggests that time constraints reduce response accuracy. Its relevance in observational settings—where response time (RT) may not be constrained but respondent sp...
-
Jenss–Bayley Latent Change Score Model With Individual Ratio of the Growth Acceleration in the Framework of Individual Measurement Occasions J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2022-06-06 Jin Liu
Longitudinal data analysis has been widely employed to examine between-individual differences in within-individual changes. One challenge of such analyses is that the rate-of-change is only availab...
-
Two Statistical Tests for the Detection of Item Compromise J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2022-05-11 Wim J. van der Linden
Two independent statistical tests of item compromise are presented, one based on the test takers’ responses and the other on their response times (RTs) on the same items. The tests can be used to monitor an item in real time during online continuous testing but are also applicable as part of post hoc forensic analysis. The two test statistics are simple intuitive quantities as the sum of the responses
-
Statistical Inference for G-indices of Agreement J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2022-04-29 Douglas G. Bonett
The limitations of Cohen’s κ are reviewed and an alternative G-index is recommended for assessing nominal-scale agreement. Maximum likelihood estimates, standard errors, and confidence intervals for a two-rater G-index are derived for one-group and two-group designs. A new G-index of agreement for multirater designs is proposed. Statistical inference methods for some important special cases of the
-
A Critical View on the NEAT Equating Design: Statistical Modeling and Identifiability Problems J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2022-04-29 Ernesto San Martín, Jorge González
The nonequivalent groups with anchor test (NEAT) design is widely used in test equating. Under this design, two groups of examinees are administered different test forms with each test form containing a subset of common items. Because test takers from different groups are assigned only one test form, missing score data emerge by design rendering some of the score distributions unavailable. The partially
-
Regression Discontinuity Designs With an Ordinal Running Variable: Evaluating the Effects of Extended Time Accommodations for English-Language Learners J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2022-04-27 Youmi Suk, Peter M. Steiner, Jee-Seon Kim, Hyunseung Kang
Regression discontinuity (RD) designs are commonly used for program evaluation with continuous treatment assignment variables. But in practice, treatment assignment is frequently based on ordinal variables. In this study, we propose an RD design with an ordinal running variable to assess the effects of extended time accommodations (ETA) for English-language learners (ELLs). ETA eligibility is determined
-
Statistical Power for Estimating Treatment Effects Using Difference-in-Differences and Comparative Interrupted Time Series Estimators With Variation in Treatment Timing J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2022-02-08 Peter Z. Schochet
This article develops new closed-form variance expressions for power analyses for commonly used difference-in-differences (DID) and comparative interrupted time series (CITS) panel data estimators. The main contribution is to incorporate variation in treatment timing into the analysis. The power formulas also account for other key design features that arise in practice: autocorrelated errors, unequal
-
What Is Actually Equated in “Test Equating”? A Didactic Note J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2022-02-08 Wim J. van der Linden
The current literature on test equating generally defines it as the process necessary to obtain score comparability between different test forms. The definition is in contrast with Lord’s foundational paper which viewed equating as the process required to obtain comparability of measurement scale between forms. The distinction between the notions of scale and score is not trivial. The difference is
-
Analyzing Longitudinal Social Relations Model Data Using the Social Relations Structural Equation Model J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2021-12-14 Steffen Nestler, Oliver Lüdtke, Alexander Robitzsch
The social relations model (SRM) is very often used in psychology to examine the components, determinants, and consequences of interpersonal judgments and behaviors that arise in social groups. The standard SRM was developed to analyze cross-sectional data. Based on a recently suggested integration of the SRM with structural equation models (SEM) framework, we show here how longitudinal SRM data can
-
Item Pool Quality Control in Educational Testing: Change Point Model, Compound Risk, and Sequential Detection J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2021-12-13 Yunxiao Chen, Yi-Hsuan Lee, Xiaoou Li
In standardized educational testing, test items are reused in multiple test administrations. To ensure the validity of test scores, the psychometric properties of items should remain unchanged over time. In this article, we consider the sequential monitoring of test items, in particular, the detection of abrupt changes to their psychometric properties, where a change can be caused by, for example,
-
A New Multiprocess IRT Model With Ideal Points for Likert-Type Items J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2021-12-09 Kuan-Yu Jin, Yi-Jhen Wu, Hui-Fang Chen
For surveys of complex issues that entail multiple steps, multiple reference points, and nongradient attributes (e.g., social inequality), this study proposes a new multiprocess model that integrates ideal-point and dominance approaches into a treelike structure (IDtree). In the IDtree, an ideal-point approach describes an individual’s attitude and then a dominance approach describes their tendency
-
Obtaining Interpretable Parameters From Reparameterized Longitudinal Models: Transformation Matrices Between Growth Factors in Two Parameter Spaces J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2021-12-01 Jin Liu, Robert A. Perera, Le Kang, Roy T. Sabo, Robert M. Kirkpatrick
This study proposes transformation functions and matrices between coefficients in the original and reparameterized parameter spaces for an existing linear-linear piecewise model to derive the interpretable coefficients directly related to the underlying change pattern. Additionally, the study extends the existing model to allow individual measurement occasions and investigates predictors for individual
-
On the Generalized S−X2–Test of Item Fit: Some Variants, Residuals, and a Graphical Visualization J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2021-10-25 Jochen Ranger, Kay Brauer
The generalized S−X2–test is a test of item fit for items with polytomous responses format. The test is based on a comparison of the observed and expected number of responses in strata defined by the test score. In this article, we make four contributions. We demonstrate that the performance of the generalized S−X2–test depends on how sparse cells are pooled. We propose alternative implementations
-
Reporting Proficiency Levels for Examinees With Incomplete Data J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2021-10-24 Sandip Sinharay
Takers of educational tests often receive proficiency levels instead of or in addition to scaled scores. For example, proficiency levels are reported for the Advanced Placement (AP®) and U.S. Medical Licensing examinations. Technical difficulties and other unforeseen events occasionally lead to missing item scores and hence to incomplete data on these tests. The reporting of proficiency levels to the
-
Comparison of Within- and Between-Series Effect Estimates in the Meta-Analysis of Multiple Baseline Studies J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2021-08-05 Seang-Hwane Joo, Yan Wang, John Ferron, S. Natasha Beretvas, Mariola Moeyaert, Wim Van Den Noortgate
Multiple baseline (MB) designs are becoming more prevalent in educational and behavioral research, and as they do, there is growing interest in combining effect size estimates across studies. To further refine the meta-analytic methods of estimating the effect, this study developed and compared eight alternative methods of estimating intervention effects from a set of MB studies. The methods differed
-
Block What You Can, Except When You Shouldn’t J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2021-07-07 Nicole E. Pashley, Luke W. Miratrix
Several branches of the potential outcome causal inference literature have discussed the merits of blocking versus complete randomization. Some have concluded it can never hurt the precision of estimates, and some have concluded it can hurt. In this article, we reconcile these apparently conflicting views, give a more thorough discussion of what guarantees no harm, and discuss how other aspects of
-
Mean Comparisons of Many Groups in the Presence of DIF: An Evaluation of Linking and Concurrent Scaling Approaches J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2021-06-08 Alexander Robitzsch, Oliver Lüdtke
One of the primary goals of international large-scale assessments in education is the comparison of country means in student achievement. This article introduces a framework for discussing differential item functioning (DIF) for such mean comparisons. We compare three different linking methods: concurrent scaling based on full invariance, concurrent scaling based on partial invariance using the RMSD
-
Analyzing Cross-Sectionally Clustered Data Using Generalized Estimating Equations J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2021-06-04 Francis L. Huang
The presence of clustered data is common in the sociobehavioral sciences. One approach that specifically deals with clustered data but has seen little use in education is the generalized estimating equations (GEEs) approach. We provide a background on GEEs, discuss why it is appropriate for the analysis of clustered data, and provide worked examples using both continuous and binary outcomes. Comparisons
-
Using Sequence Mining Techniques for Understanding Incorrect Behavioral Patterns on Interactive Tasks J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2021-05-03 Esther Ulitzsch, Qiwei He, Steffi Pohl
Interactive tasks designed to elicit real-life problem-solving behavior are rapidly becoming more widely used in educational assessment. Incorrect responses to such tasks can occur for a variety of different reasons such as low proficiency levels, low metacognitive strategies, or motivational issues. We demonstrate how behavioral patterns associated with incorrect responses can, in part, be understood
-
A Rating Scale Mixture Model to Account for the Tendency to Middle and Extreme Categories J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2021-03-31 Roberto Colombi, Sabrina Giordano, Gerhard Tutz
A mixture of logit models is proposed that discriminates between responses to rating questions that are affected by a tendency to prefer middle or extremes of the scale regardless of the content of the item (response styles) and purely content-driven preferences. Explanatory variables are used to characterize the content-driven way of answering as well as the tendency to middle or extreme categories
-
Item Characteristic Curve Asymmetry: A Better Way to Accommodate Slips and Guesses Than a Four-Parameter Model? J. Educ. Behav. Stat. (IF 2.116) Pub Date : 2021-03-29 Xiangyi Liao, Daniel M. Bolt
Four-parameter models have received increasing psychometric attention in recent years, as a reduced upper asymptote for item characteristic curves can be appealing for measurement applications such as adaptive testing and person-fit assessment. However, applications can be challenging due to the large number of parameters in the model. In this article, we demonstrate in the context of mathematics assessments