Abstract
Person fit statistics are frequently used to detect aberrant behavior when assuming an item response model generated the data. A common statistic, \(l_z\), has been shown in previous studies to perform well under a myriad of conditions. However, it is well-known that \(l_z\) does not follow a standard normal distribution when using an estimated latent trait. As a result, corrections of \(l_z\), called \(l_z^*\), have been proposed in the literature for specific item response models. We propose a more general correction that is applicable to many types of data, namely survey or tests with multiple item types and underlying latent constructs, which subsumes previous work done by others. In addition, we provide corrections for multiple estimators of \(\theta \), the latent trait, including MLE, MAP and WLE. We provide analytical derivations that justifies our proposed correction, as well as simulation studies to examine the performance of the proposed correction with finite test lengths. An applied example is also provided to demonstrate proof of concept. We conclude with recommendations for practitioners when the asymptotic correction works well under different conditions and also future directions.
Similar content being viewed by others
References
Albers, C. J., Meijer, R. R., & Tendeiro, J. N. (2016). Derivation and applicability of asymptotic results for multiple subtests person-fit statistics. Applied Psychological Measurement, 40(4), 274–288. https://doi.org/10.1177/0146621615622832.
Baer, R. A., Ballenger, J., Berru, D., & Wetter, M. W. (1997). Detection of random responding on the MMPI-A. Journal of Personality Assessment, 68(1), 139–151.
Bedrick, E. J. (1997). Approximating the conditional distribution of person fit indexes for checking the rasch model. Psychometrika, 62, 191–199. https://doi.org/10.1007/BF02295274.
Berry, D. T. R., Wetter, M. W., Baer, R. A., Larsen, L., Clark, C., & Monroe, K. (1992). MMPI-2 random responding indices: Validation using a self-report methodology. Psychological Assessment, 4(3), 340–345. https://doi.org/10.1037/1040-3590.4.3.340.
Bhattacharya, R., Lin, L., & Victor, P. (2016). A course in mathematical statistics and large sample theory. Berlin: Springer.
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinees ability. In F. M. Lord & M. Novick (Eds.), Statistical theories of mental test scores (pp 397–472).
Casella, G. & Berger, R. (2001). Statistical Inference (No. 141). https://doi.org/10.1057/pt.2010.23
Chalmers, R. P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software 48 (6). Retrieved from http://www.jstatsoft.org/v48/i06/https://doi.org/10.18637/jss.v048.i06
Cheng, Y., & Yuan, K. H. (2010). The impact of fallible item parameter estimates on latent trait recovery. Psychometrika, 75, 280–291. https://doi.org/10.1007/s11336-009-9144-x.
Conijn, J. M., Emons, W. H., & Sijtsma, K. (2014). Statistic lz-based person-fit methods for noncognitive multiscale measures. Applied Psychological Measurement, 38(2), 122–136. https://doi.org/10.1177/0146621613497568.
Conrad, K. J., Bezruczko, N., Chan, Y. F., Riley, B., Diamond, G., & Dennis, M. L. (2010). Screening for atypical suicide risk with person fit statistics among people presenting to alcohol and other drug treatment. Drug and Alcohol Dependence, 106, 92–100. https://doi.org/10.1016/j.drugalcdep.2009.07.023.
Drasgow, F., Levine, M. V., & McLaughlin, M. E. (1991). Appropriateness measurement for some multidimensional test batteries. Applied Psychological Measurement, 15(2), 171–191. https://doi.org/10.1177/014662169101500207.
Drasgow, F., Levine, M. V., & Williams, E. A. (1985). Appropriateness measurement with polychotomous item response models and standardized indices. British Journal of Mathematical and Statistical Psychology, 38(1), 67–86. https://doi.org/10.1111/j.2044-8317.1985.tb00817.x.
Goldberg, L. R. (1992). The development of markers for the big-five factor structure. Psychological Assessment, 4(1), 26–42. https://doi.org/10.1037/1040-3590.4.1.26.
Goldberg, L. R., & Kilkowski, J. M. (1985). The prediction of semantic consistency in self-descriptions. Characteristics of persons and of terms that affect the consistency of responses to synonym and antonym pairs. Journal of Personality and Social Psychology, 48(1), 82–98. https://doi.org/10.1037/0022-3514.48.1.82.
Hong, M., Steedle, J. T., & Cheng, Y. (2019). Methods of detecting insufficient effort responding: Comparisons and practical recommendations. Educational and Psychological Measurement, 80(2), 312–345. https://doi.org/10.1177/0013164419865316.
Jeon, M., & De Boeck, P. (2019). Evaluation on types of invariance in studying extreme response bias with an IRTree approach. British Journal of Mathematical and Statistical Psychology, 72(3), 517–537. https://doi.org/10.1111/bmsp.12182.
Karabatsos, G. (2003). Comparing the Aberrant Response Detection Performance of Thirty-Six Person-Fit Statistics., 16(4), 277–298. https://doi.org/10.1207/S15324818AME1604.
Magis, D., Raîche, G., & Béland, S. (2012). A didactic presentation of snijders’s l z* index of person fit with emphasis on response model selection and ability estimation. Journal of Educational and Behavioral Statistics, 37(1), 57–81. https://doi.org/10.3102/1076998610396894.
Magnus, J., & Neudecker, H. (1988). Matrix differential calculus with applications in statistics and econometrics. New York: Wiley.
Meijer, R. R. (1996). Person-fit research: An introduction. Applied Measurement in Education, 9(1), 1–2.
Meijer, R. R., & Sijtsma, K. (2001). Methodology review: Evaluating person fit. Applied Psychological Measurement, 25(2), 107–135. (Retrieved from).
Molenaar, I. W., & Hoijtink, H. (1990). The many null distributions of person fit indices. Psychometrika, 55, 75–106. https://doi.org/10.1007/BF02294745.
Niessen, A. S. M., Meijer, R. R., & Tendeiro, J. N. (2016). Detecting careless respondents in web-based questionnaires: Which method to use? Journal of Research in Personality, 63, 1–11. https://doi.org/10.1016/j.jrp.2016.04.010.
Reckase, M. (2009). Multidimensional Item Response Theory.https://doi.org/10.1007/978-0-387-89976-3.
Reise, S. P. (1990). A comparison of item- and person-fit methods of assessing model-data fit in IRT. Applied Psychological Measurement, 14(2), 127–137. https://doi.org/10.1177/014662169001400202.
Rizopoulos, D. (2006). Itm: An R package for latent variable modeling and item response theory analyses. Journal of Statistical Software.https://doi.org/10.18637/jss.v017.i05.
Rupp, A. A. (2013). A systematic review of the methodology for person fit research in item response theory: Lessons about generalizability of inferences from the design of simulation studies. Psychological Test and Assessment Modeling, 55(1), 3–38.
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores (Vol. 35) (No. 1). https://doi.org/10.1007/BF02290599
Shao, C., Li, J., & Cheng, Y. (2016). Detection of test speededness using change-point analysis. Psychometrika, 81(4), 1118–1141. https://doi.org/10.1007/s11336-015-9476-7.
Sinharay, S. (2016a). Asymptotically correct standardization of person-fit statistics beyond dichotomous items. Psychometrika, 81, 992–1013. https://doi.org/10.1007/s11336-015-9465-x.
Sinharay, S. (2016b). Some remarks on applications of tests for detecting a change point to psychometric problems. Psychometrika, 82, 1–13. https://doi.org/10.1007/s11336-016-9531-z.
Snijders, T. A. B. (2001). Asymptotic null distribution of person fit statistics with estimated person parameter. Psychometrika, 66(3), 331–342. https://doi.org/10.1007/BF02294437.
Tendeiro, J. N. (2017). The lz(p)* person-fit statistic in an unfolding model context. Applied Psychological Measurement, 41(1), 44–59. https://doi.org/10.1177/0146621616669336.
Tendeiro, J. N., Meijer, R. R., & Niessen, A. S. M. (2016). Perfit: An R package for person-fit analysis in IRT. Journal of Statistical Software, 74(5), 1–27. https://doi.org/10.18637/jss.v074.i05.
von Davier, M., & Molenaar, I. W. (2003). A person-fit index for polytomous rasch models, latent class models, and their mixture generalizations. Psychometrika, 68, 213–228. https://doi.org/10.1007/BF02294798.
Wang, C. (2015). On latent trait estimation in multidimensional compensatory item response models. Psychometrika, 80, 428–449. https://doi.org/10.1007/s11336-013-9399-0.
Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54, 427–450. https://doi.org/10.1007/BF02294627.
Yu, X., & Cheng, Y. (2019). A change-point analysis procedure based on weighted residuals to detect back random responding. Psychological Methods, 5, 658–674. https://doi.org/10.1037/met0000212.
Zhang, J., & Stout, W. (1999). The theoretical detect index of dimensionality and its application to approximate simple structure. Psychometrika, 64, 213–249. https://doi.org/10.1007/BF02294536.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is under review in Psychometrika.
Ying Cheng is supported by the National Science Foundation Grant SES-1853166. The contribution of Lizhen Lin was supported by NSF grant DMS Career 1654579.
Supplementary Information
Below is the link to the electronic supplementary material.
Appendix A
Appendix A
1.1 Formulas for Different Estimators of \(\varvec{\theta }\): \(\varvec{\theta }_{MAP}\) and \(\varvec{\theta }_{WLE}\)
The following section is based on work done by Sinharay (2016a) and Wang (2015). Suppose \(\varvec{\theta }\) is estimated by \(\varvec{{\hat{\theta }}}\), where \(\varvec{{\hat{\theta }}}\) satisfies the following condition:
which can be rewritten as:
for some functions \(t_{0}(\varvec{\theta })= (t_{01}, t_{02}, \ldots , t_{0S})'\) and \(t_{ij}(\varvec{\theta }) = (t_{ij1}, t_{ij2}, \ldots , t_{ijS})'\). For instance, \(\hat{\varvec{\theta }}_{ML}\) is the value of \(\varvec{\theta }\) for which:
The equality in Equation (64) holds where \(t_{0}(\varvec{\theta })\) and \(t_{ij}(\varvec{\theta })\) satisfy:
Similarly, \(\varvec{{\hat{\theta }}}_{MAP}\) satisfies the following:
where \(\pi (\varvec{\theta })\) is a prior distribution for \(\varvec{\theta }\). Equation (64) holds for \(\varvec{{\hat{\theta }}}_{MAP}\) where:
Note that if the prior is a standard multivariate normal distribution, then \( \nabla \log \pi (\varvec{\theta })|_{\varvec{\theta }} = -\varvec{\theta }\). \(\varvec{{\hat{\theta }}}_{WLE}\) satisfies Equation (64) where:
where \(\bar{\mathbf {I}}_p\) be the average information about \(\varvec{\theta }\) in the sample where \(\bar{\mathbf {I}}_p = \sum _{i=1}^p \mathbf {I}_i(\varvec{\theta })/p\). \(\mathbf {B}(\varvec{\theta }) = [B(\theta _1), B(\theta _2), ..., B(\theta _S)]'\) is a S-dimensional vector where the \(s^{th}\) element in \(\mathbf {B}(\varvec{\theta })\) is:
Therefore this satisfies Equation (64) where
Rights and permissions
About this article
Cite this article
Hong, M., Lin, L. & Cheng, Y. Asymptotically Corrected Person Fit Statistics for Multidimensional Constructs with Simple Structure and Mixed Item Types. Psychometrika 86, 464–488 (2021). https://doi.org/10.1007/s11336-021-09756-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11336-021-09756-3