Skip to main content
Log in

Differential Item Functioning Analysis Without A Priori Information on Anchor Items: QQ Plots and Graphical Test

  • Theory and Methods
  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

Differential item functioning (DIF) analysis is an important step in establishing the validity of measurements. Most traditional methods for DIF analysis use an item-by-item strategy via anchor items that are assumed DIF-free. If anchor items are flawed, these methods will yield misleading results due to biased scales. In this article, based on the fact that the item’s relative change of difficulty difference (RCD) does not depend on the mean ability of individual groups, a new DIF detection method (RCD-DIF) is proposed by comparing the observed differences against those with simulated data that are known DIF-free. The RCD-DIF method consists of a D-QQ (quantile quantile) plot that permits the identification of internal references points (similar to anchor items), a RCD-QQ plot that facilitates visual examination of DIF, and a RCD graphical test that synchronizes DIF analysis at the test level with that at the item level via confidence intervals on individual items. The RCD procedure visually reveals the overall pattern of DIF in the test and the size of DIF for each item and is expected to work properly even when the majority of the items possess DIF and the DIF pattern is unbalanced. Results of two simulation studies indicate that the RCD graphical test has Type I error rate comparable to those of existing methods but with greater power.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. The number (j) is the subscript of the jth smallest \(d_{j}\), corresponding to the standard notation for order statistics.

  2. Note that \(\hat{d}_{(1)}\) does not necessarily correspond to the smallest \(b_{j}^{(2)}-b_{j}^{\left( 1 \right) }\) in the population, and similarly \(\hat{d}_{(M)}\) does not necessarily correspond to the greatest \(b_{j}^{(2)}-b_{j}^{\left( 1 \right) }\) in the population. Sampling errors also contribute to the order of the estimated difference in addition to the true size of the \(b_{j}^{(2)}-b_{j}^{\left( 1 \right) }\) in the population.

  3. Alternatively, we can use \(\hat{b}_{j}^{{(1)}}\) or \(\hat{b}_{j}^{{(2)}}\) obtained in step 1 as the population values if group 1 or group 2 is the reference group. We found that they yield essentially the same results in both Monte Carlo simulation (Sect. 4) and illustrative data analysis (Sect. 3.2), simply because each choice closely matches the underlying population and satisfies the need of DIF-free null hypothesis.

  4. In our initial study, we tried with 1000, 2000, and 5000 replications and obtained essentially the same results.

  5. A condition is called balanced if both the number of items and the sizes of DIF that favor group one equal those that favor group two so that the effect of DIF on the whole test cancels out. Otherwise, the condition is called unbalanced.

  6. Because the distribution of \(\hat{d}_{(j)}\) is approximated by Monte Carlo simulation, slight differences might be observed on a particular dataset between the choices of \(\hat{b}_{j}\), \(\hat{b}_{j}^{(1)}\) or \(\hat{b}_{j}^{(2)}\) as the population parameters or when using different seeds for generating the DIF-free data \(Y_{ij}^{(1)}\) and \(Y_{ij}^{(2)}\) in step 2 of Sect. 2.3. But such differences disappear when averaging across replications (see footnote 3).

  7. We used the D-QQ plot to choose the reference points and examined the plots for the first 10 replications and found that they always pointed us to items that are DIF-free in Study 1 and Study 2. We also explored the effect of different number of reference points in our initial analysis: 10, 4, and 2 and found that they lead to essentially the same results. Thus, instead of examining the D-QQ plot for every replication (which is implausible), we just choose the middle four items that are most likely DIF-free as the reference points in the simulation Study 1 and Study 2.

  8. With the DIPF procedure, the total number of comparisons is \(30\times 29/2=435\); for each item the number of comparisons is 29. The Type I errors reported in Table 4 and the values of power in Table 6 are for each item paired with the other 29 items, and the corresponding significant level is corrected using 29 comparisons. If the Type I error and power were calculated based on 435 comparisons and corrected accordingly using Holm’s method, then the Type I error rates and power values would be much smaller than being reported in Tables 4 and 6.

  9. The Type I error rates at the test level of the MH, the LRT, and the Wald methods are more than 60%, 60%, and 40% without using the Holm adjustment, respectively.

  10. The 2PL model can be equivalently expressed via discrimination and difficulty parameters (a, b), with \(a_{j}^{\left( g \right) }=\alpha _{j}^{\left( g \right) }\) and \(b_{j}^{(g)}=-\beta _{j}^{\left( g \right) } \big / \alpha _{j}^{\left( g \right) }.\) But it involves more complicated equations to present the extension using the (a, b) parameterization.

References

  • Ackerman, T. A. (1992). A didactic explanation of item bias, item impact, and item validity from a multidimensional perspective. Journal of Educational Measurement, 29(1), 67–91.

    Article  Google Scholar 

  • Barnard, G. A. (1963). Discussion on The spectral analysis of point processes (by M. S. Bartlett). Journal of the Royal Statistical Society B, 25, 294–296.

    Google Scholar 

  • Barnett, V., & Lewis, T. (1994). Outliers in statistical data (3rd ed.). Chichester: Wiley.

    Google Scholar 

  • Bauer, D. J., Belzak, W. C. M., & Cole, V. T. (2019). Simplifying the assessment of measurement invariance over multiple background variables: Using regularized moderated nonlinear factor analysis to detect differential item functioning. Structural Equation Modeling: A Multidisciplinary Journal, 27(1), 43–55.

    Article  Google Scholar 

  • Bechger, T. M., & Maris, G. (2015). A statistical test for differential item pair functioning. Psychometrika, 80(2), 317–340.

    Article  PubMed  Google Scholar 

  • Belzak, W. C. M., & Bauer, D. J. (2020). Improving the assessment of measurement invariance: Using regularization to select anchor items and identify differential item functioning. Psychological Methods,. https://doi.org/10.1037/met0000253.

    Article  PubMed  PubMed Central  Google Scholar 

  • Cai, L. (2017). flexMIRT R 3.51: Flexible multilevel and multidimensional item response theory analysis and test scoring [Computer software]. Chapel Hill, NC: Vector Psychometric Group LLC.

    Google Scholar 

  • Cai, L., duToit, S. H. L., & Thissen, D. (2009). IRTPRO: flexible, multidimensional, multiple categorical IRT modeling [Computer software]. Chicago: Scientific Software International.

    Google Scholar 

  • Candell, G. L., & Drasgow, F. (1988). An iterative procedure for linking metrics and assessing item bias in item response theory. Applied Psychological Measurement, 12(3), 253–260.

    Article  Google Scholar 

  • Cao, M., Tay, L., & Liu, Y. (2017). A Monte Carlo study of an iterative Wald test procedure for DIF analysis. Educational and Psychological Measurement, 77(1), 104–118.

    Article  PubMed  Google Scholar 

  • Chalmers, R. P. (2012). Mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1–29.

    Article  Google Scholar 

  • Choi, S. W., Gibbons, L. E., & Crane, P. K. (2011). Lordif: An R package for detecting differential item functioning using iterative hybrid ordinal logistic regression / item response theory and Monte Carlo simulations. Journal of Statistical Software, 39(2), 1–30.

    PubMed  PubMed Central  Google Scholar 

  • Clauser, B. E., & Mazor, K. M. (1998). Using statistical procedures to identify differential item functioning test items. Educational Measurement: Issues and Practice, 17(1), 31–44.

    Article  Google Scholar 

  • Clauser, B., Mazor, K., & Hambleton, R. K. (1993). The effects of purification of matching criterion on the identification of DIF using the Mantel-Haenszel procedure. Applied Measurement in Education, 6(4), 269–279.

    Article  Google Scholar 

  • Da Costa, P. D., & Araújo, L. (2012). Differential item functioning (DIF): What function differently for Immigrant students in PISA 2009 reading items (Report EUR 25565 EN). Retrieved from https://core.ac.uk/display/38627538

  • Davey, A., & Savla, J. (2009). Estimating statistical power with incomplete data. Organizational Research Methods, 12(2), 320–346.

    Article  Google Scholar 

  • Davison, A. C., & Hinkley, D. V. (1997). Bootstrap methods and their application. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • DeMars, C. E. (2011). An analytic comparison of effect sizes for differential item functioning. Applied Measurement in Education, 24(3), 189–209.

    Article  Google Scholar 

  • Doebler, A. (2019). Looking at DIF from a new perspective: A structure-based approach acknowledging inherent indefinability. Applied Psychological Measurement, 43(4), 303–321.

    Article  PubMed  Google Scholar 

  • Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap. New York: Chapman & Hall.

    Book  Google Scholar 

  • Falk, C. F., & Cai, L. (2016). Maximum marginal likelihood estimation of a monotonic polynomial generalized partial credit model with applications to multiple group analysis. Psychometrika, 81(2), 434–460.

    Article  PubMed  Google Scholar 

  • Fidalgo, A. M., Mellenbergh, G. J., & Muñiz, J. (2000). Effects of amount of DIF, test length, and purification type on robustness and power of Mantel-Haenszel procedures. Methods of Psychological Research Online, 5(3), 43–53.

    Google Scholar 

  • Finch, H. (2005). The MIMIC model as a method for detecting DIF: Comparison with Mantel-Haenszel, SIBTEST, and the IRT likelihood ratio. Applied Psychological Measurement, 29(4), 278–295.

    Article  Google Scholar 

  • Fischer, G. H., & Molenaar, I. W. (1995). Rasch models: Foundations, recent developments, and applications. New York, NY: Springer.

    Book  Google Scholar 

  • Frederickx, S., Tuerlinckx, F., De Boeck, P., & Magis, D. (2010). RIM: A random item mixture model to detect differential item functioning. Journal of Educational Measurement, 47(4), 432–457.

    Article  Google Scholar 

  • French, B. F., & Maller, S. J. (2007). Iterative purification and effect size use with logistic regression for differential item functioning detection. Educational and Psychological Measurement, 67(3), 373–393.

    Article  Google Scholar 

  • Frick, H., Strobl, C., & Zeileis, A. (2015). Rasch mixture models for DIF detection: A comparison of old and new score specifications. Educational and Psychological Measurement, 75(2), 208–234.

    Article  PubMed  Google Scholar 

  • Gnanadesikan, R. (1997). Methods for statistical data analysis of multivariate observations (2nd ed.). New York: Wiley.

    Book  Google Scholar 

  • González-Betanzos, F., & Abad, F. J. (2012). The effects of purification and the evaluation of differential item functioning with the likelihood ratio test. Methodology: European Journal of Research Methods for the Behavioral and Social Science, 8(4), 134–145.

    Article  Google Scholar 

  • Hall, P., & Wilson, S. R. (1991). Two guidelines for bootstrap hypothesis testing. Biometrics, 47, 757–762.

    Article  Google Scholar 

  • Hancock, G. R., Stapleton, L. M., & Arnold-Berkovits, I. (2009). The tenuousness of invariance tests within multi-sample covariance and mean structure models. In T. Teo & M. S. Khine (Eds.), Structural equation modeling: Concepts and applications in educational research (pp. 137–174). Rotterdam: Sense Publishers.

    Google Scholar 

  • Harman, H. H. (1976). Modern factor analysis (3rd ed.). Chicago: The University of Chicago Press.

    Google Scholar 

  • Holland, P. W., & Thayer, D. T. (1988). Differential item performance and the Mantel-Haenszel procedure. In H. Wainer & H. I. Braun (Eds.), Test validity (pp. 129–145). Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics, 6, 65–70.

    Google Scholar 

  • Hope, A. C. A. (1968). A simplified Monte Carlo test procedure. Journal of the Royal Statistical Society, 30(3), 582–598.

    Google Scholar 

  • Huang, X., Wilson, M., & Wang, L. (2016). Exploring plausible causes of differential item functioning in the PISA science assessment: Language, curriculum or culture. Educational Psychology, 36(2), 378–390.

    Article  Google Scholar 

  • Huang, P. H. (2018). A penalized likelihood method for multi-group structural equation modelling. British Journal of Mathematical & Statistical Psychology, 71(6), 499–522.

    Article  Google Scholar 

  • Hunter, J. E., & Schmidt, F. L. (2004). Methods of meta-analysis: Correcting error and bias in research findings (2nd ed.). Thousand Oaks, CA: Sage.

    Book  Google Scholar 

  • Jalal, S., & Bentler, P. (2018). Using Monte Carlo normal distributions to evaluate structural equation models with nonnormal data. Structural Equation Modeling: A Multidisciplinary Journal, 25(4), 541–557.

    Article  Google Scholar 

  • Jöreskog, K. G., & Goldberger, A. S. (1975). Estimation of a model with multiple indicators and multiple causes of a single latent variable. Journal of the American Statistical Association, 70(351a), 631–639.

    Article  Google Scholar 

  • Kim, J., & Oshima, T. C. (2013). Effect of multiple testing adjustment in differential item functioning detection. Educational and Psychological Measurement, 73(3), 458–470.

    Article  Google Scholar 

  • Kopf, J., Zeileis, A., & Strobl, C. (2013). Anchor methods for DIF detection: A comparison of the iterative forward, backward, constant and all-other anchor class (Technical Report 141). Munich: Department of Statistics, LMU Munich.

    Google Scholar 

  • Kopf, J., Zeileis, A., & Strobl, C. (2015a). A framework for anchor methods and an iterative forward approach for DIF detection. Applied Psychological Measurement, 39(2), 83–103.

  • Kopf, J., Zeileis, A., & Strobl, C. (2015b). Anchor selection strategies for DIF analysis: Review, assessment, and new approaches. Educational and Psychological Measurement, 75(1), 22–56.

    Article  PubMed  Google Scholar 

  • Le, L. T. (2009). Investigating gender differential item functioning across countries and test languages for PISA science items. International Journal of Testing, 9(2), 122–133.

    Article  Google Scholar 

  • Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum.

    Google Scholar 

  • Magis, D., & De Boeck, P. (2012). A robust outlier approach to prevent type I error inflation in differential item functioning. Educational and Psychological Measurement, 72(2), 291–311.

    Article  Google Scholar 

  • Magis, D., & Facon, B. (2013). Item purification does not always improve DIF detection: A counterexample with Angoff’s delta plot. Educational and Psychological Measurement, 73(2), 293–311.

    Article  Google Scholar 

  • Magis, D., Béland, S., Tuerlinckx, F., & De Boeck, P. (2010). A general framework and an R package for the detection of dichotomous differential item functioning. Behavior Research Methods, 42(3), 847–862.

    Article  PubMed  Google Scholar 

  • Magis, D., Tuerlinckx, F., & De Boeck, P. (2015). Detection of differential item functioning using the lasso approach. Journal of Educational and Behavioral Statistics, 40(2), 111–135.

    Article  Google Scholar 

  • May, H. (2006). A multilevel Bayesian item response theory method for scaling socioeconomic status in international studies of education. Journal of Educational Behavioral Statistics, 31(1), 63–79.

    Article  Google Scholar 

  • Millsap, R. E., & Meredith, W. (1992). Inferential conditions in the statistical detection of measurement bias. Applied Psychological Measurement, 16(4), 389–402.

    Article  Google Scholar 

  • Muthén, B. O. (1985). A method for studying the homogeneity of test items with respect to other relevant variables. Journal of Educational Statistics, 10, 121–132.

    Article  Google Scholar 

  • Navas-Ara, M. J., & Gómez-Benito, J. (2002). Effects of ability scale purification on the identification of DIF. European Journal of Psychological Assessment, 18(1), 9–15.

    Article  Google Scholar 

  • Oshima, T. C., Kushubar, S., Scott, J. C., & Raju, N. S. (2009). DFIT8 for Window User’s Manual: differential functioning of items and tests. St. Paul MN: Assessment Systems Corporation.

    Google Scholar 

  • Özdemir, B. (2015). A comparison of IRT-based methods for examining differential item functioning in TIMSS 2011 mathematics subtest. Procedia-Social and Behavioral Sciences, 174, 2075–2083.

    Article  Google Scholar 

  • Price, E. A. (2014). Item discrimination, model-data fit, and Type I error rates in DIF detection using Lord’s Chi\(^{2}\), the Likelihood ratio test, and the Mantel-Haenszel procedure. Ohio University, ProQuest Dissertations Publishing.

  • R Core Team. (2018). R: A Language and Environment for Statistical Computing. Austria: R Foundation for Statistical Computing.

  • Rogers, H. J., & Swaminathan, H. (1993). A comparison of logistic regression and Mantel-Haenszel procedures for detecting differential item functioning. Applied Psychological Measurement, 17(2), 105–116.

    Article  Google Scholar 

  • Roussos, L. A., Schnipke, D. L., & Pashley, P. J. (1999). A generalized formula for the Mantel-Haenszel differential item functioning parameter. Journal of Educational and Behavioral Statistics, 24, 293–322.

    Article  Google Scholar 

  • Santoso, A. (2018). Equivalence testing for anchor selection in differential item functioning detection (Doctoral dissertation). Retrieved from https://curate.nd.edu/downloads/und:5712m61688h

  • Schauberger, G., & Mair, P. (2020). A regularization approach for the detection of differential item functioning in generalized partial credit models. Behavior Research Methods, 52(4), 279–294.

    Article  PubMed  Google Scholar 

  • Schmetterer, L. (1974). Introduction to mathematical statistics (translated from German to English by Kenneth Wickwire). New York: Springer.

    Google Scholar 

  • Shealy, R., & Stout, W. (1993). A model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DTF as well as item bias/DIF. Psychometrika, 58(2), 159–194.

    Article  Google Scholar 

  • Shih, C. L., & Wang, W. C. (2009). Differential item functioning detection using the multiple indicators, multiple causes method with a pure short anchor. Applied Psychological Measurement, 33(3), 184–199.

    Article  Google Scholar 

  • Sinharay, S., Dorans, N. J., Grant, M. C., Blew, E. O., & Knorr, C. M. (2006). Using past data to enhance small-sample DIF estimation: A Bayesian approach (ETS RR-06-09). Princeton, NJ: Educational Testing Servics.

    Google Scholar 

  • Soares, T. M., Gonçalves, F. B., & Gamerman, D. (2009). An integrated Bayesian model for DIF analysis. Journal of Educational and Behavioral Statistics, 34(3), 348–377.

    Article  Google Scholar 

  • Strobl, C., Kopf, J., & Zeileis, A. (2015). Rasch trees: A new method for detecting differential item functioning in the Rasch model. Psychometrika, 80(2), 289–316.

    Article  PubMed  Google Scholar 

  • Swaminathan, H., & Rogers, H. J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27(4), 361–370.

    Article  Google Scholar 

  • Tay, L., Huang, Q., & Vermunt, J. K. (2016). Item response theory with covariates (IRT-C) assessing item recovery and differential item functioning for the three-parameter logistic model. Educational and Psychological Measurement, 76(1), 22–42.

    Article  PubMed  Google Scholar 

  • Tay, L., Meade, A. W., & Cao, M. (2015). An overview and practical guide to IRT measurement equivalence analysis. Organizational Research Methods, 18(1), 3–46.

    Article  Google Scholar 

  • Thissen, D., Steinberg, L., & Gerrard, M. (1986). Beyond group-mean differences: The concept of item bias. Psychological Bulletin, 99(1), 118–128.

    Article  Google Scholar 

  • Thissen, D., Steinberg, L., & Wainer, H. (1993). Detection of differential item functioning using the parameters of item response models. In P. Holland & H. Wainer (Eds.), Differential item functioning (pp. 67–113). Hillsdale, NJ: Lawrence Erlbaum Associates.

    Google Scholar 

  • Toland, M. (2008). Determining the accuracy of item parameter standard error of estimates in BILOG-MG3.

  • Tutz, G., & Berger, M. (2016). Item-focused trees for the identification of items in differential item functioning. Psychometrika, 81(3), 727–750.

    Article  PubMed  Google Scholar 

  • Tutz, G., & Schauberger, G. (2015). A penalty approach to differential item functioning in Rasch models. Psychometrika, 80(1), 21–43.

    Article  PubMed  Google Scholar 

  • Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3(1), 4–70.

    Article  Google Scholar 

  • Wang, W. C., & Su, Y. H. (2004). Effects of average signed area between two item characteristic curves and test purification procedures on the DIF detection via the Mantel-Haenszel method. Applied Measurement in Education, 17(2), 113–144.

    Article  Google Scholar 

  • Wang, W. C., Shih, C. L., & Sun, G. W. (2012). The DIF-free-then-DIF strategy for the assessment of differential item functioning. Educational and Psychological Measurement, 72(4), 687–708.

    Article  Google Scholar 

  • Wang, W. C., Shih, C. L., & Yang, C. C. (2009). The MIMIC method with scale purification for detecting differential item functioning. Educational and Psychological Measurement, 69(5), 713–731.

    Article  Google Scholar 

  • Woods, C. M. (2009). Testing for differential item functioning with measures of partial association. Applied Psychological Measurement, 33(1), 538–554.

    Article  Google Scholar 

  • Woods, C. M., & Grimm, K. J. (2011). Testing for nonuniform differential item functioning with multiple indicator multiple cause models. Applied Psychological Measurement, 35(5), 339–361.

    Article  Google Scholar 

  • Woods, C. M., Cai, L., & Wang, M. (2013). The Langer-improved Wald test for DIF testing with multiple groups: Evaluation and comparison to two-group IRT. Educational and Psychological Measurement, 73(3), 532–547.

    Article  Google Scholar 

  • Yuan, K.-H., & Chan, W. (2008). Structural equation modeling with near singular covariance matrices. Computational Statistics & Data Analysis, 52(10), 4842–4858.

    Article  Google Scholar 

  • Yuan, K.-H., Hayashi, K., & Bentler, P. M. (2007). Normal theory likelihood ratio statistic for mean and covariance structure analysis under alternative hypotheses. Journal of Multivariate Analysis, 98(6), 1262–1282.

    Article  Google Scholar 

  • Zhang, G. (2018). Testing process factor analysis models using the parametric bootstrap. Multivariate Behavioral Research, 53(2), 219–230.

    Article  PubMed  Google Scholar 

  • Zieky, M. (1993). DIF statistics in test development. In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 337–347). Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Zwick, R., & Thayer, D. T. (2002). Application of an empirical Bayes enhancement of Mantel-Haenszel DIF analysis to a computerized adaptive test. Applied Psychological Measurement, 26(1), 57–76.

    Article  Google Scholar 

  • Zwick, R., Thayer, D. T., & Lewis, C. (2000). Using loss functions for DIF detection: An empirical Bayes approach. Journal of Educational and Behavioral Statistics, 25(2), 225–247.

    Article  Google Scholar 

Download references

Acknowledgements

Funding was provided by National Natural Science Foundation of China (Grant Nos. 31971029, 32071091).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongyun Liu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

K.-H. Yuan: His research has been around developing better or more valid methods for analyzing messy data or non-standard samples in social and behavioral sciences. Most of his work is on factor analysis, structural equation modeling, and multilevel modeling.

H. Liu: Her research interests are educational measurement, advanced statistics methods.

Y. Han: Her research interests are psychometrics and educational measurement.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 2047 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yuan, KH., Liu, H. & Han, Y. Differential Item Functioning Analysis Without A Priori Information on Anchor Items: QQ Plots and Graphical Test. Psychometrika 86, 345–377 (2021). https://doi.org/10.1007/s11336-021-09746-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11336-021-09746-5

Keywords

Navigation