Skip to main content
Original Article

Generalized Linear Mixed Models for Randomized Responses

Published Online:https://doi.org/10.1027/1614-2241/a000153

Abstract. Response bias (nonresponse and social desirability bias) is one of the main concerns when asking sensitive questions about behavior and attitudes. Self-reports on sensitive issues as in health research (e.g., drug and alcohol abuse), and social and behavioral sciences (e.g., attitudes against refugees, academic cheating) can be expected to be subject to considerable misreporting. To diminish misreporting on self-reports, indirect questioning techniques have been proposed such as the randomized response techniques. The randomized response techniques avoid a direct link between individual’s response and the sensitive question, thereby protecting the individual’s privacy. Next to the development of the innovative data collection methods, methodological advances have been made to enable a multivariate analysis to relate responses to sensitive questions to other variables. It is shown that the developments can be represented by a general response probability model (including all common designs) by extending it to a generalized linear model (GLM) or a generalized linear mixed model (GLMM). The general methodology is based on modifying common link functions to relate a linear predictor to the randomized response. This approach makes it possible to use existing software for GLMs and GLMMs to model randomized response data. The R-package GLMMRR makes the advanced methodology available to applied researchers. The extended models and software will seriously improve the application of the randomized response methodology. Three empirical examples are given to illustrate the methods.

References

  • Bates, D., Mächler, M., Bolker, B. & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67, 1–48. https://doi.org/10.18637/jss.v067.i01 First citation in articleCrossrefGoogle Scholar

  • Blair, G., Imai, K. & Zhou, Y-Y. (2015). Design and analysis of the randomized response technique. Journal of the American Statistical Association, 110, 1304–1319. https://doi.org/10.1080/01621459.2015.1050028 First citation in articleCrossrefGoogle Scholar

  • Blair, G., Zhou, Y.-Y. & Imai, K. (2015). rr: Statistical methods for the randomized response, Comprehensive R Archive Network (CRAN). Retrieved from http://CRAN.R-project.org/package=rr First citation in articleGoogle Scholar

  • Böckenholt, U., Barlas, S. & van der Heijden, P. G. M. (2009). Do randomized-response designs eliminate response biases? An empirical study of non-compliance behavior. Journal of Applied Econometrics, 24, 377–392. https://doi.org/10.1002/jae.1052 First citation in articleCrossrefGoogle Scholar

  • Böckenholt, U. & van der Heijden, P. G. M. (2007). Item randomized–response models for measuring noncompliance: Risk–return perceptions, social influences, and self-protective responses. Psychometrika, 72, 245–262. https://doi.org/10.1007/s11336-005-1495-y First citation in articleCrossrefGoogle Scholar

  • Boruch, R. F. (1971). Maintaining confidentiality of data in educational research: A systematic analysis. The American Psychologist, 26, 413–430. https://doi.org/10.1037/h0031502 First citation in articleCrossrefGoogle Scholar

  • Breslow, N.E. & Clayton, D.G. (1993). Approximate inference in generalized linear mixed models. Journal of the American Statistical Association, 88, 9–25. https://doi.org/10.2307/2290687 First citation in articleGoogle Scholar

  • Cruyff, M. J. L. F., Böckenholt, U. & van der Heijden, P. G. M. (2016). The multidimensional randomized response design: Estimating different aspects of the same sensitive behavior. Behavior Research Methods, 48, 390–399. https://doi.org/10.3758/s13428-015-0583-2 First citation in articleCrossrefGoogle Scholar

  • De Jong, M. G., Fox, J.-P. & Steenkamp, J. B. E. M. (2015). Quantifying under- and overreporting in surveys through a dual-questioning-technique design. Journal of Marketing Research, 52, 737–753. https://doi.org/10.1509/jmr.12.0336 First citation in articleCrossrefGoogle Scholar

  • De Jong, M. G., Pieters, F. G. M. & Fox, J.-P. (2010). Reducing social desirability bias through item randomized response: An application to measure underreported desires. Journal of Marketing Research, 47, 14–27. https://doi.org/10.1509/jmkr.47.1.14 First citation in articleCrossrefGoogle Scholar

  • Fox, J.-P. (2005). Randomized item response theory models. Journal of Educational and Behavioral Statistics, 30, 1–24. https://doi.org/10.3102/10769986030002189 First citation in articleCrossrefGoogle Scholar

  • Fox, J.-P. (2016). Bayesian randomized item response theory models for sensitive measurements. In W.J. van der LindenEd., Handbook of item response theory: Vol. 1. Models (pp. 4821–4837). Boca Raton, FL: Chapman & Hall/CRC. First citation in articleGoogle Scholar

  • Fox, J.-P., Avetisyan, M. & van der Palen, J. (2013). Mixture randomized item-response modeling: A smoking behavior validation study. Statistics in Medicine, 32. https://doi.org/10.1002/sim.5859 First citation in articleCrossrefGoogle Scholar

  • Fox, J.-P., Klein Entink, R. K. & Avetisyan, M. (2014). Compensatory and non-compensatory multidimensional randomized item response models. British Journal of Mathematical and Statistical Psychology, 67, 133–152. https://doi.org/10.1111/bmsp.12012 First citation in articleCrossrefGoogle Scholar

  • Fox, J.-P., Klotzke, K. & Veen, D. (2016). GLMMRR: Generalized linear mixed modeling of RR data, Comprehensive R Archive Network (CRAN). Retrieved from https://cran.r-project.org/web/packages/GLMMRR First citation in articleGoogle Scholar

  • Fox, J.-P. & Meijer, R. R. (2008). Using IRT to obtain individual information from randomized response data: An application using cheating data. Applied Psychological Measurement, 32, 595–610. https://doi.org/10.1177/0146621607312277 First citation in articleCrossrefGoogle Scholar

  • Fox, J.-P. & Wyrick, C. (2008). A mixed effects randomized item response model. Journal of Educational and Behavioral Statistics, 33, 389–415. https://doi.org/10.3102/1076998607306451 First citation in articleCrossrefGoogle Scholar

  • Green, P. J. (1984). Iteratively reweighted least squares for maximum likelihood estimation, and some robust and resistant alternatives. Journal of the Royal Statistical Society, Series B, 46, 149–192. First citation in articleGoogle Scholar

  • Greenberg, B. G., Abul-Ela, A., Simmons, W. R. & Horvitz, D. G. (1969). The unrelated question randomized response model: theoretical framework. Journal of the American Statistical Association, 64, 520–539. https://doi.org/10.2307/2283636 First citation in articleCrossrefGoogle Scholar

  • Heck, D. W. & Moshagen, M. (2014). RRreg: Correlation and regression analyses for randomized response data. Comprehensive R Archive Network (CRAN). Retrieved from http://cran.r-project.org/package=RRreg First citation in articleGoogle Scholar

  • Hoffmann, A., Diedenhofen, B., Verschuere, B. & Musch, J. (2015). A strong validation of the crosswise model using experimentally-induced cheating behavior. Journal of Experimental Psychology, 62, 403–414. https://doi.org/10.1027/1618-3169/a000304 First citation in articleLinkGoogle Scholar

  • Höglinger, M. & Jann, B. (2016). More is not always better: An experimental individual-level validation of the randomized response technique and the crosswise model, (Working paper No. 18). Retrieved from University of Bern Social Sciences http://ideas.repec.org/p/bss/wpaper/18.html First citation in articleGoogle Scholar

  • Hosmer, D. H. & Lemeshow, S. (1980). Goodness of fit tests for the multiple logistic regression model. Communications in Statistics – Theory and Methods, 9, 1043–1069. https://doi.org/10.1080/03610928008827941 First citation in articleCrossrefGoogle Scholar

  • Jann, B. (2011). RRLOGIT: Stata module to estimate logistic regression for randomized response data, Retrieved from https://ideas.repec.org/c/boc/bocode/s456203.html First citation in articleGoogle Scholar

  • Jann, B., Jerke, J. & Krumpal, I. (2012). Asking sensitive questions using the crosswise model: Questions using the crosswise model an experimental survey measuring plagiarism. Public Opinion Quarterly, 76, 32–49. https://doi.org/10.1093/poq/nfr036 First citation in articleCrossrefGoogle Scholar

  • Jansen, A., König, C. J., Stadelmann, E. H. & Kleinmann, M. (2012). Applicants’ self-presentational behavior: What do recruiters expect and what do they get? Journal of Experimental Psychology, 11, 77–85. https://doi.org/10.1027/1866-5888/a000046 First citation in articleAbstractGoogle Scholar

  • Kuk, A. Y. C. (1990). Asking sensitive questions indirectly. Biometrika, 77, 436–438. https://doi.org/10.1093/biomet/77.2.436 First citation in articleCrossrefGoogle Scholar

  • Lensvelt-Mulders, G. J. L. M., Hox, J. J., van der Heijden, P. G. M. & Maas, C. (2005). Meta-analysis of randomized response research: 35 years of validation. Sociological Methods & Research., 33, 319–348. https://doi.org/10.1177/0049124104268664 First citation in articleCrossrefGoogle Scholar

  • McCullagh, P. & Nelder, J. A. (1989). Generalized linear model (2nd ed.). London, UK: Chapman & Hall. First citation in articleCrossrefGoogle Scholar

  • McCulloch, C. E., Searle, S. R. & Neuhaus, J. M. (2008). Generalized linear, and mixed models (2nd ed.). New York, NY: Wiley. First citation in articleGoogle Scholar

  • Moshagen, M., Hilbig, B. E., Erdfelder, E. & Moritz, A. (2014). An experimental validation method for questioning techniques that assess sensitive issues. Experimental Psychology, 61, 48–54. https://doi.org/10.1027/1618-3169/a000226 First citation in articleLinkGoogle Scholar

  • R Core Team (2014). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Retrieved from http://www.R-project.org/ First citation in articleGoogle Scholar

  • Rabe-Hesketh, S. & Skrondal, A. (2007). Multilevel and latent variable modeling with composite links and exploded likelihoods. Psychometrika, 72, 123–140. https://doi.org/10.1007/s11336-006-1453-8 First citation in articleCrossrefGoogle Scholar

  • Rosenfeld, B., Imai, K. & Shapiro, J. (2015). An empirical validation study of popular survey methodologies for sensitive questions. American Journal of Political Science, 60, 783–802. https://doi.org/10.1111/ajps.12205 First citation in articleCrossrefGoogle Scholar

  • Scheers, N. J. & Dayton, C. (1988). Covariate randomized response model. Journal of the American Statistical Association, 83, 969–974. https://doi.org/10.1080/01621459.1988.10478686 First citation in articleCrossrefGoogle Scholar

  • Thompson, R. & Baker, R. J. (1981). Composite link functions in generalized linear models. Journal of the Royal Statistical Society, Series C, 30, 125–131. https://doi.org/10.2307/2346381 First citation in articleGoogle Scholar

  • Tourangeau, R. & Yan, T. (2007). Sensitive questions in surveys. Psychological Bulletin, 133, 859–883. https://doi.org/10.1037/0033-2909.133.5.859 First citation in articleCrossrefGoogle Scholar

  • Tutz, G. (2012). Regression for categorical data. Cambridge, UK: Cambridge University Press. First citation in articleGoogle Scholar

  • van den Hout, A., Böckenholt, U. & van der Heijden, P. G. M. (2010). Estimating the prevalence of sensitive behaviour and cheating with a dual design for direct questioning and randomized response. Journal of the Royal Statistical Society, Series C, 59, 723–736. https://doi.org/10.1111/j.1467-9876.2010.00720.x First citation in articleGoogle Scholar

  • van den Hout, A., van der Heijden, P. G. M. & Gilchrist, R. (2007). The logistic regression model with response variables subject to randomized response. Computational Statistics & Data Analysis, 51, 6060–6069. https://doi.org/10.1016/j.csda.2006.12.002 First citation in articleCrossrefGoogle Scholar

  • van den Hout, A., Gilchrist, R. & van der Heijden, P. G. M. (2010). The randomized response log linear model as a composite link model. Statistical Modelling, 10, 57–67. https://doi.org/10.1177/1471082X0801000104 First citation in articleCrossrefGoogle Scholar

  • van der Heijden, P. G. M., van Gils, G., Bouts, J. & Hox, J. J. (2000). A comparison of randomized response, computer-assisted self-interview, and face-to-face direct questioning eliciting sensitive information in the context of welfare and unemployment benefit. Sociological Methods & Research, 28, 505–537. https://doi.org/10.1177/0049124100028004005 First citation in articleCrossrefGoogle Scholar

  • Warner, S. L. (1965). Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 60, 63–69. https://doi.org/10.1080/01621459.1965.10480775 First citation in articleCrossrefGoogle Scholar

  • Yu, J.-W., Tian, G.-L. & Tang, M.-L. (2008). Two new models for survey sampling with sensitive characteristic; design and analysis. Metrika, 67, 251–263. https://doi.org/10.1007/s00184-007-0131-x First citation in articleCrossrefGoogle Scholar