Original Article

Generalized Linear Mixed Models for Randomized Responses

Jean-Paul Fox

Department of Research Methodology, Measurement and Data Analysis, University of Twente, Enschede, The Netherlands

Search for more papers by this author

Duco Veen

Department of Methods and Statistics, University of Utrecht, The Netherlands

Search for more papers by this author

, and

Konrad Klotzke

Department of Research Methodology, Measurement and Data Analysis, University of Twente, Enschede, The Netherlands

Search for more papers by this author

Published Online:December 12, 2018https://doi.org/10.1027/1614-2241/a000153

Abstract

Abstract. Response bias (nonresponse and social desirability bias) is one of the main concerns when asking sensitive questions about behavior and attitudes. Self-reports on sensitive issues as in health research (e.g., drug and alcohol abuse), and social and behavioral sciences (e.g., attitudes against refugees, academic cheating) can be expected to be subject to considerable misreporting. To diminish misreporting on self-reports, indirect questioning techniques have been proposed such as the randomized response techniques. The randomized response techniques avoid a direct link between individual’s response and the sensitive question, thereby protecting the individual’s privacy. Next to the development of the innovative data collection methods, methodological advances have been made to enable a multivariate analysis to relate responses to sensitive questions to other variables. It is shown that the developments can be represented by a general response probability model (including all common designs) by extending it to a generalized linear model (GLM) or a generalized linear mixed model (GLMM). The general methodology is based on modifying common link functions to relate a linear predictor to the randomized response. This approach makes it possible to use existing software for GLMs and GLMMs to model randomized response data. The R-package GLMMRR makes the advanced methodology available to applied researchers. The extended models and software will seriously improve the application of the randomized response methodology. Three empirical examples are given to illustrate the methods.

References

Bates, D., Mächler, M., Bolker, B. & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67, 1–48. https://doi.org/10.18637/jss.v067.i01 First citation in article Crossref, Google Scholar
Blair, G., Imai, K. & Zhou, Y-Y. (2015). Design and analysis of the randomized response technique. Journal of the American Statistical Association, 110, 1304–1319. https://doi.org/10.1080/01621459.2015.1050028 First citation in article Crossref, Google Scholar
Blair, G., Zhou, Y.-Y. & Imai, K. (2015). rr: Statistical methods for the randomized response, Comprehensive R Archive Network (CRAN). Retrieved from http://CRAN.R-project.org/package=rr First citation in article Google Scholar
Böckenholt, U., Barlas, S. & van der Heijden, P. G. M. (2009). Do randomized-response designs eliminate response biases? An empirical study of non-compliance behavior. Journal of Applied Econometrics, 24, 377–392. https://doi.org/10.1002/jae.1052 First citation in article Crossref, Google Scholar
Böckenholt, U. & van der Heijden, P. G. M. (2007). Item randomized–response models for measuring noncompliance: Risk–return perceptions, social influences, and self-protective responses. Psychometrika, 72, 245–262. https://doi.org/10.1007/s11336-005-1495-y First citation in article Crossref, Google Scholar
Boruch, R. F. (1971). Maintaining confidentiality of data in educational research: A systematic analysis. The American Psychologist, 26, 413–430. https://doi.org/10.1037/h0031502 First citation in article Crossref, Google Scholar
Breslow, N.E. & Clayton, D.G. (1993). Approximate inference in generalized linear mixed models. Journal of the American Statistical Association, 88, 9–25. https://doi.org/10.2307/2290687 First citation in article Google Scholar
Cruyff, M. J. L. F., Böckenholt, U. & van der Heijden, P. G. M. (2016). The multidimensional randomized response design: Estimating different aspects of the same sensitive behavior. Behavior Research Methods, 48, 390–399. https://doi.org/10.3758/s13428-015-0583-2 First citation in article Crossref, Google Scholar
De Jong, M. G., Fox, J.-P. & Steenkamp, J. B. E. M. (2015). Quantifying under- and overreporting in surveys through a dual-questioning-technique design. Journal of Marketing Research, 52, 737–753. https://doi.org/10.1509/jmr.12.0336 First citation in article Crossref, Google Scholar
De Jong, M. G., Pieters, F. G. M. & Fox, J.-P. (2010). Reducing social desirability bias through item randomized response: An application to measure underreported desires. Journal of Marketing Research, 47, 14–27. https://doi.org/10.1509/jmkr.47.1.14 First citation in article Crossref, Google Scholar
Fox, J.-P. (2005). Randomized item response theory models. Journal of Educational and Behavioral Statistics, 30, 1–24. https://doi.org/10.3102/10769986030002189 First citation in article Crossref, Google Scholar
Fox, J.-P. (2016). Bayesian randomized item response theory models for sensitive measurements. In W.J. van der LindenEd., Handbook of item response theory: Vol. 1. Models (pp. 4821–4837). Boca Raton, FL: Chapman & Hall/CRC. First citation in article Google Scholar
Fox, J.-P., Avetisyan, M. & van der Palen, J. (2013). Mixture randomized item-response modeling: A smoking behavior validation study. Statistics in Medicine, 32. https://doi.org/10.1002/sim.5859 First citation in article Crossref, Google Scholar
Fox, J.-P., Klein Entink, R. K. & Avetisyan, M. (2014). Compensatory and non-compensatory multidimensional randomized item response models. British Journal of Mathematical and Statistical Psychology, 67, 133–152. https://doi.org/10.1111/bmsp.12012 First citation in article Crossref, Google Scholar
Fox, J.-P., Klotzke, K. & Veen, D. (2016). GLMMRR: Generalized linear mixed modeling of RR data, Comprehensive R Archive Network (CRAN). Retrieved from https://cran.r-project.org/web/packages/GLMMRR First citation in article Google Scholar
Fox, J.-P. & Meijer, R. R. (2008). Using IRT to obtain individual information from randomized response data: An application using cheating data. Applied Psychological Measurement, 32, 595–610. https://doi.org/10.1177/0146621607312277 First citation in article Crossref, Google Scholar
Fox, J.-P. & Wyrick, C. (2008). A mixed effects randomized item response model. Journal of Educational and Behavioral Statistics, 33, 389–415. https://doi.org/10.3102/1076998607306451 First citation in article Crossref, Google Scholar
Green, P. J. (1984). Iteratively reweighted least squares for maximum likelihood estimation, and some robust and resistant alternatives. Journal of the Royal Statistical Society, Series B, 46, 149–192. First citation in article Google Scholar
Greenberg, B. G., Abul-Ela, A., Simmons, W. R. & Horvitz, D. G. (1969). The unrelated question randomized response model: theoretical framework. Journal of the American Statistical Association, 64, 520–539. https://doi.org/10.2307/2283636 First citation in article Crossref, Google Scholar
Heck, D. W. & Moshagen, M. (2014). RRreg: Correlation and regression analyses for randomized response data. Comprehensive R Archive Network (CRAN). Retrieved from http://cran.r-project.org/package=RRreg First citation in article Google Scholar
Hoffmann, A., Diedenhofen, B., Verschuere, B. & Musch, J. (2015). A strong validation of the crosswise model using experimentally-induced cheating behavior. Journal of Experimental Psychology, 62, 403–414. https://doi.org/10.1027/1618-3169/a000304 First citation in article Link, Google Scholar
Höglinger, M. & Jann, B. (2016). More is not always better: An experimental individual-level validation of the randomized response technique and the crosswise model, (Working paper No. 18). Retrieved from University of Bern Social Sciences http://ideas.repec.org/p/bss/wpaper/18.html First citation in article Google Scholar
Hosmer, D. H. & Lemeshow, S. (1980). Goodness of fit tests for the multiple logistic regression model. Communications in Statistics – Theory and Methods, 9, 1043–1069. https://doi.org/10.1080/03610928008827941 First citation in article Crossref, Google Scholar
Jann, B. (2011). RRLOGIT: Stata module to estimate logistic regression for randomized response data, Retrieved from https://ideas.repec.org/c/boc/bocode/s456203.html First citation in article Google Scholar
Jann, B., Jerke, J. & Krumpal, I. (2012). Asking sensitive questions using the crosswise model: Questions using the crosswise model an experimental survey measuring plagiarism. Public Opinion Quarterly, 76, 32–49. https://doi.org/10.1093/poq/nfr036 First citation in article Crossref, Google Scholar
Jansen, A., König, C. J., Stadelmann, E. H. & Kleinmann, M. (2012). Applicants’ self-presentational behavior: What do recruiters expect and what do they get? Journal of Experimental Psychology, 11, 77–85. https://doi.org/10.1027/1866-5888/a000046 First citation in article Abstract, Google Scholar
Kuk, A. Y. C. (1990). Asking sensitive questions indirectly. Biometrika, 77, 436–438. https://doi.org/10.1093/biomet/77.2.436 First citation in article Crossref, Google Scholar
Lensvelt-Mulders, G. J. L. M., Hox, J. J., van der Heijden, P. G. M. & Maas, C. (2005). Meta-analysis of randomized response research: 35 years of validation. Sociological Methods & Research., 33, 319–348. https://doi.org/10.1177/0049124104268664 First citation in article Crossref, Google Scholar
McCullagh, P. & Nelder, J. A. (1989). Generalized linear model (2nd ed.). London, UK: Chapman & Hall. First citation in article Crossref, Google Scholar
McCulloch, C. E., Searle, S. R. & Neuhaus, J. M. (2008). Generalized linear, and mixed models (2nd ed.). New York, NY: Wiley. First citation in article Google Scholar
Moshagen, M., Hilbig, B. E., Erdfelder, E. & Moritz, A. (2014). An experimental validation method for questioning techniques that assess sensitive issues. Experimental Psychology, 61, 48–54. https://doi.org/10.1027/1618-3169/a000226 First citation in article Link, Google Scholar
R Core Team (2014). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Retrieved from http://www.R-project.org/ First citation in article Google Scholar
Rabe-Hesketh, S. & Skrondal, A. (2007). Multilevel and latent variable modeling with composite links and exploded likelihoods. Psychometrika, 72, 123–140. https://doi.org/10.1007/s11336-006-1453-8 First citation in article Crossref, Google Scholar
Rosenfeld, B., Imai, K. & Shapiro, J. (2015). An empirical validation study of popular survey methodologies for sensitive questions. American Journal of Political Science, 60, 783–802. https://doi.org/10.1111/ajps.12205 First citation in article Crossref, Google Scholar
Scheers, N. J. & Dayton, C. (1988). Covariate randomized response model. Journal of the American Statistical Association, 83, 969–974. https://doi.org/10.1080/01621459.1988.10478686 First citation in article Crossref, Google Scholar
Thompson, R. & Baker, R. J. (1981). Composite link functions in generalized linear models. Journal of the Royal Statistical Society, Series C, 30, 125–131. https://doi.org/10.2307/2346381 First citation in article Google Scholar
Tourangeau, R. & Yan, T. (2007). Sensitive questions in surveys. Psychological Bulletin, 133, 859–883. https://doi.org/10.1037/0033-2909.133.5.859 First citation in article Crossref, Google Scholar
Tutz, G. (2012). Regression for categorical data. Cambridge, UK: Cambridge University Press. First citation in article Google Scholar
van den Hout, A., Böckenholt, U. & van der Heijden, P. G. M. (2010). Estimating the prevalence of sensitive behaviour and cheating with a dual design for direct questioning and randomized response. Journal of the Royal Statistical Society, Series C, 59, 723–736. https://doi.org/10.1111/j.1467-9876.2010.00720.x First citation in article Google Scholar
van den Hout, A., van der Heijden, P. G. M. & Gilchrist, R. (2007). The logistic regression model with response variables subject to randomized response. Computational Statistics & Data Analysis, 51, 6060–6069. https://doi.org/10.1016/j.csda.2006.12.002 First citation in article Crossref, Google Scholar
van den Hout, A., Gilchrist, R. & van der Heijden, P. G. M. (2010). The randomized response log linear model as a composite link model. Statistical Modelling, 10, 57–67. https://doi.org/10.1177/1471082X0801000104 First citation in article Crossref, Google Scholar
van der Heijden, P. G. M., van Gils, G., Bouts, J. & Hox, J. J. (2000). A comparison of randomized response, computer-assisted self-interview, and face-to-face direct questioning eliciting sensitive information in the context of welfare and unemployment benefit. Sociological Methods & Research, 28, 505–537. https://doi.org/10.1177/0049124100028004005 First citation in article Crossref, Google Scholar
Warner, S. L. (1965). Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 60, 63–69. https://doi.org/10.1080/01621459.1965.10480775 First citation in article Crossref, Google Scholar
Yu, J.-W., Tian, G.-L. & Tang, M.-L. (2008). Two new models for survey sampling with sensitive characteristic; design and analysis. Metrika, 67, 251–263. https://doi.org/10.1007/s00184-007-0131-x First citation in article Crossref, Google Scholar

Volume 15Issue 1January 2019

ISSN: 1614-1881eISSN: 1614-2241

History

ReceivedOctober 21, 2017
RevisedMay 30, 2018
AcceptedJuly 18, 2018
Published onlineDecember 12, 2018

Licenses & Copyright

Keywords

PDF download

Verify Phone

Congrats!

Generalized Linear Mixed Models for Randomized Responses

Abstract

References

History

Licenses & Copyright

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners

Change Password

Your password must have 8 characters or more and contain 3 of the following:

Password Changed Successfully

Create a new account

Request Username

Verify Phone

Congrats!

Generalized Linear Mixed Models for Randomized Responses

Abstract

References

History

Licenses & Copyright

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners