Linking Scores with Patient-Reported Health Outcome Instruments:A VALIDATION STUDY AND COMPARISON OF THREE LINKING METHODS

Schalet, Benjamin D.; Lim, Sangdon; Cella, David; Choi, Seung W.

doi:10.1007/s11336-021-09776-z

Linking Scores with Patient-Reported Health Outcome Instruments:A VALIDATION STUDY AND COMPARISON OF THREE LINKING METHODS

Application Reviews and Case Studies
Published: 26 June 2021

Volume 86, pages 717–746, (2021)
Cite this article

Psychometrika Aims and scope Submit manuscript

1285 Accesses
24 Citations
2 Altmetric
Explore all metrics

Abstract

The psychometric process used to establish a relationship between the scores of two (or more) instruments is generically referred to as linking. When two instruments with the same content and statistical test specifications are linked, these instruments are said to be equated. Linking and equating procedures have long been used for practical benefit in educational testing. In recent years, health outcome researchers have increasingly applied linking techniques to patient-reported outcome (PRO) data. However, these applications have some noteworthy purposes and associated methodological questions. Purposes for linking health outcomes include the harmonization of data across studies or settings (enabling increased power in hypothesis testing), the aggregation of summed score data by means of score crosswalk tables, and score conversion in clinical settings where new instruments are introduced, but an interpretable connection to historical data is needed. When two PRO instruments are linked, assumptions for equating are typically not met and the extent to which those assumptions are violated becomes a decision point around how (and whether) to proceed with linking. We demonstrate multiple linking procedures—equipercentile, unidimensional IRT calibration, and calibrated projection—with the Patient-Reported Outcomes Measurement Information System Depression bank and the Patient Health Questionnaire-9. We validate this link across two samples and simulate different instrument correlation levels to provide guidance around which linking method is preferred. Finally, we discuss some remaining issues and directions for psychometric research in linking PRO instruments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Solving the Tower of Babel Problem for Patient-Reported Outcome Measures

Article 18 June 2021

Jakob Bue Bjorner

ExternalRefStart http://www.common-metrics.org www.common-metrics.org ExternalRefEnd : a web application to estimate scores from different patient-reported outcome measures on a common scale

Article Open access 19 October 2016

H. Felix Fischer & Matthias Rose

Matching IRT Models to Patient-Reported Outcomes Constructs: The Graded Response and Log-Logistic Models for Scaling Depression

Article Open access 31 August 2021

Steven P. Reise, Han Du, … Mark G. Haviland

References

Ahmed, S., Berzon, R. A., Revicki, D. A., Lenderking, W. R., Moinpour, C. M., Basch, E., Reeve, B. B., Wu, A. W., & International Society for Quality of Life Research (2012). The use of patient-reported outcomes (PRO) within comparative effectiveness research: implications for clinical practice and health care policy. Medical Care, 50(12), 1060–1070.
Albano, A. D. (2016). equate: An R package for observed-score linking and equating. Journal of Statistical Software, 74(8), 1–36.
Article Google Scholar
Amtmann, D., Cook, K. F., Jensen, M. P., Chen, W.-H., Choi, S., Revicki, D., et al. (2010). Development of a PROMIS item bank to measure pain interference. Pain, 150(1), 173–182.
Article PubMed PubMed Central Google Scholar
Angoff, W. H. (1971). Scales, norms, and equivalent scores. In R.L. Thorndike (Ed.) Educational measurement. (2nd ed., pp. 508–600). Washington, DC: American Council on Education.
Askew, R. L., Kim, J., Chung, H., Cook, K. F., Johnson, K. L., & Amtmann, D. (2013). Development of a crosswalk for pain interference measured by the BPI and PROMIS pain interference short form. Quality of Life Research, 22(10), 2769–2776.
Article PubMed Google Scholar
Basch, E. (2014). New frontiers in patient-reported outcomes: Adverse event reporting, comparative effectiveness, and quality assessment. Annual Review of Medicine, 65, 307–317.
Article PubMed Google Scholar
Basch, E., Spertus, J., Dudley, R. A., Wu, A., Chuahan, C., Cohen, P., et al. (2015). Methods for developing patient-reported outcome-based performance measures (PRO-PMs). Value in Health, 18(4), 493–504.
Article PubMed Google Scholar
Baumhauer, J. F., & Bozic, K. J. (2016). Value-based healthcare: Patient-reported outcomes in clinical decision making. Clinical Orthopaedics and Related Research®, 474(6), 1375–1378.
Article Google Scholar
Bland, J. M., & Altman, D. G. (1999). Measuring agreement in method comparison studies. Statistical Methods in Medical Research, 8(2), 135–160.
Article PubMed Google Scholar
Bock, R. D., & Mislevy, R. J. (1982). Adaptive EAP estimation of ability in a microcomputer environment. Applied Psychological Measurement, 6(4), 431–444.
Article Google Scholar
Brennan, R. (2004). Linking with Equivalent Group or Single Group Design (LEGS; Version 2.0)[Computer software]. Iowa City, IA: University of Iowa, Center for Advanced Studies in Measurement and Assessment (CASMA).
Browne, M. W., & Cudeck, R. (1992). Alternative ways of assessing model fit. Sociological Methods and Research, 21(2), 230–258.
Article Google Scholar
Bryant, D. U., Smith, A. K., Alexander, S. G., Vaughn, K., & Canali, K. G. (2005). Expected a posteriori estimation of multiple latent traits (518612013-445)
Buysse, D. J., Yu, L., Moul, D. E., Germain, A., Stover, A., Dodds, N. E., et al. (2010). Development and validation of patient-reported outcome measures for sleep disturbance and sleep-related impairments. Sleep, 33(6), 781–792.
Article PubMed PubMed Central Google Scholar
Cai, L. (2015). Lord–Wingersky algorithm version 2.0 for hierarchical item factor models with applications in test scoring, scale alignment, and model fit testing. Psychometrika, 80(2), 535–559.
Article PubMed Google Scholar
Carstensen, B. (2010). Comparing methods of measurement: Extending the LoA by regression. Statistics in Medicine, 29(3), 401–410.
Article PubMed Google Scholar
Cella, D., Choi, S. W., Condon, D. M., Schalet, B., Hays, R. D., Rothrock, N. E., et al. (2019). PROMIS® adult health profiles: Efficient short-form measures of seven health domains. Value in Health, 22(5), 537–544.
Article PubMed PubMed Central Google Scholar
Cella, D., Riley, W., Stone, A., Rothrock, N., Reeve, B., Yount, S., et al. (2010). The patient-reported outcomes measurement information system (PROMIS) developed and tested its first wave of adult self-reported health outcome item banks: 2005–2008. Journal of Clinical Epidemiology, 63(11), 1179–1194.
Article PubMed PubMed Central Google Scholar
Cella, D., Schalet, B., Kallen, M., Lai, J.-S., Cook, K., Rutsohn, J., & Choi, S. (2016). PROSETTA stone analysis report: A rosetta stone for patient reported outcomes.
Cella, D., & Stone, A. A. (2015). Health-related quality of life measurement in oncology: Advances and opportunities. American Psychologist, 70(2), 175.
Article PubMed Google Scholar
Cella, D., Yount, S., Rothrock, N., Gershon, R., Cook, K., Reeve, B., et al. (2007). The patient-reported outcomes measurement information system (PROMIS): Progress of an NIH Roadmap cooperative group during its first two years. Medical Care, 45(5 Suppl 1), S3.
Article PubMed PubMed Central Google Scholar
Chalmers, R.P. mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 1–29 (2012).
Choi S, Lim S, Schalet B, Kaat A, & Cella, D. (2020). PROsetta: Linking Patient-Reported Outcomes Measures. R package version 0.2.0, https://cran.r-project.org/package=PROsetta
Choi, S. W., Gibbons, L. E., & Crane, P. K. (2011). Lordif: An R package for detecting differential item functioning using iterative hybrid ordinal logistic regression/item response theory and Monte Carlo simulations. Journal of Statistical Software, 39(8), 1.
Article PubMed PubMed Central Google Scholar
Choi, S. W., Schalet, B., Cook, K. F., & Cella, D. (2014). Establishing a common metric for depressive symptoms: Linking the BDI-II, CES-D, and PHQ-9 to PROMIS depression. Psychological Assessment, 26(2), 513.
Article PubMed PubMed Central Google Scholar
Cleeland, C. S., Gonin, R., Hatfield, A. K., Edmonson, J. H., Blum, R. H., Stewart, J. A., et al. (1994). Pain and its treatment in outpatients with metastatic cancer. New England Journal of Medicine, 330(9), 592–596.
Article PubMed Google Scholar
Cook, K. F., Schalet, B. D., Kallen, M. A., Rutsohn, J. P., & Cella, D. (2015). Establishing a common metric for self-reported pain: Linking BPI pain interference and SF-36 bodily pain subscale scores to the PROMIS pain interference metric. Quality of Life Research, 24(10), 2305–2318.
Article PubMed PubMed Central Google Scholar
Coster, W. J., Ni, P., Slavin, M. D., Kisala, P. A., Nandakumar, R., Mulcahey, M. J., et al. (2016). Differential item functioning in the patient reported outcomes measurement information system pediatric short forms in a sample of children and adolescents with cerebral palsy. Developmental Medicine and Child Neurology, 58(11), 1132–1138.
Article PubMed PubMed Central Google Scholar
Curran, P. J., & Hussong, A. M. (2009). Integrative data analysis: The simultaneous analysis of multiple data sets. Psychological Methods, 14(2), 81–100. https://doi.org/10.1037/a0015914.
Article PubMed PubMed Central Google Scholar
De Vet, H. C., Terwee, C. B., Mokkink, L. B., & Knol, D. L. (2011). Measurement in medicine: A practical guide. Cambridge: Cambridge University Press.
Book Google Scholar
Dorans, N. J. (2004). Equating, concordance, and expectation. Applied Psychological Measurement, 28(4), 227–246.
Article Google Scholar
Dorans, N. J. (2007). Linking scores from multiple health outcome instruments. Quality of Life Research, 16(1), 85–94.
Article PubMed Google Scholar
Dorans, N. J., & Holland, P. W. (2000). Population invariance and the equatability of tests: Basic theory and the linear case. ETS Research Report Series, 2000(2), i–35.
Article Google Scholar
Dorans, N. J., Lyu, C. F., Pommerich, M., & Houston, W. M. (1997). Concordance between ACT assessment and recentered SAT I sum scores. College and University, 73(2), 24–32.
Google Scholar
Fischer, H. F., & Rose, M. (2019). Scoring depression on a common metric: A comparison of EAP estimation, plausible value imputation, and full Bayesian IRT modeling. Multivariate Behavioral Research, 54(1), 85–99.
Article PubMed Google Scholar
Fischer, H. F., Wahl, I., Fliege, H., Klapp, B. F., & Rose, M. (2012). Impact of cross-calibration methods on the interpretation of a treatment comparison study using 2 depression scales. Medical Care, 50(4), 320–326.
Article PubMed Google Scholar
Gershon, R. C., Lai, J. S., Bode, R., Choi, S., Moy, C., Bleck, T., et al. (2012). Neuro-QOL: Quality of life item banks for adults with neurological disorders: item development and calibrations based upon clinical and general population testing. Quality of Life Research, 21(3), 475–486.
Article PubMed Google Scholar
Gottfredson, N. C., Cole, V. T., Giordano, M. L., Bauer, D. J., Hussong, A. M., & Ennett, S. T. (2019). Simplifying the implementation of modern scale scoring methods with an automated R package: Automated moderated nonlinear factor analysis (aMNLFA). Addictive Behaviors, 94, 65–73.
Article PubMed Google Scholar
Haebara, T. (1980). Equating logistic ability scales by a weighted least squares method. Japanese Psychological Research, 22(3), 144–149.
Article Google Scholar
Hahn, E. A., DeWalt, D. A., Bode, R. K., Garcia, S. F., DeVellis, R. F., Correia, H., et al. (2014). New English and Spanish social health measures will facilitate evaluating health determinants. Health Psychology, 33(5), 490.
Article PubMed PubMed Central Google Scholar
Hansen, M., Cai, L., Stucky, B. D., Tucker, J. S., Shadel, W. G., & Edelen, M. O. (2014). Methodology for developing and evaluating the PROMIS® smoking item banks. Nicotine and Tobacco Research, 16(Suppl 3), S175–S189.
Article PubMed Google Scholar
Hanson, B. A., Zeng, L., & Colton, D. A. (1994). A comparison of presmoothing and postsmoothing methods in equipercentile equating (Vol. 94). New York: American College Testing Program.
Google Scholar
Hays, R. D., Brodsky, M., Johnston, M. F., Spritzer, K. L., & Hui, K.-K. (2005). Evaluating the statistical significance of health-related quality-of-life change in individual patients. Evaluation and the Health Professions, 28(2), 160–171.
Article PubMed Google Scholar
Hays, R. D., Liu, H., & Kapteyn, A. (2015). Use of Internet panels to conduct surveys. Behavior Research Methods, 47(3), 685–690.
Article PubMed PubMed Central Google Scholar
Holland, P. W., & Dorans, N. J. (2006). Linking and equating. Educational Measurement, 4, 187–220.
Google Scholar
Hu, L.-T., & Bentler, P. M. (1998). Fit indices in covariance structure modeling: Sensitivity to underparameterized model misspecification. Psychological Methods, 3(4), 424.
Article Google Scholar
Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal, 6(1), 1–55.
Article Google Scholar
Hussong, A. M., Gottfredson, N. C., Bauer, D. J., Curran, P. J., Haroon, M., Chandler, R., et al. (2019). Approaches for creating comparable measures of alcohol use symptoms: Harmonization with eight studies of criminal justice populations. Drug and Alcohol Dependence, 194, 59–68. https://doi.org/10.1016/j.drugalcdep.2018.10.003.
Article PubMed Google Scholar
Jensen, R. E., Moinpour, C. M., Potosky, A. L., Lobo, T., Hahn, E. A., Hays, R. D. et al. (2017). Responsiveness of 8 Patient-Reported Outcomes Measurement Information System (PROMIS) measures in a large, community-based cancer study cohort. Cancer, 123(2), 327–335.
Kaat, A. J., Kallen, M. A., Nowinski, C. J., Sterling, S. A., Westbrook, S. R., & Peters, J. T. (2020). PROMIS® pediatric depressive symptoms as a harmonized score metric. Journal of Pediatric Psychology, 45(3), 271–280.
Article PubMed Google Scholar
Kaat, A. J., Newcomb, M. E., Ryan, D. T., & Mustanski, B. (2017). Expanding a common metric for depression reporting: linking two scales to PROMIS® depression. Quality of Life Research, 26(5), 1119–1128
Kang, T., & Petersen, N. S. (2012). Linking item parameters to a base scale. Asia Pacific Education Review, 13(2), 311–321.
Article Google Scholar
Katzan, I. L., Fan, Y., Griffith, S. D., Crane, P. K., Thompson, N. R., & Cella, D. (2017). Scale linking to enable patient-reported outcome performance measures assessed with different patient-reported outcome measures. Value in Health, 20(8), 1143–1149.
Article PubMed Google Scholar
Kim, J., Chung, H., Askew, R. L., Park, R., Jones, S. M., Cook, K. F., & Amtmann, D. (2015). Translating CESD-20 and PHQ-9 scores to PROMIS depression. Assessment, 1073191115607042.
Kim, S. (2006). A comparative study of IRT fixed parameter calibration methods. Journal of Educational Measurement, 43(4), 355–381.
Article Google Scholar
Kolen, M. J., & Brennan, R. L. (2014). Test equating, scaling, and linking: Methods and practices. Berlin: Springer.
Book Google Scholar
Kroenke, K., Spitzer, R. L., & Williams, J. B. (2001). The PHQ-9: Validity of a brief depression severity measure. Journal of General Internal Medicine, 16(9), 606–613.
Article PubMed PubMed Central Google Scholar
Kroenke, K., Spitzer, R. L., Williams, J. B., & Löwe, B. (2010). The patient health questionnaire somatic, anxiety, and depressive symptom scales: A systematic review. General Hospital Psychiatry, 32(4), 345–359.
Article PubMed Google Scholar
Lai, J.-S., Cella, D., Yanez, B., & Stone, A. (2014). Linking fatigue measures on a common reporting metric. Journal of Pain and Symptom Management, 48(4), 639–648.
Article PubMed PubMed Central Google Scholar
Lee, W. C., & Lee, G. (2018). IRT linking and equating (pp. 639–673). The Wiley Handbook of Psychometric Testing: A Multidisciplinary Reference on Survey, Scale and Test Development.
Liegl, G., Wahl, I., Berghöfer, A., Nolte, S., Pieh, C., Rose, M., et al. (2016). Using Patient Health Questionnaire-9 item parameters of a common metric resulted in similar depression scores compared to independent item response theory model reestimation. Journal of Clinical Epidemiology, 71, 25–34.
Article PubMed Google Scholar
Liu, H., Cella, D., Gershon, R., Shen, J., Morales, L. S., Riley, W., et al. (2010). Representativeness of the patient-reported outcomes measurement information system internet panel. Journal of Clinical Epidemiology, 63(11), 1169–1178.
Article PubMed PubMed Central Google Scholar
Lord, F. M. (1980). Applications of item response theory to practical testing problems. London: Routledge.
Google Scholar
Lord, F. M. (1982). The standard error of equipercentile equating. Journal of Educational Statistics, 7(3), 165–174.
Article Google Scholar
Lord, F. M., & Wingersky, M. S. (1984). Comparison of IRT true-score and equipercentile observed-score equatings. Applied Psychological Measurement, 8(4), 453–461.
Article Google Scholar
Lucke JF (2015). Unipolar item response models. In Reise SP & Revicki DA (Eds.), Handbook of Item Response Theory Modeling: Applications to Typical Performance Assessment (pp. 272–284). New York, NY: Routledge/Taylor & Francis Group.
Mokkink, L. B., Terwee, C. B., Patrick, D. L., Alonso, J., Stratford, P. W., Knol, D. L., et al. (2010). The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments: An international Delphi study. Quality of Life Research, 19(4), 539–549.
Article PubMed PubMed Central Google Scholar
McHugh, R. K., Rasmussen, J. L., & Otto, M. W. (2011). Comprehension of self-report evidence-based measures of anxiety. Depression and Anxiety, 28(7), 607–614.
Park, T., Reilly-Spong, M., & Gross, C. R. (2013). Mindfulness: A systematic review of instruments to measure an emergent patient-reported outcome (PRO). Quality of Life Research, 22(10), 2639–2659.
Article PubMed Google Scholar
Pilkonis, P. A., Choi, S. W., Reise, S. P., Stover, A. M., Riley, W. T., & Cella, D. (2011). Item banks for measuring emotional distress from the patient-reported outcomes measurement information system (PROMIS®): Depression, anxiety, and anger. Assessment, 18(3), 263–283.
Article PubMed PubMed Central Google Scholar
Pilkonis, P. A., Choi, S. W., Salsman, J. M., Butt, Z., Moore, T. L., Lawrence, S. M., et al. (2013). Assessment of self-reported negative affect in the NIH Toolbox. Psychiatry Research, 206(1), 88–97.
Article PubMed Google Scholar
Pilkonis, P. A., Yu, L., Dodds, N. E., Johnston, K. L., Maihoefer, C. C., & Lawrence, S. M. (2014). Validation of the depression item bank from the patient-reported outcomes measurement information system (PROMIS®) in a three-month observational study. Journal of Psychiatric Research, 56, 112–119.
Article PubMed PubMed Central Google Scholar
Purvis, T. E., Neuman, B. J., Riley, L. H, I. I. I., & Skolasky, R. L. (2018). Discriminant ability, concurrent validity, and responsiveness of PROMIS health domains among patients with lumbar degenerative disease undergoing decompression with or without arthrodesis. Spine, 43(21), 1512–1520.
Article PubMed Google Scholar
Reeve, B. B., Hays, R. D., Bjorner, J. B., Cook, K. F., Crane, P. K., Teresi, J. A., et al. (2007). Psychometric evaluation and calibration of health-related quality of life item banks: Plans for the patient-reported outcomes measurement information system (PROMIS). Medical Care, 45(5), S22–S31.
Article PubMed Google Scholar
Reeve, B. B., Thissen, D., DeWalt, D. A., Huang, I.-C., Liu, Y., Magnus, B., et al. (2016). Linkage between the PROMIS®pediatric and adult emotional distress measures. Quality of Life Research, 25(4), 823–833.
Article PubMed Google Scholar
Reinsch, C. H. (1967). Smoothing by spline functions. Numerische mathematik, 10(3), 177–183.
Article Google Scholar
Reise, S. P., Moore, T. M., & Haviland, M. G. (2013). Applying unidimensional item response theory models to psychological data. In K. F. Geisinger, B. A. Bracken, J. F. Carlson, J.-I. C. Hansen, N. R. Kuncel, S. P. Reise, & M. C. Rodriguez (Eds.), APA handbooks in psychology®. APA handbook of testing and assessment in psychology, Vol. 1. Test theory and testing and assessment in industrial and organizational psychology (p. 101–119). American Psychological Association.
Reise, S. P., Rodriguez, A., Spritzer, K. L., & Hays, R. D. (2018). Alternative approaches to addressing non-normal distributions in the application of IRT models to personality measures. Journal of Personality Assessment, 100(4), 363–374.
Article PubMed Google Scholar
Revicki, D., Hays, R. D., Cella, D., & Sloan, J. (2008). Recommended methods for determining responsiveness and minimally important differences for patient-reported outcomes. Journal of Clinical Epidemiology, 61(2), 102–109.
Article PubMed Google Scholar
Rose, J. S., Dierker, L. C., Hedeker, D., & Mermelstein, R. (2013). An integrated data analysis approach to investigating measurement equivalence of DSM nicotine dependence symptoms. Drug and Alcohol Dependence, 129(1–2), 25–32.
Article PubMed Google Scholar
Rose, M., Bjorner, J. B., Gandek, B., Bruce, B., Fries, J. F., & Ware, J. E. (2014). The PROMIS physical function item bank was calibrated to a standardized metric and shown to improve measurement efficiency. Journal of Clinical Epidemiology, 67(5), 516–526.
Article PubMed PubMed Central Google Scholar
Rosseel, Y. (2012). Lavaan: An R package for structural equation modeling and more. Version 0.5–12 (BETA). Journal of Statistical Software, 48(2), 1–36.
Article Google Scholar
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. (Psychometrika Monograph Supplement No. 17) Richmond, VA Psychometrics Society.
Schalet, B. D., Cook, K. F., Choi, S. W., & Cella, D. (2014). Establishing a common metric for self-reported anxiety: Linking the MASQ, PANAS, and GAD-7 to PROMIS Anxiety. Journal of Anxiety Disorders, 28(1), 88–96.
Article PubMed Google Scholar
Schalet, B. D., Janulis, P., Kipke, M. D., Mustanski, B., Shoptaw, S., Moore, R., et al. (2020). Psychometric Data Linking Across HIV and Substance Use Cohorts. AIDS and Behavior, 24, 3215–3224.
Segawa, E., Schalet, B., & Cella, D. (2020). A comparison of computer adaptive tests (CATs) and short forms in terms of accuracy and number of items administrated using PROMIS profile. Quality of Life Research, 29(1), 213–221.
Article PubMed Google Scholar
Stocking, M. L., & Lord, F. M. (1983). Developing a common metric in item response theory. Applied Psychological Measurement, 7(2), 201–210.
Article Google Scholar
ten Klooster, P. M., Voshaar, M. A. O., Gandek, B., Rose, M., Bjorner, J. B., Taal, E., et al. (2013). Development and evaluation of a crosswalk between the SF-36 physical functioning scale and Health Assessment Questionnaire disability index in rheumatoid arthritis. Health and Quality of Life Outcomes, 11(1), 1.
Google Scholar
Thissen D., Liu Y., Magnus B., Quinn H. (2015) Extending the Use of Multidimensional IRT Calibration as Projection: Many-to-One Linking and Linear Computation of Projected Scores. In van der Ark L., Bolt D., Wang WC., Douglas J., Chow SM. (Eds.), Quantitative Psychology Research. Springer Proceedings in Mathematics & Statistics, vol 140 (pp 1–16). Springer, Cham.
Thissen, D., Pommerich, M., Billeaud, K., & Williams, V. S. (1995). Item response theory for scores on tests including polytomous items with ordered responses. Applied Psychological Measurement, 19(1), 39–49.
Article Google Scholar
Thissen, D., Varni, J. W., Stucky, B. D., Liu, Y., Irwin, D. E., & DeWalt, D. A. (2011). Using the PedsQL™3.0 asthma module to obtain scores comparable with those of the PROMIS pediatric asthma impact scale (PAIS). Quality of Life Research, 20(9), 1497–1505.
Article PubMed PubMed Central Google Scholar
Tomitaka, S., Kawasaki, Y., Ide, K., Akutagawa, M., Ono, Y., & Furukawa, T. A. (2019). Distribution of psychological distress is stable in recent decades and follows an exponential pattern in the US population. Scientific Reports, 9(1), 1–10.
Article Google Scholar
Tuck, N. L., Johnson, M. H., & Bean, D. J. (2019). You’d better believe it: The conceptual and practical challenges of assessing malingering in patients with chronic pain. The Journal of Pain, 20(2), 133–145.
Article PubMed Google Scholar
Tulsky, D. S., Kisala, P. A., Boulton, A. J., Jette, A. M., Thissen, D., Ni, P., et al. (2019). Determining a transitional scoring link between PROMIS® pediatric and adult physical health measures. Quality of Life Research, 28(5), 1217–1229.
Article PubMed Google Scholar
Uijen, A. A., Heinst, C. W., Schellevis, F. G., van den Bosch, W. J., van de Laar, F. A., Terwee, C. B., et al. (2012). Measurement properties of questionnaires measuring continuity of care: A systematic review. PloS One, 7(7), e42256.
Article PubMed PubMed Central Google Scholar
Victorson, D., Schalet, B. D., Kundu, S., Helfand, B. T., Novakovic, K., Penedo, F., et al. (2019). Establishing a common metric for self-reported anxiety in patients with prostate cancer: Linking the Memorial Anxiety Scale for Prostate Cancer with PROMIS Anxiety. Cancer, 125(18), 3249–3258.
Article PubMed Google Scholar
von Davier, M., Yamamoto, K., Shin, H. J., Chen, H., Khorramdel, L., Weeks, J., et al. (2019). Evaluating item response theory linking and model fit for data from PISA 2000–2012. Assessment in Education: Principles, Policy and Practice, 26(4), 466–488.
Google Scholar
Voshaar, M. O., Vonkeman, H., Courvoisier, D., Finckh, A., Gossec, L., Leung, Y., et al. (2019). Towards standardized patient reported physical function outcome reporting: Linking ten commonly used questionnaires to a common metric. Quality of Life Research, 28(1), 187–197.
Article Google Scholar
Wall, M. M., Park, J. Y., & Moustaki, I. (2015). IRT modeling in the presence of zero-inflation with application to psychiatric disorder severity. Applied Psychological Measurement, 39(8), 583–597.
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We wish to clarify that Seung W. Choi served as the senior author on this manuscript.

Author information

Authors and Affiliations

Department of Medical Social Sciences, Northwestern University, Feinberg School of Medicine, 625 N Michigan Ave, 21st Floor, Chicago, IL, 60611, USA
Benjamin D. Schalet
Department of Educational Psychology, The University of Texas at Austin, 1912 Speedway, Stop D5800, Austin, TX, 78712-1289, USA
Sangdon Lim
Department of Medical Social Sciences, Northwestern University, Feinberg School of Medicine, 625 N Michigan Ave, 21st Floor, Chicago, IL, 60611, USA
David Cella
Department of Educational Psychology, The University of Texas at Austin, 1912 Speedway, Stop D5800, Austin, TX, 78712-1289, USA
Seung W. Choi

Authors

Benjamin D. Schalet
View author publications
You can also search for this author in PubMed Google Scholar
Sangdon Lim
View author publications
You can also search for this author in PubMed Google Scholar
David Cella
View author publications
You can also search for this author in PubMed Google Scholar
Seung W. Choi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Benjamin D. Schalet.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 76 KB)

Supplementary material 2 (zip 4395 KB)

Supplementary material 3 (pdf 45 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schalet, B.D., Lim, S., Cella, D. et al. Linking Scores with Patient-Reported Health Outcome Instruments:A VALIDATION STUDY AND COMPARISON OF THREE LINKING METHODS. Psychometrika 86, 717–746 (2021). https://doi.org/10.1007/s11336-021-09776-z

Download citation

Received: 14 June 2020
Revised: 03 March 2021
Accepted: 19 May 2021
Published: 26 June 2021
Issue Date: September 2021
DOI: https://doi.org/10.1007/s11336-021-09776-z

Linking Scores with Patient-Reported Health Outcome Instruments:A VALIDATION STUDY AND COMPARISON OF THREE LINKING METHODS

Abstract

Access this article

Similar content being viewed by others

Solving the Tower of Babel Problem for Patient-Reported Outcome Measures

ExternalRefStart http://www.common-metrics.org www.common-metrics.org ExternalRefEnd : a web application to estimate scores from different patient-reported outcome measures on a common scale

Matching IRT Models to Patient-Reported Outcomes Constructs: The Graded Response and Log-Logistic Models for Scaling Depression

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary material 1 (pdf 76 KB)

Supplementary material 2 (zip 4395 KB)

Supplementary material 3 (pdf 45 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Linking Scores with Patient-Reported Health Outcome Instruments:A VALIDATION STUDY AND COMPARISON OF THREE LINKING METHODS

Abstract

Access this article

Similar content being viewed by others

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation