Differential Item Functioning Analyses of the Patient-Reported Outcomes Measurement Information System (PROMIS®) Measures: Methods, Challenges, Advances, and Future Directions,Psychometrika

当前位置： X-MOL 学术 › Psychometrika › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Differential Item Functioning Analyses of the Patient-Reported Outcomes Measurement Information System (PROMIS®) Measures: Methods, Challenges, Advances, and Future Directions
Psychometrika ( IF 3 ) Pub Date : 2021-07-12 , DOI: 10.1007/s11336-021-09775-0
Jeanne A Teresi _{1,

2,

3,

4} , Chun Wang ₅ , Marjorie Kleinman ₄ , Richard N Jones ₆ , David J Weiss ₇

Affiliation

Several methods used to examine differential item functioning (DIF) in Patient-Reported Outcomes Measurement Information System (PROMIS®) measures are presented, including effect size estimation. A summary of factors that may affect DIF detection and challenges encountered in PROMIS DIF analyses, e.g., anchor item selection, is provided. An issue in PROMIS was the potential for inadequately modeled multidimensionality to result in false DIF detection. Section 1 is a presentation of the unidimensional models used by most PROMIS investigators for DIF detection, as well as their multidimensional expansions. Section 2 is an illustration that builds on previous unidimensional analyses of depression and anxiety short-forms to examine DIF detection using a multidimensional item response theory (MIRT) model. The Item Response Theory-Log-likelihood Ratio Test (IRT-LRT) method was used for a real data illustration with gender as the grouping variable. The IRT-LRT DIF detection method is a flexible approach to handle group differences in trait distributions, known as impact in the DIF literature, and was studied with both real data and in simulations to compare the performance of the IRT-LRT method within the unidimensional IRT (UIRT) and MIRT contexts. Additionally, different effect size measures were compared for the data presented in Section 2. A finding from the real data illustration was that using the IRT-LRT method within a MIRT context resulted in more flagged items as compared to using the IRT-LRT method within a UIRT context. The simulations provided some evidence that while unidimensional and multidimensional approaches were similar in terms of Type I error rates, power for DIF detection was greater for the multidimensional approach. Effect size measures presented in Section 1 and applied in Section 2 varied in terms of estimation methods, choice of density function, methods of equating, and anchor item selection. Despite these differences, there was considerable consistency in results, especially for the items showing the largest values. Future work is needed to examine DIF detection in the context of polytomous, multidimensional data. PROMIS standards included incorporation of effect size measures in determining salient DIF. Integrated methods for examining effect size measures in the context of IRT-based DIF detection procedures are still in early stages of development.

中文翻译：

患者报告结果测量信息系统 (PROMIS®) 测量的差异项目功能分析：方法、挑战、进展和未来方向

介绍了用于检查患者报告结果测量信息系统 (PROMIS®) 测量中差异项目功能 (DIF) 的几种方法，包括效应量估计。提供了可能影响 DIF 检测的因素的总结以及 PROMIS DIF 分析中遇到的挑战，例如锚项选择。PROMIS 中的一个问题是多维建模不充分可能导致错误的 DIF 检测。第 1 节介绍了大多数 PROMIS 研究人员用于 DIF 检测的一维模型及其多维扩展。第 2 节是基于先前对抑郁症和焦虑症简短形式的一维分析的说明，以使用多维项目反应理论 (MIRT) 模型检查 DIF 检测。项目反应理论-对数似然比检验（IRT-LRT）方法用于以性别为分组变量的真实数据说明。IRT-LRT DIF 检测方法是一种灵活的方法来处理特征分布中的群体差异，在 DIF 文献中称为影响，并且在真实数据和模拟中进行了研究，以比较 IRT-LRT 方法在一维范围内的性能IRT (UIRT) 和 MIRT 上下文。此外，还比较了第 2 节中提供的数据的不同效果大小度量。真实数据说明的一个发现是，与在 MIRT 上下文中使用 IRT-LRT 方法相比，在 MIRT 上下文中使用 IRT-LRT 方法会导致更多标记项目UIRT 上下文。模拟提供了一些证据表明，虽然一维和多维方法在 I 类错误率方面相似，但多维方法的 DIF 检测能力更强。第 1 节中介绍的和第 2 节中应用的效果大小度量在估计方法、密度函数的选择、等式方法和锚项选择方面各不相同。尽管存在这些差异，但结果具有相当大的一致性，尤其是对于显示最大值的项目。未来的工作需要在多维、多维数据的背景下检查 DIF 检测。PROMIS 标准包括在确定显着 DIF 时结合效应量测量。

更新日期：2021-07-12

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>