当前位置: X-MOL 学术Bioinformatics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Querying multiple sets of P-values through composed hypothesis testing
Bioinformatics ( IF 5.8 ) Pub Date : 2021-09-03 , DOI: 10.1093/bioinformatics/btab592
Tristan Mary-Huard 1, 2 , Sarmistha Das 3 , Indranil Mukhopadhyay 3 , Stéphane Robin 1, 4
Affiliation  

Motivation Combining the results of different experiments to exhibit complex patterns or to improve statistical power is a typical aim of data integration. The starting point of the statistical analysis often comes as a set of P-values resulting from previous analyses, that need to be combined flexibly to explore complex hypotheses, while guaranteeing a low proportion of false discoveries. Results We introduce the generic concept of composed hypothesis, which corresponds to an arbitrary complex combination of simple hypotheses. We rephrase the problem of testing a composed hypothesis as a classification task and show that finding items for which the composed null hypothesis is rejected boils down to fitting a mixture model and classifying the items according to their posterior probabilities. We show that inference can be efficiently performed and provide a thorough classification rule to control for type I error. The performance and the usefulness of the approach are illustrated in simulations and on two different applications. The method is scalable, does not require any parameter tuning, and provided valuable biological insight on the considered application cases. Availability and implementation The QCH methodology is available in the qch package hosted on CRAN. Additionally, R codes to reproduce the Einkorn example are available on the personal webpage of the first author: https://www6.inrae.fr/mia-paris/Equipes/Membres/Tristan-Mary-Huard. Supplementary information Supplementary data are available at Bioinformatics online.

中文翻译:

通过组合假设检验查询多组 P 值

动机 结合不同实验的结果来展示复杂的模式或提高统计能力是数据集成的典型目标。统计分析的起点通常来自先前分析的一组 P 值,需要灵活组合以探索复杂的假设,同时保证较低比例的错误发现。结果 我们引入了复合假设的一般概念,它对应于简单假设的任意复杂组合。我们将测试组合假设的问题改写为分类任务,并表明找到组合原假设被拒绝的项目归结为拟合混合模型并根据项目的后验概率对项目进行分类。我们表明可以有效地执行推理并提供全面的分类规则来控制 I 类错误。该方法的性能和实用性在模拟和两个不同的应用程序中得到了说明。该方法是可扩展的,不需要任何参数调整,并为所考虑的应用案例提供了有价值的生物学见解。可用性和实施​​ QCH 方法在 CRAN 上托管的 qch 包中可用。此外,第一作者的个人网页上提供了用于重现 Einkorn 示例的 R 代码:https://www6.inrae.fr/mia-paris/Equipes/Membres/Tristan-Mary-Huard。补充信息 补充数据可在 Bioinformatics 在线获取。该方法的性能和实用性在模拟和两个不同的应用程序中得到了说明。该方法是可扩展的,不需要任何参数调整,并为所考虑的应用案例提供了有价值的生物学见解。可用性和实施​​ QCH 方法在 CRAN 上托管的 qch 包中可用。此外,第一作者的个人网页上提供了用于重现 Einkorn 示例的 R 代码:https://www6.inrae.fr/mia-paris/Equipes/Membres/Tristan-Mary-Huard。补充信息 补充数据可在 Bioinformatics 在线获取。该方法的性能和实用性在模拟和两个不同的应用程序中得到了说明。该方法是可扩展的,不需要任何参数调整,并为所考虑的应用案例提供了有价值的生物学见解。可用性和实施​​ QCH 方法在 CRAN 上托管的 qch 包中可用。此外,第一作者的个人网页上提供了用于重现 Einkorn 示例的 R 代码:https://www6.inrae.fr/mia-paris/Equipes/Membres/Tristan-Mary-Huard。补充信息 补充数据可在 Bioinformatics 在线获取。可用性和实施​​ QCH 方法在 CRAN 上托管的 qch 包中可用。此外,第一作者的个人网页上提供了用于重现 Einkorn 示例的 R 代码:https://www6.inrae.fr/mia-paris/Equipes/Membres/Tristan-Mary-Huard。补充信息 补充数据可在 Bioinformatics 在线获取。可用性和实施​​ QCH 方法在 CRAN 上托管的 qch 包中可用。此外,第一作者的个人网页上提供了用于重现 Einkorn 示例的 R 代码:https://www6.inrae.fr/mia-paris/Equipes/Membres/Tristan-Mary-Huard。补充信息 补充数据可在 Bioinformatics 在线获取。
更新日期:2021-09-03
down
wechat
bug