当前位置: X-MOL 学术J. Am. Stat. Assoc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Semiparametric Kernel Independence Test With Application to Mutational Signatures
Journal of the American Statistical Association ( IF 3.7 ) Pub Date : 2021-02-16 , DOI: 10.1080/01621459.2020.1871357
DongHyuk Lee 1 , Bin Zhu 1
Affiliation  

Abstract

Cancers arise owing to somatic mutations, and the characteristic combinations of somatic mutations form mutational signatures. Despite many mutational signatures being identified, mutational processes underlying a number of mutational signatures remain unknown, which hinders the identification of interventions that may reduce somatic mutation burdens and prevent the development of cancer. We demonstrate that the unknown cause of a mutational signature can be inferred by the associated signatures with known etiology. However, existing association tests are not statistically powerful due to excess zeros in mutational signatures data. To address this limitation, we propose a semiparametric kernel independence test (SKIT). The SKIT statistic is defined as the integrated squared distance between mixed probability distributions and is decomposed into four disjoint components to pinpoint the source of dependency. We derive the asymptotic null distribution and prove the asymptotic convergence of power. Due to slow convergence to the asymptotic null distribution, a bootstrap method is employed to compute p-values. Simulation studies demonstrate that when zeros are prevalent, SKIT is more resilient to power loss than existing tests and robust to random errors. We applied SKIT to The Cancer Genome Atlas mutational signatures data for over 9000 tumors across 32 cancer types, and identified a novel association between signature 17 curated in the Catalogue of Somatic Mutations in Cancer and apolipoprotein B mRNA editing enzyme (APOBEC) signatures in gastrointestinal cancers. It indicates that APOBEC activity is likely associated with the unknown cause of signature 17. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.



中文翻译:

应用于突变签名的半参数内核独立性测试

摘要

癌症是由体细胞突变引起的,体细胞突变的特征组合形成突变特征。尽管发现了许多突变特征,但许多突变特征背后的突变过程仍然未知,这阻碍了对可能减少体细胞突变负担和预防癌症发展的干预措施的识别。我们证明突变特征的未知原因可以通过具有已知病因的相关特征来推断。然而,由于突变特征数据中的零点过多,现有的关联测试在统计上并不强大。为了解决这个限制,我们提出了一种半参数内核独立性测试(SKIT)。SKIT 统计量定义为混合概率分布之间的积分平方距离,并分解为四个不相交的分量以查明依赖源。我们推导出渐近零分布并证明幂的渐近收敛。由于对渐近零分布收敛缓慢,采用自举方法计算p值。仿真研究表明,当零点普遍存在时,SKIT 比现有测试更能抵御功率损耗,并且对随机误差具有鲁棒性。我们将 SKIT 应用于 32 种癌症类型的 9000 多个肿瘤的癌症基因组图谱突变特征数据,并确定了癌症体细胞突变目录中精选的特征 17 与胃肠癌中的载脂蛋白 B mRNA 编辑酶 (APOBEC) 特征之间的新关联. 这表明 APOBEC 活动可能与签名 17 的未知原因有关。本文的补充材料,包括对可用于复制该作品的材料的标准化描述,可作为在线补充获得。

更新日期:2021-02-16
down
wechat
bug