当前位置: X-MOL 学术Biometrics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Post-selection inference for changepoint detection algorithms with application to copy number variation data
Biometrics ( IF 1.4 ) Pub Date : 2021-01-12 , DOI: 10.1111/biom.13422
Sangwon Hyun 1 , Kevin Z Lin 2 , Max G'Sell 3 , Ryan J Tibshirani 3
Affiliation  

Changepoint detection methods are used in many areas of science and engineering, for example, in the analysis of copy number variation data to detect abnormalities in copy numbers along the genome. Despite the broad array of available tools, methodology for quantifying our uncertainty in the strength (or the presence) of given changepoints post-selection are lacking. Post-selection inference offers a framework to fill this gap, but the most straightforward application of these methods results in low-powered hypothesis tests and leaves open several important questions about practical usability. In this work, we carefully tailor post-selection inference methods toward changepoint detection, focusing on copy number variation data. To accomplish this, we study commonly used changepoint algorithms: binary segmentation, as well as two of its most popular variants, wild and circular, and the fused lasso. We implement some of the latest developments in post-selection inference theory, mainly auxiliary randomization. This improves the power, which requires implementations of Markov chain Monte Carlo algorithms (importance sampling and hit-and-run sampling) to carry out our tests. We also provide recommendations for improving practical useability, detailed simulations, and example analyses on array comparative genomic hybridization as well as sequencing data.

中文翻译:

变化点检测算法的选择后推理,适用于拷贝数变异数据

变化点检测方法用于科学和工程的许多领域,例如,用于分析拷贝数变异数据以检测基因组中拷贝数的异常。尽管有广泛的可用工具,但用于量化我们在选择后给定变化点的强度(或存在)的不确定性的方法缺乏。选择后推理提供了一个框架来填补这一空白,但这些方法最直接的应用会导致低功效的假设检验,并留下几个关于实际可用性的重要问题。在这项工作中,我们针对变化点检测精心定制了选择后推理方法,重点关注拷贝数变异数据。为了实现这一点,我们研究了常用的变更点算法:二进制分割,以及它的两个最流行的变体,野性和圆形,以及融合套索。我们实现了选择后推理理论的一些最新进展,主要是辅助随机化。这提高了能力,这需要实施马尔可夫链蒙特卡罗算法(重要性采样和命中并运行采样)来执行我们的测试。
更新日期:2021-01-12
down
wechat
bug