Scalable network estimation with L0 penalty,Statistical Analysis and Data Mining

当前位置： X-MOL 学术 › Stat. Anal. Data Min. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Scalable network estimation with L0 penalty
Statistical Analysis and Data Mining ( IF 1.3 ) Pub Date : 2020-10-21 , DOI: 10.1002/sam.11483
Junghi Kim ₁ , Hongtu Zhu ₂ , Xiao Wang ₃ , Kim-Anh Do ₄

Affiliation

With the advent of high‐throughput sequencing, an efficient computing strategy is required to deal with large genomic data sets. The challenge of estimating a large precision matrix has garnered substantial research attention for its direct application to discriminant analyses and graphical models. Most existing methods either use a lasso‐type penalty that may lead to biased estimators or are computationally intensive, which prevents their applications to very large graphs. We propose using an L₀ penalty to estimate an ultra‐large precision matrix (scalnetL0). We apply scalnetL0 to RNA‐seq data from breast cancer patients represented in The Cancer Genome Atlas and find improved accuracy of classifications for survival times. The estimated precision matrix provides information about a large‐scale co‐expression network in breast cancer. Simulation studies demonstrate that scalnetL0 provides more accurate and efficient estimators, yielding shorter CPU time and less Frobenius loss on sparse learning for large‐scale precision matrix estimation.

中文翻译：

具有 L0 惩罚的可扩展网络估计

随着高通量测序的出现，需要一种有效的计算策略来处理大型基因组数据集。估计大型精度矩阵的挑战因其直接应用于判别分析和图形模型而引起了广泛的研究关注。大多数现有方法要么使用可能导致有偏差的估计量的套索类型的惩罚，要么是计算密集型的，这阻止了它们在非常大的图上的应用。我们建议使用L ₀惩罚来估计超大精度矩阵（scalnetL0）。我们应用scalnetL0来自癌症基因组图谱中代表的乳腺癌患者的 RNA-seq 数据，并发现生存时间分类的准确性有所提高。估计的精度矩阵提供了有关乳腺癌中大规模共表达网络的信息。仿真研究表明，scalnetL0提供了更准确和有效的估计器，在大规模精度矩阵估计的稀疏学习中产生更短的 CPU 时间和更少的 Frobenius 损失。

更新日期：2020-10-21

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>