Community Assessment of the Predictability of Cancer Protein and Phosphoprotein Levels from Genomics and Transcriptomics.,Cell Systems

当前位置： X-MOL 学术 › Cell Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Community Assessment of the Predictability of Cancer Protein and Phosphoprotein Levels from Genomics and Transcriptomics.
Cell Systems ( IF 9.0 ) Pub Date : 2020-07-24 , DOI: 10.1016/j.cels.2020.06.013
Mi Yang ₁ , Francesca Petralia ₂ , Zhi Li ₃ , Hongyang Li ₄ , Weiping Ma ₂ , Xiaoyu Song ₅ , Sunkyu Kim ₆ , Heewon Lee ₆ , Han Yu ₇ , Bora Lee ₈ , Seohui Bae ₉ , Eunji Heo ₁₀ , Jan Kaczmarczyk ₁₁ , Piotr Stępniak ₁₁ , Michał Warchoł ₁₁ , Thomas Yu ₁₂ , Anna P Calinawan ₂ , Paul C Boutros ₁₃ , Samuel H Payne ₁₄ , Boris Reva ₂ , , Emily Boja ₁₅ , Henry Rodriguez ₁₅ , Gustavo Stolovitzky ₁₆ , Yuanfang Guan ₄ , Jaewoo Kang ₆ , Pei Wang ₂ , David Fenyö ₃ , Julio Saez-Rodriguez ₁₇

Affiliation

Faculty of Biosciences, Heidelberg University, 69120 Heidelberg, Germany; Joint Research Centre for Computational Biomedicine (JRC-COMBINE), RWTH Aachen University, Faculty of Medicine, 52074 Aachen, Germany.
Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.
Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY 10016, USA; Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10016, USA.
Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA.
Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.
Department of Computer Science and Engineering, Korea University, Seongbuk-gu, Seoul, Republic of Korea.
Department of Biostatistics and Bioinformatics, Roswell Park Comprehensive Cancer Center, Buffalo, NY 14263, USA.
Deargen, Daejeon 34051, Republic of Korea.
Deargen, Daejeon 34051, Republic of Korea; Department of Biological Science, Department of Bio-Brain Engineering, KAIST, Daejeon, Republic of Korea.
Deargen, Daejeon 34051, Republic of Korea; Department of AI, KAIST, Daejeon 34141, Republic of Korea.
Ardigen, Kraków 30-394, Poland.
Sage Bionetworks, Seattle, WA 98121, USA.
Ontario Institute of Cancer Research, Toronto, ON M5G 0A3, Canada; Department of Medical Biophysics, University of Toronto, Toronto, ON M5G 1L7, Canada; Department of Pharmacology and Toxicology, University of Toronto, Toronto, ON M5S 1A8, Canada; Department of Human Genetics, University of California, Los Angeles, CA 90095, USA; Department of Urology, University of California, Los Angeles, CA 90095, USA; Institute for Precision Health, University of California, Los Angeles, CA, USA; Jonsson Comprehensive Cancer Center, University of California, Los Angeles, CA 90095, USA.
Department of Biology, Brigham Young University, Provo, UT 84604, USA.
Office of Cancer Clinical Proteomics Research, National Cancer Institute, Bethesda, MD 20892, USA.
IBM Research, IBM Thomas J Watson Research Center, Yorktown Heights, NY 10598, USA.
Joint Research Centre for Computational Biomedicine (JRC-COMBINE), RWTH Aachen University, Faculty of Medicine, 52074 Aachen, Germany; European Molecular Biology Laboratory-The European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK; Institute for Computational Biomedicine, Heidelberg University Hospital and Heidelberg University, Faculty of Medicine, Bioquant Heidelberg, Hedelberg 69120, Germany.

Cancer is driven by genomic alterations, but the processes causing this disease are largely performed by proteins. However, proteins are harder and more expensive to measure than genes and transcripts. To catalyze developments of methods to infer protein levels from other omics measurements, we leveraged crowdsourcing via the NCI-CPTAC DREAM proteogenomic challenge. We asked for methods to predict protein and phosphorylation levels from genomic and transcriptomic data in cancer patients. The best performance was achieved by an ensemble of models, including as predictors transcript level of the corresponding genes, interaction between genes, conservation across tumor types, and phosphosite proximity for phosphorylation prediction. Proteins from metabolic pathways and complexes were the best and worst predicted, respectively. The performance of even the best-performing model was modest, suggesting that many proteins are strongly regulated through translational control and degradation. Our results set a reference for the limitations of computational inference in proteogenomics.

A record of this paper’s transparent peer review process is included in the Supplemental Information.

中文翻译：

从基因组学和转录组学对癌症蛋白质和磷蛋白水平的可预测性进行社区评估。

癌症是由基因组改变驱动的，但导致这种疾病的过程主要由蛋白质执行。然而，蛋白质比基因和转录物更难测量，也更昂贵。为了促进从其他组学测量中推断蛋白质水平的方法的发展，我们通过 NCI-CPTAC DREAM 蛋白质组学挑战利用众包。我们要求从癌症患者的基因组和转录组数据中预测蛋白质和磷酸化水平的方法。最好的性能是由一组模型实现的，包括作为预测因子的相应基因的转录水平、基因之间的相互作用、跨肿瘤类型的保守性以及用于磷酸化预测的磷酸化位点接近度。来自代谢途径和复合物的蛋白质分别是最好的和最差的预测。即使是性能最好的模型的性能也不高，这表明许多蛋白质通过翻译控制和降解受到强烈调节。我们的结果为蛋白质基因组学中计算推理的局限性提供了参考。

本文的透明同行评审过程的记录包含在补充信息中。

更新日期：2020-07-24

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11