Development of a Gene Expression Panel, for the Prediction of Protein Abundances in Cancer Cell Lines,Current Bioinformatics

当前位置： X-MOL 学术 › Curr. Bioinform. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Development of a Gene Expression Panel, for the Prediction of Protein Abundances in Cancer Cell Lines
Current Bioinformatics ( IF 2.4 ) Pub Date : 2021-06-30 , DOI: 10.2174/1574893616666210517162530
Gunhee Lee ₁ , Yeun-Jun Chung ₂ , Minho Lee ₃

Affiliation

Background: Due to the ease of quantifying mRNA expression in comparison with that of protein abundances, many studies have utilized it to infer protein product quantification. However, the mRNA expression values for a gene and its protein products are not known to have a strong relationship, because of the complex mechanisms required to regulate the amounts of protein levels, from translation to post-translational modifications.

Methods: We have developed, in this study, models to predict protein levels from mRNA expression levels using the transcriptome and reverse phase protein arrays (RPPA)-based on protein levels in pancancer cell lines. When predicting the abundance of a protein expression, in addition to using RNA expression of the corresponding gene, we also used RNA expression levels of a particular set of other genes. By applying support vector regression, we have identified a 47-gene expression panel that contributes to the improved performance of the prediction, and its optimal subsets specific to each protein species.

Result and Conclusion: Eventually, our final prediction models doubled the number of predictable protein expressions (r > 0.7). Due to the weaknesses of RPPA, our model had some limitations, however, we expect that these prediction models and the panel can be widely used in the future to infer protein abundances.

中文翻译：

开发基因表达面板，用于预测癌细胞系中的蛋白质丰度

背景：由于与蛋白质丰度相比，量化 mRNA 表达更容易，许多研究已经利用它来推断蛋白质产物的量化。然而，基因及其蛋白质产物的 mRNA 表达值之间的关系尚不清楚，因为调节蛋白质水平的数量需要复杂的机制，从翻译到翻译后修饰。

方法：在本研究中，我们开发了基于泛癌细胞系中蛋白质水平的模型，使用转录组和反相蛋白质阵列 (RPPA) 从 mRNA 表达水平预测蛋白质水平。在预测蛋白质表达的丰度时，除了使用相应基因的 RNA 表达外，我们还使用了一组特定其他基因的 RNA 表达水平。通过应用支持向量回归，我们确定了一个 47 基因表达面板，有助于提高预测性能，以及每个蛋白质物种特有的最佳子集。

结果和结论：最终，我们的最终预测模型使可预测的蛋白质表达数量增加了一倍（r > 0.7）。由于 RPPA 的弱点，我们的模型有一些局限性，但是，我们希望这些预测模型和面板在未来可以广泛用于推断蛋白质丰度。

更新日期：2021-06-30

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11