当前位置: X-MOL 学术Physiol. Genom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An integrative proteogenomics approach reveals peptides encoded by annotated lincRNA in the mouse kidney inner medulla.
Physiological Genomics ( IF 2.5 ) Pub Date : 2020-08-31 , DOI: 10.1152/physiolgenomics.00048.2020
Cameron T Flower 1 , Lihe Chen 1 , Hyun Jun Jung 1 , Viswanathan Raghuram 1 , Mark A Knepper 1 , Chin-Rang Yang 1
Affiliation  

Long noncoding RNAs (lncRNAs) are intracellular transcripts longer than 200 nucleotides and lack the capacity to encode protein. A subclass of lncRNA known as long intergenic noncoding RNAs (lincRNAs) are transcribed from genomic regions that share no overlap with annotated protein-coding genes. Increasing evidence has shown that some annotated lincRNA transcripts do in fact contain open reading frames (ORFs) encoding functional short peptides in the cell. Few robust methods for lincRNA-encoded peptide identification have been reported, and the tissue-specific expression of these peptides has been largely unexplored. Here we propose an integrative workflow for lincRNA-encoded peptide discovery and tested it on the mouse kidney inner medulla (IM). In brief, low molecular weight protein fractions were enriched from homogenate of IM and trypsinized into shorter peptides, which were characterized using high resolution liquid chromatography-tandem mass spectrometry (LC-MS/MS). The challenge is to curate a hypothetical lincRNA-encoded peptide database for peptide-spectrum matching following LC-MS/MS. We performed RNA-Seq on IM, computationally filtered out reads overlapping with annotated protein-coding genes, and re-mapped the remaining reads to a database of mouse noncoding transcripts. The mapped transcripts are likely to be lincRNAs, and further searched for ORFs using an existing rule-based algorithm for peptide-spectrum matching. Peptides identified by LC-MS/MS were further evaluated using several quality control criteria and bioinformatics methods. We discovered three novel lincRNA-peptides, which are conserved in mouse, rat, and human. The workflow can be adapted for discovery of small protein-coding genes in any species or tissue where noncoding transcriptome information is available.

中文翻译:

一种综合蛋白质基因组学方法揭示了小鼠肾内髓质中由注释的 lincRNA 编码的肽。

长链非编码 RNA (lncRNA) 是长度超过 200 个核苷酸的细胞内转录物,缺乏编码蛋白质的能力。称为长基因间非编码 RNA (lincRNA) 的 lncRNA 子类是从与注释的蛋白质编码基因没有重叠的基因组区域转录而来的。越来越多的证据表明,一些带注释的 lincRNA 转录本确实包含在细胞中编码功能性短肽的开放阅读框 (ORF)。很少有可靠的 lincRNA 编码肽识别方法被报道,并且这些肽的组织特异性表达在很大程度上尚未得到探索。在这里,我们提出了一个用于 lincRNA 编码肽发现的集成工作流程,并在小鼠肾内髓质 (IM) 上对其进行了测试。简单来说,低分子量蛋白质部分从 IM 匀浆中富集并胰蛋白酶消化成较短的肽,使用高分辨率液相色谱 - 串联质谱 (LC-MS/MS) 对其进行表征。面临的挑战是在 LC-MS/MS 之后为肽谱匹配管理一个假设的 lincRNA 编码的肽数据库。我们在 IM 上进行了 RNA-Seq,通过计算过滤掉与带注释的蛋白质编码基因重叠的读数,并将剩余的读数重新映射到小鼠非编码转录本的数据库中。映射的转录本很可能是 lincRNA,并使用现有的基于规则的肽谱匹配算法进一步搜索 ORF。使用多种质量控制标准和生物信息学方法进一步评估了通过 LC-MS/MS 鉴定的肽。我们发现了三种新的 lincRNA 肽,在小鼠、大鼠和人类中都是保守的。该工作流程适用于在任何可获得非编码转录组信息的物种或组织中发现小的蛋白质编码基因。
更新日期:2020-09-01
down
wechat
bug