当前位置: X-MOL 学术PeerJ Comput. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Identification of high-efficiency 3'GG gRNA motifs in indexed FASTA files with ngg2.
PeerJ Computer Science ( IF 3.8 ) Pub Date : 2016-02-16 , DOI: 10.7717/peerj-cs.33
Elisha D O Roberson 1
Affiliation  

CRISPR/Cas9 is emerging as one of the most-used methods of genome modification in organisms ranging from bacteria to human cells. However, the efficiency of editing varies tremendously site-to-site. A recent report identified a novel motif, called the 3'GG motif, which substantially increases the efficiency of editing at all sites tested in C. elegans. Furthermore, they highlighted that previously published gRNAs with high editing efficiency also had this motif. I designed a python command-line tool, ngg2, to identify 3'GG gRNA sites from indexed FASTA files. As a proof-of-concept, I screened for these motifs in six model genomes: Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, Danio rerio, Mus musculus, and Homo sapiens. I also scanned the genomes of pig (Sus scrofa) and African elephant (Loxodonta africana) to demonstrate the utility in non-model organisms. I identified more than 60 million single match 3'GG motifs in these genomes. Greater than 61% of all protein coding genes in the reference genomes had at least one unique 3'GG gRNA site overlapping an exon. In particular, more than 96% of mouse and 93% of human protein coding genes have at least one unique, overlapping 3'GG gRNA. These identified sites can be used as a starting point in gRNA selection, and the ngg2 tool provides an important ability to identify 3'GG editing sites in any species with an available genome sequence.

中文翻译:

用ngg2识别索引的FASTA文件中的高效3'GG gRNA基序。

CRISPR / Cas9逐渐成为从细菌到人类细胞等生物体中最常用的基因组修饰方法之一。但是,不同站点之间的编辑效率差异很大。最近的一份报告鉴定了一种新颖的基序,称为3'GG基序,该基序大大提高了秀丽隐杆线虫测试的所有位点的编辑效率。此外,他们强调指出,先前发表的具有高编辑效率的gRNA也具有这种基序。我设计了一个python命令行工具ngg2,以从索引的FASTA文件中识别3'GG gRNA位点。作为概念验证,我在六个模型基因组中筛选了这些基序:酿酒酵母,秀丽隐杆线虫,果蝇果蝇,达尼奥里奥,小家鼠和智人。我还扫描了猪(Sus scrofa)和非洲象(Loxodonta africana)的基因组,以证明其在非模式生物中的效用。我在这些基因组中鉴定出超过6000万个单匹配3'GG基序。参考基因组中超过61%的所有蛋白质编码基因具有至少一个与外显子重叠的独特3'GG gRNA位点。特别地,超过96%的小鼠和93%的人类蛋白质编码基因具有至少一个独特的,重叠的3'GG gRNA。这些鉴定出的位点可用作gRNA选择的起点,而ngg2工具提供了重要的能力,可在具有可用基因组序列的任何物种中鉴定3'GG编辑位点。参考基因组中超过61%的所有蛋白质编码基因具有至少一个与外显子重叠的独特3'GG gRNA位点。特别地,超过96%的小鼠和93%的人类蛋白质编码基因具有至少一个独特的,重叠的3'GG gRNA。这些鉴定出的位点可用作gRNA选择的起点,而ngg2工具提供了重要的能力,可在具有可用基因组序列的任何物种中鉴定3'GG编辑位点。参考基因组中超过61%的所有蛋白质编码基因具有至少一个与外显子重叠的独特3'GG gRNA位点。特别地,超过96%的小鼠和93%的人类蛋白质编码基因具有至少一个独特的,重叠的3'GG gRNA。这些鉴定出的位点可用作gRNA选择的起点,而ngg2工具提供了重要的能力,可在具有可用基因组序列的任何物种中鉴定3'GG编辑位点。
更新日期:2019-11-01
down
wechat
bug