当前位置: X-MOL 学术bioRxiv. Genom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Variant Library Annotation Tool (VaLiAnT): an oligonucleotide library design and annotation tool for Saturation Genome Editing and other Deep Mutational Scanning experiments
bioRxiv - Genomics Pub Date : 2021-01-19 , DOI: 10.1101/2021.01.19.427318
Luca Barbon , Victoria Offord , Elizabeth J. Radford , Adam P. Butler , Sebastian S. Gerety , David J. Adams , Matthew E. Hurles , Hong Kee Tan , Andrew J. Waters

Motivation: Recent advances in CRISPR/Cas9 technology allow for the functional analysis of genetic variants at single nucleotide resolution whilst maintaining genomic context (Findlay et al., 2018). This approach, known as saturation genome editing (SGE), is a distinct type of deep mutational scanning (DMS) that systematically alters each position in a target region to explore its function. SGE experiments require the design and synthesis of oligonucleotide variant libraries which are introduced into the genome by homology-directed repair (HDR). This technology is broadly applicable to diverse research fields such as disease variant identification, drug development, structure-function studies, synthetic biology, evolutionary genetics and the study of host-pathogen interactions. Here we present the Variant Library Annotation Tool (VaLiAnT) which can be used to generate saturation mutagenesis oligonucleotide libraries from user-defined genomic coordinates and standardised input files. This software package is intentionally versatile to accommodate diverse operability, with species, genomic reference sequences and transcriptomic annotations specified by the user. Genomic ranges, directionality and frame information are considered to allow perturbations at both the nucleotide and amino acid level. Results: Coordinates for a genomic range, that may include exonic and/or intronic sequence, are provided by the user in order to retrieve a corresponding oligonucleotide reference sequence. A user-specified range within this sequence is then subject to systematic, nucleotide and/or amino acid saturating mutator functions, with each discrete mutation returned to the user as a separate sequence, building up the final oligo library. If desired, variant accessions from genetic information repositories, such as ClinVar and gnomAD, that fall within the user- specified ranges, will also be incorporated into the library. For SGE library generation, base reference sequences can be modified to include PAM (Protospacer Adjacent Motif) and protospacer protection edits that prevent Cas9 from cutting incorporated oligonucleotide tracts. Mutator functions modify this protected reference sequence to generate variant sequences. Constant regions are designated for non-editing to allow specific adapter annealing for downstream cloning and amplification from the library pool. A metadata file is generated, delineating annotation information for each variant sequence to aid computational analysis. In addition, a library file is generated, which contains unique sequences (any exact duplicate sequences are removed) ready for submission to commercial synthesis platforms. A VCF file listing all variants is also generated for analysis and quality control processes. The VaLiAnT software package provides a novel means to systemically retrieve, mutate and annotate genomic sequences for oligonucleotide library generation. Specific features for SGE library generation can be employed, with other diverse applications possible. Availability and Implementation: VaLiAnT is a command line tool written in Python. Source code, testing data, example library input and output files, and executables are available at https://github.com/cancerit/VaLiAnT. A user manual details step by step instructions for software use, available at https://github.com/cancerit/VaLiAnT/wiki. The software is freely available for non- commercial use (see Licence for more details, https://github.com/cancerit/VaLiAnT/blob/develop/LICENSE).

中文翻译:

变异文库注释工具(VaLiAnT):寡核苷酸文库设计和注释工具,用于饱和基因组编辑和其他深度突变扫描实验

动机:CRISPR / Cas9技术的最新进展允许以单核苷酸分辨率对遗传变异进行功能分析,同时保持基因组背景(Findlay等,2018)。这种称为饱和基因组编辑(SGE)的方法是深度突变扫描(DMS)的一种独特类型,它可以系统地改变目标区域中的每个位置以探索其功能。SGE实验需要设计和合成寡核苷酸变异文库,该文库通过同源直接修复(HDR)引入基因组。该技术广泛适用于各种研究领域,例如疾病变异鉴定,药物开发,结构功能研究,合成生物学,进化遗传学以及宿主-病原体相互作用的研究。在这里,我们介绍了变异库注释工具(VaLiAnT),该工具可用于根据用户定义的基因组坐标和标准化的输入文件生成饱和诱变寡核苷酸库。该软件包具有通用性,可以适应各种操作性,并具有用户指定的种类,基因组参考序列和转录组注释。基因组范围,方向性和框架信息被认为可以在核苷酸和氨基酸水平上引起干扰。结果:用户可提供基因组范围的坐标,其中可能包括外显子和/或内含子序列,以便检索相应的寡核苷酸参考序列。然后,该序列中用户指定的范围会受到系统的,核苷酸和/或氨基酸的饱和突变功能的影响,每个离散突变作为单独的序列返回给用户,从而构建最终的寡核苷酸库。如果需要,属于用户指定范围内的遗传信息库(例如ClinVar和gnomAD)的变体登录名也将被合并到该库中。对于SGE文库的生成,可以将基本参考序列修改为包括PAM(Protospacer相邻基序)和Protospacer保护编辑,以防止Cas9剪切掺入的寡核苷酸片段。突变体功能可修改此受保护的参考序列以生成变体序列。恒定区被指定用于非编辑,以允许特定的衔接子退火,用于下游克隆和从文库中扩增。生成了元数据文件,描绘每个变体序列的注释信息,以帮助进行计算分析。此外,还会生成一个库文件,其中包含准备提交给商业合成平台的唯一序列(删除了所有精确的重复序列)。还将列出列出所有变体的VCF文件,以进行分析和质量控制过程。VaLiAnT软件包提供了一种新颖的方法来系统地检索,突变和注释基因组序列,以生成寡核苷酸文库。可以采用SGE库生成的特定功能,也可以使用其他各种应用程序。可用性和实现:VaLiAnT是用Python编写的命令行工具。源代码,测试数据,示例库输入和输出文件以及可执行文件可从https://github.com/cancerit/VaLiAnT获得。用户手册详细介绍了软件使用的分步说明,网址为https://github.com/cancerit/VaLiAnT/wiki。该软件可免费用于非商业用途(有关更多详细信息,请参阅许可证,https://github.com/cancerit/VaLiAnT/blob/develop/LICENSE)。
更新日期:2021-01-20
down
wechat
bug