当前位置: X-MOL 学术J. Bioinform. Comput. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
DNA sequence, physics, and promoter function: Analysis of high-throughput data On T7 promoter variants activity
Journal of Bioinformatics and Computational Biology ( IF 1 ) Pub Date : 2020-01-31 , DOI: 10.1142/s0219720020400016
Mikhail A Orlov 1 , Anatoly A Sorokin 1
Affiliation  

RNA polymerase/promoter recognition represents a basic problem of molecular biology. Decades-long efforts were made in the area, and yet certain challenges persist. The usage of certain most suitable model subjects is pivotal for the research. System of T7 bacteriophage RNA-polymerase/T7 native promoter represents an exceptional example for the purpose. Moreover, it has been studied the most and successfully applied to aims of biotechnology and bioengineering. Both structural simplicity and high specificity of this molecular duo are the reason for this. Despite highly similar sequences of distinct T7 native promoters, the T7 RNA-polymerase enzyme is capable of binding respective promoter in a highly specific and adjustable manner. One explanation here is that the process relies primarily on DNA physical properties rather than nucleotide sequence. Here, we address the issue by analyzing massive data recently published by Komura and colleagues. This initial study employed Next Generation Sequencing (NGS) in order to quantify activity of promoter variants including ones with multiple substitutions. As a result of our work substantial bias in simultaneous occurrence of single-nucleotide sequence alterations was found: the highest rate of co-occurrence was evidenced within specificity loop of binding region while the lowest — in initiation region of promoter. If both location and a kind of nucleotides involved in replacement (both initial and resulting) are taken into consideration, one can easily note that N to A substitutions are most preferred ones across the whole 19 b.p.-long sequence. At the same time, N to C are tolerated only at crucial position in recognition loop of binding region, and N to G are uniformly least tolerable. Later in this work the complete set of variants was split into groups with mutations (1) exclusively in binding region; (2) exclusively in melting region; (3) in both regions. Among these three groups second comprises extremely few variants (at triple-digit rate lesser than in two other groups, 46 versus over one and six thousand). Yet these are all promoter with substantial to high activity. This group two appeared heterogenous by primary sequence; indeed, upon further subdivision into above versus below average activity subgroups first one was found to comprise promoters with negligible conservation at [Formula: see text]2 position of melting region; the second was hardly conserved in this region at all. This draws our attention to perfect consensus sequence of class III T7 promoter with [Formula: see text]2 nucleotide randomized (all four are present by one to several copies in the previously published source dataset), the picture becomes even more pronounced. We therefore suggest that mutations at the position therefore do not cause significant changes in terms of promoter activity. At the same time, such modifications dramatically change DNA physical properties which were calculated in our study (namely electrostatic potential and propensity to bend). One possible suggestion here is that [Formula: see text]2 nucleotide might function as a generic switch; if so, substitution [Formula: see text]2A to [Formula: see text]2T has important regulatory consequences. The fact that that [Formula: see text]2 b.p. is the most evidently different nucleotide between class II versus class III promoters of T7 genome and that it also distinguishes the class III promoter in T7 genome versus promoters of its relative but reproductively isolated bacteriophage T3. In other words, it appears feasible that mutation at [Formula: see text]2 nucleotide does not impede promoter activity yet alter its physical properties thus affecting differential RNA polymerase/promoter interaction.

中文翻译:

DNA 序列、物理和启动子功能:T7 启动子变体活性的高通量数据分析

RNA聚合酶/启动子识别代表了分子生物学的一个基本问题。在该领域进行了数十年的努力,但某些挑战仍然存在。某些最合适的模型主题的使用对于研究至关重要。T7 噬菌体 RNA 聚合酶/T7 天然启动子系统代表了一个特殊的例子。此外,它已被研究最多并成功地应用于生物技术和生物工程的目标。这种分子二重体的结构简单性和高特异性都是造成这种情况的原因。尽管不同 T7 天然启动子的序列高度相似,但 T7 RNA 聚合酶能够以高度特异性和可调节的方式结合各自的启动子。这里的一种解释是,该过程主要依赖于 DNA 物理特性而不是核苷酸序列。在这里,我们通过分析 Komura 及其同事最近发布的大量数据来解决这个问题。这项初步研究采用下一代测序 (NGS) 来量化启动子变体的活性,包括具有多个替换的启动子变体。作为我们工作的结果,发现同时发生单核苷酸序列改变的重大偏差:在结合区域的特异性环内证明了最高的共现率,而在启动子的起始区域中证明了最低的共现率。如果同时考虑位置和参与替换的一种核苷酸(初始的和结果的),可以很容易地注意到 N 到 A 的替换是整个 19 bp 长序列中最优选的替换。同时,N 到 C 只在结合区识别环的关键位置被容忍,和 N 到 G 是一致的最不可容忍的。在这项工作的后期,完整的变体集被分成具有突变的组(1)仅在结合区域;(2) 仅在熔化区;(3) 在两个地区。在这三个组中,第二个包含极少的变体(三位数的比率低于其他两个组,46 对超过 1 和 6000)。然而,这些都是具有实质性到高活性的启动子。这组二按一级序列显示是异质的;事实上,在进一步细分为高于平均水平和低于平均水平的活动亚组后,发现第一个亚组包含在 [公式:见文本]2 熔化区位置处具有可忽略的保守性的启动子;第二个在该地区几乎没有得到保护。这引起了我们对 III 类 T7 启动子的完美共有序列与[公式:见文本]2 个核苷酸随机化(所有四个在先前发布的源数据集中以一到多个副本存在),图片变得更加明显。因此,我们建议该位置的突变不会导致启动子活性发生显着变化。同时,这种修饰极大地改变了我们研究中计算的 DNA 物理特性(即静电势和弯曲倾向)。这里的一个可能的建议是[公式:见文本]2 核苷酸可能充当通用开关;如果是这样,将 [公式:见正文]2A 替换为 [公式:见正文]2T 具有重要的监管后果。事实上 [公式:见正文]2 bp 是 T7 基因组的 II 类与 III 类启动子之间最明显不同的核苷酸,并且它还区分了 T7 基因组中的 III 类启动子与其相对但生殖分离的噬菌体 T3 的启动子。换句话说,[公式:见文本]2核苷酸处的突变似乎是可行的,它不会阻碍启动子的活性,但会改变其物理性质,从而影响差异的RNA聚合酶/启动子相互作用。
更新日期:2020-01-31
down
wechat
bug