当前位置: X-MOL 学术bioRxiv. Synth. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Machine learning-based promoter strength prediction derived from a fine-tuned synthetic promoter library in Escherichia coli
bioRxiv - Synthetic Biology Pub Date : 2020-07-01 , DOI: 10.1101/2020.06.25.170365
Mei Zhao , Shenghu Zhou , Longtao Wu , Yu Deng

Promoters are one of the most critical regulatory elements controlling metabolic pathways. However, in recent years, researchers have simply perfected promoter strength, but ignored the relationship between the internal sequences and promoter strength. In this context, we constructed and characterized a mutant promoter library of Ptrc through dozens of mutation-construction-screening-characterization engineering cycles. After excluding invalid mutation sites, we established a synthetic promoter library, which consisted of 3665 different variants, displaying an intensity range of more than two orders of magnitude. The strongest variant was 1.52-fold stronger than a 1 mM isopropyl-β-D-thiogalactoside driven PT7 promoter. Our synthetic promoter library exhibited superior applicability when expressing different reporters, in both plasmids and the genome. Different machine learning models were built and optimized to explore relationships between the promoter sequences and transcriptional strength. Finally, our XgBoost model exhibited optimal performance, and we utilized this approach to precisely predict the strength of artificially designed promoter sequences. Our work provides a powerful platform that enables the predictable tuning of promoters to achieve the optimal transcriptional strength.

中文翻译:

基于机器学习的启动子强度预测,源于大肠杆菌中经过微调的合成启动子库

启动子是控制代谢途径的最关键的调节元件之一。然而,近年来,研究人员只是简单地完善了启动子强度,却忽略了内部序列与启动子强度之间的关系。在这种情况下,我们通过数十个突变构建筛选特征工程周期,构建并表征了Ptrc突变启动子文库。在排除无效突变位点之后,我们建立了一个合成启动子文库,该文库由3665个不同的变体组成,显示的强度范围超过两个数量级。最强的变体比1 mM异丙基-β-D-硫代半乳糖苷驱动的PT7启动子强1.52倍。当我们表达不同的报告基因时,我们的合成启动子文库表现出卓越的适用性,在质粒和基因组中都存在。建立并优化了不同的机器学习模型,以探索启动子序列与转录强度之间的关系。最后,我们的XgBoost模型表现出最佳性能,并且我们利用这种方法来精确预测人工设计的启动子序列的强度。我们的工作提供了一个强大的平台,使启动子的可预测调整能够实现最佳转录强度。
更新日期:2020-07-02
down
wechat
bug