当前位置: X-MOL 学术Zool. Scr. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Extracting ‘legacy loci’ from an invertebrate sequence capture data set
Zoologica Scripta ( IF 2.3 ) Pub Date : 2021-09-17 , DOI: 10.1111/zsc.12513
Caroline D. Miller 1 , Michael Forthman 1, 2 , Christine W. Miller 1 , Rebecca T. Kimball 3
Affiliation  

Sequence capture studies result in rich data sets comprising hundreds to thousands of targeted genomic regions that are superseding Sanger-based data sets comprised of a few well-known loci with historical uses in phylogenetics (‘legacy loci’). However, integrating sequence capture and Sanger-based data sets is of interest as legacy loci can include different types of loci (e.g. mitochondrial and nuclear) across a potentially larger sample of species from past studies. Sequence capture data sets include nontargeted sequences, and there has been recent interest in extracting legacy loci from invertebrate data sets. Here, we use published legacy data from leaf-footed bugs (Hemiptera: Coreoidea) to recover 15 mitochondrial and seven nuclear legacy loci from off-target sequences in a sequence capture data set, explore approaches to improve legacy locus recovery, and combine these loci with sequence capture data for phylogenetic analysis. Two nuclear loci were determined to already be targeted by sequence capture baits. Most of the remaining loci were successfully recovered from off-target sequences, but this recovery varied greatly. Additionally, complementing complete mitogenomes with additional reference mitochondrial sequences from a genetic depository did not offer improvement for most of our taxa; however, supplementing these reference sequences with extracted legacy loci offered ≥6% improvement across taxa for a given mitochondrial locus (negligible improvement for nuclear loci). Phylogenetic analysis of legacy and sequence capture data produced a topology generally congruent with recent studies, but support was lower. Thus, future studies may employ the approaches used in this study to integrate legacy data with newly generated sequence capture data sets without added expenses.

中文翻译:

从无脊椎动物序列捕获数据集中提取“遗留基因座”

序列捕获研究产生了丰富的数据集,其中包含数百到数千个目标基因组区域,这些数据集取代了基于 Sanger 的数据集,这些数据集由几个在系统发育学中具有历史用途的著名基因座(“遗留基因座”)组成。然而,整合序列捕获和基于 Sanger 的数据集是令人感兴趣的,因为遗留基因座可以包括来自过去研究的潜在更大样本的不同类型的基因座(例如线粒体和核)。序列捕获数据集包括非靶向序列,最近人们对从无脊椎动物数据集中提取遗留基因座感兴趣。在这里,我们使用来自叶足虫(半翅目:Coreoidea)的已发布遗留数据从序列捕获数据集中的脱靶序列中恢复 15 个线粒体和 7 个核遗留基因座,探索改善遗留基因​​座恢复的方法,并将这些基因座与序列捕获数据结合起来进行系统发育分析。确定两个核位点已经成为序列捕获诱饵的目标。大多数剩余的基因座已从脱靶序列中成功恢复,但这种恢复变化很大。此外,用来自遗传库的额外参考线粒体序列补充完整的有丝分裂基因组并没有为我们的大多数分类群提供改进。然而,对于给定的线粒体基因座,用提取的遗留基因座补充这些参考序列在整个分类群中提供了≥6% 的改进(核基因座的改进可以忽略不计)。遗留和序列捕获数据的系统发育分析产生了与最近的研究大体一致的拓扑结构,但支持率较低。因此,
更新日期:2021-09-17
down
wechat
bug