当前位置: X-MOL 学术Plant Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Variation Patterns of NLR Clusters in Arabidopsis thaliana Genomes
Plant Communications ( IF 10.5 ) Pub Date : 2020-06-24 , DOI: 10.1016/j.xplc.2020.100089
Rachelle R Q Lee 1 , Eunyoung Chae 1
Affiliation  

The nucleotide-binding domain and leucine-rich repeat (NLR) gene family is highly expanded in the plant lineage with extensive sequence and structure polymorphisms. To survey the landscape of NLR expansion, we mined the published long-read data generated by the resistance gene enrichment sequencing of 64 diverse Arabidopsis thaliana accessions. We found that the hot spots of massive multi-gene NLR cluster expansion did not typically span the whole cluster; instead, they were restricted to a handful of, or only one, dominant radiation(s). All sequences in such a radiation were distinct from other genes in the cluster but not from each other in the clade, making it difficult to assign trustworthy reference-based orthologies when multiple reference genes were present in the radiation. Consequently, NLR genes can be broadly divided into two types: radiating or high-fidelity, where high-fidelity genes are well conserved and well separated from other clades. A similar distinction could be made for NLR clusters, depending on whether cluster size was determined primarily by extensive radiation or the presence of numerous high-fidelity genes. We also identified groups of well-conserved NLR clades that were missing from the Columbia-0 reference genome. This suggests that the classification of NLRs using gene IDs from a single reference accession can rarely capture all major paralogs in a cluster accurately and representatively and that a reference-agnostic perspective is required to properly characterize these additional variations. Finally, we present a quantitative visualization method for differentiating these situations in a given clade of interest.



中文翻译:

拟南芥基因组中 NLR 簇的变异模式

核苷酸结合域和富含亮氨酸重复序列 (NLR) 基因家族在植物谱系中高度扩展,具有广泛的序列和结构多态性。为了调查 NLR 扩展的前景,我们挖掘了由 64 种不同拟南芥的抗性基因富集测序生成的已发表长读数据加入。我们发现大规模多基因 NLR 集群扩展的热点通常不会跨越整个集群;相反,它们仅限于少数或仅一种主要辐射。这种辐射中的所有序列都与簇中的其他基因不同,但在进化枝中彼此不同,因此当辐射中存在多个参考基因时,很难分配可靠的基于参考的直系同源。因此,NLR 基因可以大致分为两类:辐射型或高保真基因,其中高保真基因被很好地保守并与其他进化枝很好地分离。NLR 簇也可以进行类似的区分,这取决于簇大小是主要由广泛的辐射决定还是由大量高保真基因的存在决定。我们还确定了哥伦比亚-0 参考基因组中缺失的一组保存良好的 NLR 进化枝。这表明使用来自单个参考登记的基因 ID 对 NLR 进行分类很少能准确和代表性地捕获集群中的所有主要旁系同源物,并且需要与参考无关的视角来正确表征这些额外的变异。最后,我们提出了一种定量可视化方法,用于在给定的感兴趣的进化枝中区分这些情况。这表明使用来自单个参考登记的基因 ID 对 NLR 进行分类很少能准确和代表性地捕获集群中的所有主要旁系同源物,并且需要与参考无关的视角来正确表征这些额外的变异。最后,我们提出了一种定量可视化方法,用于在给定的感兴趣的进化枝中区分这些情况。这表明使用来自单个参考登记的基因 ID 对 NLR 进行分类很少能准确和代表性地捕获集群中的所有主要旁系同源物,并且需要与参考无关的视角来正确表征这些额外的变异。最后,我们提出了一种定量可视化方法,用于在给定的感兴趣的进化枝中区分这些情况。

更新日期:2020-06-24
down
wechat
bug