当前位置: X-MOL 学术Methods Ecol. Evol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Unsupervised detection of ancestry tracks with the GHap r package
Methods in Ecology and Evolution ( IF 6.6 ) Pub Date : 2020-08-13 , DOI: 10.1111/2041-210x.13467
Yuri Tani Utsunomiya 1, 2 , Marco Milanesi 1, 2 , Mario Barbato 3 , Adam Taiti Harth Utsunomiya 1, 2 , Johann Sölkner 4 , Paolo Ajmone‐Marsan 3 , José Fernando Garcia 1, 2, 5
Affiliation  

  1. The identification of ancestry tracks is a powerful tool to assist the inference of evolutionary events in the genomes of animals and plants. However, algorithms for ancestry track detection typically require labelled reference population data. This dependency prevents the analysis of genomic data lacking prior information on genetic structure, and may produce classification bias when samples in the reference data are inadvertently admixed.
  2. We combined heuristics with K‐means clustering to deploy a method that can detect ancestry tracks without the provision of lineage labels for reference population data. The resulting algorithm uses phased genotypes to infer individual ancestry proportions and local ancestry. By piling up ancestry tracks across individuals, our method also allows for mapping loci with excess or deficit ancestry from specific lineages.
  3. Using both simulated and real genomic data, we found that the proposed method was accurate in inferring genetic structure, assigning chromosomal segments to lineages and estimating individual ancestry, especially in cases where ancestry tracks resulted from recent admixture of highly divergent lineages.
  4. The method is implemented as part of the v2 release of the GHap r package (available at https://cran.r‐project.org/package=GHap and https://bitbucket.org/marcomilanesi/ghap/src/master/).


中文翻译:

GHap r软件包的无监督祖先轨迹检测

  1. 祖先轨迹的识别是一种强大的工具,可帮助推断动植物基因组中的进化事件。但是,用于祖先轨迹检测的算法通常需要标记的参考种群数据。这种依赖性阻止了缺少关于遗传结构先验信息的基因组数据的分析,并且可能会在无意中将参考数据中的样本混合在一起时产生分类偏差。
  2. 我们将启发式方法与K均值聚类相结合,以部署一种无需提供参考种群数据沿袭标签就可以检测祖先轨迹的方法。结果算法使用分阶段的基因型来推断个体祖先的比例和当地祖先。通过堆积个体之间的祖先轨迹,我们的方法还可以映射来自特定谱系的具有过多或不足祖先的基因座。
  3. 使用模拟和真实的基因组数据,我们发现该方法在推断遗传结构,将染色体片段分配给谱系以及估计单个谱系方面是准确的,尤其是在谱系轨迹是由于最近高度分化的谱系混合而产生的情况下。
  4. 该方法是GHap r软件包v2版本的一部分(可从https://cran.r-project.org/package=GHap和https://bitbucket.org/marcomilanesi/ghap/src/master/获得)的一部分。 )。
更新日期:2020-08-13
down
wechat
bug