当前位置: X-MOL 学术bioRxiv. Evol. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Joint identification of sex and sex-linked scaffolds in non-model organisms using low depth sequencing data
bioRxiv - Evolutionary Biology Pub Date : 2021-03-04 , DOI: 10.1101/2021.03.03.433779
Casia Nursyifa , Anna Brüniche-Olsen , Genis Garcia Erill , Rasmus Heller , Anders Albrechtsen

Being able to assign sex to individuals and identify autosomal and sex-linked scaffolds are essential in most population genomic analyses. Non-model organisms often have genome assemblies at scaffold level and lack characterization of sex-linked scaffolds. Previous methods to identify sex and sex-linked scaffolds have relied on e.g. sequence similarity between the non-model organism and a closely related species or prior knowledge about the sex of the samples to identify sex-linked scaffolds. In the latter case, the difference in depth of coverage between the autosomes and the sex chromosomes are used. Here we present "Sex Assignment Through Coverage" (SATC), a method to identify sample sex and sex-linked scaffolds from NGS data. The method only requires a scaffold level reference assembly and sampling of both sexes with whole genome sequencing (WGS) data. We use the sequencing depth distribution across scaffolds to jointly identify: i) male and female individuals and ii) sex-linked scaffolds. This is achieved through projecting the scaffold depths into a low-dimensional space using principal component analysis (PCA) and subsequent Gaussian mixture clustering. We demonstrate the applicability of our method using data from five mammal species and a bird species complex. The method is open source and freely available at https://github.com/popgenDK/SATC

中文翻译:

使用低深度测序数据联合鉴定非模型生物中的性别和与性相关的支架

在大多数人群基因组分析中,能够为个体分配性别并鉴定常染色体和与性相关的支架是必不可少的。非模型生物通常在支架水平上具有基因组装配,并且缺乏与性相关的支架的表征。鉴定性别和与性别相关的支架的先前方法依赖于例如非模型生物与密切相关的物种之间的序列相似性或关于样品性别的先验知识以鉴定与性别相关的支架。在后一种情况下,使用常染色体和性染色体之间的覆盖深度差异。在这里,我们介绍“通过覆盖进行性别分配”(SATC),这是一种从NGS数据中识别样本性别和与性相关的支架的方法。该方法只需要一个支架水平的参考组装,并使用全基因组测序(WGS)数据对两性进行抽样。我们使用跨支架的测序深度分布来共同识别:i)男性和女性个体; ii)性别相关的支架。这是通过使用主成分分析(PCA)和随后的高斯混合聚类将支架深度投影到低维空间中来实现的。我们使用来自五个哺乳动物物种和鸟类物种复合体的数据证明了我们方法的适用性。该方法是开源的,可从https://github.com/popgenDK/SATC免费获得 这是通过使用主成分分析(PCA)和随后的高斯混合聚类将支架深度投影到低维空间中来实现的。我们使用来自五个哺乳动物物种和鸟类物种复合体的数据证明了我们方法的适用性。该方法是开源的,可从https://github.com/popgenDK/SATC免费获得 这是通过使用主成分分析(PCA)和随后的高斯混合聚类将支架深度投影到低维空间中来实现的。我们使用来自五个哺乳动物物种和鸟类物种复合体的数据证明了我们方法的适用性。该方法是开源的,可从https://github.com/popgenDK/SATC免费获得
更新日期:2021-03-05
down
wechat
bug