当前位置: X-MOL 学术Journal of Humanistic Mathematics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Using Hidden Markov Modeling for Biogeographical Ancestry Analysis
Journal of Humanistic Mathematics ( IF 0.3 ) Pub Date : 2019-07-01 , DOI: 10.5642/jhummath.201902.06
Melvin Currie

This paper describes a methodology for analyzing X chromosome data to establish biogeographical contributions to the author’s X chromosome. We present an exposition of how Hidden Markov Modeling (HMM) can be used as a black box for ancestry analysis and focus on a set of conditions that are not universal but fairly common. The first condition is that the ancestral populations are drawn from regions that have had very little or no contact with each other since prehistoric times. The second condition is that the number of possible ancestral populations is small. In this analysis, we assume that the ancestral populations are Native North American, Northwestern European, and West African. We compare the result of our analysis with the analyses carried out by the companies 23andMe and deCODEme for the same data. Finally, we point to a mechanism for reducing noise by adjusting the data before applying HMM. This paper describing the author’s analysis of his X chromosome is the result of a marriage between two spheres. The author is a mathematician and an avid genealogist. His formal education is in pure mathematics, having written a PhD dissertation in that domain, which was followed by a period in academia conducting related research. However, he spent the last 25 years of his career before retirement applying mathematics to cryptanalysis and cryptographic design at the National Security Agency. The year before his retirement he wrote an in-depth paper on Hidden Markov modeling (HMM) that covered in gory detail, with all the derivations and proofs, everything from the alpha-pass to the Baum-Welch convergence.1 1This was an internal NSA paper but is available upon request from the author. Journal of Humanistic Mathematics Volume 9 Number 2 (July 2019)

中文翻译:

使用隐马尔可夫模型进行生物地理祖先分析

本文介绍了一种用于分析X染色体数据以建立对作者X染色体的生物地理贡献的方法。我们介绍了如何将隐马尔可夫建模(HMM)用作血统分析的黑匣子,并重点介绍了一组不通用但相当普遍的条件。第一个条件是祖先的人口来自史前以来很少或根本没有接触的地区。第二个条件是可能的祖先种群数量很少。在此分析中,我们假设祖先的人口是北美原住民,西北欧和西非。我们将分析结果与23andMe和deCODEme公司针对相同数据进行的分析进行比较。最后,我们指出了一种通过在应用HMM之前调整数据来降低噪声的机制。本文描述了作者对其X染色体的分析是两个球体之间结合的结果。作者是数学家和狂热的家谱学家。他的正规教育是纯数学的,他在该领域撰写了博士学位论文,随后在学术界进行了一段时间的相关研究。但是,他在退休之前度过了职业生涯的最后25年,并在国家安全局将数学应用于密码分析和密码设计。退休前一年,他写了一篇有关Hidden Markov建模(HMM)的深入论文,其中详尽地涵盖了所有推导和证明,包括从alpha传递到Baum-Welch收敛的所有内容。1 1这是NSA的内部文件,但可应作者要求提供。人文数学杂志第9卷第2期(2019年7月)
更新日期:2019-07-01
down
wechat
bug