当前位置: X-MOL 学术Syst. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Tale of Two Processes
Systematic Biology ( IF 6.1 ) Pub Date : 2005-12-01 , DOI: 10.1080/10635150500234682
Peter Lockhart 1 , Mike Steel
Affiliation  

The “long-branch attraction” (LBA) phenomenon in phylogeny reconstruction is well cited but its causes have been poorly characterized. In this article, we point out that different biological processes can lead to similar forms of long-branch attraction. That is, although sequences generated by different processes look similar “through the eyes of parsimony,” the ensemble of sequence site patterns (not just the parsimony sites) can distinguish between these processes. In 1978, Felsenstein described an evolutionary scenario under which unequal amounts of change in nonadjacent lineages would mislead tree-building methods based on parsimony (or on uncorrected distances). Other authors have since shown that when substitution models are misspecified, maximum likelihood and distancebased methods can be similarly misled (e.g., Hillis et al., 1994; Lockhart et al., 1996; Bruno and Halpern, 1999; Swofford et al., 2001; Sullivan and Swofford, 2001; Ho and Jermiin, 2004). LBA problems may also arise because of sparse and/or unbalanced taxon sampling (Hendy and Penny, 1989; Holland et al. 2003; Lockhart and Penny, 2005) and/or because of lineage-specific differences in rates or processes of evolution (e.g., Hasagawa and Hashimoto, 1993; Steel et al., 1993, 2000). Felsenstein (1978) assumed that whilst evolutionary rates varied across a tree, individual sites in sequences could be ascribed a rate of change that was the same at other sites in the same sequence. That is, if a lineage was fast (or slow) evolving, then the evolution of sites in a sequence belonging to that lineage was also fast (or slow). A site position that has evolved under this scenario can be seen as a special case of “heterotachy.” This is a property of individual sequence positions, which literally means different speeds. It is the concept of sequence evolution at a given site undergoing substitution at different rates in different parts of the tree (Lopez et al., 2002). Interestingly, Simon et al. (1996) have also described this phenomenon and referred to it as “mosaic evolution.” It is important to note that variation in the substitution rate of a site throughout the tree is distinct from rate variation across sites (as modeled, for example, by a gamma distribution). In the latter case there is a site-specific substitution rate that varies randomly across the sites, but at any site it applies equally to all the branch lengths of the tree (the branch lengths at the site are all multiplied by the site-specific rate). Consequently, the ratio of substitution rates on two different branches is constant across sites in such models, even when one allows both rate variation in the tree (as in Felsenstein’s scenario) as well as an independent process of rate variation (e.g., gamma distribution) across sites. In contrast, with more general forms of heterotachy, the ratio of substitution rates on different branches of the tree may vary across sites. The special case of heterotachy assumed by Felsenstein (1978) is different from another very special type of heterotachy recently explored in simulations by Kolaczkowski and Thornton (2004). These authors envisaged a four-taxon tree, for which the external lineages evolved in such a way that sites in the sequences accumulated substitutions at one of two rates, either slow or fast. As Spencer et al. (2005) point out, the frequencies of patterns expected under the simulation model studied by Kolaczkowski and Thornton (2004) are a small snapshot of the full range of possibilities when all possible combinations of short and long branches are considered, and most of these do not cause LBA (further concerns regarding the findings of Kolaczkowski and Thornton [2004] have been discussed by Steel, 2005). Thus, the patterns that Kolaczkowski and Thornton (2004) studied are different from those expected under the standard stationary covarion (or “covarion drift”) models, which have been the subject of much recent study (e.g., Tuffley and Steel, 1998; Penny et al., 2001; Gaucher et al., 2001; Huelsenbeck, 2002; Galtier, 2001; Misof et al., 2002; Inagaki et al., 2004; Ane et al., 2005; Guindon et al., 2004). These standard covarion models have reversible stationary substitution rates among character states that are switched “on” (variable), and a reversible stationary process between the state of “off” (invariable) and “on.” The latter condition will maintain

中文翻译:

两个过程的故事

系统发育重建中的“长枝吸引”(LBA) 现象被广泛引用,但其原因的特征却很差。在本文中,我们指出不同的生物过程可以导致类似形式的长枝吸引。也就是说,虽然不同过程产生的序列“通过简约的眼睛”看起来相似,但序列位点模式的集合(不仅仅是简约位点)可以区分这些过程。1978 年,Felsenstein 描述了一种进化情景,在这种情景下,非相邻谱系的不等量变化会误导基于简约(或未校正距离)的造树方法。其他作者此后表明,当替换模型被错误指定时,最大似然和基于距离的方法可能会受到类似的误导(例如,Hillis 等,1994;洛克哈特等人,1996 年;布鲁诺和哈尔彭,1999 年;Swofford 等人,2001 年;沙利文和斯沃福德,2001 年;Ho 和 Jermiin,2004 年)。LBA 问题也可能因为稀疏和/或不平衡的分类单元抽样(Hendy 和 Penny,1989 年;Holland 等人,2003 年;Lockhart 和 Penny,2005 年)和/或由于谱系特定的进化速率或过程差异(例如, Hasagawa 和 Hashimoto, 1993; Steel 等人, 1993, 2000)。Felsenstein (1978) 假设,虽然一棵树的进化速率不同,但序列中的单个位点可以归因于同一序列中其他位点的相同变化率。也就是说,如果一个谱系快速(或缓慢)进化,那么属于该谱系的序列中位点的进化也是快速(或缓慢)的。在这种情况下演变的站点位置可以看作是“异性”的特例。这是单个序列位置的属性,字面意思是不同的速度。它是给定位点的序列进化概念,在树的不同部分以不同的速率进行替换(Lopez 等,2002)。有趣的是,西蒙等人。(1996) 也描述了这种现象并将其称为“马赛克进化”。重要的是要注意整个树中一个站点的替代率的变化与站点之间的速率变化不同(例如,通过伽玛分布建模)。在后一种情况下,特定站点的替代率在站点之间随机变化,但在任何地点,它同样适用于树的所有分支长度(该地点的分支长度都乘以特定地点的比率)。因此,即使在允许树中的速率变化(如 Felsenstein 的场景中)以及速率变化的独立过程(例如,伽玛分布)的情况下,两个不同分支上的替代率比率在此类模型中跨站点也是恒定的跨站点。相比之下,对于更一般形式的异性,树不同分支上的替代率比率可能因站点而异。Felsenstein (1978) 假设的异质性的特殊情况与 Kolaczkowski 和 Thornton (2004) 最近在模拟中探索的另一种非常特殊的异质性类型不同。这些作者设想了一个四类群树,外部谱系以这样一种方式进化,即序列中的位点以两种速率之一累积替换,要么慢要么快。正如斯宾塞等人。(2005) 指出,在 Kolaczkowski 和 Thornton (2004) 研究的模拟模型下预期的模式频率是考虑短分支和长分支的所有可能组合时的全部可能性的一个小快照,其中大多数确实如此不会引起 LBA(Steel,2005 年讨论了关于 Kolaczkowski 和 Thornton [2004] 的发现的进一步担忧)。因此,Kolaczkowski 和 Thornton (2004) 研究的模式与标准平稳协变(或“协变漂移”)模型下的预期不同,后者是最近许多研究的主题(例如,Tuffley 和 Steel,1998 年;Penny等人,2001 年;Gaucher 等人,2001年;胡尔森贝克,2002 年;加尔捷,2001 年;Misof 等人,2002 年;稻垣等人,2004 年;安妮等人,2005 年;Guindon 等人,2004 年)。这些标准协变模型在“开启”(可变)的字符状态之间具有可逆的平稳替代率,以及在“关闭”(不变)和“开启”状态之间的可逆平稳过程。后一种情况将保持
更新日期:2005-12-01
down
wechat
bug