当前位置: X-MOL 学术Cladistics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A reconsideration of inapplicable characters, and an approximation with step-matrix recoding
Cladistics ( IF 3.9 ) Pub Date : 2021-04-10 , DOI: 10.1111/cla.12456
Pablo A Goloboff 1, 2 , Jan De Laet 3 , Duniesky Ríos-Tamayo 1 , Claudia A Szumik 1
Affiliation  

Evidence for phylogenetic analysis comes in the form of observed similarities, and trees are selected to minimize the number of similarities that cannot be accounted for by homology (homoplasies). Thus, the classical argument for parsimony directly links homoplasy with explanatory power. When characters are hierarchically related, a first character may represent a primary structure such as tail absence/presence and a secondary (subordinate) character may describe tail colour; this makes tail colour inapplicable when tail is absent. It has been proposed that such character hierarchies should be evaluated on the same logical basis as standard characters, maximizing the number of similarities accounted for by secondary homology, i.e. common ancestry. Previous evaluations of the homology of a given ancestral reconstruction contain the unintuitive quantity “subcharacters” (number of regions of applicability). Rather than counting subcharacters, this paper proposes an equivalent but more intuitive formulation, based on counting the number of changes into each separate state. In this formulation, x-transformations, the homoplasy for the reconstruction is simply the number of changes into the state beyond the first, summed over all states. There is thus no direct connection between homoplasy and number of steps, only between homoplasy and extra steps. The link between the two formulations is that, for any region of applicability of any character, a subcharacter can be interpreted as the change into the state that is plesiomorphic in that region. Although some authors have claimed that the equivalence between maximizing explanatory power and minimizing independent originations of similar features (i.e. the standard justification of parsimony) does not hold for inapplicable characters, evaluating homoplasy with x-transformations clearly connects the two sides of that equation. Furthermore, as the evaluation with x-transformations provides a direct count and a straightforward interpretation of homoplasy, it extends naturally into implied weighting, and sheds light on problems with additive, step-matrix or continuous characters. It also allows deriving transformation costs for recoding hierarchies as step-matrix characters (where recoded states correspond to permissible combinations of states in primary and secondary characters), so that homology of the original observations is properly measured. Those transformation costs set the cost of gaining the primary structure to the maximum difference between “present” states plus cost of loss, and difference between “present” states to the sum of user-defined transformation costs between secondary features. With such recoding, invoking multiple independent derivations of the structure and similar features will cost as many extra “steps” as the instances of similarities (in both original characters) that are not being homologized. The step-matrix recoding also can take into account nested dependences. We present a simple convention for naming characters, which TNT can use to automatically convert the original data into a step-matrix form and set the proper transformation costs. Finally, the basic elements for handling inapplicable characters in the context of maximum-likelihood inference are outlined, and some quantitative comparisons between different approaches to inapplicables are provided.

中文翻译:

重新考虑不适用字符,以及使用步进矩阵重新编码的近似值

系统发育分析的证据以观察到的相似性的形式出现,并且选择树以最大限度地减少同源性(同源性)无法解释的相似性的数量。因此,简约的经典论点直接将同质性与解释力联系起来。当字符在层次上相关时,第一个字符可以表示主要结构,例如尾部不存在/存在,而次要(从属)字符可以描述尾部颜色;当没有尾巴时,这使得尾巴颜色不适用。有人提出,应在与标准字符相同的逻辑基础上评估此类字符层次结构,最大限度地增加二次同源性(即共同祖先)所解释的相似性数量。先前对给定祖先重建同源性的评估包含不直观的数量“子字符”(适用区域的数量)。本文提出了一个等效但更直观的公式,而不是计算子字符,它基于计算每个单独状态的变化数量。在这个公式中,x-transformations,重建的同质性只是超出第一个状态的变化数量,对所有状态求和。因此,同质性和步数之间没有直接联系,只有同质性和额外的脚步。这两种表述之间的联系在于,对于任何字符的任何适用区域,子字符都可以解释为在该区域中变为准拟态的状态。尽管一些作者声称最大化解释力和最小化相似特征的独立起源(即简约的标准证明)之间的等价性不适用于不适用的字符,但用 x 变换评估同质性显然连接了该等式的两侧。此外,由于 x 变换的评估提供了直接计数和对同质性的直接解释,它自然地扩展到隐含加权,并阐明了加法、步进矩阵或连续字符的问题。它还允许导出将层次结构重新编码为步进矩阵字符的转换成本(其中重新编码的状态对应于主要和次要字符中允许的状态组合),以便正确测量原始观察的同源性。这些转换成本将获得主要结构的成本设置为“当前”状态加上损失成本之间的最大差异,并将“当前”状态之间的差异设置为次要特征之间用户定义的转换成本之和。通过这样的重新编码,调用结构和相似特征的多个独立推导将花费与未同源化的相似性实例(在两个原始字符中)一样多的额外“步骤”。步骤矩阵重新编码也可以考虑嵌套依赖。我们提出了一个简单的字符命名约定,TNT 可以使用它来自动将原始数据转换为步进矩阵形式并设置适当的转换成本。最后,概述了在最大似然推理上下文中处理不适用字符的基本要素,并提供了不同处理不适用方法之间的一些定量比较。
更新日期:2021-04-10
down
wechat
bug