当前位置: X-MOL 学术Autism Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Prototyping as subtyping strategy for studying heterogeneity in autism
Autism Research ( IF 4.7 ) Pub Date : 2021-06-02 , DOI: 10.1002/aur.2535
Michael V Lombardo 1, 2
Affiliation  

All models are wrong but some are useful” (Box, 1979, p. 202). This statement is particularly useful when it comes to the topic of studying of heterogeneity in autism. Expanding on ideas initially presented in Mottron and Bzdok (2020), Mottron's (2021) new commentary in this journal explicitly calls out how the current model for how we study autism is wrong and that there are other models (i.e., “prototypical autism”) that may be more useful. I think the field would resoundingly agree that the current model, resting on blind usage of the single diagnostic label, is less than ideal when it comes to furthering scientific research. Happé et al.'s (2006) paper was certainly one of the first to avert my attention to the issue. Many subsequent papers have come since then elaborating on the idea that heterogeneity in autism is a central scientific challenge for the field and also to discuss ways to address this challenge (e.g., Geschwind & Levitt, 2007; Gillberg, 2010; Lai et al., 2013; Lombardo et al., 2019; London, 2007; Müller & Amaral, 2017; Waterhouse & Gillberg, 2014).

While we would agree that the single diagnostic label is not ideal, we likely disagree somewhat with how a concept like “prototypical autism” might be an important alternative to the current situation. By studying prototypical autism, Mottron (2021) suggests we would reap the benefits of much larger effect sizes and thus gain more clarity in terms of a mechanistic understanding of autism. The claim of a bigger effect sizes as one goes further back into the past (Rødgaard et al., 2019) is not without some problems though. Sample size in autism research has also increased over time, and it is well known that effect sizes can be inflated (and sometimes by very large margins) in situations with small sample sizes (see simulations within Lombardo et al., 2019). In Figure 1, I have re-plotted data from Rødgaard et al. (2019), to show both the year-by-effect size and sample size-by-effect size relationships, with Spearman's ρ annotated on each plot. For the reproducible data and code for this re-analysis, see here: https://github.com/IIT-LAND/autism_es_time_n. For studies of social cognition and the P3b amplitude, both relationships between year and effect size, and sample size and effect size are present. Also notable is the decrease in variance of effect size as sample size increases, showing the well understood phenomenon that as sample size increase, the precision of the estimated statistic increases and likely converges on what the true population parameter is likely to be. This effect alone underscores the primary importance of larger sample sizes. Since effect size inflation is also a known issue (Lombardo et al., 2019), it is unclear whether higher effect sizes in the past are indeed precise estimates, because effect sizes are more imprecise with smaller sample sizes and older studies have much smaller sample sizes than studies of today (i.e., in all cases, except studies of non-social cognitive tasks, sample size and year are also significantly positively correlated). Trying to statistically covary out sample size, as Rødgaard et al. (2019) did is not really the best remedy for this situation, since removing variation associated with sample size also removes shared variance for year, due to the fact that year and sample size are positively correlated.

image
FIGURE 1
Open in figure viewerPowerPoint
Relationships between time, sample size, and effect size in autism research (a re-analysis of data from Rødgaard et al., 2019). Panel (a) shows relationships for studies classified as social cognition (e.g., emotion recognition and theory of mind). Panel (b) shows the same plots for studies of non-social cognitive tasks (e.g., flexibility, planning, inhibition). Panel (c) shows plots for studies of the P3 amplitude, while Panel (d) shows plots for studies of brain size. In each plot, each study is a single dot and Spearman's ρ is shown to describe the relationship. The coloring of the dots reflects the log10(sample size) (hotter colors reflecting larger sample size studies). The size of each study's dot also changes according to sample size (bigger reflecting larger sample size). The linear best-fit line is also shown, and this best fit is estimated with robust linear modeling in order to be insensitive to outliers. Reproducible data and analysis code for this re-analysis can be found here: https://github.com/IIT-LAND/autism_es_time_n

However, let us assume that it is the case that studying prototypical autism results in larger effect sizes. There is still a primary conceptual issue at the heart of this argument. Mottron's view of prototypical autism squarely dismisses ideas about heterogeneity, diversity, and individual differences as being nothing more than artifact—in fact, Mottron and Bzdok (2020) declare this take-home message unequivocally in the title of their paper: “Autism spectrum heterogeneity: fact or artifact”. Mottron's view implies that autism is best conceived of as singular unitary entity and not as a spectrum with important individual differences. The success of this kind of prototypical autism model depends somewhat on whether this underlying assumption is indeed valid and more correct than other models that would not assume that autism is one singular entity.

Besides this issue, I can see other problems with applying Mottron's approach in practice. First, what is prototypical autism? Mottron suggests that prototypical autism is a distinction that is clear to clinical experts, but it is never properly given a clear definition. Second, amongst the multiple levels one could examine in autism, the proposition of studying prototypical autism is one that places priority or importance at levels like behavior, cognition, and the clinical phenotype. What about other models that instead look for genetic or neural distinctions (e.g., Hong et al., 2020)? Why would the approach of looking at what is “prototypical” about autism at cognitive, behavioral, or clinical levels be more important than a definition of what is “prototypical” or most common about autism at other more biological levels of analysis?

A hardliner approach would be to declare that studying Mottron's “prototypical autism” as the best and only way to move forward. Taking this hardliner route is probably equally as wrong as the current situation. I would rather view what Mottron proposes in a slightly different light and also as just one of many different types of possible approaches for understanding heterogeneity in autism. Focusing on “prototypical autism” is just one of many possible “supervised” models for studying heterogeneity in autism (for more on “supervised” vs. “unsupervised” approaches to studying heterogeneity in autism, please see Lombardo et al., 2019). With the right set of justifications and the explicit end goals pre-specified, along with nuanced interpretation and care to not overgeneralize, there is no reason why studying “prototypical autism” could not yield insights for the field.

In terms of how to construe what is: prototypical,” we could benefit by looking at analogies taken from other approaches. For example, normative modeling approaches (Bethlehem et al., 2020; Marquand et al., 2016) are one of many ways to study heterogeneity and are well suited for detecting outlier individuals that markedly differ from age-appropriate norms defined from a non-autistic comparison population. With normative modeling, one first defines what the non-autistic comparison population norms are, and then autistic individuals are described statistically relative to those norms. The concept of “prototypical autism” seems somewhat analogous with respect to idea that first, a norm or prototype needs to be defined, and this would help make the definition of the “prototype” a bit more objective. However, the population that the norm is estimated on would need to be the autism population itself. If one could define the degree to which an autistic individual conforms to the autism norm or prototype, a whole range of phenomena could then be studied with respect to how individuals may deviate from that prototype. An example of how this can be done comes from our recent work where we characterized the joint distribution of social-communication (SC) and restricted repetitive behavior scores from the autism population and then used those norms to create subgroups (Bertelsen et al., 2021). An advantage to construing the idea of “prototypical autism” within this kind of “normative modeling” framework is the allowance for the norm to be defined on the basis of any collection of measured variables, which could come from any level of analysis. Furthermore, what is considered the definition of the norm or prototype is now empirically defined in statistical terms (e.g., a z-score based on the mean and SD of the autism population). The approach also does not a priori place higher importance on a few phenotypic levels (e.g., behavior, cognition) and could be implemented on any set of variables characterizing the autism population (e.g., cortical thickness phenotypes derived from structural MRI). Thus, rather than rejecting the idea of heterogeneity and assuming autism is one thing, this approach could take the concept of “prototypical autism,” formally define it based on empirically-derived statistical norms from the autism population, and then allow researchers to select and restrict their samples accordingly. The inferences one could generate from this approach would then need to be specific to the subset of the population and the set of variables that originally defined by the norm or prototype under study. In this way, the model's usefulness is preset based on a priori supervised justifications for how and why the researcher first defines the prototype under study. The model may not be panacea to all problems we face, and thus it may be wrong in many respects, but at least it would be explicitly clear from the start what it is useful for.

In sum, a view that interprets prototypical autism to be one of many possible strategies for dealing with heterogeneity in autism could reap significant insights if applied in a nuanced way, where the goals and justifications are clearly spelled out, the disadvantages are clearly understood, and the interpretations are not over- or under-generalized. There is room for many possible models. All will likely be wrong in some way, but it is very possible that different models are useful in different ways. We should exploit all possible models for their unique utility until better models come along that are empirically demonstrated to be less wrong.



中文翻译:

原型作为研究自闭症异质性的子类型策略

所有模型都是错误的,但有些模型是有用的”(Box,  1979,第 202 页)。当涉及到自闭症异质性研究的主题时,这一说法特别有用。Mottron ( 2021 ) 在本杂志上的新评论扩展了 Mottron 和 Bzdok ( 2020 ) 最初提出的想法,明确指出当前我们研究自闭症的模型是错误的,并且还有其他模型(即“原型自闭症”)这可能更有用。我认为该领域会一致认为,当前的模型基于盲目使用单一诊断标签,在进一步开展科学研究方面并不理想。Happé 等人(2006)的论文无疑是最早转移我对这个问题的注意力的论文之一。此后出现了许多后续论文,阐述了自闭症异质性是该领域的一个核心科学挑战的观点,并讨论了应对这一挑战的方法(例如,Geschwind & Levitt,  2007;Gillberg,  2010;Lai 等人,  2013 年;Lombardo 等人,  2019 年;伦敦,  2007 年;Müller 和 Amaral,  2017 年;Waterhouse 和 Gillberg,  2014 年)。

虽然我们同意单一诊断标签并不理想,但我们可能在某种程度上不同意“典型自闭症”这样的概念如何成为当前情况的重要替代方案。Mottron ( 2021 ) 认为,通过研究典型的自闭症,我们将获得更大效应量的好处,从而对自闭症的机械理解更加清晰。然而,随着人们回溯到更远的过去,效应会更大(Rødgaard 等人,  2019)的说法并非没有一些问题。自闭症研究的样本量也随着时间的推移而增加,众所周知,在样本量较小的情况下,效应量可能会被夸大(有时幅度非常大)(参见 Lombardo 等人,2019 年的模拟 。在图 1 中,我重新绘制了 Rødgaard 等人的数据。( 2019 ),显示逐年效应大小和样本量与效应大小的关系,并在每个图上注释斯皮尔曼的 ρ 。有关此重新分析的可重现数据和代码,请参见此处:https://github.com/IIT-LAND/autism_es_time_n。对于社会认知和 P3b 幅度的研究,存在年份和效应大小以及样本大小和效应大小之间的关系。同样值得注意的是,随着样本量的增加,效应量的方差会减小,这表明了众所周知的现象,即随着样本量的增加,估计统计量的精度会增加,并且可能会收敛于真实总体参数的可能值。仅这一效应就强调了较大样本量的首要重要性。由于效应量膨胀也是一个已知问题(Lombardo et al.,  2019),因此尚不清楚过去较高的效应量是否确实是精确的估计,因为样本量较小,效应量更加不精确,而且较早的研究样本要小得多规模比今天的研究要大(即,在所有情况下,除了非社会认知任务的研究之外,样本量和年份也显着正相关)。正如 Rødgaard 等人所言,尝试在统计上共同改变样本量。(2019)确实并不是针对这种情况的最佳补救措施,因为消除与样本量相关的变化也会消除年份的共享方差,因为年份和样本量呈正相关。

图像
图1
在图查看器中打开微软幻灯片软件
自闭症研究中时间、样本量和效应大小之间的关系(对 Rødgaard 等人的数据的重新分析,  2019)。面板(a)显示了分类为社会认知(例如,情感识别和心理理论)的研究的关系。图(b)显示了非社会认知任务(例如灵活性、计划、抑制)研究的相同图。图 (c) 显示了 P3 振幅研究的图,而图 (d) 显示了大脑大小研究的图。在每个图中,每个研究都是一个点,并显示Spearman 的ρ来描述这种关系。点的颜色反映了 log10(样本量)(较热的颜色反映了较大样本量的研究)。每个研究点的大小也会根据样本量而变化(越大反映样本量越大)。还显示了线性最佳拟合线,并且该最佳拟合是通过稳健的线性模型估计的,以便对异常值不敏感。可在此处找到此重新分析的可重复数据和分析代码:https://github.com/IIT-LAND/autism_es_time_n

然而,让我们假设研究典型自闭症会产生更大的效应。这一争论的核心仍然存在一个主要的概念问题。Mottron 对典型自闭症的看法直接驳斥了关于异质性、多样性和个体差异的想法,认为它们只不过是人为因素——事实上,Mottron 和 Bzdok (2020 )在他们的论文标题中明确地宣布了这一重要信息:“自闭症谱系异质性:事实或人工制品”。莫特朗的观点意味着自闭症最好被视为单一的单一实体,而不是具有重要个体差异的谱系。这种典型自闭症模型的成功在某种程度上取决于这一基本假设是否确实有效,并且比其他不假设自闭症是一个单一实体的模型更正确。

除了这个问题之外,我还看到在实践中应用 Mottron 方法的其他问题。首先,什么是典型自闭症?Mottron 认为,临床专家很清楚原型自闭症的区别,但从未给出明确的定义。其次,在自闭症可以检查的多个层面中,研究典型自闭症的主张是优先考虑或重视行为、认知和临床表型等层面的主张。其他寻找遗传或神经差异的模型又如何呢(例如 Hong 等人,  2020)?为什么在认知、行为或临床水平上研究自闭症“典型”的方法比在其他更生物学的分析水平上定义自闭症“典型”或最常见的定义更重要?

强硬派的做法是宣称研究莫特朗的“典型自闭症”是前进的最好也是唯一的方法。采取这种强硬路线可能和目前的情况一样错误。我宁愿从稍微不同的角度来看待 Mottron 的提议,也将其视为理解自闭症异质性的许多不同类型的可能方法之一。关注“原型自闭症”只是研究自闭症异质性的许多可能的“监督”模型之一(有关研究自闭症异质性的 “监督”与“无监督”方法的更多信息,请参阅 Lombardo 等人, 2019)。有了正确的理由和预先指定的明确的最终目标,再加上细致入微的解释和注意不要过度概括,研究“典型自闭症”没有理由不能为该领域带来见解。

就如何解释什么是“原型”而言,我们可以通过研究其他方法的类比而受益。例如,规范建模方法(Bethlehem 等人,  2020 年;Marquand 等人,  2016 年)是研究异质性的多种方法之一,非常适合检测与非非标准定义的年龄适当标准明显不同的异常个体。自闭症比较人群。通过规范建模,首先定义非自闭症比较人群标准是什么,然后相对于这些标准对自闭症个体进行统计描述。“原型自闭症”的概念似乎有点类似于首先需要定义一个规范或原型,这将有助于使“原型”的定义更加客观。然而,估计标准的人口必须是自闭症人口本身。如果可以定义自闭症个体符合自闭症规范或原型的程度,那么就可以研究个体如何偏离该原型的一系列现象。如何做到这一点的一个例子来自我们最近的工作,其中我们描述了自闭症人群的社交沟通(SC)和限制性重复行为评分的联合分布,然后使用这些规范来创建子组(Bertelsen 等人,2021  )。在这种“规范建模”框架内解释“典型自闭症”概念的一个优点是允许基于任何测量变量的集合来定义规范,这些变量可以来自任何级别的分析。此外,现在所认为的标准或原型的定义是用统计术语凭经验定义的(例如,基于自闭症人群的平均值和标准差的 z 分数)。该方法也不会先验地对一些表型水平(例如行为、认知)给予更高的重视,并且可以在表征自闭症人群的任何一组变量上实施(例如,从结构 MRI 得出的皮质厚度表型)。因此,这种方法不是拒绝异质性的想法并假设自闭症是一回事,而是可以采用“原型自闭症”的概念,根据自闭症人群的经验得出的统计规范对其进行正式定义,然后允许研究人员选择和相应地限制他们的样本。从这种方法中得出的推论需要特定于总体子集以及最初由所研究的规范或原型定义的变量集。通过这种方式,模型的有用性是基于研究人员如何以及为何首先定义所研究的原型的先验监督理由而预设的。该模型可能不是解决我们面临的所有问题的灵丹妙药,因此它在很多方面可能是错误的,但至少从一开始就明确地清楚了它的用途。

总而言之,将典型自闭症解释为处理自闭症异质性的多种可能策略之一的观点,如果以细致入微的方式应用,可以收获重要的见解,其中目标和理由被清楚地阐明,缺点被清楚地理解,并且这些解释并不过分概括或不够概括。许多可能的模型都有空间。所有的模型都可能在某种程度上是错误的,但不同的模型很可能以不同的方式发挥作用。我们应该利用所有可能的模型来发挥其独特的效用,直到出现更好的模型,并且经经验证明错误较少。

更新日期:2021-06-02
down
wechat
bug