当前位置: X-MOL 学术Ethics and Information Technology › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Can the predictive processing model of the mind ameliorate the value-alignment problem?
Ethics and Information Technology ( IF 3.633 ) Pub Date : 2021-09-06 , DOI: 10.1007/s10676-021-09611-0
William Ratoff 1
Affiliation  

How do we ensure that future generally intelligent AI share our values? This is the value-alignment problem. It is a weighty matter. After all, if AI are neutral with respect to our wellbeing, or worse, actively hostile toward us, then they pose an existential threat to humanity. Some philosophers have argued that one important way in which we can mitigate this threat is to develop only AI that shares our values or that has values that ‘align with’ ours. However, there is nothing to guarantee that this policy will be universally implemented—in particular, ‘bad actors’ are likely to flout it. In this paper, I show how the predictive processing model of the mind, currently ascendant in cognitive science, may ameliorate the value-alignment problem. In essence, I argue that there are a plurality of reasons why any future generally intelligent AI will possess a predictive processing cognitive architecture (e.g. because we decide to build them that way; because it is the only possible cognitive architecture that can underpin general intelligence; because it is the easiest way to create AI.). I also argue that if future generally intelligent AI possess a predictive processing cognitive architecture, then they will come to share our pro-moral motivations (of valuing humanity as an end; avoiding maleficent actions; etc.), regardless of their initial motivation set. Consequently, these AI will pose a minimal threat to humanity. In this way then, I conclude, the value-alignment problem is significantly ameliorated under the assumption that future generally intelligent AI will possess a predictive processing cognitive architecture.



中文翻译:

大脑的预测处理模型能否改善价值对齐问题?

我们如何确保未来普遍智能的 AI 分享我们的价值观?这就是价值对齐问题。这是一件很重要的事情。毕竟,如果人工智能对我们的福祉持中立态度,或者更糟的是,对我们积极敌视,那么它们就会对人类构成生存威胁。一些哲学家认为,我们可以减轻这种威胁的一个重要方法是仅开发与我们的价值观相同或与我们的价值观“一致”的人工智能。然而,并不能保证这项政策会得到普遍实施——尤其是“坏人”很可能会无视它。在本文中,我展示了目前在认知科学中占主导地位的心理预测处理模型如何改善价值对齐问题。在本质上,我认为,未来普遍智能的人工智能将拥有预测处理认知架构的原因有很多(例如,因为我们决定以这种方式构建它们;因为它是唯一可以支持通用智能的认知架构;因为它是创建 AI 的最简单方法。)。我还认为,如果未来普遍智能的 AI 拥有预测处理认知架构,那么无论他们最初的动机是什么,他们都会分享我们的道德动机(将人性视为目的;避免不良行为等)。因此,这些人工智能将对人类构成最小的威胁。这样,我得出结论,

更新日期:2021-09-07
down
wechat
bug