当前位置: X-MOL 学术arXiv.cs.SC › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Fairness in Machine Learning with Tractable Models
arXiv - CS - Symbolic Computation Pub Date : 2019-05-16 , DOI: arxiv-1905.07026
Michael Varley, Vaishak Belle

Machine Learning techniques have become pervasive across a range of different applications, and are now widely used in areas as disparate as recidivism prediction, consumer credit-risk analysis and insurance pricing. The prevalence of machine learning techniques has raised concerns about the potential for learned algorithms to become biased against certain groups. Many definitions have been proposed in the literature, but the fundamental task of reasoning about probabilistic events is a challenging one, owing to the intractability of inference. The focus of this paper is taking steps towards the application of tractable models to fairness. Tractable probabilistic models have emerged that guarantee that conditional marginal can be computed in time linear in the size of the model. In particular, we show that sum product networks (SPNs) enable an effective technique for determining the statistical relationships between protected attributes and other training variables. If a subset of these training variables are found by the SPN to be independent of the training attribute then they can be considered `safe' variables, from which we can train a classification model without concern that the resulting classifier will result in disparate outcomes for different demographic groups. Our initial experiments on the `German Credit' data set indicate that this processing technique significantly reduces disparate treatment of male and female credit applicants, with a small reduction in classification accuracy compared to state of the art. We will also motivate the concept of "fairness through percentile equivalence", a new definition predicated on the notion that individuals at the same percentile of their respective distributions should be treated equivalently, and this prevents unfair penalisation of those individuals who lie at the extremities of their respective distributions.

中文翻译:

具有可处理模型的机器学习的公平性

机器学习技术已经在一系列不同的应用中变得普遍,现在广泛应用于累犯预测、消费者信用风险分析和保险定价等不同领域。机器学习技术的流行引起了人们对学习算法可能会对某些群体产生偏见的担忧。文献中提出了许多定义,但由于推理的难易性,对概率事件进行推理的基本任务是一项具有挑战性的任务。本文的重点是采取步骤将易处理模型应用于公平。已经出现了可处理的概率模型,保证条件边际可以按模型大小的时间线性计算。特别是,我们展示了总和产品网络 (SPN) 能够有效地确定受保护属性和其他训练变量之间的统计关系。如果 SPN 发现这些训练变量的一个子集与训练属性无关,那么它们可以被视为“安全”变量,我们可以从中训练分类模型,而不必担心生成的分类器会导致不同的结果不同人口群体。我们对“德国信用”数据集的初步实验表明,这种处理技术显着减少了对男性和女性信用申请人的不同待遇,与最先进的技术相比,分类准确性略有降低。我们还将激发“通过百分位数等价实现公平”的概念,
更新日期:2020-01-14
down
wechat
bug