Deep Clustering based Fair Outlier Detection,arXiv - CS - Computers and Society

当前位置： X-MOL 学术 › arXiv.cs.CY › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Deep Clustering based Fair Outlier Detection
arXiv - CS - Computers and Society Pub Date : 2021-06-09 , DOI: arxiv-2106.05127
Hanyu Song, Peizhao Li, Hongfu Liu

In this paper, we focus on the fairness issues regarding unsupervised outlier detection. Traditional algorithms, without a specific design for algorithmic fairness, could implicitly encode and propagate statistical bias in data and raise societal concerns. To correct such unfairness and deliver a fair set of potential outlier candidates, we propose Deep Clustering based Fair Outlier Detection (DCFOD) that learns a good representation for utility maximization while enforcing the learnable representation to be subgroup-invariant on the sensitive attribute. Considering the coupled and reciprocal nature between clustering and outlier detection, we leverage deep clustering to discover the intrinsic cluster structure and out-of-structure instances. Meanwhile, an adversarial training erases the sensitive pattern for instances for fairness adaptation. Technically, we propose an instance-level weighted representation learning strategy to enhance the joint deep clustering and outlier detection, where the dynamic weight module re-emphasizes contributions of likely-inliers while mitigating the negative impact from outliers. Demonstrated by experiments on eight datasets comparing to 17 outlier detection algorithms, our DCFOD method consistently achieves superior performance on both the outlier detection validity and two types of fairness notions in outlier detection.

中文翻译：

基于深度聚类的公平异常值检测

在本文中，我们关注有关无监督异常值检测的公平性问题。传统算法没有针对算法公平性的特定设计，可能会隐式编码和传播数据中的统计偏差，并引起社会关注。为了纠正这种不公平并提供一组公平的潜在异常值候选者，我们提出了基于深度聚类的公平异常值检测 (DCFOD)，它可以学习效用最大化的良好表示，同时强制可学习表示在敏感属性上是子组不变的。考虑到聚类和异常值检测之间的耦合和互惠性质，我们利用深度聚类来发现内在聚类结构和结构外实例。同时，对抗性训练消除了公平适应实例的敏感模式。从技术上讲，我们提出了一种实例级加权表示学习策略来增强联合深度聚类和异常值检测，其中动态权重模块重新强调可能的内部值的贡献，同时减轻异常值的负面影响。与 17 种异常值检测算法相比，在 8 个数据集上进行的实验证明，我们的 DCFOD 方法在异常值检测有效性和异常值检测中的两种公平概念方面始终如一地实现了卓越的性能。

更新日期：2021-06-10

点击分享查看原文

点击收藏

阅读更多本刊最新论文