当前位置: X-MOL 学术Inf. Process. Manag. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Semi-supervised Co-Clustering on Attributed Heterogeneous Information Networks
Information Processing & Management ( IF 7.4 ) Pub Date : 2020-07-17 , DOI: 10.1016/j.ipm.2020.102338
Yugang Ji , Chuan Shi , Yuan Fang , Xiangnan Kong , Mingyang Yin

Node clustering on heterogeneous information networks (HINs) plays an important role in many real-world applications. While previous research mainly clusters same-type nodes independently via exploiting structural similarity search, they ignore the correlations of different-type nodes. In this paper, we focus on the problem of co-clustering heterogeneous nodes where the goal is to mine the latent relevance of heterogeneous nodes and simultaneously partition them into the corresponding type-aware clusters. This problem is challenging in two aspects. First, the similarity or relevance of nodes is not only associated with multiple meta-path-based structures but also related to numerical and categorical attributes. Second, clusters and similarity/relevance searches usually promote each other.

To address this problem, we first design a learnable overall relevance measure that integrates the structural and attributed relevance by employing meta-paths and attribute projection. We then propose a novel approach, called SCCAIN, to co-cluster heterogeneous nodes based on constrained orthogonal non-negative matrix tri-factorization. Furthermore, an end-to-end framework is developed to jointly optimize the relevance measures and co-clustering. Extensive experiments on real-world datasets not only demonstrate that SCCAIN consistently outperforms state-of-the-art methods but also validate the effectiveness of integrating attributed and structural information for co-clustering.



中文翻译:

属性异构信息网络上的半监督联合聚类

异构信息网络(HIN)上的节点群集在许多实际应用中扮演着重要角色。尽管先前的研究主要是通过利用结构相似性搜索来独立地对相同类型的节点进行聚类,但他们忽略了不同类型节点的相关性。在本文中,我们关注于共聚异构节点的问题,其目的是挖掘异构节点的潜在相关性,并将它们同时划分为相应的类型感知集群。这个问题在两个方面都具有挑战性。首先,节点的相似性或相关性不仅与多个基于元路径的结构相关,而且与数字和分类属性相关。其次,聚类和相似性/相关性搜索通常会相互促进。

为了解决这个问题,我们首先设计了一种可学习的整体相关性度量,该度量通过使用元路径和属性投影来整合结构相关性和属性相关性。然后,我们提出了一种新方法,称为SCCAIN,用于基于约束正交非负矩阵三因子分解来共聚异构节点。此外,开发了一个端到端框架来共同优化相关性度量和共同集群。在现实世界的数据集上进行的大量实验不仅证明SCCAIN始终优于最新方法,而且还验证了将属性信息和结构信息集成在一起进行聚类的有效性。

更新日期:2020-07-17
down
wechat
bug