当前位置: X-MOL 学术arXiv.cs.DS › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
On 2-Clubs in Graph-Based Data Clustering: Theory and Algorithm Engineering
arXiv - CS - Data Structures and Algorithms Pub Date : 2020-06-26 , DOI: arxiv-2006.14972
Aleksander Figiel, Anne-Sophie Himmel, Andr\'e Nichterlein, Rolf Niedermeier

Editing a graph into a disjoint union of clusters is a standard optimization task in graph-based data clustering. Here, complementing classic work where the clusters shall be cliques, we focus on clusters that shall be 2-clubs, that is, subgraphs of diameter two. This naturally leads to the two NP-hard problems 2-Club Cluster Editing (the allowed editing operations are edge insertion and edge deletion) and 2-Club Cluster Vertex Deletion (the allowed editing operations are vertex deletions). Answering an open question from the literature, we show that 2-Club Cluster Editing is W[2]-hard with respect to the number of edge modifications, thus contrasting the fixed-parameter tractability result for the classic Cluster Editing problem (considering cliques instead of 2-clubs). Then focusing on 2-Club Cluster Vertex Deletion, which is easily seen to be fixed-parameter tractable, we show that under standard complexity-theoretic assumptions it does not have a polynomial-size problem kernel when parameterized by the number of vertex deletions. Nevertheless, we develop several effective data reduction and pruning rules, resulting in a competitive solver, clearly outperforming a standard CPLEX solver in most instances of an established biological test data set.

中文翻译:

基于图的数据聚类中的 2-Club:理论与算法工程

将图编辑为不相交的集群联合是基于图的数据聚类中的标准优化任务。在这里,作为对簇应为小团的经典作品的补充,我们关注应为 2-clubs 的簇,即直径为 2 的子图。这自然会导致两个 NP-hard 问题 2-Club Cluster Editing(允许的编辑操作是边插入和边删除)和 2-Club Cluster Vertex Deletion(允许的编辑操作是删除顶点)。回答文献中的一个悬而未决的问题,我们表明 2-Club Cluster Editing 在边修改的数量方面是 W[2]-hard,从而对比了经典 Cluster Editing 问题的固定参数易处理性结果(考虑 cliques 2 家具乐部)。然后专注于2-Club Cluster Vertex Deletion,这很容易被认为是固定参数易处理的,我们表明,在标准复杂性理论假设下,当通过顶点删除的数量进行参数化时,它没有多项式大小的问题内核。尽管如此,我们开发了几种有效的数据缩减和修剪规则,从而产生了具有竞争力的求解器,在已建立的生物测试数据集的大多数情况下,明显优于标准 CPLEX 求解器。
更新日期:2020-06-29
down
wechat
bug