当前位置: X-MOL 学术Ann. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Clustering in Block Markov Chains
Annals of Statistics ( IF 4.5 ) Pub Date : 2020-12-01 , DOI: 10.1214/19-aos1939
Jaron Sanders , Alexandre Proutière , Se-Young Yun

This paper considers cluster detection in Block Markov Chains (BMCs). These Markov chains are characterized by a block structure in their transition matrix. More precisely, the $n$ possible states are divided into a finite number of $K$ groups or clusters, such that states in the same cluster exhibit the same transition rates to other states. One observes a trajectory of the Markov chain, and the objective is to recover, from this observation only, the (initially unknown) clusters. In this paper we devise a clustering procedure that accurately, efficiently, and provably detects the clusters. We first derive a fundamental information-theoretical lower bound on the detection error rate satisfied under any clustering algorithm. This bound identifies the parameters of the BMC, and trajectory lengths, for which it is possible to accurately detect the clusters. We next develop two clustering algorithms that can together accurately recover the cluster structure from the shortest possible trajectories, whenever the parameters allow detection. These algorithms thus reach the fundamental detectability limit, and are optimal in that sense.

中文翻译:

块马尔可夫链中的聚类

本文考虑了块马尔可夫链 (BMC) 中的集群检测。这些马尔可夫链的特征在于它们的转换矩阵中的块结构。更准确地说,$n$ 个可能的状态被分成有限数量的$K$ 个组或簇,使得同一簇中的状态表现出相同的到其他状态的转换率。观察马尔可夫链的轨迹,目标是仅从这一观察中恢复(最初未知的)簇。在本文中,我们设计了一种聚类程序,可以准确、有效且可证明地检测聚类。我们首先推导出在任何聚类算法下满足的检测错误率的基本信息理论下界。这个界限标识了 BMC 的参数和轨迹长度,可以准确地检测出聚类。我们接下来开发了两种聚类算法,只要参数允许检测,它们就可以从最短的可能轨迹中准确地恢复聚类结构。因此,这些算法达到了基本的可检测性极限,并且在这个意义上是最佳的。
更新日期:2020-12-01
down
wechat
bug