当前位置: X-MOL 学术ACM Trans. Knowl. Discov. Data › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Sampling Sparse Representations with Randomized Measurement Langevin Dynamics
ACM Transactions on Knowledge Discovery from Data ( IF 4.0 ) Pub Date : 2021-02-10 , DOI: 10.1145/3427585
Kafeng Wang 1 , Haoyi Xiong 2 , Jiang Bian 3 , Zhanxing Zhu 4 , Qian Gao 2 , Zhishan Guo 3 , Cheng-Zhong Xu 5 , Jun Huan 6 , Dejing Dou 2
Affiliation  

Stochastic Gradient Langevin Dynamics (SGLD) have been widely used for Bayesian sampling from certain probability distributions, incorporating derivatives of the log-posterior. With the derivative evaluation of the log-posterior distribution, SGLD methods generate samples from the distribution through performing as a thermostats dynamics that traverses over gradient flows of the log-posterior with certainly controllable perturbation. Even when the density is not known, existing solutions still can first learn the kernel density models from the given datasets, then produce new samples using the SGLD over the kernel density derivatives. In this work, instead of exploring new samples from kernel spaces, a novel SGLD sampler, namely, Randomized Measurement Langevin Dynamics (RMLD) is proposed to sample the high-dimensional sparse representations from the spectral domain of a given dataset. Specifically, given a random measurement matrix for sparse coding, RMLD first derives a novel likelihood evaluator of the probability distribution from the loss function of LASSO, then samples from the high-dimensional distribution using stochastic Langevin dynamics with derivatives of the logarithm likelihood and Metropolis–Hastings sampling. In addition, new samples in low-dimensional measuring spaces can be regenerated using the sampled high-dimensional vectors and the measurement matrix. The algorithm analysis shows that RMLD indeed projects a given dataset into a high-dimensional Gaussian distribution with Laplacian prior, then draw new sparse representation from the dataset through performing SGLD over the distribution. Extensive experiments have been conducted to evaluate the proposed algorithm using real-world datasets. The performance comparisons on three real-world applications demonstrate the superior performance of RMLD beyond baseline methods.

中文翻译:

使用随机测量 Langevin Dynamics 对稀疏表示进行采样

随机梯度朗之万动力学 (SGLD) 已广泛用于来自某些概率分布的贝叶斯采样,其中包含对数后验的导数。通过对对数后验分布的导数评估,SGLD 方法通过执行作为恒温器动力学从分布中生成样本,该动力学通过肯定可控的扰动遍历对数后验的梯度流。即使密度未知,现有解决方案仍然可以首先从给定数据集中学习核密度模型,然后使用 SGLD 在核密度导数上生成新样本。在这项工作中,不是从内核空间探索新样本,而是一种新颖的 SGLD 采样器,即随机测量 Langevin Dynamics(RMLD)被提议从给定数据集的光谱域中采样高维稀疏表示。具体来说,给定一个用于稀疏编码的随机测量矩阵,RMLD 首先从 LASSO 的损失函数中推导出一个新的概率分布似然评估器,然后使用具有对数似然和 Metropolis 导数的随机朗之万动力学从高维分布中采样——黑斯廷斯抽样。此外,可以使用采样的高维向量和测量矩阵重新生成低维测量空间中的新样本。算法分析表明,RMLD确实将给定的数据集投影到具有拉普拉斯先验的高维高斯分布中,然后通过对分布执行SGLD从数据集中绘制新的稀疏表示。已经进行了广泛的实验来评估使用真实世界数据集提出的算法。三个实际应用程序的性能比较证明了 RMLD 超越基线方法的卓越性能。
更新日期:2021-02-10
down
wechat
bug