当前位置: X-MOL 学术Ann. Phys. (Berlin) › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Toward Bayesian Data Compression
Annalen Der Physik ( IF 2.2 ) Pub Date : 2021-02-08 , DOI: 10.1002/andp.202000508
Johannes Harth‐Kitzerow 1, 2, 3, 4 , Reimar H. Leike 1 , Philipp Arras 1, 2, 5 , Torsten A. Enßlin 1, 2, 4
Affiliation  

In order to handle large datasets omnipresent in modern science, efficient compression algorithms are necessary. Here, a Bayesian data compression (BDC) algorithm that adapts to the specific measurement situation is derived in the context of signal reconstruction. BDC compresses a dataset under conservation of its posterior structure with minimal information loss given the prior knowledge on the signal, the quantity of interest. Its basic form is valid for Gaussian priors and likelihoods. For constant noise standard deviation, basic BDC becomes equivalent to a Bayesian analog of principal component analysis. Using metric Gaussian variational inference, BDC generalizes to non‐linear settings. In its current form, BDC requires the storage of effective instrument response functions for the compressed data and corresponding noise encoding the posterior covariance structure. Their memory demand counteract the compression gain. In order to improve this, sparsity of the compressed responses can be obtained by separating the data into patches and compressing them separately. The applicability of BDC is demonstrated by applying it to synthetic data and radio astronomical data. Still the algorithm needs further improvement as the computation time of the compression and subsequent inference exceeds the time of the inference with the original data.

中文翻译:

迈向贝叶斯数据压缩

为了处理现代科学中无处不在的大型数据集,必须使用有效的压缩算法。在此,在信号重建的背景下,导出了适合于特定测量情况的贝叶斯数据压缩(BDC)算法。给定对信号的先验知识(感兴趣的数量),BDC会在保留其后部结构的情况下以最小的信息损失来压缩数据集。它的基本形式对高斯先验和似然有效。对于恒定的噪声标准偏差,基本BDC等效于主成分分析的贝叶斯模拟。使用度量高斯变分推断,BDC可以推广到非线性设置。以目前的形式,BDC需要存储有效的仪器响应函数,以用于压缩数据和编码后协方差结构的相应噪声。他们的内存需求抵消了压缩增益。为了改善这一点,可以通过将数据分成补丁并将它们分别压缩来获得压缩响应的稀疏性。通过将BDC应用于合成数据和射电天文数据,证明了其适用性。由于压缩和后续推理的计算时间超过了对原始数据进行推理的时间,因此该算法仍需要进一步改进。通过将BDC应用于合成数据和射电天文数据,证明了其适用性。由于压缩和后续推理的计算时间超过了对原始数据进行推理的时间,因此该算法仍需要进一步改进。通过将BDC应用于合成数据和射电天文数据,证明了其适用性。由于压缩和后续推理的计算时间超过了对原始数据进行推理的时间,因此该算法仍需要进一步改进。
更新日期:2021-03-11
down
wechat
bug