当前位置: X-MOL 学术Spectrochim. Acta B. At. Spectrosc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Restricted Boltzmann Machine method for dimensionality reduction of large spectroscopic data
Spectrochimica Acta Part B: Atomic Spectroscopy ( IF 3.2 ) Pub Date : 2020-05-01 , DOI: 10.1016/j.sab.2020.105849
J. Vrábel , P. Pořízka , J. Kaiser

Abstract Multivariate data obtained using, for instance, Laser-Induced Breakdown Spectroscopy (LIBS), are quite bulky and complex. Advanced processing of spectroscopic data demands a multidisciplinary approach, covering not only modern machine learning tools but also a deep understanding of underlying physical mechanisms. Dimension reduction and visualization of large datasets is a task of significant interest in the spectroscopic data processing. Commonly employed linear techniques (e.g., Principal Component Analysis, PCA) cannot explain the correlations of higher-order which are present in the data. Even more, computational cost and memory limitations become way more relevant considering the size of “modern” LIBS data (millions of high-dimensional spectra). Methods based on Artificial Neural Networks (ANN) seem suitable for this task, and based on their success, they are given considerable attention within the spectroscopic community. We propose a new methodology based on Restricted Boltzmann Machine (ANN method) for dimensionality reduction of spectroscopic data and compare it to standard PCA. As an extension to successful reconstruction, we demonstrate a generation of new (unseen) spectra by the RBM model trained on a large spectroscopic dataset. This data generation is of great use not only for the extending measured datasets but also as a proper training state's confirmation of the model.

中文翻译:

用于大光谱数据降维的受限玻尔兹曼机方法

摘要 使用例如激光诱导击穿光谱 (LIBS) 获得的多变量数据非常庞大和复杂。光谱数据的高级处理需要采用多学科方法,不仅涵盖现代机器学习工具,还需要深入了解潜在的物理机制。大数据集的降维和可视化是光谱数据处理中的一项重要任务。常用的线性技术(例如,主成分分析,PCA)无法解释数据中存在的高阶相关性。更重要的是,考虑到“现代”LIBS 数据(数百万个高维光谱)的大小,计算成本和内存限制变得更加重要。基于人工神经网络 (ANN) 的方法似乎适合这项任务,基于他们的成功,他们在光谱学界受到了相当大的关注。我们提出了一种基于受限玻尔兹曼机(ANN 方法)的新方法,用于光谱数据的降维,并将其与标准 PCA 进行比较。作为成功重建的扩展,我们通过在大型光谱数据集上训练的 RBM 模型展示了一代新的(看不见的)光谱。这种数据生成不仅对于扩展测量数据集非常有用,而且还可以作为模型的正确训练状态的确认。作为成功重建的扩展,我们通过在大型光谱数据集上训练的 RBM 模型展示了一代新的(看不见的)光谱。这种数据生成不仅对于扩展测量数据集非常有用,而且还可以作为模型的正确训练状态的确认。作为成功重建的扩展,我们通过在大型光谱数据集上训练的 RBM 模型展示了一代新的(看不见的)光谱。这种数据生成不仅对于扩展测量数据集非常有用,而且还可以作为模型的正确训练状态的确认。
更新日期:2020-05-01
down
wechat
bug