当前位置: X-MOL 学术IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Scalable Variational Gaussian Processes for Crowdsourcing: Glitch Detection in LIGO
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 23.6 ) Pub Date : 2020-09-21 , DOI: 10.1109/tpami.2020.3025390
Pablo Morales-Álvarez 1 , Pablo Ruiz 2 , Scott Coughlin 3, 4 , Rafael Molina 1 , Aggelos K. Katsaggelos 2
Affiliation  

In the last years, crowdsourcing is transforming the way classification training sets are obtained. Instead of relying on a single expert annotator, crowdsourcing shares the labelling effort among a large number of collaborators. For instance, this is being applied in the laureate laser interferometer gravitational waves observatory (LIGO), in order to detect glitches which might hinder the identification of true gravitational-waves. The crowdsourcing scenario poses new challenging difficulties, as it has to deal with different opinions from a heterogeneous group of annotators with unknown degrees of expertise. Probabilistic methods, such as Gaussian processes (GP), have proven successful in modeling this setting. However, GPs do not scale up well to large data sets, which hampers their broad adoption in real-world problems (in particular LIGO). This has led to the very recent introduction of deep learning based crowdsourcing methods, which have become the state-of-the-art for this type of problems. However, the accurate uncertainty quantification provided by GPs has been partially sacrificed. This is an important aspect for astrophysicists in LIGO, since a glitch detection system should provide very accurate probability distributions of its predictions. In this work, we first leverage a standard sparse GP approximation (SVGP) to develop a GP-based crowdsourcing method that factorizes into mini-batches. This makes it able to cope with previously-prohibitive data sets. This first approach, which we refer to as scalable variational Gaussian processes for crowdsourcing (SVGPCR), brings back GP-based methods to a state-of-the-art level, and excels at uncertainty quantification. SVGPCR is shown to outperform deep learning based methods and previous probabilistic ones when applied to the LIGO data. Its behavior and main properties are carefully analyzed in a controlled experiment based on the MNIST data set. Moreover, recent GP inference techniques are also adapted to crowdsourcing and evaluated experimentally.

中文翻译:

众包的可扩展变分高斯过程:LIGO 中的故障检测

在过去的几年里,众包正在改变获得分类训练集的方式。众包不依赖于单一的专家注释者,而是在大量合作者之间共享标签工作。例如,这被应用在获奖者激光干涉仪引力波天文台 (LIGO) 中,以检测可能阻碍识别真正引力波的故障。众包场景带来了新的挑战,因为它必须处理来自具有未知专业程度的异类注释者的不同意见。概率方法,例如高斯过程 (GP),已被证明在此设置建模方面是成功的。但是,GP 不能很好地扩展到大型数据集,这阻碍了它们在现实世界问题(尤其是 LIGO)中的广泛采用。这导致最近引入了基于深度学习的众包方法,这些方法已成为此类问题的最新技术。然而,由 GP 提供的准确不确定性量化已被部分牺牲。这对于 LIGO 的天体物理学家来说是一个重要方面,因为故障检测系统应该提供非常准确的预测概率分布。在这项工作中,我们首先利用标准稀疏 GP 近似 (SVGP) 来开发一种基于 GP 的众包方法,该方法可分解为小批量。这使得它能够处理以前禁止的数据集。第一种方法,我们称之为众包的可扩展变分高斯过程 (SVGPCR),将基于 GP 的方法带回到最先进的水平,并且擅长不确定性量化。当应用于 LIGO 数据时,SVGPCR 被证明优于基于深度学习的方法和以前的概率方法。在基于 MNIST 数据集的受控实验中仔细分析了它的行为和主要特性。此外,最近的 GP 推理技术也适用于众包和实验评估。
更新日期:2020-09-21
down
wechat
bug