当前位置: X-MOL 学术Stat. Methods Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Nonparametric semi-supervised classification with application to signal detection in high energy physics
Statistical Methods & Applications ( IF 1.1 ) Pub Date : 2021-08-25 , DOI: 10.1007/s10260-021-00585-3
Giovanna Menardi 1 , Alessandro Casa 2
Affiliation  

Model-independent searches in particle physics aim at completing our knowledge of the universe by looking for new possible particles not predicted by the current theories. Such particles, referred to as signal, are expected to behave as a deviation from the background, representing the known physics. Information available on the background can be incorporated in the search, in order to identify potential anomalies. From a statistical perspective, the problem is recasted to a peculiar classification one where only partial information is accessible. Therefore a semi-supervised approach shall be adopted, either by strengthening or by relaxing assumptions underlying clustering or classification methods respectively. In this work, following the first route, we semi-supervise nonparametric clustering in order to identify a possible signal. The main contribution consists in tuning a nonparametric estimate of the density underlying the experimental data to identify a partition which guarantees a signal warning while allowing for an accurate classification of the background. As a side contribution, a variable selection procedure is presented. The whole procedure is tested on a dataset mimicking proton–proton collisions performed within a particle accelerator. While finding motivation in the field of particle physics, the approach is applicable to various science domains, where similar problems of anomaly detection arise.



中文翻译:

非参数半监督分类在高能物理信号检测中的应用

粒子物理学中独立于模型的搜索旨在通过寻找当前理论无法预测的新可能粒子来完成我们对宇宙的了解。此类粒子称为信号,预期表现为与背景的偏差,代表已知物理。背景中可用的信息可以合并到搜索中,以便识别潜在的异常情况。从统计的角度来看,这个问题被重新定义为一个特殊的分类问题,其中只有部分信息是可访问的。因此,应采用半监督方法,分别通过加强或放松基于聚类或分类方法的假设。在这项工作中,按照第一条路线,我们半监督非参数聚类以识别可能的信号。主要贡献在于调整实验数据基础的密度的非参数估计,以识别保证信号警告的分区,同时允许对背景进行准确分类。作为附带贡献,介绍了一个变量选择程序。整个过程在模拟粒子加速器内执行的质子 - 质子碰撞的数据集上进行测试。在粒子物理学领域寻找动力的同时,该方法适用于出现类似异常检测问题的各个科学领域。

更新日期:2021-08-26
down
wechat
bug