• Open Access

Bias and priors in machine learning calibrations for high energy physics

Rikab Gambhir, Benjamin Nachman, and Jesse Thaler
Phys. Rev. D 106, 036011 – Published 15 August 2022

Abstract

Machine learning offers an exciting opportunity to improve the calibration of nearly all reconstructed objects in high-energy physics detectors. However, machine learning approaches often depend on the spectra of examples used during training, an issue known as prior dependence. This is an undesirable property of a calibration, which needs to be applicable in a variety of environments. The purpose of this paper is to explicitly highlight the prior dependence of some machine-learning-based calibration strategies. We demonstrate how some recent proposals for both simulation-based and data-based calibrations inherit properties of the sample used for training, which can result in biases for downstream analyses. In the case of simulation-based calibration, we argue that our recently proposed Gaussian Ansatz approach can avoid some of the pitfalls of prior dependence, whereas prior-independent data-based calibration remains an open problem.

  • Figure
  • Figure
  • Figure
  • Figure
  • Figure
  • Figure
  • Figure
  • Received 23 May 2022
  • Accepted 20 July 2022

DOI:https://doi.org/10.1103/PhysRevD.106.036011

Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI. Funded by SCOAP3.

Published by the American Physical Society

Physics Subject Headings (PhySH)

Particles & Fields

Authors & Affiliations

Rikab Gambhir1,2,*, Benjamin Nachman3,4,†, and Jesse Thaler1,2,‡

  • 1Center for Theoretical Physics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
  • 2The NSF AI Institute for Artificial Intelligence and Fundamental Interactions
  • 3Physics Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
  • 4Berkeley Institute for Data Science, University of California, Berkeley, California 94720, USA

  • *rikab@mit.edu
  • bpnachman@lbl.gov
  • jthaler@mit.edu

See Also

Learning Uncertainties the Frequentist Way: Calibration and Correlation in High Energy Physics

Rikab Gambhir, Benjamin Nachman, and Jesse Thaler
Phys. Rev. Lett. 129, 082001 (2022)

Article Text

Click to Expand

References

Click to Expand
Issue

Vol. 106, Iss. 3 — 1 August 2022

Reuse & Permissions
Author publication services for translation and copyediting assistance advertisement

Authorization Required


×
×

Images

×

Sign up to receive regular email alerts from Physical Review D

Reuse & Permissions

It is not necessary to obtain permission to reuse this article or its components as it is available under the terms of the Creative Commons Attribution 4.0 International license. This license permits unrestricted use, distribution, and reproduction in any medium, provided attribution to the author(s) and the published article's title, journal citation, and DOI are maintained. Please note that some figures may have been included with permission from other third parties. It is your responsibility to obtain the proper permission from the rights holder directly for these figures.

×

Log In

Cancel
×

Search


Article Lookup

Paste a citation or DOI

Enter a citation
×