1 Introduction

Liquid scintillator (LS) detectors, composed of LS and photosensors, such as photomultiplier tubes (PMTs), have been widely used in neutrino experiments since the first detection of reactor antineutrinos [1]. Among the reactor neutrino experiments, KamLAND has measured the \(\theta _{12}\) driven oscillation [2], and \(\theta _{13}\) was measured at Daya Bay [3], RENO [4] and Double Chooz [5]. Besides, Borexino has made remarkable contributions in solar neutrino measurements [6, 7]. LS detectors also play important roles in the searches for neutrinoless double-beta (\(0\nu \beta \beta \)) decays, such as KamLAND-Zen [8] and SNO+ [9]. In the future, JUNO will build the largest LS detector to determine neutrino mass ordering (NMO) by precisely measuring the fine oscillation patterns with reactor neutrinos, which requires the uncertainty of the positron kinetic energy scale better than 1% and an unprecedented energy resolution of \(3\%\) at 1 MeV of visible energy [10,11,12,13,14].

For LS detectors used in reactor neutrino experiments, \(e^+\) from the inverse beta decay (IBD) interaction is the target signal, however, \(\gamma \)s instead of \(\beta \)s are the most common calibration sources. Charged particles \(e^\pm \) lose energy mostly by ionizing and exciting the solvent molecules, while \(\gamma \)s generate \(e^-\) (and at sufficiently high energies \(e^\pm \) pairs) firstly. [15]. The deexcited molecules release scintillation photons. Charged particles moving faster than light in the medium produce Cherenkov photons. Photons propagate in the detector and undergo series of optical processes until they hit PMTs and get converted into photoelectrons (PEs). The mapping between the number of photoelectrons (\(N_\textrm{pe}\)) and the particle deposited energy reflects the energy response of the LS detector, which is crucial to the spectral analysis of neutrino oscillations.

Due to different energy deposition processes, energy response in LS is particle-dependent. The nonlinearities in energy releases of \(e^\pm \) and \(\gamma \) have been thoroughly studied [14, 16]. For LS detectors with relatively poor energy resolution, the reported energy resolution is usually calibrated with mono-energetic \(\gamma \) sources without considering particle types, e.g., see Ref. [17]. LS detectors with high energy resolution will have the potential to distinguish the energy resolution curves of different particle types, and the spectral analysis for physics topics like NMO, solar neutrino and \(0\nu \beta \beta \) decays, etc, will benefit from more comprehensive knowledge of energy resolution. Thus, it is necessary to construct a unified energy resolution model.

The \(N_\textrm{pe}\) of mono-energetic charged particles like \(e^\pm \) is determined by the underlying energy deposition and optical processes in LS. \(\gamma \)s can be described by \(e^\pm \) because they deposit energies by generating primary \(e^\pm \) firstly. However, each \(\gamma \) has a different deposition mode, among which the multiplicities and energies of primary \(e^\pm \) are different. Therefore, the \(N_\textrm{pe}\) distribution of mono-energetic \(\gamma \)s needs to be derived from the collection of all deposition modes instead of a one-dimensional probability density function (PDF) of the kinetic energy of the primary \(e^\pm \) (e.g., see Fig. 7 in Ref. [16]). All in all, the nonlinearity and resolution are intrinsically correlated.

In this paper, we develop a unified energy model to describe particle-dependent energy response for LS detectors. Following a similar strategy as Ref. [16], we construct the energy response model for \(e^-\) firstly, then derive the \(\gamma \) and \(e^+\) models from it and calibrate the model with \(\gamma \) sources and continuous \(\beta \) spectra. By taking into account the dispersions in \(\gamma \) energy deposition modes mentioned above, the model is capable to describe nonlinearity and resolution simultaneously. The structure of this paper is as follows: in Sect. 2 details of the Geant4-based [18] Monte Carlo simulation are presented. In Sect. 3, we elaborate on the connections among particles in LS. The methods of model construction are described in Sect. 4. Then in Sect. 5 the calibration procedures and performances of the model are presented. Furthermore, we discuss the implications of calibration inputs and various energy resolution scenarios in Sect. 6. Finally we give some further remarks and summarize our main conclusions in Sect. 7.

2 Monte Carlo configurations

A set of Geant4-based (version 4.10.p02) Monte Carlo simulation software is developed. For simplicity, a liquid scintillator sphere is implemented as the target which is surrounded by photosensors. PMT geometry and its response are not accounted in this work in order to exclude the corresponding contribution to the LS energy model.

The Geant4 physics packages and custom codes jointly describe the physics processes in the detector. The low energy electromagnetic processes are described by the default Livermore model. A custom scintillation process covers the quenching effect, photon emission, absorption and re-emission. Other optical processes including Cherenkov process, Rayleigh scattering and boundary interactions are depicted with the official Geant4 codes. Besides, the optical parameters of the liquid scintillator are key to the optical propagation. Wavelength-dependent refractive indices are derived from measurement values and the dispersion relation [19], and the attenuation length and Rayleigh scattering length are taken from Refs. [19,20,21]. Scintillation properties are taken from the measurements including emission spectrum [22], fluorescence quantum yield [23,24,25] and time profile [26, 27]. Flexible studies with varied Monte Carlo inputs are feasible using our simulation software.

By tuning the detection efficiency of sensitive detectors, the photon statistics and hence the energy resolution vary among different simulation samples. Series of different detector configurations have been simulated under this framework by covering the range of energy resolution of several typical large-scale liquid scintillator detectors, for example JUNO (\(\sim 3.0\%\)) [10], Borexino (\(\sim 6.0\%\)) [7], KamLAND (\(\sim 6.5\%\)) [2] and Daya Bay (\(\sim 8.4\%\)) [3]. A baseline light yield is set as 1400 photoelectrons/MeV which corresponds to an energy resolution of electron-positron annihilation with zero kinetic energy around \(2.9\%\) in the current simulation framework. The other three configurations with light yields scaled by 1/2, 1/4 and 1/8 have the corresponding energy resolutions \(3.8\%\), \(5.5\%\) and \(7.8\%\). It is worth mentioning that the real corresponding relation between the light yield and the energy resolution depends on the specific experimental details, the values mentioned above are obtained under the simulation framework in this study.

3 Connections among \({e^\pm }\) and \({\gamma }\) in liquid scintillator

For generality and simplicity, the start vertices of all particles are assumed at the detector center and true \(N_\textrm{pe}\) detected by PMTs is used. The detector-dependent instrumental effects require specific studies for each experiment and can be added to this model.

To convert \(N_\textrm{pe}\) into the energy dimension, a general definition of “visible energy” is introduced as,

$$\begin{aligned} E_\textrm{vis}\equiv & {} {N_\textrm{pe}} / Y, \end{aligned}$$
(1)

where Y is the photoelectrons yield per \(\textrm{MeV}\) and is calibrated by the neutron capture on hydrogen using neutron calibration sources. The energy-dependent nonlinearity f and resolution R for the mono-energetic particles are defined as,

$$\begin{aligned} f(E)\equiv & {} \frac{\overline{E_\textrm{vis}}}{E}, \end{aligned}$$
(2)
$$\begin{aligned} R(E_\textrm{vis})\equiv & {} \frac{\sigma }{\overline{E_\textrm{vis}}}, \end{aligned}$$
(3)

where \(\overline{E_\textrm{vis}}\) and \(\sigma \) are the expected value and standard deviation of \(E_\textrm{vis}\), respectively.

Similar to Ref. [16], the \(e^-\) is regarded as a basic particle, which has energy response described by Eqs. (2) and (3) directly. In our model, we consider the \(e^+\) deposits its kinetic energy T equivalently as \(e^-\) approximately, followed by two identical \(0.511\,\textrm{MeV}\) annihilation \(\gamma \)s. Therefore, the energy response of \(e^+\) can be predicted with that of \(e^-\) directly,

$$\begin{aligned} f^{e^+}\equiv & {} \frac{\overline{E^{e^+}_\textrm{vis}}}{E^{e^+}} = \frac{\overline{E^{e^-}_\textrm{vis}}\left( T\right) +2\cdot \overline{E^{\gamma }_\textrm{vis}}\left( m_e\right) }{T+2m_e},\end{aligned}$$
(4)
$$\begin{aligned} R^{e^+}\equiv & {} \frac{\sigma ^{e^+}}{\overline{E^{e^+}_\textrm{vis}}} = \frac{\sqrt{\left[ \sigma ^{e^-}\left( T\right) \right] ^2+ 2\left[ \sigma ^{\gamma }(m_e)\right] ^2 }}{\overline{E^{e^-}_\textrm{vis}}\left( T\right) + 2\cdot \overline{E^{\gamma }_\textrm{vis}}\left( m_e\right) }, \end{aligned}$$
(5)

where the kinetic energy and annihilation energies are considered as two independent parts. One way to describe the two \(0.511\,\textrm{MeV}\) annihilation \(\gamma \)s is employing the \(^{\textrm{68}}\)Ge source. The \(^{\textrm{68}}\)Ge source is a positron source while the kinetic energy is mostly deposited in the enclosure, and the two annihilation \(\gamma \)s drift out and deposit energy in LS. The explicit formulae can be found in Sect. 4.

There are two cases that might introduce systematic bias, annihilation in flight and formation of positronium. Annihilation in flight will generate two \(\gamma \)s with higher energies, and the fraction of this effect is around 1% per MeV of the kinetic energy from the simulation [16]. If positronium is formed, there are two spin states: singlet (p-Ps) and triplet (o-Ps). Around \(2\%\) triplets decay to three \(\gamma \)s with the total energy of \(1.022\,\textrm{MeV}\), while two \(0.511\,\textrm{MeV}\) \(\gamma \)s are released in the other cases. To include these effects, one way is to parameterize the energy response of \(\gamma \). If one has the full knowledge of \(\gamma \) energy distributions from above two effects, the energy response of \(e^+\) can be corrected furthermore. In the simplified consideration of Eqs. (4)–(5), these two effects are ignored due to their relatively small contributions.

The \(\gamma \)s deposit energy mainly by generating primary \(e^\pm \) through three interactions: photoelectric effect, Compton scattering, and pair conversion. The multiplicities and energies of the primary \(e^\pm \) from mono-energetic \(\gamma \)s are different event by event (named “deposition mode”) resulting from the competition of those three interactions. According to the energy-dependent response functions in Eqs. (2) and (3), each collection of primary \(e^\pm \) in a deposition mode corresponds to a specific visible energy distribution:

$$\begin{aligned} \left[ E^\gamma _\textrm{vis}\right] _j = \sum _{l^j}\left[ E^{e^-}_\textrm{vis}\right] _{l^j}+\sum _{n^j}\left[ E^{e^+}_\textrm{vis}\right] _{n^j}, \end{aligned}$$
(6)

where j specifies the given deposition mode, while \(l^j\) and \(n^j\) enumerate all the primary \(e^-\) and \(e^+\) in the j-th deposition mode, respectively. The positron term in Eq. (6) can be also described by the \(e^-\) response according to Eqs. (4)–(5). The expected value of \(\left[ E_\textrm{vis}^\gamma \right] _j\) and its variance are calculated assuming that primary \(e^\pm \) are independent:

$$\begin{aligned} \overline{\left[ E^\gamma _\textrm{vis}\right] _j}&= \sum _{l^j}\overline{\left[ E^{e^-}_\textrm{vis}\right] _{l^j}}+\sum _{n^j}\overline{\left[ E^{e^+}_\textrm{vis}\right] _{n^j}}, \end{aligned}$$
(7)
$$\begin{aligned} \left[ \sigma ^\gamma \right] ^2_{j}&= \sum _{l^j}\left[ \sigma ^{e^-}\right] ^2_{l^j}+\sum _{n^j}\left[ \sigma ^{e^+}\right] ^2_{n^j}. \end{aligned}$$
(8)

The total visible energy distribution is supposed to be a mixed distribution for all the deposition modes. In previous nonlinearity studies, only the expected value \(\overline{E_\textrm{vis}^\gamma }\) is considered thus a one-dimensional PDF averaging over all deposition modes is sufficient for calculation [16]. However, to involve the resolution into consideration, each deposition mode requires to be calculated as an independent distribution so that the fluctuations among different modes are not washed out. Consequently, the nonlinearity and resolution for \(\gamma \)s can be calculated with the knowledge of all the possible deposition modes:

$$\begin{aligned} f^\gamma&= \frac{\overline{E_\textrm{vis}^\gamma }}{E^\gamma } = \frac{\sum _j w_j \overline{\left[ E^\gamma _\textrm{vis}\right] _j}}{E^\gamma }, \end{aligned}$$
(9)
$$\begin{aligned} R^\gamma&= \frac{\sigma ^\gamma }{\overline{E_\textrm{vis}^\gamma }} \nonumber \\&= \frac{\sqrt{\sum _jw_j\left[ \overline{\left[ E_\textrm{vis}^\gamma \right] _j}-\overline{E_\textrm{vis}^\gamma }\right] ^2+ \sum _jw_j \left[ \sigma ^\gamma \right] ^2_{j}}}{\overline{E_\textrm{vis}^\gamma }}\end{aligned}$$
(10)
$$\begin{aligned}&= \frac{\sqrt{\sigma _\textrm{nonl}^2 + \sigma _\textrm{ave}^2}}{\overline{E_\textrm{vis}^\gamma }}, \end{aligned}$$
(11)

where \(w_j\) is the weight of the j-th deposition mode. The two terms in the numerator of Eq. (10) are redefined as \(\sigma _\textrm{nonl}^2\) and \(\sigma _\textrm{ave}^2\) in Eq. (11), respectively. The second term \(\sigma _\textrm{ave}^2\) is the weighted sum of variance for all primary \(e^\pm \). The first term \(\sigma _\textrm{nonl}^2\) is the weighted sum of the square of mean value deviation which reflects the dispersion of all deposition modes. Apparently, this term requires an individual calculation of each event rather than the one-dimensional PDF, because different collections of the primary \(e^\pm \) generated by the same \(E^\gamma \) will transfer into different visible energy due to nonlinearity. The \(\sigma _\textrm{nonl}^2\) term will disappear if the energy response is linear in LS detectors. Therefore, the coupling of nonlinearity and resolution of \(e^-\) is embedded in the \(\gamma \) resolution.

4 Model construction

According to discussions in Sect. 3, the energy response for both \(e^+\) and \(\gamma \) can be deduced from that of \(e^-\). Therefore, it is crucial to construct nonlinearity and resolution models for \(e^-\) firstly. Details are described in the following.

4.1 Energy response model of \(\varvec{e^-}\)

4.1.1 Nonlinearity

The two luminescent processes in LS are scintillation and Cherenkov processes. It is necessary to discuss the contributions to energy nonlinearity from the two effects separately.

1. Quenching effect. Some fraction of the deposited energy is transferred into heat and only the remaining quenched energy \(E_\textrm{q}\) is visible by generating photons [28]. The quenching induced nonlinearity \(f_\textrm{q}\) is defined as \(f_\textrm{q} = \overline{E_\textrm{q}}/E\), and the expected value of visible energy from scintillation photons (\(\overline{E_\textrm{s}}\)) can be expressed by introducing the scintillation light yield \(Y_\textrm{s}\), which is the number of scintillation PEs per unit visible energy. As a comparison, Y is the total PEs per unit visible energy:

$$\begin{aligned} \overline{E_\textrm{s}} = f_\textrm{q}(k_\textrm{B}, E) \cdot E \cdot Y_\textrm{s} / Y, \end{aligned}$$
(12)

where \(k_\textrm{B}\) is the Birks’ coefficient and larger \(k_\textrm{B}\) yields larger quenching. Implementations of the quenching effect in Geant4 are based on the Birks’ model in discrete steps. Besides, we used a numerical integral of Birks’ model with stopping power data from ESTAR [29] for cross-check. The different calculation results of \(f_\textrm{q}(E)\) are shown in Fig. 1. The curves have apparent discrepancies in the low energy region, while similar best-fit curves of the total nonlinearity and resolution have been obtained in the follow-up analysis, which validates the feasibility and robustness of our model. Hereafter, the results are based on the numerical integral curves.

Fig. 1
figure 1

The quenching induced nonlinearity as a function of the deposited energy of electrons. Solid lines are derived from the numerical integral with the ESTAR \(\textrm{d}E/\textrm{d}x\) and dashed lines are Geant4-based simulation results with the production cut of \({0.1}{\textrm{mm}}\). The red, blue and gray lines are nonlinearity curves with \(k_\textrm{B}=5.5\times 10^{-3}\) g/cm\(^2\)/MeV, \(k_\textrm{B}=6.5\times 10^{-3}\) g/cm\(^2\)/MeV (inherent value in simulation) and \(k_\textrm{B}=7.5\times 10^{-3}\) g/cm\(^2\)/MeV, respectively. The kinks at 80 keV in the simulation curves are caused by the production cut setting in Geant4

2. Cherenkov radiation. Cherenkov effect heavily relies on LS refractive index and the absorption re-emission probability, especially in the ultraviolet region. The Frank–Tamm formula [30] is implemented in Geant4 for Cherenkov intensity calculation. The shape of the Cherenkov photoelectron yield as a function of the electron deposited energy \(f_\textrm{C}\left( E \right) \) can be obtained from Geant4 simulation and is shown in Fig. 2. To avoid the heavy dependency on the LS optical properties in simulation, an empirical parametrization of the Cherenkov light is constructed as,

$$\begin{aligned} \begin{array}{l} f_\textrm{C}\left( E \right) = \left\{ \begin{aligned} &{} 0, E^\prime < 0, \\ &{} \frac{p_0\cdot E^\prime }{ E^\prime + p_1\cdot e^{-p_2\cdot E^{\prime }}}, E^{\prime }\ge 0, \end{aligned} \right. \end{array} \end{aligned}$$
(13)

where \(E^\prime = E-E_0\), and \(E_0\) is the energy threshold of the Cherenkov effect. The fitting curve is also shown in Fig. 2, and there is a good agreement between the simulation and our model.

Fig. 2
figure 2

The Cherenkov PEs yield \(f_\textrm{C}\) for \(e^-\) from simulation (blue markers). The parametrization is shown as the red line

Fig. 3
figure 3

a Decomposition of \(e^-\) energy resolution. The black line is the total energy resolution from the simulation. The three shadow regions represent the Cherenkov part \((\sigma _\textrm{C}/E_\textrm{vis})^2\) (red), scintillation part \((\sigma _\textrm{s}/E_\textrm{vis})^2\) (blue) and covariance part \(2\cdot \textrm{Cov}\left[ E_\textrm{s}, E_\textrm{C} \right] /E_\textrm{vis}^2\) (gray). The boundary markers are the Monte Carlo simulation, blue inverted triangles for \(E_\textrm{s}\) and red triangles for \(E_\textrm{C}\). The blue dashed line and red dashed line refer to the Poisson statistics for scintillation and Cherenkov photons. Large excess w.r.t. the Poisson statistics exists in the Cherenkov case. b Parametrization of the energy resolution of \(e^-\) (simulation data as red markers, the best fit of the parametrization as the black line)

In summary, the expected visible energy is the sum of the scintillation part and the Cherenkov part, as shown in Eq. (14),

$$\begin{aligned} \overline{E_\textrm{vis}}&= \overline{E_\textrm{s}} + \overline{E_\textrm{C}} \nonumber \\&= f_\textrm{q}\left( k_\textrm{B}, E\right) \cdot E \cdot Y_\textrm{s} / Y + f_\textrm{C}\left( E, \varvec{p}\right) \cdot E / Y, \end{aligned}$$
(14)

where the parameter \(\varvec{p}\) are the four parameters involved in Eq. (13). A six-parameter energy nonlinearity model has been established for \(e^-\) based on the physical origins.

4.1.2 Resolution

Similar to the energy nonlinearity model, the contributions to the energy resolution from scintillation photons and Cherenkov photons have been studied separately. Considering the correlation between \(E_\textrm{s}\) and \(E_\textrm{C}\), the total fluctuation of the visible energy can be decomposed as:

$$\begin{aligned} \sigma ^2 = \sigma _\textrm{s}^2 + \sigma _\textrm{C}^2 + 2 \cdot \textrm{Cov}\left[ E_\textrm{s}, E_\textrm{C} \right] , \end{aligned}$$
(15)

where \(\sigma _s\) and \(\sigma _C\) represent the standard deviation of \(E_\textrm{s}\) and \(E_\textrm{C}\), and \(\textrm{Cov}\left[ E_\textrm{s}, E_\textrm{C} \right] \) is the covariance defined as \(\textrm{Cov}\left[ E_\textrm{s}, E_\textrm{C} \right] =\overline{E_\textrm{s}\cdot E_\textrm{C}}-\overline{E_\textrm{s}}\cdot \overline{E_\textrm{C}}\). In Fig. 3a the three parts in Eq. (15) are shown as the three shadow regions. The scintillation process is close to Poissonian as expected (blue inverted triangles). The Cherenkov photons have a rather large smearing due to particle track length fluctuation (red circles). A longer track length leads to more Cherenkov photons and relatively smaller \(\textrm{d}E/\textrm{d}x\), which induces a smaller quenching effect. Thus, a positive correlation between \(E_\textrm{s}\) and \(E_\textrm{C}\) degrades the energy resolution furthermore (gray region). The energy resolution for LS detectors is sensitive to the photon composition, and a higher Cherenkov ratio yields worse resolution with the same photon statistics.

Due to the strong correlation, decoupling of the three terms in Eq. (15) is difficult without separate benchmark measurements. Instead, the energy resolution for LS detectors is often empirically modeled as

$$\begin{aligned} R^2= & {} \left( \frac{\sigma }{\overline{E_\textrm{vis}}}\right) ^2 =\frac{a^2 / Y}{\overline{E_\textrm{vis}}} + \frac{b^2 \left( Y\right) ^{n-2}}{\left( \overline{E_\textrm{vis}}\right) ^{2-n}} + \frac{c^2 / Y^2 }{\overline{E_\textrm{vis}}^2}, \end{aligned}$$
(16)
$$\begin{aligned}= & {} R_{\textrm{stat}}^2 + R_{\mathrm {non-Pois}}^2 + R_{\textrm{noise}}^2, \end{aligned}$$
(17)

where the three terms in Eq. (16) are redefined as the corresponding terms in Eq. (17). The a-related term is mainly induced by statistical fluctuation, the b-related term is dominated by a non-Poisson fluctuation of Cherenkov PEs and c is caused by PMT dark noises. By the insertion of Y in Eq. (16), ab and c are all dimensionless variables. The parameter n in Eq. (16) is often fixed as 2 in previous studies [14, 16, 17], while it is released as a free parameter in this analysis for more flexible description extending to the higher energy region.

Suppose that PMT dark noises in the readout window obey Poisson statistics, the noise term is particle independent and can be estimated as

$$\begin{aligned} c^2 = R_\textrm{DN} \times T \times N_\textrm{PMT}, \end{aligned}$$
(18)

with the total number of PMTs \(N_\textrm{PMT}\), average dark noise rate \(R_\textrm{DN}\) and readout time window length T. There will be a trade-off between less dark noises in shorter windows and more signal photons in longer windows during the optimization of energy reconstruction, which requires a separate study. In the following, the \(R_\textrm{noise}\) term is neglected as no PMT noise is considered. But it is straightforward to include dark noise in the future by introducing the c-related term into the model. Figure 3b shows the resolution of \(e^-\) from simulation, together with the parametrization based on Eq. (16).

4.2 Energy response model of \(\varvec{\gamma }\)

The connections between energy resolution of \(\gamma \) and \(e^-\) have been constructed in Sect. 3. In Eqs. (9) and (10), the calculation of energy response for mono-energetic \(\gamma \)s requires applying \(e^-\) nonlinearity (Eq. (14)) and resolution (Eq. (16)) on the primary \(e^\pm \) collections of all deposition modes. To avoid complex modeling, the method adopted here is to sample deposition modes from simulation based on a certain physical model. With sufficient sampling statistics, the calculation results are supposed to converge to the true values.

Fig. 4
figure 4

Multiplicity distribution (a) and energy distribution (b) for \({8}{\textrm{MeV}}~\gamma \)s with Livermore model (blue) and Penelope model (red). The bin width in panel (b) is 80 keV. The relative bias is defined as \((L-P)/{L}\) where L represents the Livermore model and P represents the Penelope model, and displayed in the upper panels

To test the dependency on physical models of energy deposition of \(\gamma \)s, two common low-energy electromagnetic models in Geant4, namely Livermore and Penelope, have been compared. It is claimed by the Geant4 group that these two models can provide reliable results covering electrons and photons physics from \(250\,\textrm{eV}\) to \(1\,\textrm{GeV}\). Taking \(8\,\textrm{MeV}\) \(\gamma \)s as an example, consistent results of the energy and multiplicity distributions of primary \(e^\pm \) have been obtained between the two models in Fig. 4. Besides, the interaction cross sections for \(\gamma \) energy deposition are well theoretically calculated and measured in Ref. [31]. An algorithmic calculation of primary \(e^\pm \) distributions has been developed as in Ref. [32] and has excellent agreements with Geant4 simulation. Based on the above cross-validations on the \(\gamma \)s interactions in the reactor antineutrino energy range, we choose the Livermore model in the following simulation.

Decomposition of \(\gamma \) energy resolution as Eq. (11) is shown in Fig. 5. The energy resolution of \(e^-\) is also displayed as the dark green dotted line for comparison. The excess of \(\gamma \) resolution compared with \(e^-\) mainly comes from the nonlinearity induced smearing term \(\sigma _\textrm{nonl}^2\), which contributes more than \(20\%\) in the energy range below \(10\,\textrm{MeV}\). The remaining term \(\sigma _\textrm{ave}^2\) (red cross markers) is slightly smaller than \(e^-\) resolution with the same visible energy, because lower energy \(e^\pm \) in \(\gamma \) energy deposition has fewer Cherenkov photons.

Fig. 5
figure 5

Lower panel: decomposition of \(\gamma \) energy resolution according to Eq. (11), with \(\sigma ^2\) in blue solid line with round markers and \(\sigma _\textrm{ave}^2\) in red cross markers. The \(e^-\) energy resolution curve is shown as the dark green dotted line for comparison. The worse resolution for \(\gamma \) than \(e^-\) comes from \(\sigma _\textrm{nonl}^2 = \sigma ^2 - \sigma _\textrm{ave}^2\) (gray region) which is caused by the dispersion among deposition modes. The remaining \(\sigma _\textrm{ave}^2\) is slightly smaller than \(\left[ \sigma ^{e^-}\right] ^2\) due to less Cherenkov photons. Upper panel: the ratio of \(\sigma _\textrm{nonl}^2/\sigma ^2\), which is larger than 20% for all energies below 10 MeV

4.3 Energy response model of \(\varvec{e^+}\)

Precision measurement of reactor neutrino oscillations requires a deep understanding of the energy resolution of \(e^+\). Meanwhile, knowledge of the individual contributors to the resolution will provide guidelines for future improvement. As discussed in Sect. 3, the energy response of \(e^+\) in LS can be conveniently expressed by the energy response of \(e^-\) and calibration source \(^{68}\)Ge (Eqs. (4) and (5)). Therefore, by adding calibration data of \(^{68}\)Ge upon Eqs. (14) and (16), the detailed expressions of \(e^+\) energy resolution can be written as,

$$\begin{aligned} f^{e^+}&= \frac{\overline{E_\textrm{vis}^{e^-}}(T) + \overline{E_\textrm{vis}^{^{68}\textrm{Ge}}} }{T+2m_e} \nonumber \\&= \frac{\left[ f_\textrm{q}(k_\textrm{B}, T)\cdot T \cdot Y_\textrm{s}/Y + f_\textrm{C}(T, {\textbf{p}})\cdot T/Y \right] + \overline{E_\textrm{vis}^{^{68}\textrm{Ge}}}}{T + 2m_e}, \end{aligned}$$
(19)
$$\begin{aligned} R^{e^+}&{=}\nonumber \\&\frac{\sqrt{ \left[ a^2/Y\cdot \overline{E_\textrm{vis}^{e^-}}(T) {+} b^2Y^{n-2}\cdot (\overline{E_\textrm{vis}^{e^-}}(T))^n\right] {+} \left[ \sigma ^{{^{68}\textrm{Ge}}}\right] ^2}}{\overline{E_\textrm{vis}^{e^-}}(T) {+} \overline{E_\textrm{vis}^{^{68}\textrm{Ge}}}}, \end{aligned}$$
(20)

where \(\overline{E_\textrm{vis}^{^{68}\textrm{Ge}}}\) and \(\sigma ^{{^{68}\textrm{Ge}}}\) represent the expected value and standard deviation of the calibrated visible energy spectrum of \(^{68}\)Ge.

5 Model tuning and results

5.1 Model inputs

The radioactive gamma sources are the most important ones for LS detector calibration. In our Monte Carlo simulation, the gamma sources from Ref. [14] are used, as shown in Table 1. For simplicity, all \(\gamma \) sources are considered as bare sources without enclosures and simulated at the detector center. To reach \(0.01\%\) statistical uncertainty of the energy peaks, \(5\times 10^4\) events for each source are generated.

Table 1 List of the calibration sources from [14]

Besides the regular gamma calibration sources, some isotopes with continuous \(\beta \) spectra induced by energetic cosmic muons can also serve as the model inputs. One important source is \(^\textrm{12}\textrm{B}\), which decays with a Q value about \({13.4}{\textrm{MeV}}\) and lifetime around \({29}{\textrm{ms}}\). The visible energy spectrum of \(^\textrm{12}\textrm{B}\) spans from \({0}{\textrm{MeV}}\) to around \({14}{\textrm{MeV}}\). Another useful sample is Michel \(e^-\) generated by the decay of stopped cosmic muons. Michel \(e^-\) has a wide energy spectrum with the cut-off energy at \({52.8}{\textrm{MeV}}\). To achieve relatively low statistical uncertainties, the \(^\textrm{12}\textrm{B}\) samples and Michel \(e^-\) events are assumed as \({100}{\textrm{k}}\) and \({400}{\textrm{k}}\) respectively in this study. Taking the JUNO detector as an instance, where the muon rate is estimated as \({3.6}{\textrm{Hz}}\) [33], the charge ratio is approximated as the value at the sea level [34] and the muon stopping rate is around \(4\%\), the above statistics is equivalent to three months running period approximately. All the isotope spectra are simulated at the detector center without considering the detector geometric effects.

5.2 Statistical approach via \({\chi ^2}\) minimization

The inputs for model fitting are the energy spectra of the simulated sources mentioned in Sect. 5.1. The whole energy spectra of \(\gamma \) sources are utilized. For \(^\textrm{12}\textrm{B}\) spectrum, an energy window from \(3\,\textrm{MeV}\) to \(12\,\textrm{MeV}\) is chosen, both for suppression of the natural radioactivity and for covering the high energy range of reactor antineutrinos.

Fig. 6
figure 6

Comparison of \(\gamma \) energy response and \(^\textrm{12}\textrm{B}\) energy spectrum between the simulated results (red markers) and the best-fit model (black lines) (a gamma energy nonlinearity, b energy resolution, c \(^\textrm{12}\textrm{B}\) visible energy spectrum). The relative residual bias, defined as \((F-T)/T\) (T is the simulation value, F is the best-fit model prediction), is within \(0.05\%\) and \(1\%\) for nonlinearity and resolution, respectively. For nonlinearity of each multi-\(\gamma \) source, like \(^{60}\)Co, \(^{68}\)Ge and n\(^{12}\)C, the average energy of all its \(\gamma \)s is displayed. The resolution of multi-\(\gamma \) sources deviates from that of the single-\(\gamma \) source, mainly due to different Cherenkov PEs ratios

To determine the parameters in our energy model, a \(\chi ^2\) statistics is defined as Eq. (21) by comparing predictions (P) and measurements (M) of energy spectra for all input sources,

$$\begin{aligned} \chi ^2(Y_\textrm{s}, k_\textrm{B}, \varvec{p}, a, b, n)= & {} \sum _i \left[ \sum _{j^i} \frac{ \left( P_{j^i}^\gamma - M_{j^i}^\gamma \right) ^2}{\left( \sigma _{j^i}^\gamma \right) ^2} \right] \nonumber \\{} & {} + \sum _k \left[ \frac{\left( P_k^{^\textrm{12}\textrm{B}} - M_k^{^\textrm{12}\textrm{B}}\right) ^2}{\left( \sigma _k^{^\textrm{12}\textrm{B}}\right) ^2} \right] . \end{aligned}$$
(21)

The first term in Eq. (21) is for \(\gamma \)s. The summation of i enumerates all calibration \(\gamma \) sources, and the summation of \(j^i\) enumerates all energy bins for the i-th source. \(M_{j^i}\) and \(\sigma _{j^i}\) are the count of j-th energy bin and its statistical uncertainty. Neglecting the energy leakage in large LS detectors via a fiducial volume, the ideal energy spectra of \(\gamma \) sources are approximately expected as normal distributions \({\mathcal {N}}\left( \overline{E_\textrm{vis}^\gamma }, \sigma ^\gamma \right) \). Thus, the count of j-th energy bin (\(P_{j^i}\)) can be predicted according to \(\overline{E_\textrm{vis}^\gamma }\) and \(\sigma ^\gamma \), where the \(\overline{E_\textrm{vis}^\gamma }\) and \(\sigma ^\gamma \) can be calculated using the \(e^-\) energy response. The second term in Eq. (21) is the spectrum comparison of \(^\textrm{12}\textrm{B}\). The summation of k enumerates all fitting energy bins of the \(^\textrm{12}\textrm{B}\) energy spectrum. \(M_k^{^\textrm{12}\textrm{B}}\) and \(\sigma _k^{^\textrm{12}\textrm{B}}\) are the count of k-th energy bin and its statistical uncertainty. The \(^\textrm{12}\textrm{B}\) energy spectrum (\(P_k^{^\textrm{12}\textrm{B}}\)) is predicted by first applying the \(e^-\) nonlinearity (Eq. (14)) on the theoretical energy spectrum, and then smearing the spectrum according to the \(e^-\) energy resolution function (Eq. (16)). The Michel \(e^-\) spectrum is not used as a fitting input considering uncertainties, more details are discussed in Sect. 6.1.

Table 2 Fitting parameters summary (best-fit values and uncertainties) from \(\chi ^2\) minimization

The description of \(e^-\) nonlinearity introduces scintillation PEs yield \(Y_\textrm{s}\), Birks’ coefficient \(k_\textrm{B}\) in the quenching effect and 4 parameters in the Cherenkov effect \(f_\textrm{C}\). Besides, the \(e^-\) resolution model requires 3 more parameters ab and n. Therefore, total 9 physical parameters are involved in the model fitting. The \(\chi ^2\) in Eq. (21) is minimized with TMinuit.

5.3 Fitting results

The fitting procedures have been proceeded in the four simulation configurations mentioned in Sect. 2, and the detector with the energy resolution around \(2.9\%\) are demonstrated in this section, where the value of Y is determined as 1400 photoelectrons\(/\textrm{MeV}\). The fitted nonlinearity and resolution values for all these calibration \(\gamma \)s are compared with those from simulation in Fig. 6a, b. A residual bias for nonlinearity and resolution of \(\gamma \)s of less than 0.1% and 1%, respectively, can be achieved. The nonlinearity-only calibration in Ref. [14] based on a 4-parameter empirical formula is quoted as a comparison. The 0.1% residual bias level of the nonlinearity in Fig. 6a are comparable with the 0.1% residual bias level there. Among all \(\gamma \) calibration sources, there are three multi-\(\gamma \) sources, \(^{60}\)Co, \(^{68}\)Ge and n\(^{12}\)C, which have slightly different behavior of energy response compared to the other single \(\gamma \) sources. The displayed energy E for multi-\(\gamma \) sources in Fig. 6a is calculated as the average value of all the \(\gamma \)s, while the sum of all the \(\gamma \)s are displayed for multi-\(\gamma \) sources in Fig. 6b.

The \(^{68}\)Ge and \(^{60}\)Co sources have better energy resolution compared with single \(\gamma \) sources with same deposited energy, because the Cherenkov PEs are less for two softer \(\gamma \)s, resulting in smaller non-Poisson fluctuation. While for n-\(^{12}\)C, the worse resolution originates from the \(E_\textrm{vis}\) discrepancy for the two branches (see Table 1). Furthermore, the best-fit \(^\textrm{12}\textrm{B}\) spectrum has great consistency with the simulation data (see Fig. 6c).

The best-fit parameters are listed in Table 2, and the correlation matrix is shown in Fig. 7. The correlation matrix manifests features of the block matrix, where parameters of the scintillation nonlinearity, Cherenkov nonlinearity and energy resolution are strongly correlated internally. The nonlinearity part and the resolution part are only weakly correlated. However, around \(60\%\) correlation is found between the amplitudes of scintillation and Cherenkov components (\(Y_\textrm{s}\) and \(p_0\)). Amounts of scintillation photons and Cherenkov photons are negatively correlated due to the total PEs constraints, and may have bias compared to the nominal settings. For example, the best-fit value of the Birks’ coefficient \(k_\textrm{B} = (5.76\pm 0.03)\times 10^{-3}\) g/cm\(^2\)/MeV deviates from the nominal value \(6.5\times 10^{-3}\) g/cm\(^2\)/MeV, while the total nonlinearity fitting performs well as in Fig. 6a. To disentangle the strong correlation, standalone measurements on different components are important.

Fig. 7
figure 7

Correlation coefficients of the 9 parameters

5.4 \({e^+}\) nonlinearity and resolution from data-driven calibration

To validate the model fitting, mono-energetic \(e^+\) samples are simulated at the detector center covering the reactor antineutrino energy range. According to Eqs. (19) and (20), the \(e^+\) energy response can be derived from the best-fit \(e^-\) energy model and the measured data of \(^{68}\)Ge calibration source. The predictions of the \(e^+\) nonlinearity and resolution, as well as the uncertainties, are shown in Fig. 8. To calculate the uncertainty bands, fitting parameters are sampled according to Fig. 7 considering the full correlations for \(10^4\) times. The variations of energy nonlinearity and resolution at each energy point can be evaluated from the sampled parameter sets. The uncertainty bands are defined as the symmetrical 68.3% confidence interval of the distributions. The predictions have excellent agreements with the simulation data, where the relative residual bias for \(e^+\) nonlinearity and resolution is less than \(0.1\%\) and \(2\%\), respectively. Again, the residual bias of the nonlinearity fitting of \(e^+\) is similar with the \(0.2\%\) residual bias level in Ref. [14] and the uncertainty is relatively smaller.

Fig. 8
figure 8

Comparison of \(e^+\) energy response between the prediction and the simulated data (a nonlinearity, b resolution). The relative residual bias, with the same definition in Fig. 6, is less than \(0.1\%\) and \(2\%\) for nonlinearity and resolution, respectively. The uncertainty bands are scaled for better visualization

6 Discussion

6.1 Impacts of fitting data inputs

The fitting inputs for our model are \(\gamma \) calibration sources and cosmogenic \(^\textrm{12}\textrm{B}\). During the model tuning, we have found that the spectra of \(\gamma \) sources are critical to constrain the resolution model due to the mono-energetic nature. However, the bias and uncertainty increase in the high energy region due to lack of calibration data. Unlike the case in Ref. [16], our Monte Carlo data does not include electronics nonlinearity, thus the \(^\textrm{12}\textrm{B}\) spectrum provides limited improvements on the nonlinearity fitting. Moreover, the \(^\textrm{12}\textrm{B}\) spectrum is insensitive to the energy resolution either because of the continuous spectrum property.

The fitting performance in the higher energy region is checked by the Michel \(e^-\) samples. The caveat is that the theoretical spectrum of Michel \(e^-\) could be significantly affected if muons decay in atomic-bound states [35]. Moreover, there are both \(\mu ^-\) and \(\mu ^+\) components in the cosmic muons, so the Michel \(e^-\) spectrum might be contaminated by \(e^+\), which introduces additional uncertainties into the model prediction. Therefore, the Michel \(e^-\) spectrum is not used as a fitting input. As a test, we use the ideal energy spectrum of Michel \(e^-\) without considering theoretical uncertainties or the \(e^+\) contamination, and compare it with the model prediction using best-fit values in Table 2. The best-fit spectrum has good agreement with the simulated data, as shown in Fig. 9.

Fig. 9
figure 9

Comparison of Michel \(e^-\) visible energy spectrum between the model prediction and the simulated data. The relative residual bias, with the same definition in Fig. 6, is shown in the upper panel

Fig. 10
figure 10

The best-fit nonlinearity (a) and resolution (b) for \(\gamma \) and \(e^-\), as well as the predicted \(e^+\) responses. Shadow regions on the lines represent the error bands, which have been scaled for better visualization

In the energy region of reactor antineutrinos, our model enables a high-precision calibration of the energy nonlinearity and resolution of \(e^+\). While extending to higher energy, the energy resolution is less constrained due to the lack of mono-energetic sources. This also motivates the development of new calibration strategies in the specific energy region of interest.

6.2 Energy response separation among different particles

In Fig. 10, we compare the best-fit nonlinearity and resolution for \(\gamma \) and \(e^-\), as well as the predicted \(e^+\) responses. The \(^{68}\)Ge calibration source anchors the starting points of the \(e^+\) nonlinearity and resolution. Some observations can be made below.

  • At the same \(E_\textrm{vis}\), the required energy of \(e^-\) is lower than that of \(e^+\), whereas the energy resolution of \(e^-\) is slightly worse than that of \(e^+\), due to more Cherenkov photons and larger non-Poisson fluctuations for \(e^-\), as discussed in Sect. 4.1.2.

  • The nonlinearity curve of \(\gamma \) is below that of \(e^-\) because its energy deposition is via multiple secondary \(e^\pm \) with smaller energies. At low energies, the deposited energy of \(e^+\) is dominated by the annihilation part, so \(e^+\) generates fewer photons than the single \(\gamma \)s with the same total energy. As energy increases, the nonlinearity curve of \(e^+\) will exceed that of \(\gamma \)s.

  • The \(\gamma \) resolution is notably worse than those of \(e^-\) and \(e^+\). It mainly results from the diversity of energy deposition modes as discussed in Sect. 3.

  • Similar to previous work [14, 16], the current model is able to separate the nonlinearity curves for \(\gamma \) and \(e^\pm \). Moreover, the new feature is the ability to obtain particle-dependent energy resolution curves, see Fig. 10b. High-resolution detectors like JUNO are capable of distinguishing the energy resolution between \(\gamma \) and \(e^+\) at the reactor antineutrino energy range.

Fig. 11
figure 11

Fitting results of energy resolution for \(e^+\) are displayed, and the shadow error regions on the lines have been scaled for better visualization. Calibration data of \(\gamma \) sources are displayed together for comparison. Four different detector configurations have been studied. As the energy resolution of detectors becomes worse, the separation ability between \(e^+\) and \(\gamma \) decreases

6.3 Model results on detectors with varied resolutions

By tuning the detection efficiency of the photosensors in the Monte Carlo software, it is convenient to study the model performances versus different resolutions. Four cases are simulated with the resolution values of zero kinetic energy positron being \(2.9\%, 3.8\%, 5.5\%\) and \(7.8\%\), respectively. The same fitting procedure is applied and the results are shown in Fig. 11. The best-fit curves of \(e^+\) resolution are shown as colored lines with error bars scaled for better visualization. For comparison, the resolution values of single \(\gamma \) sources are also superimposed. The statistics of \(N_\textrm{pe}\) dominates the energy resolution variation among the four cases. The discrimination ability between \(e^+\) and \(\gamma \) decreases as the energy resolution of detectors becomes worse. This can be simply understood that as the LS-induced nonlinearities among all detectors are similar, the fraction of \(\sigma ^2_\textrm{nonl}\) in \(\left[ \sigma ^\gamma \right] ^2\) becomes smaller for detectors with worse energy resolution according to Eq. (11). Results in Fig. 11 indicate that the gamma resolution curve is incapable to approximate \(e^+\) for high-resolution detectors. The proposed data-driven calibration strategy for \(e^+\) resolution will strongly advance the precision measurements of reactor antineutrinos.

7 Summary and prospect

A comprehensive study has been carried out to construct a unified energy model for \(\gamma \) and \(e^\pm \) in LS detectors. Based on Monte Carlo studies, we have demonstrated a promising data-driven calibration approach to predict the \(e^+\) nonlinearity and resolution simultaneously in the reactor antineutrino energy region. Without considering uncertainties from enclosures of \(\gamma \) sources, backgrounds in the \(^{12}\textrm{B}\) spectrum and other instrumental induced effects, the relative residual bias in nonlinearity and resolution can achieve within \(0.1\%\) and \(2\%\), respectively. For higher energy region, the good agreement with simplified Michel \(e^-\) spectrum has preliminarily validated the model performance, however, further exploration of dedicated calibration strategies will be needed. The study of energy resolution decomposition will also motivate our future developments on scintillation/Cherenkov photon discrimination algorithms.

In practice, the observed \(N_\textrm{pe}\) is often influenced by the PMT and electronics response, such as charge smearing and dark noises, and the geometric effects in LS detectors will influence the energy response such as the non-uniformity. Those instrumentally induced effects can be effectively eliminated by the reconstruction algorithms, e.g., see discussions in Refs. [14, 36, 37] for JUNO. For the application to a realistic LS detector, these effects that impact energy response should be considered carefully and added to our model.

The JUNO-TAO experiment [38], which is a ton-scale satellite experiment of JUNO, aims to precisely measure the reactor antineutrino spectrum. Its targeted effective energy resolution is <2% and a similar calibration strategy has been developed [39] to control the nonlinear energy scale to be <1%. Our energy model would be also applicable to JUNO-TAO, and its high-resolution data could be of great value to disentangle the Cherenkov and quenching effects in energy nonlinearity and resolution.