Implementation of Feldman-Cousins corrections and oscillation calculations in the HPC environment for the NOvA Experiment

Analysis of neutrino oscillation data involves a combination of complex fitting procedures and statistical correction techniques that are used to determine the full three-flavor PMNS parameters and constraint contours. These techniques rely on computationally intensive “multi-universe” stochastic modeling. The process of calculating these contours and corrections can dominate final stages of the data analysis and become a bottleneck for examining the effect of systematic variations on the final results. As part of the DOE SciDAC-4 sponsored research program, we present a new implementation of a neutrino oscillation fitting and framework to carry out calculations of Feldman-Cousins corrections. This implementation is specifically designed and optimized to operate on modern High-Performance Computing (HPC) facilities. We present the performance of the system in calculating allowed regions for the NOvA experiment based on 8.9E20 and 6.9E20 protons-on-target (POT) neutrino and antineutrino datasets, and compare the performance of this new implementation run at the NERSC supercomputing facility with that from methods used previously by the NOvA collaboration running on grid computing facilities. 1 Neutrino Oscillations and the NOvA Experiment Over the last 20 years, very compelling evidence has been assembled that neutrinos change flavor as they propagate from their production point. Neutrino oscillations require the three weak eigenstates (νe, νμ, ντ) to be distinct from the neutrino mass states (ν1, ν2, ν3). The complete set of experimental data available is almost perfectly described by oscillations driven by two independent mass-squared differences, ∆m32 and ∆m21, and by a 3×3 neutrino mixing matrix [1], where two of the mixing angles, θ23 and θ12 have large values, and the third mixing angle, θ13 is measured to have a relatively small, but non-zero value of θ13 = 8.37◦±1.62◦ [2]. Although we have now measured most of the neutrino mixing parameters, two fundamental parameters remain unknown: a) the CP violation phase in the leptonic sector δCP; and b) the sign of ∆m32, commonly referred to as the neutrino mass hierarchy. If neutrino oscillations do not preserve CP symmetry, we will have first evidence for CP violation in the leptonic sector and may potentially explain the matter-antimatter asymmetry observed in the universe. ∗e-mail: alex.sousa@uc.edu The measurement of the mass hierarchy (normal for ∆m32 > 0 or inverted for ∆m32 < 0) has profound implications in our understanding of supernovae explosions [3] and is essential for the interpretation of future results by neutrinoless double-beta decay experiments [4]. The key to determining these two parameters lies in precisely measuring the subdominant νμ →νe transition probability, sensitive to both CP-violation and matter effects. Assuming a three-neutrino mixing scenario, the νμ → νe oscillation probability including matter effects is approximately given by Eq. 1: P( (—) νμ → (—) νe) ≈ sin2 2θ13 sin2 2θ23 sin2(A − 1)∆ (A − 1)2 + 2α sin θ13 cos δCP sin 2θ12 sin 2θ23 sin A∆ A sin(A − 1)∆ (A − 1) cos ∆ (+) 2α sin θ13 sin δCP sin 2θ12 sin 2θ23 sin A∆ A sin(A − 1)∆ (A − 1) sin ∆, (1) where ∆ ≡ ∆m 2 32L 4E , α ≡ ∆m21 ∆m32 , and A ≡ (-) + G f neL √ 2∆ . The latter factor arises due to the passage of neutrinos through matter. The symbols in parentheses indicate how the expression is modified in the case of antineutrino oscillations. It is clear all terms in Eq. 1 depend on the value of the small mixing angle θ13. It is also apparent that the matter effect contribution, determined by A, is directly proportional to the neutrino propagation length L and that it changes sign depending on the neutrino mass hierarchy. In vacuum, CP violation in neutrino oscillations is equivalent to having P(νμ → νe) , P(ν̄μ → ν̄e). This is only possible if the third term in Eq. 1 does not vanish, a condition verified if δCP is a non-integer multiplicative factor of π. However, if matter effects are included, even in the absence of CP-violation, we can measure P(νμ → νe) , P(ν̄μ → ν̄e), leading to ambiguities that pose major challenges in simultaneous resolving mass hierarchy and CP-violation. The NuMI Off-Axis νe Appearance experiment (NOvA) has as its primary goal to measure νμ →νe oscillations by looking for appearance of electron neutrinos or antineutrinos using the Neutrinos at the Main Injector (NuMI) neutrino beam produced at Fermilab. This is accomplished by using two detectors separated by 810 km, placed 14 mrad off the NuMI beam axis. The 300 ton Near Detector (ND), located underground next to the MINOS Near detector hall at Fermilab, and the 14 kton Far Detector (FD), positioned at the surface in Ash River, Minnesota, are composed of extruded PVC modules filled with liquid scintillator. The scintillator light in each cell is collected by a loop of wavelength shifting fiber, which in turn is read out by a 32-pixel avalanche photodiode (APD). By virtue of their off-axis positioning, the detectors sample a narrow-band beam peaked at 2 GeV, near the νμ → νe oscillation maximum at 810 km. NOvA began operations in 2014. Since 2016, the NuMI beam has been running at its nominal power of 700 kW, corresponding to an intensity of 4.9 × 1013 protons impinging on the beam target during each 10 μsec pulse, happening every 1.33 seconds. By taking advantage of large neutrino matter effects due to NOvA’s long baseline of 810 km, and the NuMI capability of running in neutrino or antineutrino-dominated configurations, NOvA can independently measure P(νμ → νe), P(ν̄μ → ν̄e) and disentangle ambiguities between values of δCP and the mass hierarchy for some portions of the parameter space. In the analysis and results presented here, we use NOvA exposures of 8.9×1020 protons-on-target (POT) in neutrino running, and 6.9 × 1020 POT in antineutrino running. Besides measurements of νe appearance, NOvA is also able to measure the oscillation parameters ∆m32 and sin 2 2θ23 via νμ disappearance with unprecedented precision. NOvA can also look for beyond standard model physics, detect supernovae explosions, and is carrying out a rich program of neutrino cross-section measurements using its Near Detector. 2 Parameter Extraction in NOvA Long-baseline neutrino experiments such as NOvA, T2K, and in the future DUNE, are designed to measure the energy spectra for muon neutrinos (νμ) and muon antineutrinos (ν̄μ) produced in an accelerator beam, using a detector in close proximity to the source of the neutrinos. In the absence of neutrino oscillations, the muon neutrino and antineutrino spectra extrapolated from the Near Detector measurement in the absence of oscillations would be similar to the observed Far Detector spectrum. Similarly, in the absence of oscillations, the fraction of electron neutrinos (νe) and electron-antineutrinos (ν̄e) would be consistent between the Near and Far measurements and correspond to the ∼ 1% fraction of νe and ν̄e intrinsic to the beam. In the presence of neutrino oscillations, an energy-dependent depletion of the muon νμ and ν̄μ spectra is observed in the Far Detector. At the same time, the oscillations enhance the rate measured for the νe and ν̄e populations in the same energy regime. Comparison of the predicted νμ and νe spectra from the Near Detector measurement with the Far Detector observation permits the measurement of the νμ and ν̄μ survival probabilities P( (—) νμ → (—) νμ) and the νe and ν̄e appearance probabilities P( (—) νμ → (—) νe). Extraction of the PMNS mixing parameters can then be done by analyzing the depletion and enhancement regions. The position in depletion region is driven by the characteristic frequency of the oscillations and used to extract the value for the neutrino mass splitting ∆m32while the amplitude of the depletion is used to determine the mixing angles θ23 and θ13. Measured spectrum Unoscillated νμ spectrum Figure 1: Illustration of the disappearance analysis principle. The different oscillation parameters distort the oscillated muon neutrino energy spectrum in different ways with respect to the predicted unoscillated spectrum, enabling measurement of the parameters. In practice, long-baseline experiments compare predictions assuming different sets of oscillation parameters with the Far Detector data, with agreement estimated using a negative log-likelihood. The minimization of the negative log-likelihood allows us to identify the set of oscillation parameters best describing the experimental observations, and define confidence intervals for the parameter measurement. 2.1 Statistical Interpretation of Results The frequentist approach to assessing the level of agreement between data and prediction involves calculating the probability of observing random variable x under a model subject to parameters ~θ. The P(x|~θ) is equated to the Likelihood function, L(~θ) as a function of the model’s parameters. An estimate of a model’s true parameters can be found as the set of parameters that maximize the likelihood function L(~θ). This can also be cast as a minimization problem through relating the log of the likelihood function to a chi-square distribution: χ2(~θ) = −2 logL(~θ) (2) Confidence intervals can then be constructed using the test statistic: ∆χ2 = χ2(~θ) − χ(~θbest) (3) which according to Wilks’ theorem [7] follow a standard χ2 distribution. Wilk’s theorem holds in the limit of Gaussian errors, and in regions where the free parameters are far from physical boundaries. In this limit, the p-value can be calculated analytically, and converted to a statistical significance in terms of a standard error that is measured directly by √ ∆χ2. This is referred to as the Gaussian χ2 surface. In NOvA, parameter estimation for sin2 θ23, ∆m32 and δCP proceeds either individually for each parameter, or in pairwise combinations to determine correlated 2D acceptance regions1. However, the validity conditions of Wilks’ theorem are not fully satisfied by the NOvA meas


Neutrino Oscillations and the NOvA Experiment
Over the last 20 years, very compelling evidence has been assembled that neutrinos change flavor as they propagate from their production point. Neutrino oscillations require the three weak eigenstates (ν e , ν µ , ν τ ) to be distinct from the neutrino mass states (ν 1 , ν 2 , ν 3 ). The complete set of experimental data available is almost perfectly described by oscillations driven by two independent mass-squared differences, ∆m 2 32 and ∆m 2 21 , and by a 3×3 neutrino mixing matrix [1], where two of the mixing angles, θ 23 and θ 12 , have large values, and the third mixing angle, θ 13 is measured to have a relatively small, but non-zero value of θ 13 = 8.37 • ± 1.62 • [2]. Although we have now measured most of the neutrino mixing parameters, two fundamental parameters remain unknown: a) the CP-violating phase in the leptonic sector δ CP ; and b) the sign of ∆m 2 32 , commonly referred to as the neutrino mass hierarchy.
If neutrino oscillations do not preserve CP symmetry, we will have first evidence for CP violation in the leptonic sector and may potentially explain the matter-antimatter asymmetry observed in the universe. The measurement of the mass hierarchy (normal for ∆m 2 32 > 0 or inverted for ∆m 2 32 < 0) has profound implications in our understanding of supernovae explosions [3] and is essential for the interpretation of future results by neutrinoless doublebeta decay experiments [4]. The key to determining these two parameters lies in precisely measuring the subdominant ν µ →ν e transition probability, sensitive to both CP violation and matter effects. Assuming a three-neutrino mixing scenario, the ν µ → ν e oscillation probability including matter effects is approximately given by Eq. 1: P( (-) ν µ → (-) ν e ) ≈ sin 2 2θ 13 sin 2 2θ 23 sin 2 (A − 1)∆ (A − 1) 2 + 2α sin θ 13 cos δ CP sin 2θ 12 sin 2θ 23 where ∆ ≡ 2∆ . The latter factor arises due to the passage of neutrinos through matter. The symbols in parentheses indicate how the expression is modified in the case of antineutrino oscillations. It is clear all terms in Eq. 1 depend on the value of the small mixing angle θ 13 . It is also apparent that the matter effect contribution, determined by A, is directly proportional to the neutrino propagation length L and that it changes sign depending on the neutrino mass hierarchy. In vacuum, CP violation in neutrino oscillations is equivalent to having P(ν µ → ν e ) P(ν µ →ν e ). This is only possible if the third term in Eq. 1 does not vanish, a condition verified if δ CP is a non-integer multiplicative factor of π. However, if matter effects are included, even in the absence of CP violation, we can measure P(ν µ → ν e ) P(ν µ →ν e ), leading to ambiguities that pose major challenges in simultaneous resolving mass hierarchy and CP violation.
The NuMI Off-Axis ν e Appearance experiment (NOvA) has as its primary goal to measure ν µ →ν e oscillations by looking for appearance of electron neutrinos or antineutrinos using the Neutrinos at the Main Injector (NuMI) neutrino beam produced at Fermilab. This is accomplished by using two detectors separated by 810 km, placed 14 mrad off the NuMI beam axis. The 300 ton Near Detector (ND), located underground next to the MINOS Near detector hall at Fermilab, and the 14 kton Far Detector (FD), positioned at the surface in Ash River, Minnesota, are composed of extruded PVC modules filled with liquid scintillator. By taking advantage of large neutrino matter effects due to NOvA's long baseline of 810 km, and the NuMI capability of running in neutrino or antineutrino-dominated configurations, NOvA can independently measure P(ν µ → ν e ), P(ν µ →ν e ), and disentangle ambiguities between values of δ CP and the mass hierarchy for some portions of the parameter space. In the analysis and results presented here, we use NOvA exposures of 8.9 × 10 20 protons-on-target (POT) in neutrino running, and 6.9 × 10 20 POT in antineutrino running. Besides measurements of ν e appearance, NOvA is also able to measure the oscillation parameters ∆m 2 32 and sin 2 2θ 23 via ν µ disappearance with unprecedented precision.

Parameter Extraction in NOvA
Long-baseline neutrino experiments such as NOvA, T2K, and in the future DUNE, are designed to measure the energy spectra for muon neutrinos (ν µ ) and muon antineutrinos (ν µ ) produced in an accelerator beam, using a detector in close proximity to the source of the  Figure 1: Illustration of the disappearance analysis principle and an example of the actual estimated parameters from this procedure. The oscillation parameters distort the oscillated muon neutrino energy spectrum with respect to the predicted unoscillated spectrum, enabling measurement of the parameters through a shape fit.
neutrinos. In the absence of neutrino oscillations, the muon neutrino and antineutrino spectra extrapolated from the Near Detector measurement in the absence of oscillations would be similar to the observed Far Detector spectrum. Similarly, in the absence of oscillations, the fraction of electron neutrinos (ν e ) and electron-antineutrinos (ν e ) would be consistent between the Near and Far measurements and correspond to the ∼ 1% fraction of ν e andν e intrinsic to the beam. In the presence of neutrino oscillations, an energy-dependent depletion of the muon ν µ andν µ spectra is observed in the Far Detector. At the same time, the oscillations enhance the rate measured for the ν e andν e populations in the same energy regime.
Comparison of the predicted ν µ and ν e spectra from the Near Detector measurement with the Far Detector observation permits the measurement of the ν µ andν µ survival probabilities P( ν µ ) and the ν e andν e appearance probabilities P( ν e ). Extraction of the PMNS mixing parameters can then be done by analyzing the depletion and enhancement regions. The position in depletion region is driven by the characteristic frequency of the oscillations and used to extract the value for the neutrino mass splitting ∆m 2 32 while the amplitude of the depletion is used to determine the mixing angles θ 23 and θ 13 .

Statistical Interpretation of Results
The frequentist approach to assessing the level of agreement between data and prediction involves calculating the probability of observing random variable x under a model subject to parameters θ. The P(x| θ) is equated to the Likelihood function, L( θ) as a function of the model's parameters. An estimate of a model's true parameters can be found as the set of parameters that maximize the likelihood function L( θ). This can also be cast as a minimization problem through relating the log of the likelihood function to a chi-square distribution: Confidence intervals can then be constructed using the test statistic: which according to Wilks' theorem [5] follows a standard χ 2 distribution. Wilk's theorem holds in the limit of Gaussian errors, and in regions where the free parameters are far from Figure 2: Empirically determined ∆χ 2 distribution compared with the predicted ∆χ 2 distribution for two degrees of freedom assuming Gaussian errors. The critical ∆χ 2 crit value of 4.61 corresponding to the 90% CL. for a χ 2 distribution with two degrees of freedom is compared to the ∆χ 2 FC value up to which 90% of pseudoexperiments are included. physical boundaries. In this limit, the p-value can be calculated analytically, and converted to a statistical significance in terms of a standard error that is measured directly by ∆χ 2 . This is referred to as the Gaussian χ 2 surface. In NOvA, parameter estimation for sin 2 θ 23 , ∆m 2 32 , and δ CP proceeds either individually for each parameter, or in pairwise combinations to determine correlated 2D acceptance regions 1 . However, the validity conditions of Wilks' theorem are not fully satisfied by the NOvA measurements 2 . Consequently, the resulting ∆χ 2 surfaces are not necessarily distributed as a standard χ 2 distribution. The resulting confidence intervals taken from the Gaussian χ 2 surface may both overcover and undercover the true significance of the parameters being estimated.

The Feldman-Cousins Ordering Principle
Knowledge of the actual ∆χ 2 distribution for the observations is required to construct confidence intervals that correctly cover the true values of the parameters being estimated. The Feldman-Cousins unified approach [6] provides a method to determine empirically the correct distributions, and thereby compute confidence intervals with the correct coverage. The NOvA collaboration uses this approach when reporting neutrino oscillation results.
Implementation of the Feldman-Cousins prescription involves creating fake data spectra including both statistical fluctuations and fluctuations within the expected systematic uncertainty error envelopes. These spectra are often referred to as a collection of pseudoexperiments, denoted as {D i ( θ)}, and are generated for each point in parameter space as a function of the experimental parameters θ. For each of the pseudoexperiments, a ∆χ 2 is computed under the fitting procedure used for estimating the oscillation parameters. This is done for thousands of pseudoexperiments to generate distributions of ∆χ 2 ( θ i ) at each point in the parameter space being probed. These distributions can then be used to compute the p-value corresponding to a specific value of ∆χ 2 ( p i ) obtained from the fit to data as shown in Figure 2. Construction of the confidence intervals for an observation thus requires the generation of these distributions and the mapping of p-values to χ 2 's for each point in the physics model's parameter space, at each point p i . The hypothesis test for any interval then reduces to applying the mapping of χ 2 → P(σ) across the surface under the assumption of P( θ = θ true ).

Computational Challenges
This multi-universe Feldman-Cousins approach was successfully implemented and used in the previous 2017 NOvA analysis. The 2017 results were produced using exclusively the neutrino-mode data sample (as opposed to neutrino and antineutrino samples in the 2018 analysis), simplifying the complexity of the fits while limiting their computational cost. At the time, the Feldman-Cousins corrections were performed exclusively on the Fermilab computing grid. The computations took approximately four weeks to complete, causing a significant delaying between when the data was initially unblinded with its final selection criteria, and the point at which the results of the analysis could be reviewed by the NOvA Collaboration.
The 2018 NOvA analysis combined both neutrino and antineutrino oscillation measurements in both the ν e and ν µ channels into a unified fit. This more intricate measurement and approach also required taking into account additional systematic uncertainties impacting the different measurement channels. This led to an increase of the number of nuisance parameters that were included in the final fit. The added measurement complexity increased the fitting time of a single pseudoexperiment by an order of magnitude. The 2017 analysis fits had convergence times on the order of 2-3 minutes for the ∆m 2 32 and θ 23 parameter combinations. The 2018 analysis fits in the same parameter space combination, were observed to have a mean convergence time of ∼ 40 minutes. This increase in the computation cost of the analysis is amplified by the number of oscillation parameters and parameter combinations extracted in the 2018 analysis. For single-parameter measurements (slices) with profiling over other parameters, each parameter range was broken into 60 equally spaced regions. For measurements involving the simultaneous extraction of a pair of parameters (2-dimensional contouring) each parameter range was subdivided into 30 equally spaced intervals to create 900 sampling points across the surface. The Gaussian χ 2 surface resulting from the primary fits is available prior to computation of the pseudoexperiments. This allows for the surface to be examined and regions of high ∆χ 2 to be identified. To reduce the computational requirements, these points which are determined to be well beyond the 3σ confidence limit are removed from the computation. In this manner the interesting region of the parameters spaces were reduced to a total of 471 points for the 2D contours. At each point, a minimum of 4,000 pseudoexperiments were run, with up to 10,000 pseudoexperiments being required for some points to provide adequate statistical coverage for the 3σ boundaries. In total, a minimum of 6,684,000 pseudoexperiments were required to be generated and fitted by the NOvA analysis. This number of pseudoexperiments and their corresponding fits represents a computational problem that is not well matched to the shared grid computing facilities that are available to NOvA from Fermilab, the Open Science Grid, and other computing facilities.

Implementation in the HPC Environment -SciDAC-4 Project
The large scale of the Feldman-Cousins computations needed for the 2018 NOvA analysis motivated us to explore implementation of the computations on a high-performance computing (HPC) platform. This implementation was performed by a collaboration between NOvA scientists and members of the SciDAC-4 sponsored project "HEP Data Analytics on HPC". This SciDAC-4 project has been designed to bring together HEP and ASCR scientists to solve high-energy physics problems using HPC facilities.
The project targeted the NERSC facilities for the Feldman-Cousins computations. From its initial stages the design and implementation was made to be compatible with the Ivybridge, Haswell and Knights Landing architectures along with the configurations of the compute nodes that are present on the Edison and Cori supercomputers. To do this, the decision was made to containerize the NOvA code instead of building it in a native manner on each of the individual machines. The baseline NOvA fitting code was originally implemented within the ROOT [7] data analysis framework. The minimizations themselves are performed by the MINUIT2 library [8] through the ROOT framework's API. The inputs to the fits are constructed through a combination of Monte Carlo spectra, near detector data spectra and the spectra produced from the data of selected neutrino events in the far detector. The construction of the likelihood function to be minimized is then performed and takes the form of: Where N i and O i are the predicted and observed number of events observed in the i th bin of the fitting spectra as a function of the oscillation model parameters θ and the nuisance parameters η. The second sum in the likelihood function is the primary pull term. All of the systematic nuisance parameters are fit to their best fit values δ j , and then the third sum acts as a set of secondary pull terms that account for the uncertainties in these nuisance parameters. The generation of the pseudoexperiments was also performed within the legacy NOvA code, using a compiled C++ ROOT macro that handled the pseudoexperiment generation through a series of randomized selection of test values for each parameter being profiled over, chosen from correspond probability distributions. The NOvA workflow was organized by a wrapper script that was designed to run at the level of a system node. The wrapper assigned each node in the computation a complete copy, in terms of fit parameter ranges, of the surface being corrected. A given number of pseudoexperiments was calculated and assigned for each of these surfaces. The number was dependent on the confidence level goal for the measurement, and determined from the required number of experiments N total p and the number of nodes that would participate in the computation broken out by architectural flavor, {n ivy , n has , n knl }. Each of these flavors was assigned a weight according to its computational scaling, {w ivy , w has , w knl }. The weights were determined from the average run times for a single fit cycle and normalized to the performance of the Knights Landing machines. They were combined with NERSC cost coefficients such that: and then used to form a constrained linear optimization problem that included the maximum level of concurrency on each platform. For the Haswell and Ivy Bridge architectures, this level of concurrency was set at 32. For the Knights Landing architecture, the number of concurrent fits was capped at 45 corresponding to the limitation of running the computations within the 96 GB of memory that was available on each node (i.e. at an approximate memory footprint of 2 GB per fit).
Porting the NOVA Analysis software to NERSC, in a massively scalable and efficient format, required building a Docker image containing the existing software and its dependencies. As part of the validation process of both the software and the workflow, the 2017 NOvA Feldman-Cousins corrections were successfully reproduced at NERSC in a dedicated run in a matter of hours. The Docker image was stored into a private registry hosted by NERSC, and pulled down to the Cori and Edison machine as Shifter images. The jobs were submitted  Figure 3, showing the significant effects of application of the Feldman-Cousins approach for some regions of oscillation parameter space. These results were shown by the NOvA Collaboration at the Neutrino 2018 international conference [9], and are being prepared for publication. From benchmarking of the 2017 and 2018 analysis executables, we estimate that performing the same number of computations using conventional resources like FermiGrid would have taken 4-5 months instead of 52 hours. Thus, we conservatively estimate that use of HPC resources accelerate NOvA's time-to-results by at least a factor of 50. Such a dramatic improvement may result in paradigm shifts in analysis workflows, enabling even more complex parameter extraction methods to be used, or optimization of event classification on final physics sensitivities. It may also benefit development of more complex models describing neutrino phenomenology when confronting experimental data. with multiple subdomains (blocks) per MPI process rank and efficient data reduction over pseudoexperiments. Additional efficiency will be gained by loading only relevant histograms required for the fitting algorithm, as well as better handling of the final output files. Input files will also be converted to file formats popular in HPC such as HDF5 [11]. Together, these gains in efficiency allow for the full NOvA analysis to be performed in an HPC environment, enabling sophisticated optimization of NOvA data selection criteria based on Feldman-Cousins corrected sensitivities. Moreover, this framework will be modular enough to support other neutrino oscillation experiments and different neutrino model assumptions.