Life Outside the Golden Window: Statistical Angles on the Signal-to-Noise Problem

Lattice QCD simulations of multi-baryon correlation functions can predict the structure and reactions of nuclei without encountering the baryon chemical potential sign problem. However, they suffer from a signal-to-noise problem where Monte Carlo estimates of observables have quantum fluctuations that are exponentially larger than their average values. Recent lattice QCD results demonstrate that the complex phase of baryon correlations functions relates the baryon signal-to-noise problem to a sign problem and exhibits unexpected statistical behavior resembling a heavy-tailed random walk on the unit circle. Estimators based on differences of correlation function phases evaluated at different Euclidean times are discussed that avoid the usual signal-to-noise problem, instead facing a signal-to-noise problem as the time interval associated with the phase difference is increased, and allow hadronic observables to be determined from arbitrarily large-time correlation functions.


Introduction
Properties of large nuclei, nuclear matter, and colliding neutron stars could be predicted from first principles if lattice quantum chromodynamics (LQCD) calculations could be performed with nonzero baryon chemical potentials. However, the QCD action is complex in the presence of a baryon chemical potential. Monte Carlo (MC) integration methods employing importance sampling cannot be used to compute path integrals with complex integrands, an obstacle known as the sign problem.
Nuclei can be also studied in LQCD by calculating multi-baryon correlation functions averaged over zero-density importance sampled gluon field configurations. QCD zero-momentum nucleon correlation functions G(t) have path integral and spectral representations given by eigenstates, E n denotes their energies, Z n ( Z n ) are overlap factors for source (sink) interpolating operators, ∼ denotes proportionality as t → ∞, and M N is the nucleon mass. Lattice units are used throughout. The path integral over gluon field configurations can be numerically approximated using a MC ensemble of i = 1, · · · , N correlation functions C i (t) = C(t, U i ) computed in zero-density importance sampled gluon field configurations, where equality holds in the limit N → ∞. An effective mass that becomes equal to the nucleon mass in the limit t → ∞ can be constructed as M(t) = −∂ t ln G(t) = ln G(t + 1) − ln G(t).
Parisi [1] and Lepage [2] showed that the signal-to-noise (StN) ratios of correlation functions in MC calculations can be understood physically. The variance of the real part of C i includes contributes from C 2 i , which has two-nucleon quantum numbers, and from |C i | 2 , which has nucleon-antinucleon quantum numbers. The lowest-energy state with nucleon-antinucleon quantum numbers and no valence quark annihilation is composed of three pions, and so the signal-to-noise (StN) ratio of nucleon correlation functions decays exponentially as LQCD studies have revealed a golden window of intermediate times where C i is described by groundstate time evolution but |C i | 2 is not dominated by its 3π ground state [3][4][5]. The StN ratio degrades exponentially less quickly in the golden window than at large times. Studies of small nuclei through the golden window have shown recent success, and it is possible to extend the golden window by reducing overlap with the variance ground state [6,7]. Still, the golden window shrinks with increasing baryon number, and , different analysis strategies may be required when studying large nuclei whose correlation functions have no apparent golden window.

Complex Correlation Function Statistics
Baryon correlation functions are complex in generic gluon field configurations, and their statistics can be usefully analyzed in terms of their log-magnitudes and complex phases [8],  Figure 1. The left plot shows the nucleon effective mass as well as the magnitude and phase effective masses of Eq. (6) for the m π ∼ 450 MeV LQCD ensemble detailed in Ref. [9]. Shaded red and purple regions show the central value and uncertainties of 3 2 m π and M N − 3 2 m π respectively as given in Ref. [9]. The middle plot shows the covariance of the magnitude and phase factor C i − e R i e iθ i divided by the full correlation function C i . The right plot shows the effective mass minus the sum of the magnitude and phase effective masses.
The t dependence of the magnitude and phase can be studied numerically using the effective masses Fig. 1 shows results for M R and M θ employing N ∼ 500, 000 correlation functions previously computed by the NPLQCD collaboration from Gaussian-smeared sources and point sinks on isotropicclover gauge-field configurations at a pion mass of m π ∼ 450 MeV generated by the College of William and Mary/JLab lattice group and the NPLQCD collaboration, see Ref. [9] for further details. At large t, both M R and M θ appear approximately t independent and consistent with M θ ∼ M N − 3 2 m π and M R ∼ 3 2 m π . The magnitude has no significant StN problem, while the phase has a StN problem with the same severity as the full correlation function. As shown in Fig. 1, M − M R − M θ is consistent with zero at large t. These results suggest the Parisi-Lepage StN problem can be identified with the StN problem arising from reweighting the complex correlation function sign problem associated with e iθ i . Similar results are expected to apply to generic complex correlation functions. This is consistent with results for canonical and grand canonical partition functions, where reweighting approaches to sign problems lead to StN problems that become exponentially worse with increasing spacetime volume at a rate determined by an energy difference between the full theory and a phasequenched theory [10][11][12][13][14].
Positive-definite correlation functions is a wide variety of quantum field theories have to found to be approximately log-normally distributed. 1 LQCD pion correlation functions are known to be approximately log-normally distributed [15] with a variance set by the ππ scattering length [16]. Lognormal distributions have also been observed and usefully exploited in MC calculations of unitary fermions [17,18] and condensed matter systems [19,20]. At small t, the imaginary parts of baryon correlation functions negligible are the real parts are approximately log-normally distributed [21]. At large t, the imaginary parts are non-negligible and the real parts are better described by heavy-tailed stable distributions [22]. The distribution of the real parts can further be shown to be increasingly broad and symmetric at large t by extensions of Parisi-Lepage scaling to higher moments [23,24]. Similar results of log-normally distributed positive-definite correlation functions and broad, symmetric real parts of complex correlation functions are seen in QCD-like theories [25]. LQCD calculations show that |C i | 2 and further |C i | are approximately log-normally distributed at all t, consistent with results for positive-definite correlation functions [8]. The phase is observed to be normally distributed at small t with a variance that increases until the distribution resembles a uniform distribution over the 2π range of definition of θ i . This suggests that complex correlation functions have a normally distributed complex log, subject to the restriction that θ i is a circular random variable defined modulo 2π. The wrapped-normal distribution of circular statistics describes the distribution of θ i at all t, and a complex-log-normal distribution approximately describing C i can be constructed as a product of a log-normal magnitude and a wrapped-normal phase factor, where the parameters µ R , σ 2 R , and σ 2 θ are given in the infinite statistics limit by The distribution of Re C i is obtained from Eq. (7) by performing a change of variables and marginalizing over the imaginary part. Fig. 2 shows histograms of the real part of baryon correlation functions along with fits to complex-log-normal distributions using Eqs. 7-8 as well as maximum-likelihood fits to stable distributions. Complex-log-normal distributions roughly capture the shape of the distribution at all t, but only describe the tails of the distribution well at large t. This is expected from statistical arguments based off central limit theorems that suggest that log-normal distributions and their complex generalizations describe the large t limit of correlation functions where they can be viewed as products of many independent time evolution factors [17,18].
In the infinite-statistics limit, parameter inference with circular random variables proceeds exactly as with ordinary random variables defined on the real line. Away from this limit, circular statistics includes finite N effects not seen for ordinary random variables. Under the assumption of a wrappednormal distribution, the variance of the sample mean of cos θ i can be computed, More generally, cos 2 θ i = 1 2 + 1 2 cos 2θ i and the decay of cos 2θ i = cos(arg(C 2 i )) ∼ e −2(M N − 3 2 m π )t shows that the variance of cos θ i includes t-independent contributions. In order for this noise to not dominate the exponentially decreasing signal cos θ i , the statistical ensemble should be large enough that Eq. (10) must be satisfied in order to reliably perform parameter inference for random variables sampled from generic wrapped distriubtions, see Ref. [26] for more details. For finite N and large lattice time extent, there is a "noise region" where t is sufficiently large that Eq. (10) is violated. In the noise region, finite sample size effects lead to violations of Parisi-Lepage scaling and make standard estimates of ground-state energies unreliable. There is still useful information contained in θ i at large t. LQCD results indicate that the distribution of ∂ t θ i approaches a t-independent shape at large t. The wrapped normal variance of ∂ t θ i can be computed at all t and approaches a constant value at large t, but the distribution of ∂ t θ i has heavy tails that are not adequately described by a wrapped-normal distribution. Similar heavy tails are seen for ∂ t R i . The distributions of ∂ t R i and ∂ t θ i can be described by heavy-tailed stable and wrapped-stable distributions respectively. The continuum limit distributions of ∂ t R i and ∂ t θ i are left to future work.
The finite difference interval used to define the lattice derivative can be extended beyond a single lattice spacing, Results for the distribution of ∆R i with increased ∆t suggest that for ∆t m −1 π , the distribution of ∆R i approaches a normal distribution. This is expected for a difference of uncorrelated, normally distributed variables. Over hadronic scales, the time evolution of R i can be described as a random walk with independent, identically distributed steps drawn from the large-t distribution of ∆R i . ∆θ i is fit by a heavy-tailed wrapped-stable distribution even for ∆t m −1 π , one expects that for ∆t m −1 π the time evolution of θ i can still be treated as a random walk with independent, identically distributed steps drawn from the large-t distribution of ∆θ i . The presence of heavy tails in ∆θ i signals that this random walk does not correspond to Brownian motion and is better described as a heavy-tailed Lévy flight. The physical origin of heavy-tailed time evolution of ln C i remains to be understood.

Phase Reweighting
Useful physical information can be extracted from the noise region if C i can be related to properties of the t-independent distributions describing ∆R i and ∆θ i at large t. Rather than extracting the spectrum from the phase difference between the phases θ i (0) and θ i (t), one could instead compare phases at very large t to a more closely spaced "source" phase at θ i (t − ∆t). By the approximate t-independence of the distribution of ∂ t θ i at large t, results are be expected to be independent of t − ∆t provided that ∆t and t are both large compared to hadronic scales. With a source phase convention θ i (0) = 0, results for θ i can then be obtained by taking precise results for ∆θ i in the regime m −1 π ∆t t where it has a narrow, t-independent distribution and performing the extrapolation ∆t → t to recover ∆θ i (t, t) = θ i (t).
Treating ∆R i the same as ∆θ i leads to an estimator An alternative estimator involving ∆θ i but instead using the full magnitude R i for any ∆t is given by [27] M PR (t, ∆t) = −∂ t+∆t ln e R i (t)+i∆θ i (t,∆t This effective mass can be constructed from a phase-reweighted correlation function G PR as Physical results are recovered as ∆t → t, where G PR (t, t) = G(t) and M PR (t, t) = M(t).
Phase reweighting is similar to Green's Function Monte Carlo (GFMC) methods where the phase of the wavefunction is held fixed for initial evolution and then allowed to evolve without constraints once the system is close to it's ground state [28][29][30][31]. Similar techniques are also used in GFMC calculations that extrapolate between a Hamiltonian without a sign problem and the desired Hamiltonian with a sign problem [32]. There are also similarities between phase reweighting an hierarchical integration, where locality is exploited to decompose correlation functions into products of correlation functions defined on sub-volumes that can be averaged independently [33,34].
where δE is the gap to the first nucleon excited state that has appreciable coupling to the non-local source produced in [0, t − ∆t]. Eq. (14) turns this into an ansatz for M PR , Fits to this form for the m π ∼ 450 MeV nucleon are shown in Fig. 3. The correlation-function-ratio estimator of Eq. (12) is also shown with fits of the same form Eq. (15). Fits to M give consistent results with factor of two larger uncertainties. Phase reweighting has also been applied to two-baryon systems, and consistent results for finite-volume energy shifts have been extracted using phase reweighting and golden window methods for the ΞΞ( 1 S 0 ) system. Encouragingly, the ∆t dependence of the binding energy is much more mild than the ∆t dependence of single-particle masses shown here.

Conclusion
The StN problem is a challenge for many LQCD calculations, particularly large nuclei made of light quarks. LQCD calculations of the magnitudes and phases of nucleon correlation functions show that the StN problem arises from reweighting a sign problem and affects phases but not magnitudes. The time evolution of the phase resembles a Lévy flight on hadronic scales, and the time derivative of the phase does not possess a StN problem.
Calculations of correlation functions reweighted with their inverse phase determined at an earlier time measure finite differences of phases over t-independent length intervals and therefore have no StN degradation with t. Phase-reweighted correlation functions do have a StN problem in ∆t, the length of the phase interval, with the same severity as the standard baryon StN problem; however, they allow additional spectral information to be extracted from correlation functions at arbitrarily large t. This may prove useful in calculations of nuclei where a golden window cannot be identified.