Local multiboson factorization of the quark determinant

We discuss the recently proposed multiboson domain-decomposed factorization of the gauge-field dependence of the fermion determinant in lattice QCD. In particular, we focus on the case of a lattice divided in an arbitrary number of thick time slices. As a consequence, multiple space-time regions can be updated independently. This allows to address the exponential degradation of the signal-to-noise ration of correlation functions with multilevel Monte Carlo sampling. We show numerical evidence of the effectiveness of a two-level integration for pseudoscalar propagators with momentum and for vector propagators, in a two active regions setup. These results are relevant to lattice computation of the hadronic contributions to the anomalous magnetic moment of the muon and to heavy meson decay form factors.


Introduction
It is well known that the standard Monte Carlo (MC) evaluation of a generic multi-point function of QCD has an exponential signal-to-noise ratio (S/N) problem. Namely, the signal decays with the distance between points with a faster exponential rate than its statistical error, or noise [1,2]. This currently limits the obtainable statistical accuracy at long distances. A way to tackle the signal-tonoise problem that involves a modification of the MC integration procedure, the so-called multilevel algorithm [3,4], is briefly introduced in Section 3.
Algorithms of this kind have been applied to lattice Yang-Mills theory [3][4][5][6][7][8], thanks to the fact that the action of a purely bosonic theory is manifestly local. However, in numerical lattice QCD the Grassmann quark fields are first integrated out analytically. This leads to a non-local fermion determinant and non-local Wick contractions. Both non-localities prevent a straightforward application of multilevel sampling.
In the first part of these proceedings (Sec. 4), we present the factorization introduced in Refs [9,10], focusing on the case of independent updates of an arbitrary number of thick-time-slice regions. To this purpose, the fermion determinant is factorized in contributions that are localized to a spacetime region of a domain decomposed (DD) lattice [11], while interactions between different domains are mediated Speaker, e-mail: marco.ce@uni-mainz.de, HIM-2017-06, CERN-TH-2017-221, DESY-  by multiboson (MB) fields [12,13]. In the second part of these proceedings (Sec. 5) we show new numerical evidence that multilevel sampling can solve the S/N problem for the connected pseudoscalar correlator with momentum and for the vector correlator.

The signal-to-noise ratio problem
As a prototypical manifestation of the S/N problem, consider the mesonic correlator on a fixed background gauge field configuration where • f denotes the fermion path integral. The mesonic two-point function is estimated by averaging Eq. (1) on gauge field configurations. At large distances, it decays exponentially with the energy of the lightest state with the specified quantum numbers At the same time, the variance decays exponentially with the lightest state with at least four quark lines, i.e. a pair of zero-momentum pions, irrespective of Γ and p, Therefore, neglecting finite volume effects on the ππ state energy, any connected mesonic two-point function with the exclusion of pseudoscalar ones, e.g. Γ = γ 5 , γ 0 γ 5 , has an expectation value that is decaying faster than its standard deviation. Assuming sampling over n independent configurations with a standard MC, the S/N is

The multilevel Monte Carlo algorithm
The multilevel algorithm addresses the S/N problem modifying the MC sampling and boosting the standard n −1/2 scaling of the statistical error in Eq. (4). This is achieved by splitting the MC estimator in two (or more) levels of sampling. At the lowest level-0, the observable is averaged on n 0 configurations that span the whole lattice gauge field. At level-1 (and higher), n 1 gauge field configurations are generated for each level-0 configuration. The level-1 updates are characterized by the independent update of different spacetime regions of the lattice, and they are used to independently average the observable in the distinct regions.
To this end, we divide the lattice in thick-time-slice regions, as pictured in Figure 1. In general, we can write any global observable, such as O = W Γ (x 0 , y 0 , p) in Eq. (1), as i.e. the product 1 of observables O i , which are local to a thick time slice, plus a non-factorized correction O rest . We introduced a way to build the factorized W Γ (x 0 , y 0 , p) from factorized quark propagators in Refs [14,15] for the two relevant regimes: when source and sink are both in the same region, and when x 0 x 0 y 0 y 0 y 0 y 0 y 0 y 0 Figure 1. Connected meson two-point functions in a two-level setup with multiple thick-time-slice regions.
they are in different regions. In both cases, O rest can be made small and it is easily dealt with using a standard algorithm. We focus here on connected contributions at large distances, so we are interested in the latter regime. 2 The multilevel estimator for O fact is Since the number of thick-time-slice regions is roughly proportional to the source-sink separation, m M |x 0 − y 0 |, the noise decreases with distance exponentially faster with respect to the standard MC. We expect to haveC mlv fact,Γ (x 0 , y 0 , p) σ C mlv fact,Γ (x 0 , y 0 , p)/n 1/2 Provided that is possible to boost the exponential rate of the noise to the level of the one of the signal, the S/N problem is completely solved. In general, the optimal value of n 1 is observable specific.

The factorization of the lattice QCD action
In order to implement multilevel sampling in QCD with dynamical fermions, we need to be able to update the gauge field in different spacetime regions independently. To this purpose, in Ref. [9] we introduced a factorization of the fermion determinant and its dependency on gauge links in different regions.

The domain decomposition
As a first step, we set up a domain decomposition [11] in multiple active regions bordered by buffer regions with frozen gauge links, e.g. the decomposition of the lattice in even and odd thick time slices shown in Figure 2. Only the links in the Λ i with even i are going to be active gauge links at level-1.
The determinant representing a single quark flavour is decomposed exactly in the product where Q Ω * i is the hermitian Dirac operator Q = γ 5 (D W + m) restricted to a three thick-time-slice region and supported on the inner boundaries of active regions Λ i , even i. In Eq. (8), the determinant in each thick time slice is naturally factorized, but for a global residual contribution det W 1 . We argue that det W 1 is parametrically close to one. To motivate this, we partition the active regions between doubly even ones Λ and singly even ones Λ +2 , where = 4k, k ∈ Z. We define 1 − w as the Schur complement of the singly even block of W 1 . This leads to the identity [16] det The operator w is supported on the internal boundaries of doubly even regions Λ only, and it has a peculiar structure: the first factor propagates a quark line to the internal boundary of a singly even Λ +2 region, and the second factor propagates it back. This implies an exponential suppression with a rate proportional to M π /2 times 2∆, where ∆ is the thickness of the buffer regions Λ j with odd j. A way to express this is that the eigenvalues δ i of w satisfy |δ i | O(δ), withδ = e −M π ∆ . Moreover, with algebra analogous to Eq. (13) in Ref. [9], w can be written as the product of two Hermitian operators. This implies that w is similar to w † , thus the δ i are either real or complex conjugate pairs. These constraints on the spectrum have been verified numerically in the simpler case of a three thick-time-slice setup, see Refs [9,10] for the details.

The multiboson representation
The fact that the condition number of 1 − w is (1 +δ)/(1 −δ) = O(1) suggests as a next step to apply to it the MB technique [12], in its complex-valued variant [13]. This is obtained by approximating the operator inverse with a polynomial A simple and good choice is a truncated geometric series P N (z) = N p=1 (1 − z) p , which has the nice property of being exponentially more accurate close to z = 1, where the bulk of the UV eigenvalues of 1 − w are concentrated. This polynomial choice has a constant accuracy on circumferences in the complex plane |1 − z| = const, but different choices might provide slightly better approximations.
Chosing N even, we can write down the MB action for det{1 − w} where the MB fields χ k are N/2 coloured Weyl spinor fields 3 supported on the inner boundaries of the active Λ i . The derivation of Eq. (12) relies on the fact that w and w † are similar. Moreover, we used det{z 2 1 − w} ∝ det W z to switch back to the operator W z defined in Eq. (9). This last step is necessary to have a truly factorized MB action. Indeed, introducing χ i,k = P ∂Λ i χ k as the MB fields on Λ i , the MB action couples MB fields on Λ i−2 to up to Λ i+2 . However, each term in Eq. (13) depends only on the gauge links in a single active region Λ i , and it does not depend on the gauge links in any other active region Λ i , i i. Because of this it is possible to update independently all the Λ i regions with even i. The remaining factors in Eq. (8) can be represented using ordinary algorithm based on pseudofermion fiels, restricting them to Ω * i or Λ i regions. The resulting configurations are sampled with an action that differs from the Wilson action by a small reweighing factor W N = det{1 − R N+1 (1 − w)} for each dynamical quark flavour. However, given a reasonable ∆ ≈ 0.5-1 fm, our tests show that 10 MB fields for each light flavour are sufficient to obtain an a high-precision approximation, resulting in a negligible W N . We do not expect this figure to change significantly with lighter quarks.

Numerical tests in the quenched theory
Numerical tests are needed to asses the effectiveness of multilevel MC sampling in addressing the S/N problem of fermionic observables. To this purpose, we studied connected two-point functions such as the pion and baryon propagator [14,15], as well as disconnected two-point functions [9,14,15]. In all these cases, multilevel sampling proved its value in solving or mitigating the S/N problem.
Here, we extend previous results considering two additional interesting observables: • the pseudoscalar correlator at non-zero momentum, which is a building block of transition matrix elements such as heavy meson decay form factors; • the vector correlator, at zero momentum and averaged over i = 1, 2, 3, which plays a prominent role in the computation of the hadron vacuum polarization (HVP) contribution to g − 2 of the muon.
We tested the two-level sampling on these two-point functions in the quenched approximation. This has proven instrumental both to disentangle from the complications introduced by the factorization of the fermion determinant, and to limit the computational resources needed, while being still affected by the same S/N problem. The test was performed on n 0 = 50, T × L 3 = 64 × 24 3 a 4 level-0 configurations with open boundary conditions in the time direction, generated with the Wilson gauge action at β = 6, which corresponds to a 0.093 fm assuming r 0 = 0.5 fm. The valence quark hopping parameter is set to κ = 0.1560, which corresponds to M π 455 MeV [14]. To keep things simple, we started with a two active thick-time-slice setup. For each level-0 configuration, we generated n 1 = 30 level-1 configurations updating independently the gauge field in Λ 0 = {x : x 0 ∈ (0, 15a)} and Λ 2 = {x : x 0 ∈ (24a, 63a)}. Gauge links in Λ 1 = {x : x 0 ∈ (16a, 23a)} have been kept frozen. 4 For each active region, n 1 = 30 level-1 configurations have been used for measurements. A large number of MDUs have been skipped between measurements in order to reduce autocorrelation effects.
We computed the connected two-point function of the pseudoscalar density and of the local vector current using 12 spin-and colour-diluted random sources with U(1) noise, defined on the source time slice x 0 = 8a. Both sink and source were projected on momentum p 2 = n(2π/L) 2 , with n up to 3. We show in these proceedings two specific correlators: C γ 5 (x 0 , y 0 , p) with non-zero momentum p = 2π(1, 1, 0)/L, and C γ i (x 0 , y 0 , 0) at zero momentum and averaged over i = 1, 2, 3.

Pseudoscalar correlator with momentum
The pseudoscalar correlator at zero momentum is a special case since already with a standard MC both the signal and the statistical error decay with the same exponential rate, so no significant S/N degradation with distance is expected. Our tests confirm that the multilevel algorithm results in no gain for the zero-momentum correlator. However, with non-zero momentum the S/N degrades with distance. For instance, with p = 2π(1, 1, 0)/L the S/N is expected to decrease with E γ 5 (p) − M π ≈ 0.213/a.
We show the results of the numerical computation in this case in Figure 3. In the right plot, the variance of the correlator is estimated on level-0 configurations after averaging on level-1 configurations, and it is normalized to the variance computed with n 1 = 1 in order to factor out the standard exponential suppression with distance. With 30 level-1 configurations included but a standard single-level averaging procedure, the variance is ≈ 1/30 as expected. On the other hand, with a two-level averaging procedure, y 0 /a the normalized variance depends on the sink y 0 coordinate. If both x 0 , y 0 ∈ Λ 0 , the independent updates in Λ 2 do not affect the variance. If y 0 ∈ Λ 2 , the independent updates in the two regions result in a significant variance reduction. While at medium distances the variance reduction is saturated by less than 30 level-1 updates, for y 0 38a the normalized variance drops to ≈ 1/30 2 , in perfect agreement with expectations. We did not attempt to saturate the variance at longer distances pushing beyond n 1 = 30. In the left plot of Figure 3 we show that the reduced variance translates into a smaller statistical error on the effective mass of the correlator at long source-sink separations. The signal stays constant for additional ≈ 0.5 fm before vanishing into the noise.

Vector correlator
In the setup considered here, the vector correlator decays asymptotically with the mass of the ρ meson state. Since the noise is reduced only according to M π , the vector correlator at zero momentum suffers a exponential S/N problem with a rate M ρ − M π ≈ 0.170/a. We addressed this problem with the multilevel. As in the previous case, in the right plot of Figure 4 we show the normalized variance of the correlator, with and without multilevel averaging. The same considerations of Section 5.1 apply: the two-level averaging procedure results in a variance perfectly scaling with n 2 1 when source and sink are in different regions. As shown in the left plot of Figure 4, the two-level estimator allows the signal in an effective mass plot to be followed for additional ≈ 0.9 fm with respect to the standard estimator, opening new perspectives for the determination of the HVP contribution to g − 2 of the muon.

Conclusions
We showed in Ref. [9] that it is possible to factorize the gauge field dependence of the quark determinant, introducing a domain decomposition in overlapping domains and representing the residual contribution with multiboson fields. Combined with the quark propagator factorization introduced in Ref. [14], the quark determinant factorization allows multilevel sampling of fermionic correlators, for the first time with dynamical fermions. In these proceedings, we focused on a setup with multiple independent thick time slices, that results in the action in Eq. (13). The effectiveness of two-level sampling has been tested in Refs [9,14,15] for the baryon propagator and for the disconnected contribution to the flavour-singlet pseudoscalar propagator. In these proceedings, we tested in addition the two-point functions of pseudoscalar densities at non-zero momentum and of vector currents in a two-region setup. Results show an exponential gain in the S/N in both cases. This suggests new ways to approach computations such as the hadronic vacuum polarization contribution to the anomalous magnetic moment of the muon.
Variants on the combined domain decomposition and multiboson scheme might also address the limitations of standard HMC algorithms in master field simulations with dynamical fermions [17].