$\alpha_s$ from the Lattice Hadronic Vacuum Polarisation

We present a determination of the QCD coupling constant, $\alpha_s$, obtained by fitting lattice results for the flavour $ud$ hadronic vacuum polarisation function to continuum perturbation theory. We use $n_f=2+1$ flavours of Domain Wall fermions generated by the RBC/UKQCD collaboration and three lattice spacings $a^{-1}=1.79,\, 2.38$ and $3.15\ \text{GeV}$. Several sources of potential systematic error are identified and dealt with. After fitting and removing expected leading cut-off effects, we find for the five-flavour $\overline{\text{MS}}$ coupling the value $\alpha_s(M_Z)=0.1181(27)^{+8}_{-22}$, where the first error is statistical and the second systematic.


I. INTRODUCTION
The strong coupling constant, α s , of Quantum Chromodynamics (QCD) plays a defining rôle in the rich dynamics of the theory as it describes the interaction strength of the constituent fundamental particles, the quarks and gluons. As a key input to the calculation of perturbative contributions to physical processes, an accurate determination of its value is vital for precision tests of the Standard Model (SM). The lack of sufficient precision has been a bottleneck in accurately measuring the Higgs mass and its couplings through the branching ratios of h →cc,bb and gg [1]. It also represents one of the main uncertainties in determining the cross section of the gluon fusion process for Higgs production [2]. Significant improvements will, moreover, be required in order to take advantage of the few parts per mil determinations of Higgs partial branching fractions expected from a future ILC [3] to search for evidence of beyond-the-SM physics in those relative branching fractions [4].
Since α s runs with scale, it is conventional to facilitate comparisons by quoting results at a common, agreed-upon scale. We follow convention and quote our results for the coupling in the five-flavour theory at the Z boson mass scale, M Z , denoted by α s (M Z ) in what follows. The most recent FLAG average [5] lattice determination is α s (M Z ) = 0.1182 (12), and the most recent PDG averages [6] α s (M Z ) = 0.1181 (11) including lattice results and α s (M Z ) = 0.1174 (16) excluding them.
It is clear that several lattice determinations dominate the PDG world average [6]. Most lattice measurements (including the one detailed in this paper) have to match to perturbation theory at an available lattice scale where the perturbative expansion is still reliable (see, e.g., Refs. [7][8][9][10]). This can produce complications, since lattice simulations at large scales comparable to their inverse lattice spacing generally have large cut-off effects that must be taken into account. On the other hand, these measurements can take advantage of alreadygenerated gauge configurations used for other lattice studies. One could also define the coupling non-perturbatively using the Schrödinger functional [11,12] or the Gradient Flow coupling [12,13] schemes, which allows access to high matching scales, but these methods require expensive dedicated simulations. More discussion of recent results can be found in the FLAG review [5]. In principle, all measurements of α s should agree within their respective systematics. If they do, the combination of complementary yet orthogonal determinations should provide the most accurate and dependable evaluation of this quantity.
In this paper, we determine α s by analyzing lattice data for the vacuum polarization function (VPF) of the current-current two-point function of the flavor ud (isovector), vector current J µ =ūγ µ d using the continuum operator product expansion (OPE). This approach is closely related to the most precise of the non-lattice determinations, that from finite-energy sum rule (FESR) analyses of inclusive hadronic τ decay data involving exactly the same VPF [14][15][16][17][18][19].
In the τ analysis, weighted integrals of the experimental spectral distribution from s = 0 to some upper endpoint s = s 0 ≤ m 2 τ are related by analyticity to integrals of the product of the same weight and associated VPF over the circle |s| = s 0 in the complex-s plane. For large enough s 0 , and suitable choices of weight, the OPE (possibly supplemented with a large-N c -motivated representation of residual duality violations (DVs)) is used to represent the VPF. The main complication for the τ analysis comes from the kinematic restriction, s 0 ≤ m 2 τ , imposed by the τ mass. With m τ 1.78 GeV relatively small, it is not possible to choose s 0 large enough that higher dimension OPE contributions associated with unknown, or poorly known, higher dimension condensates, and residual DVs are safely negligible. In fact, if one removes the α s -independent, parton-model perturbative contribution from the measured experimental spectral distribution, the DV oscillations about the α s -dependent perturbative contributions and the perturbative contributions themselves are seen to be comparable in size (see, e.g., Fig. 2 of Ref. [19]). This observation makes clear the necessity of attempting to quantify the size of residual DVs in the weighted OPE integrals entering the FESR analysis. It is a disadvantage of the τ approach that, at present, this can only be done using models, albeit ones constrained by data, for the form of the DV contributions to the spectral distribution.
A key advantage of the alternative lattice approach to analyzing the same flavor ud VPF is the absence of the kinematic constraint limiting the τ -data-based analysis. Modern n f = 2 + 1 ensembles allow the VPF to be evaluated at (Euclidean) Q 2 considerably larger than m 2 τ . Not only will oscillatory DVs present in the spectral distribution at Minkowski Q 2 be exponentially suppressed for Euclidean Q 2 , but higher dimension OPE contributions can also be suppressed simply by working at higher Q 2 scales. The main complications for the lattice approach are (i) understanding at what Q 2 higher dimension OPE contributions become safely negligible, and (ii) dealing with and quantifying lattice discretization artifacts one expects to encounter at such high scales. Discretization effects can (as we will see) be successfully fitted and removed. It is also straightforward, in principle, to reduce the errors on α s associated with fitting and removing such discretization artifacts by suppressing their magnitude through the use, in future, of even finer lattices than those employed in the current study.
The FESR and lattice approaches turn out to be complementary when it comes to dealing with complication (i) above. The reason is that the freedom of weight choice in the FESR framework allows one to perform multiple analyses with different weights, the different weights producing integrated OPE expressions with different dependences on the higher dimension OPE condensates. This freedom can be used to fit the higher dimension condensates to data. One can then use these results, now at Euclidean Q 2 , to determine where such higher dimension OPE contributions become safely small, relative to the perturbative contributions from which one aims to determine α s . This FESR input turns out to be especially important since, with only the VPF at Euclidean Q 2 to work with, it is not possible to reliably fit higher dimension effective condensates in situations where multiple such condensates are simultaneously numerically relevant and have alternating signs leading to cancellations producing much weaker Q 2 dependence than that of the individual higher dimension terms. Such a behaviour, which, unfortunately, turns out to be a feature of the flavor ud VPF in nature, can lead, at lower Q 2 , to combined higher dimension contributions that can mimic perturbative Q 2 dependence and contaminate lower-scale versions of the lattice α s determination. The presence of such a combined higher dimension contribution is impossible to expose using lattice data alone. We return to this point in the next section.
In what follows, we will determine α s using n f = 2 + 1 flavour Domain Wall Fermions (DWF) with several lattice spacings. This work is built upon a previous determination by one of the authors using different ensembles (with Overlap fermions) [20,21] where continuum OPE results were fit to lattice data with a single lattice spacing a, with a −1 = 1.83 GeV, in a fit window of Q lying between ∼ 1.2 and 1.8 GeV. We will argue that, because of their low scale, these earlier analyses were susceptible to the systematic problem of contamination by residual higher dimension OPE effects resulting from the presence of multiple higher dimension contributions comparable in size and with sizable cancellations already discussed above. We stress that the mimicking of lower dimension contributions by cancelling sums of multiple higher dimension contributions makes it impossible to rule out the presence of such contamination using lattice data alone. We believe this is the reason the result, α s (M Z ) = 0.1118 +16 −17 [21], of this low-scale analysis lies well below the FLAG lattice and PDF world averages. We will address these systematics as well as several others we have identified, in providing a robust determination of α s using lattice HVP data. This paper is organized as follows. We first collect some continuum perturbation theory results, and outline the FESR analysis, in Sec. II. We then briefly describe the relevant lattice methodology in Sec. III, show the results of fitting the continuum perturbative expressions to the lattice data in Sec. IV, and conclude the paper in Sec. V.

A. Running coupling in Perturbation Theory
The running coupling, α s , depends on the renormalisation scale µ and its dependence runs with µ via the renormalisation group equation, The β-function is now known at five-loop order in the MS scheme [22], but our series coefficients (A ij in App. VII B) use only the four-loop β-coefficients so for consistency we will use the four-loop running [23] and three-loop threshold decoupling [24] to run our n f = 3 results to the five-flavour scale µ = M Z . In the end, the difference between using the four and five loop running will be small compared to the statistical and systematic errors of our measurement. The coefficients β i are n f -dependent and for n f = 3 are listed in Eq. 19 of Appendix VII A. The coupling can be run in perturbation theory (PT) from the renormalisation scale µ to some scale m by numerically integrating Eq. 1, but for scales m close to µ one can also use the series expansion, The coefficients A ij can be found in Eq. 20 of Appendix VII B.

B. Adler function and the perturbative series for the HVP
In the continuum, the current-current two-point function, Π µν , involving vector or axial vector currents j a µ (x) ≡ j µ (x), is defined by Generally, this has a Lorentz decomposition into J = 1 (transverse) and J = 0 (longitudinal) components, In this work we consider only the flavor ud vector current, in the isospin limit. Π µν then satisfies an exact Ward identity, q µ Π µν = 0, and is purely transverse, with Π (0) (q 2 ) = 0. The HVP, Π (1) (q 2 ), will be denoted by Π(q 2 ) in what follows.
The Adler function is a physical quantity related to the regularisation-and scheme- with dimension D ≥ 4 condensates, C D , parameterising non-perturbative contributions. At large enough Q 2 , the perturbative (dimension D = 0) series, D (0) , dominates the HVP and, at five loops, is known to the highest loop order in continuum perturbation theory. The D (2) series, which contains perturbative contributions second order in the quark masses, is known to three loops [25]. In Sec. II C, we will show how to use FESR results to help us restrict our attention to Q 2 for which D ≥ 4 condensate contributions are safely negligible. The D = 2 series is generally strongly suppressed by the presence of the squared-light-quark mass factors, but does display rather slow convergence. It is, however, known, from a comparison of the OPE to lattice data for the flavor-breaking ud − us HVP difference, which has a very similar slowly converging D = 2 series, that, despite this slow convergence, the 3-looptruncated version of the D = 2 series, combined with known D = 4 contributions, provides an excellent representation of the lattice data over a wide range of relevant Euclidean Q 2 [26].
It is thus possible to use the three-loop-truncated form to estimate D = 2 contributions to the flavor ud vector HVP, and confirm that they are completely negligible for all the ensembles considered, at the Q 2 we employ in our analysis (see Sec.II C). We will thus be able to analyze the lattice HVP data using only the the well-behaved, high-order D (0) part of the continuum OPE series, and, moreover, average the results obtained from the different ensembles. The main task will then be to quantify residual lattice discretization errors present at these larger scales.
The five-loop result for the perturbative series D (0) [27][28][29][30], written in terms of the fixedscale coupling α s (µ), is, where t = ln(Q 2 /µ 2 ) and the coefficients d ij (for three flavours) are given in Eq. 21 of Appendix VII B. The series for the HVP is then The unphysical integration constant C carries all of the scheme-and regularisation-scale dependence.
It is convenient to introduce the subtracted quantity, which is independent of the unphysical constant, C. The normalization is chosen so the leading term in the perturbative series for ∆(Q 2 1 , Q 2 , µ 2 ) is just α s (µ)/π, precisely the quantity we wish to measure. The perturbative series for ∆ has the form, The subtraction point Q 2 1 will be chosen below to ensure that higher dimension condensate contributions are safely negligible at scales Q 2 ≥ Q 2 1 .

C. FESR analysis
One of the drawbacks of the previous lattice analysis, Refs. [20,21], involving one of the current authors, was the low scale used, 1.2 GeV < Q < 1.8 GeV. The lattice data being analyzed showed clear signs of D ≥ 4 contributions, which were handled by fitting the condensate C 4 (which, in addition to the quark condensate, taken from other lattice calculations, involved the effective gluon condensate, which was not known from other sources) and fixing the analysis window such that neglecting D = 6 (and higher) contributions was self-consistent.
As noted above, there is a potential systematic problem with this analysis strategy, since it relies on the implicit assumption that, as one lowers Q 2 from high values, contributions from the non-perturbative condensates will turn on one at a time, so that a window of Q 2 will exist in which contributions from D > 4 condensates will be negligible relative to that of the lowest dimension (D = 4) condensate, with D = 6 and higher contributions becoming numerically relevant, one at a time as Q 2 is lowered, only at yet lower scales. Unfortunately, FESR results for the higher D condensates do not support this picture 1 . Instead, one finds C D≥4 which generate higher D contributions to D(Q 2 ) showing no sign of convergence with increasing dimension at the scales used in Refs. [20,21]. This is true whether one considers (i) the results for the C D obtained in Ref. [15] from analyzing the vector channel data neglecting DVs, (ii) the analogous V + A channel C D obtained in Ref. [17] in a framework including a model for DVs, or (iii) the C D obtained by performing an extension, analogous to that carried out for the V + A channel, of the vector channel analysis (including DVs) reported in Ref. [17] to generate additional C D beyond the C 6 and C 8 results reported in that paper.
These results, moreover, all show strong cancellations amongst terms of different dimension having different signs, leading to sums of non-perturbative contributions with much weaker Q 2 dependence than that of the individual terms in the sum.
These points are illustrated in Figure 1. The figure shows the D = 4 through 16 contributions to D(Q 2 ), obtained using the central C D values of case (iii) above, together with their much more weakly Q 2 -dependent sum. The Q 2 range displayed extends down to 1.55 GeV 2 , the lower edge of the fit window of Refs. [20,21]. Also shown, for comparison, is the α s -1 It is an advantage of the FESR framework that, through the freedom of weight choice, the relative sizes of different higher-dimension OPE contributions can be easily varied, making it possible to fit the higher D condensates, C D , to data. This freedom is not available in the approach relying only on lattice data for the HVP at Euclidean Q 2 . Thus, while the lattice approach has many advantages over the FESR one, we will still need FESR input to help us determine at what Q 2 potential systematic problems associated with higher D contributions become safely negligible in the lattice approach. Given this evidence for the lack of convergence with D of the D ≥ 4 series, and the failure of any single low-dimension condensate to dominate the sum, the only safe approach to using lattice HVP data to fix α s is to work at scales Q 2 sufficiently large that all D ≥ 4 contributions are simultaneously small relative to the α s -dependent D = 0 contributions of interest for determining α s . Working at lower Q 2 , where FESR results predict dangerous higher-D effects are almost certainly present, and almost certainly not amenable to being fitted and removed, makes unavoidable a potentially significant theoretical systematic uncertainty impossible to successfully quantify using lattice data alone. We deal with this issue by using FESR results to identify Q 2 sufficiently large that all of the estimated individual D ≥ 4 non-perturbative contributions to D(Q 2 ) are safely small, and restrict our analysis to Q 2 1 and Q 2 lying in this region. Scales above ∼ 3 to 3.5 GeV 2 appear sufficient for this purpose.   [31].

III. THE LATTICE HVP
In this work we will use n f = 2+1 dynamical Domain Wall fermion (DWF) configurations generated by the RBC/UKQCD collaborations and extensively detailed in Refs. [31,32]. The relevant ensemble parameters are listed in Tab. I. Since we work at the high scales suggested by the FESR analysis of the previous section, where the D=2 series and higher terms in the OPE will be negligible, we expect to be able to safely average the results for α s obtained from ensembles with different masses. We have checked that these results are, indeed, consistent within one-sigma errors at the momentum scales used here.

A. Lattice computation
We start by considering the lattice analog of the continuum HVP of Eq. 3 with flavour ud content.
One can define a conserved vector current in the DWF discretisation [33,34] with finite fifth dimension extension L s , via where U is an SU (3) link matrix. We consider the two-point function involving the product of this conserved current and the local vector current V ν (x) = Z V ψ(x)γ ν ψ(x). This combination has been shown to have no contact term [20,21,35].

B. Projection and cylinder cut
The Lorentz structure of the lattice two-point function should be equivalent to that of the continuum version, Eq. (4), up to lattice artifacts from the action and from the breaking of rotational symmetry O(4) under the discrete hyper-cubic group H 4 . Including such artifacts, the lattice two-point function is expected to take the form [20,21], To limit the impact of lattice artifacts, we utilize momentum reflections when projecting out the transverse piece, Π: where r µ is a reflection operator for Q µ in the µ direction, r µ Q µ = −Q µ . By design, this removes all (n + m)-even combinations of hypercubic artifact terms from Π.
In addition, to further minimize O(4)-breaking effects, we perform the reflection projection on purely off-axis, non-zero momenta that have been filtered by a cylinder cut procedure [37]. We accept only momenta lying within a radius (aw) of a body-diagonal vector, n µ , with norm 1, n µ = (±1, ±1, ±1, ±1)/2, explicitly: [38], By performing the cylinder cut before we convert our momenta to physical units we obtain a consistent physical radius. We find that a very narrow cylinder width, aw = 0.24, gives smooth results with very limited hyper-cubic artifacts while still providing plenty of data points to use in performing fits, as can be seen in Fig. 2.

C. Modelling cut-off effects
We will be attempting to fit the n f = 3 series for ∆ (Eq. 9) to our lattice data. In principle this fit would involve only one parameter, α s (µ), if our lattice ensembles were very close to the continuum limit, a 2 → 0. With the ensembles available, dictated by current lattice technology, we must, however, also fit residual lattice-spacing artifacts.
Considering the fact that we have projected out several combinations of the possible discretisation effects, and have limited the magnitude of the surviving O(4)-breaking terms, we postulate that the momentum-dependent lattice corrections will be the rotation-preserving terms C 1 a 2Q2 and C 2 (a 2Q2 ) 2 . Such effects, if present in our data, will be evident through linear and quadratic dependences onQ 2 . In principle, even higher-order corrections could be present. The data quality, however, does not permit us to identify and fit an additional (a 2Q2 ) 3 term.
Another possible cut-off effect in our data, which we will fit and remove, is a correction to the continuum coupling itself, Collecting all these correction terms, our fit to ∆ is performed using the fit form where ∆ Q 2 1 , Q 2 2 , µ 2 , αs(µ:a 2 ) π is the continuum perturbative expression for ∆ (Eq. 9), with lattice correction to the coupling. We will choose several subtraction points,Q 2 1 , and fit our data over a range ofQ 2 >Q 2 1 . We fit both the coupling and all of the correction terms simultaneously. Our final fit thus has only four-parameters: α s (µ), C 1 , C 2 and C α . We first consider fitting our data using a single renormalisation scale µ. We will find that the single-scale approach produces a large, unphysical, dependence of α s on the scale µ, and fix this problem using an alternate multiple-scale approach. Throughout this section all errors come from the bootstrap. Fig. 2 illustrates the quality of our data for the quantity ∆ once the reflection projection and cylinder cut procedures have been applied. It is evident that a large linearly-rising behaviour inQ 2 remains for all ensembles and that higher-order effects become significant for the coarse ensemble, and, to a lesser extent, for the fine ensemble data, at largeQ 2 .

IV. RESULTS
Throughout this work we will perform what is commonly called an "uncorrelated" or weighted fit to our data since we could not find suitable fit ranges where correlated fits were possible. This means that, while remaining a useful metric for goodness of fit, the nominal χ 2 /dof should not be taken as having the usual statistical interpretation it would have for a correlated fit. A nominal χ 2 /dof of order 1 or greater will, however, indicate a poor fit to our data.

A. Single-scale analysis
We choose to work with the representative fit-ranges shown in Tab. II. For now, we are forced to use somewhat lower than ideal values ofQ 2 1 to accommodate the coarse ensemble. This is to ensure the subtraction pointQ 2 1 is small enough that cut-off effects will be small atQ 2 1 , allowing us to fit the cut-off effects associated with the higher scaleQ 2 using their Q 2 dependence. We also aim to choose subtraction pointsQ 2 1 which are similar for all the ensembles, subject to the constraint that theQ 2 1 for each corresponds to a Fourier mode. In Fig. 3 we show the µ-dependence of our single-scale analysis results for α s (M Z ). Since the fit range is fixed, and the residual, truncation-induced µ-dependence is small in the continuum limit, the results should be essentially independent of µ if the single-scale analysis has been successful in bringing the artifacts under control.
What we see instead from Fig. 3 is that the single-scale approach, when applied to our data, produces results with a large-µ dependence. The µ-dependence is, moreover, much greater than the residual µ-dependence that exists in the continuum truncated perturbative expression 2 .
All of the fits underlying the results of Fig. 3 have a nominal χ 2 /dof < 1, so it is not that the fits do not describe the data well. Rather the fits lack sufficient information to successfully differentiate between the effects of the continuum coupling and the artifact terms. Similar unphysical µ-dependence is also observed for one or more of the artifact coefficients.
One might think that the results of Fig. 3 are more or less consistent within error, but this is not true, given the very strong correlation among the data points.

B. The multiple-scale analysis
The strong µ dependence of the single-scale analysis seen in Fig. 3 is clearly unphysical.
A good quality fit in a single-scale analysis of our data is thus insufficient to allow both the continuum coupling α s (µ) and the artifact parameters to be reliably constrained. The fact that the unphysical µ dependence is large, however, indicates that it must be possible to further constrain the analysis using the known smallness of the residual scale dependence of the five-loop-truncated continuum D = 0 series.
We thus extend the analysis to perform fits using multiple renormalisation scales m.
These are run to a common renormalisation scale µ using the continuum expression of Eq. 2. We use the values m = 2.0, 2.25, 2.5 and 2.75 GeV for the alternative scale, and for this analysis will vary the renormalisation scale µ between 1.5 and 3.5 GeV as we did for the single-scale analysis for comparison. We will also use the same fit ranges and subtraction points as the single-scale analysis above.
We investigate whether our simultaneous multiple-scale fitting fixes the spurious µdependence observed in the single-scale analysis results by again running the coupling to the common five-flavour scale M Z . The multiple-scale results for α s (M Z ) is shown in Fig. 4, for a typical choice of fit ranges.  C. Eliminating the coarse data As seen in Fig. 2, the results from the coarse ensemble have the strongest lattice artifacts.
Moreover, to safely fit the data only a small window inQ 2 is suitable. If we consider the same set ofQ 2 1 as before (Tab. II) and compare the results of the multi-scale fit using just the superfine and fine ensemble data to those of the corresponding fit using all three ensembles, we see, by varying the upper bound of the coarse-ensemble'sQ 2 fit window, that the coarse ensemble contributes little to the overall fit apart from marginally increasing the statistical uncertainty in the region where the result is stable. This is illustrated in Fig. 5.
As Fig. 5 shows, in the (small) fit window for the coarse ensemble (with the superfine and fine fit windows fixed), in the region where the fit appears to be stable the result is consistent with the result of the fit using only superfine and fine data but has slightly larger errors.
We conclude that cut-off effects for the coarse ensemble are too large to make employing this ensemble useful for this study. We thus eliminate the coarse ensemble from our fits, and base our final results on fits involving only the superfine and fine data. for the coupling from these varied fit ranges are given in Fig. 6.
Now that we know we can safely omit coarse ensemble data, the much larger cut-off scales of the superfine and fine ensembles make it possible to chooseQ 2 1 in the range where FESR results indicate the neglect of D ≥ 4 contributions should be very safe without incurring overly large discretisation effects. For our final results, we perform a multiple-scale analysis with multipleQ 2 1 choices to further constrain our fits. We investigate the fit ranges in Tab. III to understand our fit-range dependence. We vary the lower edge of the fit range in steps of 0.5 GeV 2 and the upper edge of the fit range in steps of 1 GeV 2 . We find good stability when varying the lower bound over this range, keeping the upper fit bound fixed, for the fine and superfine ensembles. In general, it is beneficial to have a low fit bound as close toQ 2 1 as possible as this reduces the contributions of the terms proportional to C 1 and C 2 , and allows the fit to constrain the coupling better. A histogram of the central values obtained for α s (M Z ) from a scan over these fit ranges is shown in Fig. 6. We see that some fit-range-dependence does remain, and this will need to be incorporated in our systematic error estimate. Since the distribution is skewed, the systematic error will be asymmetric.
Our central result, corresponding to the maximum of the histogram, is given by theQ 2 fit bounds of 6.04 GeV 2 → 9.74 GeV 2 for the fine data and 6.03 GeV 2 → 14.83 GeV 2 for the superfine. We use the reference scale µ = 2 GeV 3 .
For these fit ranges, we obtain the following fit parameter results, This fit has a nominal χ 2 /dof = 0.7(1).

E. Error budget
We have identified several sources of systematic uncertainty and have tabulated their effects in Tab. IV. A more in-depth discussion of how these entries were estimated can be found in the following subsections.

Fit range
Even though we have tamed the unphysical µ-dependence of the single-scale approach by working at multiple scales, we still see some dependence on our fit ranges, as shown in Fig. 6. All of these results correspond to reasonable fit ranges inQ 2 and a low nominal To estimate this systematic we will take the one-σ deviation of the histogram.

Lattice spacing error
We cannot propagate the measured distributions for the lattice spacing directly, so we estimate the the impact of the error on the lattice spacing on our final result as a systematic.
We do this by introducing a 2 fit parameters for the fine and superfine ensembles into the final fit with a prior width of the errors given in Tab. I. This introduces a negligible downward correction to the central value of α s (M Z ) of the order of 0.03%.

Perturbative truncation
If we vary the loop order in our analysis, using an estimated value of d 5 = 400 4 to extend this to six loops (in this case, still using just the combination of 4-loop running and 3-loop threshold matching ), we see that our results at three-loop order are consistent with those at six-loop order. Perturbative truncation is thus not a significant source of systematic uncertainty.

Higher order lattice artifacts
With our fit parameters (Eq. 17), we note that the O(a 2 ) correction term in α s (µ) is large even for our superfine ensemble. We found we could not stably fit an O(a 4 ) correction term to our data so we must estimate its possible effect. We note that C 2 0.16 C 1 and thus expect neglected, higher-order rotation-breaking and rotation-preserving contributions to be smaller than those included in our fits. This is to be expected from the lack of strong curvature in the fine and superfine data of Fig.2 within our chosen fit range.
To estimate the effects of an a 4 correction on the coupling α s (µ) we put a coefficient of where the first error is statistical and the second contains the systematic errors identified in the previous section combined in quadrature.

V. CONCLUSIONS
We have revisited the determination of α s from OPE fits to lattice data for the flavor ud vector current HVP, identifying and dealing with several sources of systematic error not recognized in previous implementations of this approach.
One key problem, of particular relevance to low-scale versions of this analysis, is the possibility that multiple numerically significant higher dimension OPE contributions can, through cancellation, produce a sum which mimics lower dimension contributions and hence potentially contaminates the determination of α s . Such effects, if present, cannot be fitted and removed using lattice data at Euclidean Q 2 alone. Continuum FESR results were (i) shown to confirm that this problem was almost certainly present at the low scales, Q 2 < 3 GeV 2 , of the previous versions of this analysis, then (ii) used further to identify those Q 2 at which all such higher dimension OPE contributions should be safely negligible. For such Q 2 , the lattice HVP should be dominated by dimension 0 (perturbative) OPE contributions and residual lattice artifact contributions, with the latter to be fitted and removed. We have introduced a momentum-reflection-based projection method for the lattice HVP and employed a very narrow cylinder cut to further reduce higher-order rotation-breaking cut-off effects.
A second key problem is the large, unphysical, µ dependence observed when employing the (previously used) single-scale approach to fitting the HVP data. This problem was rectified by implementing a multiple-scale analysis strategy.
We find that some residual fit-range dependence does remain, even after performing a multiple-scale, multiple-Q 2 1 analysis. For the lattice spacings of the present study, lattice artifacts are not small in the Q 2 range employed and these had to be carefully modelled to obtain a well-behaved result. Additional ensembles finer than the superfine ensemble employed here (which will have smaller lattice artifacts in the same Q 2 range) should help to further reduce the systematic uncertainty, as well as improve statistics. One significant advantage of our approach is that simulations at multiple unphysical (heavy) pion masses can be used simultaneously and averaged to improve statistics, since, for the Q 2 in our chosen fit windows, mass-dependent OPE contributions are negligible over the range of m π typical of modern simulations.
Our final result is consistent with both the PDG central value of 0.1181(11) [6] and the FLAG result of 0.1182(12) [5]. Our errors, both statistical and systematic, are currently larger, but checking for consistency is important for the determination of such a fundamental constant. We also believe we have identified the source of the tension between the world average and the results of the previous, lower-scale version of this analysis [21] (which gave α(M Z ) = 0.1118 +16 −17 ) and, with our improved method, have succeeded in solving the problems that led to this tension. Future improvements to both the statistical and systematic errors, moreover, appear technically straightforward.
We conclude by stressing the close relation of the lattice HVP approach discussed in this paper to the τ -decay-data-based FESR determination of α s . Both involve an analysis of the same object, the flavor ud HVP. The lattice approach, however, has certain important advantages. In particular, while the somewhat low scale of the τ mass makes unavoidable the presence of higher dimension OPE and DV contributions in the FESR analysis, DV contributions are exponentially suppressed at the Euclidean Q 2 of the lattice analysis and, with modern lattices, it is feasible to work at scales, Q 2 , where higher dimension OPE contributions are safely negligible. The price to pay for these advantages is the presence of residual lattice artifacts that have to be fit and removed from the lattice HVP data at these higher scales. This procedure, however, is subject to systematic improvement through increased statistics and the use of finer lattices with reduced artifact contributions. As such, it appears to us that, in the long run, the lattice approach represents the more favourable of the two methods for extracting α s from the light-quark HVP.

VII. APPENDICES
Here we put the relevant series coefficients for the perturbative series we have used in this work. All coefficients are for the n f = 3 variants.