Representation of complex probabilities and complex Gibbs sampling

Complex weights appear in Physics which are beyond a straightforward importance sampling treatment, as required in Monte Carlo calculations. This is the well-known sign problem. The complex Langevin approach amounts to effectively construct a posi\-tive distribution on the complexified manifold reproducing the expectation values of the observables through their analytical extension. Here we discuss the direct construction of such positive distributions paying attention to their localization on the complexified manifold. Explicit localized repre\-sentations are obtained for complex probabilities defined on Abelian and non Abelian groups. The viability and performance of a complex version of the heat bath method, based on such representations, is analyzed.

Abstract. Complex weights appear in Physics which are beyond a straightforward importance sampling treatment, as required in Monte Carlo calculations. This is the wellknown sign problem. The complex Langevin approach amounts to effectively construct a positive distribution on the complexified manifold reproducing the expectation values of the observables through their analytical extension. Here we discuss the direct construction of such positive distributions paying attention to their localization on the complexified manifold. Explicit localized representations are obtained for complex probabilities defined on Abelian and non Abelian groups. The viability and performance of a complex version of the heat bath method, based on such representations, is analyzed.

Motivation and existing approaches
Many physical problems reduce to computing weighted averages For a large number of entangled degrees of freedom the Monte Carlo method is often the only viable approach: x ∼ P (Importance sampling with respect to P(x)).
However complex probabilities do appear in certain cases. An outstanding example is lattice QCD at finite density [1]. Straightforward importance sampling is meaningless when P(x) is not a positive distribution and this is the well known sign problem [2]. Several proposals have been put forward to deal with complex weights within Monte Carlo, among other: Reweighting, complex Langevin equation (CLE) [3][4][5][6][7][8], Lefschetz thimbles and variants [9,10], reweighting the Complex Langevin equation [11], or particular treatments for special problems, e.g., worm algorithms [12], or other [13].
The sign problem is hard [2]. Nothing as robust and reliable as the Metropolis algorithm is available in the complex case and each of the abovementioned methods has well-known limitations or is too recent to asses its performance. It is fair to say that, at present, no existing method can be regarded as the complete solution. In this view new alternative approaches are worth pursuing.
Our goal is to explore a complex version of the heat bath method. The idea is to do the updates by means of a representation of the conditional complex probability. This requires studying representations of complex probabilities by themselves (i.e., quite independently of the CLE construction).

Direct representations
The CLE is based on representing P(x) = e −S (x) by an ordinary (real and positive) probability distribution ρ(z) defined on the complexified manifold, so that ideally A P = A ρ where A(z) is the analytical extension of A(x). In the CLE the distribution ρ(z) is not obtained nor needed explicitly. Abstracting the idea, one can define [14] a representation ρ(z) of a complex distribution P(x) on R n , as a distribution on C n such that Equivalently, Of interest for Monte Carlo are the positive representations, that is, those with ρ(z) ≥ 0.
As it turns out, positive representations exists very generally and constructions are available for many complex probabilities, including Gaussian times polynomial of any degree and in any number of dimensions (a dense set) [14], periodic distributions in any number of dimensions, complex distributions on compact Lie groups and coset spaces [15]. In such representations P(x) needs not be an analytical function, any (normalizable) complex distribution has positive representations sharing the same symmetries enjoyed by the complex probability itself. Moreover, the expectation values of the subset of the holomorphic function do not fully fix ρ(z): Representations of a given P(x) are by no means unique. E.g., if ρ is a positive representation, so is exp( t∇ 2 )ρ (t > 0) [14].

A concrete construction
Given a complex probability P(x) on R n let us choose P 0 (x) > 0 and H(x) such that Then it is easy to verify [15] that A(z) = A P with z = x + z H(x ), where x ∼ P 0 and z ∼ q, and q(z) is any positive representation of δ(x) + δ (x), such as q(z) Nevertheless, the sign problem is not solved in practice since the available constructions are not viable for large systems: (i) the normalization of P(x) is not usually known, (ii) the vector field H(x) is not easy to obtain for large n, and (iii) the dispersion of z increases exponentially with n, spoiling the Monte Carlo estimates.

Conditions on the support of positive representations
This is an important point in the representation approach: representations with smaller widths yield smaller variances and poor representations may not be better than reweighting. As a rule, the more complex P(x), the wider must be ρ(z) > 0. For given P(x), the width (size of the support in the imaginary direction) of any positive ρ(z) is bounded from below. For instance, the relation constrains how localized ρ can be. In particular A = e −ikx provides quite explicit bounds [16]. For local actions good quality positive representations (i.e., with bounded width as n increases) should exist, eventually outperforming reweighting: CLE being an example, when it works.

Two-branch representations: one-dimensional case
Attending to the issue of the width, let us consider representations with support along two horizontal lines, R ± iY (two-branch representations [16]): It is easy to check that such ρ(z) will represent P(x) provided For any Y, real functions Q ± (x) exist and they are essentially unique. In terms of Fourier modes: with P(x) = k e ikxP (k). For Y above some critical value, the Q ± (x) become positive: there are more e Y in denominator and eventually the positive constant mode dominates. The critical Y is the optimal choice since it has the smallest width. Two examples are displayed in figure 1.

Two-branch representations: higher-dimensional case
For complex probabilities defined on R n (n > 1) one can try a strict two-branch scheme [16] with A similar construction has been proposed in [17,18]. Once again, this is a formal solution but not a practical one. In addition there is a technical problem: there are no good choices for Y. The natural choice Y = (Y, . . . , Y) meets k· Y = 0 for some Fourier modes because these components are not moved into the complex manifold. Asymmetric Y are unnatural and also tend to require some Y i larger than necessary. An elegant solution is to use 2 n branches: For each Fourier mode k and variable Effectively one is adding a bit for each degree of freedom (say, each site), from x ∈ R n to ( x, σ) ∈ (R × Z 2 ) n . Having more branches is compensated by i) simplicity, ii) restoration of symmetry, iii) smaller values of Y, and iv) it is mandatory in the non periodic case.

Representation of complex probabilities on groups
Let P(g) be a complex probability defined on a matrix group G = {e −i a· T , a ∈ R m }. The observables can be analytically extended to the complexified groupG = {e −i a· T , a ∈ C m }, Haar measures can be adopted on G andG and representations can be defined as before.
The two-branch scheme can be adapted to the present case: Letting G I ≡ {e −i a· T , a ∈ iR m }, where A(gh −1 ), A(gh) refer to the analytical extension inG. So the Monte Carlo sampling is carried out using the real and positive weights Q ± (g) defined on G.
As an example, a two-branch representation on SU(2) for P(g) = 1 + β tr(σ 3 g) is Q + (g) is displayed in figure 3, using g = a 0 − i a · σ, and (a 0 , a) ∈ S 3 . As it would be expected subtleties arise in the non-Abelian case: regardless of the choice of h, certain components (singlet with respect to h ) will remain unchanged (would have Ω = 1 in Eq. (18)). Equation (15) can only be fulfilled if P is real along those components. For them a different element h ∈ G I has to be applied, subject to the condition 3 Complex heat bath approach

The complex Gibbs sampling method
The construction of direct representations just discussed becomes rapidly inviable as the dimension of the manifold increases. This suggests a heat bath approach. As in the standard Gibbs sampling, each variable is updated in turn, but, since the conditional probability is complex, a representation of it is used instead: (z 1 , . . . , z i , . . . , z n ) → (z 1 , . . . , z i , . . . , z n ), z i ∼ ρ rep (z i |{z j i }) where ρ rep (z i |{z j i }) is a representation (with respect to z i ) of the conditional probability P(z i |{z j i }) = P(z 1 , . . . , z i , . . . , z n ) P(z 1 , . . . , z i , . . . , z n ) .
By construction only one-variable representations are needed, however, the method relies on the analytical extension of P( x), just like CLE. Furthermore convergence to a representation of P(x) is not guaranteed as standard Markovian chain theorems do not apply. Also, there is a potential problem from zeroes of the marginal probabilities P(z 1 , . . . , z i , . . . , z n ).
To see the performance of the method we have studied a concrete model [16], namely, a complex action in a d-dimensional periodic lattice . In the free case (λ = 0), this conditional probability is a displaced real Gaussian and the marginal probability has no zeroes. The method works alright for a bounded region of the β plane (e −S is normalizable for bounded Re(β)).

Monte Carlo study of an interacting case
We further apply the complex Gibbs sampling approach for λ = 1 and imaginary β. Some checks from T -matrix calculations, for d = 1 only, show that the method works up to |β| ≈ 1.
For higher dimensions we have used checks from virial (or Schwinger-Dyson) relations The choices A = φ x and A = φ x+μ , give rise to four observables O 1,2 and I 1,2 such that I 1 = I 2 = 0. These observables are computed using Monte Carlo and results are displayed in table 1. Reweighting and complex Langevin results are also displayed for comparison. As expected reweighting stops working for the larger lattices. Complex heat bath works rather well for any lattice size for moderated couplings although several technical issues arise, including the need of a cutoff and filtering of rare (but catastrophic) cases (see [16] for details).
Some small bias is observed in I 1 (which ought to be zero). Likely, this is rooted in a real weakness of the method, related to the presence of zeroes in the marginal probabilities. To analyze this, let us assume, in a two-variable setting, that the current distribution ρ is a proper representation, A ρ = A P . One would like the Gibbs update to preserve this property. A Gibbs update (z 1 , z 2 ) → (z 1 , z 2 ), produces a new distribution ρ . Using the representation property one has: So spurious contributions may arise from poles in 1/P(z 2 ). This idea is fully sustained by a detailed analysis of P(x 1 , x 2 ) = (1 + β cos(x 1 ))(1 + β cos(x 2 ))(1 + β cos(x 1 − x 2 )). A bias is found whenever [−Y, Y] contains the zero of P(z 2 ) ∼ 1 + 1 2 β 2 cos(z 2 ), and this cannot be prevented for sufficiently large β (Y increases and the zero z 2 decreases).  To have a robust method it is mandatory to find a way to remove the spurious contributions from the poles in 1/P(z 1 , . . . , z i , . . . , z n ), when they exist. No such technique is available yet. On the other hand, it is not hard to monitor on-the-flight whether the marginal probabilities stay away from zero during the Markovian chain. This is displayed in figure 4.

Summary and outlook
Positive representations for any complex probability exist quite generally with variable degrees of quality, related to their extension on the complexified manifold. Good quality positive representations are expected to exist for local actions, or more precisely, when the correlation length does not increase with the size of the system. Unfortunately, there is no known practical way to obtain them beyond the low dimensional case. CLE would be such a construction when it works.
The complex heat bath method, which only needs one-variable representations, works for moderate couplings (moderately complex actions). The limitation is due to the problem of possible zeroes in the marginal probabilities used to do the updates. These zeroes introduce poles in the construction of the representations of the conditional probabilities. The validity of the method in each case can be assessed by monitoring the marginal probabilities on-the-flight. It would be worth studying whether the bias introduced by the zeroes of the marginal probability could somehow be removed. In another direction, perhaps the bias problem could be alleviated by considering larger clusters of variables to be updated at each step (which is also more costly).
Despite the impediments noted, may be the most interesting lesson to be learned from the direct representation approach is that such positive representations do exist, also in hard problems like QCD at finite density. This means that there exist real actions on the complexified manifold (however ugly, wide and non-local they could be) that correctly reproduce such real-life systems. Then one can attempt a different route, namely, to try to directly write down a real and local action (on the complexified manifold) in the same universality class as that of QCD at finite density. Such real action would have the same symmetries (e.g., invariance under real gauge transformations) and would include a chemical potential as an incorporated parameter but associated to the baryon number conserved charge. In addition the action should be compatible with reflection positivity (at least within the subset of holomorphic observables) and should have a support with bounded width in the large volume limit. While this does not seem to be an easy task, it should be recalled that, as we have argued, such actions definitely exist, albeit without the locality and width conditions.