Efficient and Scalable Approach to Equilibrium Conditional Simulation of Gibbs Markov Random Fields

We study the performance of an automated hybrid Monte Carlo (HMC) approach for conditional simulation of a recently proposed, single-parameter Gibbs Markov random field (Gibbs MRF). The MRF is based on a modified version of the planar rotator (MPR) model and is used for efficient gap filling in gridded data. HMC combines the deterministic over-relaxation method and the stochastic Metropolis update with dynamically adjusted restriction and performs automatic detection of the crossover to the targeted equilibrium state. We focus on the ability of the algorithm to efficiently drive the system to equilibrium at very low temperatures even with sparse conditioning data. These conditions are the most challenging computationally, requiring extremely long relaxation times if simulated by means of the standard Metropolis algorithm. We demonstrate that HMC has considerable benefits in terms of both computational efficiency and prediction performance of the MPR method.


Introduction
Gaussian Markov random fields (GMRFs) are used for modeling spatial data on regular grids [1]. GM-RFs are based on the principle of conditional independence and the enforcement of spatial correlations via local interactions. The latter translate into sparse precision matrices, which allow computationally efficient representations. While GMRFs have a long history [2], non-Gaussian Markov random fields (NGMRFs) have attracted less attention. Binary-valued Ising spin models, the q-state Potts model, and the continuous planar rotator are typical NGMRF examples. Widely studied in statistical physics, they have also found applications in areas such as image restoration [3][4][5] and spatial prediction [6].
Recently, we have introduced a novel Gibbs Markov random field for prediction of spatial data on regular grids, based on the modified planar rotator (MPR) model [7]. We also proposed an efficient and automated hybrid Monte Carlo (HMC) approach for the conditional simulation of the model. HMC has been shown to lead to fast relaxation, and the short-range nature of the interaction between the "spin" variables enabled vectorization. Consequently, the MPR computational time for both inference and simulation was found to scale approximately linearly with system size, which makes MPR-based prediction attractive for big and gappy data sets, such as satellite and radar images. Given a rectangular grid of size L x × L y , the problem of interest is to estimate by efficient updating the unknown values at e-mail: milan.zukovic@upjs.sk e-mail: dionisi@mred.tuc.gr Algorithm 1 HMC algorithm.Φ old is the current andΦ new is the new spin state.Φ old −p is the current state excluding the point labeled by p. U(0, 1) denotes the uniform probability distribution in (0, 1).
for p = 1, . . . , P do Loop over prediction sites 1:Φ p ← O{Φ old p } Perform over-relaxation update according to (1) 2: r 1 ← U(0, 1) Generate uniform random number 3:Φ p ←Φ p + 2π(r 1 − 0.5)/a (mod 2π) Propose spin update End of prediction loop 7: returnΦ new Return the updated state after one HMC sweep end procedure P prediction sites (missing data), while the conditioning values (sample data) at the remaining sites are kept fixed during the updating process.
Herein we focus on the HMC approach and study its performance via a vis the standard Metropolis algorithm. The latter is known to be inefficient at very low temperatures, which is the operating parameter region of the MPR method. We show that HMC updates can considerably reduce the relaxation times of the standard Metropolis approach; even more importantly, the number of HMC sweeps necessary to reach equilibrium is insensitive to grid size, i.e., the HMC algorithm is scalable.

Hybrid Monte Carlo
The standard Metropolis algorithm is often used in MC simulation due to its flexibility and applicability to a wide range of problems [8]. However, it can be rather inefficient in some situations, e.g. at low temperatures, due to very low acceptance rate (proportional to exp(−∆E/T ), where ∆E = E new − E old is the energy difference between the new and old states). This leads to extremely long relaxation times in the low-T limit, which is the typical parameter region for the MPR prediction (T ≈ 10 −2 ) [7]. Efficient use of the MPR method requires an updating scheme that is able to drive the system to equilibrium fast, i.e., with the shortest possible relaxation time.
To tackle this problem we proposed the HMC updating scheme (see Algorithm 1). HMC combines a flexible restricted form of stochastic Metropolis and the deterministic over-relaxation [9] methods. The former algorithm generates a proposal spin-angle state at the ith site according to the rule φ i = φ i + 2π(r − 0.5)/a, where r is a uniformly distributed random number r ∈ (0, 1). A tunable parameter a is automatically reset during the equilibration to maintain the acceptance rate above a predefined threshold value (arbitrarily set to 0.3). In the over-relaxation update, a new spin-angle value at the ith site is chosen so that the system energy is conserved. In the MPR model defined by the nearest-neighbor interaction Hamiltonian H = −J ∑ i, j cos[(φ i − φ j )/2], the over-relaxation update is achieved by means of the following transformation where nn(i) denote the nearest neighbors of {φ i } P i=1 , and arctan 2(·) is the four-quadrant inverse tangent: for any real x, y such that |x| + |y| > 0, arctan 2(y, x) is the angle (in radians) between the positive horizontal axis and the point (x, y).

Results and Conclusion
Efficiency of the HMC approach is demonstrated on Gaussian synthetic data Z ∼ N(m = 50, σ = 10) and exponential covariance C(r) = σ 2 exp(−r/ξ ) where ξ = 5, simulated on square grids with L nodes per side (L = 32, . . . , 2048). We simulate missing data by randomly removing c% of the L 2 values. Ensemble expectations are obtained by averaging over different sampling configurations. The relaxation time is expressed as the number of MC sweeps, N relax , necessary to reach equilibrium.
We focus on N relax in the limit of sparse samples and low temperatures. As shown in Figure 2(a), for c = 90% and T = 0.01, both N relax and its slope with increasing L are largest for standard Metropolis updating. On the other hand, N relax and its rate of increase are considerably suppressed by combining standard Metropolis with either over-relaxation or restricted updating. However, the hybrid method that combines all three approaches further suppresses N relax significantly and completely eliminates its dependence on L. Thus, relaxation by merely ≈ 60 hybrid MC sweeps suffices to equilibrate the sparsely conditioned system (for L ranging between 32 and 2048). Figure 1(b) shows the evolution of the specific energy for L = 2048. Relaxation from the random initial state to the equilibrium (flat) regime is detected automatically at the crossover point N = N relax , where the trend disappears. The inset shows that the standard Metropolis update converges to equilibrium much slower than the hybrid method; in addition, standard Metropolis fails to reach a perfectly flat regime even at N relax . This could be addressed by a stricter convergence criterion, which would further increase the relaxation times. Therefore, albeit large, N relax for standard Metropolis is in fact an underestimate. The differences in minimum energy values (reached at N relax ) between simpler methods and the hybrid approach are illustrated in Figure 2(a). Figure 2(b) shows the root mean square error RMSE = ∑ s p ∈G p Z( s p ) −Ẑ( s p ) 2 /P, where Z( s p ) represent true andẐ( s p ) predictions at the missing points s p , p = 1, . . . , P. Due to slower relaxation to equilibrium, MPR gap filling with either standard (S) or only partially hybrid (SO or SR) updating schemes yields larger RMSEs than the hybrid method.
In summary, we demonstrated that the HMC algorithm can significantly increase both the computational and prediction performance in the challenging limits of very low temperature and sparse conditioning data. Moreover, the HMC relaxation time is insensitive to grid size, i.e., the algorithm is scalable. Owing to the short-range nature of the interactions between variables, the computational efficiency of the HMC algorithm (and the entire MPR method), can be further increased by vectorization or parallelization on graphics processing units [10]. This advance will make MPR gap filling attractive for near real-time processing of big data sets, e.g. satellite and radar images.