Double-Folding Nucleus-Nucleus Optical Potential: Parallel MPI and OpenMP Implementations

The computation of the real part of the nucleus-nucleus optical potential based on the microscopic double-folding model was implemented within both the MPI and OpenMP parallelising techniques. Test calculations of the total cross section of the 6He+ 28Si scattering at the energy 50 A MeV show that both techniques provide significant comparable speedup of the calculations.


Introduction
The optical potential (OP) of the elastic nucleus-nucleus scattering is one of the most important ingredients needed for the understanding of the mechanisms of the various types of nuclear interactions and the calculation of the corresponding physical observables. The phenomenological OP forms (like e.g. the Woods-Saxon potential) usually have 6 and more free parameters requiring adjustment to the experimental data. More realistic microscopic approaches take into account the internal structure of the interacting nuclei. The real part of the microscopic OP is derived within the double folding model (DFM) [1,2]. In general, the DFM OP does not contain free parameters to be fitted, although in practice it is usually necessary to renormalize the depth of the calculated OP in order to get agreement with the experimental data.
The computer code for calculating the DFM OP was developed in 2007 [3]. It is intensively used within the hybrid model of OP [2] combining the DFM-based real part of OP with the highenergy approximation [4] for calculating its imaginary part. Such approach, in spite of the significant computer time for the DFM OP calculation, is successfully used in studies of nuclear interactions at energies of 10-100 MeV/nucleon (see, e.g., [5] and references therein).
In this contribution, we present the parallel implementation of the DFM OP model in two versions based on two popular parallel programming techniques, MPI and OpenMP. The method of calculating the DFM OP is briefly described, the results of methodological calculations confirming the efficiency of MPI/C++ and OpenMP/C++ codes are presented. e-mail: bashashinmv@jinr.ru The DFM OP V DF is constructed as the sum of isoscalar and isovector components each of which including direct V D and exchange V EX . The isoscalar potential is as follows: Each density function of the projectile nucleus ρ p and the target nucleus ρ t with atomic masses A p and A t is the sum of the neutron and proton densities. K is the local momentum of the relative motion, U c is the Coulomb potential. The effective nucleon-nucleon potentials υ D NN and υ EX NN depend on the energy E and the densities of the interacting nuclei. The isovector DFM OP formulae are obtained by replacing in (2) the sum of the neutron and proton densities by their difference and the use of corresponding formulae for the effective nucleon-nucleon potentials υ D NN and υ EX NN [6]. The local momentum K depends on the potential V DF , i.e. the expression of V EX is a nonlinear integral equation which can be schematically written in the form where Φ is an integral expression of multiplicity 3. For the numerical solution of (4), a uniform grid is introduced along the radial coordinate. Numerical integration is carried out using the Simpson method, the solution of equation (4) is based on the fixed point method organized as follows: where k denotes the number of iterations, the direct potential V D is calculated by means of standard quadrature formulae. The convergence condition of the iteration process (5) is set to where ε > 0 is a given small parameter. Naturally, instead of the singular integration interval [0, ∞), the calculations are carried out on a finite interval [0, R max ], where R max is a sufficiently large value that ensures the correct asymptotic behavior of the DFM OP and its decaying to 0 in the R max region. The calculation of the DFM OP requires (depending on the atomic mass of the interacting nuclei and the kinetic energy of the projectile nucleus) from several minutes to higher. The complexity of the calculation grows toward heavier nuclei and higher energies due to the need to take a smaller stepsize of discrete grid along the spatial coordinate and a longer integration interval.
The parallel implementation is based on the splitting of the integration interval [0, R max ] into blocks according to the number P of parallel computing units (MPI-processes, OpenMP-threads). In practice, this means the distribution of the N coordinate nodes (and respective calculations) between P parallel units. In the MPI implementation, this distribution is organized explicitly, in the OpenMP technique this procedure is automated by means of special directives.  The calculations show that both MPI and OpenMP versions provide almost the same speedup proportional to the number P of parallel units while 1 < P < 12. The further growing of the speedup when P > 12 is slower. The maximal speedup S that can be obtained for each N in the MPI and OpenMP calculation is presented in Fig. 1. Also, the respective numbers of the parallel units P and the values of the effectiveness Q characterizing the load balance of parallel computational units are given for each N. The quantities S and Q are calculated as follows:

Numerical results
where T 1 is the execution time in serial mode, T P is the execution time in the case of P parallel units. In general, both MPI and OpenMP techniques provide comparable values of the maximal speedup (20-25 times). However, for the MPI implementation, the maximal speedup is reached at 40 ≤ P ≤ 44 while in the OpenMP case, the maximal speedup is obtained at P = 28 for all values of N. Further growing of the number of OpenMP-threads to P > 28 leads to a significant decrease in speedup to values of 10-15 times.
An example of the use of the DFM OP for the calculation of a physical observable is presented in the right panel of Fig. 2. Here, the 6 He + 28 Si total reaction cross sections σ tot computed using the DFM OP are compared with experimental data [9,10]. The cross sections have been calculated following [10] with help of the DWUCK4 code [11]. To obtain the theoretical curve depicted in right  6 He + 208 Si total reaction cross sections. Experimental data from [9] (squares) and [10] (triangles). panel of of Fig. 2, it was necessary to calculate the DFM OPs 24 times, at energies from 2 to 50 MeV/nucleon with increment of 2 MeV/nucleon, with the number of nodes of the discrete grid being N = 401. The calculation of the required potentials would require about 8 hours in a serial mode, while using 44 MPI-processes -about 20 minutes.

Conclusions
It was found that both the MPI and OpenMP parallel versions of the DFM OP are quite efficient, with more than 20 times speedup of calculations in comparison with the serial mode. The package of computer codes DFM-POTM including the serial C++ code, the MPI-code and the OpenMP-code is freely available from the JINRLIB library of the JINR 1 .