Random sampling and validation of covariance matrices of resonance parameters

Analytically exact methods for random sampling of arbitrary correlated parameters are presented. Emphasis is given on one hand on the possible inconsistencies in the covariance data, concentrating on the positive semi-definiteness and consistent sampling of correlated inherently positive parameters, and on the other hand on optimization of the implementation of the methods itself. The methods have been applied in the program ENDSAM, written in the Fortran language, which from a file from a nuclear data library of a chosen isotope in ENDF-6 format produces an arbitrary number of new files in ENDF-6 format which contain values of random samples of resonance parameters (in accordance with corresponding covariance matrices) in places of original values. The source code for the program ENDSAM is available from the OECD/NEA Data Bank. The program works in the following steps: reads resonance parameters and their covariance data from nuclear data library, checks whether the covariance data is consistent, and produces random samples of resonance parameters. The code has been validated with both realistic and artificial data to show that the produced samples are statistically consistent. Additionally, the code was used to validate covariance data in existing nuclear data libraries. A list of inconsistencies, observed in covariance data of resonance parameters in ENDF-VII.1, JEFF-3.2 and JENDL-4.0 is presented. For now, the work has been limited to resonance parameters, however the methods presented are general and can in principle be extended to sampling and validation of any nuclear data.


Introduction
Compared to deterministic uncertainty propagation [1], random sampling is less time efficient, however may be more accurate and less cumbersome.
In the covariances of existing nuclear data libraries, different mathematical inconsistencies are a common phenomenon.Some of the inconsistencies can be overcome while performing deterministic uncertainty propagation, but they completely disable random sampling.One of the requirements for random sampling is positive semi-definiteness of the covariance/correlation matrix [2,3].When sampling inherently positive parameters (e.g.cross sections or resonance widths), their probability distributions is not governed by Gaussian function, but lognormal distribution can be used instead [4].This distribution requires additional restrictions on correlation coefficient values [5] which are not taken into account in all libraries [2].In practice, approximative methods avoiding lognormal distribution can be used, e.g.Gaussian distribution with truncated negative values.These methods generally work well with small uncertainties ( 30%), however create significant biases when sampling parameters with larger relative uncertainties.
In this paper, analytically exact methods to randomly sample nuclear data are presented.The methods have been a e-mail: lucijan.plevnik@ijs.sib Currently at EC-JRC Institute for Reference Materials and Measurements, Geel, Belgium; e-mail: gasper.zerovnik@ijs.siapplied in the program ENDSAM, written in the Fortran 95 language, which receives • a number of required samples n, • a file from a nuclear data library of a certain isotope in ENDF-6 format [6], as input and returns n new files in ENDF-6 format which contain values of random samples of resonance parameters in places of original values.The source code for the program ENDSAM is available from OECD/NEA Data Bank.
The code has been validated on three major contemporary nuclear data libraries: ENDF/B-VII.1 [7], JEFF-3.2[8], and JENDL-4.0u2[9].In Sect. 4 the results given by ENDSAM are compared with expected values, while Sect. 5 contains the list of all isotopes with inconsistent data from aforementioned libraries.
More details about this work can be found in the paper [10].

Normal and lognormal parameters
In nuclear data libraries, physical parameters are represented by mean values and (possibly) the corresponding covariances.The usual assumption is that they are distributed according to the normal distribution.However, normally distributed parameters may attain positive or negative values, while in reality some parameters, e.g.neutron widths, are inherently positive.If a parameter has a small relative uncertainty (< 30%), then normal distribution is a good approximation for most practical purposes.However, as emphasized in [2] and demonstrated with examples in Sect.4, in some rare cases relative uncertainties of parameters given in nuclear data libraries may be extremely large (even 150000%).Therefore, we describe positive parameters with a lognormal distribution [4], which means that their logarithms are normally distributed.
We use x , σ (x), ru (x) = σ (x) x , and corr (x, y) to denote mean of x, standard deviation of x, relative uncertainty of x, and correlation between x and y, respectively.
Suppose that x and y are lognormally distributed, that is x = e u and y = e v for normally distributed random parameters u and v, respectively.Then moments of normal parameters may be expressed in terms of moments of x and y in the following way: see [10] and the references therein.We now deduce from −1 ≤ corr (u, v) ≤ 1 and Eqs. ( 2) and ( 3) that correlations involving lognormal parameters should satisfy and Figure 1 shows upper and lower bound for correlation between lognormal parameters in dependency of their relative uncertainties.

Random sampling method
Consider a combination of normal and lognormal parameters x 1 , . . ., x n and a collection of normal parameters If x j is lognormal, then means and deviations of x j can be computed by respectively using Eq. ( 1).Moreover, if x i is another parameter, then depending on its type, correlation corr x i , x j has to satisfy either the condition in Eqs. ( 4) or (5).If it does, then corr x i , x j can be computed either by Eqs. ( 2) or ( 3), respectively.However, it was pointed out in [2] and will be demonstrated with examples in Sect. 5 that conditions in Eqs. ( 4) and ( 5) are often not taken into account in nuclear data libraries.In such cases we assume that the correlation matrix from data library is given for a collection of normal parameters, so it applies to the collection x 1 , . . ., x n and not to the original collection of parameters.
Another restriction which is sometimes not taken into account is that a correlation matrix must be positive semidefinite.Some cases appear when even a matrix given in a library has some negative eigenvalues and in some cases this happens to be the case after transformation of correlation coefficients.In this case we apply Higham's method [11] to find the nearest correlation matrix in Frobenius norm and work with the resulting matrix.This correction method may not be universally the best solution, however it is mathematically rigorous.
We summarize the following algorithm to randomly sample a collection x 1 , . . ., x n .Algorithm 3.1 1) For j = 1, . . ., n: if x j is lognormal, then compute x j and σ x j by respectively applying Eq. ( 1). 2) Check whether all correlations satisfy theoretical bounds for lognormal parameters from Eqs. ( 4) and ( 5).
• If not, then set C = C.
• If yes, then compute new corrlelation matrix C = corr x i , x j by applying Eqs. ( 2) or (3) when necessary.

3)Check if C is positive definite by trying to compute its
Cholesky factorization.
• If it is, then save its lower Cholesky factor into L. 4) Compute standard normal random vector v = (v 1 , . . ., v n ) using the Box-Muller method [12].5) Compute normal random vector u = (u 1 , . . ., u n ) by formula u j = x j + σ x j (Lv) j , j = 1, . . ., n. (7) 6)For j = 1, . . ., n: sample value of x j equals    u j : x j is normal, e u j : x j is lognormal. (8) Among resonance parameters in nuclear data libraries, resonance energy ER is the only parameter which is assumed to be normally distributed by ENDSAM, since its relative uncertainty is always very small.

Validation of the sampling code
The sampling code ENDSAM was validated by sampling different examples of resonance parameters from existing nuclear data libraries.Figure 2 shows relative errors of the first two moments.A "batch of samples" is referred to as a sequence of generated samples, e.g. 10 5 samples in the graph below in Fig. 2. In each batch, distribution moments of the first i samples were calculated, averaged over all batches, and plotted as a function of i.We considered the following resonance parameters of 35 Cl with different values of relative uncertainties from the ENDF B/VII.1 library: 13.1%: neutron width with resonance spin J = 1 at energy around 55 keV, and orbital angular momentum l = 0, 100%: proton width with J = 1 at around 183 keV, l = 0, 357%: proton width with J = 2 at around 15 keV, l = 0, 5000%: proton width with J = 2 at around 7.6 MeV, l = 0, 150000%: proton width with J = 1 at around 68 keV, l = 0.
Sample means of a parameter with relative uncertainty 100% converge to relative error less than 1% after less than 10 samples per batch, while error is still around 11% after 10 4 samples per batch for a parameter with relative uncertainty 150000%.Convergence of standard deviations is slower.For the first parameter, the relative error falls within 1% after 500 samples per batch, while it is nowhere near required value after 10 5 samples per batch for the second parameter.
On the other hand, convergence of correlations depends on more factors.We give analysis for two artificially chosen parameters x and y with different expected values of moments.
Convergence depends on corr (x, y) and ru (x) when x is lognormal and y is normal.If corr (x, y) has zero value, then absolute error falls within 0.01 almost immediately and does not vary significantly for different values of ru (x).If correlation is nonzero, then larger value of |ru (x) | results in slower convergence, see Fig. 3.
Finally, correlation and relative uncertainties influence rate of convergence of correlation between two lognormal parameters x and y.Again, zero value of corr (x, y) yields quick convergence independently of relative uncertainties and larger values of relative uncertainties imply slower convergence.Figure 3 shows absolute error as a function of number of samples for ru (x) = 100% and different values of ru (y) and corr (x, y).

Validation of nuclear data libraries
The correlation matrices of resonance parameters in ENDF/B-VII.1, JEFF 3.2 and JENDL 4.0 libraries have been analyzed and the list containing information of possible faults mentioned in Sect. 3 in the correlation/covariance matrices is given in Table 1.
For example 232 Th from ENDF/B-VII.1 and JEFF 3.2 is incompatible with lognormal distribution.The resonance covariance file for resolved energy range contains 2781 resonance parameters.Parameter No. 36 corresponds to radiation width of 12 th resonance and its relative uncertainty equals 0.13.On the other hand, parameter No. 797 corresponds to neutron width of 266 th resonance with (extremely large) relative uncertainty of 27.66.Both parameters are sampled according to lognormal distribution.Their correlation equals −0.205 which is smaller than required lower bound from Eq. ( 5): = −0.08.

Conclusions
An open source code, named ENDSAM, has been implemented, which: • reads resonance parameters and their covariance data from nuclear data libraries, • checks whether the covariance data is consistent, • produces random samples of resonance parameters.
In the future, the code is planned to be extended to sampling of other nuclear data, such as cross sections, angular and energy distributions, and other.The code has been validated against realistic and artificial data to show that the produced samples meet statistical expectations.Additionally, the code can be used to validate existing nuclear data libraries.A list of inconsistencies, spotted in covariance data of resonance parameters in ENDF-VII.1,JEFF-3.2 and JENDL-4.0 is presented.

Figure 1 .
Figure 1.Upper bound for correlation of lognormal and normal parameter (top); lower (bottom left) and upper (bottom right) bound for correlation of two lognormal parameters as functions of relative uncertainties.

Figure 3 .
Figure 3. Absolute error of sample correlations for different expected values of correlation and relative uncertainty.Up: normal and lognormal parameter.Below: two lognormal parameters.Averaged over 1000 batches of samples.(Data are generated artificially.)