Nuclear data evaluation methodology including estimates of covariances

Evaluated nuclear data rather than raw experimental and theoretical information are employed in nuclear applications such as the design of nuclear energy systems. Therefore, the process by which such information is produced and ultimately used is of critical interest to the nuclear science community. This paper provides an overview of various contemporary methods employed to generate evaluated cross sections and related physical quantities such as particle emission angular distributions and energy spectra. The emphasis here is on data associated with neutron induced reaction processes, with consideration of the uncertainties in these data, and on the more recent evaluation methods, e.g., those that are based on stochastic (Monte Carlo) techniques. There is no unique way to perform such evaluations, nor are nuclear data evaluators united in their opinions as to which methods are superior to the others in various circumstances. In some cases it is not critical which approaches are used as long as there is consistency and proper use is made of the available physical information. However, in other instances there are definite advantages to using particular methods as opposed to other options. Some of these distinctions are discussed in this paper and suggestions are offered regarding fruitful areas for future research in the development of evaluation methodology.


Introduction
The understanding that evaluated nuclear data rather than raw experimental and theoretical results should be used for nuclear applications has a history spanning many decades. It is not the intent of this communication to review the history but rather to note that this work has led to the development of several evaluated nuclear data libraries which are widely used around the world. These include both national and regional libraries [1][2][3][4][5]. It should be understood that while these libraries have been generated by different projects, they are not totally independent since certain evaluations appear in several different libraries. Also, there have been many different releases of these libraries over the years as their content is updated and expanded. What ties these libraries together is their use of a common format known as the ENDF format [6]. The use of a common format has allowed these various libraries to be readily translated to input data formats used by nuclear applications codes, and this enables the performance of these libraries to be compared in various practical applications. However, it is not the intent of this contribution to dwell on such issues but rather to focus on how the information in these libraries is produced by evaluators. Among the issues to be addressed are: i) collection and adjustment of experimental data; ii) theoretical data from nuclear modeling; iii) the handling of nuclear model defects in estimating uncertainties in evaluated data; iv) deterministic evaluation methods; EPJ Web of Conferences v) stochastic evaluation methods; and vi) advanced concepts for using nuclear data in nuclear energy applications.

Data from experiments
It was recognized early by the nuclear science community that it would be impractical for each evaluator to independently resort to the original literature to obtain the experimental data used in evaluations. Therefore, an international collaborative effort was established several decades ago to continually compile experimental data and provide them using standardized formats in a readily accessible library known as the EXFOR system [7]. This approach has been of great value to evaluators, but the system is far from perfect. One problem is that the information compiled in EXFOR is raw data and therefore there are often errors due to confusion over units, misreporting by original authors or miscopying by compilers, etc. Some of these data are just plain bad due to defects in the experiments. Finally, since most neutron reaction experiments make use of decay and standards data in their interpretation, changes in the results due to changes in these reference data are to be expected. It is a tedious chore for a contemporary evaluator to sort all this out and thereby insure that he (or she) is using the best possible experimental information in the evaluation process. To make things even worse, evaluator corrections are usually not compiled into usable databases. An example of one evaluator's efforts to weed out bad (and discrepant) values and adjust others from an experimental database is shown in the following two figures [8]: The details in these two plots are not of particular importance for present purposes, but the concept of weeding out bad data and performing needed adjustments to provide better experimental results for an evaluation is of critical importance. Evaluation methods are very sensitive to the use of discrepant data, which may distort derived results. Unfortunately, not all evaluators do this as thoroughly and conscientiously as they should. To aid in this process a subgroup of the NEA Working Party for Evaluation Cooperation (WPEC) is undertaking a collaborative project to improve the EXFOR database by eliminating some of the more obvious errors that it contains [9]. In spite of this valuable effort, reliance on compiled information in a database such as EXFOR will continue to be problematic because many of the associated problems with the data content cannot be resolved other than through communication with original authors. Often this is impossible. So, evaluators are frequently required to make estimates concerning information missing from EXFOR, which is erroneous, or is otherwise not readily accessible. These judgments are likely to be rather subjective for obvious reasons. Thus, it is important that experimenters of the future be informed that they need to take great care in their measurements, in documenting them in sufficient detail, and in providing credible estimates of uncertainties, in order to gain the confidence of evaluators who will use these data. A careful assessment of the possible systematic and statistical uncertainties is required and should be properly documented. The consequences for the future of nuclear energy are significant. A final point to be made in this 64 Zn(n,p) 64 Cu

04001-p.2
EFNUDAT Workshop "Measurements and Models of Nuclear Reactions", Paris, 2010 section is that information on experimental uncertainties, and especially correlations between them, is frequently lacking in EXFOR. This is a major impediment for evaluators attempting to provide credible uncertainty information for their evaluations.

Data from nuclear modeling
For myriad technical reasons, it is impossible to measure with sufficient accuracy, or sometimes at all, many of the physical parameters required for nuclear data applications. For this reason, nuclear modeling is being called upon ever more frequently to fill in these gaps. On the positive side, nuclear modeling has become quite sophisticated and capable of predicting the major features of important nuclear data. On the downside, these models remain deficient in several areas. In this section, we examine briefly two major sources of uncertainty associated with the prediction of physical quantities (e.g. cross sections) through the use of nuclear models.

Nuclear model parameter uncertainties
Although some progress is being made in developing models with predictive capabilities that are derived from fundamental nucleon-nucleon interactions, to a large extent theoretical calculations are based on parameterized phenomenological nuclear models. In former times, the predictions generated from models by different laboratories varied greatly, often by orders of magnitude.
Matters have improved lately, in large part because of a growing consensus concerning what the key model parameters ought to be as well as the use of similar algorithms in the major modeling codes used by evaluators [10][11][12]. Regarding model parameters, a significant step forward has been the adoption of the Reference Input Parameter Library (RIPL) as the "gold standard" for initial parameter estimates as well as their uncertainties [13]. Evaluators still "fine tune" these model parameters to provide closer agreement with data for particular isotopes and reaction channels, but RIPL has been found to provide an excellent global qualitative representation of nuclear data. As we shall see in Section 4, agreed upon estimates of uncertainties in these parameters provide essential input for estimating the uncertainties of evaluated results via the evaluation procedures to be described.

Nuclear model deficiency uncertainties
Estimates of uncertainty in nuclear model parameters alone is insufficient to provide overall estimates of the uncertainties in the predictions of nuclear models since the deficiencies in these models also contribute to the uncertainty, but in a manner which is very difficult to assess. Models themselves are deficient in large part because of the approximations that have to be made for practical reasons. However, deficiencies also persist due to a lack of perfect understanding of the physical interactions in the various reaction channels (e.g. emission of complex particles). So, some attempt must be made to estimate these deficiencies. To date, the main approach used to estimate model deficiencies has been to examine the discrepancies that persist between model-calculated and experimental results when the best possible model-parameter formulations and good quality experimental data are used for such comparison purposes [14,15], e.g., see Fig. 3. It is clear from the example of Fig. 3 that these deficiencies are quantitatively energy dependent. While estimates of this energy dependence can be provided in many cases, estimates of the correlations in these uncertainties are very difficult to determine. Consequently, full correlation is often assumed even though that is likely to be a very crude assumption.
In closing this section, we state the obvious fact that knowledge of model deficiencies can never be more than speculative because it is not known what the correct model should be, and if that fact were known there would be no model deficiency uncertainty! It has been said that the best way to reduce uncertainty is through better understanding of the fundamentals. Therefore, the best course for the nuclear science community to pursue is to continue research aimed at gaining a better understanding of nuclear interactions and to develop better models for use in future evaluations.

Nuclear reaction evaluation
We now turn to the main topic of this communication, i.e., a discussion of the most important methods used for neutron reaction data evaluation. Attention is given here only to modern methods that involve well-defined algorithms with minimal subjectivity employed in producing the evaluated results. The evaluation of data for reaction processes in the thermal, resolved resonance, and unresolved resonance energy regions (that is for low neutron energies) involves specialized techniques that are not treated in this paper. Therefore, the following discussion is applicable mainly for neutron energies above the 10-100 keV range (depending on the isotope). This is generally referred to as the fast-neutron region. Furthermore, evaluations for light nuclei (i.e., A < 15−20) also require specialized techniques so they will not be discussed either.
It is essential to understand that the main challenge of contemporary nuclear reaction data evaluation is to establish the best possible approaches for combining objective (measured) with subjective (theoretical or nuclear-model generated) information, and to generate reasonable estimates of the uncertainties in the final evaluated results.
Methods for evaluation of neutron reaction data in the fast-neutron region can be classified into two broad categories: deterministic and stochastic (Monte Carlo). Of these two approaches, the deterministic ones are better established and still the most widely used. However, the newer Monte Carlo techniques are gaining broader acceptance and, in some cases, they are beginning to be used on a regular basis for full-scale evaluations. Further development of stochastic methods of evaluation is in progress, and some of these techniques have been used so far only for simple, hypothetical examples to prove the principles of the methods. Evaluators have also pursued a hybrid approach in which both deterministic and stochastic techniques are employed, and this will be mentioned briefly below.

Deterministic evaluation methods
By and large, most contemporary neutron reaction evaluation procedures are based on various manifestations of the least-squares method. The simple least-squares (SLSQ) procedure produces evaluations that involve only experimental data whereas the generalized least-squares (GLSQ) approach is derived from the Bayesian concept, and it involves use of prior information (usually from nuclear modeling) that is "updated" or "enhanced" by introducing experimental data [16]. Since the SLSQ approach can be viewed as a special case of the GLSQ approach in which the prior information is assumed to have uncertainties that are so large that the prior data can be considered to be of no consequence (i.e., the 04001-p.4 EFNUDAT Workshop "Measurements and Models of Nuclear Reactions", Paris, 2010 "non-informative" prior), we will focus on providing an outline of the theoretical basis of the GLSQ approach and then mention briefly the contemporary applications for both the SLSQ and GLSQ deterministic methods.
Without going into all the details, evaluation by the GLSQ approach at its core can be viewed as solving the following mathematical problem [16]: In this expression, σ represents the evaluated cross section set to be determined, y E is the experimental data set with V E its covariance matrix, σ C is the calculated cross section set (i.e., the model prior) with its covariance matrix V C [16,17].
Since the experimental data set may include complex values such as ratios and integral data, it is necessary to linearize the minimization problem in order to solve it in a straightforward deterministic manner. The approximations that must be applied are generally acceptable as long as the data uncertainties are not too large, but they may break down and this can lead to distortions if the uncertainties are large and/or complex experimental data, e.g., ratios or integral values, are involved. One should notice from Eq. (1) that complete covariance information, including correlations, is required when dealing with the evaluation of multiple parameter data sets.

Non-model fits to experimental data
As indicated above, SLSQ methods incorporate no prior information in the evaluation process so they rely entirely on experimental data. The mathematical condition to be satisfied corresponds to Eq. (1) in which the second term of the expression is eliminated. The most important recent example of using this approach is the evaluation of the international neutron cross section standards [18] using the code GMA. This was a simultaneous evaluation of several standard reaction processes for which high quality experimental data are available. Many of these data appear as ratios. The procedures used, including the selection and adjustment of the experimental data, estimation of their uncertainties as well as correlations, etc., are exhaustively described in the standards reference document [18]. It should be noted that since no prior information based on nuclear models was employed to effectively "smooth" the evaluated results, it was deemed to be desirable to mathematically, and somewhat artificially, smooth the final solution values in order to eliminate insignificant and potentially misleading artificial structure in the final evaluated results that usually emerges when performing an evaluation that involves only experimental data.

Deterministic GLSQ evaluations
Straightforward GLSQ evaluation techniques are incorporated in various well-established data evaluation codes such as GLUCS [19], but these methods are well documented so there is no benefit in mentioning them further here. Of greater interest are the newer variations of this approach. One of these involves a combined use of the nuclear modeling code EMPIRE [10] and the Kalman Filter code KALMAN [20]. A very similar combination incorporates KALMAN with the model code GNASH [12]. These approaches are fully deterministic in that numerical uncertainty values associated with nuclear modeling are obtained by propagating model parameter uncertainties with the aid of deterministically calculated sensitivity coefficients. Another approach based on the GLSQ concept involves use of the GANDR code system [21]. GANDR incorporates a GLSQ routine in the code package, and it also employs a novel and quite sophisticated technique for preparing experimental data (since these are often available at energies incompatible with the chosen evaluation grid) so that they can be easily handled in the evaluation process. These GLSQ approaches have been used to produce a significant number of the more recent evaluations that will be included in next-generation evaluated nuclear data libraries [1][2][3][4][5].

Stochastic (Monte Carlo) evaluation methods
The use of Monte Carlo techniques for evaluating neutron reaction data is relatively recent [22], but they are becoming more widespread owing to the possibilities offered to bypass some of the limitations inherent in the older deterministic approaches. One of these limitations can be traced to the process of linearizing the problem, thereby ignoring non-linear effects which are certainly present in nuclear models. Five stochastic evaluation approaches are described briefly in this section.

Filtered Monte Carlo (FMC)
The Filtered Monte Carlo (FMC) approach is based on performing an evaluation and estimating the uncertainties mainly on the basis of nuclear model parameters and their uncertainties. The idea is quite simple [22]. A chain of sets of nuclear model parameters is generated by random sampling within the confines defined by the assumed model parameter uncertainties. Each of these sets of parameters is used to calculate corresponding sets of observable physical parameters (cross sections, angular distributions, etc.). This Markov chain of sets of values for the observables can then be used to generate averages and covariances that constitute the desired evaluation.
The concept of "filtering" derives from the fact that some of the stochastic parameter sets yield calculated results that are significantly discrepant with available experimental data so they are discarded (hence filtered). However, the methods for deciding what is and is not discrepant remain somewhat ad hoc at this stage in the development of this approach. The main disadvantage of the FMC is a complete neglect of experimental correlations. Figure 4 shows a family of computed random cross section excitation functions that can be compared with experimental data [23]. Notice the calculated central value and upper and lower limit uncertainty bands that are included in this plot.
A further advantage of the FMC approach is that it enables covariance information to be provided for physical properties such as those related to neutron angular distributions and neutron spectra just as readily as this information can be generated for cross sections. The FMC method has been employed exclusively in producing several versions of the comprehensive model-based nuclear data library known as TENDL [24].

Unified Monte Carlo (UMC)
The FMC approach suffers from two fundamental limitations. First, it leads to evaluations, which are almost entirely based on nuclear modeling, with only very limited attention paid to experimental data. As such, evaluations performed by this approach tend to involve strong correlations stemming from stiffness of the models. Second, it is clear that his approach does not give adequate consideration to the information that experiments can bring to the evaluation process in terms of mean values and uncertainties, neglecting known experimental correlations. Thus, a more rigorous approach is called for and this is provided by the Unified Monte Carlo (UMC) method [25].
The details of the UMC method are documented elsewhere [25] so they need be mentioned here only briefly. The mathematical approach is based on invoking Bayes Theorem and the Maximum Entropy Principle from statistics [16]. A finite collection of physical parameters which, without loss of generality, are assumed here to be cross sections σ = (σ 1 , σ 2 , . . . , σ m ) is to be evaluated and its m × m covariance matrix V σ also determined. In practical situations, the evaluation process usually involves including both theoretical data (calculated using nuclear models) and experimental data as is the case for other methods discussed here.
First, a related set of theoretical values x C = (x C1 , x C2 , . . . , x Cm ) related to cross sections is generated along with an m×m covariance matrix V C . These data are considered to be governed by a prior probability density function p 0 (σ|x C , V C ). Next, a collection of also relevant experimental values y E = (y E1 , y E2 , . . . , y En ) with an n×n covariance matrix V E is introduced into the evaluation procedure. It is assumed to be described by a probability density function L(y E , V E |σ). L is generally referred to in this situation as the Likelihood Function. The data sets x C and y E are taken to be explicitly uncorrelated to each other, but the matrices V C and V E may include non-zero internal correlations, i.e., they are usually non-diagonal. Bayes Theorem states that the resultant posterior probability density function p(σ) that describes the evaluated parameters σ is the following: Traditionally p 0 and L are both considered to be normal distributions. In fact, derivation of the widely used GLSQ evaluation method is based on this assumption. The assumption that data are normally distributed appears to be quite reasonable when their uncertainties are small, their probability distributions are highly localized, and both values and their uncertainties are provided in the input data sets. However, departures from normal distributions are known to occur when the uncertainties become large (or are missing) and when complex data types, e.g., ratios, are involved. In fact, it is possible that p 0 and/or L could be constructed as combinations of various other probability functions as well as the normal distribution. It should be mentioned that knowledge of the probability function normalization constant C shown in eq. (2) is irrelevant in the UMC approach regardless of which Monte Carlo sampling technique is used [25]. Investigations of the UMC approach suggest that the well-known Metropolis-Hastings sampling scheme [26,27] is both efficient and accurate for this application [28].
The UMC approach has not been used to date for a full scale evaluation, but it has been studied extensively using simple, hypothetical examples [28,29]. These studies have demonstrated that if all input data are comprised only of cross sections, then the UMC and GLSQ methods yield essentially the same results, so there is no advantage to using the more computational intensive UMC technique. However, when more complex data, or data that are not normally distributed, are involved in an evaluation then the UMC approach can be superior to GLSQ.

Hybrid approach (MC+GLSQ)
The Hybrid method MC+GLSQ is similar to the UMC and Filtered Monte Carlo approaches in dealing with modeling uncertainties to estimate the model prior. However, experimental data are incorporated into the evaluation by the GLSQ method that combines properly-weighed experimental and model covariance matrices. As discussed above, such a method had been demonstrated to be equivalent to the UMC method as long as we are dealing only with measured cross sections [28].
This method was employed by one group to generate stochastically a set of prior cross section values and their covariance matrix using the nuclear modeling code EMPIRE [10]. This prior information was then combined deterministically with available experimental data using the abovementioned GANDR system [20] to provide the final evaluations [30].

Backward-Forward Monte Carlo (BFMC)
The basis of the Backward-Forward Monte Carlo (BFMC) approach relies on the assumption that the only source of uncertainty of the model-calculated cross sections is the imperfect knowledge of the model parameters [31]. There are two phases to this approach. The Backward phase corresponds to determination of the model parameter covariance matrix while the Forward phase propagates this matrix toward the determination of cross-section covariances by Monte Carlo sampling. Since the Forward stage is quite straightforward, a few words are in order to explain how the Backward phase is performed. In the early stages of development of this approach, attention was focused on optical model parameters (OMP). A large collection of OMP's was generated at random by sampling values in an uncorrelated manner over a relatively wide expanse of parameter space. For each sample, calculated values of observable quantities, e.g., total, elastic-scattering, and (n,2n) cross sections, were determined and these were compared with actual experimental data. As is the case for the FMC approach described above, OMP sets that are at odds with experiment are weeded out in BFMC. The surviving parameter sets are then subjected to a statistical analysis that leads to the generation of mean values and covariances for the OMP's. This information is then used for the Forward phase of the analysis, as mentioned above. Work on implementation of this approach proceeds. For example, it has been used in evaluating selected reaction channels for n + 89 Y [31].

Total Monte Carlo (TMC)
This method is very similar to the FMC approach. The only real difference stems from the recognition that once a Markov chain of random values for observable physical parameters has been generated, it can be used for much more than just producing a conventional evaluation [32]. In fact, these values can be used to generate random values for system operating parameters, and these, in turn, used to generate mean values and uncertainties of great interest to applied data users. For practical reasons, the TMC approach is currently implemented in the following way: Each random set of physical parameters is used to produce an entire evaluated nuclear data file. Therefore, about thousand of such data files in ENDF-6 format are generated and each one is then used to perform a system analysis. The results from these many analyses are then compiled and analyzed statistically to determine the best values and their dispersions (uncertainties) for the analyzed nuclear system parameters.
This approach has generated considerable interest within the nuclear science community, but it has also drawn some criticism. There are two main objections that have emerged. One is that the approach is largely based on nuclear modeling because it relies on FMC to combine model and experimental data; therefore the proposed TMC does not take into adequate consideration the available experimental data an their correlations, including integral data which are deemed important by applied data user communities. The second concern is that such an approach, while it gives potentially valuable information about system operating parameter uncertainties, should never be viewed as a substitute for detailed validation of well-defined and adopted nuclear data libraries that have been carefully screened by C/E comparisons with well characterized, accurate benchmark experiments. This latter, traditional approach has been the basis for judging the quality of nuclear data libraries for many decades so it is not likely to be abandoned in favor of a new and untested scheme such as TMC. There is some merit in this argument from the perspective of users, but it should also be realized that the TMC approach offers possibilities for exploring some of the more obscure, and potentially important, physical issues involved in the relationship between nuclear data and nuclear system performance, e.g., the effects of uncertainties in neutron angular distribution data, the effects of uncertainties in double differential cross sections (impact of energy and angular correlations), cross reaction and cross material uncertainty correlations, etc., that have been inaccessible to investigation by more traditional approaches. Moreover TMC avoids the linearization approximation required by traditional sensitivity methods.

A future approach: UMC+TMC
An approach which merges the concepts of UMC and TMC is currently under development by the present authors and their collaborators, A. Koning and D. Rochman. This concept will facilitate the production of evaluated nuclear data libraries that incorporate both nuclear model results and experimental data in a rigorous manner, according to the UMC technique, while at the same time it will also offer the opportunity to bypass creation of explicit evaluated data libraries and proceed directly to the estimation of nuclear system operating parameters as embodied in the TMC method. The combined UMC+TMC approach draws upon the strengths of both component concepts and eliminates certain limitations that each of them experiences when employed alone.
Here is how it is envisioned that UMC+TMC would operate in practice: The experimental and model-calculated data are generated and assembled in the same manner described for the UMC method. A Markov chain of sets of possible solution values for the observable parameters would then be generated by sampling in accordance with the posterior probability distribution using the Metropolis-Hasting algorithm, again in the same manner as described for UMC. However, at this point the option is provided of either calculating the first and second moments for the probability distribution using the values generated by Monte Carlo, thus leading to the UMC outcome and a specific set of evaluated data, or one could use the collection of values, as described in the TMC approach, to calculate mean values and standard deviations of selected observable parameters for a chosen nuclear system. By this means it would be possible to overcome the objection voiced by some individuals that the pure TMC approach is too heavily dependent on nuclear modeling and does not properly consider the available differential and integral experimental data and known experimental correlations. It would also be possible to incorporate estimates for model defects within both the UMC and UMC+TMC approaches and thereby deflect any potential criticism that this source of uncertainty cannot be considered by these evaluation methods.
To date, the UMC+TMC approach has not been demonstrated, but there appear to be no foreseen technical impediments to doing this.

Summary
This paper has provided a very brief overview of several contemporary methods used to produceand in one case to use -evaluated fast-neutron nuclear reaction data. An essential aspect of each of these methods is the gathering, and possibly adjustment, of all available experimental and theoretical (model-calculated) nuclear data that are relevant to the problem in question. If the input data are of poor quality, then no evaluation approach can yield good quality results. However, when good quality information is available it is important that it be used in a rigorous and optimal manner. This is the goal behind the development and implementation of all the methods discussed here. It is anticipated that attention will continue to be devoted to developing and refining nuclear reaction data evaluation methods, and that these new evaluation tools will lead to reduction of the uncertainties in the nuclear data for applications.