International criticality benchmark comparison exercise in support of nuclear data validation

. The objective of this paper is to present the main results of the k eff inter-comparison exercise performed in 2019-2022 by LANL, LLNL, ORNL and IRSN on a set of common benchmarks. The paper includes a description of the methodology to make the inter-comparison, presents the set of selected cases along with the main tendencies that could be at-tributed to nuclear data. In addition, it identifies some issues on the processing and interpretation of nuclear data done by the various codes of the intercomparison.


Introduction
The development of bias estimation methodologies has been a great tool for data uncertainty assessments. In particular, keff data and their uncertainties are becoming paramount for criticalitysafety and reactor physics assessment. Moreover, the adjustment of cross sections implemented to match experimental keff within their uncertainty * Kristina Y. Spencer was at Los Alamos Laboratories when the inter-comparison exercises began. ** Corresponding author: nicolas.leclaire@irsn.fr *** Now retired **** Now at Institut de Radioprotection et de Sûreté Nucléaire, PSN-RES/SAM, F-92260, Fontenayaux-Roses, France margins is important to produce the requested bias and associated uncertainty. A computation error on keff can strongly influence the calculated bias and its associated uncertainty. Therefore, it is highly recommended to have a high confidence level on the calculated keff values. Many Monte Carlo and deterministic codes and various nuclear data libraries are available for assessing keff. Since the bias due to the Monte Carlo simulation is relatively small, the biases in a calculation are mostly in connection to the nuclear data libraries.
There had been initially in 2018 a motivation under the US/DOE NCSP [1] for promoting inter-comparison exercise. The focus was mainly on keff values produced by various Monte Carlo codes for a three-year period.
The aim of the keff inter-comparison exercise was: To check the consistency between results from different codes using the same nuclear data libraries and therefore test the "independent modelling effect"; this task allowed providing a rigorous basis for quality experiments with a view to validate nuclear data, If lack of agreement exists, to find out the source of the discrepancy between the Monte Carlo model, interpretation of benchmark (detailed/simplified), revision number differing, processing of nuclear data libraries and correct the input decks so that all validation databases involve input decks describing the same benchmark model, To provide feedback on the nuclear data libraries (cross sections and thermal scattering) through the comparison of results using different nuclear data libraries and through trending analyses. A progressive analysis was done from 2019 dealing with the keff comparison of critical benchmarks involving Pu, HEU (2019), LEU, IEU (2020), MIX, 233 U(2021) fissile media. Quite a few mistakes were highlighted on input decks explaining some calculated discrepancies. Issues on nuclear data processing were also outlined, such as the processing by AMPX [2] of the incoherent elastic scattering of hydrogen cross sections being a good example.
The main conclusions of this intercomparison exercise are presented in this paper.

Laboratories and codes
The inter-comparison exercise is funded by the Nuclear Criticality Safety Program (NCSP) under the auspices of the US DOE. It gathers in an international collaborative framework four laboratories (LANL, LLNL, ORNL and IRSN) with the aim to provide quality validation databases of keff results that could allow deriving information on nuclear data. Four neutron transport codes are involved for the determination of keff: MCNP 6.2 (LANL), COG 11 (LLNL), SCALE (ORNL) and MORET 5 (M5 -IRSN).

Nuclear data libraries
Three evaluations of nuclear data are considered within the inter-comparison: JEFF-3.3, ENDF/B-VII.1 and ENDF/B-VIII.0. All evaluations are not used in connexion with all transport codes. SCALE and MCNP produce results only for the ENDF/B-VII.1 and ENDF/B-VIII.0 evaluations of nuclear data.

Selected cases
The selected benchmarks are reported in Table 1 for the four codes of the intercomparison exercise. They all come from the ICSBEP Handbook [3]. It was decided to make an analysis of results only for cases that were common to the four codes. Only a few common cases were found for MIX, IEU and SPEC systems. As a result, the analysis will mainly focus on other series of benchmarks.

Methodology
The comparison is based on keff results sent by the participants and gathered by the project leader. It was not always easy to know which revision of the benchmark was used for the modeling and which model (detailed/simplified was used). A first step of the inter-comparison has consisted in comparing results between the four codes based on the ENDF/B-VII.1 library taken as a reference. This allowed identifying issues independently from the nuclear data and to correct them. The reason for observing differences in keff values, could find its origins in: An error of modeling in the input decks, The use of different revisions of the benchmark, An inconsistency in the cases numbers due to a renumbering of cases from one of the teams, The use of different models (detailed/simplified). Once the input decks were corrected, assuming that the physical models implemented in the Monte Carlo codes were not source of significant difference in keff values, the keff differences could only be attributed to nuclear data effects, which in turn could be due to differences in the processing of data. Indeed, different tools were used for processing nuclear data: NJOY [4] for MCNP [5], GAIA 1.1 (NJOY) for MORET [6], AMPX [2] for SCALE [7] and PREPRO [8] for COG [9] and it is interesting to know the impact of this step on the keff results.
Then in a third step, after discarding processing effects, calculated keff were compared with the benchmark keff and their associated uncertainty. If the absolute difference exceeded two times the combined Monte Carlo and experimental uncertainty, then a bias could be identified, otherwise one could conclude to a good agreement between the calculated keff and the benchmark keff.
A trending parameter could be used to plot keff against the parameter (for instance, reflector thickness, concentration in fissile medium). This trending analysis offered the opportunity to identify issues on nuclear data that were emphasized by the sensitivity of keff to the parameter of interest.

Analysis of results
The keff results were run targeting a Monte Carlo standard deviation less than 0.00020 (20 pcm). This standard deviation is not reported on the following figures. However, the 1 experimental uncertainty is indicated with a red dotted line.

Selected cases
Comparisons of calculated keff to the benchmark keff for various fissile media are reported in Fig. 1 to Fig. 4. For Pu systems, a general good agreement is obtained, the C-E results remaining in the 1 range of experimental uncertainties. Moreover, new JEFF-3.3 and ENDF/B-VIII.0 evaluations lead to results closer to the benchmark. An issue on the MORET input deck of PST-003-002 has also been identified; the input has then been corrected. Finally, no resulting trend has been identified.
For HEU systems, a general good agreement at the 2 level between the calculated keff and the benchmark keff is obtained except for experiments with Mo diluent (HMF-093) and experiments involving a vanadium reflector (HMF-025). One can notice the strong effect of the nuclear data library for HMF 025 (vanadium reflector) and HMF-011 (polyethylene reflector).
For 233 Usystems, a tendency to underestimate keff can be pointed out for experiments in the epithermal energy range (INTER) with a significant nuclear data library effect.
For LEU systems, a general good agreement between the calculated keff and the benchmark keff is observed, except maybe for LCT-010-001 (lead reflector) and LST-002-002. For LCT-010-001 (lead reflector), a significant effect of the nuclear data library can be seen.

Analysis of experiments involving plutonium in thermal energy spectrum
A series of experiments involving plutonium nitrate solutions (PST-041) is analyzed in this section against the concentration of plutonium as trending parameter. The 240 Pu content of the plutonium solution is 3 %. One can see on

Vanadium and nickel reflectors
C-E results for vanadium and nickel reflectors are reported in Fig. 6 and Fig. 7.

Beryllium and Unat reflector
C-E results for beryllium and natural or depleted uranium reflectors are reported in Fig. 8 and Fig. 9. Concerning the uranium reflector, one can notice a tendency to decrease keff with the reflector thickness for all evaluations. The results remain however in the 2 range of experimental uncertainties. One can also notice a significant library effect between ENDF/B-VIII.0 results and other ones. The ENDF/B-VIII.0 evaluation leads to results closer to the benchmark keff values.

Tungsten reflected and ZEUS experiments
C-E results for tungsten and for ZEUS experiments involving HEU disks moderated by graphite and reflected by copper are reported in Fig. 10 and Fig. 11. Concerning the tungsten reflector cases, the three evaluations are in good agreement. However, a tendency to overestimate keff can be pointed out even if C-E remains in the 2 range of experimental uncertainties. This tendency is directly linked to the increasing sensitivity to the capture cross sections of tungsten.
Concerning ZEUS experiments (HMI-006), a tendency to overestimate keff can be observed with the increase of the EALF value. Better agreement with the benchmark keff is obtained with the JEFF-3.3 evaluation of nuclear data. Both uranium and copper (reflector) cross sections are responsible for this significant improvement.

Issues due to processing of nuclear data
The inter-comparison exercise also offered the opportunity to highlight issues on the processing of cross sections. Two of them are illustrated on Fig. 12 and on Fig. 13. As can be seen on Fig. 12, the KENO V.a (SCALE) results with ENDF/B-VII.1 are not in good agreement with other codes. The issue is on the processing with AMPX of the incoherent elastic contribution to the thermal scattering cross section of polyethylene [10]. For a thick polyethylene reflector as in Fig. 12, a -400 pcm keff difference can be outlined. This has been corrected with a patched version of the SCALE continuous-energy data.
Other processing issues relating to the thermal scattering cross sections of water can be suspected. When looking at Fig. 13 one can see that a significant difference exists between COG, MORET and SCALE results for experiments involving lattices of UO2 rods (LCT) in light water. This difference is only observed for the ENDF/B-VII.1 and ENDF/B-VIII.0 evaluations since TSL data of water are generated, for these evaluations of nuclear data, with LLNL own internal method based on ENDF-6 file 7 format contrary to what is done with the NJOY code for MCNP and MORET codes. Indeed, ACE data from NEA processed with the NJOY code are used for the JEFF-3.3 evaluation.

Conclusion
The aim of this paper was to show the benefit of the inter-comparison exercise on keff values provided by four different codes for the nuclear data community. Through a selection of experimental cases that are common to the four codes of the NCSP partners, tendencies that can be attributed to nuclear data were identified for some materials acting as reflectors. These tendencies were plotted against some parameter of interest (thickness of reflector, spectrum) and help exhibit issues on the capture or scattering cross sections of the materials since there is sensitivity of keff to the cross sections of these materials. Also, well known issues on the processing of incoherent elastic thermal scattering data of polyethylene were also confirmed, as well as significant difference of keff with different processing of the thermal scattering data of water.
It could also be noted that this work is complementary to the work done in the framework of the VaNDaL [12] project whose aim is to ensure that existing validation databases are under quality assurance and will also help improve the quality of the ICSBEP/DICE [3] database.
Finally, quite recently the need for comparison on reactor physics kinetics parameters and shielding benchmarks values has arisen and it was proposed and accepted to extend the exercise until FY 2024.