Benchmarking and validation activities within JEFF project

. The challenge for any nuclear data evaluation project is to periodically release a revised, fully consistent and complete library, with all needed data and covariances, and ensure that it is robust and reliable for a variety of applications. Within an evaluation effort, benchmarking activities play an important role in validating proposed libraries. The Joint Evaluated Fission and Fusion (JEFF) Project aims to provide such a nuclear data library, and thus, requires a coherent and efﬁcient benchmarking process. The aim of this paper is to present the activities carried out by the new JEFF Benchmarking and Validation Working Group, and to describe the role of the NEA Data Bank in this context. The paper will also review the status of preliminary benchmarking for the next JEFF-3.3 candidate cross-section ﬁles.


Introduction
The current evaluations of neutron induced cross sections are based on differential and integral experiments as well as nuclear reaction models. These evaluations may include compensating errors between different reactions crosssections (capture, inelastic and/or elastic scattering, and fission), without forgetting the average number of prompt neutrons and the prompt fission neutron spectra. These compensating effects are present in all the evaluations, and the generally-good performance of evaluated data in benchmarks is, to some extent, due to these compensating effects. Nowadays, evaluation efforts are focused on identifying and correcting these effects to restore good performance of the library in integral benchmarks [1]. The development of methodologies and computational tools, beside the benchmarking and validation of new evaluated data will be essential to obtain better accuracies, and to support the design and operation of new nuclear facilities.
The JEFF project has assessed the needs for nuclear data improvements and brings together experts in different areas such as experiments, data evaluation, a e-mail: oscar.cabellos@oecd.org verification and compilation of the data, processing and benchmarking [2]. In the past, JEFF participants have used different benchmarking suites applied to criticality, fuel inventory, reactivity variation, shielding and activation, decay heat, . . . and hence, benchmarking consisted of summing isolated validation results, which resulted in poor coverage of application space. In order to centralize, streamline and strengthen the benchmarking process, the JEFF Benchmarking and Validation Working Group (JEFF-B&V WG) has been created [3].
The NEA Data Bank has taken an increased role in JEFF Project. In addition to carrying out the Project secretariat tasks, and as part of broader nuclear data services to participating countries, the NEA Data Bank also provides a number of services with the JEFF files such as consistency checks, conversion to various pointwise or multigroup formats, file testing and benchmarking using open databases such as ICSBEP and SINBAD. To provide these services the Nuclear Data Evaluated Cycle (NDEC) [4] platform is developed and implemented at the Data Bank.
In this paper, benchmarking efforts from different participating institutions using the general-purpose test libraries JEFF-3.3T1 and T2 are presented. It can be ND2016 concluded that good performance of the library is achieved.

NEA data bank: Tools and databases
NEA Data Bank has supported some recent activities to enhance Nuclear Data Services for participating countries. A comprehensive Q&A process involving verification, testing and benchmarking relies on these NEA tools and databases. These tools and databases have been used to assess the production and selection of JEFF-3.3 file candidates. Hereafter, a brief summary of these tools is presented (see Fig. 1): • The JANIS (JAva-based Nuclear Information Software) software developed by the NEA Data Bank to facilitate the visualization and manipulation of nuclear data, giving access to evaluated nuclear data libraries, such as ENDF, JEFF, JENDL, TENDL etc., and also to experimental nuclear data (EXFOR) and their bibliographical references (CINDA). The on-line JANIS Books provide compilations of cross-sections of evaluated and experimental data from a number of evaluated libraries, nuclear reactions and associated reaction products. • The Database for the International Criticality Safety Benchmark Evaluation Project (DICE) contains 567 evaluations representing 4874 critical, near-critical, or subcritical configurations into a standardised format that allows criticality safety analysis. This database is easily used to validate calculation tools and perform benchmarking to assess the performance of evaluated nuclear data libraries. DICE also provides the user with access to sensitivity coefficients (percent changes of k-effective due to elementary change of basic nuclear data) for the major nuclides and nuclear processes in a 30-group and 238-group energy structure. These data are currently only available for about 75% of experimental configurations. • The Nuclear Data Sensitivity Tool (NDaST) is a Java based software, designed to perform calculations on nuclear data sensitivity files for benchmark cases. These calculations are either an estimation of the impact of nuclear data perturbations to the computed case results, and/or; calculation of the uncertainty in the computed results due to evaluated nuclear covariance data. This allows simple and fast analysis for nuclear data evaluators to test the impact of revisions across a wide set of benchmarks. This tool has been recently applied [5] to estimate the impact of improved CIELO files (individual cross-sections, e.g., elastic, inelastic, fission, capture or nu bar) against the computed or reference case, ENDF/B-VII.1, in around 1500 ICSBEP Benchmarks taken from this Reference [6]. • The Nuclear Data Evaluation Cycle (NDEC) [4] is a systematised workflow for handling and diagnosing the quality of nuclear data under the different steps involved in the production of nuclear data libraries. These steps are the verification, processing, experimental differential validation and experimental integral benchmarking of evaluated nuclear data files. A criticality validation suite of 123 ICSBEP benchmarks is used to assess the reactivity impact of changes to associated nuclear data libraries [7].

JEFF benchmarking and validation working group
A challenge for the JEFF Project is to issue a fully consistent library, "complete" with all needed data and associated covariance information, which can be reliably used for a large spectrum of applications, and which has a proven performance level equal or better than that of the present JEFF-3.2 library. In the JEFF-3.2 evaluation, many benchmarking activities were carried out on criticality, fuel inventory, reactivity variation, shielding and activation, decay heat, . . . However, these benchmarking activities involved summing individual contributions resulting in poor coverage of this evaluation. In addition, the lack of a large, consolidated, reference document for this evaluation resulted in difficulties to reference and an overall decrease in the quality of documentation of this work.
Consequently, JEFF benchmarking activities have been revised to be more coherent and efficient, with the NEA Data Bank taking an increased role in servicing the Project, in particular file testing and benchmarking activities. To fulfil this purpose, the JEFF Benchmarking and Validation (JEFF-B&V) Working Group has been created.
The main goals of JEFF-B&V WG are to exchange ideas on best practices and procedures for benchmarking, and to perform inter-comparisons among different methodologies and validation suites used by JEFF-B&V WG's participating members. The working group will exchange and cross-check benchmark input decks, discuss benchmarking suites to be used, selection of pertinent validation cases for all nuclides in the library, and provide as complete as possible a multi-purpose validation suite. Institutions such as CIEMAT, ENEA, SCK·CEN, KIT, UKAEA, IRSN, KAERI, JSI, CEA, PSI, JRC, UPM, NRG and NEA have agreed to join efforts to providing thorough benchmarking for upcoming JEFF candidate files.
The centralization of these tasks relies on the NEA Data Bank to implement and co-ordinate a comprehensive process involving verification, testing and benchmarking tasks according to well-defined criteria, while assessing the needs for benchmarking efforts in participating institutions and helping streamline and rationalize these activities, in particular testing and benchmarking.
Here, the NDEC application [4], under development by NEA, performs the automated testing and diagnosis in human-readable outputs, aiding the selection of better JEFF files. At the same time, DICE/IDAT benchmark databases are used through NDaST to provide a means to easily select benchmarks due to sensitivity coefficients and quickly assessing the impact on benchmark calculations due to nuclear data library changes.
JEFF Project has encouraged users and parties interested in JEFF evaluation to identify the benchmarking data that the new releases should comply with and to provide with the sensitivity analysis to reflect on the performance and deficiencies of the new evaluation.

Benchmarking and validation of JEFF-3.3
In this section, the status of benchmarking activities and preliminary results for upcoming JEFF candidate cross-section files are reviewed. The major changes in JEFF-3.3T1/2, with a total number of 559 isotopes, are summarized in Ref. [8]. It has to be remarked the completely new evaluations of 22 isotopes: 235,238 U, 239 Pu, Hf isotopes, etc. it contains unchanged 92 isotopes kept from JEFF-3.2 and 304 new isotopes from TENDL-2015. Feedbacks, bugs and deficiencies in JEFF-3.3 beta version were reported and reviewed by JEFF community.

General validation
New activities have been initiated at the NEA to perform the validation of JEFF-3.3 against standard, evaluated, microscopic and integral cross-sections. A more thorough analysis has been presented in Ref. [9] as an essential part of activities to validate TENDL library.

IRSN -Benchmarking [10]
A first suite of benchmarks is created to test and assess new nuclear data evaluations (e.g., 235 U, 239 Pu and 16 O). The DICE database associated with the ICSBEP Handbook is used for this purpose using different criteria for the selection: high sensitivity to new evaluations, low sensitivity to the "background" cross-sections and discharging experiments showing too high experimental uncertainty. Figure 2 shows the IRSN suite to test new 235 U evaluation. Large but discrepant values were observed in HMI-006 series (Fig. 3), part of this effect can be justified due to the change in the 235 U capture cross-section in the range 1-100 keV. HMI6 series have EALF values in a range between 4-80 keV. Good results are obtained with JEFF-3.3T1/2. Additional analysis in Ref. [11] replacing graphite and copper with other evaluations concluded that these files might have important sensitivity in these cases. Therefore, additional analysis are needed for selecting the best files.   [11], NEA -Benchmarking [7] The second set of validations suites was created to provide a general indication of the overall performance of a given library. In addition, they can help to identify areas where improvements are needed or unexpected discrepancies from changes to nuclear data. The benchmarks are divided according to the fissile material that produces the majority of fissions: Pu, HEU, IEU, LEU, U233 and MIX. Benchmarks are classified by neutron spectra (fast, intermediate and thermal), enrichment, reflector thickness or solution content.
NEA's criticality validation suite contains the 119 KAERI's cases and 4 additional cases for testing purposes of: 237 Np (SMF-8), heavy-water solutions (HST-4), very thermal Pu solution (PST-9 series) and unmoderated ZEUS benchmark (HMF-73) for Cu cross-sections in the fast energy range. Figure 5 shows a good performance of JEFF-3.3.T1 and T2. Calculations are performed with MCNP6.1.   [12], PSI -Benchmarking [13] A third type of validation suites is defined to assess specific applications.

CEA -Benchmarking
CEA's suite is focused on thermal and fast systems (Fig. 6, only for uranium cases). The JEFF-3.3T1 evaluation is compared with JEFF-3.2 on a set of criticality experiments (ICSBEP, CEA/EOLE LWR mockup and CEA/MASURCA SFR mockup, ZPPR) using TRIPOLI-4 code. A general better performance for Pu cases is found.

Shielding and Benchmarks for fusion application
SINBAD database is used as a high quality reference set of Benchmarks for validation of nuclear data used for radiation transport and shielding [18]. New iron cross-section evaluations ( 56 Fe-CIELO and partial new evaluations of 54,57,58 Fe) have been used in SINBAD benchmarks recently re-evaluated such as ASPIS IRON-88. The validation revealed general good performance of the new iron data [19,20]. JEFF/Fusion group is working on the reanalysis of the HCPB and HCLL mockup experiments with recent nuclear data evaluations. In general, the current state-ofthe-art nuclear data libraries FENDL-3.1b, ENDF/B-VII.1 and JEFF-3.2 showed similar results [21].
Cu-Benchmark experiment irradiated with 14 MeV neutrons at the Frascati Neutron Generator at (FNG) is used to check the status of the nuclear data library for copper, in particular the latest release (JEFF-3.3T2), but also earlier data sets such as JEFF-3.1.1/3.2 and FENDL-3. A comparison of C/E values of irradiated activation foils at different penetration depths, and neutron and gammaray spectra were performed. The main conclusions of this work are: 1) underestimation up to 15% is found for high threshold reactions using JEFF-3.1.1 and FENDL-3 and up to 20% using JEFF-3.2, good improvement in JEFF-3.3T2; 2) underestimation of the low threshold reaction is more severe (30%), furthermore, a decreasing trend versus penetration depth is observed, JEFF-3.2 provides the largest underestimation; 3) non-threshold reactions show even larger underestimation (60%), with the C/E decreasing as a function of the penetration depth, all the libraries produce the same results. It was concluded that results of the Cu experiment call for a deep revision/reevaluation of the copper cross-sections [22,23].
Within the JEFF Fusion WG two extensions to the MCNP code have been developed, MCSEN to carry out sensitivity analysis, and MCUNED to handle deuteron data libraries in the transport simulation. Extensive usage, validation and testing of these tools on well-defined benchmarks have been performed [24,25].

Burnup benchmarks
The performance of nuclear reaction data should be assessed in different depletion scenarios using powerful burnup/depletion codes [26]. A PWR burnup pin-cell benchmark proposed by UAM [27] has been used to calculate differences between evaluations. The keff produced with JEFF-3.3T1 underestimates JEFF-3.1.2's results by ∼500 pcm (at 60 GWd/t) [28]. This effect might be mainly due to the new evaluation of 239 Pu. More 239 Pu disappears along burnup with JEFF-3.3T1 compared to JEFF-3.1.2; in parallel, the products of 239 Pu capture ( 240 Pu, 241 Pu and 242 Pu) appear in larger concentrations with JEFF-3.3T1. The one-group integrated 239 Pu capture cross-section with JEFF3.3T1 is ∼1.5% larger than the value with JEFF3.1.2.
Finally, fuel assembly decay heat benchmarks are a valuable piece of information to test long lived products and actinides. Preliminary results in Ringhals 2&3 measurements for UO 2 assemblies are presented in Ref. [29], JEFF-3.3.T1 calculated values over predict by 3-4% with respect to JEFF-3.2.

Activation benchmarks
A review of current decay data evaluations has shown the importance to identify and fill pressing gaps. Particularly for decay heat calculations where beta decay and gamma contributions play an important role (e.g., TAGS, spectra or missing components of the spectra -particularly at high energy gamma, . . . ) [30].

Other nuclear data validation
Several activities focused on validation of effective kinetics parameters (βeff and ) based on IRPhE experiments [31] are still in progress, more effort is needed to analyse the effect of switching from 6 to 8 families of neutron precursors.

Reactor benchmarks and new designs
New reactor concept designs such as ASTRID (Advanced Sodium Technological Reactor for Industrial Demonstration) and MYRRHA (Multi-purpose hYbrid Research Reactor for High-tech Applications) have been used for testing purposes. In ASTRID, the impact of new 238 Pu and 239 Np files is assessed. Pb and Bi files are tested in MYRRHA design [32].
In the framework of the JEFF project, new methodologies to propagate nuclear data covariance have been developed. SANDY code [37] developed by SCK·CEN has demonstrated high potential to assess the response uncertainty in different nuclear systems.

Conclusion
JEFF/B&V WG has addressed benchmarking efforts in participating institutions of JEFF Project which benchmarking activities are revised to be more coherent and efficient. This paper has collected all of the benchmarking activities performed by both the evaluation community and the user community within the JEFF project. Benchmarks from the open literature (such as ICSBEP, SINBAD or IRPhE, databases) or reactor experiments have been used in this activity. This work has given valuable feedbacks and trends for improving the nuclear data evaluations before releasing JEFF-3.3 library.
The JEFF Project relies on the NEA Data Bank to implement and co-ordinate a comprehensive Q&A process involving verification, testing and benchmarking tasks according to well-defined criteria. Errors and inconsistencies in JEFF library are systematically analysed and solved with NDEC tool providing processed and verified files according to recommended processing standards given by JEFF Processing & Verification WG. DICE and NDaST tools have demonstrated to be methodologies break-through with a large impact on reducing the time to develop a good evaluation, to first select the sensitive benchmarks and then fine-tune evaluations in different energy ranges.
The assignment of the experimental correlations between benchmarks might be useful to assess the importance of selecting/removing experiments in the benchmarking activities [38]. The implementation in DICE of an extended set of correlations in ICSBEP experiments will lead to a better understanding of the real performance of nuclear data libraries.
It can be concluded that the first assessment of JEFF-3.3T1/2 nuclear data library has shown a good performance. Although, more efforts for improving files such as Cu are still needed. New benchmarking activities are foreseen to assess the impact of new evaluations such as 16 O-IRSN and 238 U-CIELO files.