IDENTIFICATION OF REACTOR PHYSICS BENCHMARKS FOR NUCLEAR DATA TESTING: TOOLS AND EXAMPLES

Measurements of reactor physics quantities aimed at identifying the reactivity worth of materials, spectral ratios of cross-sections, and reactivity coefficients have ensured reactor physics codes can accurately predict nuclear reactor systems. These measurements were critical in the absence of sufficiently accurate differential data, and underpinned the need for experiments through the 50s, 60s, 70s and 80s. Data from experimental campaigns were routinely incorporated into nuclear data libraries either through changes to general nuclear data libraries, or more commonly in the local libraries generated by a particular institution or consortium interested in accurately predicting a specific nuclear system (e.g. fast reactors) or parameters (e.g. fission gas release, yields). Over the last three decades, the model has changed. In tandem access to computing power and monte carlo codes rose dramatically. The monte carlo codes were well suited to computing k-eff, and owing to the availability of high quality criticality benchmarks and these benchmarks were increasing used to test the nuclear data. Meanwhile, there was a decline in the production of local libraries as new nuclear systems were not being built, and the existing systems were considered adequately predicted. The costto-benefit ratio of validating new libraries relative to their improved prediction capability was less attractive. These trends have continued. It is widely acknowledged that the checking of new nuclear data libraries is highly skewed towards testing against criticality benchmarks, ignoring many of the high quality reactor physics benchmarks during the testing and production of general-purpose nuclear data libraries. However, continued increases in computing power, methodology (GPT), and additional availability reactor physics experiments from sources such as the International Handbook of Evaluated Reactor Physics Experiments should result in better testing of new libraries and ensured applicability to a wide variety of nuclear systems. It often has not. Leveraging the wealth of historical reactor physics measurements represents perhaps the simplest way to improve the quality of nuclear data libraries in the coming decade. Resources at the Nuclear Energy Agency can be utilized to assist in interrogating available identify benchmarks in the reactor physics experiments handbook, and expediting their use in verification and validation. Additionally, high quality experimental campaigns that should be examined in validation will be highlighted to illustrate potential improvements in the verification and validation process.


INTRODUCTION
Neutronics predictions underpin the safety, economics, and operation of reactor systems. Reactor physicists are concerned with the applicability and accuracy of their modelling and simulation tools that characterize the steady state and transient behavior of the reactor core. Question arising such as, will the system be critical (k-eff), how often to fuel, what is the local power peaking, conversion ratio, response of the in-core detection and shutdown system, void reactivity, doppler feedback, isotopic composition of used nuclear fuel, and the potential for medical isotopic production are all central to the design and operation of nuclear systems. Analysts require tools having undergone rigorous verification and validation. The consequences can be large as uncertainties and errors in code predictions can potentially cost billions of dollars [1][2][3].
Many neutronics tools and methods perform adequately for existing systems. Despite improvements to methods and data, the cost benefit ratio of the validation process is prohibitive to adopting these updates; a case in point is that the nuclear data library of reference for many applications in the UK is JEF2.2, released in 1992. For new tools to be adopted biases and uncertainties need to be determined and documented to at least the standard of the previous versions, requiring significant benchmarking against available experimental data. Furthermore, while outside the scope of this paper, as the codes are coupled to downstream applications such as fuel performance and thermal-hydraulics, the consequences of new biases and uncertainties will not be evident from neutronics testing alone. Mechanisms to quickly provide feedback, in order to communicate and resolve differences, is a key obstacle to overcome.
During the golden age of reactor design (1960's), it was common to build mockups of reactor systems, followed by hundreds of measurements characterizing the system and its feedback coefficients. Neutronics designs became increasingly complex (fast reactors, molten salt systems, thorium systems, accelerator driven systems etc.) and mockups determined the applicability of the simulation tools of-the-day, which at the time were quite uncertain. Validation was big business, and the methods to optimize the process were of the utmost economic importance. Perturbation theory, outlined by Wigner in 1945 [4], was extensively developed, expanded, and applied by Usahcev [5] who used these methods to improve experimental design [6]. As perturbation theory methods were increasingly adopted in the 1970s and 80s the methods were widely applied to improve interpolation and extrapolation of experimental results based on similarity of mockups to the reactors being designed, known as representativity. Ushachev wisely recognized the value in sensitivity and uncertainty methods to improve communication between nuclear data evaluators and users [7], which he noted was of increasing importance as discipline specialization increased. During this era, the biases and uncertainties were a mix of analytic approximations and nuclear data, which limited the ability of perturbation theory methods to directly improve nuclear data, especially in thermal systems. Furthermore, to isolate the impact of nuclear data, credible uncertainties were needed. In the 1970s and prior, there was rarely covariance data in the major evaluated nuclear data libraries. Occasionally covariances were available, but their applicability was limited. Nuclear data evaluators at the time were not universally enthusiastic about adding this information, as the following excerpt from Ref 8. illustrates.
The discussion of uncertainties within ENDF/B spurred a vigorous debate circa 1974. CSWEG members were heard to say "Uncertainties were too difficult to assign, and virtually impossible to assign over a complete range of data." "Even if assigned, uncertainties would never be used. There simply was not sufficient interest to justify the enormous expense to implement uncertainties in reactor physics codes" A consensus on the path forward for nuclear data covariances was far from being reached, although many efforts to generate covariances continued. At the same time, user interest and the need for covariances increased. The landscape changed rapidly in the mid 2000's and many sources of covariance data became available to reactor physics analysts [9][10][11]. In parallel to availability of uncertainty estimates in nuclear EPJ Web of Conferences 247, 10028 (2021) PHYSOR2020 https://doi.org/10.1051/epjconf/202124710028 data, progress in quantifying and reducing the uncertainties from the deterministic methodology was greatly advancing, with an uptake of monte carlo modelling and methods.
Reactor systems designed prior to 2000 made limited use of monte-carlo predictions, as in the 1990s only a few dozen processor could be used [12]. The uncertainty and errors of solving the Boltzmann equation, often with the 2-step method of coupling cell and diffusion codes, was at least as great as the errors in the nuclear data used in the equation. Two-step methods are still the de-facto standard in neutronics design, however, they are now commonly verified against monte carlo codes. Such code-to-code checking was initially done for static integral parameters (k-eff), but is increasingly done for local and time-dependent quantities, although it will be some time before deterministic methods are replaced [13]. As a result of verification against monte carlo methods, approximations from solving the Boltzmann equation can be well quantified, and are of decreasing importance. Thus, improving the nuclear data should directly translate into reduced biases and uncertainties of advanced modelling and simulation tools.
Switching tools and data is not a decision to be trivialized. For systems that already exist, measured data ensures that the biases in the modelling and simulation tools are well quantified for observable in the system, although the sources might not be always well understood. Switching data libraries results in new, unquantified biases that may be detrimental and ultimately more impactful than the reduction in uncertainty. It is rightly considered risky to change the underlying nuclear data, unless it has the accompanying verification and validation case. At the same time, inevitably, the amount of measured data is far less than the number of reactor states, and advanced simulations based on more accurate data allow better interpolation and extrapolation away from the measured state points; incentivizing the use of improved methods.
Given the above, it should be common practice for nuclear data evaluations to be vetted against the gauntlet of collected experimental data. In actuality, this is very much not the case [14,15], and it could be argued that initial integral validation efforts may even be less effective than in the past. Most of the nuclear data testing involves criticality experiments. This is done partially because: a) it was done before (tradition), b) the data are very good, well documented with models and uncertainties (ICSBEP [16]), c) it is relatively easy (Fast, easy to interpret, tools to pick relevant benchmarks).

REACTOR PHYSICS EXPERIMENTAL BENCHMARKS FOR NUCLEAR DATA TESTING
As outlined in the previous section, nuclear data testing is skewed towards criticality benchmarks. For reactor physics benchmarks, the main source of high-quality benchmarks is compiled in the International Handbook of Evaluated Reactor Physics Benchmark Experiments [23]. The tables in the subsequent sections have been extracted from the Handbook using the IRPhEP Database and Analysis Tool (IDAT) [24], which can be used to interrogate the data contained within the Handbook.
The question can be asked, what type of experiments would be of greatest value to the nuclear data community? Recent efforts to answer this question in the context of nuclear data adjustment provide insight into a potential hierarchy of experiments for nuclear data testing. In the progressive incremental approach (PIA) [25] in order to reduce compensating errors during nuclear data adjustment, it is suggested that the following hierarchy be used for actinides [25]. Similarly, this order can be applied to prioritize evaluation of experimental benchmarks and direct nuclear data testing.

Fission spectral indices: sensitive to fission cross sections (but also to inelastic and fission spectrum, in
Of course, many other reactor physics experiments exist, often of excellent pedigree. However, in order for the nuclear data community to embrace these experiments, additional efforts are needed to transform the experimental results into benchmarks with well defined uncertainties, as was done in the reactor physics handbook. Some high-quality experiments and experimental benchmarks will be noted in the upcoming sections.
Finally, often high-quality benchmarks are not in the public domain. The consequences are that restricted access of these benchmarks impedes the ability to improve nuclear data and tools. As institutions will eventually be pushed to adopt new tools and methods, the opportunity to efficiently integrate restricted benchmark data into the new tools will be missed. Eventually questions will be asked by experts and regulatory bodies about the impacts of using antiquated information. To remedy this, it is suggested that at a minimum, sensitivity profiles of the applications should be made publicly available and collected.
Existing tools [26] can be used to provide feedback on the impact of nuclear data changes to these applications. This would improve adoption by industry, and incentivize nuclear data projects.

Criticality
Understandably, nuclear codes and data can predict k-eff well for existing systems, for which there is a wealth of data. In reactor physics predicting k-eff impacts aspects such as the critical mass and fueling rate of the reactor. During the development of nuclear systems, changing the nuclear data library from ENDF/B-VI.8 to ENDF/B-VII.0 was found to lower the anticipated burnup of a next generation heavy water reactor, ACR-1000 [27], and required a significant redesign of the in core reactivity devices to maintain the advertised burnup of the fuel. While k-eff is important for reactor systems and behavior, it will not be addressed further as it is well tested by the major nuclear data libraries and new simulation tools.

Spectral Characteristics
Spectral characteristics can include ratios of reaction rates from activation foils, or detailed determination of the neutron spectrum from devices such as proportional counters. Reactor systems are often compared on the basis of their neutron spectrum as this quantity acts as a surrogate for the overall system properties.

Available Measurements
The reactor physics handbook has 233 spectral characteristics measurements performed, performed at 15 facilities as shown in Table I. The majority of measurements are reaction rates relative to either Pu 239f or U 235f . Facilities such as BFS1 and ZPPR have multiple experimental benchmarks where the configuration has been changed specifically to alter the neutron spectrum. As illustrated in PIA, these measurements should be used to constrain any nuclear data adjustment process. The reactor physics community needs to prioritize the preservation of these experimental data, and ensure they are made available to nuclear data testers in a convenient format.  Many other high quality spectral characteristic measurements exist, some in critical facilities currently in the ICSBEP Handbook. While there are many BFS and ZPPR benchmarks currently in the handbook, these flexible facilities have had numerous experimental campaigns many with high-quality spectral characteristics measurements. Similar to the ZPPR in design, the FCA reactor in Japan also is a potential source of high quality data.

Reactivity Effects and Coefficients
When designing nuclear systems, as was previously noted, it is crucial to accurately predict the reactivity worth of different materials and the system feedback effects. Reactivity effect measurements are likely the most common neutronics experiments performed, however they have only occasionally been evaluated into experimental benchmarks. For nuclide capture-cross-sections experiments such as those performed using pile oscillators can provide precise estimation of the reactivity worth in the given system.
Reactivity coefficients, these often are critical parameters in the safety case of reactor systems. Void reactivity, particularly in sodium fast reactors, and pressurized heavy water reactors of the utmost importance in safety cases, and often determine the operating envelope. As performing experiments for all possible core configurations and burnups is impractical, analysts rely on computational tools to interpolate and extrapolate the experimental results at hand. Many of these experiments remain proprietary, due to their high commercial importance and safety implications.

Void Reactivity
The reactor physics handbook has 16 sodium void reactivity measurements made in 4 different facilities, as is shown in Table II. Many experimental campaigns have been performed to evaluate the sodium void effect in different systems, however the experiments remain unevaluated. As a consequence, when nuclear data cross sections are changed, they are rarely benchmarked against these configurations.

Temperature Reactivity
The reactor physics handbook has approximately 50 measurements of system responses to changes in temperature, conducted at 8 facilities as is shown in Table III. The majority of these experiments are for increases in the total system temperature. While experiments have performed high temperature measurements for both thermal and fast systems, these have not been evaluated as benchmarks.

Control Rod Worth
The reactor physics handbook has 50 control rod worth measurements made in 9 different facilities, as is shown in Table IV. Often both the differential control rod worth and the total worth have been benchmarked. For nuclear data testing, control rod benchmarks are of less importance, however confirmation of codes abilities to predict the transient response of systems is an important quantity that would benefit from enhanced benchmarking.

Available Measurements Burnup Reactivity
Only two benchmarks exist for burnup reactivity in the reactor physics handbook. The DUKE experimental benchmark was an extensive study of differences between measured and calculated neutron flux in different power reactors, and can be used to improve the integral prediction of the reactivity decrement during burnup. Many PIE measurements have been performed from fuel discharged from power reactors, and are available in the SFCOMPO database [30]; currently SFCOMPO measurements have not been transformed into benchmarks, however an international technical review group has been formed and tasked to this.

Kinetics Parameters
Kinetics parameters play an important role in the safety analyses of nuclear reactors as they determine the transient response of the reactor. The delayed neutron fractions and time constants are often embedded in the interpretation of other experimental results, where knowledge of the kinetics parameters is used to deconvolve the measured signal into reactivity. Benchmarking of beta effective is occasionally performed, although additional efforts should be made to define the uncertainties in the data.

Reaction Rate and Power Distributions
Reaction-rate measurements are flux maps, fission chamber scans, and wire-activation fine-structure and macro-structure measurements. They provide validation of the ability of the codes and nuclear data to predict the spatial distribution of neutron flux. Different materials can be employed, especially when seeking sensitivity to specific neutron spectra. Measurements can either be absolute, or relative, either to a normalizing position or average.
These measurements are of particular interest near material interfaces, such as core reflector, or near structural components. Additionally fine structure measurements of the reaction rate/power distribution within the reactor fuel is of particular interest to fuel performance codes. Table VII provides an overview of data available within the reactor physics handbook.

CONCLUSIONS
A rethink is needed about how nuclear data validation is done, and the roles and priorities of the various stakeholders. A number of candidate reactor physics experimental benchmarks have been identified as well as candidates for future evaluation. The reactor physics community needs to invest heavily in turning the highest quality experiments of the past into benchmarks. If they do not, the nuclear data community will not test against these measurements, and new libraries will not perform for applications, creating a vicious circle of disincentivization. Meanwhile, the nuclear data community needs engage with the reactor physics community, collecting their needs, and develop rapid testing methods to ensure predictions of well-known quantities do not get worse. With recent advances in perturbation theory, nuclear data covariance, the monte carlo method, and computing power all the pieces are in place for the paradigm to shift.

ACKNOWLEDGMENTS
The International Reactor Physics Experiment Evaluation Project (IRPhEP) technical review group is acknowledged, as is the leadership of the project provided by its chair John Bess. A special thanks to the efforts of Enrico Sartori and Blair Briggs to preserve and evaluate experimental data is noted. Also, the financial contribution of the government of Japan to support the IRPhEP and the accessibility of the data is noted.