From ﬁssion yield measurements to evaluation: new statistical methodology applied to 235 U ( n th , f ) mass yields

. The study of ﬁssion yields has a major impact on the characterization and understanding of the ﬁssion process and is mandatory for reactor applications. The mass and isotopic yields of the ﬁssion fragments have a direct inﬂu-ence on the predictions of fuel burn-up and decay heat. Moreover, these data are requested for other studies as delayed neutron evaluation, antineutrino ﬂux assessment or reactor program. Today, the lack of covariance matrix associated to evaluated ﬁssion yields induces overestimated uncertainties of mass yields since these observables result from the sum of isotopic and isomeric yields. Our col-laboration starts a new program in the ﬁeld of the evaluation of ﬁssion products in addition to the current experimental program. The goal is to deﬁne a new methodology of evaluation based on statistical tests in order to provide the best estimation with consistent sets of measurements. A ranking of solutions with associated covariance based on Shannon’s entropy criterion is proposed for the mass yields from 235 U( n th , f ) reaction.


Introduction
Fission yields evaluation represents the synthesis of experimental and theoretical knowledges in order to perform the best estimation of mass, isotopic and isomeric yields. Nevertheless the estimation of thes observables are drastically based on experimental data since the modelling of fission process is not predictive. Today, the output of fission yields evaluation is available as a function of isotopic and isomeric yields. As a consequence, mass yields are the sum of isobar nuclei and their quadratic sum to deduce uncertainties. So without any correct covariance matrix, mass yields uncertainties are greater than isotopic yields. This consequence is in contradiction with experimental knowledges where the abundance of mass yields measurements is clearly dominant. Thus, we expect the uncertainties on this latter observable to be lower than those of isotopic yields. Covariance matrix assessment depends on the evaluation process and its validity assumes that all measurements are statistically in agreement. These last years, different covariance matrices have been suggested but the experimental part of those are neglected in covariance evaluation [1] [2][3] [4] or applications [5] [6]. In the first part we present an assessment methodology based on statistical test. The consideration of experimental data is crucial in the definition of the covariance of evaluations where models are not predictive. A large range of data are listed in the EXFOR data bank but a lot of them cover partial mass range. Data are also provided for different incident neutron  Cumulative  Cumulative  Datasets  measurement mass  Index 0  1  2  3  4  5  6 7  number  number   Maeck78  37  37  0  37  Thierens75  74  48  1  26 37  Diiorio77  138  64  2  37 36 64  Bail07  168  67  3  17 19 28 30  Tsoukatos68  232  67  4  37 36 64 28 64  Tsoukatos68b 260  67  5  23 20 28 13 28 28  Rosman83  267  72  6  0  0  1  2  1  0  7  Mathiews83  279  77  7  0  0  1  2  1  0  7 12 energy with different mass resolutions. The merge of data could generate the non-unique solutions according to consistent datasets and will be presented in the second part. In the last part, a ranking of solutions with associated covariance is proposed according to the Shannon's entropy criterion.

Experimental datasets
For this first work, the development of the evaluation method is focused on 235 U(n th , f ) reaction because of a large number available measurements. Thus, for this reaction the EXFOR [7] database permits to cover all the mass range of fission products. Absolute mass yields Y(A) is obtained using the self-normalization of this observable according to the equation: where Ω is the normalization factor, here Ω = 2. Unfortunately, for most of these measurements, only statistical uncertainties is provided and systematic uncertainties are estimated in the best cases. Thus the estimation of covariance matrix is non-obvious and represents an important work which will be described in a future work.

Statistical test on the compatibility of available data
In order to reduce the data and to merge all the measurements for a given mass, it is necessary to test the compatibility of the data. Two kind of data are presented: i) the full range mass yield measurements but not necessarily with sufficient mass resolution (3σ A < 1amu), ii) uncomplete mass range generating relative measurements or relative normalizations by the authors. Through the EXFOR [7] database, we chose to test the methodology only on 8 datasets: W.J. Maeck et al. [8], G. Diiorio et al. [9], H. Thierens et al. [10], A. Bail et al. [11], M.P. Tsoukatos [12] with two different datasets, K.J. Rosman [13], C.K. Mathiews [14]. These data correspond to 279 measurements over 77 masses (from A=77 to A=154 with a lack of A=122 in the data used in this work). With this selection, we cover both peaks, allowing the absolute normalization of our evaluation. Thus, assuming independent Gaussian distributions associated to the measurements without explicit information on correlation data, we can calculate the χ 2 using the n A common measured mass number. This value is compared to the limited χ 2 value (χ 2 lim ) given for a 99.5% confidence level. In practice, we calculate the ; ∞] range of the χ 2 distribution for (nÂ − 1) degrees of freedom. The usable data are only those which pass the χ 2 test and common mass setÂ are defined as following: where {A} i (respectively {A} j ) are the measured masses of the i th (respectively j th ) dataset and CL is the confidence level chosen at CL=0.995 for this work. The common mass number nÂ = Card({Â}) is presented in table 1. Formally, the comparison of each dataset N j (A) to the reference one N i (A) generates P − values lower than 1 − CL = 0.005 for all data set. So we introduce a cross-normalization factor k i, j to maximize the number of measurements in agreement considering all measurements as relative ones. This comparison is described by the C ij vector following: The normalization factors is obtained considering the minimum of the generalized χ 2 g over the {Â} measurements in agreement: where Cov −1 is the inverse covariance matrix associated to C ij . nevertheless, at this step without the covariance of the measurements, we consider that: The standard deviations of the cross-normalization factors k i, j are presented in table 2. A discussion about usable data management is described in reference [15] [16].

Cross-correlations of usable data
The relative normalization of each j th dataset, N j (A), to the (i th ) reference one, N i (A), is defined by as follows: According to the perturbation theory [17], the covariance of two normalized measurements R i l (A) and R i j (A ) is developed in the appendix (see Sect. Apppendix). For this study, without explicit experimental covariance matrices, most of components of variance-covariance are considering null: Cov(N j (A); N l (A )) = Var(N j (A)).δ AA .δ jl ∀ j, l For n measurements of mass A [18], the mean normalized mass rateR(A) is equal to: For n measurements of mass A and m measurements of mass A [18], the covariance of mean normalized mass rates is equal to (see Sect. Apppendix, eq. 23): with here C l j = Cov(R i l (A); R i j (A )) for two different masses A and A , thus (n × m) terms. The weights are defined (see Sect. Apppendix, eq. 24): with here C l j = Cov(R i l (A); R i j (A)) for a same mass A with (n × n) covariance terms (or for a same mass A with (m × m) covariance terms). Fig. 1 presents the cross-correlation of R i l (A) data for two different reference sets. We note that the intensity of the correlation depends drastically of the choice of normalization and the uncertainty of the normalization factor σ(k i j ) (see table 2).

Ranking of solutions based on Shannon's entropy
According to the normalization of mass yields (see eq.1), the generalized perturbation theory [17] allows to describe the variance-covariance matrix associated to the evaluation of the mass yields: with the sensitivity of mass yield Y(A) to mean mass ratesR(A) andR(A ): where Ω is the normalization factor, here Ω = 2 (see eq. 1). Fig. 2 shows the results for two different reference sets and we remark that the structures of these correlations are strongly different. Thus, the result of the mass yields evaluation depends on the initial datasets but also the path of analysis. In order to discriminate all possible evaluations, the Shannon's entropy S S h is chosen as a useful criterion in order to assess the brewing of information [19]. It is given by the relation: Where n is the number of eigenvalues. We approximate the probability with the weight of each component of the eigenvalue decomposition to build a relative criterion. The weight of the information is provided according to the following equation: where tr(Corr) = 77 is the correlation matrix trace (in this study, 77 mass yields are evaluated).

Results and discussion
Results on pure experimental mass yields evaluation (modeless) are presented on Fig. 3. From all solutions, we note that the maximum of Shannon's entropy corresponds to the minimum of variances and correlations values. This results is consistent to the Cramer-Rao theorem which fixes the limits on minimal variances as the maximum of the Fischer's information. Shannon's entropy corresponds to another quantification of information of the analysis and we expect that the best searched solution corresponds to the minimum of variance-covariance and then the maximum of information. In this work, experimental data consideration is crucial for the definition of the mass yields evaluation, the uncertainties and the correlations. The lack of experimental covariance could induce a lower estimation of evaluated mass yields uncertainties since the dealt information is overestimated. Correlations in the data limit the knowledge provided by a dataset. Then the perspective of this work is to build a priori experimental correlation matrix to fill the lacks in this analysis. The relative normalization of each j th dataset, N j (A), to the (i th ) reference one, N i (A), is defined as follows: Based on the minimum of χ 2 , the variance of normalization factor k i j is given by: Var(N j ( A)) According to the perturbation theory, the covariance of two normalized measurements R i l (A) and R i j (A ) is described by the following equation [17]: with: For this study, without explicit experimental covariance matrices, most of components of variance-covariance are considering null: Cov(N j (A); N l (A )) = Var(N j (A)).δ AA .δ jl ∀ j, l For n measurements of mass A, the mean normalized mass rateR(A) is equal to [18]: with: with C l j = Cov(R i l (A); R i j (A)) for a same mass A and its variance is given by the following equation: For n measurements of mass A and m measurements of mass A , the covariance of mean normalized mass rates is equal to: Cov(R(A);R(A )) = n,m l, j=1 W l C l j W j (23) with here C l j = Cov(R i l (A); R i j (A )) for two different masses A and A , thus (n × m) terms; and the weights: with here C l j = Cov(R i l (A); R i j (A)) for a same mass A with (n × n) covariance terms (or for same mass A with (m × m) covariance terms)