Applicability evaluation of Akaike’s Bayesian information criterion to covariance modeling in the cross-section adjustment method

. The applicability of Akaike’s Bayesian Information Criterion (ABIC) to the covariance modeling in the cross-section adjustment method has been investigated. In the conventional cross-section adjustment method, the covariance matrices are assumed to be true. However, this assumption is not always appropriate. To improve the reliability of the cross-section adjustment method, the estimation of the covariance model using the metric ABIC has been introduced, and the performance of ABIC has been investigated through simple numerical experiments. This paper derives the formula to efficiently evaluate ABIC which is represented by a lower rank matrix to enable numerical experiments with large samples in a realistic computation time. From the results of the numerical experiments, it has been confirmed that ABIC tends to select a covariance model with fewer hyperparameters and a smaller variance for the estimation error. However, it has also been found that this desirable property of ABIC will be lost when the structure of the covariance model is far from the true one.


Background
Every measurement and analysis values always have uncertainties. Namely, the data follow a probability distribution with parameters such as population mean and covariance. Unfortunately, we cannot know the true parameters. Therefore, when we analyze the data, we usually estimate and/or assume these parameters. The appropriateness of the covariance matrix set (e.g., nuclear-data covariance) by analysts has been widely discussed in the framework of Sub-Groups (SGs) under the Working Party on International Nuclear Data Evaluation Cooperation (WPEC) in OECD/NEA (e.g., [1][2][3][4]). Recently, the discussion had been also done in Japan through the activity of the covariance data utilization and promotion working group organized in the JENDL committee [5].
In the cross-section adjustment method, covariance matrices are also used. One of the most important things for a reliable cross-section adjustment method is giving suitable covariance matrices close enough to the true covariance matrices. To judge the goodness of the covariance modeling, a metric is desirable. As a candidate for this metric, we focus on Akaike's Bayesian Information Criterion (ABIC) [6] which is one of the information criteria in Bayesian inference, because the cross-section adjustment method is often discussed within the framework of Bayesian inference.
In the conventional cross-section adjustment method, incorporation of the analysis method errors * Corresponding author: maruyama.shuhei@jaea.go.jp (errors due to the core calculation method, e.g., discretizing error in a deterministic code) as a covariance matrix still requires ad hoc treatment. In JAEA, the integral experimental database for fast reactors has been developed and the adjusted crosssection set ADJ2017 [7,8] has been created based on this database. Many of the core characteristics in the database have been analyzed by a deterministic method. Therefore, the predicted core characteristics have nonnegligible uncertainties with correlations due to some numerical approximations. However, evaluating the uncertainties and their correlations is still a challenging issue. In addition, there would be unknown uncertainties that experimenters and analysts of reactor physics experiments were not able to recognize.
In order to address the difficulties in the conventional adjustment method, we will try to incorporate ABIC into the adjustment method. ABIC is expected to work as a metric for evaluating the goodness of covariance matrix modeling related to the uncertainties. We aim to improve the reliability of the conventional cross-section adjustment method by ABIC. This paper investigates the applicability of ABIC through several numerical experiments using random sampling techniques.
We introduce ABIC and incorporate this into the conventional cross-section adjustment in the next section. The applicability is evaluated through the simple numerical experiments in section 3. The conclusion of this paper will be described in section 4.

Methodology
Before we get to the main subject, we introduce the notations that will be used in this paper. The covariance matrix with the tilde (" ~ ") means a covariance model with hyperparameters determined later. The covariance matrix with a hat (" ^ ") means the selected covariance by determining the hyperparameters. These covariances are not always "true." The true covariance is represented by the characters without a tilde or a hat. For example, : a true covariance matrix, : a covariance model with undetermined hyperparameters, : the covariance selected among a covariance model by determining hyperparameters.

Review of conventional cross-section adjustment method
When true nuclear data set follows a multivariate normal distribution with the mean and covariance , the probability distribution of is 1 where is the dimension of and , | | is the determinant of a matrix , and the superscripts 1 and T represent the inverse and transpose of the matrix respectively. When integral experimental data sets are obtained, the likelihood function is represented as follows: is a set of experimental values and is the experimental covariance matrix.
is a set of calculational values if the true nuclear data set is given, and is the covariance matrix due to calculation methods. Here, denotes the dimension of and . From Bayes' theorem, | ∝ | , 3 the posterior probability is represented as follows: The cross-section and its covariance after adjustment are derived from the condition that maximizes the posterior probability of Eq. (4) as follows: Here, means nuclear-data sensitivity coefficients. Note that the linear approximation for , i.e., , 8 is assumed to derive Eqs. (5) and (6). The concept of the cross-section adjustment method is shown in Fig. 1.

Akaike's Bayesian information criterion
In the conventional cross-section adjustment method, the covariance matrices , , and used for the adjustment are assumed to be true. However, these are not always appropriate. Therefore, let us assume the covariance matrices have hyperparameters that are determined to suit the observed data. To find a better covariance model by tuning the hyperparameters, we propose to use ABIC in this paper. As mentioned in the previous section, since the setting of the covariance matrix due to a calculation method is a challenging issue in the conventional adjustment method, only is assumed to be unknown and to be the estimation target using ABIC in this study. In other words, and in this study are assumed to be true. If the covariance model has unknown positive hyperparameters , , ⋯ , , the likelihood function of Eq. (2) is represented as follows: | ; , , ⋯ , where is the number of hyperparameters. As shown later (subsection 3.1.3), we prepare positive hyperparameters expressed as squared values to guarantee the positive definiteness of the covariance matrix . As a metric of the goodness of an inference model including the covariance modeling, we employ an information criterion called ABIC, ABIC 2 ln , , ⋯ , 2 . 10 Here, is the marginal likelihood defined as , , ⋯ , | ; , , ⋯ , . 11 ABIC consists of the two terms related to and . The marginal likelihood is proportional to the generation probability of the observed data (integral experimental data set ). By choosing a good inference model that leads to this probability, i.e., a suitable covariance model, ABIC is decreased. On the other hand, the number of hyperparameters works penalty in terms of the complexity of an inference model. Therefore, ABIC prefers a simpler model having fewer hyperparameters. In a previous study, a cross-section adjustment method based on the optimization of marginal likelihood was presented [9]. Our proposed method using ABIC proposed in this paper is an extension of the previous study.
Performing the integral in Eq. (11) under the assumption of Eq. (8), the becomes (e.g., [10] Here, tr is the trace of a matrix X. In general, the rank of ′ , i.e., is very large in the cross-section adjustment method. So, the evaluation of ln ′ becomes expensive without any treatments. In the hyperparameters tuning using ABIC, the iterative calculation due to the determination of hyperparameters is needed. Moreover, this iterative calculation will be repeated for large samples in this paper. Thus, the numerical experiments cannot be performed within a realistic computational cost. To avoid this issue, ABIC is represented as follows (see Appendix): The equivalence of Eq. (12) and Eq. (17) was confirmed by preliminary numerical experiments.
Generally, the dimension of , i.e., is much smaller than . As described in Sec. 3, the total number of experimental data is 33 in the present numerical experiments and sufficiently smaller than the total number of nuclear data 14230. Therefore, ABIC is evaluated from Eqs. (16) and (17) instead of Eqs. (10) and (12) in this study.
The covariance generally has a wide range of values with different orders. Then, the determinant of such matrices can easily cause a numerical underflow/overflow even if the rank of the matrix is not so high. Therefore, some treatment should be needed to stably evaluate ABIC. Various methods can be considered to avoid this numerical difficulty, but from the viewpoint of computation time, we adopted the "linalg.slogdet" method of the NumPy library [11]. This method computes a determinant via LU factorization using a LAPACK [12] routine.
The procedure of the cross-section adjustment using ABIC consists of two steps as shown in Fig. 2. The new procedure of covariance modeling is added before the conventional cross-section adjustment procedure. In this procedure, the covariance model with hyperparameters is assumed first. Then, the hyperparameters will be determined so that ABIC is minimized. As the solver of the ABIC minimization, the SciPy library [13] will be used. Finally, based on the hyperparameters determined by ABIC, the suitable covariance matrix , , ⋯ , is selected. The cross-section adjustment is performed using this selected covariance.

Numerical experiments
To evaluate the applicability of ABIC to the crosssection adjustment method, some numerical experiments using the random sampling method were performed in this section. Thereby, we generated virtual nuclear data sets and integral experimental data sets based on the virtual true covariances. Using the random sampling data sets, the cross-section adjustment was performed. After that, the performance of ABIC will be discussed from the relationship between ABIC and the variance of an estimation error resulting from the crosssection adjustment method.

Procedure of random sampling
First, to perform the random sampling, we virtually assumed the true covariance matrices , , and . We employed the covariance matrices in our integral experimental database [7,8] for fast reactors as the true covariance matrices. Sensitivity coefficients were also prepared from the database. As the integral experimental data for the numerical experiments, the experimental data related to criticality measurements used for the development of ADJ2017 were selected. Thirty-three experimental data listed in Table 1 were used.
It was assumed that these covariance matrices were all positive definite matrices. The singular value decomposition was performed on the original covariance matrix, and all the singular values which are smaller than a setting value Λ were replaced with Λ in advance. We set Λ 1.0 10 here. The rank of is the same as the dimension of the nuclear data set, 14,230, and the ranks of and are the same as the dimension of the integral experimental data set, 33. The procedure of the random sampling based on these prepared covariance matrices is described below.
The i-th sample related to a nuclear data set ≡ was generated from the probability distribution of Eq. (1). Namely, the obtained sample follows the following multivariate normal distribution: ~ , , 18 and the ith sample related to an integral experimental data set ≡ was generated from the probability distribution of Eq. (2), Namely, the obtained sample obeys the following multivariate normal distribution: ~ , .
19 Here, 1,2, … , . We set sample size 10,000 in this paper. The sample of ≡ which was used for the cross-section adjustment was obtained from correcting based on the linear relationship of Eq. (8), i.e., . 20 Using this sample to the adjustment formula of Eq. (5), ≡ ′ was evaluated as . 21 Note that, since there is no way to know the true covariance matrix in the real world, was determined by minimizing ABIC and were estimated by Eq. (21) using which includes .
To evaluate the performance of ABIC, the estimation error of the cross-section adjustment was defined as . Here, was generated from the true covariance M, and was evaluated from the sample generated from the true covariances and the estimated covariance . To facilitate the interpretation of results, the estimation error was projected onto a sensitivity vector as ≡ . 22 Here, was the sensitivity of the criticality of the JSFR core listed in Table 1. This core had been proposed as a design example for a next-generation sodium-cooled fast reactor core. means the estimation error of the target core characteristic.
Moreover, the statistics for were investigated. The following sample mean and sample variance were also calculated: 1 23 1 1 24

Method to evaluate the performance of ABIC
We describe the method to evaluate the performance of ABIC using the results obtained from random sampling here. The performance was evaluated using the property of described below. If the selected covariance matrix is equal to the true one, i.e., if , it is expected that the sample follows Gauss distribution with a mean of 0 and a variance of . Note was nuclear data covariance after adjustment evaluated from the true covariance matrix . Hereafter, we call as "reference variance." The distribution of is shown in Fig. 3. It is found that the distribution follows the theoretical distribution as expected. The values of variance and are also consistent when , as shown in Table 2.  If we misestimate the evaluation of , for example, when 0.1 and 10 , the value of becomes larger than the reference variance as shown in Table 2 and Figs. 4 and 5. The value of becomes minimum when . It is expected that the value of becomes larger as gets further away from the true one. Especially, the underestimation of has a larger impact on than the overestimation of .
Based on this property, we interpret the covariance model showing the smallest (i.e., closest to the reference variance ) as the best model. We investigate the performance of ABIC based on whether the best model can be found by minimizing ABIC or not. Namely, we check for the consistency of the minimum point between ABIC and .

Assumption of covariance models
The assumptions of the covariance models corresponding to the unknown true covariance are listed in Table 3. The three kinds of covariance models were prepared for the numerical experiments. Since the true covariance is unknown, assumptions for the covariance model are necessary. Therefore, in the numerical tests, the covariance models have been prepared with some modifications to the true covariance . The difference between the first and second cases is whether can reproduce the true covariance or not by tuning the hyperparameter. In these cases, the number of hyperparameters was assumed to be constant. On the other hand, the third case had multiple hyperparameters.
The covariance structure and the singular values of the true covariance matrix used in the numerical experiments are shown in Figs. 6 and 7, respectively.

The case of
As described above, in the first case, the covariance model can reproduce the true covariance by tuning the hyperparameter.
The histogram of the determined hyperparameter for each sample is shown in Fig. 8. The singular distribution was observed around 0. The mean of became somewhat smaller than the expected value, 1. These could be caused by the assumption of a positive value as the range of the hyperparameter. The evaluated value of the sample variance is listed in Table 4 with the reference variance . The histogram of is shown in Fig. 9. The shape of the histogram of is very close to that of the reference one. In fact, the sample variance and the mean of the reference variance are close. Consequently, in the first numerical experiment, ABIC showed good performance to find an appropriate covariance model.

The case of
In the second case, the noise matrix was employed in the . Unlike given in the first case, cannot reproduce the true covariance matrix due to the noise matrix , even if the hyperparameter is tuned. Here, is a unit matrix, and is a constant value to be 8.6 10 . The value of was set equal to the maximum singular value of .
The histogram of is shown in Fig. 10 and Table 5 shows the evaluated values of the mean of ABIC and . The result of the first case of is also shown in this table for comparison. The agreement of the histogram with the reference distribution becomes worse compared to the first case. The sample variance also becomes larger than that of the first case. As if to correspond to this phenomenon, the mean value of ABIC becomes larger. From this result, ABIC can reflect the badness of the covariance modeling of the second case compared to that of the first case. As far as this result is concerned, ABIC seems to adequately reflect the goodness of covariance modeling.  However, a more detailed analysis reveals that the performance of ABIC will become poor when the covariance structure differs greatly from the true one. For the first and second cases, Figs. 11 and 12 show the changes in the mean of ABIC and the sample variance according to the hyperparameter change, respectively. The minimum points of the mean of ABIC and the sample variance are shown as the white and black arrows in these figures. Figure 11 represents the results when we assumed . Both minimum points are matching between the sample mean of ABIC and sample variance in the first case. This matching result shows that ABIC can find the optimal hyperparameter. On the other hand, Fig. 12 represents the results when we assumed . In contrast to the previous result in Fig. 11, the mismatching of the minimum point was observed in the second case. This mismatching represents the difficulty in finding the optimal hyperparameters using ABIC. The result of the second numerical experiment implies that the mismatch tends to increase as the covariance structure differs.

The case of ∑
In the third case, we investigated whether ABIC can choose an optimal number of hyperparameters. To clarify this, the singular vectors of were used for . Here, having a smaller number of corresponds to the larger singular value.
The results of the sample mean of ABIC and sample variance of when 1-8 are shown in Fig. 13. In addition, the result of the case is also shown in this figure as the case of 0. As increases, the mean of ABIC decreases rapidly at first due to the decrease of ℒ (the first log likelihood term in the righthand side of Eq. (16)), and then increases slowly due to the increase of 2 (the last penalty term in the righthand side of Eq. (16)). The mean of ABIC becomes minimum when the reduction of the value of ℒ is saturated. The sample mean of ABIC becomes minimum when 4. The sample variance becomes minimum when 3 . The minimum points of the mean of ABIC and were almost consistent. Thus, the third numerical experiment demonstrates that ABIC can appropriately suggest the number of hyperparameters. In this case, ABIC and are larger than in the first case. This is due to the following reasons: If the covariance Min.
Min. modeling is equal to the true covariance matrix , the sample variance is consistent with the reference variance . However, for smaller K, the true covariance matrix cannot be perfectly reproduced by even if the hyperparameters are tuned. For larger K, there are too many hyperparameters to accurately estimate by .

Conclusions
To improve the reliability of the cross-section adjustment method, the use of ABIC was proposed. We derived the formula to efficiently evaluate ABIC which was represented by a lower-rank matrix to enable numerical experiments with large samples in a realistic computation time.
To evaluate the applicability of ABIC to the crosssection adjustment method, some simple numerical experiments were performed in this study. Through these numerical experiments, the relationship between ABIC and (i.e., the variance of the estimation error of a core characteristic) was investigated. More specifically, the applicability of ABIC was discussed based on the consistency of the minimum points of ABIC and . Consequently, it was confirmed that ABIC tends to select a covariance model with fewer hyperparameters and a smaller . However, it was also found that this desirable property of ABIC will be lost if the structure of the covariance model was far from the true one. This is a future research problem to be solved when ABIC is applied to the cross-section adjustment method.