USING ARTIFICIAL NEURAL NETWORKS TO ACCELERATE TRANSPORT SOLVES

Discontinuous Finite Element Methods (DFEM) have been widely used for solving SN radiation transport problems in participative and non-participative media. In this method, small matrix-vector systems are assembled and solved for each cell, angle, energy group, and time step while sweeping through the computational mesh. In practice, these systems are generally solved directly using Gaussian elimination, as computational acceleration for solving this small systems are often inadequate. Nonetheless, the computational cost of assembling and solving these local systems, repeated for each cell in the phase-space, can amount to a large fraction of the total computing time. In this paper, a Machine Learning algorithm is designed to accelerate the solution of local systems. This one is based on Artificial Neural Networks (ANNs). Its key idea is training an ANN with a large set of solutions to random one-cell transport problems and, then, replacing the assembling and solution of the local systems by the feedforward evaluation of the trained ANN. It is observed that the optimized ANNs are able to reduce the compute times by a factor of ∼ 3.6 per source iteration, while introducing mean absolute errors between 0.5− 2% in transport solutions.


INTRODUCTION
Machine Learning techniques are slowly paving it way into computational science. Beyond the current hype, Machine Learning techniques have proven effective in compressing and optimizing the solution for large computational scale problems [1]. Among these problems, there is the deterministic computational solution of radiation transport.
Historically, Discontinuous Finite Element Method (DFEM) has been widely used for solving S N radiation transport problems. In this method, the transport equation is discretized into a set of algebraic equations that must be solved for each spatial cell, angular direction, energy group, and time. Each of this problem is called one-cell problem. For example, the one-cell problem consists of a system of 8 coupled algebraic equations for trilinear DFEM in 3D hexahedral cells. Reducing the problem to one-cell units is effective when reducing the memory requirements in the solution of the radiation transport problem. However, iterative accelerators, designed for large systems of algebraic equations, are usually not effective at the scale of the small local system of equations obtained for one-cell problems . Moreover, the computational cost of assembling and solving the one-cell problem, repeated for each phase-space cell, amounts for most of the computational time in DFEM radiative transport solves. This imposes an upper computational bound on the potential acceleration of these methods when a fixed computational power is available.
However, when analyzing the local systems obtained for one-cell problems, an underlying structure can be observed in these systems because of the symmetries and directionality of the radiative transport process. In the present work, a Machine Learning algorithm is designed to exploit this structure for accelerating radiation transport solves is presented. The key idea is to replace the assembly and solution of each local system by a pre-trained ANN recursively evaluated during the sweeping process.
In the following sections are organized as follows. First, the DFEM discretization of the radiation transport problem is and the structure appearing in the local systems is analyzed by means of correlation matrices. Then, the design of a Shallow Neural Network to exploit this structure for accelerating the solution to one-cell problems is discussed. Finally, key results obtained by mean of this acceleration are presented.

DISCONTINUOUS GALERKIN FINITE ELEMENT METHOD IN TRANSPORT SWEEPS
The one-group S n transport equation, with isotropic scattering and sources, is given by with d the direction index in the angular quadrature. The scalar flux is obtained via numerical integration: φ = d w d ψ d , with w d the quadrature weight [2]. For simplicity, in Eq. (1), the total source (scattering source + extraneous source) has been introduced as the scattering source is often lagged on iteration behind, and thus merged in with the extraneous source, during a source iterations process. After the DFEM [3] spatial discretization of Eq. (1), the following local linear system for cell K in the phase-space (space, direction, energy, and time) is obtained where piece-wise constant material properties have been assumed. The set of faces for cell K is split into faces where radiation is incoming and outgoing, F ∓ = {f ∈ ∂K, s.t., Ω d · n f ≶ 0}. Besides, Ψ denotes the DFEM solution in the current cell, while Ψ ↑ f denotes the upwinded flux (transmitted through faces from the upwind cell neighbors). The local matrices are defined as follows: The DFEM basis functions on cell K are denoted by b i (r) and (i, j) ∈ [1, N K ] where N K is the number of spatial degrees of freedom for cell K. The size of the local system is N K × N K . Here, we restrict ourselves to trilinear basis functions on hexahedral cells, so that N K = 8.
In a more general form, Eq. (2) can be written as where A d K is the matrix of coefficient for the local system and b d K is the right-hand-side vector containing the incident fluxes and source. During a regular solution process, the right-hand-side b d K is computed for each cell K, by updating the incident upwinded fluxes and scattering source from the previous iteration, and, then, system in Eq. (7) is solved by GE. In this case, this results in 729 computational operations. For simplicity, we assume equivalent computational cost in all operations, although divisions are proven to be more computationally expensive in modern processors [4]. This is the upper bound when replacing this process with the Machine Learning algorithm.
The base ground to design an accelerator based on a Machine Learning algorithm is to demonstrate that there exist a structure in problem (7) and, thus, a simpler mapping between inputs (A d K , Ψ d K ) and output (b d K ) is achievable. In this case, in order to study the system's structure, a set random one-cell problems have been generated. When generating each random problem the internal sources, incident fluxes, total cross section, direction vector, and the cell's dimensions are randomly sampled from a uniform random distribution. Then, Eq. (2) is assembled and solved. Finally, the solution vector b d K is collected for each of these problems. A total of 10 5 random problems are necessary to obtain statistical convergence among the second order correlations in each of the components of the discretized system Eq. (7). Therefore, in the system structure is analyzed in light of these correlations. In particular Pearson's correlations and Student's t-tests are performed as two complementary statistical measures of correlation among the data. The results obtained for the correlation coefficients between each of the entries of the system matrix (A), the right-hand-side vector (b), and the solution vector (Ψ) are presented in Figure 1. In Figure 1 it can be obserevd that a large correlation exists among the coefficients in the system's matrix. This is a consequence of the way in which this matrix is assembled in Eq. (2) and because there are more coefficients than random inputs in the system. Moreover, a non-negligible correlation exists among the components of the right-hand-side vector. This results from the structure maintained in the construction of the right-hand-side vector (incident fluxes + sources). Furthermore, structures appear when looking at the correlations of these two with the solution Ψ. This is evident since the solution vector is univocally determined by the matrix of coefficients and the right-hand-side vector.
The main conclusion of this section is that there is a structure local systems obtained in the DFEM discretization of radiation transport. The of next section is to exploit this structure to reduce the number of operations currently required in the solutions of one-cell problems.

MACHINE LEARNING APPROACH FOR ACCELERATING TRANSPORT SWEEPS
The key idea of this work is to replace the assembly and solution by GE of system Eq.
(2) with a shallow Artificial Neural Network (ANN). Note that for a sweeping process the same trained ANN will be providing the solution for each cell in the phase-space. This means that the sweeping process is replaced by a staggered recurrent evaluation of the ANN.
The architecture of this ANN is to be optimized in order to reduce the number of operations involved in its feedforward evaluation, while keeping a bounded error in the solution. The resulting ANN, is presented in Figure 2. It consists of an ANN with only one hidden layer (shallow ANN). Hyperbolic tangents have been used as activation functions in the hidden layer, while Exponential Linear Units (ELUs) has been used as activation functions for the output layer. The reason for first ones is that an hyperbolic tangent provide the mixed linear-exponential behaviour as a function of its inputs that is required in the solution to transport problems. Moreover, ELUs can scale the outputs of the ANN without a bound and, thus, provide the ANN the ability to represent any solution for the angular fluxes (unlimited positive range). The key part of acceleration arrives when determining appropiate inputs for the ANN. Rather than feeding in the linear system coefficients (matrix and vector entries), the input layer takes in as inputs the variables determining the solution of linear system. Specifically, these are: (i) the incident fluxes over the three incident faces of a sweeping octant (Ψ d inc ), (ii) the total source term in the DFEM nodes within the cell (q), (iii) two independent direction cosines determining the sweeping angle (µ, η), (iv) the three factors determining the sizes of the hexahedral cells (∆x, ∆y, ∆z), and (v) the total cross section in the cell (σ t ). Thus, the resulting ANN has 26 inputs, instead of the 72 inputs that will be required if the coefficients of the system's matrix and right-hand-side vector had been used.
To fundament this idea, a comparison of the number of operations required for the general assembly and solution by Gauss elimination of system is compared against the number of operations required for the feed-forward evaluation of the ANN in Figure 3. It is observed that the feed-forward evaluation of the ANN may allow to reduce the number of operations by a factor of ∼ 3.2, when using 8 neurons in the hidden layer as shown in Figure 2. The number of neurons in this hidden layer was set by balancing the ANN fitting error produced by solving local systems with the ANN versus the acceleration provided by this one.
At this points, three points that differ from classical Machine Learning approaches that are necessary to obtain accurate solutions and improve acceleration when building a system based in this ANN: (i) appropriate definition of a regularized loss function fro training the weights of the ANN, (ii) suitable algorithm to randomly build the training set avoiding the introduction of biases that can cause overfitting, (iii) application of an adapted DropConnect that allows accelerating the feedforward evaluation of the ANN. The first point is discussed here since it is of paramount important in this work. The reader is referred to out paper [5] for more information on the other two.
The training of the ANNs in each direction is performed on a set of 9 × 10 4 solutions to one-cell problems and the testing set consists of 1 × 10 4 equivalent ones. The training is performed by backpropagation using the Adam optimization algorithm. For training a testing, the regularized loss function is defined as follows: where τ is a characteristic optical length of the cell, taken in this case as τ = Σt∆ µ , with the cell size parameter ∆ = ∆x 2 + ∆y 2 + ∆z 2 and µ the director cosine in thex-direction. Furthermore, the parameters λ 1 , λ 2 , γ 1 , γ 2 , and γ 3 are fitted to obtain the required behavior in the ANN. The values used in this case are presented in Table 1.
The first term in the right-hand-side of Eq. (8) is a classical quadratic loss function. The second one is a Lasso regularization that assigns higher weights to errors produced by the ANN when the fluxes are small in magnitude. This one is introduced because large relative errors in small components of the solution vector are hindered in the classical quadratic definition of the loss function in Eq. (8). In our case, this hindering results   in an error buildup when recurrently evaluating the ANN in a sweeping process. The third term is also a Lasso regularization parameter that assigns higher values for errors produced in optically thin problems. This regularizer channels the ANN to find solutions by compressing the inputs in the hidden layers of the ANN towards zero. The parameters in both regularizing terms have been fitted after a sensitivity study.
In the following section, the performance of a trained ANN in replacing the solution to one cell problems is analyzed for a benchmark case.

RESULTS
In this paper, we present the results of applying the trained ANNs presented in the previous section to benchmark Problem 1 proposed in [6]. The geometry of the Benchmark problem is shown in Figure 4. The problem has one internal source (Region I), followed by a void region (Region II), and a thick absorber (Region III). The parameters for the three regions are shown in Table 2. This benchmark case was selected since the ANN could be tested in limiting cases with very different properties, maximizing the probability of error the buildup in the recurrent evaluation of the ANN during a transport sweep. The 3D domain was discretized in 100 × 100 × 100 cells. Furthermore, Figure 4: Geometry of the benchmark used.
4 different ANNs were trained for the S 2 , S 4 , S 6 , and S 8 angular quadratures. The average error for each discretization is shown in Table 3. These errors are larger than the ones obtained during the training and testing of the ANN for one cell problems (∼ 10 −4 ). This indicates that an error build-up exists when recurrently evaluating the ANN. However, due to the regularizer in Eq. (8) the error buildup is small enough to yield acceptable solutions. Subsequent work will focused on techniques to further reducing this error. A large speed-up in computing time is obtained when employing the ANN. This one is produced by two factors. First, the reduction in the number of operations required with the ANN (factor 3.2). Then, the most rapid convergence of the solutions produced with the ANNs during source iterations due to the noise in the connections of the ANN, which act with a diffusive character when recursively evaluating the network (factor 4.9). The reader is referred to our paper [5] for more detailed testing cases for this ANN.

CONCLUSION
In the present work an ANN is proposed for accelerating radiation transport solves. This one replaces the local system solution for the one-cell problems obtained in the DFEM discretization of radiation transport. Thus, the sweeping problem to solve radiation transport is replaced by the recursed evaluation of the ANN. The ANN structure has been optimized by taking a reduced set  Table 3: ANN results for the Benchmark problem [6] with c = 0.999 of parameters as inputs that fully determines the solutions to the one-cell problems. Moreover, accurate solutions are obtained by applying Lasso regularizers to the loss function when training the ANN. In this paper the results obtained with an ANN trained with 9 × 10 4 one-cell random radiative transfer problems and tested on 1 × 10 4 equivalent ones are presented. This ANN is then introduced in a large scale 3D problems used as benchmark, where it is observed that replacing the solution of one-cell problems by the recurrent evaluation of the ANN introduces error between ∼ 1 − 2%, while introducing an acceleration factor of ∼ 15 − 6%. Our future work is oriented to improving the error in the ANNs, developing hybrid ANN-exact solutions, and using the ANNs as low order operators in High-Order Low-Order schemes in radiative transfer.