Rapid Evaluation of Particle Properties using Inverse SEM Simulations

The characteristic X-rays produced by the interactions of the electron beam with the sample in a scanning electron microscope (SEM) are usually captured with a variable-energy detector, a process termed energy dispersive spectrometry (EDS). The purpose of this work is to exploit inverse simulations of SEM-EDS spectra to enable rapid determination of sample properties, particularly elemental composition. This is accomplished using penORNL, a modified version of PENELOPE, and a modified version of the traditional Levenberg–Marquardt nonlinear optimization algorithm, which together is referred to as MOZAIK-SEM. The overall conclusion of this work is that MOZAIK-SEM is a promising method for performing inverse analysis of X-ray spectra generated within a SEM. As this methodology exists now, MOZAIK-SEM has been shown to calculate the elemental composition of an unknown sample within a few percent of the actual composition.


EXECUTIVE SUMMARY
This report is the final deliverable of a 3 year project whose purpose was to investigate the possibility of using simulations of X-ray spectra generated inside a scanning electron microscope (SEM) as a means to perform quantitative analysis of the sample imaged in the SEM via an inverse analysis methodology. Using the nine point Technology Readiness Levels (TRL) typically used by the US Department of Defense (DOD) and the National Aeronautics and Space Administration (NASA), this concept is now at a TRL of 3. In other words, this work has proven the feasibility of this concept and is ready to be further investigated to address some of the issues highlighted by this initial proof of concept.
The work performed during the first and second year of this project is not discussed in great detail in this report because that work was all building to the inverse analysis methodology that is described in this report. References are provided that document the work performed during these initial 2 years. The most significant accomplishment during these years was the development of penORNL. In order to simulate the X-ray spectra generated inside a SEM, this project needed a fast running coupled electron / photon transport code that accurately simulated electron and photon physics at energies from hundreds of eV to tens of keV. The Monte Carlo transport code PENELOPE provided the requisite physics, but it was modified to run in parallel, add new tally and source capabilities, and improve the available variance reduction techniques. The result of these modifications is penORNL. With penORNL fully operational, the effort to implement an inverse analysis methodology to perform inverse quantitative analysis could begin.
Traditional quantitative analysis compares the X-ray spectrum produced in an SEM by an unknown sample to the spectra generated by well-characterized elemental or simple binary standards. The ratio of characteristic X-ray intensities of the unknown sample to the known standard is equal to the ratio of the elemental weight fraction of the unknown sample to the known standard. This relationship is used to determine the weight fraction of each element within the unknown sample. To account for the fact that the elements in the unknown sample are in a compound or mixture with other elements, unlike most of the standards, empirical matrix corrections are applied to the ratio of characteristic X-ray intensities. These empirical matrix corrections improve the results of the quantitative analysis.
At a very high level, the purpose of this project is to obviate the need for well-characterized elemental or simple binary standards and to replace empirical matrix corrections with detailed first-principles simulation of the electrons and photons within the unknown sample. Some refer to this as "standardless" quantitative analysis. The Initial Guess Module (IGM) of the inverse analysis methodology replaces the use of standards in this work. The IGM compares the measured X-ray spectrum for the unknown sample to a database of known characteristic X-rays and determines what elements might be in the unknown sample. Then the IGM compares the intensity of the characteristic X-rays in the measured spectrum from the unknown sample to simulations of X-ray spectra of pure elemental samples. The resulting elemental composition serves as the initial guess of the solution required by the inverse analysis methodology. This initial guess is similar to traditional quantitative analysis results without any matrix corrections. With an initial guess of the elemental composition, the inverse analysis can begin.
The before mentioned matrix effects are where characteristic X-rays produced by one element in an unknown sample cause a characteristic X-ray of a different element in the sample to be created. Therefore, the intensity of a characteristic photopeak is dependent on the amount (weight fraction) of every element in the unknown sample, but is primarily dependent on the amount of the element that produces the photopeak. The contributions to each characteristic photopeak by other elements (matrix effects) are a second-order effect but are not negligible. Therefore, the dependence of the characteristic Xray's intensity on the amount of each element in the sample is nonlinear, which led to the choice of using the Levenberg-Marquardt nonlinear optimization algorithm. The nonlinearity of this system requires a new X-ray spectrum to be calculated for each perturbation of the elemental composition; in other words, a penORNL simulation is required for each perturbation of the elemental composition. Furthermore, Levenberg-Marquardt requires the partial derivatives of the spectrum, and if this is calculated using a difference approximation, then an additional penORNL calculation must be performed for each element under consideration. This approach would be prohibitive since each new estimate of the elemental composition would require 2N penORNL simulations for an unknown sample with N elements. In order to reduce the computational overhead of this methodology, a simplifying assumption was made. This assumption was that the spectrum can be broken down into two parts, the photopeaks and the X-ray background. Furthermore, the intensity of the photopeaks only depends on the amount of the element that produces each characteristic X-ray and the background only depends on the scattered X-rays in the sample. We have previously described that these dependencies are mostly true but not completely so due to matrix effects. This simplifying assumption reduces the number of penORNL calculations required. The resulting inverse analysis methodology, which is referred to as MOZAIK-SEM, has two levels of iterations, inner and outer iterations. At the beginning of each outer iteration, a penORNL simulation is performed to calculate the X-ray spectrum. This is followed by a number of inner iterations, applying Levenberg-Marquardt, that minimize the difference between the measured and calculated photopeak intensities, which is really a linear problem. The new elemental composition produced by the inner iterations is modeled by penORNL at the beginning of the next outer iteration, which updates the background portion of the X-ray spectrum and fully models the matrix effects. The X-ray spectrum that is produced by each penORNL simulation is compared to the measured X-ray spectrum to determine if MOZAIK-SEM satisfies the given convergence criterion for the optimum solution. Even though matrix effects are ignored during the Levenberg-Marquardt inner iterations, they are modeled by each outer iteration.
Verification simulations with MOZAIK-SEM revealed that this methodology can successfully identify the weight fractions of low-Z elements such as oxygen and fluorine, which is difficult for existing quantitative analysis methods. However, like other quantitative analysis methods, this method can also suffer when X-ray line energies for multiple elements are very close together. A problem unique to this method is in distinguishing small photopeaks due to trace elements from the Monte Carlo statistical noise in the X-ray background calculated by penORNL. These small photopeaks can be indistinguishable from the background if the Monte Carlo uncertainty of the background is large, that is, if the background Monte Carlo uncertainty is of the same order of magnitude as the weight fraction of a trace element. Simulating more particle histories during the inverse analysis helps reduce this issue, but that increases the computational burden in time or additional processors. Finally, a validation simulation with MOZAIK-SEM was less successful than the verification simulations, at least in part due to vaguely defined conditions of the experimental measurement. A number of prominent differences exist between the measured spectrum provided by the National Institute of Standards and Technology (NIST) and the spectrum simulated by penORNL. In the end, no source of these differences can be absolutely identified, so it is impossible to state if these differences are due to the measurements, MOZAIK-SEM methodology, or penORNL physics.
For the MOZAIK-SEM methodology to be improved, two issues need to be addressed. First, a better method to deal with X-ray line energies that are very close together should be investigated. The method as it exists now works fairly well, but there is clearly room for improvement. Second, and more importantly, additional validation simulations should be performed, but this should be done as part of a much closer collaboration between the experimentalist and modeler. For the validation effort to be as successful as possible, future work should include an energy calibration of the SEM-Energy Dispersive Spectroscopy (EDS) detector; an energy calibration of the SEM electron source; and details of the geometry, including the sample, detector, and internal configuration of the SEM sample chamber.

INTRODUCTION
Analysis of small particles and material surfaces has progressed rapidly with the development of the scanning electron microscope (SEM) over the past half-century. Images of surface features, heterogeneous phases, and other morphological information can be acquired at much higher resolution than with traditional (optical) microscopes. The X-rays produced by the interactions of the electron beam with the sample are usually captured with a variable-energy detector, a process termed energy dispersive spectrometry (EDS). Analysis by SEM-EDS provides additional information with which to characterize the sample by identifying the electron transitions occurring in it. This last feature, often of secondary importance to SEM analysts, forms the basis for the work under this project.
Measuring the chemical composition and crystallographic phase of small samples using X-rays in SEM instruments is termed electron probe microanalysis (EPMA). By focusing an electron beam to a diameter of a few nanometers and measuring X-rays fluoresced from the sample when it is struck by fast electrons, an analyst can determine the elements present (qualitative analysis) with a spatial resolution well below 1 µm. Depending on the electron beam energy used, most of the periodic table can be reliably detected this way. The problem of accurately measuring the relative elemental abundances (quantitative analysis) is more difficult and has only been conducted acceptably on samples that are flat, highly polished, and chemically uniform. When these conditions are met, the chemical composition can be measured with great precision by comparison with known standards that have been carefully prepared. However, most real-world samples do not conform to these restrictions, including the vast majority of those encountered in nonproliferation work. Hence, it is desired to obtain both qualitative and quantitative analyses without using standards and without meticulous sample preparation.
The purpose of this project is to exploit inverse simulations of SEM-EDS spectra to enable rapid determination of sample properties, including both qualitative and quantitative chemical analysis. This requires a method to model the electron interactions with small samples that occur in an SEM and the subsequent generation of X-rays, which will be provided by Monte Carlo (MC) simulations of the process by which SEM-EDS spectra are measured. The underlying physics of this three-dimensionally complex signal-generation process is reasonably well understood, and MC codes have been used to predict the distribution of both scattered electrons and emitted X-rays. That is, for samples of known composition and simple geometry, code inputs can be devised to reproduce the X-ray spectra that are observed. This work endeavors to advance such simulation capability to evaluate progressively more complex samples and to provide rapid and automated analysis.
This project used a phased approach to develop and test the requisite methods needed to accomplish these goals. Several existing MC codes for coupled electron/photon transport were tested for accuracy of the physics models and superior computational performance. A sensitivity study was performed to determine the important parameters that contribute to the SEM-EDS spectra, which provided insight about their contributions to inverse SEM-EDS simulations. The PENELOPE code [1] was identified as the best platform for further development, and numerous modifications were performed during the course of this project; the modified code is referred to as penORNL [2]. All of this work has been previously documented and is summarized in the subsections that follow.
The remainder of this report (Sections 2-4) focuses on the last stage of this project, which was to develop an inverse algorithm to identify sample elemental composition from a measured SEM-EDS spectrum. As stated before, this task requires a code to model coupled electron/photon transport physics, which in this case is penORNL. Also presented are computational results for the purposes of verification and validation of the inverse tool and analysis of the tool's performance, followed by recommendations for future work.

EVALUATION OF EXISTING COUPLED ELECTRON/PHOTON MONTE CARLO TRANSPORT CODES
Early on it was decided to evaluate only MC codes for this project to eliminate the approximations that are involved with deterministic transport codes. The codes evaluated were DTSA-II [3], MCNP versions 5 and 6 [4,5], ITS6 [6], and PENELOPE. Table 1 summarizes the strengths of each code [7]. The features evaluated were the availability of a graphical user interface (GUI); the available variance reduction techniques; the ability of the code to simulate low-energy physics; and whether the code could perform adjoint coupled electron/photon transport, run in parallel, and tally pulse heights. In Table 1 PENELOPE receives two checkmarks for variance reduction because of its superior variance reduction technique that biases the production of characteristic X-rays.
When this project began, MCNP6 had not been released, so the only codes evaluated with the requisite low-energy coupled electron/photon transport physics were DTSA-II and PENELOPE. PENELOPE was selected as the code of choice primarily because of its speed, which can be attributed to its transport algorithm [8] and unique variance reduction techniques for electron transport [9]. Once MCNP6 was released, its performance was comparable with that of PENELOPE. Although its physics was improved, the execution speed of MCNP6 was still noticeably slower than that of PENELOPE. A comparison in reference 7 shows a case in which PENELOPE performs between a factor of 1.5 and 3 times faster than MCNP6. Note also the last column of Table 1 compares these codes with PENORNL, a modification to PENELOPE conducted as part of this work which is described in Section 1.3.

SENSITIVITY ANALYSIS OF SEM-EDS SPECTRA
An analysis was performed to determine what parameters associated with the samples imaged in an SEM are most important to the X-ray spectra generated. This sensitivity analysis [10] considered compositionand data-dependent parameters such as elemental weight percent, density, stopping power, and electron shell transition probabilities. It also considered geometric-dependent parameters such as surface roughness and material heterogeneity.
This sensitivity analysis provided some insight about which parameters will be impactful during an inverse simulation to determine the material composition based on a measured SEM spectrum. It was shown that some materials with very similar compositions can be distinguished, but this requires very small MC statistical uncertainties. For samples larger than the interaction volume of the SEM electrons (dimensions on the order of the electron range), mass density was not a major concern because there was little sensitivity to density for these large samples. The X-ray spectra were sensitive to the stopping power and electron shell transition probabilities, but they were more sensitive to the transition probabilities. Both of these parameters constitute physical data and are not computational parameters that will be adjusted during the inverse analysis. However, uncertainties in these input data can reduce the accuracy of the SEM MC simulations and therefore the accuracy of the inverse analysis.
The simulations of varying heterogeneity showed no issues that would preclude the practical application of inverse methods to simulations of SEM X-ray spectra if the heterogeneity/inclusion is properly identified and included in the geometry model of the sample. An example of this is illustrated in Fig. 1 and Fig. 2. This example is a U metal sample with a small Cu inclusion, as shown in Fig. 1. Fig. 2 shows the respective X-ray spectra simulated for a pure U metal sample and when the SEM electron source is placed at the four source positions labeled A, B, C, and D in Fig. 1. The spectra in Fig. 2 clearly illustrate how the characteristic X-ray lines change when the SEM electron source is centered on the Cu versus the U. The curves in Fig. 2 have been offset by different amounts to make each spectrum easy to see. If the presence of the element Cu is omitted from the penORNL computation, then the inverse simulation will not include any copper peaks in the sample spectrum. However, when the actual SEM focal point is at source position A or B, then Cu lines will appear in the spectrum, and there will be a significant difference between the measured and simulated spectra.  It was shown that a reasonable amount of surface roughness causes no significant issues when SEM X-ray spectra are being simulated. The scanning/raster pattern of the electron beam directs electrons onto the "peaks and valleys" of the rough sample. This scanning pattern creates characteristic X-rays in the peaks and valleys, so the measured and simulated X-ray spectrum is an average over the entire surface. Thus surface roughness is not a major issue if the surface roughness dimensions are small compared to the size of the bulk sample; the averaging effect minimizes the effect of surface roughness on both the measured and simulated X-ray spectra.

DEVELOPMENT OF PENORNL
Section 1.1 discussed the coupled electron/photon transport codes considered for this project and noted that the PENELOPE code was selected as a platform for further development. PENELOPE has only been developed as a serial code, since the primary focus of the developers was the electron/photon physics modeling. They assumed that most users would write their own main program or driver program, but they did provide two simple examples of main programs. Because MC codes can achieve massive speedups

U metal Cu
A B C D Source Positions using multiple processors, a new main program was written to create a parallel version of PENELOPE. This new parallel version of the code is named penORNL and has been discussed in detail in Ref. 2. The development of penORNL created a code that had all the most important features mentioned in Table 1: low-energy physics and variance reduction from PENELOPE plus the parallel capability and pulse-height tallies available in other codes.
The type of parallel implementation in penORNL is common for MC codes and relies on domain replication and uses a master-slave approach. In domain replication, the same problem is replicated on a number of processors, and an MC simulation is performed on each replica by running some portion of the problem. In the master-slave approach, one process is allocated as the master process to set up the problem, establish communication among the processes, broadcast data for domain construction on each process, gather computed data from the other processes, and control the other processes for a successful parallel execution. The remaining processes are allocated as slave processes to perform only particle transport calculations on the given problem domain. The slave processes only communicate with the master process and not with the other slave processes. The master process can also perform particle transport calculations to improve the process utilization by decreasing the idle time of the node running the master process.
A few unique additional features were implemented in penORNL. Some of these are only of interest to this project or others simulating X-ray spectra in an SEM. However, some are more general and will be interesting to any user of penORNL or PENELOPE.
1. A new source module was written. This module defines a few sources that are specific to simulating an SEM.

2.
A tally module was created, primarily to enhance the parallel capability of penORNL.
3. A pulse-height tally was added as a new tally type to the tally module. This tally is useful when simulating the detailed energy response of any detector.
4. One of the unique variance reduction techniques in PENELOPE was updated and corrected. This variance reduction technique is an interaction-forcing mechanism that increases the production of characteristic X-rays; it was needed because the original implementation in PENELOPE violated energy conservation.
The use of penORNL can greatly speed up the inverse analysis required by this project, as discussed in Sections 2 and 3. The code has been shown to scale nearly linearly with an increasing number of processors, which is somewhat dependent on the problem being simulated [2]. It is not uncommon to perform inverse analysis using several hundred processors, and penORNL allows that analysis to be performed much more quickly.

DETECTOR RESPONSE FUNCTION
Before the inverse SEM simulations are performed, it is necessary to discuss the detector response function that is applied to MC simulation results. The detector response function simulates the Gaussian energy broadening inherent in all types of photon detectors but is absent from MC simulations.
To compare MC simulation results to experimental data from the SEM-EDS process, a detector response function is used to generate the detector pulse-height spectrum from the photon line spectrum calculated by the MC simulation of the X-rays incident on the detector. The detector response function is defined as the pulse-height distribution for any incident monoenergetic X-ray, indicated by R(E',E), where E' is the incident photon energy and E is the pulse-height energy. All our calculations used a typical Si(Li) detector response which was imported from the PENELOPE code package: where is the standard deviation of the photopeak and its relation to the full-width-at-half-maximum is given as FWHM = 2.35 . To get the relationship of detector photopeak standard deviation with respect to X-ray energy, a non-linear model is used: A set of calculations was performed to optimize the coefficients 1 and 2 in the response function, to obtain better agreement between the spectra computed by penORNL and the measured spectra. Table 2 presents the optimal values of these coefficients.
In reality, discrete energy bins are used to describe detectors responses, and the continuous response function in Eq. (1) is replaced by the response matrix = � �, with elements = � , �. The convolution integral in Eq. (3) is likewise replaced by the discretized form: which can be expressed in matrix form as where = ( ), and = ( 1 , 2 , … , ) T and = ( 1 , 2 , … , ) T are vectors of computed line spectra and detector response spectra, respectively.
A measured SEM-EDS spectrum and a typical MC line spectrum computed by penORNL for the same standard (apatite) are shown in Fig. 3(a). After the convolution operation was performed, the computed spectrum was represented in a more realistic shape, as presented in Fig. 3(b). The difference (%) shown in Fig. 3(b) calculated between the measured spectrum and the convoluted spectrum is mostly due to improperly estimated bremsstrahlung background. Slight differences between the modeling and simulation can cause differences in the bremsstrahlung background; therefore, we observe such a difference after the convolution process.   4 is an example of how the shape of the X-ray spectrum can change for a given sample with different SEM electron energies. This figure presents the X-ray spectrum measured by the National Institute of Standards and Technology (NIST) for one of their standard glasses, K497. The measured data were reported as using 25 keV electrons. Two simulations were performed at 25 (green curve) and 20 (red curve) keV. Most of the peak locations in the spectra agree, but the slope of the measured background does not agree with either of the computed spectra. This difference could be due to a number of causes, including the detector response, the calibration of the source energy, use of a monoenergetic source in the simulations, and the physics models in penORNL. Additional data about the measurement, such as the results of a source energy calibration and detector calibration, are needed to fully explain the discrepancies. Unfortunately, these details are no longer available.

INVERSE ANALYSIS OF SEM-EDS SPECTRA
The most common measurement technique used to detect X-rays in SEM machines is Energy Dispersive Spectroscopy (EDS). It detects X-rays from any energy emitted from the sample due to fluorescing interactions of the electron beam. When inner-shell electrons in the sample are energized, outer-shell electrons transition to the vacancies, which result in the emission of X-rays characteristic of the parent element. Detection and measurement of the X-ray energy permit elemental analysis. EDS can provide rapid qualitative and, with adequate standards, quantitative analysis of elemental composition with a sampling depth of a few microns. X-rays may also be used to form maps or line profiles, showing the elemental distribution in a sample surface.
Traditional SEM-EDS analysis uses properly calibrated standards to identify the elements in an unknown sample. Usually, X-ray energies from the unknown sample shown in the EDS spectrum are compared with known characteristic X-ray energy values to determine the presence of an element in a sample. In these analyses, elements with an atomic number as low as that of beryllium can be detected. Quantitative results can be obtained from peak heights, i.e., the relative X-ray counts at the characteristic energy levels for the sample constituents. In this process, first the continuum X-ray background signal is removed by processing the spectra of the specimen and standards, so that the measured intensities consist only of the characteristic signal. Subsequently, X-ray intensity ratios are calculated by using the sample intensity and the standard specimen intensity for each element present in the unknown sample. The measured intensity ratios considered as the ratios of mass or weight fractions of the constituents of the sample are determined. Some semiquantitative results are readily available without standards by using mathematical corrections based on the analysis parameters and the sample composition. One of the major disadvantages of these methods is the approximation in these models.
An MC simulation of the complete response of an EDS is interesting from various points of view. Even simple MC models can provide quantitative analysis by taking the major fluorescence lines of each element into account and dispensing with simulation of scattering phenomena altogether. A more complete MC simulation of an EDS also covers the scattering of the primary radiation and includes second-and high-order effects such as the enhancements of fluorescence lines by higher-energy fluorescence or scattered radiation. As a result, it is possible to predict the complete spectral response of an SEM-EDS spectrometer.
A significant advantage that the quantification scheme based on MC simulation has over traditional analysis methods is that the simulated spectrum can be compared directly to the experimental data in its entirety, taking into account not only the fluorescence line intensities but also the scattered background of the EDS spectrum. This is coupled with the fact that MC simulations are not limited to first-or secondorder approximations or to ideal geometries.
In this section, we introduce our methodology for inverse analysis of SEM-EDS spectra and present some verification and validation results.

THEORY
Our inverse methodology is a multilevel iterative search algorithm that seeks a solution for the parameter vector to minimize the differences between the measured SEM-EDS X-ray spectrum and the computed spectrum of the unknown sample. The elements of the parameter vector correspond to the elemental inventories (mass or mole fractions) of the sample. This minimization operation is performed over the entire spectrum rather than using only characteristic X-ray peaks. This approach could reveal extra information from the background if the characteristic X-ray peaks of any element from the unknown sample do not appear as large peaks in the spectrum.
At a high level, the inverse analysis of a measured SEM-EDS spectrum proceeds via the following steps: 1. An initial guess of the elemental composition must be provided. Usually, this initial guess includes all the possible elements within the unknown sample. If an element is not contained in this initial guess, it will not be in the final solution. Effectively, this step constitutes a qualitative analysis of the spectrum.
2. The initial elemental composition is used as input to penORNL to calculate an SEM-EDS spectrum for the specified elemental composition. If the difference between the calculated and measured spectra is less than a prescribed tolerance, the elemental composition is assumed to be optimal and the analysis stops. Since specific elemental inventories have been provided, this step is a quantitative analysis. For the rest of this report, this step is referred to as a step or outer iteration. Thus a new step always begins each time a new SEM-EDS spectrum calculation begins.

If the convergence criterion in
Step 2 is not met, then the elemental inventories are modified via an iterative algorithm to minimize the difference between the calculated and measured spectra. When some prescribed convergence criterion is met, then an updated elemental composition has been determined. Each of these iterations is referred to as an inner iteration or just an iteration.
4. To ensure that the new elemental composition is indeed the best result, this procedure returns to Step 2 to repeat the MC simulation using the elemental composition produced by Step 3 as the input in Step 2. In other words, a new step (outer iteration) is started.
The described procedure sometimes calculates negative values for the elemental composition. As a remedy, the search algorithm replaces negative values with very small positive values (10 -5 ) and continues the calculations.
Details of the inverse analysis algorithm and additional definitions are presented in the following sections. The inner iteration in Step 3 is performed using the traditional Levenberg-Marquardt optimization algorithm. The algorithm itself is presented first and is followed by a description of how it has been modified for this project and an evaluation of the algorithm's performance.

Traditional Iterative Levenberg-Marquardt Nonlinear Optimization
The minimization problem for the SEM-EDS inverse calculations can be expressed as a sum-of-squarederror minimization with the following objective function: which is minimized with respect to the parameter vector . Also, is the measured spectrum at channel , and is the computed spectrum for the given parameter vector at channel ; can be computed from the MC line spectrum using Eq. (4). In other words, to compute for a given parameter vector, a physics calculation must be performed using penORNL.
The problem described in Eq. (5) is a nonlinear least-squares problem and can be solved by the Levenberg-Marquardt nonlinear optimization algorithm [11,12]. After differentiating Eq. (5) and setting derivatives to zero, the method constructs an update strategy that includes terms from the Gauss-Newton quadratic approximation and the first-order steepest-descent method: where is the iteration number, is the damping factor at iteration , is the increment/step vector at iteration , which is used to estimate the parameter vector for the next iteration ( iteration) , is the measured spectrum, = � � is the computed spectrum at iteration , is the parameter vector at iteration , and is the Jacobian matrix at iteration , whose rows are the gradients (with respect to ) of the elements of , i.e.. J =(J i j ) = � � .
Equation (6) can be solved for the increment vector, , and the parameter vector is updated for the next iteration using the following formula: .
During the iterative search process, the damping factor is updated in the usual way (an increase or decrease based on whether the objective function was better or worse in the previous iteration). This procedure is repeated until a best parameter vector is found to satisfy the given minimization criteria. As stated, the Jacobian matrix depends nonlinearly on the parameter vector and so must be recalculated at each iteration. Furthermore, the derivatives must also be calculated, and if this is done using difference approximations, then an additional physics calculation must be performed for each element under consideration. Clearly, this approach would be prohibitive, since each computed solution involves a complete MC calculation.

Modified Algorithm for SEM-EDS
Although the traditional optimization algorithm described in Sect. 3.1.1 is a reliable approach, it is not feasible for the scope of this project because it requires multiple physics calculations at each iteration step. Hence a modified version of the search procedure has been developed that utilizes a linearization of the computed solution. In this methodology, the MC line spectrum (from a penORNL simulation) is decomposed into two components and linearized: where is the background spectrum, is the line spectrum in channel due to element , and the summation is taken over all elements considered. As noted by the dependence on 0 , the background spectrum and the line spectrum remain constant through all inner iterations. The only quantity that changes is the quantity , which is the updated inventory of element at iteration . Inherent in this formulation are the following assumptions: • The characteristic line intensities of the spectrum are approximately proportional to the amount of each element in the sample (parameter vector). As a result, the characteristic line spectrum can be expressed with a linear dependence on the parameter vector.
• The background component does not change with the parameter vector.
For each inner iteration, the formulation in Eq. (8) can be expressed as the matrix equation where is the background vector and is the matrix whose columns are the characteristic spectral line values for each element under consideration. The vector � has elements � = / 0 , i.e., the ratios of elemental inventories at iteration to those specified initially. Thus, � is the contribution to the computed line spectrum due directly to the presence of elemental inventories. Note that Eq. (9) constitutes a linear approximation to the computed line spectrum. Thus, it is possible to approximate a MC line spectrum for the updated parameter vector without performing additional penORNL simulations. The method assumes that an initial penORNL simulation has already been performed using the initial vector 0 , which provides the background spectrum and the characteristic line spectrum. Even though this is a linear approximation, the nonlinear optimization scheme has been retained in case it is needed for any future developments.
Inserting Eq. (9) into Eqs. (4), (5), and (6) yields a procedure to solve the nonlinear least-squares problem iteratively without performing many additional penORNL simulations. Using the definition of from Eq. (9) yields the following representations for , , and : Thus, due to the linearization from Eq. (8), the Jacobian is now independent of the parameter vector (i.e., the elemental inventories) and therefore is constant throughout the iterative procedure at each step.
Also a result of the linearization, a modified version of the Levenberg-Marquardt iteration involves a single physics calculation (penORNL simulation) at the beginning of each step. However, the linearization also implies that a globally optimal solution may not be achieved at the end of the search process. Rather, it only provides a best estimate for a given penORNL simulation because the background is assumed constant. To correct this problem, we perform another penORNL calculation using the optimal parameter vector from the previous step. This allows updating of the background and characteristic line spectra, after which a new optimal vector is determined using the modified (linearized) search process. This multilevel search process is repeated until no further improvement is obtained.
An outline of the multilevel search algorithm is depicted in Fig. 5. Starting with the fixed sample, detector/electron beam geometry, and an initial parameter vector (discussed below), the search algorithm performs a penORNL calculation to simulate the SEM-EDS process. Then, the algorithm computes the sum of the squares of the difference between the computed spectrum and the experimental data. If the desired convergence criterion is satisfied, then the calculation is terminated. Otherwise, the current parameter vector and corresponding computed spectrum are used as input for the modified Levenberg-Marquardt search process to seek a better solution vector. Another penORNL simulation is initiated using the updated elemental compositions, and the above procedures are repeated until the best solution vector is found.
In this methodology, there are two iteration loops: (1) outer iteration (or step), in which the penORNL calculation is performed, and (2) inner iteration (or just iteration), which is the iteration of the modified Levenberg-Marquardt search process. The characteristic line contribution to the spectrum is updated during Step 2 through the linear dependence on , whereas the background component is only updated after a penORNL calculation is performed at the start of Step 1. This arrangement might degrade the overall performance for some cases in which the background contribution is significantly higher than the contribution of the characteristic lines. Another problem arises if the characteristic lines of two or more elements are scored in the same channel of the MC line spectrum. To minimize the effects of this issue, the line emission probabilities from the penORNL simulation are provided to the search algorithm in order to make a first order estimate of the intensity of the characteristic lines within the same channel of the MC line spectrum.

Fig. 5. Outline of the iterative search process.
The modular code framework MOZAIK-SEM was developed to perform inverse calculations described above. It is patterned after an existing code MOZAIK, which is a code package that enables automatic geometry optimization for nuclear applications [13]. One of the benefits of this framework is its modularity, which allows module content to be replaced, depending on the requirements. For example, penORNL and its auxiliary modules (preprocessor and postprocessor modules) can be replaced with another simulation code without modifying the MOZAIK-SEM framework. Similarly, our optimization methodology can be replaced with another approach. This flexibility allows us to test several other options within the same framework for this project.

INITIAL TESTING OF MOZAIK-SEM
A set of sample problems (Table 3) was devised to test the MOZAIK-SEM components and the various options for its use with inverse calculations for SEM-EDS. In each case, a reference (sample) spectrum was computed by penORNL. Using this reference spectrum as the measured spectrum, inverse analysis was performed to estimate sample compositions.
For the reference simulations, cubic samples measuring 50 µm on all sides were placed 2 cm away from the electron beam source, and X-ray spectra were tallied for a generic Si(Li) detector. Calculations were performed with 1 billion source particles (electrons) to reduce the statistical noise in the X-ray spectra. Energy bin widths were set to 10 eV to match the typical channel width in SEM-EDS detectors.  6 shows the reference spectra computed by penORNL for these three samples-PbS, SrWO 4 , and apatite. In each spectrum, X-ray peaks and the background are shown with minimal noise.

Fig. 6. Reference spectra for the three samples used to test the functions of MOZAIK-SEM.
Results from MOZAIK-SEM calculations for the three samples are presented in the next sections. However, we first note the possible options for generating initial composition vectors.
• Case 1: Uniform initial guesses (identical weight fractions for each element).
• Case 2: Random initial guesses (the initial weight fraction of each element was generated randomly).
• Case 3: Extreme initial guesses (the weight fraction for one element was set very close to 1, and the others were randomly generated close to 0).
• Case 4: Additional elements (2-13 randomly selected elements were added to the initial parameter vector).
Cases 1 and 2 were used to demonstrate that MOZAIK-SEM's results were independent of the initial guess. Case 3 was used to test whether or not the methodology could recover the elements with very low initial weight fractions. In Cases 1 through 3, only elements known to be in the sample were included in the initial guess. Case 4 was designed to test if the methodology could minimize elements in the parameter vector that are not in the reference sample; it represents a more realistic calculation in which the elements of a sample are unknown. The selection operation could be either an automated process or based on user insight. Section 3.3 discusses some of the automation methodology for the initial guess selection.
Test calculations were performed on 64 computational cores (2 nodes, each having 4 CPUs with 8 cores, 4 GB memory/core) with 10 7 source particles (electrons). Such simulations result in relatively small uncertainties in the peaks and 1% to 3% uncertainties in the background for the tests discussed above. Some additional calculations were performed using 10 8 source particles, which reduced the background uncertainties further and improved the accuracy of inverse calculations. Results from these calculations are presented in the following sub-sections.

Test of MOZAIK-SEM with PbS Sample
Analysis of the PbS sample was performed for the initialization Cases 1-3, with results shown in Table 4. These results show that the MOZAIK-SEM calculations produce excellent estimates of the reference values for both elements, with relative error less than 2.7%. Validity of results does not appear to depend on initialization values. Table 5 presents some performance results for the MOZAIK-SEM calculations with the PbS sample. For these cases, the penORNL calculation at each step took less than 3 min on 64 computational cores. As a result, inverse analysis for these three cases was completed in less than 1 h. Comparisons between the reference spectrum and the spectrum computed by MOZAIK-SEM for Case 3 are given in Fig. 7. At the beginning of the search process (first outer iteration), there are significant differences between the two spectra, as shown in Fig. 7(a). The results presented in Fig. 7(b) indicate that MOZAIK-SEM essentially found the correct weight fractions of Pb and S with about 2.2% error in the test sample in five inner iterations, and then it started fine-tuning with additional outer iterations. After 18 outer iterations (steps), the optimal solution vector was attained that reduces the difference between the computed and reference spectra significantly, as seen in Fig. 7(c). Variation of the parameter vectors (weight fractions of Pb and S in the sample) at each outer iteration is shown in Fig. 7(d). The first three cases were used to demonstrate that our inverse calculation methodology is feasible for quantitative analysis. The chemical elements were assumed to be known, and the procedure searched for the optimal weight fractions of these two elements.
Case 4 demonstrates the code performance with a more difficult problem. Thirteen additional elements were randomly selected and were added to the initial parameter vector, and then inverse analysis was performed. Results of this calculation (given in Table 6) show that MOZAIK-SEM estimates very small amounts of the additional elements and estimates the amount of Pb and S in the PbS sample with 1% relative differences compared to the reference weight fraction values. Fig. 8 shows the comparison of reference and computed spectra at the beginning of the calculation (15 elements of identical amounts), and at the end of the calculation (13 elements have been discarded, and primarily Pb and S remain). Initially there is significant difference between the reference and computed spectra due to several characteristic X-ray emissions from all 15 elements. MOZAIK-SEM concludes that X-ray lines from Pb and S minimize the differences between the reference and computed spectra, so it discards the other elements by limiting their weight fractions to small numbers. This test case demonstrates both qualitative and quantitative analysis of SEM-EDS samples.

Test of MOZAIK-SEM with SrWO 4 Sample
MOZAIK-SEM calculations with a SrWO 4 sample were also performed using the initialization Cases 1-3. This sample is important because correctly determining the oxygen weight fraction is sometimes difficult for experimentalists. Usually, EPMA can quantitatively analyze elements from fluorine (Z = 9) to uranium (Z = 92) at routine levels as low as 100 ppm. Therefore, testing our methodology with a sample that contains oxygen indicates that elements with even lower atomic numbers could be analyzed. Table 7 presents the simulation results estimated by MOZAIK-SEM using 10 7 source particles and indicates that the oxygen was estimated with less than 3% relative error. Weight fractions of the other two elements were also estimated with less than 4% relative error. However, absolute assays of all elements were determined within 1% regardless of initialization case. The Case 1 calculation was repeated using 10 8 source particles to lower the MC statistical error in the low-energy region (< 1 keV). Increasing the number of particles results in noticeably better estimation for oxygen, as shown in Table 8. While the estimate for W shows slightly higher error, the estimates for both Sr and W are also excellent predictions. Fig. 9 shows the results from the MOZAIK-SEM calculations at the initial and final steps, which converged after six steps (outer iterations). The computed spectrum at the initial step shows considerable deviation from the reference spectrum; however, the two spectra are indistinguishable after step 6.

Test of MOZAIK-SEM with Apatite Sample
MOZAIK-SEM calculations with an apatite sample were also performed for initialization Cases 1-3. This sample contains both oxygen and fluorine, which have characteristic X-ray emissions below 1 keV. Moreover, the weight fraction of fluorine is much smaller than the weight fraction of the other elements, and its characteristic X-ray peak energy overlaps that of oxygen (which makes the fluorine peak difficult to identify), as shown in Fig. 10.  Table 9 presents details of each calculation and the results estimated by MOZAIK-SEM. Elemental compositions are estimated with less than 5% relative error except for fluorine, whose relative errors range from 8% to 18%. This is mainly due to (1) peak overlap with oxygen and (2) possible MC statistical uncertainty in the computed fluorine peaks because the amount of fluorine is small compared to the other elements. Table 10 presents the results from repeating the Case 3 calculation with 10 8 source particles and indicates improvements for all elements. Most notably, the amount of F is now estimated within about 5% relative error, which is an excellent result for a sample of only 0.037 weight fraction. It is important to note that even for the results using 10 7 source particles (Table 9), all absolute assays are determined within 1.5%.

Conclusions
All these testing results show that the inverse calculation methodology for the SEM-EDS analysis is feasible. The methodology reduces the number of physics calculations significantly and makes inverse calculations with MOZAIK-SEM a viable option. Test results indicate that 10 7 source particles are sufficient for most of the calculations, although increasing the number of source particles lowers the MC statistical uncertainty and results in convergence to the correct values with lower error. On average, a MOZAIK-SEM calculation with 10 7 source particles takes 1-2 h (with parallel penORNL on 64 cores) for samples containing few elements. The same calculations with 10 8 source particles in penORNL simulations take 10-20 h (with parallel penORNL on 64 cores). Increasing the number of computational cores should decrease the total calculation time since the parallelism in penORNL has demonstrated approximately linear scaling.

INITIAL GUESS ESTIMATION
Inverse calculations require specification of an initial parameter vector (the elements and their inventories). MOZAIK-SEM does support a user-defined initial parameter vector, although an automatic specification is more useful if no elemental knowledge is available. For this purpose, an initial guessgeneration module (MOZAIK-IGM) was developed and integrated into MOZAIK-SEM. This module enables a fully automated search process, which represents both qualitative and quantitative analysis without user insight. It is divided into three submodules, described in the following subsections. The tasks described in the subsections are performed sequentially in the order described.

Peak Analysis
The Peak Analysis submodule involves the following tasks: 1. performs smoothing in the reference spectrum (measured spectrum) using one of several widely used algorithms; 2. analyzes the reference spectrum and marks the possible peak locations; 3. obtains (from the penORNL element database) X-ray line energies close to the peak locations that have been marked; 4. confirms the peak locations if they match any X-ray line energy in the penORNL element database; and 5. removes from the element list any element whose characteristic X-ray lines do not match any peaks in the reference spectrum.
The Peak Analysis submodule uses either exact X-ray line energies or the X-ray line energies within an energy interval (energy of the marked peak ± 50 eV) to confirm the marked peak as a characteristic X-ray. Fig. 11 demonstrates the peak identification performed by the Peak Analysis submodule for the reference spectrum of an Inconel sample. In this case, the reference spectrum was computed by penORNL with 10 9 source particles, which allows the Peak Analysis submodule to skip the smoothing step since noise on the reference spectrum is minimal. The methodology was able to confirm all identified peak locations because each peak matches at least one characteristic X-ray line in the element database. The Peak Analysis submodule provides these results (identified peaks and elements associated with these peaks) as an input to the next submodule within the MOZAIK-IGM module. Fig. 11. Characteristic X-ray peaks identified and marked by the Peak Analysis submodule for Inconel.

Spectrum Reconstruction
An algorithm was developed to construct a spectrum from the individual spectra of all the elements identified by the Peak Analysis submodule. A simulated spectrum database has been generated by performing a penORNL calculation for each element in the periodic table (Z = 5-92) and convoluting with the generic detector response function in Eq. (4).
After obtaining the element list from the Peak Analysis submodule, the Spectrum Reconstruction submodule constructs the sample spectrum by multiplying each spectrum from the database by the weight fraction of its element and adding. Then, it normalizes to obtain the final spectrum.
One of the deficiencies in this approach is that it does not include the matrix effect whereby one element's characteristic X-ray can excite another element's electrons, which can subsequently produce a characteristic X-ray from the other element. Therefore, this constructed spectrum is preliminary and must be refined by further calculation.

Nonlinear Least-Squares Search
This submodule uses a simple optimization scheme to find optimal weights for the spectrum construction process. The purpose of this step is to minimize the size of the parameter vector (by removing elements whose contribution to the composite spectrum is insignificant) rather than to obtain correct weights. The elemental weights are preliminary at this point (matrix effects have not been considered), and final values are only assigned after a full physics calculation is performed. This optimization in the Initial-Guess Module is necessary because the Peak Analysis submodule might add two or more elements that have characteristic lines matching the same identified peak in the reference spectrum. Fig. 12 shows a snapshot from the initial guess estimation calculations for the Inconel sample; numerous discrepancies exist between computed and reference spectra. That snapshot was generated by MOZAIK-IGM after performing Peak Analysis and creating a bounding parameter vector with the elements that may be contained in the sample because one or more of their characteristic lines matched the identified peaks in the reference spectrum. The snapshot depicted in Fig. 12 includes five informative plots: (1) bottom: histogram showing the compositions of the elements in the sample, which are initially the same (the Peak Analysis submodule has assigned 25 elements to the parameter vector), (2) center-left: comparison of the constructed spectrum and reference spectrum; numerous discrepancies are apparent,  Note that the element list shown in the bottom plot does not include Mn, which is a known trace component of the reference spectrum. It appears that MOZAIK-IGM missed the presence of this element, likely due to one or more of the following reasons: • characteristic Mn X-rays do not appear strongly enough in the reference spectrum to be discernable above background-due to the low weight fraction of this element, • characteristic Mn X-rays overlap with other peaks in the reference spectrum, which totally overwhelmed their presence, or • MOZAIK-IGM evaluated characteristic Mn X-ray peaks as the X-ray peaks from an element with similar characteristic X-rays.
Characteristic Mn line energies are K α = 5.898 keV and K β = 6.490 keV, which are close to the characteristic line energies of Fe (K α = 6.409 keV) and Cr (K β = 5.946 keV). Because the weight fractions of Fe and Cr are significantly larger than that of Mn, the Mn peak disappears under the characteristic peaks of Fe and Cr. As a result, the Peak Analysis module removed this element from the parameter vector. Fig. 13 presents results at the end of the initialization phase for the Inconel sample. MOZAIK-IGM discards some of the elements from the parameter vector and adjusts the weights of the remaining elements. Table 11 gives the element list in the final parameter vector as estimated by MOZAIK-IGM. This vector is passed to the MOZAIK-SEM routine as the initial guess for the inverse calculations.

VERIFICATION AND VALIDATION CALCULATIONS
Results presented in the previous section demonstrate the feasibility of the proposed methodology for inverse calculations. This section presents results from the more extensive verification and validation calculations performed with MOZAIK-SEM. Verification calculations use simulated spectra as reference spectra, whereas validation calculations use actual measured spectra as reference spectra.

Verification Calculations
Samples listed in Table 3 were previously used to test the proposed methodology and its implementation within the MOZAIK-SEM code framework. However, for verification calculations, different (and more difficult) samples were used, which are listed in Table 12. As seen in previous sections, it is difficult to estimate the elemental compositions correctly for elements with very low weight fractions (i.e., trace levels). This issue arises when characteristic X-rays from multiple elements overlap, and when light elements such as oxygen are involved. It is complicated by the statistical uncertainties of the MC simulations, which create artificial noise in the computed spectra.
In the verification calculations, the geometric description of the problem is similar to the test problems used in the previous section. Similarly, the initial parameter vectors contained only the elements known to be in the sample.
MOZAIK-SEM results obtained for the first three samples, U 3 O 8 , Eu(PO 3 ) 3 , and K227 glass, are presented in Table 13, Table 14, and Table 15. Results indicate that the calculated weight fractions have less than 1.2% relative error. In these calculations, 10 8 source particles were used in each penORNL simulation, and the penORNL simulations were performed on 64 computational cores. For these respective samples, MOZAIK-SEM used 4, 18, and, 13 steps (outer iterations).    Fig. 16 show the comparison of the computed spectra with the reference spectra for these three samples. In each plot, the relative difference between the computed and reference spectra are small for all energy values and the discrepancies on the pulse-height plots are not discernable.   Results for the Inconel sample are given in Table 16, where uncertainties in the computed weight fractions were derived from the MC statistical errors. Although Inconel contains no low-Z material such as oxygen, for which quantitative analysis is difficult, it has consecutive elements in the periodic table, which causes multiple peak overlaps in the spectrum. Moreover, some of the elements in the sample have very low weight fractions compared to the others. Therefore, the characteristic X-ray peaks in the computed sample exhibit large uncertainties, posing an extra challenge for the inverse calculation. Table 16 show that the calculated weight fractions have less than 1% relative error for Fe, Cr, Ni, and Ti. The weight fractions of these elements in the sample are relatively large compared to the other elements; therefore, their peaks exhibit very low uncertainty in the spectrum computed by penORNL. Weight fractions for the rest of the elements were estimated with larger relative error, including the 34% error for Mn. As described in Sect. 3.3.3, characteristic X-rays of Mn overlap with Fe and Cr X-ray peaks. Moreover, the weight fraction of Mn is very small compared to the weight fractions of Fe and Cr. However, it should be noted that for all elements (including Mn), the absolute predicted fractions are within 1% of the actual assay. Fig. 17 plots a comparison between the reference and computed spectra.  Inverse analysis was accomplished in 13 steps.

Results in
The last sample considered for verification is the K497 glass. It is composed of 12 elements, and some of these elements have weight fractions significantly less than others. Moreover, several characteristic X-ray lines of the 12 elements overlap, contributing to the difficulty of this problem. • All elements with weight fractions greater than 0.01 (O, Mg, Al, P) were estimated with relative error less than 3%; Fe also was estimated very well. The oxygen prediction with less than 1% error is especially significant.
• For trace elements (weight fractions < 0.01), the estimated weight fractions contained larger relative error (even double or triple digits). The element B was completely discarded. These errors are at least partly related to MC uncertainties, as the characteristic X-ray peaks of these elements were statistically indistinguishable from the background portion of the simulated spectrum.
• Unusual behavior was exhibited by P and Zr. These elements have characteristic X-ray line energies that are very close to each other and their peaks overlap. The search process estimated the total weight fraction for these two elements with a 1.9% difference (reference, 0.3318 + 0.004 = 0.3358; MOZAIK-SEM, 0.3229 + 0.01507 = 0.33797) but could not resolve the contribution of each element individually-the Zr weight fraction is overestimated, and the P weight fraction is (slightly) underestimated.
• The absolute errors for all elements except Zr are less than 1% of the total assay. Zr is actually present as a trace quantity (fraction 0.004) but is predicted to have small, but non-trace, assay fraction (1.5%).   All the results from the verification study are summarized below.
• The methodology within MOZAIK-SEM is a feasible tool for SEM-EDS inverse calculations.
• The MC errors in the results from penORNL simulations limit the performance of MOZAIK-SEM for trace elements. Better results can be obtained for elements with very low weight fractions by performing longer penORNL calculations that produce smaller MC errors for the background portion of the spectrum and legitimate small photopeaks within the spectrum.
• In these examples, MOZAIK-SEM estimated the weight fraction of the elements correctly if the fraction was larger than 0.01. The MC errors for large photopeaks were usually on the order of 0.01 or less. In almost every case, the predicted elemental composition was within 2% of the actual assay value.
• MOZAIK-SEM resolved many overlaps within the spectrum; however, it demonstrated some problems with the following challenging cases.
o Overlaps in the MC line spectrum. This occurs when two or more elements' characteristic X-ray line energies are so close that the MC code scores these lines in the same energy bin of the tally. o Overlaps in the convoluted spectrum. This occurs when two or more elements' characteristic X-ray line energies are different enough that they appear in different energy bins of the MC line spectrum. However, when the MC line spectrum is convoluted with the detector response function, the characteristic X-ray energies are close enough that only a single photopeak appears in the convoluted spectrum.

Verification of Initialization with MOZAIK-IGM
The Initial-  Table 18 may be present in this sample and omitted the trace element Mn which was actually in the sample.
The calculation converged in 15 steps (outer iterations), and results are presented in column 4 of Table  18. As occurred in the verification examples of Sect. 3.4.1, these results show that MOZAIK-SEM struggles to deal with peak overlaps and with small elemental weight fractions (where MC error is of similar magnitude to spectral peaks). Therefore, MOZAIK-SEM did not eliminate all of the elements not actually in the sample, although their weight fractions are quite small. However, the weight fractions of the elements actually in the unknown sample were estimated fairly accurately. The exceptions to this are Mn, which was not identified by the MOZAIK-IGM, and elements with weight fractions of 0.01 or less. We note that all predictions of the elemental compositions are within 2% of the actual assay values.

Validation Calculations
In the validation calculations a measured spectrum provided by NIST for a material standard was used to evaluate the performance of MOZAIK-SEM. The material to be considered is the mineral fluorapatite Ca 10 (PO 4 ) 6 F 2 ,whose elemental weight fractions are given in the second column of Table 19. Using an initial guess distributed uniformly among the elements produces the estimated weight fractions given in the fourth column, which contain noticeable discrepancies. However, all elements are estimated to within 5% absolute error. The following are several possible issues that could have contributed to the error in predicted weight fractions.
• As shown in Fig. 19, the peak centroid energies for the two spectra are slightly different, which means the energy bin structure in the penORNL simulation may be different from that of the reference spectrum. The energy bin structure of the simulation is based on data provided in the NIST data file. It is possible that the detector was not calibrated immediately before the measurement was taken, so the recorded bin structure may not be correct. Additionally, the relationship between channel number and X-ray energy may not be linear, which appears to be the assumption applied by NIST.
• All our calculations used a generic detector response function that works well for the verification calculations. A response function tailored to the particular detector, and possibly the SEM itself, might improve the results.
• It is possible that the electron source specification in our computational model is not consistent with the actual SEM used because the backgrounds from both spectra have different slopes (Fig. 20). The electron spectrum was reported as a single energy, which is likely not correct. That energy is probably the mean electron energy, but no information about the energy distribution is available.
• Fig. 20 compares the measured spectrum with a computed spectrum using the known elemental composition. It indicates that the measured spectrum exhibits some peaks that are not shown in the computed spectrum. The energies of these peaks do not match the characteristic X-ray line energies of the elements in the apatite sample, so they do not appear in the computed penORNL spectrum.
o If these "extra" peaks do not belong to the elements in the sample, they might be sum peaks, which our generic response function does not account for. o Another possibility is that these are characteristic X-rays of some contaminant element (in the sample itself or in the sample environment) that is not in our computational model. o The final possibility is that there is an unidentified error (or errors) in the penORNL physics model or in the atomic data.
For all these possible reasons, MOZAIK-SEM could not estimate the weight fraction for this sample more accurately. At this time, not enough information is available to rule out any of these explanations. Because the sample represents an archived spectrum made several years ago, it is not likely that additional information about the measured spectrum can be obtained.

CONCLUSIONS AND FUTURE WORK
The overall conclusion of this work is that MOZAIK-SEM is a promising method for performing inverse analysis of X-ray spectra generated within a SEM, but additional work is needed to refine and validate this methodology. The methodology implements a two-tiered strategy: an outer iteration, which requires a full MC calculation, and an inner iteration, which refines elemental estimates quickly. The latter uses a modified Levenberg-Marquardt iteration to solve a linearized problem which reduces the inverse analysis into two pieces, the photopeaks and the background. This approach has the benefit of reducing the number of MC simulations required, although the background portion of the spectrum and the matrix effects on the spectrum are only updated after each outer iteration.
Verification simulations with MOZAIK-SEM revealed that this methodology can successfully identify the weight fractions of low-Z elements such as oxygen and fluorine. However, like other quantitative analysis methods, this method can also suffer when X-ray line energies for multiple elements are very close together. A problem unique to this method is in distinguishing small photopeaks due to trace elements from MC statistical noise in the X-ray background; the problem occurs when the background MC uncertainty is of the same order of magnitude as the weight fraction of a trace element. Simulating more particle histories during the inverse analysis helps reduce this issue, but that increases the computational burden in execution time or additional processors. Finally, the validation simulation with MOZAIK-SEM was less successful than the verification simulations, at least in part due to vaguely defined conditions of the experimental measurement. A number of prominent differences exist between the spectrum measured by NIST and the spectrum simulated by penORNL. In the end, no source of these differences can be absolutely identified, so it is impossible to state if these differences are due to the measurements, MOZAIK-SEM methodology, penORNL physics, or the interaction data it uses.
For the MOZAIK-SEM methodology to be improved, two issues need to be addressed. First, a better method to deal with X-ray line energies that are very close together should be investigated. The method as it exists now works fairly well, but there is clearly room for improvement. Second, and more importantly, additional validation simulations should be performed, but this should be done as part of a much closer collaboration between the experimentalist and modeler. For the validation effort to be successful, future measurements should include (1) an energy calibration of the SEM-EDS detector, (2) an energy calibration of the SEM electron source, and (3) greater details of the geometry, including the sample, detector, and internal configuration of the SEM sample chamber.