Database study of turbulent electron temperature fluctuation measurements at ASDEX Upgrade

. In this work, an automated method for the analysis of data from the correlation electron cyclotron emission (CECE) diagnostic is applied to discharges in the ASDEX Upgrade (AUG) tokamak. This recently developed, automated method provides an efficient means of accurately analysing large quantities of experimental turbulence data, enabling the development of the largest database of CECE measurements of tokamak plasmas to-date. The turbulence database provides the opportunity to search for large-scale trends in experimental data to improve our understanding of transport-relevant plasma turbulence. The results of physics-based investigations utilizing this turbulence database will be reported on separately from this work.


Introduction
Turbulence is known to play a major role in the transport of energy and particles out of tokamak fusion reactors [1], lowering the associated energy and particle confinement times and reducing the ability of tokamaks to produce net energy. Accurate models of turbulence and transport are thus vital to the design of fusion reactors that will be used for energy production [2].
The correlation electron cyclotron emission (CECE) diagnostic installed on the ASDEX Upgrade (AUG) tokamak detects second harmonic, X-mode, electron cyclotron emission in order to measure broadband, long-wavelength (ks ≲ 0.3) turbulent electron temperature fluctuations perpendicular to the mean magnetic field (Te⊥/Te) [3], yielding insight into turbulence-driven transport. These temperature fluctuations are among the key parameters driving the turbulent electrostatic electron heat flux in tokamak plasmas [4]. Since the original implementation of the CECE diagnostic on W7-AS [5], follow-on studies on devices around the world have reported on important, transport-relevant experimental trends in turbulent electron temperature fluctuations using CECE data (see, e.g., ref's [6][7][8][9]). Additionally, data from the CECE diagnostic has and continues to serve as a key component in the validation of advanced gyrokinetic and gyrofluid models of plasma turbulence [10][11].
Up to this point, however, experimental turbulence studies based on CECE data have generally focused on traditional shot-to-shot comparisons using discharges with similar experimental conditions. Studies of CECE data containing multiple discharges have generally been * Corresponding author: cyoo@mit.edu limited to about a dozen or fewer discharges (see, e.g., a relatively large study in ref [10]). Indeed, to the best of the knowledge of the authors, a large-scale experimental turbulence database based on CECE measurements of tokamak plasmas has not been reported on prior to this work.
Experimental databases provide the opportunity to search for large-scale trends not necessarily accessible in traditional shot-by-shot comparisons. Studies based on large-scale databases as detailed in, for example, ref.'s [12] and [13] have led to the development of important experimental scalings that have guided the design and operation of tokamaks. In the study of turbulence in tokamak plasmas specifically, probes have been used to build a multi-machine database of measurements in the edge of tokamak plasmas [14] while reflectometry signals taken during ten years' worth of operation of the Tore Supra tokamak were analysed in a large database study of core frequency spectra [15]. The current work reports on the development of a large-scale turbulence database using the CECE diagnostic installed on AUG.
The development of this turbulence database has benefited from the combination of multiple factors. An ongoing, active collaboration between scientists at the MIT Plasma Science and Fusion Center and the Max Planck Institute for Plasma Physics has enabled measurements using the CECE diagnostic at AUG for multiple years. As a passive, non-perturbative system, the CECE diagnostic at AUG can operate on a nearly continuous basis. Interpretation of CECE data as turbulent electron temperature fluctuations is relatively straightforward under certain conditions that include long steady-state periods, high optical depth, measurement localizations far from cut-off, and a nearly perpendicular line of sight between the plasma and the diagnostic optics. Since its installation in April 2021, an additional 24-channel CECE system has operated in a separate toroidal sector in tandem with the 24-channel CECE system originally detailed in ref. [3], doubling the measurement capacity of the diagnostic. Access to years' worth of data from a large suite of diagnostics at AUG and advanced analysis systems such as the Integrated Data Analysis (IDA) multi-diagnostic analysis system [16] provides valuable information that is paired with the CECE data in the turbulence database. Altogether, these contributing factors have enabled the development of a large-scale experimental turbulence database that is being applied towards the search for large-scale trends to help improve our understanding of plasma turbulence and transport.

Automated method for the analysis of CECE measurements
In order to create a database of experimental CECE measurements, an automated method was developed for the analysis of the diagnostic's data. For each discharge analysed, the automated method: 1. determines the optimal time windows for CECE analysis 2. verifies that the plasma conditions during the selected time intervals are well-suited for the interpretation of CECE measurements as turbulent electron temperature fluctuation amplitudes perpendicular to the mean magnetic field (Te⊥/Te) 3. verifies the quality of the raw voltage data measured by the diagnostic 4. filters the auto-power, cross-power, and coherence spectra calculated from the raw data 5. calculates Te⊥/Te based on the coherence spectra 6. checks for the presence of coherent modes that can amplify the magnitude of the coherence spectra beyond the level attributed solely to drift-wave-type turbulence and 7. processes the Te⊥/Te values from the CECE diagnostic into the turbulence database along with local and global plasma and engineering parameter values from a host of additional diagnostics available in the AUG shot-file system.

Determination of analysis time windows
The determination of time windows for the analysis of CECE measurements by the automated method is based on two often competing requirements. On the one hand, the maximization of these time intervals leads to better statistics, as CECE data is time-averaged to distinguish turbulence from thermal noise [5]. On the other hand, the goal of the database study is to assess trends in Te⊥/Te across a wide range of parameter space, the accuracy of which are improved when the database parameters are in steady-state as this minimizes the statistical uncertainty in the parameter values. The long duration, steady-state conditions well-suited for CECE measurements and database studies cannot be taken for granted across experiments at AUG, as many discharges feature experimental conditions that evolve continuously throughout the discharges. Figure 1 shows the time series of the three parameters used in the steady-state analysis algorithm: the plasma current (Ip) [MA], electron temperature (Te) [keV], and electron density (ne) [10 19 m -3 ], with Te and ne profile data provided by IDA and defined as the local values of these parameters at each CECE measurement location. These parameters are particularly important to analyse in steady-state as they serve as the basis for multiple important physical parameters including the localization of the CECE measurements, optical depth, cutoff frequencies, collisionality, and gradients in Te and ne. Here steady-state is defined as the state in which the mean values of Te and ne remain within designated limits, as a percentage of their initial values during the particular time window, at all CECE measurement locations during the plasma current flat-top period. While one could account for additional parameters (such as the gradients that drive turbulence) in the steady-state algorithm, doing so generally decreases the probability that the algorithm will detect steady-state intervals during a given discharge. Additionally, while Ip, Te, and ne data are nearly always available in AUG shot-files for use in the steady-state algorithm for a given discharge, certain diagnostic data (such as Ti data) are not.
The mean value of each of the parameters is tracked in 10-msec time-averaged increments. A minimum time interval constraint of 0.5 seconds is implemented to achieve CECE sensitivity limits on the order of 0.2-0.3% (as per equation 7 in ref [3]). The steady-state algorithm utilizes an iteratively increasing steady-state threshold, from a starting point of 2% up to a pre-defined maximum of 10%. This prioritizes finding the most steady-state intervals when they are available while still enabling the algorithm to obtain relatively steady-state periods during even highly dynamic discharges, as shown by the three detected steady-state intervals highlighted in blue in Figure 1.

Verifying plasma conditions
Sufficient optical depth is important for ensuring that fluctuations in the radiation temperature measured by the CECE diagnostic can be interpreted as electron temperature fluctuations and not fluctuations in electron density [5]. As shown in Figure 2, the analysis process calculates the time-averaged optical depth over the range of CECE measurements (shown in the highlighted region) during the selected time windows. Optical depth values of 2 (red) and 4 (green) mark typically cited lower limits as to what is considered sufficient optical depth for CECE measurements. [17] Fig. 2. The optical depth () at each location in the plasma measured by the CECE diagnostic (highlighted region) is tracked by the analysis method. Higher optical depth minimizes the impact of density fluctuations on CECE measurements [5,17]. The red and green lines denote optical depth levels of 2 and 4, respectively.
The analysis also identifies CECE channels that are measuring in regions of the plasma that are cutoff (i.e., regions where the right-hand cutoff frequency is greater than the second harmonic electron cyclotron frequency) [18], in which case the ECE signals at the intended measurement locations do not reach the diagnostic's focusing optics and detectors. Additionally, the analysis identifies CECE channels that have corresponding second harmonic electron cyclotron frequencies that are less than 105.5% of the respective right-hand cutoff frequencies. This 105.5% frequency threshold is more conservative as a limit than the right-hand cutoff frequency and indicates the level below which refraction has a non-negligible effect on ECE measurements. This limit corresponds to an equivalent threshold, determined based on ray tracing studies, which requires that the local electron density be less than 85% of the cutoff density in order for refraction to have a negligible effect on ECE measurements [19][20][21]. CECE channels whose second harmonic electron cyclotron frequencies are greater than this limit, as shown in Figure 3, are included for further analysis while those channels with lower frequencies are removed from the analysis process. Fig. 3. CECE measurement locations (highlighted region) are not cutoff if the second harmonic electron cyclotron frequency (solid green curve) is above the right-hand cutoff frequency (solid red curve). Additionally, the effects of refraction on data processed by the automated analysis are minimized by identifying and removing from further analysis any CECE channels that have second harmonic electron cyclotron frequencies below a 105.5% multiple of the right-hand cutoff frequency (dashed red curve) [19][20][21].

Verifying the quality of the raw CECE data
The raw voltage signals detected by each CECE channel are checked for a finite response to the presence of the plasma as well as for saturation, the latter of which is illustrated by the red curve in Figure 4, as this prevents the accurate processing of the time-series voltage data into auto-power and cross-power spectral density and coherence frequency spectra using Fourier analysis. Any channels that do not pass these checks are removed from the analysis. Additionally, the mean voltage level of each CECE channel is calculated during each analysis time window in order to determine whether the voltages are satisfactorily within the linear response region of the CECE detectors. The red curve shows a channel whose signal has saturated at the maximum voltage level of the data acquisition system. The blue curve would pass the checks of the analysis.

Filtering CECE spectra
A frequency-based filter is applied to the auto-power, cross-power, and coherence frequency spectra in order to remove large, narrowband peaks. These peaks, shown in red in Figure 5 on top of the broadband spectrum attributed to drift-wave-type turbulence, have been identified as being due to electronics noise since they have been observed in the spectra even when no plasma is present. While the peaks appear at similar frequencies across CECE channels and experiments, there is some variation of the peak frequencies measured by any individual diagnostic channel during even the same plasma discharge.
As a result of the temporal variation in the frequencies of the peaks and in order to account for the possibility of narrowband peaks arising in the data at entirely different frequencies, a threshold-based, moving-average filter is applied to identify and remove the narrowband peaks for each channel during each analysis time window. Thresholds based on the mean and standard deviation of each moving frequency increment are used to identify the peaks from the underlying broadband spectra. To reduce the potential of over-filtered spectra introducing an additional source of error into the results of the CECE analysis, any spectra that have been filtered beyond a designated percentage of 10% of their signal bandwidth are removed from further analysis.  5. Narrowband peaks (red) attributed to electronics noise are identified and removed from the CECE auto-power and cross-power spectral density and coherence frequency spectra using a threshold-based, moving average filter.

Checking for coherent modes
In order to identify coherent modes due, for example, to MHD activity that might be present within the coherence spectra, the analysis process calculates the spectrogram of data from a Mirnov coil measuring the radial magnetic field. As shown in Figure 6, the spectrograms can be used to identify possible contributions to the coherence spectra that are not due to drift-wave-type turbulence at any given time during a plasma discharge. Any coherence spectra whose signal bandwidths overlap with the frequencies of any such modes identified in the magnetics are flagged, although the spectra are not removed from the analysis process in order to enable the possibility of studying the effect of such modes on CECE measurements. Fig. 6. A spectrogram of radial magnetic field measurements from a Mirnov coil is used to identify coherent modes (as shown by the narrow yellow trace) that can cause the CECE coherence spectra to be greater in magnitude than they would be solely as a result of drift-wave-type turbulence. Coherence spectra whose signal bandwidths overlap with the frequencies of the coherent modes identified by the magnetics are flagged.

Calculating Te⊥/Te
As per equation 1 in ref. [3], calculating Te⊥/Te involves the integration of the coherence spectra over the signal bandwidth and the subtraction of a background coherence level. In order to determine the frequency limits for integration, a threshold is applied based on the standard deviation of moving frequency increments within the ECE-ECE phase angle spectra for each frequency-adjacent pair of CECE channels. As shown in Figure 7, the dip in the magnitude of the standard deviation of the ECE-ECE phase angle tracks well with the region of frequency space in which the magnitude of the complex coherence is large, enabling an accurate identification of the signal bandwidth. A fixed frequency width (i.e., 20 kHz) above the coherence integration region is used to select the background subtraction region. Fig. 7. The standard deviation of the ECE-ECE phase angle spectra between frequency-adjacent channels is used to determine the frequency limits (vertical, yellow dotted lines) over which the CECE channels should be integrated during the calculation of Te⊥/Te.

Processing diagnostic data into database
Upon completing the calculation of Te⊥/Te values for a given analysis interval and discharge, the processed data is written to the turbulence database. In addition, the time-averaged values of numerous local plasma parameters, evaluated at the same radial positions as the Te⊥/Te measurements, and the time-averaged values of multiple global plasma and engineering parameters, all of which are accessed through the AUG shot-file system, are written to the turbulence database.

Database parameters
At the present time, over 30 parameters, including Te⊥/Te and local and global plasma and engineering parameters, are written to the database along with their statistical uncertainties for each analysis interval. The inclusion of multiple tens of parameters in the turbulence database provides the opportunity to search for large-scale trends in Te⊥/Te across a wide-range of experimental operating conditions. As shown in Table  1, these parameters are relevant to: 1. turbulence drive and damping 2. ion and electron kinetics 3. the magnetic field, safety factor, and rotation 4. auxiliary heating power and radiated power 5. stored energy, current, and confinement 6. plasma geometry and 7. fuel composition. peak corresponds to the peak value of each coherence spectrum while FWHM refers to the full-width at half-maximum of a Gaussian fit to each coherence spectrum. Te, Ti, ne represent the gradients in electron temperature, ion temperature, and electron density, respectively. eff is the effective collisionality as defined in ref. [22]. Btot is the total magnetic field, qs is the safety factor, and vrot is the plasma rotation velocity. Ptotal is the total auxiliary heating power, Pohmic is the ohmic heating power, PNBI is the neutral beam injection power, P ECRH is the Electron Cyclotron Resonance Heating power, P ICRH is the Ion Cyclotron Resonance Heating power, and P rad is the radiated power. Wmhd is the MHD stored energy, N is the normalized beta, Ip is the plasma current, and E is the energy confinement time. R is the major radius, a is the minor radius,  is the inverse aspect ratio,  is the elongation, upper is the upper triangularity, and lower is the lower triangularity. A is the fuel mass number and Zeff is the plasma effective atomic number.

Overview of database contents
The first iteration of the database was developed by running the automated analysis method on 700 discharges measured by the CECE system installed on AUG Sector 9 from May 2020 to January 2021. Among the 700 discharges (including AUG discharge numbers 37,700 -38,399), 166 of them contained data that passed the checks of the automated analysis. Through the analysis of the 166 discharges, 4,286 datapoints were written to the database. Of this number, 3,936 datapoints were located in the core (defined here as corresponding to tor < 0.8, where tor is the square root of the normalized toroidal magnetic flux) of the AUG plasmas.

Conclusion
The current work reports on the development of a large-scale turbulence database using the CECE diagnostic installed on AUG. A recently developed, automated method provides an efficient means of accurately analysing large quantities of experimental turbulence data from the CECE diagnostic. The data processed by the automated method is stored along with local and global plasma and engineering parameters, evaluated during the same time intervals and at the same radial locations as the CECE measurements. The application of this method to data from discharges at AUG has enabled the development of the largest turbulence database of CECE measurements of a tokamak plasma to-date. The turbulence database provides the opportunity to search for large-scale trends in experimental data to help improve our understanding of transport-relevant plasma turbulence. Work is in progress using the database to study the effects of collisionality and gradients on the saturated amplitude of turbulence measured by the CECE diagnostic. The database could also be applied towards large-scale validation studies of advanced computational models of plasma turbulence. The results of physics-based investigations utilizing this turbulence database will be reported on separately from this work.