Deep neural network techniques in the calibration of space-charge distortion fluctuations for the ALICE TPC

The Time Projection Chamber (TPC) of the ALICE experiment at the CERN LHC was upgraded for Run 3 and Run 4. Readout chambers based on Gas Electron Multiplier (GEM) technology and a new readout scheme allow continuous data taking at the highest interaction rates expected in Pb-Pb collisions. Due to the absence of a gating grid system, a significant amount of ions created in the multiplication region is expected to enter the TPC drift volume and distort the uniform electric field that guides the electrons to the readout pads. Analytical calculations were considered to correct for space-charge distortion fluctuations but they proved to be too slow for the calibration and reconstruction workflow in Run 3. In this paper, we discuss a novel strategy developed by the ALICE Collaboration to perform distortion-fluctuation corrections with machine learning and convolutional neural network techniques. The results of preliminary studies are shown and the prospects for further development and optimization are also discussed.


Introduction
The ALICE TPC is a large cylindrical gaseous detector for the reconstruction of particle tracks and for the identification of charged particles, measuring their energy loss via ionization when they traverse the detector volume. It is located within the central barrel of ALICE, which is surrounded by a solenoid magnet generating a magnetic field of 0.5 T along the beam axis, between the Inner Tracking System (ITS) and the Transition Radiation Detector (TRD). More details about the TPC can be found in [1][2][3]. The upgrade of the TPC detector represents one of the most critical elements of a wide upgrade program that will allow AL-ICE to exploit the full Pb-Pb luminosity delivered by the LHC in Run 3 and Run 4 [2,3]. Thanks to the new readout chambers featuring Gas Electron Multiplier (GEM) technology, new readout electronics and a newly developed calibration and reconstruction software [4], the TPC will be capable of continuous readout of Pb-Pb data at interaction rates up to 50 kHz. This continuous mode of operation is only possible because of the absence of a triggered gating grid. As a consequence, a significant amount of positive space charge coming from the amplification in the GEM readout (ion backflow) will be present in the active TPC volume, which leads to distortions of the uniform drift field. Ionization electrons are deflected from their nominal drift path and, consequently, the measurement of the space points is biased. While ions created during the primary ionization process also contribute to the space-charge density, the dominant contribution comes from the ion backflow. For each electron produced in the process of gas ionization initiated by a charged particle traversing the TPC, the number of ions drifting back into the active volume of the detector is defined by ε which is the product of the gas gain and the ion backflow. Both of these parameters are determined by the voltage configuration of the readout chambers and they are experimentally measured [2,3], expecting ε to be below 20.

Space-charge density and distortions
The presence of space charge inside the active volume of the TPC leads to distortions of the nominal drift path of ionization electrons along all three spatial dimensions (r, rϕ and z). The integrals of the distortions over the full drift paths are called space-charge distortions, providing the total 3-dimensional bias in the measurement of space points.

Average space-charge density and distortions
The absolute space-charge density in the TPC strongly depends on the number of ion pile-up events N ion pile−up which is the product of the interaction rate and the ion drift time. The ion drift time of the nominal gas mixture Ne-CO 2 -N 2 (90-10-5) is expected to be around 200 ms [5]. At 50 kHz of Pb-Pb collisions, ions from 10000 events will contribute on average to the space-charge density at any given time. An estimate of the average space-charge distortions in radial direction, dr, is shown as a function of the radius and z in the left plot of Fig. 1 for ε ≈ 10 averaged over the full readout area. As the radial electric field distortions are perpendicular to the nominal magnetic field in z-direction, the ionization electrons are also distorted along the azimuthal direction (drϕ) due to the E × B-effect [2]. In addition, there are distortions along the drift direction (dz) due to the modifications of the nominal drift field and of the ideal drift path of the ionization electrons.

Space-charge density fluctuations
The space-charge density is determined by the ionization deposited in the TPC volume by primary and secondary tracks from the events piling up within one ion drift time. Fluctuations of the event and track properties manifest themselves as local space-charge density fluctuations in space and time. The relative space-charge density fluctuations σ SC µ SC are composed of several contributions which are summarized by    is the relative RMS of the distribution of the track multiplicity, µ N mult is the average track multiplicity per event and σ Q track,tot µ Q track,tot (r) 2 is the relative variation of the ionization of single tracks depending on the radius r. These track-based quantities are composed of contributions from primary (prim) and secondary (sec) tracks. F µ tot (r) = F prim (r) · µ N mult,prim + F sec (r) · µ N mult,sec quantifies the amount of tracks contributing to the fluctuations for a given volume fraction F, i.e. F = 1 for the full TPC volume and F < 1 for a fraction of the volume. The relative space-charge density fluctuations estimated by equation 1 and by MC simulations are shown as a function of the number of ion pile-up events in the right plot of Fig. 1. The plot illustrates the fluctuations integrated over the TPC volume as well as the radial dependence of the fluctuations within only a fraction of the volume. The latter are generally larger as less tracks contribute. Relative fluctuations of the order of 2 % are expected in the case of 10000 ion pile-up events (≈50 kHz interaction rate). The resulting space-charge distortion fluctuations are of the order of several millimeters. In order to preserve the performance of the TPC, they need to be corrected down to its intrinsic resolution, which is of the order of 200 µm, and further studies in [2] demonstrate that the update interval of the correction needs to be of the order of 10 ms.

Integrated digital currents as an estimator for the space-charge density
The TPC integrated digital currents (IDCs) provide an estimate of the instantaneous spacecharge density, taking local variations of ε across the readout area and the ion drift time into account. The IDCs allow to follow the space-charge density fluctuations with high time granularity. The signals collected at the pads are integrated over 1 ms by the readout, which delivers a 2-dimensional (r and rϕ) IDC map for each millisecond. The data is aggregated and continuously stored so that the full history of the IDC is available at all times. The 3D IDCs of the past ion drift time contain the full information about the space-charge density fluctuations (equation 1) relevant at any given point in time. The 1D IDCs are obtained by averaging the currents over the 2-dimensional IDC map for each ms, providing information about the fluctuations of the number of events and the fluctuations of the track multiplicity.

Calibration strategy for space-charge distortions and distortion fluctuations
The ALICE reconstruction strategy for Run 3 and beyond foresees two main stages, the synchronous and the asynchronous stage [6]. During the synchronous stage, a fast online reconstruction of the data coming from the detector is performed, including tracking based on a first calibration of the relevant parameters. The data are compressed and permanently stored for further processing during the asynchronous reconstruction, applying the final calibration in order to reach the required data quality.
A calibration of the space-charge distortions and distortion fluctuations in r, rϕ and z-direction is performed at the synchronous as well as at the asynchronous reconstruction. At both stages, the correction is applied in two steps. First, the distortions averaged over longer time intervals (O(1 min)) are corrected, using correction maps obtained from simulation or measured in data as described in [7]. Residual distortion fluctuations remain after the average correction and are corrected in a second step [2], relying on the measurement of the fluctuations of the IDCs. The distortions and corrections for any given space-charge density can be calculated analytically by solving the Poisson equation and the Langevin equation [8]. Numerical algorithms using modern programming and computing techniques are available to perform these calculations on time scales of O(1 s -1 min), but they take too long to be applicable for the reconstruction of 10 ms time frames. Therefor, ML algorithms and convolutional neural networks will be used to predict the corrections for the space-charge distortion fluctuations, taking the IDC fluctuations as primary input for the density fluctuations. Local variations of ε and the ion drift time also have a significant impact on the space-charge density and they need to be calibrated. A data-driven calibration, using the numerical derivative of the corrections with respect to the IDCs which can be measured with high precision in the relevant time intervals of O(1 min), is foreseen as an integral part of the distortion-fluctuation correction.
Two types of corrections are foreseen for the distortion fluctuations. A 1D distortionfluctuation correction accounts for the effects generated by the fluctuations of the number of events and of the track multiplicity (see equation 1). For this correction, a Boosted Decision Tree (BDT) algorithm or a simple dense neural network will be trained to predict the correction for the distortion fluctuations based on the 1D IDCs. It is applied both during the synchronous and asynchronous reconstruction and is expected to be sufficient for pp collisions where the space-charge distortion fluctuations are much smaller. In Pb-Pb collisions, the local distortion fluctuations due to the fluctuations of the track topology and deposited charge are also significant. A 3D distortion-fluctuation correction is performed during the asynchronous reconstruction to correct also the local fluctuations, using a CNN and the 3D IDC fluctuations.

3D distortion-fluctuation correction with a convolutional neural network
Deep neural networks [9] are expected to provide an effective alternative to time-consuming analytical calculations for performing the 3D distortion-fluctuation correction. In this preliminary study, U-Net [10] convolutional neural networks are used as supervised regression algorithms to predict the distortion fluctuations along the radial direction (dr), using the average space-charge density and the space-charge density fluctuations at fixed points in r, ϕ and z as input. Similar studies are ongoing to predict distortion fluctuations along rϕ and z a . The distortion fluctuations calculated by numerical algorithms solving the analytical equations are used as expected output in the training procedure.
The study is performed using a dedicated software package [11] developed in Python 3.6, which makes use of several packages, like Numpy and Pandas. The U-Net is built, trained and validated using Keras [12] and TensorFlow [13].

CNN architecture
The U-Net, evolved from the fully convolutional network, was first designed and applied to process biomedical images. Its basic U-shaped architecture is shown in Fig. 2. It consists of a contracting path, realized by successive convolutional and max pooling layers, and an expansion path, using convolutional and up-sampling layers. The outputs of the convolutions on the contracting path are concatenated with the corresponding up-sampling output channels in order to propagate the local information to the expanding convolutional layers.
As in the standard implementation in [10], we use two consecutive 3 × 3 × 3 convolutions, followed by a rectified linear activation function, at each contracting and expanding step. On the contraction path, the convolutions are followed by 2 × 2 × 2 max pooling. The first block of convolutions has four feature channels and this number is doubled at each step. The expansion path is symmetrical, applying 2 × 2 × 2 up-sampling at each step followed by the consecutive convolutions. The final layer is a 1 × 1 × 1 convolution that reduces the output to a single channel and the same shape as the network input.

Training dataset
Each training sample is represented by an array of space-charge density fluctuations and an array of the average space-charge density evaluated on a regular 3-dimensional grid in r, ϕ and z. The density maps are derived from already available data used for previous performance studies in [2]. For these studies, different grid granularities are considered. A typical grid is composed by 90 points along ϕ, 17 points along r and 17 points along z (indicated as 90 × 17 × 17 in this manuscript). Alternative grid configurations like 180 × 33 × 33 or 180 × 65 × 65 are also considered for the study.
To limit the CPU time and the disk-space resources needed to generate and store the training datasets, an augmentation procedure is adopted. In this approach, the training samples were constructed starting from a set of 1000 random simulated space-charge density a It is important to stress again that we refer to this calibration procedure as a 3D correction because it uses as an input for the prediction average space charge densities and density fluctuations on a discrete 3D grid of r, ϕ and z. The correction always addresses the distortion fluctuations along all three directions r, rϕ and z, although only the prediction for the correction along the r-direction is given in this preliminary study. Each training example is finally obtained as the combination of a random and an average space-charge density map, resulting in a total of 27000 samples for the training and validation studies. The fluctuations of the space-charge density and of the distortions are calculated by subtracting the random map from the average map.
As the distribution of space-charge density fluctuations is usually Gaussian around zero, the training data will be dominated by samples with small fluctuations which constrains the power of the network to learn and predict scenarios with relatively large fluctuations. Therefore, the space-charge densities of the training samples are randomly biased towards generally lower or higher values to make the distribution of fluctuations more uniform. The final performance of the trained models is evaluated with the unbiased data.

Preliminary results
Detailed studies were performed b to maximize the prediction accuracy of the trained model by considering different network parameters, numbers of events used for training and grid granularities. These different factors are tightly connected and cannot be factorized. We used validation tools commonly applied in the field of machine learning, like TensorBoard [14], to benchmark the network performances and to identify signs of undertraining or overtraining. Furthermore, the quality of the predictions is assessed more differentially using TPC-specific observables, e.g. selecting only a part of the TPC volume. For the sake of clarity, for most of the results presented in this document, only one setting is changed at a time, e.g. the numbers of events for the training are varied while using the same grid granularity and network configurations.
The first rough indication of the statistics needed to reach convergence was obtained b The training and validation were carried out on a local server equipped with a NVIDIA Tesla V100 GPU. ). The results obtained with a 90 × 17 × 17 and 180 × 33 × 33 grid, presented in Fig. 3, indicate that the RMSE still decreases when going from N training ev = 10000 to 18000 for both grid granularities. All curves reach proper convergence at about n epochs = 12 at values between 200 µm and 400 µm. Future studies will indicate if a substantial gain is obtained when using larger training data samples.
The evidence that the RMSE during the training seems to asymptotically converge towards 200 µm for large values of N training ev represents an encouraging result, but it does not yet guarantee that these models can successfully be applied in the calibration. Further studies are needed to assure the required precision of the corrections in the whole TPC volume. The regions close to the boundaries which define the TPC volume, e.g. at small and large r, are of particular interest as, in general, the largest distortion fluctuations are observed there. Furthermore, the effect of a given space-charge density fluctuation on the distortion fluctuations strongly depends on the distance to the TPC boundaries. This dependence is asymmetric while the U-Net assumes symmetric boundary conditions. In order to address the quality of the corrections in presence of large distortion fluctuations, the average difference between the predicted and the expected distortion fluctuations are plotted as a function of the expected value of the distortion fluctuations for networks trained with different N training ev (Fig. 4, left panel). While all the networks provide satisfactory results within uncertainties for space points with small distortion fluctuations, significant  The mean µ (data points) and the standard deviation σ std (error bands) of the difference between the predicted and the expected distortion fluctuations dr pred − dr true as a function of the true distortion fluctuations dr true . The data show the results of models with a granularity of n ϕ × n r × n z = 90 × 17 × 17. The different colors represent models trained by different numbers of events. Right: The mean value and the root mean squared error RMSE of the difference between the predicted and the expected distortion fluctuations dr pred − dr true as a function of the radius r. It is shown for 0 < z < 5 cm and it is averaged over ϕ. A small interval of the integrated relative space-charge density fluctuations r,ϕ,z ( ρ SC − ρ SC )/ ρ SC at around 1σ of the distribution is selected. The different colors show data for different grid sizes and N training ev . depend on the position inside the TPC volume and on the absolute space-charge density. The U-Net algorithm assumes certain symmetries of the problem, including symmetric boundary conditions and translational invariance, both of which are broken in the case of the space-charge distortion fluctuations. In order to study by how much the performance of the network is affected, the results are analyzed differentially in the TPC phase space, using also the RootInteractive framework [15]. The right plot of Fig. 4 shows the mean value and the RMSE, averaged over all ϕ, of the difference between the predicted and true distortion fluctuations as a function of the TPC radius in a selected region in z. A small interval of the relative space-charge density fluctuations integrated over the full TPC volume, r,ϕ,z ( ρ SC − ρ SC )/ ρ SC , is chosen. This observable represents the mean space-charge density fluctuations and it also constrains the range of the mean distortion fluctuations in a given region. Models for 90 × 17 × 17 and 180 × 33 × 33 grids trained with 10000 and 18000 samples, respectively, are shown. The deviation of the mean value increases towards the most inner and outer radii for all models, reaching almost 600 µm at the inner radial boundary of the TPC (r = 83.5 cm) for the high grid granularity. The RMSE gradually increases towards the inner radial boundary, implying a decrease of the prediction power. Opposite to the model using a 90 × 17 × 17 grid, the accuracy of the models using the high grid granularity still improves when increasing N training ev to 18000 while the relatively large offset of the mean value of the prediction is comparable. The dependence of the results on the distance to the TPC boundaries suggests that the asymmetric boundary conditions of the distortion fluctuations pose a challenge for the U-Net training.
A unit test is performed in order to evaluate the response of the network to local den- sity fluctuations within a small region of the TPC. A space-charge density fluctuation in the form of a line charge is generated at a radius close to 110 cm and at TPC sector 9 c . The charge is constant along all z and it is added on top of the nominal average space-charge density. The prediction by a model with a grid granularity of 90 × 17 × 17, trained with 10000 events, is qualitatively compared to a numerical calculation of the same scenario in Fig. 5. The scale and the characteristic shape along the ϕ-direction predicted by the model is fully dominated by a global radial dependence. At this stage of the development, it is unable to satisfactorily reproduce the local fluctuation in a quantitative way. This observation implies that the network learns rather the global properties of the problem with a long range, including the boundary and translational asymmetries, instead of the local fluctuations which have a short range. It becomes evident that the distance to the boundary needs to be provided by slightly modifying the design of the network or the effect of global dependencies needs to be reduced by applying the 1D distortion-fluctuation correction before.

Conclusions
The space-charge distortion fluctuations in the ALICE TPC in LHC Run 3 and Run 4 need to be calibrated regularly in intervals of about 10 ms. In order to comply with the tight computing requirements imposed by the very large heavy-ion statistics to be collected, the ALICE collaboration plans to adopt ML and CNN algorithms to perform a fast calibration of the distortion fluctuations. In this document, we present preliminary results to display c The TPC readout is divided into 18 sectors which cover the full azimuth. Therefore, the TPC sector is another measure for the ϕ-position. the potential of the U-Net convolutional neural network to predict the distortion fluctuations using space-charge density fluctuations and an average space-charge density as inputs. The performance of models trained with different numbers of training samples are compared, using two different grid granularities. A properly trained model is qualitatively able to reproduce the distortion fluctuations obtained from numerical calculations. However, a detailed quantitative differential analysis reveals difficulties of the network to learn the properties of distortion fluctuations originating from global and from local effects at the same time, while also dealing with the intrinsic asymmetries of the problem. Therefore, additional information about the distance to the boundary or a pre-filter for the global dependencies of the distortion fluctuations is required in order to use the U-Net for the correction of local fluctuations.
Further studies will be performed to improve the prediction performances and to benchmark them with more realistic data samples. A 1D distortion-fluctuation correction will be implemented to be applied before the 3D distortion-fluctuation correction with the goal of removing the global distortion fluctuations. These depend to leading order only on the z-position in the TPC. For the 1D distortion-fluctuation correction, Boosted Decision Trees or simple dense networks are expected to perform sufficiently well. Then, the 3D distortionfluctuation correction with the U-Net will be used to correct only for the effects from local density fluctuations. Data-driven procedures for the calibration of local ε-variations and the ion drift time will be developed and integrated into the space-charge distortion-fluctuation correction.