Particle identification with an electromagnetic calorimeter using a Convolutional Neural Network

The LHCb’s Electromagentic Calorimeter (ECAL) measures the energy that any particle leaves behind when it travels through its sensors. However, with the current granularity, it is not possible to exploit the shape of the shower produced by the particle when it interacts with the ECAL, which is an information that could be enough to conclude what particle is being detected. In an attempt to find out whether it would be possible to classify them in future runs of the LHC, simulated data is generated with Geant4, giving an idea of what SPACAL, an updated version of the current calorimeter with better resolution, is capable of. Convolutional Neural Networks are applied so that the algorithm can understand the shapes and energy deposits produced by each kind of particle. Results obtained demonstrate that bigger resolution in ECAL allows over 95% precision in some classifications such as photons against neutrons.


Introduction
The Large Hadron Collider (LHC) is the world's largest and most powerful particle accelerator [1]. It accelerates protons to collide each others at specific locations at speeds that tend to the speed of light. The Large Hadron Collider beauty (LHCb) is one of the highenergy physics experiments located at the LHC and one of its purposes is to investigate the CP violation, which could explain why the universe is made mainly of matter [2].
Convolutional Neural Networks (CNN), in deep learning, are a type of artificial neural networks which, through supervised learning, can learn to capture features in images and effectively classify them [3]. The objective of the present work is to verify whether, by improving the resolution of the current Electromagnetic Calorimeter, Convolutional Neural Networks can classify the different type of particles produced in the proton-proton collisions at LHCb.
The LHCb detector consists of several layers for gathering data to reconstruct what particle traverse in a specific moment. One of them is the Electromagnetic Calorimeter (ECAL) [4], where all kind of detectable particles (electrons, muons, photons, protons, kaons, pions, etc.) deposit some energy. While charged particles are also detected in other subsystems, ECAL is the only one that detects all visible particles, both neutral and charged.
This document consists of a first theoretical part where it is explained the related research that has been carried out. Afterwards, there is a section that details the data that has been used for the subsequent experimental part. Then, there is a detailed explanation about the implemented models and the results obtained. Finally, there is a brief discussion and the presentation of the conclusions.

State of the Art
A typical high energy physics detector has several layers, as shown in Fig. 1, where each type of particle leaves a different signature. In the first layer, the tracking system, charged particles such as electrons, muons, protons, kaons and pions can be detected. On the other hand, in the Electromagnetic Calorimeter, which is the layer that lies immediately behind, all types of particles can be detected. However, electrons and photons are the only particles that deposit all their energy, and do not penetrate into deeper layers. Muons deposit little energy in the calorimeters, while the rest of the particles develop a shower. Hadrons, such as protons, kaons, pions and neutrons, deposit all their energy in the penultimate layer, which is the hadronic calorimeter. In the last layer, only muons can be detected.
Particle identification algorithms exploit information of several subdetectors. In the particular case of neutral particles, such as photons and neutral pions, only calorimeter information can be used for their identification. Current neutral particle identification tools in LHCb are based on shallow neutral networks [6]. As the ECAL detects every particle, which leaves different shower shapes and intensities depending on its nature, some research tried to classify the particles with information just from this layer. One of the algorithms used is Convolutional Neural Network, but the limited resolution of the current LHCb calorimeter causes these results to be very inaccurate. Having three different regions, whose square cells have lengths of 4 cm, 6 cm and 12 cm, the best resolution is not enough to apply a CNN [7].
This document explains the performance of the application of CNN for image classification with an upgraded calorimeter of LHCb for Run 5 of the LHC. In particular, the study has been done with simulation of a SPACAL module [8], a new technology that increases the image definition with smaller cells (one cell every 15 mm) and with separate front and rear sections, being the front 4 cm long and 10 cm the rear. Note that the final design of the upgraded calorimeter is still under discussion and that the small granularity used here would be implemented only in the innermost region of ECAL, with three other regions of increasing granularity.

Data Sample
The datasets used to feed this project are generated in a Geant4 simulation of the SPACAL prototype, a toolkit for simulating the passage of particles through matter [9]. In total, 4,000 samples are generated for each type of charged particles and 8,000 for neutral particles. The incident angle and position of the particle with respect to the central cell is varied randomly within the typical values of particles produced in proton-proton collisions at LHCb. Only a 7x7 matrix of cells is used in the simulation. This amount of simulations allows, among others, making the balanced combinations in Table 1.
The data for each record consists of two 7x7 pixel images: each pixel captures the intensity of the energy deposited. The first image is the representation of the front sensor and the second image is the one at the rear side, which the neural network have interpreted as one single image with two channels. Images may have noise, so a threshold of 1 MeV was applied to the energy deposited in each cell. Apart from this, no additional cleaning have been necessary.

Tools
Python has been the main programming language. As a development environment, Google Colab has been used. It is an environment with a simple and user-friendly interface. Also, when running on remote servers, it allows the selection of different processors: CPU, GPU, and TPU. The main library used in Python have been PyTorch, a replacement module for Numpy optimized for the use of GPUs and the development of neural networks [10].
Additionally, MLFlow, an open source platform for the machine learning lifecycle, has been tested to save configurations and results of every model tested.

Development
Convolutional Neural Networks have been shown to be successful in image classification. However, it is necessary to select the appropriate configuration and hyperparameters to The number of fully connected layers has ranged from one to four. Apart from them, the following activation functions have been tested: ReLU, Leaky ReLU, Softmax, LogSoftmax, Sigmoid and LogSigmoid. The worst results have been found with Sigmoid and LogSigmoid outputs, while the best results had in common a configuration of three fully connected layers.

IV. Number of convolutions:
There has been neural nets without convolutions and just fully connected layers. However, the best results have been found in configurations with one and two convolutions (kernels of 2x2) and maxpool (different paddings and strides).

V. Optimizers:
The first optimizer used had been Adam. Afterwards, the best values were found with AdaMax, a variant of Adam which uses the infinite rule to improve stability [11]. Finally, Stochastic Gradient Descend has been tested with unsuccessfully results.

VI. Loss functions:
The functions to summarize the difference between predicted and real values have been Negative Log Likelihood Loss and Cross Entropy Loss, with similar success.

VII. Batch sizes and epochs:
The number of epochs has ranged between 10 and 3,000 epochs (Fig. 3), depending on the batch sizes. In an attempt to reduce the learning time, the preference has been the combination of little batch sizes along with bigger number of epochs [12]. Finally, 80% of the data was separated to perform a 5-fold cross-validation, while the remaining 20% of the data was used for testing.

Results
The following subsections show the best results achieved with different nets for each particle classification with SPACAL.

Neutral particles
The simplest classification has involved two types of particles: photons and neutrons. Photons deposit all their energy in the electromagnetic calorimeter, while neutrons deposit just a part of it. However, both of them produce a shower. After having trained the model, the test set (1,600 samples) show an accuracy greater than 95% (see Table 2 and Fig. 4). In fact, cross-validation showed a mean accuracy of 95.5% with a standard deviation of 0.3%.

Three particles
This case classifies three different types of particles in which, at least, one feature is shared: electrons have in common with pions the shower generated in the ECAL, while pions have in common with muons that none of them deposits all the energy in this layer. Finally, the electrons with the muons do not share any of the characteristics: neither showers nor energy deposited.
Results showed the energy deposited in the ECAL and the shower generation in these three particles is different enough to make a very precise classification (Fig. 5). However, pions had a big False Positive Rate and Muons a big False Negative Rate (Table 3).
Cross-validation showed an average accuracy of 79.2% with a standard deviation of 1.1%.

Particles without charge differentiation
This subgroup contains all the particles without charge differentiation. Consequently, there are different particles that share behaviours in the ECAL: energy deposit and/or shower generation (Table 4 and Fig. 6). Average accuracy during cross-validation was found to be 39.1% with a standard deviation of 0.6%.  Considering photons and electrons into a single category, because the electromagnetic showers that they produce are indistinguishable, the precision and recall was found to be 0.776 and 0.935, respectively.

All particles
The pattern that is drawn in this subsection is exactly the same as the one seen in the previous subsection: muons are an independent group, electrons and photons belong to other independent group and the other hadrons belong to a third grup. In this case, it is intended to differentiate between charges, but it is unsuccessful and, physically, impossible without information from the tracking system. As a result, the overall precision decreases, but at a general level, there is a classification of three groups of particles (Table 5 and Fig. 7), as seen before: • The first group includes e − , e + and γ. Electromagnetic particles here deposit all their energy and produce wider showers.
• The second group is composed of µ + and µ − . The energy deposited is lower and there is no shower.
• Hadrons (n, p,p, π + , π − , K + and K − ) belong to the last group. This group shows lower energy, in particular, in the front cells, so they might be confused with muons. However, hadrons usually create showers later, so they could deposit bigger amounts of energy in the rear cells.
Cross-validation showed an average accuracy of 24.1% with a standard deviation of 1.2%.

Algorithm comparison
CNNs are algorithms that have demonstrated a good precision in particle classification with SPACAL resolution. However, this resolution allows other algorithms to perform well, too. Table 6 shows the best results obtained by a decision tree (XGBoost [13]) after careful tuning. Parameters tested include, but are not limited to: • Number of estimators: from tens to hundreds trees.
• Learning rates: values ranging from 0.01 to 0.1.
• Maximum depth: values from units to few tens to assure a good precision while avoiding overfitting.  • Subsample: different sampling values from 0 to 1 to prevent overfitting.
Finally, returned precision is about 10% below the CNN's. The same decision tree was applied after image centering (ie. reducing the 7x7 matrix to 5x5 where the center contains the maximum energy) and the results improved significantly.

Conclusions
After having used the SPACAL simulation, it can be verified that CNNs are algorithms that perform well with the future upgrade of the ECAL (Run 5), despite they do not work for the