Track Fitting for the Belle II Experiment

The Belle II experiment has started to take data in 2018, studying e+e− collisions at the KEK facility in Tsukuba (Japan), in a center of mass energy range of the Bottomonium states. The tracking system includes a combination of hit measurements coming from the vertex detector, made of pixel detectors and double-sided silicon strip detectors, and a central drift chamber, inside a solenoid of 1.5 T magnetic field. Once the pattern recognition routines have identified the track candidates, hit measurements are fitted taking into account the different information coming from different detectors, the energy loss in the materials and the inhomogeneity of the magnetic field. Track fitting is performed by the generic track-fitting software GENFIT, which includes a Kalman filter improved by a deterministic annealing filter, in order to reject outlier hits coming from not correctly associated hits by the pattern recognition. Several mass hypotheses are used in the fit, in order to achieve the best track parameter estimation for each particle kind. This article presents the design of the track fitting in the Belle II software, showing results in terms of track parameter estimation as well as computing performances. 1 The Belle II experiment The Belle II experiment [1] aims to study e+e− collisions at the SuperKEKB accelerator in Tsukuba, Japan, with unprecedented design beam luminosity of 8 × 1035cm−1s−1. The physics program is focused on precise determinations of fundamental physical constants and the search for physics beyond the Standard Model. To achieve these goals a very high precision tracking system is mandatory, in order to obtain particle momenta and decay vertices with high accuracy. The Belle II experiment features three different tracking detectors. The vertex detector (VXD) is the innermost part and surrounds the interaction region; it consists of two sub-detectors: a Pixel Vertex Detector (PXD), which includes two layers of pixelated sensors based on Depleted P-channel Field Effect Transistor (DEPFET) technology, and a double-sided Silicon strip Vertex Detector (SVD), with four layers of silicon strip sensors. The central tracking detector is a large volume Central Drift Chamber (CDC), with He-C2H6 gas mixture and made of 14,336 sense wires organised into 56 layers, which surrounds the VXD. The tracking regions is enclosed within a solenoid producing a 1.5 T magnetic field, which bends the charged particle trajectories and thus allows for the measurement of their momenta. The information coming from the different systems must be merged together and the position provided by the detector hits must be fitted in order to extract track parameters of each ∗e-mail: stefano.spataro@to.infn.it


The Belle II experiment
The Belle II experiment [1] aims to study e + e − collisions at the SuperKEKB accelerator in Tsukuba, Japan, with unprecedented design beam luminosity of 8 × 10 35 cm −1 s −1 . The physics program is focused on precise determinations of fundamental physical constants and the search for physics beyond the Standard Model. To achieve these goals a very high precision tracking system is mandatory, in order to obtain particle momenta and decay vertices with high accuracy. The Belle II experiment features three different tracking detectors. The vertex detector (VXD) is the innermost part and surrounds the interaction region; it consists of two sub-detectors: a Pixel Vertex Detector (PXD), which includes two layers of pixelated sensors based on Depleted P-channel Field Effect Transistor (DEPFET) technology, and a double-sided Silicon strip Vertex Detector (SVD), with four layers of silicon strip sensors. The central tracking detector is a large volume Central Drift Chamber (CDC), with He-C 2 H 6 gas mixture and made of 14,336 sense wires organised into 56 layers, which surrounds the VXD. The tracking regions is enclosed within a solenoid producing a 1.5 T magnetic field, which bends the charged particle trajectories and thus allows for the measurement of their momenta. The information coming from the different systems must be merged together and the position provided by the detector hits must be fitted in order to extract track parameters of each particle. This paper will focus on the strategies adopted by Belle II for track fitting, showing preliminary results coming from the first collisions which were recorded in 2018 with a partial setup. Figure 1 shows the design of the Belle II tracking software. The momentum values and the vertex coordinates of each charged particle are reconstructed by merging the information from all the tracking detectors present in the spectrometer into the so called track candidates (pattern recognition), which afterward must be fitted to extract the track parameters (track fitting). The goal of pattern recognition consists in identifying detector hits belonging to the same particle, and in Belle II it is performed with a modular design involving different sequential steps, optimised to deal with the different components of the tracking system. At first, tracklets are found separately for CDC and for SVD. In the CDC, since a large number of hits is produced by machine-induced background, first a boosted decision tree classifier removes noisy hits and cleans the hit sample; the surviving hits are used by two algorithms, one based on a Legendre track finder, the other based on Cellular Automaton, and the found track candidates are then merged together. In the SVD the detector is subdivided into sectors, and a MonteCarlo mapping is performed in order to identify the possible paths the particles follow between different detector layers, and a Cellular Automaton finds the combination of hits which could belong to a track candidate. A Combinatorial Kalman Filter (CKF) allows to merge CDC and SVD tracklets, as well as to attach SVD hits to the CDC candidates. The modular design allows to run CKF after SVD track finding, recovering SVD hits which are not associated to any SVD candidate, but also to run CKF before the SVD track finding, attaching SVD hits to CDC candidates so that the SVD track finder has to deal with an easier hit data sample thus reducing the amount of possible combinations; the second approach is the current standard in the code since it has demonstrated a higher efficiency. Finally, the CKF algorithm is also used to search for PXD hits to be associated to the track candidates. More detailed informations are provided in [2]. Once the track candidates are found, the last step in reconstruction consists in fitting the tracks

Track Fitting
Due to the characteristics of the Belle II detector, fitting the track candidates is not straightforward and dedicated techniques are needed in order to achieve the high performances which are required by the experiment, in terms of momentum resolution and vertexing. Indeed, the tracking system is composed by different sub-detectors which provide completely different kinds of hit information: • PXD is a pixel detector which provides two-dimensional position (XY) along the detector planes; • SVD is made of strips which provide the position perpendicular to the strip direction; • CDC consists of wires providing the drift time information, without telling in which side of the wire cell the particle was passing (Left-Right ambiguity).
Once propagating the track parameters into each detector plane, different calculations are required for the extrapolation, and for the calculation of the parameters covariance matrix needed for the fit. It is important to consider also, for a better estimation of the parameters, that the magnetic field is not perfectly homogeneous in the tracking region, as shown in Fig. 2, and to include in the fit the effects of multiple scattering and energy loss, which depend on the particle kind. All these problems are handled by the generic toolkit for track reconstruction GENFIT2 [3,4], an open source C++ modular track-fitting framework which provides routines based on a Kalman Filter algorithm. Originally developed inside the software of the PANDA [5] experiment (GENFIT [6]), it has been improved and nowadays is used by several HEP experiments (Belle II, PANDA, GEM-TPC@FOPI [7], SHiP [8]). As input, the track parameters obtained by the preliminary helix fit from the pattern recognition algorithms are provided to the toolkit, as well as the hit measurements contributing to the track candidate; a selection of algorithm is possible for the fit, and in Belle II we use the one based on Deterministic Annealing Filter, which is able to decrease the weight of hits far from the trajectory thus removing wrongly matched hits (outliers). The extrapolations between different detector planes are performed by a Runge-Kutta based algorithm, based on the GEANT3 [9] code, which uses exactly the same detector geometry used in the simulation and the realistic magnetic field maps coming from measurements. At the end, the tracks with refitted parameters are stored and parsed for the next steps of the reconstruction and analysis.
Since the fit depends also on the particle kind, we fit the same track candidate with different particle hypotheses, as explained in the next section.

Multi-hypothesis Track Fitting
While crossing the detector, particles lose part of their energy due to interaction with the materials, and this loss depends on the particle momentum as well as on the particle kind. This means that using a wrong particle hypothesis in the track fitting could cause a systematic bias on the estimation of the track parameters; this effect is negligible for high energy particles, but at low energies (for momentum values below 1 GeV/c) the deviation between correct hypothesis and wrong one becomes more important, the lower the particle momentum, the larger the mass difference between the correct and the wrong particle hypotheses. Moreover, electrons undergo additionally bremsstrahlung and require special handling (still under development for Belle II). Fig. 3 shows the track fitting results from a simulation of single particle events with kaons generated at fixed polar angle (θ=60 • ) and at fixed momentum. Tracks are fitted with four particle hypotheses (µ, π, K, p) and the relative residuals of transverse momentum are plot together and can be compared. At high momentum (p t = 1 GeV/c) all the residual distributions overlap since the track fitting results are the same, only protons show a slight overestimation of the reconstructed momentum but at the level of per mille, and can be easily neglected. The situation is different for low energy (p t = 0.3 GeV/c), where only using the correct hypothesis the residual is well centred at zero: using a lighter particle in the fit assumption the reconstructed momentum is largely underestimated (notice that pion and muon hypotheses provide almost the same results, since their mass difference is low), while using a heavier particle the momentum is largely overestimated (outside the plot binning on the right). In Fig. 4 the mean residuals for kaon tracks are plotted versus momentum of the generated particle, under different hypotheses. Since a large number of particles for the physics channels of interest for the experiment have momentum values below 1 GeV/c, it appears evident that using the proper particle hypothesis is mandatory to achieve high performance results, and since the particle kind can be identified only after the correlation with the . Belle II simulations: comparison between mean momentum residuals obtained with different particle hypotheses in the track fitting as a function of particle momentum, for kaons at fixed polar angle (the 4% bias in the kaon fit at 100 MeV/c is inside the large resolution at this low momentum value). particle identification detectors, at the tracking level we need to run different track fits for each hypothesis, to use the correct track parameters at the analysis level once the particle is properly identified. Running track fitting multiple times has a computing cost, in terms of CPU time and also of disk space since the track parameters need to be stored on disk. In order to evaluate the impact of running different hypotheses instead of only one, simulation tests were done on a generic cocktail sample, comparing computing performances with only one particle hypothesis (π), with 3 hypotheses (πKp) and with 4 hypotheses (πKpd). The results are summarised in Table 1.
It is possible to see that the track fitting time scales almost linearly with the number of particle hypotheses, as well as the data size. The ratios compared to the single hypothesis fit are slightly less than the (expected) number of the used hypotheses, in particular for the disk space; this is understood since once the used hypothesis is very different from the correct one, the extrapolation of track parameters is far from the experimental measurements and the hits are rejected by the DAF or the fit itself fails, then at the end the required amount of computing time and the space will be less than performing a full fit and storing a complete track.
In Belle II, the adopted strategy is not to run the fit using each possible particle which could be present in the data (namely 6 hypotheses, eµπKpd), since it would increase too much the required computing resources. We decided to run three hypotheses in the fit, as the following: for electron, muon and pion candidates we use the π hypothesis, since the difference in energy loss with muons is negligible, while electrons require additional corrections due to bremsstrahlung and as a first estimation the pion hypothesis provides good results, even with large residual tails as expected; the K hypothesis is used for kaon candidates; the p hypothesis is used for both proton and deuteron hypotheses, since deuterons are so few in the data and needed for dedicated analyses, thus we use the proton as starting fit and we plan to refit them in a second stage once we are able to do the identification. Looking at Fig. 4 it appears clearly that for momentum values larger than 1 GeV/c the difference between different hypotheses is negligible; indeed we plan, for further optimisations, to decide the number (and kind) of hypotheses in the fit as a function of momentum and if possible also using partially particle identification information (such as dE/dx) to reduce the number of fits.

Preliminary results from experimental data
In April 2018, the first e + e − collisions were registered by the Belle II detector, with a non complete setup (only 1/10 of the vertex detector was installed), and ∼500pb −1 of collision data were collected for commissioning (so called Phase 2), in preparation for the full physics program starting in March 2019 (Phase 3). In this data sample, with a non optimised beam profile and with a preliminary calibration, it was possible to test the effectiveness of the track fitting. Fig. 5 shows the distribution of reconstructed momentum for all the particles, coming from the preliminary fit from the pattern recognition algorithms (helix fit), and from different particle hypothesis fits. In the data the dominant Bhabha scattering e + e − → e + e − is not suppressed by the trigger for testing purposes, indeed the two high energy peaks corresponding to the scattered beam particles are evident, together with low momentum particles. Different fit hypotheses at high momentum provide the same distributions, as expected, and in correspondence of the Bhabha scattering the two peaks become narrower than the helix fit, showing the effective improvement in resolution coming from the fit. At low momenta, where the differences in energy loss are larger, the distributions from the fit present different shapes. One of the characteristics of the Determinist Annealing Filter is the capability to remove from the fit hits wrongly associated by the pattern recognition. In Phase 2 the beam background was high, inducing a large number of hits in the detector not coming from collisions. In Fig. 6 it is possible to see the percentage of removed hits as a function of the seed momentum from pattern recognition, for pion and proton fit, in logarithmic scale. At high mo- mentum, where we expect mostly muons and electrons and where the energy loss does not depend on the hypothesis, it is possible to see a horizontal band which tell us that on average 10% of the hits are removed, coming mostly from background hits (as expected). The low momentum range (below 1 GeV/c) is populated by more particle species, and using the wrong hypothesis can lead the fit to fail; indeed, in case of wrong hypothesis, the DAF tends to remove a large number of hits in order to try to improve the χ 2 , as shown in the plot. This appears evidently in the proton fit plot, where an excess of hits with very high percentage of removed hits (above 90%) is present in the region of momentum below 0.5 GeV/c, while the same region is almost clean in the pion fit plot: the track of a very low momentum particle (mainly pion), when fitted under the incorrect proton hypothesis, is expected to lose a lot of energy and thus curl inside the detector, while in reality it is able to cross also outer layers; this results in a huge amount of removed hits, until the minimum number of hits for a helix fit is left. This tells us that the information of removed hits by the DAF could be used in the future to help in the particle identification task.
Finally, in Fig. 7 are presented two examples of resonance reconstruction using the results from track fitting, i.e. K 0 S → π + π − and Λ 0 → pπ − , showing the good performances of the invariant mass reconstruction, even at this preliminary stage of the experiment.

Conclusion
In Belle II, global tracking is performed by means of the GENFIT2 software. The procedure of Track Fitting takes into account a realistic magnetic field, different kinds of detector hits, and energy loss for different particles: tracks are fitted with three mass hypotheses (π, K, p), a compromise between performances and computing resource consumption. A momentum dependent mass hypothesis in the fit can reduce CPU time and disk usage, in particular for high momentum particles where the differences in the estimation of the track parameters under different fit hypotheses are negligible. The Determinist Annealing Filter algorithm removes outliers and down-weighs distant hits, giving the possibility to detect wrong mass hypotheses thus contributing in the particle identification. In 2018, data from first collisions were acquired and the complete tracking system was tested, providing good results in particle reconstruction even in a non optimal environment, and validating the tracking design for high level physics. At present, studies are ongoing to improve electron reconstruction including bremsstrahlung corrections, and to optimise the computing resources used for fitting.