Event reconstruction and particle identification in the ALICE experiment at the LHC

In these proceedings we give an overview of the methods for track and vertex reconstruction as well as particle identification performed with the ALICE experiment at the LHC. Because of very high particle multiplicities and softness of the particle momentum spectra observed in Pb–Pb collisions, the efficiency of traditional reconstruction algorithms becomes challenging. Therefore, ALICE has implemented a few ad-hoc algorithmic extensions allowing for significant improvements in the reconstruction quality. The strong and weak sides of these extensions are discussed here as well.


Introduction
A Large Ion Collider Experiment (ALICE) [1,2] at CERN is a general-purpose heavy-ion experiment designed to study the physics of strongly interacting matter and the Quark-Gluon Plasma in nucleusnucleus collisions at the LHC.As the means for varying the energy density, the ALICE Collaboration will study pA collisions and collisions of lower-mass ions.These data, together with the proton-proton data, provide a reference for studying collisions of heavy-ion systems.The pp data allow for a number of genuine pp physics studies as well.
The ALICE detector consists of a central part (see Fig. 1), which measures event-by-event hadrons, electrons and photons, and of a forward spectrometer dedicated to muon measurements.There are also several smaller detectors for global event characterization and triggering that are located at forward angles.In these proceedings, we will focus on the central part, which covers polar angles from 45 • to 135 • over the full azimuth, and is embedded in the large L3 solenoidal magnet.In particular, we will discuss the methods and performance of the reconstruction and particle identification (PID) done with the silicon Inner Tracking System (ITS), the Time-Projection Chamber (TPC), and the Time-Of-Flight (TOF) detector.These detectors have been optimized for charged-particle density dN ch /dy = 4000 and their performance has been checked in detailed simulations up to dN ch /dy = 8000.Under these conditions, the efficiency of traditional reconstruction algorithms becomes challenging.Therefore, to achieve its physics goals, ALICE had to develop a few ad-hoc algorithmic extensions to the conventional reconstruction methods.We will discuss some of them in this paper, and more information can be found in Chapter 5 of the ALICE Physics Performance Report [2].

ALICE event reconstruction
The offline event reconstruction is software.This software "converts" the raw data into the Event Summary Data (ESD).The raw data are files containing encoded quantities like particle hit positions, times stamps, measured ionization, stored typically in units of pad or wire number, analogue to digital converter counts etc.No physics analysis is possible using these data.This is the information kept in the ESD files that allows for doing the physics analysis.This information includes fitted particle momenta, coordinates of interaction and decay vertices, normalized PID probabilities, everything represented in physics units like GeV/c and cm.Under average Pb-Pb data taking conditions, it takes about 10 4 times more time to reconstruct an event than to record it.This is compensated by the number of CPUs running the reconstruction software and processing different events in parallel.However, the challenge of developing precise, efficient and yet fast reconstruction programs is obvious.

General reconstruction strategy
The reconstruction in the tracking detectors begins with charge cluster finding.The coordinates of the crossing points (space points) between tracks and detector sensitive elements (pad rows in the TPC, and silicon sensors in the ITS) are calculated as the centers of gravity of the clusters.The errors on the space point positions are parameterized as functions of the cluster size and of the deposited charge.
In the TPC, these errors are further corrected during the tracking, using the crossing angles between tracks and pad rows.The space points reconstructed at the two innermost Silicon Pixel Detector (SPD) layers of the ITS are then used for the reconstruction of the primary vertex.
Track reconstruction in the ALICE ITS, TPC and Transition Radiation Detector (TRD) is based on the Kalman filter approach [3].The initial approximations for the track parameters (the "seeds") for primary particles are constructed using pairs of space points taken at two outer TPC pad rows separated by a few pad rows and the primary vertex.The seeds for the secondary tracks are created without using the primary vertex, since such a constraint would unnecessarily reduce the strange particle decay finding efficiency.The additional space points used for these seeds are then searched along the straight line segment connecting the pairs of points taken at those two outer TPC pad rows.

EPJ Web of Conferences
Once the track seeds are created, they are sorted according to the estimate of their transverse momentum (p T ).Then they are extended from one pad row to another in the TPC and from one layer to another in the ITS towards the primary vertex.Every time a space point is found within a prolongation path defined by the current estimate of the covariance matrix, the track parameters and the covariance matrix are updated using the Kalman filter.For each tracking step, the estimates of the track parameters and the covariance matrix are also corrected for the mean energy loss and Coulomb multiple scattering in the traversed material.The decision on the particle mass to be used for these corrections is based on the dE/dx information given by the TPC, when available.If the information is missing or not conclusive, a pion mass is assumed.Only five particle hypotheses are considered: electrons, muons, pions, kaons and protons.
Once all the tracks are followed down to the Distance of Closest Approach (DCA) to the collision vertex, they are then propagated outwards, through the ITS, TPC and TRD.During this tracking phase, the track length and five time-of-flight hypotheses per track (corresponding to the electron, muon, pion, kaon and proton masses) are calculated.This information is later used for the TOF PID procedure.When possible, tracks are matched with the hits reconstructed in the TOF detector and other ALICE detectors residing outside the TRD in radial direction.
Finally, the track parameters for all found tracks are re-calculated back to the DCA to the primary vertex applying the Kalman filter to the space points already attached.The primary vertex is fitted once again, now using reconstructed tracks and the information about the average position and spread of the beam-beam interaction region estimated for this run.

Track reconstruction in the ITS
The track prolongation from the TPC to the ITS is difficult, because the distance between the inner wall of the TPC and the outer layer of the ITS is rather large (∼ 0.5 m) and the track density inside the ITS is so high that there are always many ITS clusters found within the prolongation "window" defined by the multiple scattering in the material.The same often happens between the ITS layers as well.All this leads to a non-negligible probability of wrong cluster-track associations, if just the criterion of minimal χ 2 is applied.Therefore, we have implemented the following extensions to the Kalman filter track-finding procedure.
For each event, we do two reconstruction passes over the set of clusters in the ITS: first, with a "primary vertex constraint" (see below) and then without this constraint.In both cases, we consider all the hits within the predicted window that have a χ 2 below a given limit, and not only the one with the minimal χ 2 .For each such hit and for each track from the TPC, we build a "tree" of all possible prolongations in the ITS.After each ITS layer is passed, the branches of this tree are sorted according to the overall χ 2 and only a restricted amount of the best branches are propagated further down to the primary vertex.Finally, we choose the most probable track candidate (i.e. the path along the tree) taking into account the quality of the whole path (based on the sum of χ 2 s at the layers, total number of assigned clusters and a few other criteria).
Because most of tracks are expected to be primary, the first reconstruction pass is done applying an ad-hoc "primary vertex constraint".When going over the clusters within the "window", we take into account not only the positions of clusters and the track intersection point with the layer, but also the direction towards the primary vertex.Technically, this is done by complementing the hit position {y, z} by the angles {φ, λ} that define the direction to the primary vertex and are calculated using the current value of the track curvature.The elements of the covariance matrix of this extended measurement vector that correspond to the two angles are evaluated considering the material which this track would cross on its remaining way to the primary vertex.The subsequent evaluation of the χ of the track parameters become thus 4-dimensional (instead of 2-dimensional, as in the case of the traditional Kalman filter tracking).Detailed Monte-Carlo studies performed with ALICE offline simulation and reconstruction framework AliRoot [1] show that the outlined ad-hoc "vertex constraint" significantly reduces the probability of wrong cluster assignment, and so the quality of reconstructed tracks improves.Unfortunately, the procedure is not free of flaws.It uses several times (even if with different "weights") the same information about the primary vertex position.Thus, the resulting covariance matrix of the track parameters becomes underestimated.This is overcome by an additional subsequent refitting step that does not use any information about the vertex, but this requires additional computing time.
In future, we would like to incorporate the vertex constraint into the Kalman-filter track finding in a mathematically stricter and computationally faster way.A possible solution can probably be found by introducing the primary vertex information as some form of Bayesian priors.

Performance of the track and vertex reconstruction
The amount of material traversed by particles in the direction perpendicular to the beam is about 11% of a radiation length including the beam pipe, the ITS and the TPC together with their services and support.In fact, this is the lowest material budget among all the LHC experiments.The track reconstruction efficiency at low momenta is limited by absorption in this material, track bending in the magnetic field and particle decays.It rises from about 45% at 0.15 GeV/c to about 75% around p T = 1 GeV/c.At higher momenta, the efficiency saturates at about 90% which is constrained by the dead zones between TPC sectors.
The momentum resolution of the ITS and TPC working together is typically 1% for momenta of 1 GeV/c and deteriorates to about 20% at 100 GeV/c (see Fig. 2, left).This resolution is defined by the strength of the magnetic field (0.5 T) and by the achieved level of the TPC calibration and the alignment for both the TPC and ITS.This level is continuously being improved.The resolution of the track transverse impact parameter (the minimal distance between a track and the primary vertex in the transverse plane) depends on the precision of track and primary vertex reconstruction.The positional precision of the track reconstruction strongly depends the quality of the ITS (SPD, in particular) alignment.Good impact parameter resolution is important for the reconstruction of strange particle decays and is crucial for the charm detection.As illustrated by Fig. 2 (right), a typical impact parameter resolution is about 50 μm at p T ∼ 1 GeV/c and about 10 μm asymptotically at high momenta.

EPJ Web of Conferences
An example result of cascade decay (Ξ − baryon) reconstruction is shown in Fig. 3 (left).The particle mass peak is clearly seen over the combinatorial background, allowing for a very good Ξ − signal extraction.

Charged particle identification
The ALICE experiment is able to identify particles with momenta from 0.1 GeV/c and up-to a few tens GeV/c (statistically, on the relativistic rise of dE/dx in the TPC).This can be achieved by combining several detecting systems that are efficient in some narrower and complementary momentum subranges.The situation is complicated by a large amount of data to be processed.Thus, the particle identification (PID) procedure should satisfy the following requirements: 1.It should be as much as possible automatic.3. When several detectors contribute to the PID, the procedure must profit from this situation by providing an improved PID.
4. When some of the detectors can not separate the particle species, the signals from these detectors must not affect the combined PID.
5. It should take into account the fact that, due to different event and track selection, the PID decisions depend on the kind of analysis, and so they can not fully be made at the reconstruction level.

Two approaches to the particle identification
There are two complementary PID strategies in ALICE.With the first one, the decision on a particle mass is made by requiring a raw PID signal to be not too far from the theoretically expected (so called "nσ cuts" method).The second one is based on the Bayesian approach.
The advantages of the first approach are the following.It is more intuitive, does not require estimating the Bayesian a priori probabilities (priors), guaranties definite and constant-over-momentum PID efficiencies (in particular, in the case of Gaussian probability density functions).But, this method does not provide any information about the rate of false PID decisions (PID contamination), does not maximize the signal/background ratio, and needs having the raw PID signals "at hand" (makes the analysis data files larger).
The Bayesian approach is free of those disadvantages.In addition, this approach explicitly factorizes the part of the procedure that can be pre-calculated "once and forever" at the event reconstruction step (the detector response conditional probabilities) from the part that depends on event and track selections (the priors), that can be accomplished at the analysis level only.
The Bayesian PID method described here is similar to that in Ref. [4].Let r(s|i) be a conditional probability density function to observe in some detector a PID signal s if a particle of type i (i = e, μ, π, K, p, ...) is detected.The probability to be a particle of type i if the signal s is observed, w(i|s), depends not only on r(s|i), but also on how often this type of particles is registered in the considered experiment (a priori probabilities C i to find a particle of i-type in the detector).The corresponding relation is given by Bayes's formula: If C i and r(s|i) are not strongly correlated, we can rely on the following approximation: • The functions r(s|i) reflect only properties of the detector (detector response functions) and do not depend on other external conditions like event and track selections.
• On the contrary, the quantities C i (relative abundances of particles of type i) do not depend on the detector properties, but do reflect the external conditions, selections etc.
In the case of several detectors, the signal s is replaced by a vector of PID measurements s in the detectors.The response function r(s|i) becomes some combined response function R( s|i) of the whole system of involved detectors (in the simplest case, this is the product of the single-detector PID response functions).The PID procedure is then done in steps: • First, the detector response functions are obtained (theoretically, or in beam tests).This can be done before the reconstruction even starts, as a part of detector calibration.
• Second, for each track, a value R( s|i) is calculated using the PID signals measured for this track.This is done during the event reconstruction.
• Third, the relative abundances of particle species C i are estimated for the subset of events and tracks selected for a specific physics analysis.For obtaining better results, the particle concentrations C i can be considered as functions of momentum.
EPJ Web of Conferences 00029-p.6 • Finally, for each track within the selected subset, the array of probabilities w(i| s) is calculated using the Eq. ( 1).This step, as well as the previous one, can be done only during the physics analysis of the data.
Doing the particle identification in this way, we naturally satisfy all the requirements mentioned at the beginning of this section.However there are two problems which we are still working on.

High-p T PID limits and track mismatching
Since the results of such a PID procedure explicitly depend on the choice of the a priori probabilities C i (and, in fact, such a dependence is unavoidable in any approach), the question of stability of the results with respect to the choice of C i becomes important.At the lower momenta, there is always some momentum region where the single-detector response functions for different particle types of at least one of the detectors do not significantly overlap, and so the stability is guaranteed.The more detectors enter the combined PID procedure, the wider the PID momentum range becomes and the stabler the results are.But, as the momentum goes up, all the detectors lose their particle separation power, and, more and more, the PID decision is given by the bare priors C i , that can not be estimated independently in this case.The question is if we can somehow quantify the contribution from priors to the final PID weights, so that when, at certain high momenta, it starts dominating over the contribution from the detector response functions, we do not try to identify particles any more.We note that the question of defining the high-p T PID limits is not specific to the Bayesian approach.The "nσ cuts" method does not define these limits in any way either.
The second problem, also quite common, is of different nature.Formula (1) fundamentally assumes that all the components of the vector s are the results of PID measurements done for the same particle.In other words, the procedure of assigning clusters to tracks has to be ideal, which is not the case in reality.For example, in spite of the fact that separation of the particle species by the time of flight strongly improves with the momentum going down, the actual situation with the PID becomes worse, especially for particles below 0.5 GeV/c.This is because the low-momentum particles decay or suffer from scattering and absorption in material, and so their tracks have a higher probability to pick up a wrong cluster in the TOF detector.This mismatching effect is not taken into account by formula (1), and so the combined PID result becomes biased at low momenta.
The effect of mismatching can be corrected by excluding from the vector s the components which deviate too much from a reasonable expectation.This is possible, for example, in the case of ALICE TOF detector, because we calculate the expected time of flight during the track finding in the ITS, TPC and TRD.However, in a general case, we may not know what that "reasonable expectation" is.Also, applying sharp cuts in an otherwise smooth procedure (1) may cause additional difficulties with finding the best values for the cuts.Thus, a better solution for the problem of dealing with the track mismatching is needed and is still to be found.

PID performance
The dE/dx resolution of the TPC is estimated to be about 5% for tracks with 159 clusters [5], which is better than the design value [2].When averaged over all reconstructed tracks, this resolution is about 6.5%.This resolutions allows for a very clean separation between the bands corresponding to different particle species (see Fig. 4

, left).
A correlation between a particle velocity β measured with the TOF detector and particle momentum is shown in Fig. 4 (right).The PID bands are well separated over a quite large momentum range.This separation is defined by the intrinsic time resolution of the TOF detector and, to even a bigger extend, by the uncertainly of a collision time.During data taking runs, the online TOF resolution is ∼ 180 ps, out of which ∼ 140 ps comes from the jitter in the absolute time of the collisions.In pp events with particles within the acceptance of a special T0 detector, this contribution is smaller than 40 ps, and for events with at least 3 tracks reaching the TOF detector, it is reduced by the TOF offline reconstruction to ∼ 80 ps.In Pb-Pb collisions, with many particles registered in both T0 and TOF, the collision-time uncertainty becomes negligible.With the help of the offline TOF calibration, the intrinsic time resolution of this detector can be made better than 90 ps.The track-matching efficiency with TPC tracks (which includes geometry, decays and interaction with material) is on average 60% for protons and pions and reaches 65% above p T = 1 GeV/c.For kaons it remains slightly lower [6].Above p T = 0.5 GeV/c, the TOF PID has an efficiency larger than 60% with a very small contamination.
An example of an interesting physics result obtained using the combined TPC and TOF particle identification is presented in Fig. 3 (right): A few anti-He 4 candidates produced in Pb-Pb collisions at 2.76 TeV have unambiguously been identified.

Conclusions
The main track reconstruction algorithm in ALICE is Kalman filter (optionally used also for vertex fitting).For the reconstruction in the ALICE ITS, this method has been extended by following a whole "tree" of possible track prolongations from the TPC, and applying an ad-hoc "vertex constrain" with a subsequent track refit without this constraint.The reconstruction efficiency is limited only by particle decays and interactions with the detector material (at low p T ) or by the detector acceptance (at high p T ).The typical momentum resolution is about 1% at p T ∼ 1 GeV/c and about 20% at p T ∼ 100 GeV/c.The typical track impact parameter resolution is about 50 μm at p T ∼ 1 GeV/c and about 10 μm asymptotically at high momenta.
There are two complementary PID approaches in ALICE: Bayesian (preciser) and "nσ cuts" (simpler).Both work quite well, however the high-p T limits need to be consolidated and the treatment of track mismatching can be improved.The average dE/dx resolution in the TPC is about 6.5% (5% for tracks having the maximal number of assigned clusters).The offline calibration of the TOF detector allows for reaching better than 90 ps of the intrinsic time resolution for this detector.

Figure 1 .
Figure 1.Schematic layout of the ALICE detector.

Figure 2 .
Figure 2. Left: Transverse momentum resolution for the combined tracking system of the TPC and ITS.Right: Transverse impact parameter resolution for the combined tracking system of the TPC and ITS.

Figure 3 .
Figure 3. Physics performance in Pb-Pb collisions at 2.76 TeV.Left: Example of an invariant-mass distribution for reconstructed Ξ − baryons.Right: Example of anti-He 4 candidates identified using combined TOF and TPC PID information.

2 .
It should be able to combine PID signals of different nature (e.g.dE/dx and time-of-flight measurements).

Figure 4 .
Figure 4. Left: Correlation between the dE/dx in the TPC and particle momentum.The bands corresponding to different particle species are clearly separated.Right: Correlation between particle velocity β measured by the TOF detector and particle momentum.The bands corresponding to different particle species are clearly separated as well.