An Object Condensation Pipeline for Charged Particle Tracking at the High Luminosity LHC

Recent work has demonstrated that graph neural networks (GNNs) trained for charged particle tracking can match the performance of traditional algorithms while improving scalability to prepare for the High Luminosity LHC experiment. Most approaches are based on the edge classification (EC) paradigm, wherein tracker hits are connected by edges, and a GNN is trained to prune edges, resulting in a collection of connected components representing tracks. These connected components are usually collected by a clustering algorithm and the resulting hit clusters are passed to downstream modules that may assess track quality or fit track parameters. In this work, we consider an alternative approach based on object condensation (OC), a multi-objective learning framework designed to cluster points belonging to an arbitrary number of objects, in this context tracks, and regress the properties of each object. We demonstrate that OC shows very promising results when applied to the pixel detector of the trackML dataset and can, in some cases, recover tracks that are not reconstructable when relying on the output of an EC alone. The results have been obtained with a modular and extensible open-source implementation that allows us to efficiently train and evaluate the performance of various OC architectures and related approaches.


Introduction
The exploration of tracking algorithms based on graph neural networks (GNNs) is motivated by the poor computational scaling of combinatorial Kalman filter (CKF) algorithms with pileup [1].In recent years, many approaches have been developed around the edge classification paradigm (see e.g.Ref. [2]), in which GNNs are designed to predict whether or not edges drawn between tracker hits represent physical trajectories.These architectures have been shown to demonstrate excellent physics performance and, importantly, scalability with respect to pileup [3].In the majority of these approaches, tracks are rendered from edge-weighted graphs directly, either by a graph walk algorithm, spatial clustering, or simply collecting connected components.
This work instead explores a learned track rendering stage based on object condensation (OC), a multi-loss training scheme we use to cluster hits belonging to the same track in a learned clustering space.Employing only a very lightweight edge classifying network EPJ Web of Conferences 295, 09004 (2024) https://doi.org/10.1051/epjconf/202429509004CHEP 2023 (without any message passing), we show that the OC approach is able to deliver excellent performance when applied to simulated high-pileup TrackML pixel detector events.We also show that the algorithm is able to reconstruct tracks with missing edges, which is not possible when relying on the output of an edge classifier alone.Therefore, this algorithm could also be used at the end of an EC pipeline, leveraging both node and edge embeddings to resolve ambiguities that may exist at the output of the edge classifier.

Dataset and input features
This study is performed using the TrackML dataset [4,5] that simulates the worst-case HL-LHC pileup conditions (⟨µ⟩ = 200) in a generic tracking detector geometry 1 .Our studies are limited to the innermost pixel detector layers, including 4 barrel layers and 7 layers in each endcap.In each pixel tracker event, we embed hits as graph nodes with 14 features, including: • The cylindrical coordinates r, ϕ, z of the hits, where z corresponds to the line of the colliding proton-proton pairs; the corresponding pseudorapidity η and the conformal tracking coordinates u and v [6] are also included.
• The hit's charge fraction (sum of charge in the cluster divided by the number of activated channels), as well as variables describing the shape and orientation of the cluster introduced in Ref. [7].For the latter, we closely follow the implementation in Ref. [2].

Tracking metrics
We study several tracking definitions to evaluate the performance of our pipeline.They are defined with respect to target populations of tracks; for example, we commonly report the tracking efficiency on reconstructable particles that produce at least three hits and have |η| < 4.0.The efficiencies are defined with respect to various matching criteria between reconstructed tracks and truth tracks.For a given reconstructed track c, we define the majority particle π c as the particle with the largest number of hits within c (and choose a random one if this condition applies to multiple particles).We write #c for the number of hits in c and #π c for the number of hits of π c anywhere.We define the majority fraction f in c as the number of hits of π c within c divided by #c and the majority outside fraction f out c as the number of hits of π c outside of c divided by #π c .
• Perfect match efficiency (ϵ perfect ): the number of reconstructed tracks with #c > 3, π c reconstructable, f in c = f out c = 1 normalized over the number of reconstructable particles.
• LHC-style match efficiency (ϵ LHC ): the number of reconstructed tracks with #c > 3, π c reconstructable, f c > 75% normalized to the number of clusters with reconstructable π c .Note that duplicates, wherein multiple reconstructed tracks match to one particle, are possible with this definition.
• Double majority match efficiency (ϵ DM ): the number of reconstructed tracks with #c > 3, π c reconstructable, f in c > 50%, and f out c > 50% normalized to the number of reconstructable particles.This definition produces unique cluster-track assignments.
We also define the fake rate based on the double majority metric as the number of reconstructed tracks with #c > 3 and π c reconstructable that do not satisfy the double majority criterion normalized to the number of clusters with reconstructable π c .
As we are mostly interested in high-p T tracks, we also consider these metrics with an additional p T > 0.9 GeV threshold applied to particles and majority particles in the definition of the metrics.The corresponding metrics are denoted ϵ DM p T >0.9 , ϵ perfect p T >0.9 , and ϵ LHC p T >0.9 .

Graph construction
The initial graph is constructed by connecting hits on different detector layers that satisfy a series of geometric constraints and pass a classifier threshold.

Edges based on geometric constraints
This procedure is nearly identical to the geometric graph construction procedure described in Ref. [8], but without applying any cuts based on truth information.Edges between nodes i and j are selected based on the following geometric variables: Here we require candidate edges to satisfy z 0 < 197.4 mm, ϕ slope < 0.001825/mm, and ∆R < 1.797.The cutoff points were optimized to maximize the fraction of reconstructable track edges appearing in the graph, while simultaneously minimizing the number of un-physical edges constructed.In contrast to Ref. [8], no barrel intersection cut is applied.Note that the performance of the graph construction can be translated to an approximate upper bound for the performance of the pipeline downstream.Assuming that the pipeline can build tracks exactly out of those hits that are connected, that is, assuming a pipeline with perfect edge classification followed by identifying tracks as connected components of the resulting edge subgraph, we obtain ϵ DM p T >0.9 ≤ EC 97.4% and ϵ perfect p T >0.9 ≤ EC 84.0%.We provide four initial edge features based on the coordinates of the two hits involved: ∆r, ∆ϕ, ∆z, and ∆R.The resulting graphs are denoted G = (X, R a , I), where X = (x i ) i=1,...,N ∈ R N×14 are the node features, I ∈ N 2×N edges is the list of edges in coordinate format, and R a = (e i j ) (i, j)∈I with e i j ∈ R 4 are the edge features2 .We also define truth labels l i ∈ {0, 1, ..., N t } (where N t is the number of particles in the graph) indicating the hit is noise (l i = 0) or belongs to track t (l i = t, 1 ≤ t ≤ N t ); in this work, we do not consider shared hits between tracks.The truth label y i j indicates whether an edge connects two non-noise hits of the same particle (l i = l j > 0, (i, j) ∈ I).The geometric cuts alone achieve a purity of N built true /N built total = 4.5% at 2.8 × 10 6 edges per graph.

Edge filtering
We then apply a lightweight edge classifier to reduce the number of false edges.For this, we train a fully connected neural network (FCNN) ϕ that takes node and edge features as inputs.The node and edge features described in section 2 and subsection 4.1 are concatenated, z (0) i j = [x i , x j , e i j ], (i, j) ∈ I, and embedded into a 256-dimensional space by a fully connected layer: z (1)  i j = W (1) z (0) i j , with learnable weights W (1) ∈ R 256×(14+14+4) .We then apply a fully connected network of five hidden layers of width 256 with ReLU activations and residual connections of the form z (ℓ+1) , where l = 1, . . ., 5, (i, j) ∈ I, and β = 0.4.
To obtain an edge weight, we apply the logistic activation function σ: w i j = σ(W (7) ReLU(z (6)  i j )) ∈ (0, 1) N edges , where W (7) ∈ R 1×256 .This output is trained with binary cross entropy loss to classify whether an edge connects two hits of the same particle.As we are more interested in tracks with a high value of p T , we exclude true edges connecting hits of particles with p T < 0.9 GeV from the loss, i.e., ℓ EF (y, w) − 1 N edges (i, j)∈I δ (p T >0.9) y i j log w i j + (1 where δ (p T >0.9) and p l i T denotes the p T of the particle belonging to hit i.The classifier achieves a ROC AUC of 93.3% when evaluated on all tracks and 99.8% when evaluated on all tracks of interest.Here, tracks of interest refers to tracks with p T > 0.9 GeV and the additional constraints described in section 3. To find an appropriate threshold, we calculate the upper bounds to ϵ DM p T >0.9 for the subgraphs satisfying w i j < w thld for all edges.Based on Figure 1a, we set w thld = 0.03, resulting in TPR = 48.3%,TPR (tracks of interest) = 98.5%,FPR = 1.1%.The approximate upper bounds for this threshold are ϵ DM p T >0.9 ≤ EC 97.7%, ϵ perfect p T >0.9 ≤ EC 92.1%.The purity of the graphs is 68% at 89 × 10 3 edges per graph.
We can also establish a "lower bound" on the performance of the pipeline by reconstructing tracks directly based on this stage.For this, we identify tracks with connected components of the aforementioned subgraphs (though with a stricter value of w thld ) and calculate the efficiencies.A scan over w thld is shown in Figure 1b and shows a maximum of ϵ DM p T >0.9 at w thld = 0.31 with ϵ DM p T >0.9 = 78.9%,ϵ LHC p T >0.9 = 77.1%,and ϵ perfect p T >0.9 = 41.1%.However, it should be noted that more elaborate probabilistic schemes to build tracks based on edge scores might surpass these numbers slightly.

Object condensation
Our architecture extends traditional edge classification pipelines with an additional step based object condensation (OC) [9], a set of truth definitions and loss functions designed to cluster hits belonging to the same object and regress the properties of the reconstructed objects.OC has been extensively validated in applications to calorimetry [9][10][11], but its applications to tracking have to-date been relatively unexplored.

Loss functions
For each hit, the OC network predicts condensation strength β i ∈ R and clustering coordinates c i ∈ R d c .During training, the highest-β i hit in each track is dubbed the track's condensation point; the goal of OC is to cluster hits around their track's condensation point in the learned clustering coordinate space.The condensation strength of a track t is that of its condensation point, i.e. β (t) = max {i|l i =t} β i .The condensation strength predicted for each hit is used to calculate an un-physical "charge" defined by q i = arctanh 2 β i + q min (here, q min is treated as a hyperparameter).The charge corresponding to a track's condensation point is denoted q (t) , located at the position c (t) .During training, the condensation points for each track are used to define attractive and repulsive losses designed to produce well-separated clusters of hits  shows the true positive and false positive rates together with approximate upper bounds on the recoverable performance based on a perfect EC.The right plot shows the achievable tracking performance by applying a cut on the edge classifier and identifying tracks with connected components of the resulting subgraph.Note that ϵ DM p T >0.9 is not strictly monotonous because it is normalized to the number of reconstructed tracks rather than particles.
belonging to the same track in the c i coordinates: Here, δ (l i =t) is 1 when the node's track label is t and 0 otherwise, and s rep is a hyperparameter.The potential functions are a quadratic attractive loss and a repulsive hinge loss: where δ (p T >0.9) (defined as in Equation 1) excludes hits from noise or low-p T particles from the attraction.An additional loss term L β is designed to encourage a unique condensation point for each track and suppress the condensation strengths of noise hits: Here, s B is a hyperparameter that controls the strength of noise suppression.All loss terms are finally combined as L L V + s β L β .For the results in this paper, we choose s rep = 0.6, s β = 0.004, q min = 0.34, and s B = 0.09.To reduce the memory footprint of the loss functions, the graphs are split in 32 sectors during training.

Model
The GNN that is doing the heavy lifting of this tracking pipeline is built from interaction network layers [12] with residual connections in the node updates.Node and edge features are first encoded, x (1)  i = W enc node x i , e (1)  i j = W enc edge e i j , where (i, j) Here, Φ and Ψ are FCNNs with ReLU activations and a layer width of 192 and one hidden layer.β has been chosen to be 0.2.Finally, the outputs are decoded as c i = W dec c ReLU(x (6) ) (clustering coordinates), β i = σ W dec β ReLU(x (6) ) (condensation likelihoods), where σ is the logistic function, and The total number of parameters of this model is 1.9 × 10 6 .

Postprocessing and results
Hit clusters produced in the OC clustering space must be rendered by a downstream algorithm.For this, we use the Density-Based Spatial Clustering of Applications with Noise (DBSCAN), an iterative clustering algorithm that has two parameters, ϵ (defining the size of the neighborhood of a point that is considered when merging clusters), and k (minimum number of points within a neighborhood for the points to be considered a core point) [13].For this application, k = 1 is optimal; maximizing ϵ DM p T >0.9 vs ϵ yields ϵ = 0.279 (see Figure 2).With this, we obtain ϵ DM p T >0.9 = 95%, ϵ LHC p T >0.9 = 97%, ϵ perfect p T >0.9 = 80% and f p T >0.9 = 1.7%.All metrics are presented vs p T and vs η in Figure 4.
In a side study, we have also tested the ability of the OC network to reconstruct tracks with missing edges after graph construction or edge filtering.For this, all edges between the barrel and the right endcap have been removed after graph construction, limiting the upper bound for ϵ perfect p T >0.9 for a perfect EC to almost zero for tracks with 2 < η < 3.However, as OC is using edges only as a means to exchange information, it is not subject to this upper bound.Indeed, the OC pipeline achieves ϵ perfect p T >0.9 = 60% in this region.This is shown in Figure 3.

Summary
This paper presents the first GNN-based charged particle tracking pipeline that uses the OC approach to reconstruct tracks in the worst-case pileup conditions expected at the HL-LHC.
Our pipeline shows excellent performance with respect to several metrics when applied to the pixel detector of the TrackML dataset.We also demonstrate that OC approach can join partial tracks that are not connected by any of the edges used for message passing, allowing it to outperform algorithms that solely rely on the output of an EC in certain scenarios.This suggests that the use of OC at the output stage of EC-based pipelines may lead to a boost in performance.Future applications of OC may also allow for the regression of track parameters, for example transverse momentum, as part of an architecture capable of rendering tracks and preliminary fits in one shot.The incorporation of track physics may well lead to a more robust model.
All results were produced with the open-source project [14] that implements various OC tracking architectures in a modular and extensible Python package.

Figure 1 :
Figure1: Performance of the edge classifier applied during edge construction.The left plot shows the true positive and false positive rates together with approximate upper bounds on the recoverable performance based on a perfect EC.The right plot shows the achievable tracking performance by applying a cut on the edge classifier and identifying tracks with connected components of the resulting subgraph.Note that ϵ DM p T >0.9 is not strictly monotonous because it is normalized to the number of reconstructed tracks rather than particles.

Figure 2 :
Figure2: Optimizing the DBSCAN hyperparameters to obtain the maximum performance.The dashed lines are the upper and lower bounds after the EF step for reference (see Figure1).

Figure 3 :
Figure 3: An OC pipeline outperforms a perfect EC when edges between the barrel and the right endcap have been removed.

Figure 4 :
Figure 4: Tracking performance in bins of p T and η.

9 Figure 5 :
Figure 5: Comparing the performance of the OC pipeline with the upper and lower bounds introduced in this paper.