A cluster-finding algorithm for free-streaming data

The algorithm Features and Performance • Consider a position-sensitive device say, for instance, a silicon strip sensor. • Within one event (collision), it is traversed by a number of particles (tracks). Depending on its inclination, each track activates one or more adjacent strips a cluster. • The first step of data reconstruction is to find these clusters from the raw data and determine the cluster position, which is the measurement of the intersection point of the particle trajectory with the detector.


Introduction
Position-sensitive detectors reconstruct the intersection point of a particle from the charge distribution generated by the particle during its passage through the active detector material. The charge is collected at a readout surface segmented in one dimension (strips) or two dimensions (pixel, pads), the granularity of the segmentation being driven by considerations like resolution in coordinate space and occupancy. In general, several readout segments (channels) are activated by a single particle, allowing to determine the coordinate by an analysis of the charge measurements in those channels.
The first step in the reconstruction of the hit coordinate is thus the identification of a cluster of active channels associated to one particle crossing the detector. In a conventionally triggered readout system, this is a rather trivial problem, since the trigger defines a set of measurements corresponding to a single (or a limited number of) event, in which clusters can be searched for. Within a trigger, all measurements are considered simultaneous. The data set in which to search for neighbouring active channels is thus well defined. The task can usually be achieved by a single loop over all detector channels. The complexity of the problem is thus of the order of n channels .
In untriggered, free-streaming readout schemes, the situation becomes more involved. The events are not sorted by triggers, but are continuously recorded. Thus, each measurement 2 The CBM experiment CBM [1] is a fixed-target heavy-ion experiment currently under construction and will take data at the FAIR accelerator facility [2] in Darmstadt, Germany. It comprises a number of detector systems for measuring nuclear collisions in the beam momentum range 3.5A -12A GeV. The Silicon Tracking System (STS) of CBM [3] will reconstruct particle trajectories inside a dipole magnetic field. It is made of double-sided silicon strip sensors.
A focus of the CBM experiment is the ability to measure at very high interaction rates (up to 10 MHz) in order to open access to extremely rare probes. Data reduction in real-time is a prerequisite for such interaction rates. The trigger topologies, however, are complex and not realisable in trigger logic. Thus, CBM will not exploit any hardware trigger, but have a free-streaming readout where all data are pushed to a computer farm, on which the trigger signatures are evaluated in software. This requires partial event reconstruction in real-time. The performance of the corresponding reconstruction algorithms is therefore a critical issue. The input data situation for the cluster finding problem is illustrated in Fig. 1, showing the frequency distribution of raw data ("digis") from the STS system. Events can be seen as spikes in the distribution, but at high interaction rates, they may occur in close temporal proximity and are then not easily separable by the time measurements alone. For the transport through the data acquisition, data will be packed into "time-slices"-containers typically comprising data from several thousands of collision events. Figure 2 illustrates the algorithmic problem of cluster finding for free-streaming data from the STS. The problem is now of two-dimensional nature, the time measurement being the second coordinate besides the address (channel number). The additional condition for several measurements to belong to a cluster, besides being in adjacent channels, is that their time measurements coincide within a precision set by the detector resolution.

The algorithm
The simplest approach extends the one-dimensional approach by a double loop over all measurements, searching for each of them for a cluster partner. However, since the number of measurements in a time-slice is very large, this approach is prohibitive in terms of computational speed. The second obvious approach is to discretize the time axis and then perform a cluster-finding procedure on a two-dimensional grid. Because of the size of the input data (length of the time-slice, corresponding to several thousands of events), this would require the sub-division of the time-slice into smaller intervals, which may cause clusters to be artificially split between two sub-intervals. Taking into account the problems of these obvious approaches, a different cluster-finding procedure was chosen, which does not rely on pre-sorting of data into grids and subsequent cluster search, but on a continuous treatment of data. The basic considerations are: • Two measurements are defined to belong to a cluster if they are in adjacent channels (strips) and their time difference is smaller than a threshold defined by the time resolution of the detector: ∆ t,thr = 3 √ 2 σ t , with σ t = 5 ns.
• Since the data are time-ordered, a cluster can be considered complete if there is a new measurement in one of its channels or one of its neighbouring channels not belonging to the cluster, i.e., not compatible in time.
The algorithm is based on bookkeeping of the state of each readout channel in the system. The number of channels is about 1.8 millions, dispersed over about 900 sensors. Each sensor can be treated separately. The bookkeeping keeps track of the last registered measurement in each channel. It is implemented as two std::vector, the sizes of which are predefined to equal the number of channels in order to avoid memory allocation after initialization.  The measurements are treated one-by-one as delivered by the data acquisition. For each new measurement, the state of the respective detector channel and its two neighbours are looked at. The following cases can be distinguished (see Fig. 3): EPJ Web of Conferences 214, 01008 (2019) https://doi.org/10.1051/epjconf/201921401008 CHEP 2018 (a) The time difference is larger than the threshold (Fig. 3, centre). A cluster object is created from the neighbouring channel. The corresponding channels are cleared.
The new measurement is added to the status.
(b) The time difference is smaller than the threshold (Fig. 3, bottom). The new measurement is added to the status.
Cluster creation from an active channel means that the left and right end of the cluster are determined by iteratively checking the status of the respective neighbours until an empty channel is found. After the last measurement in the time slice is processed, there will be remaining data in the status buffer. From all these remaining data, clusters are created. After this final step, the status buffer is empty.

Features and performance
The working principle of the algorithm assumes that the data come time-ordered, which will be guaranteed by the data acquisition software during time-slice building. In contrast, clusters are not delivered time-ordered, since the time when a cluster is created depends on the (random) occurrence of a next measurement in a channel contributing to the cluster or in a neighbouring one. If later reconstruction steps (i.e., track finding) require time-sorted input, a sorting step will have to be introduced in between. The performance of the algorithm was assessed by its application to simulated data from a full detector response simulation of typical events (minimum-bias Au + Au collisions at p = 10A GeV/c) inside the CBM software system [4] using the FairRoot framework [5]. On average, one event contains about 5,500 measurements and 1,700 clusters. Since no loops over detector channels are involved, the execution time of the algorithm does not depend on the number of channels. From the construction of the algorithm -processing one measurement after the other -it can be expected that the execution time scales with the number of