Online Data Reduction for the Belle II Experiment using DATCON

The new Belle II experiment at the asymmetric $e^+ e^-$ accelerator SuperKEKB at KEK in Japan is designed to deliver a peak luminosity of $8\times10^{35}\text{cm}^{-2}\text{s}^{-1}$. To perform high-precision track reconstruction, e.g. for measurements of time-dependent CP-violating decays and secondary vertices, the Belle II detector is equipped with a highly segmented pixel detector (PXD). The high instantaneous luminosity and short bunch crossing times result in a large stream of data in the PXD, which needs to be significantly reduced for offline storage. The data reduction is performed using an FPGA-based Data Acquisition Tracking and Concentrator Online Node (DATCON), which uses information from the Belle II silicon strip vertex detector (SVD) surrounding the PXD to carry out online track reconstruction, extrapolation to the PXD, and Region of Interest (ROI) determination on the PXD. The data stream is reduced by a factor of ten with an ROI finding efficiency of>90% for PXD hits inside the ROI down to 50 MeV in $p_\text{T}$ of the stable particles. We will present the current status of the implementation of the track reconstruction using Hough transformations, and the results obtained for simulated \Upsilon(4S) $\rightarrow \, B\bar{B}$ events.


Motivation for the Belle II Experiment
Since the Belle experiment stopped taking data in 2010, its successor Belle II is being developed. The Belle II experiment will be located at the KEK laboratory in Japan, at the only interaction point of the SuperKEKB collider, which is the successor of the KEKB machine. SuperKEKB will collide asymmetric energy beams (7 GeV electrons and 4 GeV positrons) with a design instantaneous luminosity of 8 × 10 35 cm −2 s −1 . This luminosity is 40 times the instantaneous luminosity of KEKB. This produces higher data rates and thus higher background rates with which the new Belle II detector has to cope. Due to limited bandwidth of the readout electronics and to minimise the amount of mass storage required, online data reduction is essential. The working principle of one of the two deployed online data reduction systems, the Data Acquisition Tracking and Concentrator Online Node (DATCON) is described in this article.

The Belle II Vertex Detector
A sketch of the Belle II detector is shown in Figure 1. The innermost part of the detector is the silicon based vertex detector system (VXD), which is surrounding the beryllium beam pipe. The other a e-mail: wessel@physik.uni-bonn.de   sub-detectors are: the central drift chamber, the solenoid magnet producing a magnetic field of 1.5T, detectors for particle identification, the electromagnetic calorimeter, and an instrumented flux return that can detect K L and muons. The VXD consists of two components: a PiXel Detector (PXD) based on DEPFET [1] technology and a Silicon Vertex Detector (SVD) based on double-sided silicon strip sensors. The placement of both components is shown in Figure 2. Both PXD and SVD are very thin detectors with only 0.2% and 0.6% of a radiation length per layer, respectively. The PXD consists of 2 layers at radii of 14 and 22 mm from the interaction point with in total 40 modules. A module contains 250 × 768 pixels, having two types of pixels with a pitch of 50 × 55 µm 2 (256 pixels in central region) and 50 × 70 µm 2 (512 pixels in forward and backward region) for layer 1 and 50 × 65 µm 2 (256 pixels in central region) and 50 × 70 µm 2 (512 pixels in forward and backward region) for layer 2. The information in the 8 million pixels is read out at a data rate of 256 Gb/s, corresponding to 90% of the data rate of the complete Belle II detector without data reduction. The SVD consists of 4 layers at radii ranging between 39 and 135 mm. In total the SVD contains 172 sensors with 768 × 768 strips having a pitch of 50 and 160 µm (layer 3) and 768 × 512 strips having a pitch of 75 × 240 µm (layers 4 to 6), respectively. This results in about 240000 strips in total [2].
In a BB event coming from the decay of the Υ (4S ) resonance there are on average 10 tracks in the acceptance region of the VXD, which covers the full angle of 2π in azimuthal direction (ϕ ), and the region 17 • < θ < 150 • in polar angle. In addition to hits from particle tracks originating from BB events, background processes produce additional hits that are recorded by the VXD. Such background hits come for instance from two-photon QED processes, Touschek scattering, Coulomb scattering of beam particles with residual gas in the beam pipe, radiative Bhabha scattering and Synchrotron radiation. Beam-induced backgrounds increase the occupancy of the PXD up to 1% and up to 0.25% in the SVD, based on detailed simulations of the backgrounds. Only a small fraction of the hits in the VXD belong to tracks originating from the decay of BB pairs. The online data reduction of the Belle II experiment is designed in such a way that only the hits of interest for physics analysis recorded in the PXD are forwarded to permanent storage. To implement this, the hits of other sub-detectors as the SVD are used to execute an online track finding. The reconstructed tracks are then used to define Regions of Interest (ROI) on the PXD and only the subset of pixels inside an ROI are permanently stored. For the maximum tolerable occupancy of 3%, a data reduction factor of about 10 for PXD hit information is required. . A simplified illustration of the Belle II data acquisition system with the two data reduction systems: DATCON receives hit recorded by the SVD and performs an online track reconstruction to define ROIs on the PXD. The HLT receives hit information from the SVD, the drift chamber, and the muon system as well as information from the PID detectors to define ROIs on the PXD. Both HLT and DATCON run independently of each other. The ROIs of both systems are sent to the ONSEN system, which performs the overall PXD data reduction by applying the ROIs to the PXD data. Figure 3 shows a simplified sketch of the Belle II Data Acquisition (DAQ) system. The data of the SVD, the drift chamber and the sub-detectors for particle identification (PID), calorimetry and muon detection are sent to the Higher Level Trigger (HLT) [3]. The data from the SVD are additionally sent to the DATCON, which performs online track reconstruction, extrapolation to the PXD, and calculation of ROI's in the PXD. The ROIs are then sent to the Online Selector Node (ONSEN) [4]. Besides DATCON, the HTL is the second system that performs a calculation of ROI's in the PXD. For this task the HLT not only can use data from the SVD, but also from the drift chamber and the other sub detectors. In addition, HLT provides the trigger signal for the complete detector and pipelines the data of the sub-detectors, except PXD, to the storage. The PXD data are only sent to ONSEN, which merges the ROI's of HLT and DATCON and performs the overall PXD data reduction by rejecting hits outside the ROIs. The HLT uses a computing farm with 6400 cores in total and runs sophisticated track finding and fitting algorithms. These HLT algorithms will also be used in the later offline track reconstruction. DATCON, on the other hand, runs a fast FPGA-based track reconstruction.

Data Reduction Concept
A track in a magnetic field, parametrised by a helix, is characterised by five parameters at the point of closest approach from to the beamline. This is illustrated in Figure 4. The parameters are: the initial azimuthal angle ϕ 0 , the distance of closest approach d 0 , the radius of the track r T,0 , the initial polar angle θ 0 , and the initial z-coordinate at the point of closest approach z 0 .
A schematic overview of the data reduction procedure and the data flow inside DATCON is shown in Figure 5: The ROIs are calculated by using SVD hits to reconstruct tracks and extrapolating the trajectories to the PXD. For the purpose of track finding, the tracks are assumed to be circular in the x-y-plane , with an additional ghost hit at the origin of the coordinate system where the beam spot is located. The algorithm assumes that the trajectories can be approximated by straight lines in the r-z-plane with z 0 0. The hits are transformed using Hough [5] and Hesse transformations: s = r hit cos θ + z hit sin θ, where ρ = 1 r T denotes the track curvature, r hit is the distance of the hit from the z-axis , z denotes the global z coordinate of the hit, and x , y are the conformally transformed values of x, y of the hit defined by: The conformal transformation is only valid for d 0 = 0, which is a good approximation for B-meson decays as their decay products (except the charged particles e ± , µ ± , π ± , and p/p) only have a short lifetime and decay in close proximity to the origin of the z-axis . Note that the conformal transformation is needed as the Hough transformation is better applicable to straight lines. Using conformal and Hough transformation, a helix trajectory is mapped onto a straight line. Hits on this straight line correspond to intersecting of lines in the Hough parameter space. Thus the task of finding tracks in real space is equivalent to finding intersections of lines in Hough space. Once the intersections are found, equations 1 and 2 allow for a straightforward computation of the angle and track curvature ρ = 1 r T . With this information intersections of all tracks with the PXD detector planes are calculated. This reduces the task of finding ROIs to finding the intersection of a circle with a straight line in two dimensions in the x-y-plane and to finding the intersection of two straight lines in the r-z-plane . From these interceptions, Most Probable Hits (MPH) are defined and a fixed-size ROI of 80 × 120 pixels is created. Studies to tune the ROI size on the estimated particle momenta are currently being performed. Finally, the identified ROIs are transmitted to ONSEN.  .). These coordinates are Hough-transformed (4.) to reconstruct possible tracks, yielding two 2D tracks with information on ϕ and r, and θ and s, respectively. The information is combined to obtain 3D tracks, which are extrapolated to the PXD. The positions of the most probable hits are calculated from the intersection of the extrapolated tracks with the PXD layers (5.). Finally the ROI's on the two PXD layers are calculated (6.).

Preliminary Results
To develop and test the necessary algorithm to be implemented on an FPGA, a C++ and python based implementation of the algorithm is used, running inside the Belle II Analysis and Software Framework (BASF2) [6]. The BASF2 framework allows for the simulation of the detector response of particles traversing the Belle II detector and includes the full decay chain of Υ (4S ) → BB → stable particles (i.e. e ± , µ ± , π ± , p/p, and K L ). This allows one to assess the performance of a given algorithm using simulated decays and variable background conditions. The preliminary performance of such simulations of the DATCON using 100,000 simulated BB events are shown. Figure 6 shows the performance of the track reconstruction for the azimuthal and the polar angle: 92% and 93% of all reconstructed tracks are within 1 • of the true track in ϕ and θ , respectively.
In Figure 7, the track reconstruction efficiency as a function of the transverse momentum, p T , is shown: The overall track reconstruction efficiency is 96% over the complete momentum range Figure 6. Angular residuals ∆ of the reconstructed tracks, defined as difference between reconstructed and true angles, for ϕ (blue) and θ (red) of the cases. In 92% (ϕ ) and 93% (θ ), respectively, the reconstructed values show a deviation of less than 1 • from the true track. The sharp edges edges at ±0.35 • and ±0.7 • in both distributions are caused by the discrete Hough space. expected for the decay products of the Υ (4S ) resonance. The reduced efficiency for low-p T tracks is primarily due to the fact that they only traverse one or two SVD layers and the DATCON algorithm requires at least three hits in different layers to identify a track to reduce combinatorics. Figure 8. Residuals of the extrapolation to the PXD (measured in pixels) in local sensor coordinates u and v. The residual is defined as the difference between the local coordinates of the extrapolated hit and the local coordinates of the true simulated Monte-Carlo hit. The left side shows the residuals on a logarithmic scale and the ROI of 80×120 pixels around the MPH as a red box. The right side shows a 3D illustration of the residuals. With ROIs of size 80×120 pixels, the ROI finding efficiency is 94% with a data reduction factor of 15. To obtain a higher ROI finding efficiency, it is possible to further increase the ROI size while still reducing the data by a factor of about 10. Figure 9. Efficiency of the ROI finding as a function of track p T . Particles with a transverse momentum below 80 MeV cannot be found during tracking, and thus also no extrapolation is possible for these tracks. For tracks with p T > 100 MeV, the ROI finding efficiency is above 90%, the average over the complete p T -range is larger than 94%, with the data reduction factor being above 10. Figure 8 shows the residuals of the extrapolated hits, defined as the difference between the local position of the extrapolated hit and the local position of the true MC hit, measured in pixels in the local coordinates u and v. Over 90% of the extrapolated hits are very close to the true hit position within ∆R = √ u 2 + v 2 of 50 pixels, and 94% of all MPHs in the PXD are located such that the ROIs of size 80 × 1120 pixels (u × v) calculated around the MPHs include the corresponding true PXD hits. As can be seen in Figure 8, the residual distribution is slightly wider in v-direction than in u-direction, which is why the ROI size is chosen to be larger in v-direction. Finally, Figure 9 shows the ROI-finding efficiency as a function of track p T . The average ROI-finding efficiency over the whole p T -range is larger than 94%, and for tracks with p T >100 MeV the algorithm is nearly 100% efficient. The median data reduction factor is about 15, leaving room to further optimise the ROI size and other aspects of the algorithm.

Conclusions and Outlook
In this article the performance of the DATCON system was presented: DATCON performs an FPGAbased online track reconstruction to define ROIs on the Belle II Pixel Detector using reconstructed hits from the Silicon Vertex Detector. This allows one to reduce the amount of data needed to be stored offline. The currently achieved median data reduction factor is 15. The track reconstruction efficiency that can be achieved with Υ (4S ) → BB events is about 96% over the full p T range, as determined by simulation. The efficiency of finding a true hit within a DATCON ROI is 94%. Future improvements on the performance could be obtained by tuning the ROI size using the p T of the reconstructed tracks or by improving the clustering of the Hough space.

Acknowledgements
This work is supported by the German Federal Ministry of Education and Research (BMBF).