FELIX: the new detector interface for the ATLAS experiment

During the next major shutdown from 2019-2020, the ATLAS experiment at the LHC at CERN will adopt the Front-End Link eXchange (FELIX) system as the interface between the data acquisition, detector control and TTC (Timing, Trigger and Control) systems and new or updated trigger and detector front-end electronics. FELIX will function as a router between custom serial links from front end ASICs and FPGAs to data collection and processing components via a commodity switched network. Links may aggregate many slower links or be a single high bandwidth link. FELIX will also forward the LHC bunch-crossing clock, fixed latency trigger accepts and resets received from the TTC system to front-end electronics. The FELIX system uses commodity server technology in combination with FPGA-based PCIe I/O cards. The FELIX servers will run a software routing platform serving data to network clients. Commodity servers connected to FELIX systems via the same network will run the new Software Readout Driver (SW ROD) infrastructure for event fragment building and buffering, with support for detector or trigger specific data processing, and will serve the data upon request to the ATLAS High-Level Trigger for Event Building and Selection. This proceeding will cover the design and status of FELIX and the SW ROD.


Introduction
The ATLAS detector [1] is one of the two general purpose detectors located at the Large Hadron Collider (LHC) at CERN. The LHC collides bunches of particles at a rate of 40 MHz. Since ATLAS is capable of permanently storing data at a rate of 1.5 kHz, it uses a two-level trigger system to identify which events to retain. The hardware-based Level-1 (L1) trigger reduces the 40 MHz collision rate to a maximum of 100 kHz. The L1 trigger makes its decision on the basis of coarse sums of calorimeter energy deposits and of coarse transverse momentum measurements of muon candidates to identify events with interesting signatures. The software-based High-Level Trigger (HLT) further reduces the L1 rate to 1.5 kHz. The HLT decides which events to retain by using higher granularity calorimeter and muon detector information, together with inner detector tracking information.

Planned Upgrades
The ATLAS detector will be upgraded during the second 2-year Long Shutdown (LS2) starting in 2019 [2]. An upgrade to the Liquid Argon (LAr) electromagnetic calorimeter electronics will allow for higher granularity energy measurements to be used by the L1 trigger [3]. The small wheels equipped with muon chambers, installed in the forward direction, will be replaced by New Small Wheels (NSWs) [4]. The NSW consists of eight layers of Micromegas detectors (MM) and small-strip Thin Gap Chambers (sTGC), which will provide more accurate and higher resolution tracking abilities in the forward regions of the ATLAS detector. The new muon chambers will also be able to handle the high hit rates associated with the high LHC luminosities that are expected after the Long Shutdown 2 (LS3) upgrades in 2024-2026 [5]. In addition to the NSW upgrades, additional resistive plate muon chambers, called BIS78, will be installed in the transition region between the ATLAS barrel and endcap. BIS78 will allow the L1 trigger to more efficiently reject signatures that could fake muons in this region. The calorimeter electronics of the L1 trigger system will also be upgraded to utilize the improved electromagnetic and muon measurements provided by the LS2 upgrades, resulting in improved selectivity and background rejection.
The new electromagnetic calorimeter electronics, the NSW muon detectors, BIS78, and L1 calorimeter trigger electronics will require a new infrastructure for reading out data and sending control information to the electronics. The current readout system is schematically shown in Figure 1. Once events are flagged as interesting by the L1 trigger, they are routed to custom Readout Drivers (RODs). The RODs process and check the data before sending it to the Readout System (ROS), where it is temporarily stored. The ROS consists of server PCs equipped with custom I/O based on PCIe expansion cards to receive the data from the RODs. The HLT, also consisting of server PCs, initially retrieves the data it needs for a trigger decision from the ROS via a commodity ethernet network connecting both systems. After the HLT decides which events to retain for permanent storage, the full event information is retrieved, reconstructed and stored. Software running on other commodity servers configures, controls, monitors detector electronics, and controls infrastructure through the Detector Control System (DCS). The current system contains many customized components and communication protocols (custom links, TTC [6], S-LINK [7], CANBUS) which reduce the flexibility with which information can be communicated to and from the detector. The new approach for the readout and control of the upgraded systems, shown schematically in Figure 2, allows for a reduction of the number of different types of custom electronics and protocols. The functionality that is currently provided by dedicated electronics and firmware will be moved to software running on commodity servers. This is made possible by advancements in computer technology and a radiation hard GigaBit Transceiver (GBT) link technology, developed by CERN [8]. The GBT link will be used to transmit the Timing, Trigger and Control (TTC) information to the front-end electronics. It will also be used to configure and control the front-end electronics through slow control. The protocol is bidirectional, enabling data to be passed from the front-end electronics to the data collection system and back up to the front-end. The multiplexing of relatively slow links (80, 160 or 320 Mb/s per link) by the standard GBT protocol makes it possible to transfer different independent data streams via the same physical link (raw transfer speed: 4.8 Gb/s). The slow links are known as E-links.
The Front-End LInk eXchange (FELIX) will use the GBT protocol over standard optical links and transceivers to route data and commands (including TTC) between electronics on the ATLAS detector and commodity servers. FELIX will also use the GBT protocol to route control data (including TTC), to non-radiation-hard electronics installed in an underground service area beside the ATLAS detector (off-detector electronics). For transfers from offdetector electronics towards FELIX, a protocol called FULL Mode will be used and consists of simple 8B/10B encoding with a raw transfer speed of 9.6 Gb/s. In contrast to the preupgrade system, the functionality of the RODs and of the ROS will be implemented by software running on commodity servers. The computers containing the ROD/ROS functionality are referred to as Software RODs (SW RODs). The detector-specific data processing, which previously occurred on custom RODs, will be implemented in customizable Data Handler software applications on the SW RODs. The HLT, the system used for control/configuration/monitoring and DCS will continue running on commodity servers after the upgrade and will communicate via a commodity Ethernet network. FELIX is planned to be detector agnostic and will act as a router between the custom links and a general commodity network switch. Due to this flexibility, the FELIX system is planned to be used to read out all the ATLAS subdetectors after the LS3 upgrades. The largest LS3 upgrade project is the installation of a new tracker, called the Inner Tracker (ITk), whose links will be read out by FELIX. The ITk will consist of silicon strip and pixel trackers and cover a larger pseudorapidity range compared to what the current inner detector covers. ATLAS will also upgrade the readout electronics of the other subdetectors, and all the readout and a large fraction of the control will be done via FELIX. The legacy and new readouts will exist after LS2 and will be replaced by the FELIX only readout after LS3.

Protocols
For the LS2 upgrades, the optical link connected to the front-end electronics is the Versatile link [1]. Links from FELIX towards detector or trigger system, called downlinks, use the standard GBT protocol. The type of E-link to be used (80, 160 or 320 Mb/s) and the type of protocol to be used for different E-links is configurable. Each E-link can be configured to forward either TTC signals or control data to the detector electronics received by FELIX via dedicated inputs of FELIX. Alternatively, it is possible to configure E-links for passing information sent to FELIX via the network and to select whether 8B/10B, HDLC or no encoding should be applied.
Optical links from the detector to FELIX, called uplinks, use the GBT protocol. For each Elink it is possible to configure whether 8B/10B, HDLC or no decoding is applied to the incoming data. For 8B/10B encoded event data transferred across E-links or across FULL mode links, control symbols indicate event fragment boundaries. Flow control is used for FULL mode links to indicate when the front-end system is temporarily overloaded. It is implemented with the help of control symbols.
The LS2 FELIX design will be expanded to meet additional requirements such as support of the upgraded TTC system, higher complexity of configuration of control signals, vertical and/or horizontal aggregation of data across links and handling a higher link density at rates up to 1 MHz. The uplink protocols supported for the LS3 system will include the lpGBT protocol and Aurora [4], in addition to the FULL mode and standard GBT protocols. For the downlinks towards the ITk detector, E-links of the lpGBT protocol will transfer the strip and pixel-specific coded data for control and trigger signaling.

Hardware
The FELIX system will consist of FPGA PCIe cards and Network Interface Cards (NICs) placed inside server PCs. The baseline choice of the FPGA card (Figure 3), known as the FLX-712, has a 16-lane PCIe Gen3 interface, a Xilinx Kintex Ultrascale XCKU115 FPGA, 8 miniPODs interfacing to 48 bi-directional optical links, a TTC interface and a LEMO connector for a "BUSY" signal output. The server PCs will run Linux and have a chassis that can be mounted onto racks located in the underground cavern beside the ATLAS detector. Either one or two FLX-712 cards will be installed per PC, each connecting to either 12, 24 or 48 links, depending on the type of front-end electronics to which the boards will form the interface. For testing and development, lower priced, commercially available Xilinx VC709 boards are often used instead of the FLX-712. These boards are equipped with a separate small mezzanine, called the TTCfx, which interfaces to the TTC system. A VC709 interfaces to four bi-directional links and has an 8-lane PCIe Gen3 interface.
The baseline choice for LS2 FELIX PCs is to be determined. The current options are a 2U high single CPU server with an Intel Xeon E5-1660V4 (8 core, 3.2 GHz) CPU with 32 GB of memory, or alternatively with an Intel Xeon Gold 5118 (12 core, 2.3 GHz) CPU with 48 GB of memory. The PC includes a Network Interface Card (NIC). The current baseline choice for the software RODs is a 1U high server with 2 Intel Xeon Gold 5118 CPUs, 2 x 48 GB memory, a dual port Mellanox Connect X-5 100 GbE interface and a dual port 40 GbE interface for connecting to the HLT.
The number of FELIX links, cards and FELIX PCs is shown in Table 1 for the LS2 system. In Table 1, the L1 calorimeter trigger upgrades needed for reading out the upgraded ondetector LAr calorimeter electronics are referred to as "calorimeter off-detector trigger electronics". The L1 trigger electronics upgrades needed for reading out the NSW are included under the "New Small Wheel muon detectors" section of Table 1.

Software
The FelixCore application is a multi-threaded application running on the server PCs of the FELIX system that controls the FPGA cards via PCIe registers. These are accessible via memory mapped I/O, set up with the help of a dedicated driver. During transfer of data to the SW RODs, the data is buffered in circular buffers on the FELIX PC. The contiguous memory needed for the circular buffers to or from which data is transferred under DMA control is allocated by another driver.
FelixCore reconstructs complete chunks of the data written to the circular buffers and forwards these to any destination that is registered as receiving data with FelixCore. The data are forwarded via the network, using NetIO, a network protocol-agnostic software layer. FelixCore also forwards data, received from the network by means of NetIO, to the FPGAs.

Summary
The FELIX system is a detector interface for readout and control, encapsulating functionalities common to all ATLAS subdetectors. FELIX will be initially deployed for the LS2 upgrades, which include newer LAr calorimeter electronics, the New Small Wheel, additional BIS78 muon chambers, and upgraded L1 trigger calorimeter electronics. The FELIX system will be expanded to meet post-LS3 requirements and act as the interface with all subdetectors, including the new Inner Tracker (ITk)