The Software Defined Networking in KM3NeT

The networking infrastructure of the KM3NeT detector, implemented with both White Rabbit and standard high preformance switches, is presented in its peculiar asymmetric and hybrid layout. It is one of the first usecases for Software Defined Networking in Astroparticle Physics experiments. Thanks to this innovative technology, dangerous network loops are prevented and a complete deterministic control of the handled data flows is obtained. 1 The KM3NeT networking layout KM3NeT is a distributed neutrino observatory in abyssal sites of the Mediterranean Sea [1]. Each installation will be based on a grid of thousands Digital Optical Modules (DOMs), interconnected to shore via an electro-optical seafloor infrastructure extended over distances of O(100) km. The DOMs are organised in vertical structures, the Detection Units, each one provided with a Base-module for controlling the power supply and optical amplification of the attached devices. Exploiting a custom FPGA-based White Rabbit [2] kernel with Ethernet connectivity, the DOMs and Base-modules are submarine nodes of the global Layer 2 optical networking infrastructure[3]. The network segments for the KM3NeT data acquisition (DAQ) system are represented in the scheme sketched in Figure 1, together with the relevant DAQ elements [4–6]. We focus here on the RAW LAN, which is the Layer 2 segment that interconnects the off-shore detector to the Control Unit (CU) and Trigger and Data Acquisition System (TriDAS) servers, on-shore. As anticipated, it has to comply with the two following important constraints: • Asymmetry of the connections: the asymmetry directly originates after the so-called optical broadcast architecture adopted for the optical infrstructure. It aims at best exploiting the number of optical fibers contained in the many km-long electro-optical cable (EOC) which connects the shore stations with the detectors. All the timing and slow-control information addressed downstream from the shore station is embedded into a single stream of data. This is split at any next branching step of the downstream connections up to every endpoint off-shore. Despite all DOMs and Base-modules receive the same information but only the actual destination device process it. • Hybrid switching layout: the asymmetry described above violates the requirements of the WR protocol. A customization of the White Rabbit Switch (WRS) is then necessary to optimise and adapt the usual master-slave relation between the WRSs and their endpoints. The ∗e-mail: tommaso.chiarusi@bo.infn.it ∗∗e-mail: emidio.giorgio@lns.infn.it © The Authors, published by EDP Sciences. This is an open access article distributed under the terms of the Creative Commons Attribution License 4.0 (http://creativecommons.org/licenses/by/4.0/). EPJ Web of Conferences 207, 06009 (2019) https://doi.org/10.1051/epjconf/201920706009 VLVnT-2018

• Asymmetry of the connections: the asymmetry directly originates after the so-called optical broadcast architecture adopted for the optical infrstructure. It aims at best exploiting the number of optical fibers contained in the many km-long electro-optical cable (EOC) which connects the shore stations with the detectors. All the timing and slow-control information addressed downstream from the shore station is embedded into a single stream of data. This is split at any next branching step of the downstream connections up to every endpoint off-shore. Despite all DOMs and Base-modules receive the same information but only the actual destination device process it.
• Hybrid switching layout: the asymmetry described above violates the requirements of the WR protocol. A customization of the White Rabbit Switch (WRS) is then necessary to optimise and adapt the usual master-slave relation between the WRSs and their endpoints. The Figure 1. Sketch of the two KM3NeT network segments for the raw and managed data flows RAW and MGD LANs, respectively, with the basic elements off-shore, on-shore and in the remote locations. For each network segment, the displayed layers are intended to help the reader to differentiate the different data flows.
WRS-Broadcast and the WRS-Level1 share the mastership versus the off-shore endpoints. This architecture was developed by the Seven Solutions Company [7]. Base-modules are kept in a fully closed WRS loop, while the DOMs do not reply with any WR-PTP packets, and send the Fast Acquisition data directly on standard switches.
The switching infrastructure is composed of the following elements (see Figure 2 for reference): • the White Rabbit switch fabric: a number of WRS-Broadcast and WRS-Level1, dimensioned to cope with the subdivision of the off-shore detectors in convenient different sectors; • the Standard Switch fabric: a number of DOM Front End Switch (DFES) elements, dimensioned to host the total number of DOMs; one Star Center Switch Fabric (SCSF) element, which interconnects the TriDAS resources and forward to them the optical and acoustic data flows; one Slow Control and Base Data (SCBD) element, which is responsible to connect the standard switch fabric with the White Rabbit one.
The sketched connections topology is the simplest and most effective one. Experience demonstrated that to preserve the stability of the WR infrastructure, and consequently grant a reliable time-synchronisation mechanism, it is mandatory to minimise the traffic passing through the WRS-Broadcast. This is achieved with the Software Defined Networking (SDN), a frontier technology applicable by modern high-level switches. With SDN, it is possible to define the most convenient relation between a particular data flow and the due switch ports, simply via the Layer 2 routing. A group of ports of a switch driven via SDN is called SDN instance.

The Software Defined Networking implementation
The core of the SDN implementation is the OpenFlow protocol. It gives access to the forwarding plane of a network switch. OpenFlow allows the user to define the needed forwarding/routing rules, called also flows, and implement them into the SDN switch. In KM3NeT, the SDN-flows are all Layer-2 based, i.e. they rely only on the source and destination MAC addresses written in the packets and/or to the source/destination switch port. If a packet does not fulfill the rules, it is dropped. OpenFlow v1.3 was chosen in order to exploit the MAC-masking feature for handling multiple MAC addresses within one single rule. Open-Flow v1.3 has also the advantage that the injected flows are made persistent in the memory of the SDN switch. Not all the switches can implement the OpenFlow protocol. In KM3NeT the DELL Series S switches (S6000, S4048-ON and S3124F) were selected for they allow to fully exploit the OpenFlow features and keep the possibility to configure the speed and the auto-negotiation of the in-band interfaces. This is important for correctly interfacing the DELL switches to the WR devices.
To handle all the data-flows implied with the KM3NeT Data Acquisition, only a restricted number of SDN rules are required for both SCSF and SCBD instance (see Table 1). It is important to note that the number of rules does not depend on the size of the detector, i.e. on the number of DOMs and Base-modules, thanks to the fact that the MAC addresses of the off-shore nodes always have the 08:00:30 prefix. Finally, failover procedures can require additional rules. The most effective one implements some drop actions, in order to temporarily remove unwanted data-flows.
All the rules are handled by a Controller service, which is an additional new element in the network. Any SDN instance continously polls the Controller (over a TCP connection) in order to update the set of active rules. The rules management is done by the Controller via  Figure 2.
a dedicated software called OpenDaylight (ODL) [8] which is an OpenFlow implementation with a consistent support from a wide range of hardware vendors. The version currently used in KM3NeT is Nitrogen (0.7.3). ODL allows SDN flows management through a REST-API interface. This interface has been exploited in KM3NeT not only for a high-level management of rules, but also to implement auxiliary customized services, that access the Controller via RESTCONF and produces statistics related to the network packets matching the various rules. This is relevant for monitoring the SDN system and possibly intervene, even automatically, with failover strategies.