| Issue |
EPJ Web Conf.
Volume 337, 2025
27th International Conference on Computing in High Energy and Nuclear Physics (CHEP 2024)
|
|
|---|---|---|
| Article Number | 01184 | |
| Number of page(s) | 8 | |
| DOI | https://doi.org/10.1051/epjconf/202533701184 | |
| Published online | 07 October 2025 | |
https://doi.org/10.1051/epjconf/202533701184
FEROCE: Front-End RDMA Over Converged Ethernet, a lightweight RoCE endpoint
1 National Institute for Nuclear Physics, Padova Division, Padova, Italy
2 Department of Physics and Astronomy, Padova University, Padova, Italy
3 CERN, Geneva, Switzerland
4 Department of Industrial Engineering, Padova University, Padova, Italy
5 Department of Information Engineering, Padova University, Padova, Italy
* e-mail: gabriele.bortolato@phd.unipd.it
Published online: 7 October 2025
In a DAQ system a large fraction of CPU resources is engaged in networking rather than in data processing. The common network stacks that take care of network traffic usually manipulate data through several copies performing expensive operations. Thus, when the CPU is asked to handle networking, the main drawbacks are throughput reduction and latency increase due to the overhead added to the data transmission process. Networking with zero-copy can be achieved by adding a Remote Direct Memory Access (RDMA) layer to the network stack and making dedicated hardware take care of the burden of the stack handling. Considering the ever-growing demand of larger bandwidth for big data systems, many works point in the direction of implementing network stacks on custom hardware. FPGAs are the natural target for reducing time to market and keeping a low entry-barrier. In this work implementation of RDMA directly on the front-end electronics is explored, in this way it is possible to free part of the computing farm’s CPU resources. RDMA over Converged Ethernet (RoCE) is the industry-standard Ethernet-based RDMA solution with a multi-vendor ecosystem, making it the natural choice. This work focuses on the hardware implementation of a stripped-down version of RoCEv2 implementing only the transmitter part of the protocol, enabling its deployment in small FPGA such as the rad-hard parts used in the detector front-end. Preliminary results of resource usage, latency and throughput will be shown.
© The Authors, published by EDP Sciences, 2025
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.

