Issue |
EPJ Web of Conf.
Volume 295, 2024
26th International Conference on Computing in High Energy and Nuclear Physics (CHEP 2023)
|
|
---|---|---|
Article Number | 01023 | |
Number of page(s) | 8 | |
Section | Data and Metadata Organization, Management and Access | |
DOI | https://doi.org/10.1051/epjconf/202429501023 | |
Published online | 06 May 2024 |
https://doi.org/10.1051/epjconf/202429501023
EPN2EOS Data Transfer System
1 CERN, Esplanade des Particules 1, 1211 Geneva 23, Switzerland
2 Politehnica University of Bucharest, Splaiul Independen¸tei no. 313, sector 6, Bucharest, Romania
* e-mail: asuiu@cern.ch
** e-mail: costin.grigoras@cern.ch
*** e-mail: sergiu.weisz@upb.ro
**** e-mail: latchezar.betev@cern.ch
Published online: 6 May 2024
ALICE is one of the four large experiments at the CERN LHC designed to study the structure and origins of matter in collisions of heavy ions and protons at ultra-relativistic energies. To collect, store, and process the experimental data, ALICE uses hundreds of thousands of CPU cores and more than 400 PB of different types of storage resources.
During the LHC Run 3, started in 2022, ALICE is running with an upgraded detector and an entirely new data acquisition system (DAQ), capable of collecting 100 times more events than the previous setup. One of the key elements of the new DAQ is the Event Processing Nodes (EPN) farm, which currently comprises 250 servers, each equipped with 8 MI50 AMD GPU accelerators. The role of the EPN cluster is to compress the detector data in real-time. During heavy-ion data taking the experiment collects about 900 GB/s from the sensors, compressed down to 100 GB/s, and then written to a 130 PB persistent disk buffer for further processing. The EPNs handle data streams, called Time Frames, of 10 ms duration from the detector independently from each other and write the output, called Compressed Time Frames (CTF), to their local disk. The CTFs must be transferred to the disk buffer and removed from the EPNs as soon as possible, to be able to continue collecting data from the experiment. The data transfer functions are performed by the new EPN2EOS system that was introduced in the ALICE experiment in Run 3. EPN2EOS is highly optimized to perform the copy functions in parallel with the EPN data compression algorithms and has extensive monitoring and alerting capabilities to support the ALICE experiment operators. The service has been in production since November 2021. This paper presents the architecture, implementation, and analysis of its first years of utilization.
© The Authors, published by EDP Sciences, 2024
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.