EPJ Web Conf.
Volume 214, 201923rd International Conference on Computing in High Energy and Nuclear Physics (CHEP 2018)
|Number of page(s)||8|
|Section||T3 - Distributed computing|
|Published online||17 September 2019|
ATLAS utilisation of the Czech national HPC center
Institute of Physics of the CAS,
Na Slovance 1999/2,
2 Czech Technical University in Prague, Faculty of Nuclear Sciences and Physical Engineering, Brehova 7, Prague, 115 19, Czech Republic
Published online: 17 September 2019
The Czech national HPC center IT4Innovations located in Ostrava provides two HPC systems, Anselm and Salomon. The Salomon HPC is amongst the hundred most powerful supercomputers on Earth since its commissioning in 2015. Both clusters were tested for usage by the ATLAS experiment for running simulation jobs. Several thousand core hours were allocatedto the project for tests, but the main aim is to use free resources waitigfor large parallel jobs of other users. Multiple strategies for ATLAS job execution were tested on the Salomon and Anselm HPCs. The solution described herein is based on the ATLAS experience with other HPC sites. ARC Compute Element (ARCCE) installed at the grid site in Prague is used for job submission to Salomon. The ATLAS production system submits jobs to the ARC-Evia ARC Control Tower (aCT). The ARC-CE processes job requirements from aTand creates a script for a batch system which is then executed via ssh. Sshfs is used to share scripts and input files between the site and the HPC cluster. The software used to run jobs is rsynced from the site's CVMFS installation to the HPC's scratch space every day to ensure availabiliy of recent software. Using this setting, opportunistic capacity of the Salomon HPC was exploited.
© The Authors, published by EDP Sciences, 2019
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Initial download of the metrics may take a while.