Evaluating Performance Portability with the CMS Heterogeneous Pixel Reconstruction code

Nikolaos Andriotis; Andrea Bocci; Eric Cano; Laura Cappelli; Tony Di Pilato; Luca Ferragina; Gabrielle Hugo; Matti J. Kortelainen; Martin Kwok; Juan Jose Olivera Loyola; Felice Pantaleo; Aurora Perego; Wahid Redjeb; Mark Dewing; Julien Esseiva

doi:10.1051/epjconf/202429511008

Open Access

Issue		EPJ Web of Conf. Volume 295, 2024 26^th International Conference on Computing in High Energy and Nuclear Physics (CHEP 2023)


Article Number		11008
Number of page(s)		8
Section		Heterogeneous Computing and Accelerators
DOI		https://doi.org/10.1051/epjconf/202429511008
Published online		06 May 2024

EPJ Web of Conferences 295, 11008 (2024)
https://doi.org/10.1051/epjconf/202429511008

Evaluating Performance Portability with the CMS Heterogeneous Pixel Reconstruction code

Nikolaos Andriotis¹, Andrea Bocci², Eric Cano², Laura Cappelli³, Tony Di Pilato⁴^,5, Luca Ferragina⁶, Gabrielle Hugo², Matti J. Kortelainen⁷^*, Martin Kwok⁷, Juan Jose Olivera Loyola⁸, Felice Pantaleo², Aurora Perego⁹, Wahid Redjeb²^,10, Mark Dewing¹¹ and Julien Esseiva¹² on behalf of the CMS Collaboration

¹ Barcelona Supercomputing Center, Spain
² CERN, Geneva, Switzerland
³ INFN Bologna, Italy
⁴ Center for Advanced Systems Understanding (CASUS), Görlitz, Germany
⁵ University of Geneva, Switzerland
⁶ University of Bologna, Italy
⁷ Fermi National Accelerator Laboratory, Batavia, IL, USA
⁸ Institute of Technology and Higher Studies of Monterrey, Mexico
⁹ University of Milano Bicocca, Italy
¹⁰ RWTH Aachen University, Germany
¹¹ Argonne National Laboratory, Lemont, IL, USA
¹² Lawrence Berkeley National Laboratory, Berkeley, CA, USA

^* e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.

Published online: 6 May 2024

Abstract

In the past years the landscape of tools for expressing parallel algorithms in a portable way across various compute accelerators has continued to evolve significantly. There are many technologies on the market that provide portability between CPU, GPUs from several vendors, and in some cases even FPGAs. These technologies include C++ libraries such as Alpaka and Kokkos, compiler directives such as OpenMP, the SYCL open specification that can be implemented as a library or in a compiler, and standard C++ where the compiler is solely responsible for the offloading. Given this developing landscape, users have to choose the technology that best fits their applications and constraints. For example, in the CMS experiment the experience so far in heterogeneous reconstruction algorithms suggests that the full application contains a large number of relatively short computational kernels and memory transfer operations. In this work we use a stand-alone version of the CMS heterogeneous pixel reconstruction code as a realistic use case of HEP reconstruction software that is capable of leveraging GPUs effectively. We summarize the experience of porting this code base from CUDA to Alpaka, Kokkos, SYCL, std::par, and OpenMP offloading. We compare the event processing throughput achieved by each version on NVIDIA and AMD GPUs as well as on a CPU, and compare those to what a native version of the code achieves on each platform.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.