Issue |
EPJ Web Conf.
Volume 302, 2024
Joint International Conference on Supercomputing in Nuclear Applications + Monte Carlo (SNA + MC 2024)
|
|
---|---|---|
Article Number | 04001 | |
Number of page(s) | 13 | |
Section | Monte-Carlo Transport Codes: Algorithms, HPC & GPU | |
DOI | https://doi.org/10.1051/epjconf/202430204001 | |
Published online | 15 October 2024 |
https://doi.org/10.1051/epjconf/202430204001
Study on the Particle Sorting Performance for Reactor Monte Carlo Neutron Transport on Apple Unified Memory GPUs
New Compute Laboratory, No. 89, Jianguo Road, Beijing, China
* e-mail: changyuan_liu@163.com
Published online: 15 October 2024
In simulation of nuclear reactor physics using the Monte Carlo neutron transport method on GPUs, the sorting of particles plays a significant role in performance of calculation. Traditionally, CPUs and GPUs are separated devices connected at low data transfer rate and high data transfer latency. Emerging computing chips tend to integrate CPUs and GPUs. One example is the Apple silicon chips with unified memory. Such unified memory chips have opened doors for new strategies of collaboration between CPUs and GPUs for Monte Carlo neutron transport. Sorting particles on CPU and transport on GPU is an example of such new strategy, which has been suffering the high CPU-GPU data transfer latency on the traditional devices with separated CPU and GPU. The finding is that for the Apple M2 max and M3 max chip, sorting on CPU leads to better performance per power than sorting on GPU for the ExaSMR whole core benchmark problems and the HTR-10 high temperature gas reactor fuel pebble problem. The partially sorted particle order has been identified to contribute to the higher performance with CPU sort than GPU. The in-house code using both CPU and GPU achieves 7.6 times (M3 max) power efficiency that of OpenMC on CPU for ExaSMR whole core benchmark with depleted fuel, and 130 times (M3 max) for HTR-10 fuel pebble benchmark with depleted fuel.
© The Authors, published by EDP Sciences, 2024
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.