EPJ Web Conf.
Volume 245, 202024th International Conference on Computing in High Energy and Nuclear Physics (CHEP 2019)
|Number of page(s)||7|
|Section||4 - Data Organisation, Management and Access|
|Published online||16 November 2020|
CMS data access and usage studies at PIC Tier-1 and CIEMAT Tier-2
Port d’Informació Científica (PIC), Barcelona, Spain
2 Centro de Investigaciones Medioambientales y Tecnológicas (CIEMAT), Madrid, Spain
3 Institut de Física d’Altes Energíes (IFAE), Barcelona, Spain
4 Universitat Autònoma de Barcelona (UAB), Barcelona, Spain
* e-mail: firstname.lastname@example.org
Published online: 16 November 2020
The current computing models from LHC experiments indicate that much larger resource increases would be required by the HL-LHC era (2026+) than those that technology evolution at a constant budget could bring. Since worldwide budget for computing is not expected to increase, many research activities have emerged to improve the performance of the LHC processing software applications, as well as to propose more efficient resource deployment scenarios and data management techniques, which might reduce this expected increase of resources. The massively increasing amounts of data to be processed leads to enormous challenges for HEP storage systems, networks and the data distribution to end-users. These challenges are particularly important in scenarios in which the LHC data would be distributed from small numbers of centers holding the experiment’s data. Enabling data locality relative to computing tasks via local caches on sites seems a very promising approach to hide transfer latencies while reducing the deployed storage space and number of replicas overall. However, this highly depends on the workflow I/O characteristics and available network across sites. A crucial assessment of how the experiments are accessing and using the storage services deployed in WLCG sites, to evaluate and simulate the benefits for several of the new emerging proposals within WLCG/HSF. Studies on access and usage of storage, data access and popularity studies for the CMS workflows executed in the Spanish Tier-1 (PIC) and Tier-2 (CIEMAT) sites supporting CMS activities are reviewed in this report, based on local and experiment monitoring data, spanning more than one year. This is of relevance for simulation of data caches for end-user analysis data, as well as identifying potential areas for storage savings.
© The Authors, published by EDP Sciences, 2020
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.