Issue |
EPJ Web of Conf.
Volume 295, 2024
26th International Conference on Computing in High Energy and Nuclear Physics (CHEP 2023)
|
|
---|---|---|
Article Number | 03016 | |
Number of page(s) | 6 | |
Section | Offline Computing | |
DOI | https://doi.org/10.1051/epjconf/202429503016 | |
Published online | 06 May 2024 |
https://doi.org/10.1051/epjconf/202429503016
Framework for custom event sample augmentations for ATLAS analysis data
1 Argonne National Laboratory (US)***
2 Simon Fraser University (CA)
3 University of Oslo (NO)
4 Max-Planck-Institut für Physik (DE)
5 Brookhaven National Laboratory (US)
6 Iowa State University (US)
* e-mail: gemmeren@anl.gov
Published online: 6 May 2024
For HEP event processing, data is typically stored in column-wise synchronized containers, such as most prominently ROOT’s TTree, which have been used for several decades to store by now over 1 exabyte. These containers can combine row-wise association capabilities needed by most HEP event processing frameworks (e.g. Athena for ATLAS) with column-wise storage, which typically results in better compression and more efficient support for many analysis use-cases. One disadvantage is that these containers, TTree in the HEP use-case, require to contain the same attributes for each entry/row (representing events), which can make extending the list of attributes very costly in storage, even if those are only required for a small subsample of events. Since the initial design, the ATLAS software framework features powerful navigational infrastructure to allow storing custom data extensions for subsamples of events in separate, but synchronized containers. This allows adding event augmentations to ATLAS standard data products (such as DAOD-PHYS or PHYSLITE) avoiding duplication of those core data products, while limiting their size increase. For this functionality, the framework does not rely on any associations made by the I/O technology (i.e. ROOT), however it supports TTree friends and builds the associated index to allow for analysis outside of the ATLAS framework. A prototype based on the Long-Lived Particle search is implemented and preliminary results with this prototype will be presented. At this point, augmented data are stored within the same file as the core data. Storing them in separate files will be investigated in future, as this could provide more flexibility, e.g. certain sites may only want a subset of several augmentations or augmentations can be archived to tape once their analysis is complete.
© The Authors, published by EDP Sciences, 2024
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.