Issue |
EPJ Web of Conf.
Volume 295, 2024
26th International Conference on Computing in High Energy and Nuclear Physics (CHEP 2023)
|
|
---|---|---|
Article Number | 01032 | |
Number of page(s) | 7 | |
Section | Data and Metadata Organization, Management and Access | |
DOI | https://doi.org/10.1051/epjconf/202429501032 | |
Published online | 06 May 2024 |
https://doi.org/10.1051/epjconf/202429501032
General purpose data streaming platform for log analysis, anomaly detection and security protection
INFN-CNAF, viale Berti Pichat 6/2, 40127 Bologna, Italy
* Corresponding author: enrico.fattibene@cnaf.infn.it
Published online: 6 May 2024
INFN-CNAF is one of the Worldwide LHC Computing Grid (WLCG) Tier-1 data centres, providing computing, networking and storage resources to a wide variety of scientific collaborations, not limited to the four LHC (Large Hadron Collider) experiments. The INFN-CNAF data centre will move to a new location next year. At the same time, the requirements from our experiments and users are becoming increasingly challenging and new scientific communities have started or will soon start exploiting our resources. Currently, we are reengineering several services, in particular our monitoring infrastructure, in order to improve the day-by-day operations and to cope with the increasing complexity of the use cases and with the future expansion of the centre.
This scenario led us to implement a data streaming infrastructure designed to enable log analysis, anomaly detection, threat hunting, integrity monitoring and incident response. Such data streaming platform has been organised to manage different kinds of data coming from heterogeneous sources, to support multi-tenancy and to be scalable. Moreover, we will be able to provide an on demand end-to-end data streaming application to those users/communities requesting such kind of facility.
The infrastructure is based on the Apache Kafka platform, which provides streaming of events at large scale, with authorization and authentication configured at the topic level for ensuring data isolation and protection. Data can be consumed by different applications, such as those devoted to log analysis, which provide the capability to index large amounts of data and implement appropriate access policies to inspect and visualise information.
In this contribution we will present and motivate our technological choices for the definition of the infrastructure, we will describe its components and we will depict use cases which can be addressed with this platform.
© The Authors, published by EDP Sciences, 2024
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.