EPJ Web Conf.
Volume 214, 201923rd International Conference on Computing in High Energy and Nuclear Physics (CHEP 2018)
|Number of page(s)||8|
|Section||T8 - Networks & facilities|
|Published online||17 September 2019|
Integrated automation for configuration management and operations in the ATLAS online computing farm
Budker Institute of Nuclear Physics,
2 University of Johannesburg, South Africa
3 Istituto Nazionale di Fisica Nucleare Sezione di Bologna, Italy
4 Aristotle University of Thessaloniki, Greece
5 CERN, Switzerland
6 INFN Gruppo Collegato di Udine and Università di Udine, Italy
7 University of California Irvine, United States of America
8 University of Washington, United States of America
* e-mail: firstname.lastname@example.org
Published online: 17 September 2019
The online farm of the ATLAS experiment at the LHC, consisting of nearly 4000 PCs with various characteristics, provides configuration and control of the detector and performs the collection, processing, selection, and conveyance of event data from the front-end electronics to mass storage. Different aspects of the farm management are already accessible via several tools. The status and health of each node are monitored by a system based on Icinga 2 and Ganglia. PuppetDB gathers centrally all the status information from Puppet, the configuration management tool used to ensure configuration consistency of every node. The in-house Configuration Database (ConfDB) controls DHCP and PXE, while also integrating external information sources. In these proceedings we present our roadmap for integrating these and other data sources and systems, and building a higher level of abstraction on top of this foundation. An automation and orchestration tool will be able to use these systems and replace lengthy manual procedures, some of which also require interactions with other systems and teams, e.g. for the repair of a faulty node. Finally, an inventory and tracking system will complement the available data sources, keep track of node history, and improve the evaluation of long-term lifecycle management and purchase strategies.
© The Authors, published by EDP Sciences, 2019
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.