Exploiting CRIC to Streamline the Configuration Management of GlideinWMS Factories for CMS Support

GlideinWMS is a workload management and provisioning system that allows sharing computing resources distributed over independent sites. Based on the requests made by GlideinWMS Frontends, a dynamically sized pool of resources is created by GlideinWMS pilot Factories via pilot job submission to resource sites' computing elements. More than 400 computing elements (CE) are currently serving more than 10 virtual organizations through GlideinWMS, with CMS being the biggest user with 230 CEs. The complex configurations of the parameters defining resource requests, as submitted to those CEs, have been historically managed by manually editing a set of different xml files. New possibilities arise with CMS adopting the Computing Resource Information Catalogue (CRIC), an information system that collects, aggregates, stores, and exposes, among other things, computing resource data coming from various data providers. The talk will describe the challenges faced when CMS started to use CRIC to automatically generate the GlideinWMS Factory configurations. The architecture of the prototype, and the ancillary tools developed to ease this transition, will be discussed. Finally, future plans and milestones will be outlined.

The GlideinWMS pool

CRIC
Gather and allow access to information about physical and CMS logical computing resources: • Core CRIC with info from gocdb/oim, for example (CE's and their queues) • CMS CRIC with experiment-specific info on how we use resources • CMS is using these CE's through the computing units (Sites) • Each compute unit has multiple compute resources (GlideinWMS entries)

API's available to retrieve this information.
Leverage CRIC to automate Factory configuration generation.

11
Generating configs from CRIC • Since the environment is complex, we want to design a system that gives Factory operations some flexibility to take into account all use cases: ○ Get the relevant info from CRIC, ○ but then generate xml configurations in the Factory itself. • Gives Factory ops some extra control: ○ In case we need to overwrite some values, or add specific ones ○ Allows use of plug-in modules to take care of non-WLCG use cases • Developed scripts that saves those info from both CMS and core CRIC: ○ Information saved in a set of yaml files ■ Different files for different types of information (e.g.: Grid-owned info vs. site-specific) ○ Another script merges this information and produces the final xml configuration.

Abstract
GlideinWMS is a workload management and provisioning system that allows sharing computing resources distributed over independent sites. Based on the requests made by GlideinWMS Frontends, a dynamically sized pool of resources is created by GlideinWMS pilot Factories via pilot job submission to resource sites' computing elements. More than 400 computing elements (CE) are currently serving more than 10 virtual organizations through GlideinWMS, with CMS being the biggest user with 230 CEs. The complex configurations of the parameters defining resource requests, as submitted to those CEs, have been historically managed by manually editing a set of different xml files. New possibilities arise with CMS adopting the Computing Resource Information Catalogue (CRIC), an information system that collects, aggregates, stores, and exposes, among other things, computing resource data coming from various data providers. The talk will describe the challenges faced when CMS started to use CRIC to automatically generate the GlideinWMS Factory configurations. The architecture of the prototype, and the ancillary tools developed to ease this transition, will be discussed. Finally, future plans and milestones will be outlined.