Validation of Physics Models of Geant4 Versions 10.4.p03, 10.6.p02 and 10.7.p01 using Data from the CMS Experiment

CMS tuned its simulation program and chose a specific physics model of Geant4 by comparing the simulation results with dedicated test beam experiments. Test beam data provide measurements of energy response of the calorimeter as well as resolution for well identified charged hadrons over a large energy region. CMS continues to validate the physics models using the test beam data as well as collision data from the Large Hadron Collider. Isolated charged particles are measured simultaneously in the tracker as well as in the calorimeters. These events are selected using dedicated triggers and are used to measure the response in the calorimeter. Different versions of Geant4 (10.2.p02, 10.4.p03, 10.6.p02) have been used by CMS for its Monte Carlo production and a new version (10.7.p01) is now chosen for future productions. A suitable physics list (collection of physics models) is chosen by optimizing performance against accuracy. A detailed comparison between data and Geant4 predictions is presented in this paper.


Introduction
The CMS experiment [1] at the Large Hadron Collider uses the Geant4 toolkit [2][3][4]. The simulation software as well as the Geant4 code is evolving since the CMS simulation code has been put into operation. In our previous reports [5][6][7], the full simulation scheme for Run 2 was described. Several variants of Geant4 version 10 have been used for Run 2. Initially, Geant4 10.0.p02 was adopted and a new multi-threaded mode was made operational [7]. CPU performance of the CMS simulation was substantially improved compared to the Run 1 simulation. CMS switched to the Geant4 version 10.2.p02 for its 2017 Monte Carlo (MC) production. CMS used Geant4 version 10.4.p03 for its 2018 MC production and 10.6.p02 for its 2020 production. CMS is getting ready for the MC production for LHC Run 3 which is scheduled to start in 2022 and chose to use the Geant4 version 10.7.p01. There has been an active effort within CMS to evaluate the quality of simulation as one changes either the application software or the version of Geant4.
Two sources of data are used for validating the software. Data exist from a test beam campaign during 2006 where a prototype of the CMS hadron calorimeter (HCAL) and one super-module of the electromagnetic calorimeter (ECAL) was exposed to hadron beams of different types and of different energies. Results from this test beam have been published [8] and were used in earlier tuning of the CMS simulation program [9][10]. As a second source, collision data from the CMS experiment are used. Here low luminosity runs with zero bias or minimum bias triggers are analysed from the 2016B run period. Similar analysis was done earlier in CMS [11].
These validation results consolidate the recent adaptation of the new Geant4 versions and the physics list used for Monte Carlo production.

CMS Simulation Software
CMS simulation software (a component of the overall CMS software CMSSW) started with the physics list [12] QGSP_FTFP_BERT_EML during the early Run 2 operation of the LHC. Since its 2017 MC production with Geant4 version 10.2.p02, CMS moved to the physics list FTFP_BERT_EMM. The hadronic physics model is specified by the component FTFP_BERT. This is the recommended physics list provided by the Geant4 collaboration. In this physics list, the transition between the FTFP and the Bertini cascade models is modified in different Geant4 versions. Until Geant4 version 10.4.p03, the Bertini cascade model is valid for all particles for energies below 12 GeV and the FTFP model is valid above 3 GeV. In Geant4 versions 10.6.p02 and 10.7.p01, the same transition energies are used for pions but for other particles, the Bertini cascade model is assumed to be valid only for energies below 6 GeV. The electromagnetic physics model is specified by the term EML or EMM. EML uses a simplified multiple scattering model in all parts of the detector. The EMM physics list makes use of the default Geant4 multiple scattering model in regions for the sampling calorimeter. This gives rise to an optimization between performance and accuracy of the simulation. The effect of saturation on scintillation light emission is handled by incorporating Birks' law [13] to lead tungstate crystals as well as to plastic scintillators. After the Geant4 release 10.6, Geant4 experts have shown, that the traditional value of the Birks saturation coefficient was inconsistent with Geant4 simulation [14] and requires retuning for concrete experiment. In CMS different parameters are used for the ECAL crystals and the HCAL scintillators. Analysis of the test beam data and isolated hadrons from Run 2 led to the conclusion that the saturation coefficient for the plastic scintillator should be increased from 0.0052 to 0.006. The saturation coefficients for the ECAL crystals are kept unchanged.

Validation versus test beam data
Dedicated measurements were carried out with prototypes of the CMS calorimeter in the test beam facility of CERN [8]. Two production wedges of the barrel hadron calorimeter (HB), one prototype module of the endcap (HE), and eighteen trays of the outer hadron calorimeter (HO) were exposed to hadron beams in the H2 beam line of the Super Proton Synchrotron (SPS). The HCAL was preceded by a super-module of the barrel electromagnetic calorimeter (EB). The platform holding the modules could be moved along the phi and eta directions allowing the beam to be directed onto any tower of the calorimeter. Monochromatic secondary and tertiary beams were used having a momentum between 2 and 350 GeV. Auxiliary beam counters were used to select pure beam interactions.
Both electromagnetic and hadron calorimeters were calibrated using 50 GeV electron beams. The energy response (ratio of measured calorimetric energy to the beam momentum), resolution, and shower profiles were measured for different beam momenta. This test beam data analysis allows selection of high purity samples for pions, kaons, protons and antiprotons. The same method of hit generation is used in test beam simulation as in the case of full CMS. The digitisation chain of CMSSW is not used, instead hit energy was smeared using the Gaussian distribution with fixed widths for the ECAL (0.362 GeV) and the HCAL (0.64 GeV) obtained from the test beam analysis. Hits are collected in the 7x7-crystal matrix of the ECAL and in the 3x3 HCAL tower matrix. Hits in the outer hadron calorimeter layer are excluded in both simulation and data analysis. In addition, a time cut corresponding to test beam data acquisition is applied. Visible energy response is computed using the following simple relation: Evis = EECAL•fECAL + EHCAL•fHCAL, where fECAL = 1.01 and fHCAL = 105.0. These factors are determined using the same procedure as has been done in the analysis of the test beam data. The results of simulation with Geant4 10.4.p03, 10.6.p02 and 10.7.p01 are shown in the form of mean energy response as a function of beam momentum in Fig. 1 (for pions), Fig. 2 (for protons and anti-protons), and Fig. 3 (for kaons). The level of agreement between data and MC is indicated as chi-square per data point in Table 1. All three Geant4 versions provide compatible agreement for mean energy deposition for pions and protons within statistical accuracy of few %. The level of agreement is not good for kaons. Energy response for pions and kaons is similar in the data, but not in Monte Carlo. The predictions for Geant4 versions 10.6.p02 and 10.7.p01 show some improvements for kaons, some deterioration for positive pions, and acceptable agreement for negative pions, protons and anti-protons.

MC/Data CMS Preliminary
The energy resolution for pion and protons is also compared for negative pions and protons (Fig. 4). There is the statistical agreement between data and simulation at momentum above 8 GeV/c. At low momentum, simulation underestimates energy resolution, which could be due to simplified simulation of the detector response.

Validation versus Run 2 data
The analysis method [15] for comparing data and Monte Carlo simulation of calorimeters is applied to the three sets of Geant4 simulations using Geant4 versions 10.4.p03, 10.6.p02 and 10.7.p01. The ratio of calorimeter energy measurement to track momentum for isolated charged hadrons is studied. For that, well measured charged tracks reaching the ECAL surface are selected by imposing isolation criteria based on the NxN matrix analysis on the calorimeter surface. Final cuts are following: • No extra tracks in the isolation region.
• Energy cut of 2 GeV for neutral isolation.
• No additional good primary vertex in the event in order to avoid PU effect. Calorimeter energy is measured in a NxN matrix of calorimeter cells around the impact point of the track. Two versions of NxN matrix are defined: • 7x7 or 11x11 matrix for the ECAL.
• 3x3 or 5x5 matrix for the HCAL. Two low luminosity data sets from the 2016B runs are used for this analysis. Results of comparisons are shown as a function of particle momentum (Figs. 6-7) for three different eta regions (Fig. 5) and three versions of Geant4 10.4.p03 (used during 2018), 10.6.p02 (used during 2020) and 10.7.p01 (to be used in the future).
The level of disagreement between data and MC is estimated from the deviation of the ratio (Data/MC) from 1. The mean level of disagreement is 2.1% in the barrel, 5.0% in the endcap and 3.6% in the transition region for the version 10.4.p03. The corresponding numbers for 10.6.p02 are 2.1%, 2.3%, 2.1% for Geant4 10.6.p02, and 2.6%, 1.9%, 1.0% for Geant4 version 10.7.p01. This level is computed as an average absolute deviation at all momentums excluding the 1 GeV/c point, which has a large uncertainty. Geant4 version