Search for a Standard Model Higgs boson decaying to b quarks and produced in association with Z/W bosons with the CMS detector

A search for the standard model Higgs boson is performed in a data sample corresponding to an integrated luminosity of 1.1 fb$^{-1}$, recorded by the CMS detector in proton-proton collisions at the LHC with a 7 TeV center-of-mass energy. The following modes are studied: W($\mu \nu$)H, W(e$\nu$)H, Z($\mu \mu$)H, Z(ee)H and Z($\nu \nu$)H, with the Higgs decaying to bb pairs. 95% C.L. upper limits on the VH production cross section are derived for a Higgs mass between 110 and 135 GeV. The expected (observed) upper limit at 115 GeV is found to be 5.7 (8.3) times the standard model expectation.


Introduction
The search for the Higgs boson [1] is currently one of the most important undertakings of experimental particle physics.
At the LHC the main Higgs production mechanism is direct production through gluon fusion, with a cross section of ∼ 17 · 10 3 fb for a Higgs mass m H = 120 GeV [2]. However, in this production mode, the detection of the H → bb decay is rendered nearly impossible due to overwhelming QCD di-jet production. The same holds true for the next most copious production mode, through vector-boson fusion, with a cross section of ∼ 1, 300 fb. Instead we consider processes in which the Higgs is produced in association with a vector boson which have cross sections of ∼ 660 and ∼ 360 fb for WH and ZH respectively. Even if the resulting sensitivity of the H → bb decay is less than other final states such as H → γγ and H → ττ for example, it is paramount to search for the Higgs in these modes given that the observation of the H → bb decay is key to determine the nature of this particle, if and when discovered.
We summarize a search for the standard model Higgs boson in the pp → V H production mode with the CMS detector. The analysis is performed in a data sample corresponding to an integrated luminosity of 1.1 fb −1 , collected by the CMS experiment at a 7 TeV center-of-mass energy. The following final states are included: W(µν)H, W(eν)H, Z(µµ)H, Z(ee)H and Z(νν)H -all with the Higgs decaying to bb pairs. Backgrounds arise from production of W and Z bosons associated with jets (all flavors), singly (ST) and pair-produced top quarks, and di-bosons (VV). Simulated samples of all backgrounds are used to provide guidance in the analysis optimization, and an initial evaluation of their contributions in the search region. For the main backgrounds, high-purity control regions are used to estimate their contribution in the signal region. a e-mail: michele.de.gruttola@cern.ch An optimization of the event selection, that depends on the Higgs mass, is performed, and 95% C.L. upper limits on the pp → V H production cross section are obtained for Higgs masses between 110-135 GeV. These limits are based on the observed event count and background estimate in signal regions defined in either the invariant mass distribution of H → bb candidates ("M(jj) or cut-andcount analysis"), or in the output discriminant of a boosted decision tree algorithm ("BDT analysis") [5]. The latter enhances the statistical power of the analysis by making full use of correlations between discriminating variables in signal and background events.
For lack of space we will present here only tables and plots for the 115 GeV mass hypothesis, while only the final limits plots will contains all mass range search.

Event selection
Candidate W(→ ν) decays are identified by requiring the presence of a single, isolated, lepton and additional missing transverse energy(MET). Muons (electrons) are required to have a p t above 20 (30) GeV. Candidate Z → decays are reconstructed by combining isolated, opposite charge pairs of electrons and muons and requiring the dilepton invariant mass to satisfy 75 <m < 105GeV. For Z candidates the electron p t is lowered to 20 GeV. The identification of Z → νν decays requires MET > 160GeV (the high threshold dictated by the trigger).
The reconstruction of the H → bb decay is made by requiring the presence of two central (|η| < 2.5) jets, above a minimum p t threshold and b-tagged. If more than two such jets are found in the event, the pair with the highest sum of the b-tag outputs for the two jets is chosen (except for the WH analyses, in which the tt background is larger, where the pair of jets with highest total p t is chosen). These combinations are found to yield higher efficiency and rejection of wrong combinations in signal events, as opposed to simply selecting the two highest p t jets in the event. After btagging the fraction of H → bb candidates that contain the two b-jets from the Higgs decay is near unity.

arXiv:1201.4611v1 [hep-ex] 22 Jan 2012
After b-tagging, the background from V+jets and dibosons is reduced significantly and becomes dominated by the sub-processes where the two jets originate from real b-quarks. Events with additional jets (N aj ) or additional leptons (N al ) are rejected to further reduce backgrounds from tt and WZ.
The topology of VH production is such that the W/Z and the Higgs recoil away from each other with significant p t . Cuts on the azimuthal opening angle between the vector boson and the reconstructed momenta of the Higgs candidate, ∆φ(V, H), on the p t of the V-boson and on the b-tagged dijet pair achieve significant rejection for most background processes and improve the analysis reach.
For the Z → νν channel, QCD backgrounds are further reduced by a factor of ∼ 30 when requiring that the MET does not originate from mismeasured jets.
The training of the BDT is done with simulated samples for signal and background that pass a looser event selection relative to the M(jj) analyses. Several input variables were chosen by iterative optimization. These include the di-jet invariant mass and momentum: M(jj) and p t j j , the V transverse momentum p T (V) , the b-tag value for each of the two jets, the azimuthal angle between the V and the dijets, ∆φ(V, H) , and the pseudorapidity separation between the two jets, ∆η(J1, J2) . The BDT analysis was expected to improve the sensitivity with respect to the M(jj) analysis by about 10% in every channel.

Control regions
Appropriate control regions that are orthogonal to the signal region are identified in data and used to adjust Monte Carlo estimates for the most important background processes: W + jets and Z + jets (with light and heavy-flavor jets), tt and QCD multijet and heavy-quark production. Different control regions are found for each of the different search channels by changing the event selection in a way that enriches the content of each specific background. For all cases, control regions with purity ranging from about 20% to nearly 100% have been successfully found. Discrepancies between the expected and observed yields in the data in these control regions are used to obtain a scale factor by which the estimates from the simulation are adjusted. The background from these sources in the signal region are then estimated from the adjusted simulation samples, taking into account the associated systematic uncertainty. The precise construction of all the control regions is involved and outside the scope of this summary. The procedures applied include, for example: reversing the b-tagging requirements to enhance W + jets and Z + jets with lightflavor jets; enforcing a tighter b-tag requirement and requiring extra jets to enhance tt and requiring low "boost" in order to enhance V → bb over tt . Table 1 lists the control regions and the corresponding purities and scale factors obtained.
The Z → ννH channel is unique among the five modes analyzed, in that it does not include charged leptons. An important check is to compare the observed pfMET distribution with the predicted distribution from simulation. To accomplish this, muons are removed from the Z → µµJ data sample. Reasonably pure samples of tt and W + jets events can be obtained by requiring at least one additional isolated lepton in the event, and then either requiring (for tt) or vetoing (for W + jets) b-jets. Table 2 lists the control regions and the corresponding purities and scale factors obtained. The QCD background in the signal region is also estimated from data using control regions of high and low values of two uncorrelated variables with significant discriminating power towards QCD events. One is the angle between the missing energy vector and the closest jet in azimuth, ∆φ(pfMET, J) and the other is the sum of the CSV values of the two b-tagged jets. The signal region is at high values of both discriminants, while QCD populates regions with low values of either. The method predicts a negligible contamination of this background.

Systematics
The following systematic uncertainties on the expected signal and background yields affect the upper limit. The values listed are an approximation of what is actually used in the limit calculation.
The total uncertainty on the signal prediction is taken to be 26% and 28% for ZH and WH production, respectively. Background uncertainties range from 12% to 20% depending on mode and mass point.
Experimental sources of systematics are the b-tag efficiency (∼10%), the jet energy resolution (∼10%) and scale (∼1%) uncertainty, the machine luminosity (∼4.5%), the trigger efficiency (∼2%). The signal cross section is affected by electroweak corrections for a boost of ∼ 150 GeV are 5% for ZH and 10% for WH, and QCD correction, relevant in the comparison NNLO vs. NLO, where an uncertainty of 10% for both ZH and WH is estimated.

Results
The final predicted number of events in the signal regions of the BDT and M(jj) analyses are determined with a mix of data-driven estimates based on the control regions, and expectations from simulation. We summarize the final signal and background estimates in both sets of analyses, including the systematic uncertainties summarized in the previous section, and the expected and observed upper limits using 1.1 fb −1 of integrated luminosity. We report in tables 3 and 4 and figures 1 and 2 the results for a single mass point, 115 GeV. While the final limits plots include mass points from 110 to 135 GeV. Table 3. Predicted backgrounds, signal yields with total uncertainty, and the observed number of events for 115 mass point for the 5 channels M(jj) analysis. We report also the sliding windows on M(jj) for the 115 mass point search.

Upper Limits
Preliminary 95% C.L. upper limits on the Higgs production cross section in the VH mode with H → bb were obtained from both the BDT and M(jj) analyses for a dataset corresponding to an integrated luminosity of 1.1 fb −1 . For the expected and observed limits, and the 1-and 2-σ bands, the CLs method currently recommended by the LHC Higgs Combination Group was employed [6]. The results of the five BDT analyses are combined to produce limits on Higgs production in the bb channel for the assumed masses: 110 − 135 GeV. The identical procedure was applied to the results of the M(jj) analysis. Table 5 summarizes the resulting, expected and observed, upper 95% C.L. cross section limits, with respect to the standard model cross section, for each of the mass points for   the BDT and M(jj) analyses. The results are displayed separately in Fig. 3. The primary result is the one from the BDT analysis.