Searches for the Higgs-like boson decaying into bottom quarks in the WH channel

. The most important ﬁnding of the LHC so far was the discovery of the Higgs-like boson at 125 GeV in 2012. We present the most recent results of the search for the Higgs-like boson decaying into bottom quarks, when produced in association with a W boson. Only events where the leptonically decaying W boson and the Higgs boson possess large transverse momenta are selected. The full proton-proton collision data recorded by the CMS detector in 2011 and 2012 at 7 and 8 TeV respectively, corresponding to an integrated luminosity of 24 / fb, is used for the search.


Introduction
In 2012 the ATLAS and CMS collaborations reported the discovery of a new boson with a mass of about 125 GeV [1,2]. The discovery was mainly driven by the decay channels into either γ, W or Z boson pairs. So far, all properties of the new particle are in good agreement with the predicted ones for the standard model Higgs boson. If the new particle is the Higgs boson, it decays predominantly into a bottom quark pair. The discovery of this decay channel could establish the interaction of the Higgs boson with fermions and offers a direct test for its coupling to quarks. However, due to the huge multijet background a Higgs boson signal can hardly be extracted in the dib-quark final state. Requiring also a specific production mechanism for the Higgs boson improves the ratio of expected signal over background events significantly. The presented analysis focusses on the VH channel, where the Higgs boson is produced in association with a leptonically decaying W or Z boson. Triggering on the additional leptons (or missing energy) in the final state suppresses a significant amount of background, making the search possible.

Analysis strategy
In this note the search for the process H → bb is presented for the three channels W(eν)H, W(µν)H, W(τν)H using proton-proton collision data recorded by the CMS detector [3]. Figure 1 shows the representative Feynman diagram of the processes. For the W(µν)H and W(eν)H channels data with a center-of-mass energy of 7 TeV and 8 TeV is analyzed corresponding to an integrated luminosity of 5.0 fb −1 and 19.0 fb −1 , respectively. For the W(τν)H final state, only 8 TeV data with an integrated luminosity of 18.3 fb −1 is included in the search. In the following sec- tions selected features of the analysis are highlighted. For a much more detailed description see [4].

Event selection
For every event the dijet system with the highest transverse momentum is chosen to form the Higgs boson candidate. One major challenge of this analysis is to handle the copious backgrounds, namely W boson production in association with additional jets and top quark pair, single top, diboson and multijet production. For a proper reduction of background event rates in the signal region several selection criteria are applied. We are selecting only events where the Higgs and the W boson are traveling back-toback in the transverse plane and possess high transverse momenta (p T (H/W) > 100 GeV). In addition, an isolated lepton is required. In order to assure that the two jets from the Higgs boson candidate (Higgs jets) stem most likely from b quarks the multivariate b-tagging algorithm "Combined Secondary Vertex" (CSV) is used. For the electron and muon channels three different analysis bins corresponding to different ranges of the W boson's transverse momentum (low, intermediate and high p T (W) bin) are chosen. Due to the lower amount of available data events after the selection in the W(τν)H channel only one category is used. The detailed selection criteria for the different bins can be found in Table 1.  Table 2. Data/MC scale factors in 8 TeV data in the different analysis bins including the statistical uncertainty from the fit and a systematic uncertainty accounting for possible data/MC shape differences in the discriminating variables.

Background control regions
By introducing slightly different cuts, control regions (CR) enriched with the main background processes are defined orthogonally to the previously denoted signal region. A tt enhanced control region is achieved by requiring additional jets in the event. Inverting the b-tagging requirements for the Higgs jets yields the W+light enhanced CR. Further, the W+heavy control region is obtained by applying a mass veto in the range of the Higgs boson mass. The three control regions are employed to correct simulation yield estimates for several of the main backgrounds, namely top quark pair production and W boson production in association with zero, one and two b jets. This is achieved by a simultaneous fit to data in discriminating variables separately for each analysis bin. For tt and the W+light control regions the invariant dijet mass, and for the W+heavy CR the CSV output of the Higgs jet with the second highest transverse momentum are used. The resulting scale factors can be found in Table 2. For W+1b the fitted estimate is about two times larger than the prediction from simulation. This discrepancy is also found by other studies and arises due to mismodeling of the generator parton shower when a gluon splits to a bottom quark pair. All other scale factors are within their uncertainties well compatible with unity. The control regions are also used to validate variables important in the analysis. Overall a good agreement between data and simulation is found.

Jet energy regression
Reconstructed jets stemming from b quarks usually have a worse energy resolution compared to light flavor and gluon induced jets. This is due to the fact, that in 20% of all cases neutrinos are present in the B hadron decay leading to missing energy within the jet. A regression technique is performed in order to compute correction factors for individual b jets. We train a dedicated Boosted Decision Tree (BDT) algorithm on the true transverse momenta of b quarks in simulated signal events using jet properties, b-tag and soft lepton information as input variables. Applying the resulting regression weights not only corrects the tranverse momenta and improves the resolution of the b jets, it also increases the resolution of the reconstructed dijet mass as shown in figure 2(a) and hence the search sensitivity.
All input variables of the regression are validated in the control regions. Additionally, the regression performance is confirmed in the invariant mass distribution of hadronically decaying top quarks in a tt sample as shown in figure  2(b).

Signal extraction
In order to discriminate signal from background events a multivariate analysis is performed. For all channels, boosted decision trees are trained versus all occuring backgrounds with a large set of kinematic variables of the Higgs and W boson, b-tag information of the two Higgs jets and features of the additional jets in the events as inputs. For the W(τν)H channel the resulting BDT distribution is used as final discriminant. For the electron and muon channels additional BDTs are trained to seperate signal events individually from tt, W+light and diboson events. By successively applying cuts on these outputs one can categorize the events into four regions. For all events the nominal BDT output is transferred to the corresponding category leading to final discriminants. In Figure 3 the final discriminant is shown for the high p T bin in the W(µν)H channel and one can clearly indicate the four different categories, where the three left ones are seen as tt, W+jets and diboson like regions and the rightmost bin is signal enhanced. This categorization of events helps to increase the search sensitivity. In addition to the multivariate approach, a simpler cutbased analysis is performed using the dijet mass spectrum. This analysis has a lower search sensitivity compared to the BDT analysis and serves as a cross check. The combined invariant mass distribution for the WH channel is shown in Figure 4. In the following only results from the BDT analyses are presented.

Results
The final results are calculated using a simultaneous fit to the final discriminants in each channel. In Figure 5 the resulting upper limits for several mass hypotheses are presented for the combinded W(eν)H W(µν)H and W(τν)H channels. The plot shows an excess between one and two standard deviations over the whole mass range. For a Higgs boson mass of 125 GeV we report a local significance in the WH channel of 1.0 σ, where 1.0 σ was expected if a Higgs boson was present. The fitted signal strength is µ = 1.1. In Figure 6 the signal strength for the W(�ν)H process is shown together with the Z(νν)H and Z(��)H channels [4]. We find a good compatibility between the signal strengths of all channels. For the combined VH channel a local significance for a 125 GeV Higgs boson of 2.1σ is found with a fitted signal strength of µ = 1.0 ± 0.5 [4].