Combined search for the standard model Higgs boson with the D 0 experiment

We present searches for the standard model Higgs boson using p to 9.7 fb−1 of pp̄ collisions collected with the D0 detector at Fermilab. The analyses consis t i a series of distinct final states and are sensitive to Higgs boson masses ( MH) ranging from 90 GeV to 200 GeV. These analyses are combined a d allow to exclude a standard model Higgs boson at 95% Confidence Level (C. L.) in the range 90 GeV< MH < 101 GeV and 157 GeV< MH < 178 GeV, with an expected exclusion in the range 155 GeV < MH < 175 GeV. An excess of data of about two Gaussian standard deviations is seen in t he range 120 GeV< MH < 145 GeV, consistent with the observation of a boson with a mass of 125 GeV at the Lar ge Hadron Collider and with the evidence for a particle decaying tobb̄ at the Tevatron. The easiest way to give a mass to the W andZ electroweak vector bosons is via the well-established S U(2)× U(1) electroweak symmetry breaking mechanism. In the standard model (SM), one introduces a single elementary scalar field, doublet of S U(2), which acquires a nonzero vacuum expectation value. Once accounting for the longitudinal polarizations, i.e., for the mass of the electrowea k vector bosons, one degree of freedom remains, which manifests itself as a single scalar particle, the Higgs boso n. Its mass is a free parameter of the model but is constrained from direct searches from LEP [1] and Tevatron [2]. Its properties are also consistent with the evidence for a new particle decaying tobb̄ at the Tevatron [3], and with the discovery of a new boson with a mass MH ≈ 125 GeV at the LHC [4, 5]. The phenomenology of the SM Higgs boson is very rich: because it couples to all massive particles at tree level, it can be produced and decays in a number of different ways. At the Tevatron, the dominant production modes are gluon fusion (GGF, gg → H), associated production with a vector boson (VH, qq → WH, ZH) and vector boson fusion (VBF, qq → qqH). The dominant decay modes are H → bb̄ andH → WW, but some analyzes are also sensitive to H → γγ, H → ZZ or H → ττ. In addition,W andZ bosons andτ leptons further decay to particles long-lived enough to be detected (and / or to neutrinos), thus leading to a large range of final states to look for. All analyzes searching for a SM Higgs boson follow a similar scheme. They start with a first set of selection criteria (triggers, lepton quality and transverse momentu m pT , etc.) as loose as possible, in order to maximize the acceptance. Then a certain number of multivariate analysis techniques are used (e.g. Boosted Decision Trees, ae-mail: emilien.chapon@cern.ch BDTs). These techniques combine several discriminating variables between some signal and some background into one discriminant. They make the most of correlations between the input variables. They can be trained against specific backgrounds, e.g. tt̄ or multijet background. Most of the time, the analyzes are split into subcategories (depending, for instance, on the lepton flavor or on the number of jets in the event). This allows to take advantage of di fferent signal and background compositions in each category, as well as di fferent signal to background ratios, and to better constrain systematic uncertainties associated to specific background processes. The background modeling is constrained in control regions, and can further be checked with a diboson cross-section measurement (in the H → bb̄ andH → WW analyzes), as can be seen on Fig. 1 for the WZ + ZZ production cross section measurement. At last, a final discriminant is used to look for an excess in the data compared to the background-only expectation. Analyzes presented here use up to 9 .7 fb−1 of pp̄ collisions collected with the D0 detector at Fermilab. We will briefly describe the following analyzes: WH → lνbb̄, ZH → llbb̄, ZH → νν̄bb̄ andH → WW → llνν̄. The following analyzes also enter the D0 combination [6]: H+X → WW → μ±τ∓h+ ≤ 1 jet,H → W W → lνqq̄, VH → eeμ/μμe+X, VH → eμ+X, VH → lνqq̄qq̄, VH → τhτhμ + X, H + X → lτh j j, andH → γγ.

The easiest way to give a mass to the W and Z electroweak vector bosons is via the well-established S U(2) × U(1) electroweak symmetry breaking mechanism.In the standard model (SM), one introduces a single elementary scalar field, doublet of S U (2), which acquires a nonzero vacuum expectation value.Once accounting for the longitudinal polarizations, i.e., for the mass of the electroweak vector bosons, one degree of freedom remains, which manifests itself as a single scalar particle, the Higgs boson.Its mass is a free parameter of the model but is constrained from direct searches from LEP [1] and Tevatron [2].Its properties are also consistent with the evidence for a new particle decaying to b b at the Tevatron [3], and with the discovery of a new boson with a mass M H ≈ 125 GeV at the LHC [4,5].
The phenomenology of the SM Higgs boson is very rich: because it couples to all massive particles at tree level, it can be produced and decays in a number of different ways.At the Tevatron, the dominant production modes are gluon fusion (GGF, gg → H), associated production with a vector boson (VH, qq ′ → WH, ZH) and vector boson fusion (VBF, qq ′ → qq ′ H).The dominant decay modes are H → b b and H → W + W − , but some analyzes are also sensitive to H → γγ, H → ZZ or H → τ + τ − .In addition, W and Z bosons and τ leptons further decay to particles long-lived enough to be detected (and / or to neutrinos), thus leading to a large range of final states to look for.
All analyzes searching for a SM Higgs boson follow a similar scheme.They start with a first set of selection criteria (triggers, lepton quality and transverse momentum p T , etc.) as loose as possible, in order to maximize the acceptance.Then a certain number of multivariate analysis techniques are used (e.g.Boosted Decision Trees, a e-mail: emilien.chapon@cern.chBDTs).These techniques combine several discriminating variables between some signal and some background into one discriminant.They make the most of correlations between the input variables.They can be trained against specific backgrounds, e.g.tt or multijet background.
Most of the time, the analyzes are split into subcategories (depending, for instance, on the lepton flavor or on the number of jets in the event).This allows to take advantage of different signal and background compositions in each category, as well as different signal to background ratios, and to better constrain systematic uncertainties associated to specific background processes.The background modeling is constrained in control regions, and can further be checked with a diboson cross-section measurement (in the H → b b and H → W + W − analyzes), as can be seen on Fig. 1 for the WZ + ZZ production cross section measurement.At last, a final discriminant is used to look for an excess in the data compared to the background-only expectation.
Analyzes presented here use up to 9.7 fb −1 of p p collisions collected with the D0 detector at Fermilab.We will briefly describe the following analyzes: The following analyzes also enter the D0 combination [6]:

H → b b
The Higgs boson predominantly decays to a pair of b quarks for M H < 135 GeV.However it is not possible to use the GGF production mode to look for the Higgs boson in this decay mode, because of the overwhelming multijet background at a hadron collider like the Tevatron, and analyzers focus on the VH production mode.This results It is crucial for these analyzes to be built on an excellent b-tagging algorithm, to disentangle jets originating from a b quark (b-jets) from those originating from a light quark.Such an algorithm uses the fact that B hadrons have a relatively long lifetime and travel in the detector before they decay.Variables gauging these characteristics are combined into a multivariate technique to maximize the tagging efficiency and minimize the fake rate.By adjusting the minimum requirement on the output of the b-tagger, a range of signal efficiencies and purities is achieved.
The WH → ℓ ± νb b analysis [7] is divided into categories depending on the lepton flavor (electron or muon) and on the number and quality of b-tagged jets.It also uses a BDT trained against the multijet background, which is used as an input to the final discriminants (one for each category of events).
The advantage of the ZH → ℓ + ℓ − b b analysis [8] is that it provides a fully reconstructed final state, without any missing energy from neutrinos as opposed to other Higgs analyzes.This allows to constrain the kinematics of the events, using a kinematic fit.The first step in this analysis is to select a Z → ℓ + ℓ − candidate with 70 GeV < M ℓℓ < 110 GeV.A dedicated BDT is trained against the tt background, which allows to split the analysis into tt depleted and enriched regions.A final discriminant is at last trained for each event category.
The main challenge in the ZH → ννb b analysis [9] is to model and reject the dominant multijet background, and this is done combining several tools.First, analyzers cut on a variable called E T significance, which gauges if the missing transverse energy ( E T ) is likely to have been mis-measured in the event.To further reject the multijet background, a dedicated BDT is trained against this background and a requirement is placed on its output.At last, the output L b of the b-tagging discriminant for the two jets is summed to obtain the variable L bb = L b (jet 1 ) + L b (jet 2 ).Events with a low L bb are rejected, and the remaining events are categorized into medium / high L bb regions.A final discriminant is trained in each of these two event categories.

H → W + W − → ℓνℓν
The H → W + W − → ℓνℓν analysis [11] is the most sensitive channel for M H > 135 GeV, but it has also a good sensitivity at lower masses.Even if the branching ratio of W + W − → ℓνℓν is relatively low (∼ 6.4%) compared to semi-or full-hadronic decay modes of a W boson pair, the dileptonic decay mode benefits from a clean experimental signature, with two high-p T leptons (e + e − , µ + µ − or e ± µ ∓ ) and large E T .In order to maximize the acceptance, no explicit cut on E T is placed.Instead, a BDT is trained in the e + e − and µ + µ − channels to reject the Z/γ * background, while the e ± µ ∓ channel places cuts on E T -related variables.
Then, the analysis is split into 0 jet, 1 jet and ≥ 2 jets multiplicity bins.Events are further categorized into WWdepleted and WW-enriched regions, using a BDT in the 0 and 1 jet bins (e + e − , µ + µ − ) or using the lepton quality in the 0 jet bin (e ± µ ∓ ).
At last, a final BDT is trained in each sub-samples against the remaining backgrounds.Angular variables such as the opening angle R(ℓℓ) = ∆η 2 + ∆φ 2 between the two leptons are of primary importance, because we look for a standard model Higgs boson which has a spin 0. The final discriminant is used to look for any data excess in a signal-like region and to set limits on Higgs boson production.

D0 combined limits on standard model Higgs boson production
The modified frequentist CL s method [12] is used to gauge the compatibility of the data with the background-only hypothesis (B) or signal-plus-background hypothesis (S+B) and to set limits on SM Higgs boson production, where the test statistic is a log-likelihood ratio (LLR) for the (B) and (S+B) hypotheses.The degrading effect of systematic uncertainties is reduced by fitting individual background contributions to the data by maximizing a profile likelihood function for the (B) and (S+B) hypotheses separately, appropriately taking into account all correlations between the systematic uncertainties [13].
In order to achieve the best possible sensitivity to Higgs boson production, analyzers have combined their results together.This allows to take advantage of the specificities of each analysis: for instance H → b b channels are sensitive to the coupling of the Higgs boson to fermions and are more sensitive to low M H , while H → W + W − channels are sensitive to the coupling of the Higgs boson to vector bosons and are more sensitive at high M H .The value of the combined LLR as a function of the hypothetical Higgs boson mass M H is shown on Fig. 4.
When combining all analyzes available, the D0 collaboration is able to exclude a SM Higgs boson with a mass in the range 90 GeV < M H < 101 GeV and 157 GeV < M H < 178 GeV at 95 % confidence level (C.L.) [6], while the expectation is to exclude the range 155 GeV < M H < 175 GeV (see Fig. 4, right).A data excess over expectation of up to 2 standard deviations (s.d.) is however seen in the range 120 GeV < M H < 145 GeV, which is seen from the the background-only p-value as a function of M H (see Fig. 4).As a comparison, the combination with the CDF Tevatron experiment [2] excludes the ranges 90 GeV < M H < 109 GeV and 149 GeV < M H < 182 GeV, and there is a data excess corresponding to a local significance of 3.0 s.d. for M H = 125 GeV.This excess is compatible with the observation of a new boson by the ATLAS and CMS collaborations [4,5].
We want to study this excess into more detail.First, we compute separate limits combining only analyzes sensitive to H → b b or H → W + W − (see Fig. 5), and we find that both H → b b and H → W + W − combined limits report a broad excess of up to 1.5 s.d.We also compute the best fit signal strength (σ H • B/(σ H • B) SM ) for a hypothetical Higgs boson with a mass M H = 125 GeV, as discovered at the LHC (see Fig. 5), for the combination as well as for the different Higgs boson decay channels considered.We find that each of the four main Higgs boson decay modes, H → b b, H → W + W − , H → τ + τ − and H → γγ, contribute to the observed excess.

Conclusion
We have reported selected searches for the SM Higgs boson from the D0 collaboration, focusing on the H → b b and H → W + W − decay channels.These analyzes are very complementary, both because they probe different Higgs boson mass ranges and are sensitive to different classes of couplings of the Higgs boson.We have also reported the results of the combined search for a Higgs boson by the D0 Collaboration (also including more analyses, in particular in the H → τ + τ − and H → γγ channels), which exclude a SM Higgs boson with a mass in the range 90 GeV < M H < 101 GeV and 157 GeV < M H < 178 GeV at 95 % C.L. (expected: 155 GeV < M H < 175 GeV).An excess of data of up to 2 s.d.over background expectation is seen in the range 120 GeV < M H < 145 GeV, compatible with the evidence for a new particle decaying to b b from the Tevatron and the discovery of a new boson from the LHC.The analyses combined here also provide inputs to the overall Tevatron combination, which shows evidence for a SM-like Higgs boson [2].

Figure 1 .
Figure 1.Left: Background-subtracted distribution of the reconstructed dijet mass m j j for the WH → ℓνb b, ZH → ννb b, and ZH → ℓ + ℓ − b b searches, used in the measurement of the WZ + ZZ production cross section.Middle: Distribution of one of the BDTs used to discriminate signal from the Z/γ * background in the H → W + W − → e + νe − ν analysis.Right: output of one of the final discriminants used in the WH → ℓνb b analysis to look for a data excess over the background-only expectation and to set limits on SM Higgs boson production.

Figure 2 .
Figure 2. Invariant mass of the two jets in the ZH → ℓ + ℓ − b b analysis, before b-tagging requirement.

Figure 3 .
Figure 3. Invariant mass of the two leptons in the H → W + W − → ℓνℓν analysis, before rejection of the Z/γ * background.

Figure 4 .Figure 5 .
Figure 4. Left: Observed and expected LLR as a function of the Higgs boson mass.Middle: distribution of CL b (background-only p-value) as a function of the Higgs boson mass.Right: 95% C.L. limits on SM Higgs boson production.On each of the three plots, the observed value in data is shown as a plain black line, the expectation from background only is shown as a dotted black line, and the expectation from a standard model Higgs boson with M H = 125 GeV is shown as a dashed blue line.