Searches for diboson production at the Tevatron in final states containing heavy-flavor jets

Recent searches performed by the CDF and D0 collaborations at the Tevatron for diboson production in final states containing heavy-flavor jets are reported. The searches for WZ and ZZ can be regarded as the ultimate benchmark for the corresponding searches for a low-mass Higgs boson in the WH and ZH final states. Using the exact same techniques as for those Higgs boson searches, the D0 collaboration measured a cross section for WZ/ZZ production of 1.13 +/- 0.36 times its expectation in the standard model, with a diboson signal significance of 3.3 standard deviations (2.9 expected).


Introduction
Diboson production in pp (or pp) collisions at high energy is a topic of interest in its own right. Since new physics could manifest itself differently in different final states, it is important that corresponding analyses be performed. At the Tevatron, diboson production has been observed first in fully leptonic and later in semileptonic final states [1]. With the large integrated luminosity now available, together with improved analysis techniques, it has become possible to study diboson production in final states containing heavyflavor (HF) jets. An additional interest for this topic is that such final states are encountered almost identically for a low-mass Higgs boson produced in association with a vector boson. It is this aspect of diboson production at the Tevatron that is the focus of this presentation.
The searches for a low-mass Higgs boson in pp collisions have already been reviewed at this conference [2]. Therefore, only a brief summary of the methodology is given here. The relevant production modes are W(→ ℓν)H, Z(→ ℓℓ)H, and Z(→ νν)H, with ℓ an electron or a muon and H → bb. After a trigger selection based on isolated leptons and/or missing transverse energy (as expected from neutrinos in the final state), simple kinematic and leptonidentification criteria are applied to reduce the initial sample to a manageable level without significant signal loss; the bulk of the remaining background originating from multijet (MJ) production is removed either by further kinematic criteria, or, more frequently in recent searches, by making use of some multivariate analysis technique (MVA); the sample is then enriched in HF jets through the application of a "b-tagging" algorithm, and often divided into subsamples according to the number of b-tagged jets and to the level of b purity of those tagged jets; final discriminants are constructed for each of these sub-samples, in which advantage is taken of the kinematic differences between the signal and the remaining backgrounds, most importantly (W/Z)bb, which is irreducible. A variety of MVAs a e-mail: grivaz@lal.in2p3.fr are used, chosen based on some optimization of the search sensitivity but also on the analyzers' expertise: Artificial Neural Networks (NN), Support Vector Machines (SVM), Boosted Decision Trees (BDT), Random Forests (RF). The final discriminants are subjected to a statistical analysis based on the Log-Likelihood Ratio (LLR) between the background-only and the signal+background hypotheses, with marginalization over all nuisance parameter priors.
This whole sequence of analysis procedures, designed to reach sensitivity to a tiny Higgs boson signal in the presence of large backgrounds, and therefore apparently quite involved, would largely benefit in terms of reliability from being validated through the observation of a similar, but known signal. The search for diboson production with Z → bb in the final state fulfills these requirements. Consider a Higgs boson with a mass of 115 GeV. In the final states of interest, the production cross sections in pp collisions at 1.96 TeV are: -27 fb for WH → ℓνbb with ℓ = e or µ, -5 fb for ZH → ℓℓbb with ℓℓ = ee or µµ, -15 fb for ZH → ννbb, for a total of 46 fb. Replacing H → bb by Z → bb, the corresponding cross sections are -105 fb for WZ → ℓνbb with ℓ = e or µ, -24 fb for ZZ → ℓℓbb with ℓℓ = ee or µµ, -73 fb for ZZ → ννbb, for a total of 202 fb. These cross sections for diboson production are seen to be about 4.5 times larger than the corresponding ones for a 115 GeV mass Higgs boson. It should however be kept in mind that the dijet mass resolution of the CDF and D0 detectors is not sufficient to separate the W and Z dijet mass peaks, so that WW → ℓνcs is a significant resonant background. Furthermore, the nonresonant (W/Z)bb and (W/Z)cc backgrounds and their related systematic uncertainties are substantially larger than for a Higgs boson with a mass 25 GeV above that of the Z boson. On the other hand, there is relatively more signal contribution from Z → cc than from H → cc. Altogether, the observation of (W/Z)(Z → bb), using the same techniques as in the searches for a low-mass Higgs boson, can be considered as the ultimate benchmark for those searches at the Tevatron. A total of eight conference notes or publications relevant to diboson production at the Tevatron with HF jets in the final state were made available for this conference by the CDF and D0 collaborations. The analyses use integrated luminosities ranging from 4.3 to 8.4 fb −1 . For reference, the cross sections used by the Tevatron Higgs Combination Working Group for inclusive diboson production are: 11.34 pb for WW, 3.22 pb for WZ, and 1.20 pb for ZZ; they were obtained with mcfm [3] at next-to-leading order.

W(W/Z) in ℓν +HF
The CDF collaboration performed a search for (W → ℓν) (W/Z) production, where the (W/Z) decays to HF jets [4]. This analysis uses an integrated luminosity of 7.5 fb −1 . The basic selection criteria are: a lepton (e or µ) with p T > 20 GeV, missing E T (MET) > 20 GeV, and exactly two jets with p T > 20 GeV and |η| < 2. The bulk of the MJ background is rejected by an SVM, and the remaining MJ and W+jets normalizations are obtained from a template fit to the MET distribution. Al least one jet is required to be b-tagged by a secondary vertex algorithm. The final discriminants used are the dijet mass in the 1-tag and 2-tag channels, of which an example is shown in Fig. 1. A signal cross section of 1.1 +03 −04 times its standard model (SM) expectation is obtained, holding the WW/WZ production ratio to its SM value. The significance of the diboson signal is 3.0 standard deviations (s.d.) from the background-only hypothesis (3.0 expected). In this analysis, however, most of the sensitivity is actually coming from WW production in the 1-tag channel, with W → cs.
The D0 collaboration performed a search for this same final state [5], using an integrated luminosity of 4.3 fb −1 . The basic selection criteria are: an electron (muon) with p T > 20(15) GeV , MET > 20 GeV, and at least two jets with p T > 20 GeV and |η| < 2.5. To reject most of the MJ background, a so-called "triangle cut" is applied:  chosen to define 0-, 1-, and 2-tag samples, where the two jets with largest p T (leading jets) are considered for b tagging. But the actual values of the tightest operating point passed by each jet are used among the 15 inputs of the final discriminant, a random forest. The RF outputs in the three b-tag channels are shown in Fig. 2, from which a W(W/Z) production cross section of 1.2 ± 0.2 times its SM expectation is obtained, holding the WW/WZ production ratio to its SM value. The signal significance is 8.0 s.d. (6.0 expected). More relevant for the purpose of this presentation is the result obtained if the WW production cross section is constrained to its SM value within its uncertainty of 7%: a cross section of 1.3 ± 0.6 times its SM expectation for WZ production, and a significance of 2.2 s.d. (1.2 expected) for the WZ signal.

Z(W/Z) in ℓℓ +HF
The CDF collaboration performed a search for (Z → ℓℓ) (W/Z) production, where the (W/Z) decays to HF jets [6]. This analysis uses an integrated luminosity of 6.6 fb −1 . The basic selection criteria are: an ee or µµ lepton pair with p T (ℓ) > 20 GeV, 76 < m ℓℓ < 106 GeV, and at least two jets with p T > 20 GeV and |η| < 2. The originality of this analysis is that it uses, after b-tagging to separate heavy and light flavor jets, a new quark-gluon discriminant to split the light-flavor sample into a quark-rich and a gluon-rich sam-ple. The final discriminant is the dijet mass in each of the three sub-samples so defined. The analysis however does not have (yet) enough sensitivity for the observation of a diboson signal, and only a 95% C.L. upper limit of 1.3 times the SM expectation for Z(W/Z) production has been set (2.3 expected).

(W/Z)Z in MET+HF
The CDF collaboration performed a search for (W → ℓν/ Z → νν)Z production, where the Z decays to HF jets [7]. This analysis uses an integrated luminosity of 5.2 fb −1 . The basic selection criteria are: MET > 50 GeV, supplemented with a requirement on the missing E T significance, at most one lepton (e or µ), and at least two jets with p T > 20 GeV and |η| < 2. The bulk of the MJ background is rejected by the requirement that the azimuthal angle between the missing E T and any jet be larger than 0.4 radians. The shape of the remaining MJ background is taken from events in which the azimuthal angle between the missing E T and the missing p T , the former calculated from calorimeter information and the latter from charged particle tracks, is larger than one radian. The remaining sample is divided into two sub-samples, one with zero or one b-tagged jet, and the other with two b-tagged jets. The final discriminant is the dijet mass in each of those two sub-samples, as shown in Fig. 3. The results are obtained with the WW production cross section fixed to its SM value, within its uncertainty, and with the WZ/ZZ ratio fixed to its SM value. The normalization of the (W/Z)+jets background is allowed to float independently in the two sub-samples. A signal cross section of 1.1 +07 −06 times its standard model expectation is obtained. The significance of the diboson signal is 1.9 s.d.
(1.7 expected). Because this analysis accepts events with zero or one lepton, both WZ and ZZ production contribute to the sensitivity.

The ultimate benchmark
The following mutually exclusive analyses by the D0 collaboration are exact copies of the corresponding searches for a low-mass Higgs boson. The only changes are the MVA trainings, in which the signal is now WZ/ZZ instead of WH/ZH, while WW remains a background.
In each of the analyses, fits of the nuisance parameters were performed to the final discriminant outputs in all sub-channels, with systematic uncertainties correlated across signal and backgrounds as appropriate. The main sources of uncertainty are: the ratio of heavy to light flavor production in (W/Z)+jets; the various object reconstruction and identification efficiencies; the jet energy calibration and resolution; the b-tagging efficiency and the rate of wrongly tagged light-flavor jets. Three kinds of fits to the data were performed: in one of them, the signal rate was also fitted, and the diboson production cross section was therefore measured; in the other ones, the signal rate was set either to zero (background-only hypothesis) or to its SM value (signal+background hypothesis), from which the observed LLR was deduced. These fits were repeated on pseudo-experiments, and the fraction of backgroundonly pseudo-experiments yielding a cross section value at least as large as the one observed (p-value) was used to assess the signal significance, next translated into Gaussian standard deviations. The consistency of the result with the standard model was determined in a similar way, using signal+background pseudo-experiments. Unless otherwise specified, the ratio of the WZ and ZZ production cross sections was held fixed to its SM value.

WZ in ℓν+HF
The D0 collaboration performed this search in a data sample corresponding to an integrated luminosity of 7.5 fb −1 [8]. An isolated lepton (e or µ), missing E T , and two or three jets are required. The MJ background is rejected by a "triangle cut" in the muon channel and using a BDT in the electron channel. The sample is split into a 1-tag and a 2-tag sub-sample, based on the loosest b-tagging operating point as in the D0 analysis reported in Sec. 2.1. As in that same analysis, the remaining b-tagging information is used in the final discriminant, a BDT with 14 inputs. The BDT output in the 2-tag channel is shown in Fig. 4 (top). A signal cross section of 1.6 ± 0.8 times its standard model expectation is obtained, with a significance of 2.2 s.d. (1.4  expected). The observed LLR is compared to expected distributions in the background-only and signal+background hypotheses in Fig. 4 (bottom). The sensitivity of this search is dominated by WZ production.

ZZ in ℓℓ+HF
The D0 collaboration performed this search in a data sample corresponding to an integrated luminosity of 7.5 fb −1 [9]. An electron or muon pair in a Z-mass window , and two or three jets are required. The sample is split into a 1-tag and a 2-tag sub-sample: in the 2-tag sample, two jets are b-tagged, one tightly and one loosely; in the 1-tag sample, one of the jets is tightly b-tagged, with no other jet passing the loose b-tag requirement. To take advantage of the absence of missing E T in the signal, a kinematic fit is performed, which improves significantly the dijet mass resolution. The final discriminant is an RF with 19 inputs, the output of which is shown for the 2-tag channel in Fig. 5 (top). A signal cross section of 0.1 ± 0.6 times its standard model expectation is obtained. The significance is only 0.1 s.d. (1.5 expected), not inconsistent with the signal+background hypothesis. The observed LLR is compared to expected distributions in the background-only and signal+ background hypotheses in Fig. 5 (bottom). The sensitivity of this search is dominated by ZZ production.

(Z/W)Z in MET+HF
The D0 collaboration performed this search in a data sample corresponding to an integrated luminosity of 8.4 fb −1 [10]. The selection requires a large MET with a large significance, and two jets not back-to-back in the plane transverse to the beam direction; it rejects events with an electron or a muon satisfying the criteria of the WZ search reported in Sec. 3.1. The bulk of the MJ background is next rejected using a BDT. The sample is split into a 1-tag and a 2-tag VZ RF Output   0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9   sub-sample, based on the loosest b-tagging operating point as in the D0 analysis reported in Sec. 2.1. As in that same analysis, the remaining b-tagging information is used in the final discriminant, a BDT with 32 (!) inputs. The BDT output in the 2-tag channel is shown in Fig. 6 (top). A signal cross section of 1.5 ± 0.5 times its standard model expectation is obtained, with a significance of 2.8 s.d. (1.9 expected). The observed LLR is compared to expected distributions in the background-only and signal+background hypotheses in Fig. 6 (bottom). The sensitivity of this search is shared by (Z → νν)Z and by (W → ℓν)Z production, where the lepton from the W decay falls outside of the acceptance or fails the identification criteria.

Combination
The D0 searches for WZ/ZZ reported in the previous subsections were combined using the exact same techniques as for the combination of the searches for the Higgs boson. Since the binning of the final discriminant outputs are not identical in the various sub-channels, these outputs were re-cast into common bins of signal-to-background ratio. The result of the fit from which the signal cross section is measured is shown in Fig. 7. The result is 1.13 ± 0.36 times the SM cross section. This result can be compared with the expectation from pseudo-experiments drawn in the background-only and signal+background hypotheses in Fig. 8. From the former, a signal significance of 3.3 s.d. is deduced (2.9 expected), while the latter shows consistency with the signal+background hypothesis within 0.3 s.d. The observed LLR is compared to expected distributions in the two hypotheses in Fig. 9. The results of the fit of the diboson production cross section to the combined final discriminant can also be used to plot other quantities, such as the dijet mass, as shown in Fig. 10.
A fit was also performed in which the WZ and ZZ cross sections were left uncorrelated. The results are, relative to the SM cross sections, 1.8 ± 0.5 for WZ and 0.4 ± 1.1 for ZZ. These results are correlated as shown in Fig. 11, where it can be seen that the SM expectation lies within the 68% C.L. contour of the data. The deviation from the SM is as expected, given the results of the individual channels reported in the previous subsections.

A frequently asked question
The question was raised at this conference of the flavor composition of the final diboson samples. The answer is given here for the D0 search in the MET+HF final state [10].
In the 1-tag channel: -bb: 16% -cc: 19% -cs: 23% -other: 41% In the 2-tag channel: -bb: 60% -cc: 22% -cs: 8% -other: 9% It should however be kept in mind that these values correspond to a definition of the 1-tag and 2-tag channels according to the loosest of the twelve b-tagging operating points used at D0, and that the remaining b-tagging information is used as input to the final discriminant. The weight of the bb component is therefore substantially larger than the above numbers would suggest. -The CDF collaboration observed WW/WZ production in the ℓν+HF channel with a 3.0 standard deviation (s.d.) significance [4]. -In that same channel, the D0 collaboration reached a significance of 2.2 s.d. for WZ production alone [5]. -The CDF collaboration obtained a significance of 1.9 s.d. for WZ/ZZ production in the missing E T (MET)+HF channel, with zero or one lepton accepted [7].

Summary
The D0 collaboration recycled their searches for a lowmass Higgs boson in data samples corresponding to integrated luminosities of 7.5 to 8.4 fb −1 , using WZ and ZZ as a signal instead of WH and ZH. Those three searches were combined to reach a significance of 3.3 s.d. (2.9 expected), thereby establishing evidence for diboson production in final states containing heavy-flavor jets [11]. The production cross section was measured to be 1.13 ± 0.36 times its standard model expectation. These analyses have provided a direct validation of the procedures and techniques used in the searches for a lowmass Higgs boson at the Teavtron.