B-tagging in CMS

The identification of b jets is a crucial issue to study and characterize various channels like top quark events and many new physics scenarios. Different b-tagging techniques are defined in CMS which benefit from the long life time, high mass and large momentum fraction of the b-hadron produced in b-quark jet. Effcient algorithms have been developed based on the measure of b-hadron secondary vertex or on tracks with a large impact parameter. Data collected in pp collisions at 7TeV in 2011 are used to estimate both the b-tagging effciency and the mistag rate from light flavor jets.


Introduction
The b-tagging algorithms in CMS mainly rely on the long life time, high mass and large momentum fraction of b hadrons produced in b-quark jets, as well as on the presence of soft leptons from semi-leptonic b decays [1]. Due to the high instantaneous luminosity during the 2011 data taking, the number of collision taking place in the same bunch crossing (pileup events) is of the order of 5 to 11 on average. The presence of pileup increases the track multiplicity in the events, as we can see in Fig.(1). This is why a special selection of the tracks was applied in order to remove the tracks originating from pileup[2]. In Fig.(2), the number of tracks passing the selection cuts shows a smaller pileup dependence.

The b-tagging observables
The b-tagging algorithms and their study are based on the measure of three main variables: the impact parameter significance of the tracks, the position of the secondary vertex, a e-mail: Cristina.Ferro@cern.ch and the transverse momentum of the muon relative to the jet direction. In the following a brief description of these variable is presented. The impact parameter (IP) is defined as the distance between the track and the primary interaction vertex (PV) at the point of closest approach. The IP is positive (negative) if the track is produced downstream (upstream) with respect to the PV along the jet direction ( Fig.(3)). The IP is calculated in 3 dimensions thanks to the good x-y-z resolution provided by the pixel detector. An important features of the IP is that it is Lorentz invariant and due to the bhadron lifetime the typical IP scale is set by cτ ∼480 µm. In practice, the impact parameter significance IP/σ(IP) is used in order to take into account resolution effects.Thanks to the long lifetime of the b-hadrons the IP from b-jets is expected to be mainly positive, while for the light jets it is almost symmetric with respect to zero (Fig.4).

The secondary vertex
Thanks to the high resolution of the CMS traking system, it is possible to directly reconstruct the secondary vertex, the point where the b hadron decays (Fig.(3)). The vertex reconstruction is performed using the adaptive vertex fitter. The resulting list of vertices is then subject to a cleaning procedure, rejecting SV candidates that share 65% or more of their tracks with the PV.

The transverse momentum of the muon
Semileptonic decays of b hadrons give rise to b jets that contain a muon with a branching ratio of about 11%, or 20% when b→c→l cascade decays are included. This is why the reconstructed muons inside a jet are used to study the performance of the lifetime-based tagging algorithms. The muons are seeded from the CMS muon chambers, and are then linked to tracks found in the tracking system to form global muons. The CMS muon system is able to measure muons with high acceptance resolution and efficiency.

B-tagging algorithms
Severla b-tagging algorithms are used in CMS [1], [2]. The output of each algorithm is a discriminator value on which the user can cut on to select different regions in the efficiency versus purity phase space. In Fig.(4)these discriminators are presented.
-The track counting algorithm identifies a b-jet if there are at least N tracks with a significance of the impact parameter above a given threshold. The tracks are ordered in decreasing IP/σ(IP) and the discriminator is the impact parameter significance of the Nth track . To get an high b-jet efficiency we can use the IP/σ(IP) of the second track (TCHE), to select b-jets with high purity the third track is the better choice (TCHP). -The Jet Probability algorithm relies on the IP/σ(IP) measurement of all tracks in a jet. One can use the negative tail of the IP/σ(IP) distribution to extract the probability density function (PDF) for tracks not coming from b/c-jets. By integrating on the PDF, we can compute the probability for tracks to originate from the PV. Then combining the probability of the tracks we can assign to the jet a probability to come from The combined secondary vertex algorithm includes this information and provides discrimination even when no secondary vertices are found. The mass of reconstructed charged particles at the secondary vertex is used to measure the b-tagged sample purity.

Performance of the taggers
Varying the cuts on the discriminator, we obtain different efficiency of the taggers. We establish standard operating points as, loose (L), medium (M), and tight (T), being the value at which the tagging of udsg jets is estimated from MC to be 10%, 1%, or 0.1%, respectively, for jet transverse momentum of about 80 GeV. In Fig.(5) the performance for different taggers are shown. In Fig.(6) the effects of the pileup on the performance of the TCHE tagger is presented. Thanks to the good selection on tracks the performance of the taggers are not compromised by the pileup events.

Physics results
Many algorithms already at trigger level[3]. Indeed, at trigger level, the b-quark candidates can be selected if they have at least one or two tracks with a 3D impact parameter significance above a given threshold. The motivation for applying b-tagging in the trigger is a reduction of the trigger rates, while keeping the signal efficiency high at the same time.
The typical rate reduction is a factor of 5-10. In the following a list of the main 2011 physics results obtained thanks to the b-tagging algorithms is presented: