New results on the search for B 0 s → μ + μ − from LHCb

A search for the rare decays Bs→ μ+μ− and B0→ μ+μ− is performed with the LHCb experiment using 1.1 fb−1 of data collected at √ s = 8 TeV and 1.0 fb−1 of data collected at √ s = 7 TeV. An excess of Bs→ μ+μ− candidates with respect to the background expectations is observed with a statistical significance of 3.5 standard deviations. A branching fraction of B(Bs→ μ+μ−) = (3.2+1.5 −1.2) × 10−9 is measured with an unbinned maximum likelihood fit. The measured branching fraction is in agreement with the expectation from the Standard Model. The observed number of B0 → μ+μ− candidates is consistent with the background expectation and an upper limit on the branching fraction of B(B0→ μ+μ−) < 9.4 × 10−10 is obtained.


Introduction
One of the most important goals of the LHCb experiment at the LHC is to search for phenomena that cannot be explained by the Standard Model (SM) of particle physics.Precise measurements of the branching fractions of the two Flavour Changing Neutral Current (FCNC) decays B 0 s → µ + µ − and B 0 → µ + µ − belong to the most promising of these searches.Both decays are strongly suppressed by loop and helicity factors, making the SM branching fraction small [1] B(B 0 s → µ + µ − ) = (3.23 ± 0.27) × 10 −9 and (1) B(B 0 → µ + µ − ) = (0.11 ± 0.01) × 10 −9 . (2) These theoretical predictions are the CP-averaged branching fractions.As pointed out in Ref. [2], the finite width difference of the B 0 s system needs to be considered.The time integrated branching fraction is evaluated to be for which the SM prediction (Eq. 1) and the LHCb measurement of the width difference ∆Γ s [3] are used.This is the expected value which is to be compared with an experimental measurement.Enhancements of the branching fractions of these decays are predicted in a variety of different extensions of the Standard Model, an overview is given in Ref. [4].In one popular example, the Minimal Supersymmetric Standard Model (MSSM), the enhancement is proportional to tan 6 β, where tan β is the ratio of the vacuum expectation values of the two Higgs fields.For large values of tan β, this search belongs to the most sensitive probes for physics beyond the SM which can be performed at collider experiments.A review of the experimental status of the searches for B 0 s,d → µ + µ − can be found in [5].
a e-mail: albrecht@cern.ch

Dataset and analysis strategy
The measurements [6] presented here use data recorded by the LHCb experiment: 1 fb −1 , recorded in 2011 at an center of mass energy of √ s = 7 TeV, combined with 1.1 fb −1 of data recorded in 2012 at √ s = 8 TeV.The first part of the dataset has already been analyzed [7] and was used to produce the lowest published limit on the decay rate of both decays B 0 s → µ + µ − and B 0 → µ + µ − .Two main improvements have been implemented over the previous analysis: the use of particle identification to select B 0 (s) → h + h − decays which are used to calibrate the geometrical and kinematic variables, and a refined estimate of the exclusive backgrounds.The updated estimate of the exclusive backgrounds is also applied to the 2011 data and the results reevaluated.The results obtained with the combined 2011 and 2012 data sets supersede those of Ref. [7].Candidate B 0 s,d → µ + µ − events are required to be selected by a hardware and a subsequent software trigger [8], dominantly by single and dimuon lines.The first step of the final analysis is a simple selection, which removes the dominant part of the background and keeps about 60% of the reconstructed signal events.A second selection step, based on a Boosted Decision Tree (BDT) reduces 80% of the remaining background while retaining 92% of the signal.More details on the selections are given in Ref. [6]

Signal discrimination
Each event is then given a probability to be signal or background in a two-dimensional probability space defined by the dimuon invariant mass and a multivariate discriminant operator.This likelihood combines kinematic and topological variables of the B 0 (s) decay using a BDT.The BDT is defined and trained on simulated events for both signal and background.The signal BDT shape is then calibrated using decays of the type B 0 (s) → h + h − , where h ± represents a K ± or π ± .These decays have an identical topology to the signal.The calibrated BDT signal and background shape is shown in Fig. 1.It is designed to be flat in the signal, whereas the shape in the background falls over four orders of magnitude.
The invariant mass line shape of the signal events is described by a Crystal Ball function [9].The mass resolution is calibrated with a combination of two methods: an interpolation of J/ψ, ψ(2S ) and Υ(1S ), Υ(2S ) and Υ(3S ) decays to two muons and from exclusive B 0 (s) → h + h − samples.The results are σ B 0 s = 25.0 ± 0.4 MeV/c 2 and σ B 0 = 24.6 ± 0.4 MeV/c 2 .The transition point of the radiative tail is obtained from simulated B 0 s → µ + µ − events smeared to reproduce the mass resolution measured in the data.
The background shapes are calibrated simultaneously in the mass and the BDT using the invariant mass sidebands.This procedure ensures that even though the BDT is defined using simulated events, the result will not be biased by discrepancies between data and simulation.

Binning
The binning of the BDT and invariant mass distributions is optimized using simulation, to maximize the separation between the median of the test statistic distribution expected for background and SM B 0 s → µ + µ − signal, and that expected for background only.The chosen number and size of the bins are a compromise between maximizing the number of bins and the necessity to have enough B 0 (s) → h + h − events to calibrate the B 0 s → µ + µ − BDT and enough background in the mass sidebands.

Normalization
The number of expected signal events is evaluated by normalizing with channels of known branching fraction.
Two independent channels are used: B − → J/ψK − and B 0 → K + π − .The first decay has similar trigger and muon identification efficiency to the signal but a different number of particles in the final state, while the second channel has the same two-body topology but is selected with a hadronic trigger.The event selection for these channels is specifically designed to be as close as possible to the signal selection.The normalization for B 0 s → µ + µ − and B 0 → µ + µ − is then given as where f d(s) and f norm are the probabilities that a b quark fragments into a B 0 (s) and into the hadron involved in the given normalization mode respectively.The recently updated value f s / f d = 0.256 ± 0.020 [10] is used.B norm indicates the branching fraction and N norm the number of signal events in the normalization channel obtained from a fit to the invariant mass distribution.The efficiency sig(norm) for the signal (normalization channel) is the product of the reconstruction efficiency of all the final state particles of The fit to data is superimposed in blue while the blue dotted line is the combinatorial background.In the fit to the B 0 → K + π − data, the B 0 → K + π − component (red line), the B 0 s → K + π − (green dashed line), and the partially reconstructed background (black dotted line) are also shown.
the decay including the geometric acceptance of the detector, the selection efficiency for reconstructed events, and the trigger efficiency for reconstructed and selected events.N B 0 s,d →µ + µ − is the number of observed signal events.The ratios of reconstruction and selection efficiencies are estimated from the simulation, while the ratios of trigger efficiencies on selected events are determined from data.
The fit to the two normalization channels is shown in Fig. 2. The observed numbers of B − → J/ψK − and B 0 → K + π − candidates are 424 222 ± 1 452 and 14 579 ± 1 110.The two normalization factors are in agreement within the uncertainties and their weighted average, taking correlations into account, is for B 0 s → µ + µ − and B 0 → µ + µ − candidates inside a signal window of ±60 MeV/c 2 around the mass central value.These normalization factors are used for the limit computation.The normalization factors in the full mass range, used in the fit for the branching fraction, are 10% lower.

Background characterization
Partially reconstructed decays of beauty mesons or baryons can pollute the low mass sidebands.The dominant modes are: In some of these modes kaons, pions and protons are misidentified as muons.The contributions of these decays to the B 0 s,d → µ + µ − analysis is estimated from Monte Carlo simulated samples by folding the K → µ, π → µ and p → µ fake rates extracted from D 0 → K + π − and Λ → pπ − data samples into the spectrum of simulated events.The fractional yields in BDT bins and the parameters that describe the mass lineshape are used as nuisance parameters in the unbinned maximum likelihood fit used to determine the branching fraction.

Results
The observed pattern of events in the 15 BDT bins (8 in the 2011 data and 7 in the 2012 data) is shown in Fig. 3 for  B 0 s → µ + µ − (top) and B 0 → µ + µ − (bottom) together with the fit for the branching fraction, which includes components for the exclusive background components discussed in Sec. 4.
The number of expected combinatorial background events in the B 0 and B 0 s search windows is determined  s → µ + µ − (bottom), the long dashed gray curve in the green area is the expected CLs distribution if background only was observed with its 1σ interval.from a simultaneous unbinned likelihood fit to the mass projections in the BDT bins.The same fit is then performed on the full mass range to extract the B 0 s → µ + µ − and B 0 → µ + µ − branching fractions.
In this fit the parameters that describe the mass distributions of the exclusive backgrounds, their fractional yield in each BDT bin and their overall yields are constrained to vary within ±1σ with respect to the expected values.The combinatorial background is parameterized with an exponential function with a slope and a normalization which are free parameters of the fit.The B 0 s → µ + µ − and B 0 → µ + µ − signal yields are free parameters of the fit.Their fractional yields in BDT bins are constrained to the BDT fractions calibrated with the B 0 (s) → h + h − sample and the parameters of the Crystal Ball functions that describe the mass lineshape are constrained to vary within ±1σ with respect to the expected values.
The systematic uncertainties in the exclusive background and signal predictions in each bin are computed by fluctuating the mass parameters, the BDT fractional yields and the normalization factors along the Gaussian  distributions defined by their associated uncertainties.The systematic uncertainty on the estimated number of combinatorial background events in the search windows is computed by fluctuating with a Poissonian distribution the number of events measured in the sidebands, and by varying the value of the exponent accordingly to the its uncertainty.
The compatibility of the observed distribution of events with a given branching fraction hypothesis is computed using the CLs method [11,12].The pattern observed for B 0 → µ + µ − decays is compatible with the background only hypothesis.The CLs curve is shown in Fig. 4 (top).The observed CLb value at CL s+b = 0.5 is 89%.
An excess of B 0 s → µ + µ − candidates is observed, the CL s curve to evaluate its significance is shown in Fig. 4 (bottom).The probability that background processes can produce the observed number of B 0 s → µ + µ − candidates or more is 5×10 −4 and corresponds to a statistical significance of about 3.5 standard deviations.The values of the B 0 s → µ + µ − branching fraction extracted from the fit is 2 ) × 10 −9 , in good agreement with the SM prediction.

Conclusions
A search for the rare decays B 0 s → µ + µ − and B 0 → µ + µ − has been performed with 1.1 fb −1 of data collected at √ s = 8 TeV and 1.0 fb −1 of data collected at √ s = 7 TeV.The data in the B 0 search window are consistent with the background expectations and an upper limit of B(B 0 → µ + µ − ) < 9.4 × 10 −10 is obtained at 95% CL.This is the most stringent published limit on this decay rate.The data in the B 0 s search window show an excess of events with respect to the background expectation with a statistical significance of 3.5 σ.A branching fraction of B(B 2 ) × 10 −9 is measured.This is the first evidence of the B 0 s → µ + µ decay.
The next step is a precision measurement of the decay rate of B 0 s → µ + µ − and then to limit and then measure the ratio of the decay rates of B 0 s → µ + µ − /B 0 → µ + µ − .This ratio allows a stringent test of the hypothesis of minimal flavor violation and a good discrimination between various extensions of the Standard Model.
It should be stated that the precise measurement of B(B 0 s,d → µ + µ − ) provides complementary information to the searches performed at high p T experiments.

capri proceedings
Precise measurements of the branching fractions of the two FCNC decays B 0 s → µ + µ − and B 0 → µ + µ − belong to the most promising measurements for a possible discovery of a theory beyond the SM.These decays are strongly suppressed by loop and helicity factors, making the SM branching fraction small [? ]: Taking the finite width difference of the B 0 s system into account, the time integrated branching fraction is evaluated to be [? ] Enhancements of the decay rates of these decays are predicted in a variety of different New Physics models, a summary is given in Ref.
[? ].For example, in the minimal supersymmetric Standard Model (MSSM), the enhancement is proportional to tan 6 β, where tan β is the ratio of the vacuum expectation values of the two Higgs fields.For large values of tan β, this search belongs to the most sensitive probes for physics beyond the SM which can be performed at collider experiments.A review of the experimental status of the searches for The measurements presented here use 1 fb −1 of data recorded by the LHCb experiment in 2011.Assuming the SM branching ratio, about 12 (1.3)B 0 s (B 0 ) decays are expected to be triggered, reconstructed and selected in the analyzed dataset.
The first step of the analysis is a simple selection, which removes the dominant part of the background and keeps about 60% of the reconstructed signal events.As second step, a preselection, based on a Boosted Decision Tree (BDT) reduces 80% of the remaining background while retaining 92% of the signal.
Each event is then given a probability to be signal or background in a two-dimensional probability space defined by the dimuon invariant mass and a multivariate discriminant likelihood.This likelihood combines kinematic a e-mail: albrecht@cern.ch and topological variables of the B 0 (s) decay using a BDT.The BDT is defined and trained on simulated events for both signal and background.The signal BDT shape is then calibrated using decays of the type B 0 (s) → h + h − , where h ± represents a K ± or π ± .These decays have an identical topology to the signal.The invariant mass resolution is calibrated with an interpolation of J/ψ, ψ(2S ) and Υ(1S ), Υ(2S ) and Υ(3S ) decays to two muons.The background shapes are calibrated simultaneously in the mass and the BDT using the invariant mass sidebands.This procedure ensures that even though the BDT is defined using simulated events, the result will not be biased by discrepancies between data and simulation.
The number of expected signal events is evaluated by normalizing with channels of known branching fraction.Three independent channels are used: B + → J/ψK + , B 0 s → J/ψφ and B 0 → K + π − .The first two decays have similar trigger and muon identification efficiency to the signal but a different number of particles in the final state, while the third channel has the same two-body topology but is selected with a hadronic trigger.The event selection for these channels is specifically designed to be as close as possible to the signal selection.The ratios of reconstruction and selection efficiencies are estimated from the simulation, while the ratios of trigger efficiencies on selected events are determined from data The observed pattern of events in the high BDT range is shown in Fig. 1 for  B 0 s → µ + µ − (top) and B 0 → µ + µ − (bottom).A moderate excess over the background expectations is seen in the B 0 s channel.This excess is consistent with the SM prediction.No excess is seem in the B 0 channel.
The compatibility of the observed distribution of events with a given branching fraction hypothesis is computed using the CLs method [? ?].The measured upper limit for the branching ratio is at 95% confidence level (CL) which is only a factor 20% above the SM prediction given in Eq. 1.This puts tight constraints on various extensions of the Standard Model, especially on supersymmetric models at high values of tan β.
The CMS and LHCb collaborations have excellent prospects to observe the decay B 0 s → µ + µ − with the dataset collected in 2012.This observation, and the precision measurement of B(B 0 s → µ + µ − ) in the coming years will allow to put strong constraints on the scalar sector of any extension of the Standard Model.The next step will be to limit and then measure the ratio of the decay rates of The compatibility of the observed events with that expected for a given tion hypothesis is computed using the The method provides CL s+b , a mea patibility of the observed distribution plus background hypothesis, CL b , a compatibility with the background-onl CL s = CL s+b /CL b .
of candidates observed in the data, and compute the expected number of signal and background events.
The systematic uncertainties in the background and signal predictions in each bin are computed by fluctuating the mass and BDT shapes and the normalization factors along the Gaussian distributions defined by their associated uncertainties.The inclusion of the systematic uncertainties increases the B 0 → µ + µ − and B 0 s → µ + µ − upper limits by less than ∼ 5%.
The results for B 0 s → µ + µ − and B 0 → µ + µ − decays, integrated over all mass bins in the corresponding signal region, are summarized in Table I.The distribution of the invariant mass for BDT>0.5 is shown in Fig. 1  The compatibility of the observed distribution of events with that expected for a given branching fraction hypothesis is computed using the CL s method [15].The method provides CL s+b , a measure of the compatibility of the observed distribution with the signal plus background hypothesis, CL b , a measure of the compatibility with the background-only hypothesis, and CL s = CL s+b /CL  vided by high p T experiments.The interplay between both channels allows the SUSY parameter space to be optimally constrained.

Introduction
Your text comes here.Separate text sections with

For bibliography use [? ]
3.1 Subsection title Don't forget to give each section, subsection, subsubsection, and paragraph a unique label (see Sect. 3).
For one-column wide figures use syntax of figure ?? For tables use syntax in table 1.

Figure 1 .
Figure 1.BDT distribution for the 2012 dataset, for the signal (black squares) and combinatorial background (blue open points).Values normalized to the bin size.

Figure 2 .
Figure 2. Invariant mass distribution of B − → J/ψK − (top) and B 0 → K + π − (bottom) candidates.The fit to data is superimposed in blue while the blue dotted line is the combinatorial background.In the fit to the B 0 → K + π − data, the B 0 → K + π − component (red line), the B 0 s → K + π − (green dashed line), and the partially reconstructed background (black dotted line) are also shown.

Figure 4 .
Figure 4. CLs as a function of the assumed B( f ) or B 0 → µ + µ − (top) and B 0 s → µ + µ − (bottom) decays for the combined 2011+2012 dataset.The long dashed gray curves are the medians of the expected CLs distributions if background and SM signal were observed.The yellow area covers the 1σ area around the median.The solid red curves are the observed CLs.For the B 0s → µ + µ − (bottom), the long dashed gray curve in the green area is the expected CLs distribution if background only was observed with its 1σ interval.

Figure 3 .
Figure 3. Simultaneous fit of the invariant mass distribution in the 8 BDT bins of 2011 (top) and 7 BDT bins of 2012 data (bottom).The fit result is superimposed in blue, the individual components given in the legend.
FIG. 1. Distribution of selected candid in the (left) B 0 s → µ + µ − and (right) B window for BDT>0.5, and expectations B 0 (s) → µ + µ − SM signal (gray), combin (light gray), B 0 (s) → h + h − background feed of the two modes (dark gray).The h the uncertainty on the sum of the expecte

Figure 1 .
Figure 1.Distribution of selected signal candidates.Events observed in LHCb in the B 0 s channel (top) and the B 0 channel (bottom) for BDT> 0.5 and expectation for, from top, SM signal (grey), combinatorial background (light grey), B 0 (s) → h + h − background (black) and cross-feed between both modes (dark grey).The hatched area depicts the uncertainty on the total background expectation.Figure reproduced from Ref. [? ].
which are the worlds best upper exclusion limit on the branching fraction of this decay.A combination [? ] of this measurement with the ATLAS and CMS upper exclu- + µ − ) LHC < 4.2 × 10 −9 and (4)

Table 1 .
Please write your table caption here