Measurement of the charge asymmetry in top quark pair production with the CMS experiment

We present a measurement of the charge asymmetry in top quark pair production using an integrated luminosity of 36 pb−1 collected during 2010 with the CMS detector. Top pair candidates with a signature of a charged lepton and at least four jets are selected. In order to measure the charge asymmetry in proton-proton collisions at a centre-of-mass energy of 7 TeV, the difference of absolute pseudo rapidities of top quark and anti-top quark |ηt| − |ηt̄| is used. The measured asymmetry AC = 0.060± 0.134(stat.)± 0.026(syst.) is consistent with the asymmetry predicted by the Standard Model of 0.0130(11). The result is still dominated by the statistical uncertainty but it shows that CMS has the ability to measure this quantity.


Introduction
The top quark, discovered in 1995 by both Tevatron experiments CDF [1] and D0 [2], is the only known fermion with a mass of the order of the electroweak symmetry breaking (EWSB) scale. As such, it plays a special role in many beyond Standard Model (BSM) theories of EWSB. The production of top quark pairs might be influenced by BSM effect. In some BSM theories top quark pairs can be produced by the exchange of unknown heavy particles. Candidates for such exchange particles are axigluons [3], heavy Z' particles [4] or colored Kaluza Klein excitations of the gluon [5][6][7][8]. A standard approach to search for such new particles is to reconstruct the invariant mass spectrum of the top anti-top quark pair. In the M tt spectrum some of these particles might show up as resonance peaks. However, wide resonances will probably be hidden in the M tt spectrum.
The measurement of the charge asymmetry in top quark pair production provides a possibility to search for unknown top quark production mechanisms which will be invisible in the invariant mass spectrum. Charge asymmetric effects can only occur in asymmetric initial states in top quark pair production. Different vectorial and axial couplings of new resonances to top and anti-top quarks or interferences with Standard Model (SM) production processes will lead to an emission of the top quark preferably either in the direction of the incoming quark or in the direction of the anti-quark in case of quark-antiquark annihilation in the initial state. This charge asymmetry leads to different rapidity distributions for top and anti-top quark. In proton-antiproton collisions the charge asymmetry manifests itself in a forward-backward asymmetry A FB , for example accessible by the variable y t − y¯t. In recent measurements [9][10][11][12] performed at the Tevatron by the CDF and D0 collaborations in proton-antiproton collisions a charge asymmetry in the production of top quarks has been observed, which shows a deviation from the SM prediction (A FB ∼ 5%) of about 2σ. In the region with high M tt ≥ 450 GeV/c 2 the CDF collaboration finds an asymmetry value which is 3.4σ above the SM prediction [9]. This might be an indication for the existence of BSM tt production mechanisms [13,14].
Since proton-proton collisions are forward-backward symmetric, the Tevatron method to access the charge asymmetry cannot be directly applied to the pp-collider experiments at the LHC. In this paper a new method to determine the charge asymmetry in pp-collisions has been developed and for the first time the charge asymmetry in top quark pair production has been measured at the LHC. Here, quarks in the initial state will mainly be valence quarks, whereas anti-quarks will always be sea quarks. Their different averaged momentum fractions will be transfered to different top and anti-top quark momenta in case of asymmetric hard parton production mechanisms. Due to the symmetric proton-proton initial state at the LHC the pseudo rapidity distributions of the top quarks are symmetric around 0, but the charge asymmetry manifests itself as η distributions with different width for top and anti-top quarks. To measure this effect we use |η t | − |η¯t| and compute the charge asymmetry A C , defined as: where N + and N − are the numbers of events with positive or negative values of |η t | − |η¯t|, respectively. In this definition we choose pseudo rapidities because this quantity is based on angular information only yielding a good resolution. Due to the larger contribution of top pair events produced via gluon-gluon fusion, the charge asymmetry in pp-collisions ( √ s = 7 TeV) is predicted to be tiny in the SM: A SM C = 0.0130 (11) [5,6,15]. Deviations would indicate BSM top pair production mechanisms, for example the existence of an axigluons with a mass of above 1 TeV/c 2 would yield A C − A SM C of about −0.02 to −0.03 [5,6,15]. Non-flavour universal couplings of the axigluon would lead to a sign flip of A C − A SM C , other choices of coupling 2 3 Event selection parameters could lead to asymmetries up to 0.2 to 0.3 in specific M tt regions.
The measurement of A C in tt events is performed in the lepton+jets decay channel. Thereby, an optimal reconstruction of the top quark four-momenta is achieved which is a crucial prerequisite for the A C measurement. To allow for comparisons with any theory prediction we apply a regularised unfolding technique to correct the reconstructed |η t | − |η¯t| spectrum.

Data and simulation
Within this analysis the full dataset of proton-proton collisions at a center of mass energy of 7 TeV taken with the CMS detector [16] during the year 2010 is analysed. The amount of data correspond to an integrated luminosity of 36 pb −1 .
Top pair events are generated with the tree-level matrix element generator MADGRAPH [17] using PYTHIA [18] for the parton showering. Spin correlation in the top decays is taken into account and higher order tree-level gluon and quark production is described via the matrix element for up to three extra jets beyond the top pair system. For the description of the main SM backgrounds to top quark pair production the same combination of MADGRAPH and PYTHIA is used. W and Z boson production is simulated in association with jets (abbreviated as W+jets and Z+jets in the following). The radiation of up to four jets is simulated with the matrix element. Also the electroweak production of single top quark is simulated using the MADGRAPH generator.
All event generations are performed at the center of mass energy of 7 TeV, and make use of the CTEQ6L PDF parameterisation [19]. All generated events were fully simulated and reconstructed via the CMS simulation and reconstruction software. To correct for an observed difference in jet energy resolution, all jet momenta are scaled such that the resolution matches the one in data [20].

Event selection
We select tt candidate events, where one W stemming from a top quark decay subsequently decays semileptonically into a muon or electron, and the other W decays hadronically. The branching fraction of these muon+jets and electron+jets channel is about 15% each.
The event selection follows the selection of tt candidates applied for the measurement of the tt production cross section [21]. The expected event topology in the lepton+jets decay channel consist of four jets and one charged lepton in the final state. The event selection is exploiting the expected event topology. All events are required to be triggered by either a single electron or a single muon trigger with varying p T thresholds depending on the instantaneous luminosity of the specific run range of data taking. Each event has to contain one well reconstructed primary vertex in the central part of the collision region. The existence of exactly one isolated electron or muon originating from the primary vertex is required in the event selection. Electrons are reconstructed from combined inner tracking and calorimeter information, muons are reconstructed from a combined track fit to hits in the inner tracker and in the outer muon chambers. Muon candidates have to exceed a transverse momentum p T of 20 GeV/c and must have a pseudo rapidity |η| < 2.1. Electrons are required to have E T > 30 GeV and |η| < 2.5. In addition, all events passing the event selection must consist of at least four jets. Jets are reconstructed with an anti-k T algorithm [22] with a radius of R = 0.5 clustering particle candidates reconstructed with the particle flow algorithm [23]. Relative and absolute jet energy corrections are applied to each jet [24]. Only jets with p T > 30 GeV/c and |η| < 2.4 are selected. Also the missing transverse energy, E miss T , is modelled according to the particle flow algorithm.
The final number of selected events in the CMS data sample is 428 in the electron channel and 423 in the muon channel. The number of expected events for signal and background processes is taken from [21] and is summarised in table 1. Since the selected QCD background correspond to an outer tail in the phase space of QCD multi-jet production and is difficult to simulate with Monte Carlo generators, the QCD background is modelled using data driven templates. For the QCD modelling in the electron channel, the selected electron has to fail at least two of three quality criterions considering isolation, track distance from the beam spot, and the ability to pass the cuts on different electromagnetic shower variables. To model QCD events in the muon channel, we select data events with muons which are less well isolated than muons in the standard selection. Details on the QCD modelling are also given in [21].

Reconstruction of the tt final state
The measurement of the tt charge asymmetry is based on fully reconstructing the top quarks in the lepton+jets decay channel through the four-momenta of their decay products. In each event there exist numerous hypotheses for the reconstruction of the tt pair from the assignment of jets to the top quark decay products. The four-momentum vector of the neutrino is derived from E miss T . To calculate the z-component of the neutrino momentum, a quadratic constraint using the W boson decay kinematics is used, with the assumption that the W boson mass equals the pole mass of 80.4 GeV/c 2 . This leads in general to two solutions for the neutrino momentum. If the solution of the equation is complex, only the real part of the solution is taken into account. Adding the resulting four-momentum of the neutrino and the four-momentum of the charged lepton leads to the four-momentum of the leptonically decaying W boson. In order to get all hypotheses for the semileptonically decaying top quark, we consider all combinations of the four-momentum of one of the selected jets and the four-momentum of the W boson. The hadronically decaying W boson is then reconstructed by combining the four-momenta of two of the selected jets not assigned to the semileptonically decaying top quark. Adding the fourmomenta of this W boson and of one of the remaining jets results in the hadronically decaying top quark. For simulated events it is possible to determine the hypothesis which is closest to the true event. This best possible hypothesis is defined as the hypothesis for which the deviation of the direction of the reconstructed top quarks and W bosons momentum vectors from those of the generated particles is minimal. Since this is not possible for measured data, we determine for 4 5 W+jets background study each hypothesis a quantity Ψ which gives a quantitative estimate of how well this hypothesis matches the tt pair assumption, and we choose the hypothesis with the smallest value of Ψ. We define Ψ as: Herein, the quantity χ 2 is defined as where m rec. W,had is the reconstructed mass of the hadronically W boson and m rec t,had and m rec t,lep are the reconstructed masses of the semileptonically decaying top quark and of the hadronically decaying top quark, respectively . The parameters m b.p.
GeV/c 2 , and σ m W,had = 11.9 GeV/c 2 are mean and widths obtained from Gauss fits to the respective mass distributions in the best possible hypothesis of the tt Monte Carlo sample.
The function P b (x) is the probability of a jet with a certain b-tagger output x to be assigned to one of the two b quarks in the best possible hypothesis. For the b-tagging, a track counting algorithm is applied [25]. This b-tagging algorithm takes the second largest impact parameter of all tracks associated to a secondary vertex as discriminant. The variables x b had , x b lep , x q 1 , and x q 2 are the output values of this tagger for the jets assigned to one of the two b quarks or to the light quarks from the hadronically decaying W boson, respectively. The introduction of the btagging information to the Ψ variable improves the reconstruction of the top quark momenta. On the simulated tt sample, in 33% of all events the selected hypothesis with smallest Ψ is identical to the best possible hypothesis. Taking the χ 2 only to select a hypothesis, we find the best possible hypothesis in 24% of all events.
The selected hypothesis on data is compared to the prediction in figure 1, where signal and background samples are scaled to the cross section values found in the standard model tt cross section measurement [21].

W+jets background study
The background in the selected data sample is dominated by W+jets events. The asymmetry between the rate of positively and negatively charged W bosons could have an impact on the reconstructed tt charge asymmetry in W+jets, especially if this W + /W − rate asymmetry depends on the reconstructed top quark rapidities. Therefore, a validation of the reconstruction is performed in a sideband dataset which is enriched with W+jets events. For this sample the standard lepton cuts are applied but instead of requirering at least four jets events with one or two jets are selected. To increase the purity of W+jets events and to reduce the amount of QCD multi-jet events in this sample, a cut on the transverse mass of the reconstructed W boson of M T > 50 GeV/c 2 is applied. From Monte Carlo simulations, the expected purity of W+jets events in the selected data sample is 87% in the electron channel and 96% in the muon channel.
In this sample we can of course not reconstruct |η t | − |η¯t|, since for the reconstruction of this quantity at least four jets are required. However, a reconstruction of a four-momentum which mimics the four-momentum of the leptonically decaying top quark is possible. The leptonically decaying W boson can be reconstructed in the standard way from the charged lepton momentum and missing transverse energy. Paired with one jet it mimics the reconstruction of a leptonically decaying top quark in real tt events reconstructed in the inclusive four-jet bin. To select a pairing of jets and also a neutrino p z solution, the combination which forms an invariant mass closest to the top quark mass is selected. Having reconstructed a pseudo-top vector in a W+jets dominated data sample, we are able to compare the shapes of η t pseudo for events with a positively charged lepton to the shape of η¯t pseudo in events with negatively charged leptons. In this way we can evaluate potential differences between W + and W − events. If the η t pseudo distribution in W + events is different to the η¯t pseudo in W − events, this could produce an asymmetry in |η t | − |η¯t| in the reconstruction of W+jets background events in the finally selected signal region. The ratio of W + to W − events in η t pseudo is shown in figure 2. The ratio distribution looks the same for data and Monte Carlo. This supports, that the differences between W + and W − events are correctly modelled by the Monte Carlo simulation.

Measurement of the charge asymmetry
The top quark four momenta reconstructed with the Ψ method allows to calculate a reconstructed value of |η t | − |η¯t| in each event. The distribution of this variable is shown in figure 3. From this distributions we obtain an uncorrected charge asymmetry A C as defined in equation 1 of A rec C = 0.028±0.048(stat.) in the electron and A rec C = 0.007±0.049(stat.) in the muon channel. Combining both channels we find A rec C = 0.018±0.034(stat.). For the background subtraction the result of the standard model cross section measurement [21]   is utilised. In the cross section analysis the statistical uncertainties are provided for the individual backgrounds W+jets, Z+jets, single top, QCD in the muon, and QCD in the electron channel including the full covariance matrix. Systematic uncertainties on the backgrounds will be treated individually later. From the diagonalised covariance matrix orthogonal templates are constructed according to the eigenvectors of the covariance matrix. The contribution from the uncorrelated background templates are then subtracted from the measured spectrum using Gaussian error propagation. After the background subtraction we find a charge asymmetry of A bkg. subtr. C = 0.035 ± 0.070(stat.).
The correction for smearing and selection effects distorting the spectrum of |η t | − |η¯t| is performed using a generalised unfolding technique [26]. Smearing and efficiency effects define a matrix A which translates the true spectrum of |η t | − |η¯t|, denoted as x, into the measured distribution y: Here, the entries of the vectors x and y correspond to the bin entries of the histograms of true and measured spectrum. The matrix A is determined from tt events generated with the MADGRAPH generator. The true spectrum of |η t | − |η¯t| is then be determined by constructing a generalised inverse matrix A # which transforms the measured spectrum back into the true variable. The inversion process is modified by introducing a Phillips-Tikhonov regularisation method [27,28] which smoothes the unfolded spectrum to avoid unphysical fluctuations.
The result of the unfolding process is a differential cross section measurement in |η t | − |η¯t| for top quark pair production. For the application of the unfolding on data, the overall selection efficiency taken from the MADGRAPH Monte Carlo simulation is corrected for the measured selection efficiencies in the electron and muon channels respectively. Only differential selection efficiencies in |η t | − |η¯t| are taken from the simulation. The unfolded spectrum can directly be compared with theory predictions. In figure 4 the unfolded spectrum of |η t | − |η¯t| is shown. The shown theory curve is the prediction from the MADGRAPH Monte Carlo event generator, the error bars correspond to the statistical uncertainties only which are given by the square root of the diagonal elements of the covariance matrix. The full covariance matrix taking correlations between the bins of the measured spectrum into account is also provided in figure 4.   Figure 4: Distribution of the unfolded |η t | − |η¯t| spectrum (a). The shown theory curves are the prediction from the MADGRAPH generator and a NLO computation from [5,6,15]. (b) shows the full covariance matrix for the measured spectrum.
From the unfolded |η t | − |η¯t| spectrum the tt charge asymmetry A C can be obtained via equation 1. The statistical uncertainty σ A on A C is calculated with the covariance matrix as input for a Gaussian error propagation. With this approach we measure a charge asymmetry of The performance of the unfolding algorithm is tested in sets of pseudo experiments. In each pseudo experiment random distributions for tt signal and background processes are generated from the Monte Carlo templates. In this pseudo experiments, the number of background events is varied according to the statistical uncertainties taking correlations between the different backgrounds into account. For the tt signal fraction the measured cross section of 171.7 pb is used [21]. The process of creating pseudo data is repeated many times and the unfolding method is applied on each pseudo data sample. The true values of all bins of the |η t | − |η¯t| spectrum are reconstructed correctly by the unfolding on average in all pseudo experiments. The expected statistical uncertainties per bin obtained from the spread of the unfolded results in pseudo experiments are 26-27% in the inner and 36% in the outer bins of the unfolded spectrum. Also the charge asymmetry A C and its statistical uncertainty is correctly found in the unfolding of pseudo data sets. On average, an asymmetry close to 0 as expected from the  To check, whether on a data sample showing a sizeable asymmetry, this asymmetry is reconstructed correctly using the standard Monte Carlo sample as input for the unfolding, we perform ensemble tests with pseudo experiments in which an asymmetry is present. To generate samples with different charge asymmetries we re-weight the events from the standard tt Monte Carlo sample such, that these samples show asymmetries in |η t | − |η¯t| in the range between approximately -20% and +20%. The re-weighted samples are unfolded using the standard unfolding procedure and the averaged unfolded asymmetry in a set of pseudo experiments is compared to the generated asymmetry. We see a linear dependence between true and unfolding asymmetry without biases.
The measurement of the charge asymmetry A C might be affected by several sources of systematic uncertainties. In principle, only systematic uncertainties influencing the direction of the reconstructed top quark momenta can change the value of the reconstructed charge asymmetry. The overall selection efficiency and acceptance will not change the measured asymmetry. For each source of systematic we draw pseudo experiments from systematically shifted samples and perform the unfolding with the smearing and efficiency matrix taken from the standard MADGRAPH sample. The largest systematic uncertainties arise from η dependent variations of the jet energy scale and the lepton selection efficiency. Also the variation of the parton distribution function (PDF) is found to change the reconstructed charge asymmetry. Smaller systematic uncertainties arise from the uncertainties on jet energy resolution, from the variation of the Q 2 scale and matching threshold in the used Monte Carlo samples, from uncertainties on the btagging efficiency, and from the variation of initial and final state radiation (ISR/FSR) in the tt signal Monte Carlo sample. We also account for uncertainties on the ratio of positively and negatively charged lepton candidates in the QCD sideband model. The usage of an alternative Monte Carlo generator (MC@NLO) has no significant influence. The impact on the charge asymmetry of all systematic uncertainties is summarised in table 2. The overall systematic uncertainty found by adding the single contributions in quadrature is ±0.026 after symmetrising positive and negative shifts. Some systematic uncertainties are partially based on the available data statistics. The main systematic uncertainties on lepton selection efficiency and jet energy scale will therefore be reduced in measurements performed with a larger amount of data.

7 Conclusion
In this article a method to measure the tt charge asymmetry A C in proton-proton collisions with the CMS detector has been developed. To access A C we have performed a differential cross section measurement of the |η t | − |η¯t| spectrum in the lepton+jets decay channel. The spectrum of |η t | − |η¯t| is corrected for selection and reconstruction inefficiencies using an unfolding technique. From the unfolded spectrum the charge asymmetry in tt production can be derived. Its value is measured to be A C = 0.060 ± 0.134(stat.) ± 0.026(syst.) (6) which is consistent with the Standard Model prediction. This is the first measurement of the charge asymmetry in top quark pair production performed in proton-proton collisions. The result is still dominated by the statistical uncertainty but our analysis shows that CMS has the ability to measure this quantity and to add complementary information to the charge asymmetry measurement performed at the Tevatron. Applying the presented analysis to a larger available data set corresponding to an integrated luminosity of 1 fb −1 , the measurement of the top quark charge asymmetry can reach the same sensitivity as the Tevatron results. With a larger data set it will also be possible to perform this measurement explicitly in high M tt regions where the influence of new resonances in tt production is expected to be more significant.

A Appendix
In this appendix, additional plots supporting the details of the present measurements are listed.