Review of physics results using jet substructure techniques in LHC Run1

The possible existence of phenomena beyond the standard model of particle physics at the TeV scale implies the production of highly boosted heavy objects, whose decay products tend to be collimated and can be reconstructed as jets with large radius parameter. Jet substructure analysis techniques have been developed and successfully applied to LHC Run1 data to perform cross section measurements in the high transverse momentum regime and searches for new physics phenomena. An overview of selected physics results obtained by ATLAS and CMS using jet substructure techniques is presented.


Introduction
An impressive quantity of results have been obtained by the experiments at the Large Hadron Collider (LHC) in the last few years.These results confirm the validity of the Standard Model (SM) of particle physics at an unprecedented energy scale, and complete the SM picture with the observation of its last missing piece, the Higgs boson.However some experimental evidences, like the existence of dark matter or the neutrino masses, can not be explained in the context of the SM which needs some extensions (see for example [1]).Several models of beyond-the-SM (BSM) physics have been proposed and searches for BSM physics at the LHC have now reached the TeV scale.At this energy scale SM massive particles (vector bosons W, Z, top quark t, and the Higgs boson H) are produced with a large Lorentz boost, their subsequent decay products tend to be highly collimated and can partially overlap in the reconstruction phase in case of final states characterized by the presence of several jets, like the hadronic decays W/Z/H → q q or t → bq q 1 (Fig. 1).In this regime, event reconstruction techniques employing objects that are well spatially separated in the detector become inefficient.
The decay products of hadronically decaying boosted objects can be contained in a single jet with large radius parameter R and a large transverse momentum p T .Cambridge-Aachen (C/A), k T , and anti-k T jet reconstruction algorithms ( [2] and references therein) are applied by ATLAS and CMS.Jet substructure analysis techniques have been developed, with the aims to mitigate the effect of the pile-up on the large-R jet (grooming techniques, such as trimming [3], pruning [4], and filtering [5]), and to discriminate high-p T jets originated by the decay of boosted massive objects from the ones originated by light quarks q or gluons g (jet substructure observables and taggers [6]).a e-mail: matteo.negrini@bo.infn.it 1 Charge conjugate modes are implicitly included throughout the text.
All these techniques are an active field of research on both experimental and theoretical sides (see for example [7]) and in continuous development.
Jet substructure variables are reconstructed starting from the jet constituents (topological clusters, tracks or sub-jets) and can be computed with or without the application of grooming techniques.Examples of used ones are: the jet mass m jet , defined as the invariant mass of the jet constituents; the splitting scale 2 √ d 12 = min(p T 1 , p T 2 )ΔR 12 , corresponding to the k T distance between the two proto-jets (indicated with the indices 1 and 2) in the last step of the jet clustering [8]; the momentum balance √ y f = min(p T 1 , p T 2 )ΔR 12 /m jet which is the ratio between the splitting scale and the jet mass; the mass-drop μ 12 = max(m 1 , m 2 )/m jet , which is the fraction of mass of the most massive proto-jet; the N-subjettiness τ N , quantifying to what degree the jet substructure resembles the one of a jet with N or less sub-jets [9], and the ratios τ i j = τ i /τ j .
Taggers are more sophisticated algorithms aiming to test the compatibility of a jet with specific scenarios of interest [10].Several techniques have been proposed to tag boosted top quarks, vector bosons, and the Higgs bosons; the performance of some of them are studied and compared by ATLAS [11,12] and CMS [13,14].
In the following, a selection of physics results obtained by the ATLAS and CMS experiments using with jet substructure techniques on LHC Run1 pp collision data will be discussed. 2ATLAS and CMS use a right-handed coordinate system with its origin at the nominal interaction point in the center of the detector, the x-axis pointing to the center of the LHC ring, and the y-axis pointing upwards, and the z-axis directed along the beam axis.Cylindrical coordinates (r,φ) are used in the transverse plane, φ being the azimuthal angle around the beam pipe.The pseudorapidity is defined in terms of the polar angle θ as η = − ln tan(θ/2).The spatial separation between two objects is defined as ΔR = Δφ 2 + Δη

Measurement of boosted vector bosons production
The fiducial cross section for high-p T vector boson V (V is either a W or a Z) production is measured by ATLAS reconstructing the hadronic decay V → q q in a single anti-k t jet with R=0.6, p T > 320 GeV and |η| < 1.9 [15].The discrimination of signal from background is obtained by exploiting jet substructure topological variables computed in the jet rest frame: V → q q jets tend to assume a "back-to-back" topology, while the ones from the multi-jet QCD background are more isotropically distributed.The jet sphericity, aplanarity and thrust minor are computed using the jets constituents in the centre-of-mass frame of the jet, and combined in a likelihood discriminator.This technique allows the observation of a V → q q peak in the jet mass distribution over a background mainly originated by multi-jet QCD events and t t production.The multi-jet QCD background presents a shoulder structure, originated by the choice of the jet reconstruction radius, the p T and likelihood requirements, and the internal kinematic structure of the selected jets.This structure is well reproduced in the multi-jet QCD MC simulation, obtained with PYTHIA [16] with different tunings and POWHEG [17] interfaced with both HERWIG++ [18] and PYTHIA for the parton shower and hadronization.The cross section for events with an hadronically decaying W or Z in the fiducial region p T > 320 GeV and |η| < 1.9 is measured to be σ W+Z = 8.5 ± 1.7 pb [15], in agreement with W/Z+jets production computed at the next-to-leading order.The selected event sample, before the likelihood discriminator selection, is enriched in hadronically decaying W/Z at high p T and is used to test the performance of grooming techniques.The event sample is processed by applying several jet grooming techniques to all jets selected for the cross section measurement, using default AT-LAS grooming parameter settings so without attempting a specific optimization for this analysis.Grooming removes from the original jet some of its constituents, the likelihood discriminator is recomputed after grooming and a cut is applied to obtain the same background rejection as in the nominal analysis.A reduction in the number of selected jets by approximately 50% with respect to the ungroomed case is observed for pruning and trimming and by 15% for area subtraction, and the multi-jet QCD background shows the same shoulder structure observed on the ungroomed nominal analysis.The invariant mass distribution for groomed jets show a significantly reduced impact of varying pile-up conditions.The statistical significance of the W + Z signal remains approximately constant before and after grooming.

VV and qV resonance searches
Several models of BSM physics predict the existence of massive resonances decaying to high-p T vector bosons, like Randall-Sundrum or Kaluza-Klein gravitons (G RS , G KK → WW, ZZ), a heavy partner of the W boson (W → WZ), or excited quark resonances (q * → qV).For large masses m G , m W or m q * the V → q q decay products merge in a single large-R jet, and the final state turns into a di-jet topology.The discrimination of V jets from q/g jets is obtained by ATLAS [12] and CMS [14] using jet substructure techniques.
A V jet tagging technique based on the jet mass computed after the pruning procedure and on the Nsubjettiness ratio τ 21 (Fig. 2) is applied by CMS in a search for massive resonances in the di-jet final state [19].Monte Carlo (MC) generators are used to simulate the signals of interest: HERWIG++ is used to simulate Randall-Sundrum graviton production, JHUGEN [20]   PYTHIA for Kaluza-Klein gravitons, and PYTHIA for W and excited quark resonances production.
The di-jet channel allows to reconstruct VV resonances when both vector bosons in the final state decay hadronically.At least two jets with p T > 30 GeV and |η| < 2.5 must be present, with a pseudorapidity separation |Δη| < 1.3 to reduce the multi-jet background.A tagging algorithm based on pruning is applied to the two highest-p T jets, reconstructed using the C/A R=0.8 algorithm.Events with two tagged jets are considered in the analysis to test G and W production models, while events with a single tagged jets are used to search for q * .High-and low-purity W/Z samples are selected by applying different cuts on τ 21 (τ 21 < 0.5 and 0.5 < τ 21 < 0.75, respectively), events with two jets are required to have at least one high-purity tag and the event is classified according to the purity of the second jet.
Since a large hadronic background of QCD multi-jet events is present at hadron colliders, the presence of a leptonic decay in the event (W → lν l or Z → l + l − , where l = e, μ) can be effectively exploited to reduce the combinatorial background.The same jet tagging technique used in the di-jet channel analysis is applied by CMS for the selection of the hadronic V candidate in the lepton+jets channel, in which one of the two vector bosons decays leptonically an the other hadronically.Using 20 fb −1 of pp collision data at √ s = 8 TeV, the sensitivity to G production in this channel is still limited, so exclusion limits are obtained by interpreting the results with a model independent approach, for states with varying mass and width [21].
ZV resonances have been searched by ATLAS in the lepton+jets final state using 20 fb −1 of pp collision data at √ s = 8 TeV [22].In this analysis the event selection is optimized in three different kinematic regions: low-p T resolved, high-p T resolved, and high-p T merged.The highp T merged regime is the one with the highest sensitivity in the high mass region (see Fig. 3).The decay products of boosted vector bosons are reconstructed in a single C/A R=1.2 jet, with p T > 100 GeV and limited in the fiducial region |η| < 1.2.Tagging is performed by a splitting and filtering algorithm optimized for the detection of highly boosted boson decays.The results are interpreted on the basis of the W and G KK models, implemented in PYTHIA and CALCHEP [23], respectively.
XLIV International Symposium on Multiparticle Dynamics (ISMD 2014) Using LHC Run1 data, no evidence of resonant production above the SM background has been observed; limits on the masses of resonant objects extracted using boosted objects reconstruction techniques are summarized in Tab. 1.
Table 1.Mass limits at 95% C.L. on resonant VV and qV production at the LHC, obtained using boosted objects reconstruction techniques on Run1 data.

W → tb searches
Effective models describing the W coupling to fermions can be assumed in order to interpret experimental searches.W → t b searches are complementary to W → lν ones and can potentially test different BSM physics scenarios, with the advantage that in case of hadronic decay of the top quark the final state can be fully reconstructed, allowing to search for a peak in the m tb spectrum.Some BSM theories predict larger couplings of new physics to the third generation than to the first or the second, and from the experimental point of view the presence of b quarks in the final state provides a useful handle for the the rejection of the leading background contribution, represented by multi-jet QCD events.These considerations justify the choice of the t b final state for W searches.At large m tb the top quark is produced with a large Lorentz boost and its decay products can be reconstructed in a single large-R jet, and jet substructure analysis can be applied to improve the experimental sensitivity.
The analysis performed at ATLAS reconstructs boosted hadronic top candidates using anti-k T R=1.0 jets.The discrimination of top jet candidates from q/g initiated jets is performed through a cut-based selector on jet substructure variables: the splitting scale √ d 12 and two N-subjettiness ratios τ 32 and τ 21 [24].
At CMS jets are reconstructed using the C/A R=0.8 algorithm.After decomposing the original jet into subjets, when at least 3 sub-jets are found the original jet is tagged as top candidate based on the jet mass, a minimum pairwise mass of the 3 leading sub-jets larger than 50 GeV (compatible with the W boson mass), and on τ 32 [25].
Both analysis adopt data-driven techniques to estimate the QCD multi-jet background contribution in the final sample of selected events by means of control regions obtained by inverting b-tagging or jet substructure requirements.The sub-leading t t background contribution is estimated with MC simulations.Observations are compatible with the SM background, the derived limits on the mass of right-and left-handed W states are summarized in Tab. 2. Top quarks originated from the decay of new particles with masses in the TeV region will be naturally produced with high-p T and boosted objects reconstruction techniques are mandatory to increase the signal efficiency and the overall experimental sensitivity to possible new physics effects.
The lepton+jets channel is employed to reconstruct events where one of the two top quarks decays leptonically (t → blν l , where l = e, μ) and the other hadronically.The presence of an isolated lepton provides a handle for the rejection of the huge hadronic background that is present at the LHC.These events are also characterized by the presence of a sizable missing transverse energy (from the escaping neutrino) and two jets originated by b quarks.The resulting background after a typical lepton+jets event selection is dominated by SM t t production, which can be estimated through MC simulation; other minor contributions come from W/Z+jets, multi-jet QCD, diboson, and single top production.
At high-p T the decay products of hadronically decaying top quarks can be collected in a single large-R jet, typically with a large spatial separation from the leptonic top candidate required by a back-to-back event topology.Resolved reconstruction techniques are best suited in the t t mass region up to the TeV, above this scale boosted objects reconstruction techniques allow to improve the experimental sensitivity and therefore yield more stringent limits on BSM resonances production.
Hadronically decaying top quarks in the boosted regime are reconstructed by ATLAS using anti-k T R=1.0 jets after the application of a trimming procedure [26].Large-R jets originated by hadronic top decays at high p T (p T > 300 GeV) are selected through requirements on the jet mass (m jet > 100 GeV) and on the splitting scale ( √ d 12 < 40).Spatial separation of the large-R jet from the leptonic top candidate and b-tagging are also used to improve the background rejection.Fig. 4    hadronically.This final state has a larger cross section with respect to the lepton+jets but the huge multi-jet QCD production cross section at the LHC naturally produces a large contamination in the final sample of selected events, and SM t t production becomes a sub-leading background contribution.More sophisticated top tagging techniques are therefore necessary to improve the background rejection and the sensitivity of the analysis in this channel.
Boosted top quark candidates are identified by CMS on reconstructed C/A R=0.8 jets by applying a top tagging technique based on the total jet mass (140 < m jet < 250 GeV) and on the identification of three sub-jets, with a minimum invariant mass of sub-jet pairs compatible with the W mass (m pair > 50 GeV) [28].Events are selected if two jets with a back-to-back topology are tagged.
The multi-jet background is estimated by means of data-driven techniques, using a control region containing events with at least two high-p T jets (p T > 400 GeV).A first background enhanced sample is selected by requiring that one of the two leading jets (randomly chosen) satisfy the m jet requirement, but applying a m pair < 30 GeV selection.The mis-tag rate is determined on this sample by applying the top tagging algorithm to the other jet.A second sample is defined by requiring that one of the two leading jets (randomly chosen) satisfy full the top tagging requirements.The multi-jet contribution is determined by applying the mis-tag rate to the second jet.ATLAS performed a similar analysis for the full-hadronic channel where two different tagging algorithms, HEPTopTagger [29] and Top Template Tagger [30], have been compared [31].
Since no excess with respect to the estimated backgrounds has been observed by ATLAS and CMS, limits on leptophobic Z and g KK production are determined, as summarized in Tab. 3. The interplay of resolved and boosted reconstruction techniques on the extraction of t t resonance mass limit is summarized in Fig. 5 [32], clearly showing the leading role of boosted objects reconstruction techniques on the experimental sensitivity in the high t t mass region.Table 3. Mass limits at 95% C.L. on narrow Z and g KK production at the LHC obtained using boosted objects reconstruction techniques on Run1 data.For ATLAS full-hadronic the upper limit of the exclusion region is reported, some lower mass regions remain in principle not excluded by this analysis.Figure 5.Comparison of the expected tt mass resonance limits for the resolved, the boosted lepton+jets, and the boosted full-hadronic selections for (a) narrow Z and (b) g KK [32].
channel [33].The full-hadronic decay mode is characterized by the presence of b quarks in the final state, originating from both the top quark and the Higgs boson (branching ratios BR(t → bq q) = 0.66 and BR(H → b b) = 0.56).Since an hypothetical T can have a very large mass, t and H could be produced with a large Lorentz boost and their subsequent decay products contained in two large-R jets, reconstructed using the C/A R=1.5 algorithm.Events with at least two such jets are selected and jet substructure techniques are applied to separate possible resonant tH production from the huge multi-jet QCD background inevitably present in the analysis of full-hadronic channels.
Two tagging techniques are applied: HepTopTagger is used to identify hadronic top candidates and a Higgs tagger based on the b-tagging of sub-jets.Two additional variables have been identified to further improve the signal significance after the event selection: the scalar sum of the transverse momenta of selected objects H T and the invariant mass of the two b-tagged sub-jets m H , which are combined in a likelihood discriminator.
Signal T T samples are simulated using MADGRAPH [34] interfaced to PYTHIA.Since the two t and H tagging algorithms are uncorrelated, the leading background contribution from QCD multi-jet can be determined directly from data with a matrix method, by inverting the tagging selections.The signal region contains all events with both t and H jets candidate tagged, and three sideband regions are defined by inverting alternatively each one or both tagger selections.From the relative ratio of the number of events in the four regions it is possible to extract the number of QCD multi-jet events in the signal region.It is checked that the technique applied to simulated events yields consistent results.
After summing the t t contribution to the background, derived from the simulation, no excess has been observed.The existence of T states in the mass mass region m T < 747 GeV is excluded (95% CL) by assuming the branching ratio BR(T → tH) = 100%.Lower m T values are still al-lowed by this analysis for smaller values of BR(T → tH), other analyses tailored to different final states will help in completing the picture.

Conclusion
Techniques for jet substructure analysis allow to reconstruct hadronic decays of massive objects in the boosted regime.These techniques have been successfully applied in several measurements and BSM physics searches by ATLAS and CMS on LHC Run1 data.Jet substructure analysis techniques yield results that are complementary to the ones obtained with resolved reconstruction techniques, but usually the higher efficiency in the high-p T regime enable to push farther the exclusion limits for the masses of BSM objects.
Jet substructure techniques undergo a very active development, with a fruitful interplay between theorists and experimentalists.They are extensively tested, validated and applied to LHC Run1 data analyses.Although no evidence of BSM physics emerged, they have been proven to provide accurate descriptions of the SM backgrounds.These techniques will be crucial for the LHC Run2 physics program [35], especially in the increased energy and pile-up environment that will represent a new experimental challenge.

Figure 1 .
Figure 1.Schematic representation of the reconstruction of an hadronically decaying top quark in different kinematic regimes: (a) low-p T top quarks can be reconstructed by combining three jets that are spatially separated in the detector; (b) the decay products of high-p T top quarks tend to be collimated and the consequent jets may partially overlap in the reconstruction phase; (c) the decay products of high-p T top quarks can be contained in a single large-R jet and discriminated from jets originated by high-p T light quarks and gluons through the analysis of the jet substructure.

Figure 2 .
Figure 2. Distributions for (a) m jet after pruning and (b) τ 21 in data (before tagging) and G, W , and QCD multi-jet MC [19].

Figure 3 .
Figure3.Product of signal acceptance and efficiency for the process G → ZZ → e + e − q q.The contribution of the three event selection adopted in ATLAS is shown: low-p T resolved (red line), high-p T resolved (blue line), and high-p T merged (green line)[22].

Figure 4 .
Figure 4. Data/MC comparison, including background contributions, for some quantities related to the large-R jet corresponding to the hadronic top candidate [26]: (a) m jet , (b) √ d 12 , (c) p T .Plots are for the μ+jets channel.

Table 2 .
Mass limits at 95% C.L. on right-and left-handed W → t b production at the LHC, obtained using boosted objects reconstruction techniques on Run1 data.Some extensions of the SM predict new particles with large couplings with the top quark, so the observation of resonant t t production at the LHC would be a clear indication of new physics.In particular, massive resonances like a leptophobic Z or Kaluza-Klein gluons g KK could be observed in the t t final state.
show the data/MC comparison for m jet , √ d 12 and p T , where a generally good agreement is observed for all quantities.The results of a t t resonance search in the lepton+jets channel by CMS are reported in [27], and summarized in Tab. 3 together with the ones by ATLAS.
t t resonances are also searched in the full-hadronic final state, corresponding to events with both t t decaying EPJ Web of Conferences 09001-p.4