Parton Distribution Functions and the role of forward region data

In the era of searching for new discoveries at the LHC, it is crucial to achieve a higher level of precision in understanding the proton structure which will allow for unambiguous interpretations of the high energy, luminous data ahead. The knowledge of proton’s constituents comes mainly from the deep inelastic scattering data at HERA, complemented by the measurements from the fixed target, Tevatron, and now increasingly precise more data from LHC. A road-map that marks the most recent developments from the past and present experiments sensitive to the proton constituents will be presented here with an emphasis on the recent results using LHC data from the forward region.


Motivation
It is a most exciting phase of particle physics where there are many crucial questions which point to the deficiencies of the Standard Model (SM) that need to be answered. These questions can be addressed by the long term plan of the LHC running in different stages for achieving ever higher energies and luminosity. This poses new challenges for the experimental set-up and for the understanding of quark-gluon dynamics of the incoming proton beam particles at the LHC in the search for new particles or new interaction effects. The goal is first to facilitate the discovery of new physics by comparing measurements with the expectations driven either by the current SM hypothesis or by testing various beyond-SM scenarios. Precision is crucial to control the dominant uncertainties in order to observe and later analyse possible deviations from expectations. On the experimental side, the current measurements are now available at sub percent precision (e.g. W, Z at 7 TeV, Z p T at 8 TeV), while the last decade has seen an enormous progress in achieving percent level precision on the theory calculations. Many of these are now available up to the next-to-next-to leading order (NNLO) in the truncated perturbative series of the strong coupling. In the following, an overview of the progress on improving the current limiting factor on precision, which arises from the limited knowledge of the proton's constituents, is presented. Figure 1. A schematic illustration of a PDF extraction mechanism as used by the xFitter [4] platform is shown on the left hand side. A kinematic coverage in x versus scale of experimental data [10] sensitive to PDFs is shown on the right hand side. determine them, e.g. the valence quarks peak at the high x, while the gluon and sea quarks dominate at low x.
The PDFs cannot be directly calculated by QCD, however their evolution with the scale is predicted by perturbative QCD via evolution equations [1,2]. The extraction of PDFs relies on the interplay between the precision of data versus theory. It invokes the factorisation concept where the hadronic cross section can be viewed as a convolution between calculable partonic cross sections (short distance process) and non-calculable part (long distance process).

Methodology in Extracting the PDFs
The following general steps are involved in the extraction of PDFs: 1. Parametrise PDFs at a starting scale; 2. Evolve to the scale corresponding to a data point; 3. Calculate the cross section; 4. Compare the calculated cross section with the data via χ 2 function; 5. Minimize χ 2 with respect to the PDF parameters. There are many available choices for each of the above steps. A more detailed schematics of PDF extraction is also shown in figure 1. For step (1) there are multiple options for functional forms to be used to parametrise PDFs. For step (2), different methodologies can be used to evolve the PDFs at the starting scale, using collinear or k T ordered evolutions. For step (3), there are various theoretical schemes to account for the heavy quark masses. Since the computational time is lengthy, there are also various fast grid techniques to speed up the calculations. For step (4), accounting for different sources of experimental uncertainties can be dealt by using either the covariance matrix or nuisance parameter approaches. And finally for step (4) there are different ways to minimize and extract the fit parameters, such as using data driven regularisation methods or using the MINUIT package [3].
Currently, there are several open-source software codes available, starting from the xFitter [4], which opened the ways of QCD-fit open-source code sharing, followed by the codes from various groups: APFEL [5] , ALPOS [6], OPENQCDRAD [7,8], QCDNUM [9], just to mention the main ones in use.
On the data side, there has been a persistent experimental effort over the last 40 years by both fixed-target and collider experiments around the world to provide constraints and accurate data. The kinematic coverage is illustrated in figure 1. The cleanest probe of the proton structure is the deep inelastic scattering experiments, where HERA has provided its final word on extracting PDFs in shape of HERAPDF2.0 [11]. However, complementary information is provided by the Drell-Yan processes at the collider experiments from Tevatron and the abundant data from the LHC.

Current PDF Groups
The extraction of PDFs is subject to many choices which lead to the formation of different analysis groups to extract global PDFs. The main list of differences comprises from: choice of data selection, choice of data treatment (corrections, uncertainties), various theory calculations for each process, e.g. formalism, automation, assumptions, parametrisation of PDFs and fit methodology, treatment of uncertainties (from data to theory). The most current and active PDF groups to date are: CT14 [12], MMHT14 [13], NNPDF3.0 [14], HERAPDF2.0 [11], CJ15 [15], ABM12 [16], JR14 [17] as well as dedicated PDFs produced by studies carried out by the LHC experiments, xFitter and PROSA [19]. Figure 2 summarises the known differences among these PDF groups.
Most of these differences have been addressed by benchmark exercises to assess the true differences in the methodologies of the groups [20,21]. The level of precision for these benchmarked PDFs (based on General-Mass-Variable-Flavour-Number-Schemes) reaches below 10% in the bulk x region of 10 −3 − 10 −1 , however, outside this region, the level of uncertainties escalate considerably, as shown in figure 3 for the gluon distribution. With increased mass ranges accessible by the LHC for discovery search, the PDF uncertainties increase considerably (especially for the gluon initiated processes), as illustrated in figure 3. This motivates strongly the need to improve the uncertainties for the high mass reaches.
There are also nuclear PDF groups, such as nCTEQ [22], HKN [23], EPS [24], DSSZ [25]. However, the analysis of nuclear data to extract nuclear PDFs relies on isospin symmetry and the assumption that bound proton PDFs obey the same evolution equations and sum rules as the free proton PDFs. There have been recent results from the nCTEQ group which analysed the W and Z production from LHC on proton-lead interaction with the power to provide extra constraint on light u,d quarks [18].

Impact of the LHC measurements
The LHC provides an extended kinematic range in x by its three experiments ATLAS, CMS and LHCb. There is an interesting complementarity offered by the LHCb measurements, which can be exploited with its forward detector design accessing the low x region. It has been shown [19] that the access to the low x region reduces the uncertainties on gluon and sea quarks, as shown illustratively in figure 4. Moreover, the low x kinematics of the LHCb can be linked to the neutrino physics coverage, for which the main background for astrophysical neutrinos is generally the flux of neutrinos from the decays of charm mesons in cosmic ray collisions in the atmosphere. Therefore, the heavy quark production data from LHC could validate calculations of the prompt neutrino flux.
The LHC data can provide not only PDF discrimination by confronting theory with data, but also PDF improvement by using specific processes aimed at constraining PDFs. For gluons the inclusive jets, dijets and trijets measurements target the medium to large x sensitivity to PDFs, as well as the top pair production is an interesting observable for large x region. The transverse momentum of the Z distribution is particularly interesting as it can be very precisely measured experimentally (below percent level). For the quarks, the W and Z rapidity spectra are the main channels to constrain them at medium x. Interesting measurements are also the low and high mass Drell-Yan distributions. With

Latest developments for PDFs
The fixed order calculations do not always work in describing data. A classical example is the Z transverse momentum distribution [26]. The predictions suffer from this lack of a consistent formalism applicable from small to large scales and there are efforts to merge the fixed order perturbative calculations with the parton showering or soft gluon resummation approaches.
The transverse momentum distributions, which can be viewed as adding extra dimensions to the PDFs, k T dependent PDFs, are introduced for a proper simulation of parton showers [27]. This requires generalisation of QCD factorisation with an explicit dependence on transverse momentum and polarisations. They also obey the evolution equations which generalise the ordinary Renormalisation Group Equations of collinear PDFs. Another approach is to extract PDFs using rather resumed calculations than the pure fixed order calculations which is under development [28].
At the LHC the QED effects start to bring an important contributions due to access to high scales processes, therefore it starts to become relevant implementation of the combined QCD and QED evolution to extract a complete set of PDFs including photon PDFs. An active development is ongoing in all of the PDF groups.

Summary
The PDFs are very important as they still limit our knowledge of cross sections whether SM or beyond SM. Enormous progress achieved in pushing towards percent level precision on theory and experimental measurements. On the theory side there has been a boost due to the availability of the NNLO state-of-the-art calculations, the inclusion of the electro-weak effects, the release of special-case PDFs based on resumed calculations, parton showering, intrinsic charm. This has been complemented by the development of the advanced statistical methods (Monte Carlo replica, reweighting, profiling). On the data side, there is a pressing demand in pushing for precision measurements for constraining PDFs and measurement from clean processes are expected to bring a decisive impact on PDFs and flavour separation. The precise measurements should be presented with de-convoluted correlation information to help cross-calibrate systematic uncertainties. The next big goal of the LHC is to search for hints of new physics beyond the Standard Model and this can only be achieved if we can control better all our free parameters.