Opportunities and challenges in neutron crystallography

Neutron and X-ray crystallography are complementary to each other. While X-ray scattering is directly proportional to the number of electrons of an atom, neutrons interact with the atomic nuclei themselves. Neutron crystallography therefore provides an excellent alternative in determining the positions of hydrogens in a biological molecule. In particular, since highly polarized hydrogen atoms (H+) do not have electrons, they cannot be observed by X-rays. Neutron crystallography has its own limitations, mainly due to inherent low flux of neutrons sources, and as a consequence, the need for much larger crystals and for different data collection and analysis strategies. These technical challenges can however be overcome to yield crucial structural insights about protonation states in enzyme catalysis, ligand recognition, as well as the presence of unusual hydrogen bonds in proteins.


Introduction
Although X-ray crystallography has become the workhorse of structural biology, Neutron crystallography has several advantages to offer in the structural analysis of biological molecules.
X-rays and neutrons interact differently with matter in general, and with biological macromolecules in particular. These two crystallography approaches are therefore complementary to each other [1]. While X-ray scattering is directly proportional to the number of electrons of an atom, neutrons interact with the atomic nuclei themselves. In this perspective, hydrogen atoms, which represent ~50% of the atomistic composition of proteins and DNA, are hardly visible using X-ray crystallography, while they can be observed in nuclear density maps derived from neutron diffraction data even at moderate resolution (2.5Å and higher). In fact, less than 5% of deposited models in the Protein Data Bank (PDB) were obtained from crystals diffracting X-rays to (sub-)atomic resolution, and even in the structures with resolution better than 0.5 Å, a third of the expected hydrogens could still not be experimentally identified [2]. The positions of the more mobile or labile hydrogen atoms, which are typically the most biologically relevant, had to be inferred. Neutron crystallography therefore provides an excellent alternative to determine the position of hydrogen atoms in a molecular structure. It is even able to identify highly polarized H atoms or protons (H + ), which cannot be observed by X-rays as they do not have any electrons.
This neutron specificity arises from the coherent scattering length of the two stable isotopes of hydrogen ( 1 H and 2 H (also delineated deuterium, D)) being of similar magnitude to that of other atoms which compose a biological macromolecule (Figure 1). Deuterium is mentioned here as it has similar chemical properties to the Hydrogen atom, as well as physical properties more favourable for neutron diffraction experiments. H/D isotopic replacement within the crystal (where Hydrogens are exchanged for Deuterium atoms) is mandatory for experimental success.
Visualization of hydrogen and deuterium atoms can give important insights regarding the chemistry of the biological system. For example, knowledge of the protonation states of residues within the catalytic site of an enzyme is crucial to understand the catalytic mechanism involved [3][4][5][6][7][8]. Positioning hydrogen atoms unravel the exact hydrogen-bond network that usually govern ligand or drug binding to biomacromolecules [9][10][11][12][13][14]. The hydration pattern also plays a significant role in the thermodynamic process of ligand binding [15]. Water molecule orientations can be directly inferred from neutron diffraction data. Indeed, waters are often present in protein active sites and form an integral component of the catalytic reaction [16,17].
A significant advantage of neutrons is that, since their scattering is non-destructive, data collection can be performed at room temperature. Obtaining a structure free of damage is of great interest for biological systems that are highly sensitive to X-ray radiation damage [18]. For example, redox centres of metallo-proteins can be damaged by X-rays even at temperatures as low 100K [19,20]. Neutron diffraction has also been used to identify the chemical nature of catalytic intermediates of heme-peroxidases [7,8], while dozens of X-ray datasets could not converge onto a unique structural model due to radiation damage of the active site.
While neutron crystallography is a unique technique, it bears its own limitations, due to a small number of available instruments (when compared to X-rays), the inherent low flux of neutrons sources (order of magnitude smaller than an X-ray rotating anode), and as a consequence, the need for much larger crystals. As of December 2019, only 161 neutron structures were available in the PDB. However, progress made on sample preparation (perdeuteration and large crystal growth apparatus) and on instrumentation have moved the field forward. With more than half of all neutron models deposited within the last 5 years, most, if not all, now provide important answer(s) to specific biochemical question(s), that will ultimately lead to the need to rewrite biochemistry textbooks [6,7].
In this chapter, we will describe the challenges that neutron crystallography needs to face, the instrumentation and different crystallographic method used at various sources (reactor versus spallation), the data reduction step and model refinement, and finish with detailed examples of biological questions which can be addressed using neutron crystallography. Scattering power difference between X-rays (blue circles) and neutrons (yellow and green circles). Please note that the surface corresponds to the total, i.e coherent (which contributes to signal) and incoherent (which contributes to the background) scattering lengths. While the coherent scattering lengths are in the same range for all elements found in the composition of biological molecules, the much larger surface for the hydrogen element comes from its large incoherent cross section.

Neutron crystallography challenges
The basic principles of neutron diffraction are similar to that of X-rays. The repeating motif of the crystal, called unit cell, yields in certain directions strong constructive interference, resulting in a Bragg diffraction spot. In other directions, interferences are destructive and no signal is measured. Each Bragg diffraction spot can be annotated to a particular (hkl) index, (Miller index) indicative of the diffracting plane. The measured intensity of a given hkl plane is determined by the corresponding structure factor F(hkl) and is expressed as: The summation is performed over all the atoms within the unit cell. Here, !,# and # represent the coherent scattering length and the temperature factor of the atom j [21]. In the case of X-rays, the formula is equivalent, with the exception of the form factor # in place of the coherent scattering length. The goal of a diffraction experiment is to position the atoms in the unit cell. This is done through the computation of the nuclear density, which is the Fourier transform of the structure factor F(hkl): where r(xyz) is the nuclear density at position (x,y,z) in the crystal unit cell and its volume. The structure factor is a complex number with its real part, the structure factor amplitude measures the magnitude of the diffracted wave, and its imaginary part, the phase of the diffracted wave: The structure factor amplitude is proportional to the measured intensity in a neutron diffraction experiment, while the phase a remains unknown from such experiment. The delineated "phase problem" will be mentioned later on. For now, we focus on the measured intensity, and its related experimental parameters. Indeed, the Bragg diffraction intensity of a spot with indices hkl is, in a neutron diffraction experiment: where ϕ(λ) is the differential neutron flux at the sample, F(hkl) is the factor structure amplitude of the reflection, with Bragg angle 2θ, and ! and !<--are the crystal and unit cell volumes, respectively [22]. Importantly, the intensity of each spot does not directly contain the phase associated with the spot.
While there is a real gain in intensity to increase the wavelength λ from (4), it should also be kept in mind that the highest resolution attainable with an incident beam of wavelength λ is λ/2, and the typical covalent bond lengths shorter than 2 Angstrom. Longer wavelengths will also generate larger blind zone [23], and therefore would require multi-axis goniometry [24]. Because the air scattering and absorption is not as problematic with neutrons as it is with X-rays, longer wavelengths can be used for neutron diffractions experiments (from 2Å, up to 10Å). However, high resolution Bragg spots are backscattered for wavelengths above 2 Å, and a specific detector geometry needs to be envisaged, with a maximal detector coverage to record as many Bragg spot intensities as possible within a single exposure.
By rewriting equation (1), the diffraction intensity from a crystal can be seen as being proportional to: where = is the incident intensity, is the detector area subtended by the crystal and as previously ! is the volume of the crystal within the beam, and !<-l is the volume of the unit cell. The diffracted beam intensity is therefore inversely dependent on the square of the volume of the unit cell ( !<--), meaning that the bigger the unit cell, the larger the crystal needs to be. The other values can however be optimized experimentally: by increasing the neutron flux, by increasing the detector area and by using larger crystals. Although X-ray crystallography has become the workhorse of structural biology, Neutron crystallography has several advantages to offer in the structural analysis of biological molecules.

Neutron instrumentation
The recent development and increased performance in neutron protein crystallography was made possible thanks to improvement in instrumentation, drastically reducing the overall data collection time and crystal volume needed for successful neutron diffraction experiments (see table 1). At reactor neutron sources, both monochromatic and quasi-Laue techniques have been developed, with a preferential use of image plates to detect neutrons. At spallation sources, the Laue time of flight method takes advantage of the pulsed nature of the neutron beam.

Weak neutron flux
Nuclear reactors provide a continuous flux of neutrons, which are produced by nuclear fission of 235 U. Each fission event produces 3 neutrons on average and releases about 180 MeV, mostly in the form of kinetic energy. It is also possible to produce neutrons via the process of spallation. Accelerated particles (generally protons) are bombarded onto a heavymetal target and, if the incident particle energy is above a given threshold (typically 5-15MeV), neutrons are extracted from the heavy metal nuclei, (which is transmuted into another chemical element). Both these methods produce neutrons of high kinetic energy, which need to be slowed down to become useful for scientists (i.e useful wavelengths). Neutrons are therefore moderated by repeated collisions with a hydrogen rich liquid (H20, H2 or CH4). The resultant thermal flux produced by the research facilities sources (steady states or pulsed sources) is about 10 15 n.cm -2 .s -1 , which are several orders of magnitude weaker than available laboratory X-ray sources ( Figure 2). Combined with the fact that neutrons diffract very weakly, measurement times range therefore from day(s) to weeks at neutron sources compared to less than a few minutes, or seconds at modern synchrotrons.
Though the neutron flux is limited, other technical aspects can be optimized for neutron diffraction experiments.

Diffractometers at reactor sources
The continuous neutron flux of the reactor sources is an advantage for protein crystallography as the flux is directly proportional to the measured diffracted intensity (see equation (4)).
At these sources, only the desired neutron wavelengths are selected from the neutron spectrum produced by the reactor (called white beam) by narrow (monochromatic diffractometers) or wide (quasi-Laue diffractometers) bandpass wavelength filters. Monochromators are single crystals, such as pyrolytic graphite, which diffracts a single neutron wavelength from the incident white beam. Wide-bandpass filters are designed to select a wavelength band (~30% for LADI at ILL for example using Ni-Ti supermirrors) which is used for quasi-Laue method. In such case, a large number of the reciprocal lattice points are in diffraction condition within a single crystal exposure (see Figure 3). These methods also provide large gains in flux relative to monochromatic methods, and reduce background scattering and reflection overlap compared to a Laue method (which uses the full white beam). In such experimental setup, the crystal remains stationary during the exposure, and is rotated several degrees between two expositions (5-7 degrees for a 30% wavelength filter). In monochromatic neutron techniques, the wavelength bandpass is only few percent and either the crystal is rotated during the exposure (as with X-rays) or it remains stationary. The angular rotation between two expositions is much smaller (1 degree or less). The vast majority of neutron protein diffractometers at reactor sources (Table 1) uses image plates as neutron detection systems. These are X-ray image plates, which have been doped with a neutron converter (usually Gadolinium) mixed with photo-stimulated luminescence materials [32,33]. Ewald sphere construction for monochromatic and quasi-laue diffraction. The Ewald construction is a geometric representation of the diffraction conditions. For neutrons of wavelength l, a sphere of radius 1/l is centered on the crystal. Each point or the diffraction lattice has 3 dimensional coordinates denoted by indices (hkl). The origin of the lattice (0,0,0) is placed at the intersection of the Ewald sphere with the incident beam. In order for diffraction conditions to be satisfied, the individual reciprocal lattice point has to be on the Ewald sphere. Therefore, by rotating the crystal, and consequently the reciprocal lattice, different diffraction conditions appear. In quasi-Laue diffraction, the crystal is exposed to different wavelength neutrons. The Ewald sphere is consequently broadened out, and more diffraction conditions are satisfied at one crystal orientation. Importantly, at the usual wavelength used (2.5Å and above), diffracted neutrons are backscattered. Consequently, contrary to the planar detectors used in X-ray crystallography, a cylindrical shape has been adopted for neutron detectors. Such geometry therefore provides high coverage of reciprocal space (> 2π steradian), and therefore a large number of Braggs reflections can be recorded simultaneously. Although quasi-Laue techniques provide many advantages, increasing the crystal unit cell dimensions, reflections start to overlap, as reciprocal lattice points move closer to each other, thereby limiting the unit cell dimensions accessible to these diffractometers.
Furthermore, DALI, a novel quasi-Laue instrument, with a much narrower wavelength bandwidth (~10% at 3.8Å) will be installed in early 2020 at the ILL. The use of velocity selector to select the usable wavelength from the white beam should also increase the flux 2.5 fold at the sample position.

Diffractometers at spallation sources
At spallation neutron sources, large position-sensitive detectors (PSDs) allow wavelength-resolved Laue patterns to be collected using all the available neutrons. The diffractometers are specifically designed to use the full white beam. Thanks to the pulsed nature of the neutron beam, not only is the position of a diffracted neutron recorded on the detector, but also the time at which the neutron hits the detector. The addition of this extra dimensions allows the full range of available wavelengths to be used [34]. The Time of Flight (TOF) Laue method therefore has all of the advantages of quasi-Laue methods at reactor sources, but yet does not suffer from overlapped reflections and from background scattering over the wavelength range (the background as well is spread over time bins).
Two TOF Laue diffractometers dedicated to macromolecular crystallography are currently available: MaNDi [35] at the Spallation Neutron Source (SNS) at Oak-Ridge National Laboratory (ORNL) and iBIX [30] at the Japan Proton Accelerator Research Complex (J-PARC). Two new TOF Laue diffractometers should be built within the next few years, NMX at the European spallation source (ESS) and Ewald at the second target of SNS at ORNL [36]. Specific detectors have been built to allow to detect the position and the arrival time of a neutron onto the detector. These position-sensitive detectors (PSDs) have a rather small area of detection and an array of them is positioned around the sample, in a spherical fashion.
Although the TOF Laue technique seems to only present advantages, the pulsed structure of the neutron flux decreases the integrated available flux reaching the sample. With the construction of the next generation spallation source ESS and its integrated neutron flux similar to the neutron flux available at the ILL, this limitation might not hold.

Large crystal growth
A single crystal of the biological sample is essential for crystallography. In order for crystallization to happen, the biological sample must be pure (>95%) and homogeneous. The final step of the sample purification is usually performed via size exclusion chromatography in order to remove aggregates (since these would prevent crystallization) and to isolate a single monodisperse species. The biological sample also has to be sufficiently concentrated (typically >10mg/ml for a protein).
Crystal growth can be understood as a controlled precipitation, as molecules associate with each other not randomly as in an aggregate, but in a repeating 3-dimensional motif. Crystallization is driven by variations in precipitant, ionic conditions and pH. In order to identify the optimum crystallization conditions, multi-factorial crystallization screens with over 1000 conditions per biological molecule are typically trialed (see [37]). Salts and different lengths PEGs are good crystallization agents. To understand the crystallization process, a phase diagram is often drawn (Figure 4). On the x-axis is the crystallization agent concentration, while the protein concentration is on the y-axis. The phase diagram is divided into three regions. In the undersaturated region (at low sample concentration), the protein remains soluble. Vapor diffusion techniques (by sitting or hanging drop) will increase the protein and precipitant concentrations to reach the supersaturated region (at higher sample concentration), in which nucleation occurs and crystals appear. In order for larger crystals to grow, upon the onset of nucleation, the soluble protein concentration should drop into the metastable phase, where nucleation cannot occur, but where the crystals can continue to grow. Since nucleation does not occur in the metastable region, it is also possible to use already grown crystals. These are transferred to a solution in which the protein is already in the metastable phase and will only contribute to crystal growth (termed macro-seeding). Other approaches exist, such as batch-method, where the system (protein and precipitant) are mixed to directly reach the supersaturation region [38].  4. Crystallization phase diagram. A phase diagram comprises different zones, which either need to be reached (nucleation zone) or avoided (precipitation zone). A typical crystallization starts in the undersaturated region (red dot). As water evaporates from the crystallization drop, both crystallization agent and protein concentrations increase, until reaching the nucleation zone. At this point, protein molecules arrange themselves in an ordered manner, that will lead to a crystal. From there, the protein concentration starts to decrease as the crystal grows. The system reaches equilibrium when the protein concentration is at the limit between the metastable zone and the undersaturated region. The crystal has then reached its maximal size.
Niimura and Bau suggest a detailed crystallization phase diagram should be established [39], but this approach remains time and sample consuming. A possible approach is to use instruments designed to finely control the different parameters of the crystallization experiment, allowing to diffuse to the protein more or less precipitant, as well as provide control of the temperature [40].
As soon as the crystallization conditions have been found and optimized for X-ray determination (which requires only crystals of few tens of microns in size), a second step of optimization is required to grow crystals suitable for neutron diffraction. It is important to emphasize that crystal size is a significant hurdle in neutron crystallography. Put bluntly, bigger crystals are necessary. Several tricks are used to optimize crystal growth. In particular, if crystallization happens too fast, the drop may fill up with a shower of small crystals. In order to slow down crystallization, larger drops can be tried (as they will take longer to equilibrate). Another option is to increase the temperature, until the crystals dissolve, and then reduce the temperature just enough that only a few crystal nucleation events occur, and therefore few (tentatively larger) crystals grow. Small molecule additives can also be incorporated in the previously identified crystallization buffers used to grow the smaller crystals analyzed by X-ray diffraction. Also, we must note that for large crystals, it is not impossible to have the visual impression of a single crystal at the macroscopic scale, which turns out to be slightly differently oriented crystals at the nanoscopic scale. In return, each crystal will concurrently contribute diffraction data, yielding a diffraction pattern of poor quality.
The crystal size needed for a neutron diffraction experiment not only depends on the biological system itself, and its unit-cell parameters, but also depends on the neutron diffractometer performances (linked to the available neutron flux). Hydrogenated crystals should be in the mm 3 range, while one order of magnitude in volume can be gained with a fully perdeuterated crystal. Furthermore, the crystallization buffer surrounding the crystal at the time of data collection needs to be deuterated. Two possibilities, i) the crystal is directly grown in a deuterated buffer or ii) the crystallization solution is hydrogenated and the crystal is subsequently transferred into a deuterated buffer (this can be directly done in the quartz capillaries).
For a given project to be considered for the limited available neutron beam time, crystal structures are systematically solved with X-rays first. This not only provides a starting model on which to append hydrogens later on, but also is the first proof of the diffracting quality of a given protein crystal. For example, some large crystals do not diffract at all; others have unit cells sizes that create challenges for current neutron diffractometer designs (vide infra). Also, users will have to show the capacity of their crystal to diffract in a neutron beam. Indeed, X-ray beams are getting smaller and smaller, now being typically in the 50-100µm range (if not smaller), and therefore only impinge a relatively small volume of a large crystal grown for a neutron diffraction experiment. So, X-rays only reveal the diffracting quality of a crystal within this length scale, not through the whole crystal. Neutron beams are much wider (in the mm range), the goal here being to illuminate the complete volume of the protein crystal. In this situation, the crystals need to be ordered on the nanoscopic scale over its entire volume, and bad surprises can arise when a good X-ray diffracting crystal is illuminated with a neutron beam. This is an important point to keep in mind when requesting beamtime for neutron diffraction experiment.

H/D exchange and perdeuteration
For a successful neutron diffraction experiment, the isotopic replacement of H into D is not just highly encouraged, but is mandatory. It provides two major advantages. First, as the incoherent cross section of 1 H is extremely large (80.27 barns) and hydrogen atoms represent ~50% of all the atoms in a protein, their incoherent scattering signal highly contributes to background, and consequently reduce the quality of the recorded diffracted intensities. Deuterium has a 40-fold reduced incoherent cross section (2.05 barns, Figure 1) when compared to hydrogen, and therefore H/D exchange significantly improves signal-to-noise ratio. Second, deuterium has a positive coherent cross section, twice as large (in amplitude) as the hydrogen one, which is negative. Therefore, deuterium is more easily observed in nuclear density maps of moderate resolution (~2.5Å). The negative neutron scattering length of hydrogen, which is opposite in sign to any neighbouring carbons (for example in an aliphatic group), leads to a partial cancellation of the hydrogen and carbon densities, resulting in difficulties to interpret nuclear density maps at these moderate resolutions. Near-atomic resolution (~1.5 Å) is needed for the hydrogens to be visualized in their negative nuclear densities, separated from the positive carbon densities. Finally, the significant solvent content of protein and DNA crystals (50% on average) makes substitution of H2O into D2O essential.
Two solutions are available regarding this H/D exchange; either only hydrogens at exchangeable sites can be swapped with deuterium, or the sample itself is fully deuterated (at the time of production by the expression system). With the first option, only a quarter of all hydrogen atoms composing the biomolecule are titrable (plus of course all solvent molecules). The other hydrogens, frequently belonging to methyl and methylene groups, are not. The titrable hydrogens can either be exchanged through vapour H/D exchange when the crystal is mounted in a capillary prior to data collection, or when crystals are soaked in heavy water (D2O) buffers, (although there is a risk the crystals are damaged during the transfer). Crystals can also be grown directly from deuterated buffers.
The alternative is to incorporate deuterium atoms at the time of protein synthesis. As deuterium has a relatively small incoherent cross-section, crystallographic data of fully deuterated protein exhibit up to 10-fold increase in signal-to-noise ratio in the Bragg peaks in comparison to a H/D exchanged crystal. As a consequence, a crystal ten times smaller in volume can be used.
Large scale neutron facilities now have their own deuteration labs (D-Lab at ILL, Biodeuteration Laboratory at ONRL, Isis Deuteration Facility, DEMAX at ESS and MLF at J-Parc) which can be helpful for users to produce the deuterated version of their macromolecule of interest. Additionally, it is worth noting that these facilities produce deuterated lipids, sugars, and small molecules.

Cryo-crystallography
While the non-damaging nature of neutrons permit data collection at room temperature, neutron protein diffractometers have been equipped with the possibility to collect data from frozen crystals. Data collection at cryogenic temperatures offers several advantages, including the decrease in atomic displacement parameters (ADPs, vide infra) and improvements in nuclear density definition [41]. The cylindrical image plate detector efficiency also increases at lower temperature.
Importantly, temperature-sensitive crystals can be cryo-cooled prior to neutron data collection. While some tricks can be envisaged to trap enzymatic intermediates at room temperature, usually via the design of catalytically inactive variants of the biological molecule of interest, or the use of substrate analogs which block the catalytic reaction at a given step, the most interesting, but most challenging is to trap enzymatic intermediates to identify its true chemical nature. Such strategy was successfully performed on hemeperoxidases, which allowed the determination of so-called compound I [7] and compound II [8] within this family enzyme, unraveling chemical details to define a new catalytic pathway.
It is important to note that although radiation damage is not fully prevented, its progression is significantly slowed down by freezing the crystal (by up to two orders of magnitude for X-rays [42,43]). Yet, the structure of the biological molecule in the crystal can be perturbed during the flash-freezing procedure [44]. The usually necessary addition of cryo-protectants to the crystal may also have an effect on the biological molecule's structure.

Data processing and model refinement
This section will describe the specificities of neutron crystallography from the processing of diffraction patterns to the refinement of molecular models. The goal here is not to describe every step in full detail, as these are very similar to X-ray crystallography (already well documented), but to focus on how crystallographic data treatment pertains to neutron crystallography.

Data reduction
For any crystallography experiment, processing of all recorded diffraction patterns yields, for each measured point of the reciprocal lattice (of Miller index hkl), an averaged intensity (<I>) and its corresponding error (s(I)), which are then modified into an averaged structure factor amplitude (<F>) and its corresponding error (s(F)). Although often appropriately scaled, the structure factor amplitudes are the square root of the measured intensities. Such data reduction requires a precise description of both the crystalline system (unit cell parameters and symmetry) and the instrument (detector geometry, crystal-detector distance, wavelength spectrum, etc…). Depending on the neutron source and the diffraction technique used for a given neutron protein diffractometer, different processing packages have been developed.
For monochromatic instruments, traditional software developed for X-ray crystallography can be directly used or slight modifications are needed. For example, the Biodiff instrument readily uses the hkl2000 software [45], while XDS [46] is being adapted for the D19 diffractometer of ILL. Quasi-Laue instruments generally utilize the LAUEGEN suite, which was originally developed to reduce X-ray Laue diffraction data [47]. The particularity of the quasi-Laue neutron diffraction data is their propensity to be overlapped. Dealing with such is non-trivial as the data reduction software needs to extract a single set of (<I>, s(I)) from two or more overlapped reflections. Another level of complexity arises from the multitude of wavelengths which can stimulate the same reciprocal lattice point with different crystal orientations. The recorded diffracted intensities have to be normalized with respect to the incident stimulating wavelength. Two factors come into play here: i) the diffracted intensity is proportional to l 4 (see Eq. (4) above) and ii) the incident neutron flux is dependent of the wavelength. This wavelength normalization procedure and the scaling of all reflections is performed with the software LSCALE [48]. In the quasi-Laue method, a proper estimation of the individual intensities of overlapped reflections is extremely difficult, and in this last step of data reduction, multiple intensity measurements of the same hkl index from different diffraction patterns or from symmetry equivalent reflections are compared, and, if they deviate too much, the software is unable to provide an accurate estimation of the intensity so those reflections are discarded from the data. This is a limitation of the quasi-Laue technique. The overall completeness is generally about ~80% and ~50% in the highest resolution shell, due to the number of overlapped reflections increasing with resolution. The unit-cell dimensions are therefore limited to ~150Å for quasi-Laue techniques with a wavelength spread of about 30%. In this perspective, the new protein diffractometer of the ILL has been designed with a wavelength spread of 10% to extend the capabilities of neutron protein crystallography at ILL.
For TOF Laue diffractometers at spallation sources, specific data reduction software have been developed to cope with the peculiar spatial arrangement of their array of detectors. At MaNDi, 45 detectors are positioned in a spherical way around the sample to maximize the reciprocal space coverage. Software developments is still ongoing with new procedures to integrate peaks from these diffractometers via neural networks for MaNDi [49], or the addition of profile fitting at iBIX [50].
The resolution of the diffraction dataset is therefore based on several data quality indicators [51]. The Rmerge, and subsequently the Rpim, have conventionally provided an estimate of the precision of the individual measurements. The difference between the two is that the latter is weighted to take into account the redundancy in the data. The Rmerge drastically rises with increased data multiplicity, and therefore does not provide accurate insights into data quality. Data measured multiple times should become more accurate.
The I/s ratio provides another indication about which scattering data are reasonable to include in the dataset. However, the previous strict I/ s ~2 ratio which defined the highest resolution has been relaxed, so as to include as much data as possible in the dataset (I/s >1) (detectors are much more accurate nowadays…). A more rigorous statistical approach is to use the correlation coefficient between half data sets. Typically, CC1/2 is ~ 1.0 at low resolution, and drops toward 0 with increasing resolution, but also decreasing signal-to-noise ratio. The resolution with CC1/2 cut-offs between 0.2 and 0.5 have been found to be acceptable.

The phase problem
While intensities are measured in diffraction experiments, phases, needed to compute the electron and nuclear densities in X-ray and neutron crystallography respectively, have to be retrieved. Phases can be experimentally obtained (a good introduction to the problem and its solution(s) in [52] and references therein) or retrieved from a previously obtained homologous model [53] (homologous here refers to the protein scaffold being similar). While the experimental phasing in protein crystallography is exclusively performed with X-ray, a proof of concept has been performed at ILL [54]. For neutron crystallography, experimenters already have a model from X-rays, which can be used to obtain the phases. Hydrogens or deuterium are then added to the model in their riding positions, based only on the stereochemistry. They will further be refined using the neutron diffraction data.

The molecular model; the pdb format
The aim of crystallography is to produce a 3-dimensional atomic model of the repeating asymmetric unit of the biological crystal. This asymmetric unit may contain one or more molecules.
The Protein Data Bank archive (PDB, http://www.rcsb.org) serves as a world-wide repository of information about the structures of biological macromolecules, like protein and nucleic acids. High-resolution structures were determined by X-ray and neutron crystallography, electron microscopy and NMR. The repository also includes pseudo-atomic models derived from lower resolution structural techniques. Importantly, prior to publication, coordinates, and associated experimental information (including diffraction structure factors, and NMR constraints) must now be deposited to the PDB. The entries are therefore validated and annotated following a common set of criteria and accessible to all [55].
The PDB file is in a plain-text format, which not only contains the atomic coordinates but also holds a variety of extra information regarding the biomolecule itself, and the crystal if the model has been derived from a crystallography experiment. The first section of the PDB file contains a series of remarks, giving information about the protein and ligands, the quality of the model and the crystal characteristics. The individual atoms in the model are subsequently defined by four variables: their position in 3-dimensional space (x,y,z), and the temperature factor (or B-factor). The later value provides an error estimate in the atom's position (at high resolution, the B-factor can be separated out into 3-dimensions too).
The accuracy of the model is obviously limited by the quality of the experimental data used to build the model. For example, at low resolution, multiple chemically correct models can be built that fit the available experimental data. Atoms that are not accounted for by the diffraction data are typically omitted from the model, or placed in chemical acceptable positions, but with zero occupancy. In X-ray structures, due to resolution, the model therefore does not often list hydrogen positions, while in neutron structures, observed hydrogens and deuteriums are detailed.
The deposited structure is the best model that fits the diffraction data and the chemistry. A priori biological knowledge may also have provided a guiding hand in model building. The precision that atoms can be positioned in the model depend on the regularity of the crystal [56], and a well-ordered crystal will likely yield atomic positions with better precision.
A structure determined to 1-Angstrom resolution would allow individual atoms to be directly positioned in the structural model. At resolutions better than 3-Angstrom, the shape of the amino acid can be clearly seen, but here, as resolution gets worse, the chemistry of the model plays an increasingly important role to limit the possible positions of the atoms can have in the structure. In particular, the ability to distinguish between oxygens, nitrogens and carbons is compromised, which raises issues with the orientation of Asparagine, Glutamate and Histidine side chains. In the case that diffraction is worse than 4-Angstrom resolution, secondary structures elements become difficult to identify. An interesting series of movies has been realized by James Holton (https://bl831.als.lbl.gov/~jamesh/movies/) which stress out the importance of various parameters of dataset on the resulting density maps, in which the crystallographer has to build his molecular model (here done with X-rays, so electron density).

Model refinement and validation
In crystallography, the model refinement is a computationally intensive procedure which aims at minimizing the differences between the observed structure factors (derived from the experimental intensities) and the calculated ones (computed from the atomistic pdb model). This minimization is done using different strategies, like the refinement of atom positions in space (x,y and z), as well as the atomic displacement parameter (ADP or B factor), which is linked to the thermal agitation of an atom. Several statistical tools (including the Rfactor and Rfree) have been introduced as metrics to follow the evolution of the refinement, and, as in any multi-parametric problem, making sure that over-fitting is not arising. Again, for more information regarding these particular concepts, Kleywegt and Jones [57] is an interesting and complete reference.
At moderate resolution (2.3-2.5Å), each atom in a structure is typically described by four parameters (x,y,z and B factor). When compared to X-rays, neutron refinement raises the level of complexity. The addition of H/D atoms in the refinement procedure will double the number of atoms in the structure and consequently the number of parameters to refine for an identical number of experimental data (Bragg peaks). When refining a neutron structure, the data to parameter ratio is therefore low (even below 1) and the problem is underdetermined. However, chemical restraints can be used to increase the observations over parameter ratio, which makes neutron refinement possible on its own. An alternative strategy is to perform refinement using both neutron and X-ray experimental data against the same molecular model. Phenix propose such strategies [58], SHELXL2013 [59] has been modified to easily perform neutron refinement and a neutron version of Refmac [60] is under development.

Neutron crystallography serves the chemistry of biological macromolecules
The structural details derived from the neutron scattering of H/D atoms of the macromolecule of interest is highly valuable in the understanding of various aspects of its biological function [61,62]. Since all biological reactions occur in a watery environment, hydrogen bonds are therefore ubiquitous in biology. Not only they hold the basic structure of proteins (from secondary to tertiary structure) but they coordinate solvent interaction, ligand binding, recognition and transformation. Many biological processes require the transfer of a hydrogen atom from the protein to the substrate or vice-versa. Hydrogen bonds are directional and formed between an electronegative donor, which shares its covalently bound H atom with another electronegative acceptor atom. The corresponding energy of such bond ranges between 2 and 10 kcal.mol -1 , depending on the geometry between the donor, acceptor and the shared H atom (distances and angles) [63]. This relatively low energy allows, for example, the H bonds to be formed and broken for substrate recognition and product release, respectively, over a large range of temperatures and solvent conditions (pH in particular). Importantly, although the position of most hydrogens bound to carbons can be accurately predicted, this is not the case for hydrogens attached to hydrogen-bond donor atoms like oxygen, sulfur and even nitrogen atoms. Unlike what had been observed in small molecules, very few truly colinear hydrogen bonds are observed in proteins, due to the steric constraints imposed by atom packing. Of course, the presence of a given hydrogen-bond can depend on the protonation state of charged amino-acids and their identification using neutron crystallography is generally essential to decipher the biological function. Having access to the hydrogen bond network in proteins permits the visualization of ligand/inhibitor binding modes or of specific protonation states of key catalytic residues, the clear identification of the hydration pattern of the catalytic site, as well as the observation of intermediates steps along the catalytic reaction. All these aspects will be stressed-out in this section, with precise and concrete examples of recent neutron diffraction studies.

Canonical and unusual hydrogen bonds
While neutron protein crystallography is unique to identify the position of the hydrogen atoms at near-atomic resolution and their implication in H-bonds, the technique also helped to discover some atypical ones. One of them is the low-barrier hydrogen bond (LBHB), first hypothesized in the early 90s [64], in which the hydrogen atom is literally shared between the donor and acceptor atoms (i.e at equal distance). Such atomistic arrangement is possible if the pKa values of donor and acceptor atoms are matched. The unusual nature of this hydrogen bond has been repeatedly observed in several neutron protein structures, such as, citrate synthase [65], serine protease [66], ketosteroid isomerase [67] and the fluorescent PYP protein [68][69][70]. It has also been discovered in the catalytic triad of the aminoglycoside-N3acetyltransferase VIa (AAC-VIa) with a putative role in enzyme catalysis. In a follow up study, a structural proof has been obtained on the catalytic potential of such LBHB. Indeed, in the neutron crystal structure of the protein bound to its two most efficient substrates (i.e gentamicin and sisomicin, and in absence of cofactor to prevent the reaction to occur), a LBHB is observed within the catalytic triad (Figure 5a). On the contrary, a canonical hydrogen bond is observed for one of its least preferred substrates (kanamycin B, Figure 5b). This study provided the first structural evidence of the importance of a LBHB in catalytic efficiency, and brought to a close a long-lived debate on the potential role of these atypical hydrogen bonds.

Ligand or inhibitor recognition
H-bonds are key contributors in ligand or inhibitor recognition. While most of the molecular models of such complexes are obtained with X-ray crystallography, the hydrogen bond network can only be inferred. Neutron crystallography remains a unique technique to unravel the exact and precise hydrogen bond network which holds the small molecule in the targeted binding site. In recent years, several neutron crystal structures of proteins bound to clinical drugs have been reported.
Farnesyl pyrophosphatase synthase is a molecular target of osteoporosis [71] which has recently been solved bound to the clinical drug risedronate by neutron crystallography [72]. This structure provides insight about the binding mode of risedronate and therefore insights for a rational improvement to the drug.
The aspartyl protease of HIV-1 is essential for the virus life-cycle, and non-infectious viral particles are produced if this protease is not properly functioning. Antiviral inhibitors of the protease are already marketed for AIDS treatment. The neutron structure of HIV-1 protease was solved bound to the clinical drug amprenavir and pictured an hydrogen bond network very different from the one inferred from the X-ray crystal structure [10] (Figure 6). Again, these structural details are valuable for the design of inhibitors of improved efficiency, especially due to the rapid apparition of drug resistance in the HIV, linked to its protease. Subsequently, a structure of an HIV-1 protease triple mutant, which was resistant to amprenavir, was surprisingly obtained in complex to the drug. Intriguingly, the room temperature neutron structure exhibited differences with the cryo-cooled X-ray structure of the same mutant. These data clearly demonstrated that the mutations did not alter the binding mode of amprenavir to the protein, but that other effects were at play to create drug resistance, such as, possibly global protein dynamics and active site conformational flexibility. Another example of clinical drugs bound to proteins and obtained by neutron crystallography is the human carbonic anhydrase bound to either acetazolamide [11] or to methazolamine [12]. This enzyme is inhibited during the therapeutic treatment of glaucoma and epilepsy.

Protonation states
The electrically charged state of a single side chain can play a significant role in the function of the protein. The associated pKa is dependent on the side chain's local environment, and, in an enzyme's active site, it is finely tuned to match the protonation state required for the catalytic reaction to proceed. Neutron crystallographic studies were essential in identifying active site residues with unusual pKa's and therefore unusual protonation states. For example, the catalytic function of the aspartyl protease HIV-1 was studied at various pHs. This homodimeric protease bears a pair of aspartic acid residues involved in the peptide bond cleavage. Each of these two aspartates belong to a monomer and the active site lies at the dimer interface (Figure 6a). The most interesting feature is that the chemical nature of these two, supposedly negatively charged residues and within interacting distance from each other, is different. In the neutron structure bound to the clinical drug amprenavir (pH=6.0), the aspartic residue from monomer A is neutral (with a D atom covalently linked to Od1) while the other one is negatively charged (Figure 7a). Interestingly, the same protein in complex with the drug darunavir reveals that the charge is on the aspartic acid of monomer A instead. The other aspartic residue shares a deuterium atom with the drug via a low-barrier hydrogen bond (Figure 7b). The position of the oxygen atoms in the carboxylate groups of these two aspartic residues are equivalent in the two structures, and that the deduction of proton positions and hydrogen bonding from distances between oxygens is at risk. Such subtle changes in the protonation states of catalytic residues are only accessible with neutron crystallography. Protonation states of key residues in the chromophore pocket of fluorescent proteins were assessed with neutron crystallography, such as the chromoprotein Dathail [73] and the photoactive yellow protein (PYP) [70]. The protonation state of the chromophore itself has been observed, as well as those of surrounding residues, with an unexpected neutral arginine in the PYP chromophore pocket.

Solvent network
Heavy water (D2O) has a characteristic-boomerang shaped nuclear density, as its three atoms have positive and equivalent coherent neutron scattering cross-sections. However, this typical shape is observed in structures of relatively high resolution (1.5Å and higher). A combination of X-ray (to locate the oxygen atom) and neutron (to position the deuterium atoms) diffraction make it possible to orient water molecules at much more moderate resolution (between 2.0 and 2.5Å, depending on the data quality) For example, in the structure of crambin, a small protein of 46 residues, neutron diffraction data from an H/D exchanged crystal have been obtained at 1.1Å resolution [74]. The electron and nuclear density maps of five water molecules of this structure are shown in Figure 8 and reveal the level of details which can be obtained with neutron crystallography regarding the solvent network. Of course, not all neutron structures are solved at such high resolution, but the water orientation can be retrieved in moderate resolution structure by combining the information from X-ray (to first position the oxygen atom) and neutron (to then add the deuterium atoms in the nuclear density map). Water networks are also important for proton transfer, as observed in the human carbonic anhydrase II (HCA II) enzyme. The proton transfer is the final step of the catalytic reaction which involves a water molecule, part of a well-ordered 8Å thick water network that spans from the catalytic site to the bulk solvent [16,17].  [74] In some cases, prior to ligand association, the ligand-binding pocket has a specific hydration pattern, which needs to be accommodated upon ligand binding [15]. High resolution X-ray and neutron structure of trypsin were obtained in its free form and complexed with two inhibitors. Without ligand, the binding pocket is filled with water molecules (Figure 9a) of high mobility, and characterized by an incomplete H-bond network, which does not fully solvate catalytic residue Asp 189. Upon ligand binding, water molecules are displaced and Asp 189 is engaged in two hydrogen bonds (Figure 9b). The precise understanding of water configuration in the ligand-free state gives clue on the forces which drive ligand-binding, with imperfect hydration of Asp 189 being one of them. Fig. 9. Solvation of the uncomplexed trypsin (a) and bound to N-amidinopiperidine (b). Protein residues are depicted as light brown sticks. X-ray (orange mesh) and neutron 2mFo-DFc (grey surface) maps are contoured at 1.5 and 1.8σ. The neutron omit mFo-DFc is represented in green (4σ) and was obtained after removal of deuterium atoms from all depicted water molecules as well as the deuterium atom of the inhibitor in (b). The water reservoir made up by waters W4-W7 is highlighted in blue. Reproduced from [15].

Catalytic intermediates
Finally, neutrons provide the unique capability to decipher the enzymatic mechanisms when trapping catalytic intermediates. Several approaches are available, via substrate analogues which stop the reaction at a given stage [6] or the trapping of intermediates by flash-freezing [7,8]. Recently, a nice study reported two enzymatic intermediates within a single crystal. In the asymmetric unit contained, each of two monomers were trapped in two consecutive steps of the pathway. In the first one, the reaction fully processed to reach the stage at which the substrate analog prevents the reaction to further advance, while in the second one, the catalytic loop, involved in crystal contacts, could not close onto the substrate analog and the active site, and the reaction was therefore stopped a step before that of the other monomer. This study was performed on an aspartate aminotransferase, a pyridoxal 5′phosphate (PLP) dependent enzyme. This cofactor, derived from pyridoxine is a ubiquitous cofactor in biology, needed for more that 140 different biochemical transformations. The neutron structure revealed clear protonation states of the catalytic pocket in the two complexes ( Figure 10). The consequence of the fine undersnanding of the mechanism at the atomistic level is the re-thinking of the catalytic mechanism of such an important and widespread class of enzyme. While these studies are the most challenging for neutron crystallography, they usually are the most useful and informative.