The Second EWGRD Round Robin: Inter Comparison of Gamma Spectrometry Measurements on Activation Dosimeters

. Following an initial Round Robin inter-comparison of gamma spectrometry measurements reported in 2014, this paper presents results from the first part of a second Round Robin inter-comparison commissioned by the European Working Group on Reactor Dosimetry in 2018. Measurements were performed by thirteen European organisations on a set of irradiated neutron activation detectors representative of those commonly used by the Reactor Dosimetry community to measure neutron fluence using gamma spectrometry methods. The radionuclides measured were 110m Ag, 58 Co, 60 Co, 54 Mn, and 46 Sc. The purpose of the exercise was to demonstrate the level of consistency between participating organisations in blind tests of measurements. The samples used were disks of iron, nickel, titanium, and two standard alloys of aluminium, one of 1% cobalt

Abstract. Following an initial Round Robin inter-comparison of gamma spectrometry measurements reported in 2014, this paper presents results from the first part of a second Round Robin inter-comparison commissioned by the European Working Group on Reactor Dosimetry in 2018. Measurements were performed by thirteen European organisations on a set of irradiated neutron activation detectors representative of those commonly used by the Reactor Dosimetry community to measure neutron fluence using gamma spectrometry methods. The radionuclides measured were 110m Ag, 58 Co, 60 Co, 54 Mn, and 46 Sc. The purpose of the exercise was to demonstrate the level of consistency between participating organisations in blind tests of measurements. The samples used were disks of iron, nickel, titanium, and two standard alloys of aluminium, one of 1% cobalt and the other 1% silver. They were irradiated in the MARIA reactor operated by National Centre for Nuclear Research, Poland. Participants provided their results to an independent referee who collated and compared the data. The results are presented in an anonymised form together with discussion and conclusions which may be drawn from the exercise. Good agreement was obtained with standard deviations for individual measurements between 2.0% ( 54 Mn) and 5.6% ( 60 Co). Overall, the results obtained from the latter Round Robin show less consistency than those from the first Round Robin, which is attributed to the participation of a wider pool of organisations.

Introduction
Experimentally determined neutron activation reaction rates continue to perform an integral role within assessments of reactor dosimetry being routinely used to support, adjust and validate the multiplicity of calculational methods and assessments. It is therefore essential that such measurements are determined without systematic bias and with well-characterised uncertainties. Relying as they do upon gamma radiation spectrometry methods to determine neutron activation reaction rates in-situ, it is important for the reactor dosimetry community to establish the capabilities of the analysts and the facilities used. Benchmarks and blind tests are usually proposed for this purpose.
In 2014, the European Working Group on Reactor Dosimetry (EWGRD) reported the results of a Round Robin exercise [1] to compare neutron activation measurements performed by eight European facilities on five commonly used radionuclides: 110m Ag, 60 Co, 54 Mn, 46 Sc and 94 Nb. Although the results reported were favourable, EWGRD members noted that the exercise would have been improved with a greater number of participants and the inclusion of other commonly used radionuclides, in particular 93m Nb.
In response, the EWGRD has commissioned a second Round Robin inter-comparison involving fifteen facilities from eleven European countries this time including 93m Nb in addition to the following commonly used gamma emitting nuclides: 54 Mn, 58 Co, 46 Sc, 60 Co and 110m Ag. However, due to the unique challenges of measuring 93m Nb, the Second EWGRD Round Robin was organised as two separate exercises, one for gamma-ray spectrometry of neutron activation products and the other for 93m Nb. This paper describes the inter-comparison of gamma spectrometry measurements. The 93m Nb inter-comparison is described in [2].
Thirteen organisations took part in the inter-comparison of gamma spectrometry measurements as follows: • Budapest University of Technology and Economics, Hungary • CEA Cadarache, France

Preparation and Irradiation of Samples
Samples (in the form of activation foils) for the inter-comparison of activation measurements were prepared by SCK-CEN at their Mol facility. They included 25 cylindrical foils, five each of: iron (typically 35 mg); nickel (typically 7 mg); titanium (typically 52 mg); 1% silver aluminium alloy (typically 5 mg) and 1% cobalt aluminium alloy (typically 2 mg).
The irradiations took place on January 9 th 2018 in the MARIA reactor operated by National Centre for Nuclear Research (NCBJ), Poland. In total 25 samples were irradiated, comprising five batches of five samples of each material. The irradiations were in a mixed thermal / fast spectrum at a nominal 1.2×10 14 cm -2 s -1 and had the aim of achieving c. 50 kBq on each sample.

Protocol
Due to the challenges of radioactive consignment across national boundaries and the time it would take each laboratory to receive, measure and dispatch the samples, the approach taken for distribution differed from the first EWGRD Round Robin [1]. Instead of each participant measuring the same set of samples and forwarding to the next within a single chain, batches of five samples were circulated within subgroups of participants ensuring each participant measured one sample of each type. To act as control against variation between samples of the same material, one organisation (referred to as Org. 0) measured all of the samples prior to dispatching them to the other participants. One sample of each material was retained by Org. 0 as a precaution against loss.
On completion of its measurements, each participant was asked to forward its responses to an independent EWGRD member from a non-participating organisation, who collated the data and acted as referee. The data were treated anonymously, and the participants identified as Org. 0 to Org. 12, inclusive.
Participants were requested to collate sample mass measurements and specific activities, together with corresponding uncertainties within a pre-prepared template prescribing the units of measurement, number formats required and a reference time to which measured activities were to be decay-corrected (noon, 7th March 2018). Uncertainties (at one standard deviation, 1σ) were requested as the combination Type A, those that can be treated using statistical methods, and Type B, those than cannot.
Participants were also asked to provide: • Sample mass measurements and uncertainties; • Sample activities, and specific activities (activity divided by mass); • A brief description of measurement technique(s) and equipment. including software; • To identify any corrections applied or measures taken (e.g. dead time. self-absorption); • Calibration, including any standard sources; • Nuclear data: half-life values and photon emission yields assumed in the analysis; • A short description of the treatment of uncertainties. Table 1 provides the properties and identification (ID) number of each sample, as well as the reported mass measurements and their associated uncertainties.

Sample Mass Measurements
The mass data provided in Table 1 include the mean and standard deviation of the measurements carried out on each sample, as well as the maximum and minimum values obtained for each sample expressed as deviations from the mean in multiples of standard deviation. Measurements deviating ±2 standard deviations or more from the mean of each sample are shown in grey shade.
The table shows that mass measurements for each of the iron samples have produced consistent results within each subset of participants. The picture for the other dosimeter materials is less consistent. Poorer agreement is seen for the samples 1547 and 1548 (nickel), 1553 (titanium), 1557, 1558 (Al-Co) and 1562 (Al-Ag). All of these samples were measured by two subsets of organisations 0, 7, 8, 10, 11 (1547, 1557, 1562) and 0, 2, 3, 4 (1548, 1553, 1558). Of these Org. 2 produced a maximum value for each of the five samples it measured, and Organisation 3 produced minimum values for each of the five samples it measured. The results suggest the presence of systematic bias in some of the mass measurements, the effects of which will have been exacerbated by the relatively small masses of some of these samples. . Mass measurements 1 to 5 correspond respectively to the organisations identified in the third column. The two columns on the right express deviations from the mean as multiples of standard deviation for highest and lowest measurements. Sp. Act. (Bqg -1 ) 6.59E+05 6.16E+05 6.48E+05 6.51E+05 6.57E+05 6.49E+05 6.57E+05 6.58E+05 6.44E+05 6.60E+05 6.76E+05 6.50E+05 6.58E+05 Sp. Act. (Bqg -1 ) 6.22E+06 5.93E+06 5.88E+06 6.44E+06 6.28E+06 6.14E+06 6.26E+06 6.14E+06 6.00E+06 6.27E+06 6.74E+06 6.20E+06 6.12E+06    Table 2 provides the measurement results for each participant and each nuclide, together with measurement uncertainties (σmeas) and discrepancies (Δ) from the mean for each nuclide expressed as multiples of both σmeas and the standard deviation of measurements obtained on for each nuclide (σpop). Cell shading identifies the maximum and minimum values obtained for each radionuclide. For Org. 0, the specific activities tabulated are the means of the measurements taken, excluding retained samples, and the corresponding σmeas values are the means of individual measurement uncertainties corresponding to the data averaged.

Comparison of Specific Activity Measurements
In calculating the mean and standard deviation (σpop) for each radionuclide, Org. 0 data were treated as a single point represented by its mean (excluding retained samples). This was done to prevent Org. 0 having a dominant influence on the analysis.
Specific activity measurements for each of the sample materials are also shown in Figures 1 to 5. Each figure plots the specific activity measured by each participant against the sample number. Error bars show the measurement uncertainty provided by the participant at one standard deviation. The horizontal lines plotted show the mean and mean ±multiples of the population standard deviation (σpop) for each radionuclide. Note that the sample numbers plotted on the horizontal axis are artificially separated to aid identification. To determine whether any of the data should be considered as outliers, statistical tests have been performed using Grubbs' test [3].
For 54 Mn, the mean activity obtained was 6.53×10 5 Bq g -1 with a standard deviation (σpop) of 2.1%. Figure 1 shows good agreement between all participants with all but three inside ±σpop of the mean. One value (Org. 1) is at 2.7σpop below the mean, which can be considered an outlier with greater 5% confidence according to [3]. Taking into account the measurement uncertainties (plotted as error bars), all of the measurements appear consistent with the mean, and even the outlier appears reasonable. Org. 0's results display scatter comparable with both its measurement uncertainties and σpop but, other than that, suggest no significant variation between samples.
For 58 Co, the mean activity obtained was 6.20×10 6 Bq g -1 with a standard deviation (σpop) of 3.6%. Figure 2 shows a more scattered distribution with four measurements (Orgs. 1, 2, 3) outside ±σpop and one at 2.4σpop. above the mean (Org. 10). Again, taking into account measurement uncertainties, all measurements appear reconciled with the mean with the exception of Org. 10, which appears particularly discrepant and may be considered an outlier with 5% confidence according to [3]. With lower measurement uncertainties than the other participants, Org. 0's results sit close to the mean value, suggesting no sample-to-sample variation any greater than ±½ σpop.
The mean activity obtained for 46 Sc was 4.97×10 5 Bq g -1 with a standard deviation (σpop) of 3.4%. Figure 3 shows that all of the data are within ±σpop of the mean with the exception of Orgs. 1, 2 and 8, which are between -σpop and -2σpop below the mean. Org. 3 is 2.4 σpop above the mean and may be considered an outlier [3] though only just. Taking into account measurement uncertainties, all of the data appear reasonable compared to the mean except for Org. 3, the potential outlier, sitting 8% above the mean but with σmeas of only 1.5%. Again, Org. 0 has noticeably lower measurement uncertainties than the other participants with no sample-to-sample variation greater than ~ ½ σpop. It is nevertheless interesting that for this nuclide, Org. 0 appears to be consistently greater than the mean by 2% on average and that its value for sample 1551 sits just over σpop above the mean.
For 60 Co, the mean specific activity obtained was 2.30×10 7 Bq g -1 with a standard deviation (σpop) of 6.4%. By some margin, this is the highest standard deviation of all five nuclides measured in the exercise. Figure 4 shows that, whilst nine of the participants' values are all close to the mean within ±½ σpop, four values are outside ±σpop. Org. 10 is the most significant sitting 16% or 2.5σpop above the mean and considered an outlier according to [3]. At ±1.5%, Org. 10's σmeas appears to have been greatly underestimated. Again Org. 0's measurements are provided with small measurement uncertainties and suggest no significant sample-to-sample variation although, this time they appear to be systematically 2% below the mean.
For 110m Ag, the mean activity obtained was 7.24×10 6 Bq g -1 with a standard deviation (σpop) of 3.8%. Figure 5 shows that, with the exception of Org.3, all data lie within ±σpop of the mean. Lying 2.6 σpop below the mean, Org. 3 is considered an outlier according to [3]. Measurements by Org. 0 suggest no discernible sample-to-sample variation, although it lies, on average, 2.0% above the mean.
6 Analysis and Discussion 6.1 Distribution of Discrepancies Figure 6 presents histograms showing the distributions of measurement discrepancies relative to the mean for each nuclide expressed as multiples of σmeas (histogram on the left) and σpop (histogram on the right). The data are presented so that the distributions are evident for both individual radionuclides, and the comparison as a whole. In addition to this, statistical testing has been carried out to test the distributions obtained for normality.
Compared to the standard deviations obtained for each nuclide, σpop, the overall picture is that the majority of measurements, 50 out of 65 (77%) are within ±σpop of their respective mean values and on top of this, an additional 10 (15%) lie within ±2σpop. On this basis, the distribution appears normal, and despite the presence of 5 values at or close to ±3σpop, the overall distribution successfully meets the criterion for 95% confidence in a chi-squared test for normality. The distributions of discrepancies for individual nuclides show a similar pattern of behaviour with all of them testing positive for normality with better than 95% confidence.
When compared to measurement uncertainties, σmeas, the overall distribution is considerably more scattered and clearly not normally distributed, with two measurements less than -3σmeas and five greater than +3σmeas. This is reflected in all of the individual nuclide data with the exception of 54 Mn, which produced no pronounced outliers and appears normally distributed, as supported by a chi-squared result. The apparent normal distribution of discrepancies relative to the standard deviation for each nuclide, provides compelling evidence that the participants are producing consistent measurements. However, the contrast between the distribution of discrepancies relative to σmeas and the distribution relative to σpop suggests that, for some participants at least, measurement uncertainties have been under-estimated, having failed to account for all of the factors leading to measurement errors, especially those of a systematic nature.

Compensation for Uncertainties in Mass Measurements
From the description of specific activity measurements in Section 5, it is apparent that some participants deviated systematically from the mean for a number of radionuclide measurements. For example, Orgs. 1, 2, 8 were amongst the lowest in at least three of the measurements, and Org 3 was high for 58 Co, 46 Sc and 60 Co. It is also evident from Table 1 that Orgs. 1, 2 and 3 tend to produce more of the extreme mass values in their cohort (subset of participants) than might be expected.
An obvious source of uncertainty is the measurement of sample mass. Because the masses of some of the samples are very small, most notably the nickel samples and the Al-Co and Al-Ag alloys, it is possible that for some participants, the mass of these samples was near the limit of what could be measured accurately with the equipment available. In such cases, the discrepancies between specific activity measurements could have arisen from the relatively straightforward measurement of mass rather than the considerably more technical challenge of gamma-radiation spectrometry.
To remove as far as possible the impact of mass measurement uncertainties from the comparison of specific activity measurements, each specific activity measurement has been adjusted by multiplying it by the factor < ⁄ >, where the numerator is the mass of sample i, as measured by organisation Org and the denominator is the mean of all mass measurements of sample i. The comparison of specific activities was then repeated using the specific activity data adjusted in this way. Note: this approach was taken despite the fact that participants were asked to provide sample activity data (Bq) in addition to specific activities (Bq g -1 ). A comparison of the activity data supplied was not performed because the sample activities would have been influenced by the variation of masses between samples of the same type. Uncertainties on the mass-adjusted specific activities were also estimated by subtracting in quadrature the relative uncertainty on mass measurement from the relative uncertainty on the submitted specific activity. This makes the reasonable assumption that uncertainties on each participant's measurements of activity and mass are uncorrelated.   Figures 7 and 8 show the impact of adjusting the mass data on respective 58 Co and 60 Co specific activity measurements. For 58 Co (Figure 7), the adjustments have produced a reduction in standard deviation (σpop) from 3.6% to 2.1% with no change in the mean. The adjustment has moved the measurement of Org 10 from an outlier at 2.4 σpop to being well within 2σpop of the mean. The value of Org 1's measurement has not changed significantly, but now falls just within -2σpop from the mean because of the reduction in σpop. Compared to the unadjusted data shown in Figure 2, the discrepancies evident in the adjusted data appear well represented by the revised measurement uncertainties.

Comparison of Mass-Adjusted Specific Activity Measurements
In Figure 8, a similar picture emerges for the adjusted 60 Co measurements. Here there has been a marked reduction in σpop from 6.4% to 2.5% and a reduction in the range of data (maximum minus minimum) from 24% to 9%, as reflected in the revised range of the vertical axis compared with Figure 4. There has also been a 0.6% reduction in the mean. Analysis of the adjusted data in accordance with [3] shows that the Org. 10 value can no longer be considered an outlier.
Similar data for the other radionuclides are not shown because mass-adjustment produced little or no significant change. For 46 Sc, the adjustments produced a reduction in σpop from 3.4% to 3.3%, and for 110m Ag a reduction from 3.8% to 3.7%. For 54 Mn, σpop remained unchanged at 2.1%.
It is important to note that the mass-adjusted data do not necessarily represent an improvement in every case. To illustrate this, Org. 0's 60 Co measurements have no significant sample-to-sample variation (as outlined in Section 5), but the adjustment creates artificial sample-to-sample variation, especially between samples 1556 and 1557. Such variation is an artifice created by the differences in individual organisation's mass measurements influencing the sample average.
The purpose of the mass adjustment is to determine the sensitivity of the inter-comparison to changing sample mass data to consensus values. Doing so provides an indication of the mutual consistency of the absolute activity measurements carried out by the participants on each batch of samples, with greatly reduced dependence on individual on sample mass measurements. It is emphasised that comparison of the as-submitted specific activities and the conclusions drawn from it forms the primary purpose of this work. Comparison of mass-adjusted data serves only to aid understanding of how deficiencies in individual mass measurements may have affected the results obtained. Compared to the standard deviations of the adjusted data, σpop (right-hand histogram), chi-squared tests provide high confidence that the discrepancies remain normally distributed for each radionuclide as well as the data as a whole, i.e. unchanged from the as-submitted data, despite the reductions in σpop.

Impact of Mass Adjustment on the Distribution of Discrepancies
In comparison with Figure 6, the left-hand histogram shows that compared to measurement uncertainties (σmeas), the distribution of discrepancies amongst the adjusted data has lost the three of the significant outliers at ±6σmeas but the single value at -10 σmeas remains (Org. 3, 110m Ag). Inspection of the data shows that although mass adjustment brought this point closer to the mean, there was a commensurate reduction in σmeas. Other than that, the distribution appears considerably narrower. This is confirmed by a closer inspection of the data showing that 58 out of the 65 (89%) values lie within ±3σmeas suggesting that the distribution is still a little too scattered to be considered normal. Chi-squared analysis of the distributions for individual nuclides suggests that data for 54 Mn are normally distributed, but not for the other four radionuclides.

Figure 9
Discrepancies of mass-adjusted specific activities from mean relative to measurement uncertainties, σmeas (to the left) and relative to standard deviation, σpop (to the right).
From these observations, it is concluded that the participants' estimates of the uncertainty on absolute activities are reasonable overall. In addition, the most significant outlying values obtained in the original comparison are a result of difficulties in measuring mass values, especially of the nickel and Al-Co samples. It remains a matter of debate as to why the 110m Ag results did not show a similar improvement, given that the masses of the Al-Ag samples were also comparably low.

Normalised Results
is the specific activity of radionuclide i measured by organisation org and < > is the mean of all measurements of radionuclide i. Because Org 0 measured every sample, an exception is made such that the value of for Org 0 is the mean of the four samples measured by other organisations (i.e. excluding retained samples). Normalised in this way, the data permit the identification of systematic trends of under or over-prediction by any participating organisation.
The table presents the normalised discrepancies for both as-submitted data and the mass-adjusted data. Averages across all five radionuclides are presented for each participant as well as the RMS (root mean square) of all the normalised values for each radionuclide and the set as a whole. Comparing the RMS values for the assubmitted values with the mass-adjusted values provides a further way to quantify the impact of the mass adjustment.
From the data in Table 3, it can be seen that:  As already noted, across the thirteen organisations, the averaged level of agreement is very good. The most discrepant as-submitted participants are Org 10 (+6.5%), Org 2 (-4.1%), Org 1 (-3.4%) and Org 8 (-3.3%)  With mass-adjustment, the results for Org 10, Org 8 and Org 2 improve markedly to +3.0%, -2.1% -1.5%, although they, together with Org 1 (-3.0%) remain the four most discrepant participants.  Changes in RMS values (as-submitted to mass-adjusted) are only significant for 58 Co and 60 Co, quantitatively confirming the observations made in Section 6.3 that these are the only two sets of measurements affected by inaccuracies in mass measurements.
The first EWGRD Round Robin (RR1) intercomparison [1] was reported to the 15 th International Symposium on Reactor Dosimetry in 2014 and described measurements carried out in 2012, some six years before the measurements described in this paper. Although there were important differences in approach between the two comparisons, it is informative to compare the outcomes of both. For this purpose, standard deviations have been chosen as a simple high-level measure of the mutual consistency between different organisations' measurements.
For each radionuclide Table 4 gives the mass, the total activity (specific activity × mass) and the standard deviation obtained from the measurements performed on each sample (RR1), or batch of nominally equivalent samples (2nd Round Robin, RR2). Although each intercomparison considered measurements of five different radionuclides, only the four radionuclides given in the table were common to both. (1) Column 'RR1 Orgs' shows the standard deviations obtained from the subset of organisations that participated in RR1. Values in parenthesis are from mass-adjusted specific activities. Where only one value provided, mass-adjustment had no effect.
As the table shows, in RR1 two samples of each radionuclide were measured, in most cases with notably different masses and activities. For the RR2, the table shows the average mass of the five samples provided for each nuclide, the average sample activity for each batch as well as the standard deviations obtained from the specific activity measurements performed for each batch, both for the as-submitted data and the mass-adjusted data. The second column from the right shows the standard deviation of the measurements carried out on all five samples in each batch by Org. 0, i.e. an indication of the variability within the control measurements With the exception of the more massive 54 Mn sample measured in RR1, the original (i.e. unadjusted) standard deviations obtained in RR2 are all greater than those obtained in RR1. Even the reduced standard deviations resulting from the use of mass-adjusted data are higher than those obtained in RR1. This is an interesting result because it suggests poorer levels of consistency amongst the European dosimetry community in RR2, and it warrants some discussion.
As seen in the current work, measurements of mass can affect the accuracy of specific activation calculations, especially when small and/or less sensitive balances are used. However, it can be seen from Table 4, that whilst they vary, the masses measured in RR1 are comparable with those measured in RR2. Similarly, Table 4 shows that the activities of measured radionuclides on corresponding RR1 and RR2 samples are also comparable within an order of magnitude, suggesting that high or low activity levels are not the cause. It therefore seems very unlikely that the properties of the samples are the reason for the differences seen.
An important difference in approach between RR2 and RR1 was the measurement of batches of nominally equivalent samples by subsets of participants rather than all participants measuring all samples, as in RR1. Whilst proving expedient, especially for an inter-comparison with many participants, the RR2 approach creates the risk that sample-to-sample variability might influence the submitted measurements. The use of Org. 0 as a control for variability provided a means of assessing this. In Table 4, the second column from the right shows that for all except 54 Mn, sample-to-sample variability as indicated by the standard deviation in the Org. 0 results is less than 1% and is not sufficient to reconcile RR1 and RR2 results. Whilst the sample-to-sample variability indicated for 54 Mn would have an impact for this radionuclide, this is the one radionuclide where the two Round Robins appear more consistent.
Having discounted the explanations above, a final one is that the participants in the two inter-comparisons were different. Certainly, RR2 had more participants, thirteen compared to eight of whom only six participated in RR2. However, that would only affect the consistency of the inter-comparisons if the performance of the original RR1 participants were more consistent than those RR2 participants who had not taken part in RR1. To assess this, the final column in the table shows the standard deviation of the specific activities of RR2 measurements calculated for the subset of six organisations participating in both Round Robins. In cells where two values are shown, the value in parenthesis corresponds to the mass -adjusted data. Where only one value is provided, assubmitted and mass-adjusted values are the same.
For all except 54 Mn, the RR1 participants produce notably lower standard deviations, comparable with the those seen in the RR1 [1], albeit that for 60 Co it is the mass-adjusted data rather than the as-submitted that are consistent. For 54 Mn, the RR1 participants do not perform any better but, as already noted, the two Round Robins compare reasonably well for this radionuclide in any case.
In conclusion, therefore, it appears that the original RR1 participants performed more consistently within RR2 than those who did not. Whilst speculation on reasons for this are beyond the scope of this paper, it is possible that, having already participated in a Round Robin six years before, greater experience in the original participants is translated into more consistent performance.

Lesson Learned and Ideas for Improvement
Although Round Robin inter-comparisons are well established as a means of underwriting and validating tools and techniques in Reactor Dosimetry, there were aspects to this exercise that were novel and it is worth briefly discussing these and ideas for improving similar exercises carried out in future.
The measurement of batches of nominally equivalent samples by subsets of participants was a departure from the protocol followed in the original EWRGRD Round Robin. This proved advantageous because it allowed more organisations to participate without incurring time delays associated with measurements in series, which would have risked significant levels of decay for the shorted lived radionuclides. Whilst within batch variability (sample-to-sample) risks increasing the scatter between measurements, the use of a single organisation to measure each sample provides an effective control if that organisation performs measurements with high precision. As seen in this work, the control measurements indicated modest variability with 1.4% as the highest standard deviation in any of the five batches.
The standard of submissions received by the referee was high, with only a handful of typographical errors identified. This observation represents a significant improvement on RR1 where a number of errors were detected in the original submission. In the main part, this improvement can be attributed to the increased vigilance of the participants having learned lessons from RR1 [1]. However, the use of a document template clearly specifying the requirements of each participant is judged to have played a role in the improved performance.
Compared to RR1, where every participant measured the mass of each sample, the protocol adopted in RR2 has the disadvantage that relatively few (between three and five, excluding the retained samples) measurements of mass were performed for any one sample. Consequently, it has been difficult to make statistical inferences on the consistency of mass measurements. This is unfortunate because one of the most important observations from RR2 is that discrepancies in mass measurement have resulted in a number of outlying specific activity measurements. With greater confidence in the mass of each sample, it would have been possible to shift the focus away from sample mass measurements and more onto the activity measurements, i.e. the primary objective of the inter-comparison. To reduce the adverse impact of discrepant mass measurements on future inter-comparisons, the following could be adopted:  Benchmark the masses in advance by obtaining independent high, precision measurements;  Avoid small sample masses so that mass measurement uncertainties are minimised. It is important too that participants provide absolute activity measurements together with their associated uncertainties. In the event that data have to be manipulated or adjusted, as done here, it is helpful to be able separate laboratory identified contributions such as mass.

Conclusions
The European Working Group on Reactor Dosimetry has undertaken a second Round Robin to compare gamma spectrometry measurements performed by thirteen European organisations on a set of irradiated dosimetry samples representative of those commonly used by the Reactor Dosimetry community to measure neutron fluence. The radionuclides measured were 110m Ag, 58 Co, 60 Co, 54 Mn, and 46 Sc.
The results presented anonymously in this paper show that:  The thirteen organisations have provided consistent measurements of specific activity for each of the radionuclides with standard deviations varying between 2.1% ( 54 Mn) to 6.4% ( 60 Co);  Compared to the measurement uncertainties provided by the participants, 71% are within ±2 multiples of measurement uncertainty;  Samples weighing c. 5 mg or less have resulted in larger than expected discrepancies in the results of some of the participants, a number of which have fed through as outliers in the comparison of specific activity data;  Adjustment of (all) specific activity data to make them consistent with the average of the measured masses for each sample produced a significant improvement in overall agreement. The range of standard deviations obtained was reduced, ranging from to 2.1% ( 54 Mn and 58 Co) to 3.7% ( 46 Sc). Correspondingly, 90% of the mass-adjusted measurements lie within ±2 multiples of measurement uncertainty;  Compared to the original European Working on Reactor Dosimetry Round Robin, the results of the latest inter-comparison show poorer consistency between organisations. However, analysis shows that for the subset of participants that took part in both Round Robins, similar levels of consistency were obtained. Finally, it is concluded that the organisations participating in this inter-comparison have produced a consistent set of activity measurements with representative measurement uncertainties. However, the measurement of small masses could present a significant challenge with the potential to undermine the specific activity values which depend upon them.