Modeling Photomultiplier Gain and Regenerating Pulse Height Data for Application Development

Systems that adopt organic scintillation detector arrays often require a calibration process prior to the intended measurement campaign to correct for significant performance variances between detectors within the array. These differences exist because of low tolerances associated with photo-multiplier tube technology and environmental influences. Differences in detector response can be corrected for by adjusting the supplied photo-multiplier tube voltage to control its gain and the effect that this has on the pulse height spectra from a gamma-only calibration source with a defined photo-peak. Automated methods that analyze these spectra and adjust the photomultiplier tube bias accordingly are emerging for hardware that integrate acquisition electronics and high voltage control. However, development of such algorithms require access to the hardware, multiple detectors and calibration source for prolonged periods, all with associated constraints and risks. In this work, we report on a software function and related models developed to rescale and regenerate pulse height data acquired from a single scintillation detector. Such a function could be used to generate significant and varied pulse height data that can be used to integration-test algorithms that are capable of automatically response matching multiple detectors using pulse height spectra analysis. Furthermore, a function of this sort removes the dependence on multiple detectors, digital analyzers and calibration source. Results show a good match between the real and regenerated pulse height data. The function has also been used successfully to develop auto-calibration algorithms.

detector response functions with a fixed high-voltage supply.
Response matching scintillation detectors by aligning the pulse height spectra (PHS) obtained from a gamma source with a well-defined photo-peak, such as cesium-137, by means of adjusting the bias applied to the PMT is common practice [12].Recent advances in a real-time embedded control system for fast-neutron scintillation detectors include a procedure that automatically performs detector calibration using this principle [13].
Prior works have modeled and synthesized pulses from organic scintillation detectors in response to neutrons and gammas [14], but do not provide a simulation for pulse height distributions derived from these detectors.Others have more comprehensively simulated response functions and pulse shape discrimination (PSD) for organic scintillation detectors with Geant4 [15], demonstrating excellent agreement, among others, between experimental and simulated energy spectra for a detector exposed to 0.662 MeV gammas from Cs-137.The ability to simulate a detector of varying complexity and configuration within a single Monte Carlo code is impressive but indeed complex and does not encompass features derived from the processing electronics such as pre-amplification, pulse shaping and digitization.
This paper reports on a method for modeling and regenerating pulse height data for testing algorithms for automatically calibrating scintillation detectors.It is proposed that such regenerated data can shorten development by facilitating system testing when access to multiple detectors, digital acquisition electronics or calibration sources are limited.
Pulse height data acquired from an EJ-309 liquid scintillator (supplied by Scionix, The Netherlands) operated at several different high voltage biases and exposed to a Cs-137 source are presented using a multi-channel analyzer (MCA).The MCA channel containing the mode counts in the Comptonedge region is used as a measure of the gain of the detector's PMT and is compared to the supply bias.The relationship between the mode counts in the Compton-edge region on the PHS and the related supply bias is modeled.Different detector PMTs with varying gain parameters have also been modeled.These models have been verified against real data and have been used to regenerate pulse height data that have assisted in the development and validation of several different autocalibration algorithms.

II. EXPERIMENTAL SETUP
Pulse height data were obtained using a Hybrid Instruments Ltd.Mixed Field Analyzer (model MFAx4.3,serial # 4322) in MCA mode [16].The MFAx4.3 samples the analogue input signal at 500 MSa/s with 12-bit amplitude resolution.The analogue-to-digital converter (ADC) is operated in bipolar mode, hence zero-baseline is represented as 2047 and the maximum sample amplitude as 4095.The variable gain amplifier (VGA) of the MFAx4.3 was configured with the gain set to low (10 dB) and the attenuation ladder set to -12 dB, providing a net gain of 6.84 dB.The trigger threshold was operated in greater-than mode and set to 100, therefore only capturing pulse height data exceeding 2147 (derived from 2047 zero-baseline + 100 trigger threshold).
Pulses were baseline corrected in real-time by the MFAx4.3.After baseline correction, 0 V is represented as 200 by the 12-bit ADC and the maximum 430 mV signal amplitude (governed by the VGA settings described above) is represented as 2247.
With the MFAx4.3operating in MCA mode the pulse sample with the largest amplitude, i.e. the pulse height is sent to the host computer via an Ethernet connection and written to a TXT-file.A single Ethernet packet contains 100 pulse heights to increase data transfer efficiency.
Data from an EJ-309 detector (serial # SBM842) exposed to a Cs-137 source were recorded for PMT biases ranging from −1800 V to −1600 V in 50 V increments.The source had an activity of 333 kBq and was positioned on-axis and 50 mm from the front face of the detector.For each PMT bias 100 000 pulse heights were recorded and saved to the TXT-file.These data were imported into MATLAB®, reformatted to represent the pulse heights obtained from each PMT bias and saved as a MAT-file for convenience.

III. IMPLEMENTATION
The developed MATLAB® function utilizes two models to regenerate pulse height data by rescaling real data.The first model replicates the variations in response between different detector-PMTs of the same model equation (2), with predefined values used to model ten different detector-PMTs.The second, models the detector's response with relation to the pulse height spectra (PHS), i.e. the location of the Compton-edge on a Cs-137 spectrum, for different applied PMT biases (5).
The function takes three inputs.The first allows the user to specify the magnitude of the negative high voltage bias, 'hv', applied to the detector's PMT.'hv' is in positive integer format ranging from 0 to 2000.The second input allows the user to select one of the ten predefined detectors, 'dnum', that correspond to a scalar value D, and is in integer format ranging from 1 to 10.The final user input is used to specify how many pulse height data are generated, 'qty'.'qty' is in positive integer format.A forced delay has been implemented that is directly proportional to 'qty', this replicates real data collection, i.e. more data improves statistics but takes longer to collect.The delay can be bypassed to speed up development time.
The function outputs pulse heights ranging from 0 to 2047 at roughly the quantity specified by 'qty'.The quantity of pulse heights produced will be slightly different than that specified by 'qty' because of a normally distributed pseudorandom error applied to the regenerated data.
Regenerated pulse height data are produced by the function using the follows steps.1) Calculate the mode pulse height in the Compton-edge region, M, using (2) for the specified high voltage, 'hv', and detector, 'dnum'.Calculate the bin width, W, necessary to scale the data to the mode pulse height, M, using (5).
2) Read-in 100 000 prerecorded pulse heights and subtract the minimum recorded value to shift the spectrum left to the zero-origin.The data acquired with the EJ-309 detector operated at −1750 V were used in this instance.
3) Partition the data into histogram bins according to the calculated bin width, W. Replace any zero-count bins with the mean count of the two neighbouring bins.4) Filter the bin counts with a 200-bin moving average filter.5) Resize the number of bins to 2047 by either padding with zeros if less than 2047 or by deleting the right-most bins if greater than 2047.6) Normalize the data; bin counts are divided by the total number of counts.7) Scale the normalized bin counts to the specified number of pulse heights determined by 'qty' and then apply a normally distributed pseudorandom error.8) Generate pulse heights in the calculated quantities and randomize the order of pulse amplitudes.9) Delay release of data directly proportional to 'qty'.

IV. RESULTS AND DISCUSSION
The first model used by the function represents the variations in response between different detector-PMTs of the same type.The model was produced by plotting the PHS acquired from the EJ-309 detector operated at −1800 V, −1750 V, −1700 V, −1650 V and −1600 V whilst exposed to the 333 kBq Cs-137 source, as shown by Fig. 1.The bin counts of the spectra were smoothed using a 100-bin moving average filter.
The mode pulse height in the Compton-edge region of the spectra shown in Fig. 1 were plotted against the applied PMT bias to produce Fig. 2. Using a curve fitting tool these data were fit with an exponential trend (1) which was generalized (2) to serve as a model for the relationship between the mode pulse height, M, and the magnitude of the applied PMT bias, v.  = 0.5424 • e 0.0045• (1) Where D is a scalar used to represent different detectors.Ten detector models were predefined with D values ranging from 0.52 to 0.80.
EPJ Web of Conferences 170, 07001 (2018) https://doi.org/10.1051/epjconf/201817007001ANIMMA 2017 Fig. 1.Pulse height spectra (PHS) acquired from the EJ-309 detector operated at −1800 V, −1750 V, −1700 V, −1650 V and −1600 V whilst exposed to the 333 kBq Cs-137 source.The bin counts of the spectra are filtered using a 100bin moving average filter.Fig. 2. Nonlinear relationship between mode pulse height, M, in the Compton-edge region and the applied PMT bias, v.These data were derived from pulse heights acquired from an EJ-309 liquid scintillator exposed to a Cs-137 source and operated at biases ranging from −1800 V to −1600 V in 50 V increments.About 100 000 pulse heights were recorded for each bias.Fig. 3 was produced by first loading 100 000 prerecorded pulse heights into MATLAB®.These data were acquired from the EJ-309 detector operated at −1750 V whilst exposed to the Cs-137 source.Any events that exceeded a pulse height of 2245 were removed and then the minimum recorded pulse height was subtracted from all pulse height in the data set to shift the spectrum left to the zero-origin.These data were then partitioned into bins and the count in each bin determined using the MATLAB® histcounts function.The number of bins were specified by the scalar, nbins, expressed by script (3), where W is the bin width.

𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 = (0 ∶ 𝑊𝑊 ∶ 2046);
(3) Following this partitioning process any bins with zero counts were replaced with the average count of the two neighboring bins.Zero-count bins are more prevalent when the histogram is stretched i.e. when W is less than 1.The bincounts were then filtered with a 200-bin moving average filter.The number of bins were then resized to 2047 by either padding with zero-count bins (if less than 2047 bins) or by deleting the right-most bins (if greater than 2047 bins).When W is greater than 1 the number of bins produced is less than 2047 and when W is less than 1 the number of bins produced is greater than 2047.The mode pulse height, M, was then recorded for specified bin widths, W, ranging from 0.6 to 1.7 in 0.1 increments (Fig. 3).Using a curve fitting tool a power trend was realized (4).
Rearranged to make W the subject produces the model used to determine the bin width, W, required to scale the mode pulse height of the real data to the specified pulse height, M (5).
Fig. 3. Nonlinear relationship between mode pulse height, M, in the Compton-edge region and the histogram bin width, W.These data were derived using different bin widths (0.6-1.7) during the histogram partitioning process and observing the scaling effect this had on the mode pulse height.
Fig. 4 shows the unfiltered PHS employed by the scaling function.These data were acquired from the EJ-309 detector operated at −1750 V whilst exposed to the 333 kBq Cs-137 source.For comparison, Fig. 5 shows the PHS for three modeled detectors produced by the function described in section III and using the real pulse height data shown in Fig. 4.An analysis of variance (ANOVA) was used to compare the real pulse height data (Fig. 4) and the regenerated pulse height data for Detector 2 (Fig. 5).The F ratio (1.71) is smaller than the F crit value (3.84) indicating no statistically significant difference and a good resemblance between the real pulse height data and the regenerated pulse height data for the modeled detectors.The effects of the applied normally distributed pseudorandom error are of interest with respect to the likeness to the real data and the challenges that statistical uncertainties present for auto-calibration algorithms.For instance, in this work a moving average filter was used to increase the validity of the measured mode pulse height given the continuum nature of the spectra, however alternative methods could be adopted.For the data shown in Fig. 4 and Fig. 5, both the real detector and the modeled detectors were operated at −1750 V.The D value, see (2), for detector 1 was 0.778, detector 2 was 0.654 and detector 3 was 0.542.Fig. 5. Pulse height spectra (PHS) produced for three modeled detectors using the pulse height scaling function described in section III and the data shown in Fig. 4.Each spectrum contains approximately 100 000 pulse heights and each detector was operated at −1750 V.

V. CONCLUSION
These scaled and regenerated pulse height data have proved an acceptable means for validating new auto-calibration algorithms.The reported method has shortened development time for techniques utilizing spectra analysis and has removed the dependence on and access to multiple detectors, digital acquisition electronics and calibration source with associated risks and constraints.
Moreover, an undergraduate Software Engineering project utilized the described function with a brief to design an autocalibration algorithm for aligning the mode pulse height to a specified pulse height for several model detectors by determining the detector specific PMT bias.Several different auto-calibration algorithms were conceived and extensively validated without any need for the students to access physical detectors, digital analyzers and calibration source.
Further studies are focusing on methods that are capable of automatically determining PSD parameters, including discrimination thresholds, to enable reliable and repeatable neutron/gamma discrimination.Eventually, advances of this sort will enable digital analyzers that have embedded highvoltage supplies that can optimize their entire configuration automatically to suit a range of scintillation detectors.Clearly, developments of this sort will require diverse data sets of the entire digital waveform acquired from a range of scintillation detectors and mixed-field sources or models that can adequately generate equivalent synthetic data.

Fig. 4 .
Fig.4.Unfiltered pulse height spectra (PHS) utilized by the pulse height scaling function.These data were acquired from the EJ-309 detector operated at −1750 V whilst exposed to the 333 kBq Cs-137 source.