NIKA2 mapping and cross-instrument SED extraction of extended sources with Scanamorphos

The steps taken to tailor to NIKA2 observations the Scanamorphos algorithm (initially developed to subtract low-frequency noise from Herschel on-the-fly observations) are described, focussing on the consequences of the different instrument architecture and observation strategy. The method, making the most extensive use of the redundancy built in the multi-scan coverage with large arrays of a given region of the sky, is applicable to extended sources, while the pipeline is so far optimized for compact sources. An example of application is given. A related tool to build consistent broadband SEDs from 60 microns to 2 mm, combining Herschel and NIKA2 data, has also been developed. Its main task is to process the data least affected by low-frequency noise and coverage limitations (i.e. the Herschel data) through the same transfer function as the NIKA2 data, simulating the same scan geometry and applying the same noise and atmospheric signal as extracted from the 1 mm and 2 mm data.


Introduction
Infrared and millimeter arrays made of hundreds to thousands of detectors, such as NIKA2, allow very efficient on-the-fly mapping of large fields of view, but call for special data processing techniques to remove both the atmospheric foreground and the instrumental lowfrequency noise, hereafter generically called drifts, or low-frequency noise, while preserving extended astrophysical sources. We here present a tool designed to extract all the drift components from the observations by exploiting the redundancy as thoroughly as possible, taking advantage of the fact that within the nominal field of view, each position on the sky is sampled by many different detectors at many different times, and that the atmosphere varies significantly between successive scans of the same source. The underlying assumptions are the same as for the version of the tool initially developed for PACS and SPIRE onboard Herschel, despite the obvious differences in the physical nature of the drifts and in the instrument architecture: (1) the astrophysical signal is invariant while the drifts are supposed to vary on all timescales comprised between about 10 sampling periods and the map width crossing time ; (2) the probability distribution of the uncorrelated part of the drifts is symmetric about zero, but not necessarily Gaussian.
The effort to adapt the algorithm to NIKA2 observations is still underway, and for the moment applies to total-intensity observations only. It is part of a larger endeavor to apply it to various ground-based and balloon-borne instruments, all affected by dominant low-frequency noise: the ArTéMiS 350-450 µm camera installed at APEX [1] and the PILOT experiment in polarization at 240 µm, operating at altitudes of 35 − 40 km [2].

Prerequisites
A thorough description of the instrument, the observing modes and the whole calibration procedure is given by [3] and [4]. The raw data have to be pre-processed with the pipeline to accomplish these tasks: detector identification, opacity correction, flux calibration, computation of spatial coordinates, and rejection of the few detectors with overlapping frequency tones. Then they are reformatted, with basic information attached, by dedicated scripts.

Relative gains for extended emission
All multiplicative effects must be corrected before the low-frequency noise can be subtracted, since an incorrect calibration will induce errors in the signal of the same source seen by different detectors, that will be ascribed to drifts and therefore compromise flux preservation. The flux calibration is done for point sources from special products called beammaps, allowing to obtain a well-sampled map of a primary calibrator for each individual detector [4], but this procedure does not allow to capture variations in the ratio between the extended part and the core of the beam from detector to detector. Hence, to calibrate the relative responses to extended emission, an additional step is necessary. A simple way of doing it is to use the responses to the atmosphere, since it is very bright, can be considered uniform across the array size (which was verified), and is available in all on-the-fly observations, allowing to check the stability of the responses during a whole campaign. In practice, we use all scans longer than 5 min to iteratively estimate the atmospheric signal and fit affine functions of this signal for each valid detector. Since the gains are much better constrained when the atmosphere fluctuates a lot with time, the global gains of the campaign are computed by weighing the gains of each scan by the dynamic range of the atmosphere brightness. At 1mm, the atmospheric signal is determined from array 3, that is much better behaved than array 1. Consequently, the global relative response of array 1 is not unity (unlike for arrays 2 and 3, by construction), but on the order of 1.15 . Although in principle these near-field responses are not exactly equivalent to the far-field responses that are truly needed, it is better to take them into account: their application improves the results obtained on bright extended sources at 1mm.

Algorithm and step-by-step processing
The drifts can be decomposed according to their physical origin: the atmospheric emission, strongly correlated between all detectors ; the electronic noise, strongly correlated between detectors belonging to the same electronic box or the same sub-box [5] ; and the uncorrelated flicker noise. However, to avoid having to determine the correlation coefficients for all detectors, it is much simpler from an algorithmic point of view to adopt a different decomposition: the average drift for the whole array ; the average drift for each electronic box or sub-box ; and the complement, i.e. the individual drifts. No relationship is assumed between the atmospheric emission at 1mm and 2mm, and the processing in one band is totally independent from the other band. Nevertheless, the average drift (heavily dominated by the atmosphere) extracted from the data is consistent with the same function in both bands (with normalizations that are constant under similar observing conditions).
The corrections using the redundancy (i.e. numbers 2 to 4 below) make use of a coarse spatial grid, where the size of a coarse pixel is equal to the stability length (typically between 0.5 and 1 times the FWHM), corresponding to the minimum timescale t c below which the drifts will not be corrected. This stability length must be large enough to allow computing simple statistics from the samples of each detector taken during such a time interval. The requirement is that the minimum number of samples per crossing is 7. These statistics allow rejecting data having fluctuations in significant excess of the white noise, either because of glitches or because of contamination by a compact source. For more details than is practical to give here, the reader is referred to [6].
The recorded signal is modeled as: R(t, k i ) = S (p) + D aver (t) + D indiv (t, k i ) + HF(t, k i ) as a function of time t, detectors k i and coarse pixels p, where S is the sky signal, HF is the high-frequency noise (white noise and glitches), and D designates the drifts.
Since the noise has more power at the longest temporal and spatial scales, for each noise component the longest timescales have to be subtracted first. Here are the main steps, in the order in which they are performed: (1) Baseline subtraction, on timescales comprised between the map crossing time and the scan duration: Baselines consist of linear fits of the signal averaged over the whole array or over electronic boxes, as a function of time, and are designed to be robust with respect to outlying samples. The correction is iterative, and a source mask automatically built from the current map is updated at each iteration and transferred to the time domain. The criteria for the mask are based on thresholds in units of the error map and of the standard deviation in the map, as well as the need to avoid masking too large a fraction of the map or masking predominantly the edges.
(2) Average drift subtraction on short timescales: Within each coarse pixel, differences between pairs of detector crossings are computed and coadded according to their times: ∆(t 1 , t 2 ) = R(t 1 , k i ) − R(t 2 , k j ) = D aver (t 1 ) − D aver (t 2 ) + D indiv (t 1 , k i ) − D indiv (t 2 , k j ). After coaddition over all coarse pixels and detectors, given enough redundancy, the differences of individual drifts vanish, and one obtains differences between the average drift at two different times. These differences are stored in a matrix, that is scanned several times to iteratively build the time series of the average drift, assuming a null median. Since the solution is not  unique, there is a spurious excess drift, of the same periodicity as the geometry of the observations, that is removed by another instance of the baseline subtraction, optionally including piecewise baselines for very large fields (but the same function for each leg of a scan).
(3) Average drift per electronic box or sub-box, on successively smaller timescales: This component is subtracted after computing the individual drifts (as in step 4), and then their weighted average for each box or sub-box separately. (4) Individual drifts, on successively smaller timescales: Within each coarse pixel, they are computed as the difference between each detector crossing and the weighted average of all crossings: ..N (D aver (t j ) + D indiv (t j , k j )) which reduces to D indiv (t, k i ) provided there is enough redundancy, given that the average drift has been subtracted beforehand. For each timescale longer than t c (corresponding to the stability length), the correction is binned in time, and steps 3 and 4 are performed successively and followed by a new instance of the baseline subtraction. When working at timescale t c , only step 4 is done.
Other functionalities include the detection and masking of bad detectors, glitches, and instabilities (detuning) that are short enough to be treated as glitches, such as those frequently occurring during the March 2016 campaign, and the detection of scans where a large fraction of the detectors are not properly tuned for a large fraction of the time.
An illustration of the main steps at 1 mm is shown in Fig. 1 to 5, for an extended bright source observed with only two scans at 30 arcsec s −1 , NGC 7538 . The separation between scan legs was relatively large, 30 arcsec, which might be the reason for the residual graininess of the sky. Note that the intermediate maps created during the processing are projected on a grid oriented along the scan directions. The median duration of a scan leg is 11.18 s, and the timescales for the subtraction of the individual drifts are 3.11 s, 1.04 s and 0.35 s.

Observing strategy requirements
It is important to realize that the quality of the processing depends crucially on the way the observations are performed. The most delicate step, i.e. most sensitive to the observing strategy, is probably the baseline subtraction, and particularly the construction of the source mask. In order to optimize the separation between low-frequency noise and signal, the following guidelines, stemming from close inspection of commissioning data, should be applied: (1) The trajectories of the detectors have to vary as much as possible from one scan to the next, ensuring that the time series of both the astrophysical signal and the atmosphere do not repeat themselves. In practice, it is convenient to use two orthogonal scan directions (or, in case the source geometry prevents it, two directions making between them an angle greater than 30 degrees and as large as possible), and to keep alternating between them. (2) If the source to be mapped is elongated, it is best to avoid scanning along its major axis. (3) The scanning direction has to make a substantial angle with respect to the long axis of the electronic boxes (fixed in Nasmyth coordinates), to ensure a better determination of the average drift per electronic box, because in that case different positions on the sky are sampled at successive times for each box. (4) To ensure robust baselines, all detectors should go slightly beyond the source on each side of the scan, i.e. the scan length should be equal to the source size + twice the array diameter along the scan direction. (5) To also allow sufficient coverage and redundancy in the region of interest, the map width should be significantly larger than the source size in the scan transverse direction. The width to be chosen is a compromise between coverage uniformity and observing time. (6) For low-surface brightness sources, it is useless to keep accumulating data when the effective opacity (at the elevation of the source) becomes greater than 0.4 . For bright sources, effective opacities as high as 0.7 can be tolerated. (7) During the observations, a vigilant eye should be kept on the tuning of the detectors. Samples taken out of resonance are in practice unusable.

To conclude
Applications range from planets and secondary calibrators to star-forming molecular clouds and much fainter nearby galaxies (see the slides associated with these proceedings).
The machinery has been developed to simulate and process data from other instruments as if they had been acquired with the same geometry and the same noise as in NIKA2 data, i.e. applying the same transfer function. This will allow to combine Herschel and NIKA2 data of the same extended targets in order to obtain homogeneous spectral energy distributions. For lack of space, its description is postponed to another document.
The code will be made public after being vetted by the NIKA2 core team and will then be maintained and revised throughout the instrument operation.