Monitoring of dispersed smoke-plume layers by determining locations of the data-point clusters

A modified data-processing technique of the signals recorded by zenith-directed lidar, which operates in smoke-polluted atmosphere, is discussed. The technique is based on simple transformations of the lidar backscatter signal and the determination of the spatial location of the data point clusters.The technique allows more reliable detection of the location of dispersed smoke layering. Examples of typical results obtained with lidar in a smokepolluted atmosphere are presented.


INTRODUCTION
The worldwide increase of the number of wildfires is considered as a result of global warming. On the other hand, wildfires further influence climate change. Therefore, monitoring wildfires and predicting their possible distribution is an imperative task of atmospheric science. However, the determination of the exact location of the smoke plume is a significant challenge, especially when monitoring dispersed smoke layers and plumes is made in the areas where the dispersion process creates an extended transition zone between regions of polluted and clear air.
Lidar is a good tool for monitoring smoke layer behavior. In principle, it can easily distinguish regions with high levels of backscatter from regions of clear atmosphere and detect the boundary between different atmospheric layers. When determining the boundaries of the aerosol formations with lidar, generally, some relative characteristics of the backscatter signal from areas of interest are used [1][2][3]. In particular, when using the gradient method, one can find the smoke-plume boundary location as the range where the derivative of the square-range-corrected lidar signal either is maximum or decreases from the maximum down to a fixed, user-defined level [1]. Unfortunately, there is no way to establish an optimal value for this level, which would be acceptable for different atmospheric situations. The use of other techniques, including the wavelet technique [2], also requires the selection of concrete parameters, and might be a significant issue.
The situation becomes especially challenging when one should define the upper height of the area of increased backscattering, created by dispersed smoke. The boundary of such layering is difficult to define as the conventional criteria generally assume a sharp boundary between the clear sky and the area of increased backscattering. The dataprocessing method which allows determining the upper height of the region of increased backscattering with poorly defined boundaries is considered in [3].
In our current study, the method in [3] is modified and used together with the application of the cluster principle, which allows more reliable determination of the locations of dispersed smoke boundaries. This methodology can be used both for one-directional and scanning lidar; however, below this method is considered only for profiling the atmosphere with a zenith directed lidar.

METHODOLOGY
In the study [3], the recorded lidar signal, P (r), which is the sum of the variable backscatter signal, P(r) at the range r and the range-independent offset, B, is transformed in the auxiliary function Y(x). For the zenith-directed lidar, the latter is defined as the function of independent x in the form, where x = h 2 , and h is the height. The sliding derivative of this function, dY/dx, is calculated and the intercept point of each local slope fit of the function with the vertical axis is found as the difference of two terms, (2) By using the intercept function instead the derivative of the square-range-corrected backscatter signal, the determination of the systematic offset, B, can be avoided. The retrieval technique is based on the normalized intercept function, which is defined as, For determining the upper boundary of the dispersed smoke plume of interest, the absolute value of this difference is calculated, and then transformed into the function of height, within the total height interval from hmin to hmax is found, and used for calculating the data points of interest, here the independent variable χ can be chosen in the The same as in [3], the consecutive values of  with the fixed step  = 0.05 are used in this study; that is, min = 0, 1 = 0.05, 2 = 0.1, etc. For each discrete , the corresponding height of a smoke layer, hmax() was determined; the latter was found as the maximum height where Y0,χ(h) was a non-zero value.
The main challenge is the selection of the optimal level, opt, and the corresponding height of interest, hmax(opt). In the previous study [3], such a selection was based on the calculation of the differences between the adjacent heights hmax(k-1), hmax(k), and hmax(k+1). The value of opt was determined using the following two conditions. First, the difference hmax(opt -)hmax(opt) should be maximal. Second, the next consecutive increase of , that is, selection of  equal to opt + , then opt + 2, etc., should result in an invariable, or at most, slowly decreasing height hmax().
As mentioned, the method in [3] works properly when the boundary between the layer with increased backscattering and clear air can be properly defined. The situation becomes much more challenging when lidar data are noisy and are obtained in smoke-polluted layers where the dispersion processes create an extended transition zone between the regions of polluted and clear air.
In such situations, the use of the modified dataprocessing technique based on the analysis of the cluster locations of the data points, hmax(), and their temporal changes is extremely effective.

ANALYSIS OF EXPERIMENTAL DATA
To clarify the modified data processing technique, let us consider Figs. 1 and 2, in which real squarerange-corrected lidar signal and corresponding dependence of hmax() on  are respectively shown. Here and below we will consider data, obtained from the lidar backscatter signals measured close to the zenith direction (  The square-range corrected lidar signal shown in Fig. 1 is not extremely noisy, so that the use of gradient methods for determining the height of smoke layering could yield acceptable results within the height up to ~3000 m. However, it could be difficult to discriminate the useful spikes caused by the actual change in the backscattering signal from the spikes originated by random noise over the larger heights. Using the criteria given in [3] and analyzing the corresponding dependence hmax() on  in Fig. 2, one can overcome this issue, but only for heights up to ~ 3000 m. In particular, one can easily distinguish two distinct layers, which upper heights are located at ~700 -800 m and ~ 2500 m. However, the origin of the data point at h ≈ 3700 m at  = 0.15 is unclear. Moreover as mentioned, the dependencies of hmax() on  that obey the criteria in [3] are not generally obtained when the smoke plume is strongly dispersed and the backscatter signals from that area are too noisy. Such a very typical case is shown in Figs. 3 and 4, where the  Fig. 2, in which one can easily discriminate two separated layering heights, the same criteria do not allow reliable discrimination of smoke-layer heights in Fig. 4. In particular, it is unclear whether points 1 and 2 which correspond to  = 0.4 and  = 0.45 are originated by the noise or by the backscatter signal. Note that eight points, located in the vicinity of maximum height, hmax ≈ 5000 m, are originated by increased noise level at the far end of lidar range. This noise is more intense than the noise in Fig. 2, where the noise data points are obtained within the range of , from 0 to 0.15, whereas in Fig. 3, this range extends from 0 to 0.35. Thus, in a common case, the data points in the dependencies hmax() on  can be originated from three sources: (1) the data points originated from the random noise on the signal far end; these data points are located at or close to the maximum height, hmax; (2) the data points originated from the change in the backscatter signal on the upper boundary of smoke area, and (3) the data points, who's source is unclear; they can originate from either the first or the second source.
To solve the problem of discriminating the backscatter and noise data points, the auxiliary cluster principle can be used. To clarify this principle, let us initially consider dependencies hmax() on , obtained for three consecutive time periods, at local times 15:30, 15:31, and 15:34 (Fig.  5). In all these curves, the boundary of the increased backscatter layer within the height interval ~1000-1300 m is clearly seen. The same as in the previous figures, the far-end data create noise data points, located close to maximum height, hmax = 5000 m.
The data point at the height 2000 m can be seen only at 15:31; therefore, this point also most likely originated from random noise. The origin of the three data points close to ~ 4000 m is unclear and cannot be clarified without additional analysis of the data points obtained before 15:30 and after 15:34. This principle of separation between the casual and systematically appearing data points in the two-dimensional dependence of data points hmax() on time and locating clusters of such data points is the basis of the cluster method.
To illustrate this method, in Fig. 6 the set of data points hmax() versus time for the extended period from 15:00 to 16:15 of local time is given. One can discriminate three separate layers in this figure: (1) the lower layer, which height decreases with time from 1000 m to ~600 m; (2) the layer, which during the period 15:32 -15:44 is located at the heights, ~ (1200 -1400 m) and then goes down to ~700 m; and (3) the upper layer which appears at 15:30 at the height ~2000 m and then rises. After 16:30 and until 16:40 (the termination of our measurements), this layer at the height ~2500 m was reliably fixed. The scattered data points at the heights larger than 3000 m originated from signal random noise.

SUMMARY
The modified lidar data-processing methodology is presented which allows more reliable determination of the upper heights of significantly dispersed smoke layering in presence of increased signal noise.