Image reconstruction algorithms for the microwave holographic vision system with reliable gap detection at theoretical limits

We present a reliable image reconstruction algorithm suitable for a microwave holographic vision system with several sensors coupled to the spin-diode based microwave detector and a single emission source. An objective is, by reconstructing the spatial microwave scattering density on the scene, to detect the presence and the nature of road obstacles impeding driving in the near vehicle zone. The idea of holographic visualization is to reconstruct the spatial microwave scattering density of an object by detecting an amplitude and phase of a reflected signal by lattice of sensors. We discuss versions of an algorithm, determine and analyse its resolution limits for various distances with different number of sensors for a onedimensional test problem of detecting two walls (or posts) separated by a gap at a fixed distance. The maximal interval between sensors needed for a reliable reconstruction equals approximately Fresnel zone width. We show that maximal resolution achieved by our algorithm with an appropriate number of sensors was about 40% of Fresnel zone width for wall detection and about 30% of zone width for gap detection.


Introduction
The growing interest in automated and connected vehicle technology prompted the need for efficient traffic and obstacle detection techniques in large variety of settings. For inner-city mobility, as well as for vehicle parking setting, critical issues are control of obstacles and object movement in the near-vehicle zone, about 0.3 to 4.5 meters distance to detection point. That would allow for speed control and adjustment in dense environment, vehicle-to-vehicle and pedestrian automatic emergency braking (with reaction time < 0.1 sec) at speeds less than 30 km/h, mitigating crashes in most inner-city environments. A promising technology direction in this area is the development of microwave-based detection techniques. Microwaves can penetrate various mediasuch as stone and water -so it can work at every weather conditions and sense distant or inaccessible objects.
Recently, there suggested an alternative approach to close-distance imaging [1] based on digital microwave holography, using high sensitivity spin diodes [2] as wave sensors. That could provide compact and cost effective devices for dynamic detection of surrounding objects in different driving scenarios and weather conditions. We could see the full 3D density of the objects -not just the surface. Different densities indicate different materials -stone, flesh, metal, wood etc.
At short distances this technology combines the advantages of radars (can work at every weather conditions, low cost, can see behind obstacles) and of LIDAR (doing 3D scenario reconstruction), adding also an extra capacity for object type detection, based on accurate determination of object material. In this article, we discuss the potential of that technique for accurate obstacle recognition in near-vehicle zone from the image reconstruction and algorithmic point of view. We show that there exists a capacity to use it as a component in an integrated realistic object recognition automotive system.

Image resolution and object detection
Digital holography techniques were initially proposed in 1960-ies [3,4]. They typically use an array of sensors comparable in size to the object studied and a wide parallel imaging beam. In our application, however, there is no practical way to provide a parallel imaging beam comparable in size to the imaging field. We need to rely on just one or few compact radiation sources, emitting each within a certain angle. That has serious consequences for spatial resolution achievable. Indeed, looking at Fresnel zones size at the object edges, we would not normally expect spatial resolution much better than the half of the zone width. It remains unclear though if even that resolution is reliably achievable given the limited number of sensors available for wave detection. As provided by Kotelnikov-Whittaker-Nyquist-Shannon sampling theorem, the maximal interval between sensors needed for a reliable reconstruction equals about Fresnel zone width. Same consideration implies that using more frequent sampling would not actually increase the amount of useful information available. With that in mind, we would be looking at positioning no more than few dozens of sensors at the vehicle's front. We will need effective parameters regularization techniques applied to reduce the dimensionality of the problem and still get the meaningful object detection outcome.

The density reconstruction setting
We will consider a task with one monochromatic radiation emitter, and several sensors. Given is the vector P of K sensor signals (both phase and amplitude) and the positions of both the emitter and the sensors. Fig. 1 gives an overall scheme of the corresponding system. The article [1] gave the following expression for the scattered signal P in terms of the functions expressing objects density in a point r within the imaging area, f(r), and the emission intensity in direction to r, g(r): Here p is a position of a sensor, ω is signal frequency and R is the length of the two-segment broken line from a position of the emitter to the point r, and then to the p. That expression, however, takes no account of the signal adsorption within the object. Considering the adsorption intensity to be equal to the scattering density, the total adsorption coefficient is then given by exp(-∮ f(x)dx), where the integral in the exponent is taken along the same two-segment broken line from emitter to r and then to the p, of the length R. We obtain a non-linear integral equation for the density function f(r): What makes that equation non-linear is the adsorption term. However, given a density function, we may compute it, and fixing the term will make the equation linear. It could then be readily solved and its solution will provide a correction to a previously given density. We may use that solution in its turn to obtain a newer correction, and so on.
The iterative matrix inversion task could still become too costly to perform in real time. The solution, however, may lie in ignoring the adsorption term altogether. Indeed, ignoring the term, we would still get accurate scattering picture of the front parts of the obstacles -the most critical ones to detect -and the image of distant parts slightly distorted by scattering adsorption. If we ignore the term, the task of inverting the matrix is still there, but the matrix no longer depends on the density function f(r), so its inversion could just be precomputed. Moreover, what one needs to obtain the result is simply a basis of density functions corresponding to each individual sensor -so the result is simply a sum of these functions multiplied by each sensor's signal: a map from a signal space to a density space. That gives us an easy and effective algorithm for object density reconstruction, suitable for the practical purposes.
To obtain a linear resolution procedure, one could represent the regularization preferences by a quadratic weight functional. The solution procedure (Tikhonov regularization [5]) is to find a solution with a minimal weight -or, more generally, a solution that minimizes the sum of a penalty for deviation from exact solution and of a regularization weight penalty. The weight is defined by a matrix Г and is given by ‖Гf‖ 2 , where f is a proposed solution, ‖*‖ is a standard (Euclidean or Hermitian) norm, and Г is a matrix mapping the vector f into a norm space. The form for the total penalty is then ‖Af-P‖ 2 +‖Гf‖ 2 , where P is the vector of sensor signals and A is a matrix for linearized integral equation. The regularized solution is given by where A T and Г T are Hermitian conjugate matrices to A and Г. So this is a result of multiplication of vector P by a rectangular matrix, which maps K-dimensional vector P into N-dimensional vector f -or, in other words, a collection of K basic functions fi, each corresponding to its own sensor. Please note that it depends on sensor configuration only, so one may precompute this. To express a preference for solution smoothness (in Fourier basis) is to define a diagonal matrix Г of elements increasing by harmonics number. We found in numerical experiments that choosing the weight proportional to a harmonics number gives about the best performance.

The 1D object shape extraction
We performed numerical experiments corresponding to a task of imaging and detecting a 1D object structure (a slot -two walls separated by a gap) at a given distance. For that task, we used a particular object detection algorithm. Assume that object density could only take two values 0 and 1 (all objects are of the same densityas in, of the same material) so we would need to distinguish between the two cases. The way to do it is to set a density threshold and assume zero density for all reconstructed density values below the threshold and density one for all the values above. We take the point lattice with a period equal to the wave length, take all the reconstructed values in these points, and put a threshold in the middle of the largest interval between the values (to the point of "lowest values density" -corresponding to the point between the high and low density values).
Applying the chosen threshold to the reconstructed density, we then get back a reconstructed object shape. There was a series of experiments detecting the 1D slots (two walls divided by a gap) at a given distance to determine the spatial resolution limits for the algorithm described above. Spatial resolution refers to the ability of the imaging to differentiate two objects. In our case, objects are the slot walls and the closeness of the objects is the width of the gap. We determine the gap resolution limits depending on the object distance and the configuration of walls, for several configuration of sensors. The imaging field size was fixed to be of 64 wavelengths (for a realistic case of spin diodes set to 10 GHz frequency, that would be about 2 m), and the size of sensors lattice was fixed to be the same (that would roughly correspond to a width dimension of a car). We assumed the emitter was monochromatic, wide-beam, positioned at the centre of sensors field, and sensors positioned at a lattice with approximately equal spacing (affected by a small random noise). The configurations tested had 4, 8, 16, 32 and 64 sensors.
The object distances varied from 10 to 500 wavelengths (in physical units, it is from 0.3 to 15 meters). Two types of slots, positioned to the centre of the imaging field, were tested -one with walls of equal width, another with one wall twice as wide as the other. We did not add any noise to the sensor signals modelled, assuming them exact. Fig. 2 gives an illustration of how the algorithm worked for a particular case (32 sensors, object distance is 50 wavelengths). A reconstructed 1D density, for a given slot shape: two unequal wall separated by a gap. X-axis is the object coordinate measured in wavelength units (x=D/λ). Y-axis is an object density (original object density is in red -equal 0 or 1; the reconstructed density is in blue). The thin horizontal line shows the density threshold obtained.

Fig. 2.
An object shape given applying the density threshold.

Resolution limits obtained
We summarize the results of detection experiments in terms of minimal object dimensions required for reliable slot detection. It is of interest to quantify the resolution quality by introducing the quality parameter as the ratio: κ=L/r2, where L is given minimal critical dimension (i.e. minimal gap or wall size etc.) and r2=√(2D+1) is the first Fresnel zone width (from zone center to second ring boundary in zone plate) at the distance D. One may note that the maximal possible resolution at first Fresnel zone, ∆l=1.22∆r2, equals about 0.36r2. Hence, we may at best obtain 0.36 as quality parameter value. Here is the resolution ratio values graph, obtained for unequal walls: We may see that for 16 or more sensors, the resolution well approaches the theoretical limit. Given below are the tables of minimal wall and gap widths available for reliable recognition by object distance and number of sensors (for both the unequal walls and equal walls configuration). For the equal walls slots, detection is quite reliable (detection failure rate is about 1% to 3%) at distances of up to 15 meters. Failure rate is measured as percentage of configurations for which gap detection fails, among all tested configurations within obtained minimal wall and gap dimensions limits. For the slots with unequal walls, it is quite reliable (in low single digit percentage rate) at distances up to 5 m, which is enough for near vehicle detection. One may also note there is approximately six times (for the wall) and 10-12 times (for the gap) gain in detection distance compared to the worst-case scenario of resolution equal to the first Fresnel zone width. That is, e.g. the gap of 18 cm was possible to detect at a distance of 6 m instead of worstcase 52 cm distance (which would render it unusable).

Discussion
To reach level 4 of truly autonomous vehicles (Levels of Driving Automation, SAE international standard J3016) the standard is putting emphasis on the automotive vision systems. Self-driving vehicles currently developed by Alphabet, Uber, or Toyota rely heavily on LIDAR. LIDAR can scan more than 100 meters in all directions around the car, generating a precise 3D map of the car's surroundings. The main issues related to LIDAR in automotive industry are that they generate a large amount of data and that they are still too expensive for mass-market implementation. Much cheaper solutions, in terms of price and computations, are those based on radars and cameras. The cheapest and most available sensors are the cameras, the best for classification and texture interpretation. The cameras, however, use massive amounts of data, making processing a computational intense and algorithmically complex job. Radars are computationally much lighter than cameras and LIDAR. Radars can work in every weather condition and even use reflection to see behind obstacles; however, they are much less accurate than LIDAR. Radar is a proven technology increasingly becoming more efficient for the autonomous car. This will also make radar more complementary to the cameras as the "dynamic duo" to "cross validate" the potential reconstructed scenario. The microwave holographic technology with its ability to detect the object material and volume (e.g. helping to tell a bicycle apart from a motorcycle) could become an additional useful complement to the emerging picture of synergetic collaborative and cross-validating automotive obstacles detection and classification environment. There exist wide varieties of signal processing approaches with the given algorithm, defined by the choice of functional space and a regularization metrics, allowing for adaptive treatment of sensor data based on the output of the other parts of the collaborative object recognition system.

Conclusions
We present, by combining the well-known building blocks (generalized Fourier-basis deconvolution, a type of Tikhonov regularization in Hermitian space, and a threshold-based object detection technique), a version of microwave holographic imaging and obstacle detection algorithm for an environment with a single coherent emitter and few detection sensors. The version is computationally efficient, requiring just O(N*K) operations per image frame, where K is the number of signals detected (sensors) and N is the number of image elements (pixels). The detection distances and spatial resolution achieved (better than 20 cm with distances up to 4.5 meters) are sufficient for the near-vehicle object detection purposes. An important direction for future research is to focus on distance measurement precision with object detection algorithms (the experiments presented in this article concentrated on a width measurement task only). The numerical experiments on 2D and 3D shapes detection, systematic treatment for signal noise effects, and the testing with the actual physical objects all are the subject for the future work.
We thank the Russian Science Foundation, project number 16-19-00181, for supporting this work.