Application of Turchin ’ s method of statistical regularization

During analysis of experimental data, one usually needs to restore a signal after it has been convoluted with some kind of apparatus function. According to Hadamard’s definition this problem is ill-posed and requires regularization to provide sensible results. In this article we describe an implementation of the Turchin’s method of statistical regularization based on the Bayesian approach to the regularization strategy.


Problem
Consider the usual situation: detector measures a signal (ϕ(x)) and produces observed data ( f (y)) via convolution with its own apparatus function (K(x, y)).The resulting observed data f (y) contains a random noise factor both from initial statistical uncertainty of ϕ(x) and additional noise from measurement procedure.In order to reconstruct initial signal, one needs to solve Fredholm equation of 1-st kind: but this equation is ill-possed: a small error in the measurement of f (y) leads to big instability of ϕ(x).

Solution
Solving such ill-possed problems requires additional operation called regularization.It means we need to introduce additional information to problem well-posed one.The idea of statistical regularization [1,2] is to look on the problem from the point of view of Bayessian statistics approach: unknown statistical value ϕ(x) could be reconstructed using measured value f (y), the model (K(x, y)) and some prior information about ϕ(x) behavior.Main features of statistical regularization: • Based on Bayesian approach and decision theory (choice theory).
• It's able to accommodate various forms of prior information, such as smoothness, constraints on boundary conditions, non-negativity, etc.
While the idea of the regularization was proposed a long time ago, it was not published in the journal with sufficient visibility and therefore left unknown to the broad scientific auditory.The aim of this work is both to reconstruct the initial work (create a modern implementation for described algorithms), and study its performance under different conditions.
Since original work is very hard to find and was published only in Russian language, we will present the short description of the method.

Description of statistical regularization's method 2.1 Strategy
Consider eq. 1 in operator form: According to statistical decision theory ( [3]), one can define a reconstruction strategy φ = Ŝ [ f ], which uses a prior information.Good strategy minimize an impact of prior information (which could be wrong).That could be achieved by introducing the loss-function: For this loss-function: Strategy depend on a prior information P(ϕ): Errors of solution: Let us consider what a prior information can we use?

A prior information
We expect that ϕ is relatively smooth and doesn't contain wild oscillations which, for example, we usually produce in use naive least squares.Common way of measuring function's smoothness is ϕ, Ωϕ , where At the same time we want to minimize information content of prior.Reasonable choice is ensemble of functions with average smoothness ω which maximizes Shannon entropy (or is equivalent to minimize that value of Shannon's information, which we add to solution).Which means we need to maximize:

H[P(ϕ)] = − ln P(ϕ)P(ϕ)dϕ
Subject to constraints: The resulting prior probability density for the Gaussian random process: where α = α(ω) -parameter of smoothness.In result, we get α-depend solution Ŝ α [ f ] and now we have problem of picking α.In rare cases we can select α manually using known empirical smoothness, but we proposal another way.Information about its value almost never available so we could consider it random variable as well.This yields us hierarchical Bayes model P( f, ϕ, α) = P( f |ϕ)P(ϕ, α)P(α).
Since we don't have any information about α we chose hyperprior P(α) to be flat.Now we have several options to pursue.In general case, we can use fully Bayesian approach and compute solution to marginalize by α and to use eq.3: or, to marginalize by ϕ and to integrate by P(α| f ): using MCMC for sampling ϕ and α (for Gaussian errors it could be reduced to numerical integral).In particular case we can simplify calculations by using following approximations.Also if distribution P(α| f ) is narrow, we can pick α corresponding to most probable α 0 and use P(ϕ|α 0 ) as prior.This is called empirical Bayes.For our problem it produces almost identical results.Derivation above assumed that Ω if positive definite.It possible to accommodate singular Ω.Then det in (5) should be replaced with pseudodeterminant.Positive definiteness is desirable since regularization is guaranteed to work only for non-singular Ω and singular or even near-singular Ω may and likely will cause numerical problems.
Also note that Ω we use is singular since ϕ = kx + b will have smoothness equal to zero.One way fix it is to fix value of ϕ to 0 and boundaries or to put priors on values on boundaries [4].In that case αΩ in (5) should be replaced with i α i Ω i where different Ω i corresponds to smoothness, boundary values etc.Also it's possible to add any other prior information about function.For example it possible to add non-negativity:

Discretization
In order to find numerical solution, one need to go from continuous functions in eq. 2 to discrete matrix form: where T n (x) -some basis in function space.For example, cubic spline, Fourier series and Legendre polynomials.

Solution for Gaussian distributed noise
The most common experimental case of noise distribution is normal (or Gaussian) distribution: Using the most probable α, one can get the best solution in a simple form.
For definition α, we use a posterior probability, which is calculated using Bayes' rule: where norming factor C doesn't depend on α.If we want to get more accurate solution, we should average solution over this posterior probability as it is describe in integrals 7.
3 Application The basic concepts of the described method (alongside some improvements) were implemented in a new prototype program written in the Python language.Then the program was used to reconstruct the differential cross-sections of electron scattering on hydrogen isotopes measured in Troitsk nu-mass experiment.In this experiment gaseous hydrogen (as well as deuterium and tritium) was irradiated by electrons with energies of about 20 keV from electron gun.Troitsk nu-mass spectrometer on the other side of gas volume registers the integral electron spectrum (like at fig 1a) with relative resolution of about 10 −5 .In previous works ( [5] and [6]) the differential cross-section of scattering electrons on H2 molecules was reconstructed using fit procedure against 5 or 6 parameters shape.That analysis had three major flaws: • The parametric shape for fit does not have sufficient physics basis and therefore result is subjective in respect to the parametrization.
• The chosen parametrization has too many parameters and produces very tight correlations between these parameters, making the fit results instable.
• There is not clean way to determine uncertainties for any given point of reconstructed function.
Turchin's regularization allows to avoid all these problems and produces clean model-independent result with uncertainties.The result of regularization and previously published fit are presented at Fig. 1b.The regularization reconstruction is in good agreement with fitted curve.Also there are two very small peaks picture shows two additional peaks at 25 and 30 eV.The peak on 25 eV is known to be produced by double scattering events.Peak on 30 eV is previously unknown.It could be either statistical anomaly (errors are quite large in this point), or possibly point to existence of additional dissociative scattering.

Conclusion
Turchin's statistical regularization is a very powerful tool to solve ill-posed inversed problems.It provides a flexible way to introduce almost any kind of prior knowledge into reconstruction problem.Also it opens a new way to solve some problems which previously were solved only by direct approach.In this work, we created an implementation of Turchin's regularization using modern language and successfully applied it to the Troitsk nu-mass data for electron scattering on hydrogen isotopes.The work was preceded by a lot of testing of the algorithm with different artificial functions and discretization options.The results of those tests are beyond the scope of this article.
This work is supported by the Ministry of Education and Science of the Russian Federation under the contract No. 3.3008.2017/PP.