Scale-space theory is a framework for multi-scale signal representation developed by the computer vision, image processing and signal processing communities with complementary motivations from physics and biological vision. It is a formal theory for handling image structures at different scales, by representing an image as a one-parameter family of smoothed images, the scale-space representation, parametrized by the size of the smoothing kernel used for suppressing fine-scale structures. the uniqueness claimed in the arguments based on scale invariance has been criticized, and alternative self-similar scale-space kernels have been proposed. The Gaussian kernel is, however, a unique choice according to the scale-space axiomatics based on causality based on local maxima (or minima) over scales of scale-normalized derivatives

:<math>L_{\xi^m \eta^n}(x, y; t) = t^{(m+n) \gamma/2} L_{x^m y^n}(x, y; t)</math>

where <math>\gamma \in [0,1]</math> is a parameter that is related to the dimensionality of the image feature. This algebraic expression for scale normalized Gaussian derivative operators originates from the introduction of <math>\gamma</math>-normalized derivatives according to

:<math>\partial_{\xi} = t^{\gamma/2} \partial_x\quad</math> and <math>\quad\partial_{\eta} = t^{\gamma/2} \partial_y.</math>

It can be theoretically shown that a scale selection module working according to this principle will satisfy the following scale covariance property: if for a certain type of image feature a local maximum is assumed in a certain image at a certain scale <math>t_0</math>, then under a rescaling of the image by a scale factor <math>s</math> the local maximum over scales in the rescaled image will be transformed to the scale level <math>s^2 t_0</math>.

Extensions of linear scale-space theory concern the formulation of non-linear scale-space concepts more committed to specific purposes. In addition to variabilities over scale, which original scale-space theory was designed to handle, this generalized scale-space theory

Regarding biological hearing there are receptive field profiles in the inferior colliculus and the primary auditory cortex that can be well modelled by spectra-temporal receptive fields that can be well modelled by Gaussian derivates over logarithmic frequencies and windowed Fourier transforms over time with the window functions being temporal scale-space kernels.

Deep learning and scale space

In the area of classical computer vision, scale-space theory has established itself as a theoretical framework for early vision, with Gaussian derivatives constituting a canonical model for the first layer of receptive fields. With the introduction of deep learning, there has also been work on also using Gaussian derivatives or Gaussian kernels as a general basis for receptive fields in deep networks. Using the transformation properties of the Gaussian derivatives and Gaussian kernels under scaling transformations, it is in this way possible to obtain scale covariance/equivariance and scale invariance of the deep network to handle image structures at different scales in a theoretically well-founded manner. Specifically, using the notions of scale covariance/equivariance and scale invariance, it is possible to make deep networks operate robustly at scales not spanned by the training data, thus enabling scale generalization. which possesses similar properties in a time-causal situation (non-creation of new structures towards increasing scale and temporal scale covariance) as the Gaussian kernel obeys in the non-causal case. The time-causal limit kernel corresponds to convolution with an infinite number of truncated exponential kernels coupled in cascade, with specifically chosen time constants to obtain temporal scale covariance. For discrete data, this kernel can often be numerically well approximated by a small set of first-order recursive filters coupled in cascade, see

Implementation issues

When implementing scale-space smoothing in practice there are a number of different approaches that can be taken in terms of continuous or discrete Gaussian smoothing, implementation in the Fourier domain, in terms of pyramids based on binomial filters that approximate the Gaussian or using recursive filters. More details about this are given in a separate article on scale space implementation.

See also

  • Difference of Gaussians
  • Gaussian function
  • mipmapping

References

</references>

Further reading

  • Lindeberg, Tony: Scale-space: A framework for handling image structures at multiple scales, Proc. CERN School of Computing, 96(8): 27-38, 1996.
  • Romeny, Bart ter Haar: Introduction to Scale-Space Theory: Multiscale Geometric Image Analysis, Tutorial VBC '96, Hamburg, Germany, Fourth International Conference on Visualization in Biomedical Computing.
  • Lindeberg, Tony: "Scale-space theory" In: Encyclopedia of Mathematics, (Michiel Hazewinkel, ed) Kluwer, 1997.
  • Web archive backup: Lecture on scale-space at the University of Massachusetts (pdf)
  • Powers of ten interactive Java tutorial at Molecular Expressions website
  • pyscsp : Scale-Space Toolbox for Python at GitHub and PyPi
  • pytempscsp : Temporal Scale-Space Toolbox for Python at GitHub and PyPi
  • Peak detection in 1D data using a scale-space approach BSD-licensed MATLAB code