Solomonoff's theory of inductive inference

Solomonoff's theory of inductive inference proves that, under its common sense assumptions (axioms), the best possible scientific model is the shortest algorithm that generates the empirical data under consideration. In addition to the choice of data, other assumptions are that, to avoid the post-hoc fallacy, the programming language must be chosen prior to the data and that the environment being observed is generated by an unknown algorithm. This is also called a theory of induction. Due to its basis in the dynamical (state-space model) character of Algorithmic Information Theory, it encompasses statistical as well as dynamical information criteria for model selection. It was introduced by Ray Solomonoff, based on probability theory and theoretical computer science. In essence, Solomonoff's induction derives the posterior probability of any computable theory, given a sequence of observed data. This posterior probability is derived from Bayes' rule and some universal prior, that is, a prior that assigns a positive probability to any computable theory.

Solomonoff proved that this induction is incomputable (or more precisely, lower semi-computable), but noted that "this incomputability is of a very benign kind", and that it "in no way inhibits its use for practical prediction" (as it can be approximated from below more accurately with more computational resources). by assigning larger prior credences to theories that require a shorter algorithmic description.

Origin

Philosophical

The theory is based in philosophical foundations, and was founded by Ray Solomonoff around 1960. It is a mathematically formalized combination of Occam's razor All computable theories which perfectly describe previous observations are used to calculate the probability of the next observation, with more weight put on the shorter computable theories. Marcus Hutter's universal artificial intelligence builds upon this to calculate the expected value of an action.

Principle

Solomonoff's induction has been argued to be the computational formalization of pure Bayesianism.

Another direction of inductive inference is based on E. Mark Gold's model of learning in the limit from 1967 and has developed since then more and more models of learning. The general scenario is the following: Given a class S of computable functions, is there a learner (that is, recursive functional) which for any input of the form (f(0),f(1),...,f(n)) outputs a hypothesis (an index e with respect to a previously agreed on acceptable numbering of all computable functions; the indexed function may be required consistent with the given values of f). A learner M learns a function f if almost all its hypotheses are the same index e, which generates the function f; M learns S if M learns every f in S. Basic results are that all recursively enumerable classes of functions are learnable while the class REC of all computable functions is not learnable.

Many related models have been considered and also the learning of classes of recursively enumerable sets from positive data is a topic studied from Gold's pioneering paper in 1967 onwards. A far reaching extension of the Gold’s approach is developed by Schmidhuber's theory of generalized Kolmogorov complexities, which are kinds of super-recursive algorithms.

References

Sources

Burgin, M. (2005), Super-recursive Algorithms, Monographs in computer science, Springer.
Burgin, M., "How We Know What Technology Can Do", Communications of the ACM, v. 44, No. 11, 2001, pp. 82–88.
Burgin, M.; Eberbach, E., "Universality for Turing Machines, Inductive Turing Machines and Evolutionary Algorithms", Fundamenta Informaticae, v. 91, No. 1, 2009, 53–77.
Burgin, M.; Eberbach, E., "On Foundations of Evolutionary Computation: An Evolutionary Automata Approach", in Handbook of Research on Artificial Immune Systems and Natural Computing: Applying Complex Adaptive Technologies (Hongwei Mo, Ed.), IGI Global, Hershey, Pennsylvania, 2009, 342–360.
Burgin, M.; Eberbach, E., "Evolutionary Automata: Expressiveness and Convergence of Evolutionary Computation", Computer Journal, v. 55, No. 9, 2012, pp. 1023–1029.
Burgin, M.; Klinger, A. Experience, Generations, and Limits in Machine Learning, Theoretical Computer Science, v. 317, No. 1/3, 2004, pp. 71–91
Davis, Martin (2006) "The Church–Turing Thesis: Consensus and opposition]". Proceedings, Computability in Europe 2006. Lecture Notes in Computer Science, 3988 pp. 125–132.
Gasarch, W.; Smith, C. H. (1997) "A survey of inductive inference with an emphasis on queries". Complexity, logic, and recursion theory, Lecture Notes in Pure and Appl. Math., 187, Dekker, New York, pp. 225–260.
Hay, Nick. "Universal Semimeasures: An Introduction," CDMTCS Research Report Series, University of Auckland, Feb. 2007.
Jain, Sanjay; Osherson, Daniel; Royer, James; Sharma, Arun, Systems that Learn: An Introduction to Learning Theory (second edition), MIT Press, 1999.
.
Li Ming; Vitanyi, Paul, An Introduction to Kolmogorov Complexity and Its Applications, 2nd Edition, Springer Verlag, 1997.
Osherson, Daniel; Stob, Michael; Weinstein, Scott, Systems That Learn, An Introduction to Learning Theory for Cognitive and Computer Scientists, MIT Press, 1986.

External links

Algorithmic probability – Scholarpedia