The raven paradox, also known as Hempel's paradox, Hempel's ravens or, rarely, the paradox of indoor ornithology, is a paradox arising from the question of what constitutes evidence for the truth of a statement. Observing objects that are neither black nor ravens may formally increase the likelihood that all ravens are black even though, intuitively, these observations are unrelated.

This problem was proposed by the logician Carl Gustav Hempel in the 1940s to illustrate a contradiction between inductive logic and intuition.

Paradox

Hempel describes the paradox in terms of the hypothesis:

: (1) All ravens are black. In the form of an implication, this can be expressed as: If something is a raven, then it is black.

Via contraposition, this statement is equivalent to:

: (2) If something is not black, then it is not a raven.

In all circumstances where (2) is true, (1) is also true—and likewise, in all circumstances where (2) is false (i.e., if a world is imagined in which something that was not black yet was a raven existed), (1) is also false.

Given a general statement such as all ravens are black, a form of the same statement that refers to a specific observable instance of the general class would typically be considered to constitute evidence for that general statement. For example,

: (3) My pet raven is black.

is evidence supporting the hypothesis that all ravens are black.

The paradox arises when this same process is applied to statement (2). On sighting a green apple, one can observe:

: (4) This green apple is not black, and it is not a raven.

By the same reasoning, this statement is evidence that (2) if something is not black then it is not a raven. But since (as above) this statement is logically equivalent to (1) all ravens are black, it follows that the sight of a green apple is evidence supporting the notion that all ravens are black. This conclusion seems paradoxical because it implies that information has been gained about ravens by looking at an apple.

Proposed resolutions

Nicod's criterion says that only observations of ravens should affect one's view as to whether all ravens are black. Observing more instances of black ravens should support the view, observing white or coloured ravens should contradict it, and observations of non-ravens should not have any influence.

Hempel's equivalence condition states that when a proposition, X, provides evidence in favor of another proposition Y, then X also provides evidence in favor of any proposition that is logically equivalent to Y.

The paradox shows that Nicod's criterion and Hempel's equivalence condition are not mutually consistent. A resolution to the paradox must reject at least one out of:

  1. negative instances having no influence (!PC),
  2. equivalence condition (EC), or,
  3. validation by positive instances (NC).

A satisfactory resolution should also explain why there naively appears to be a paradox. Solutions that accept the paradoxical conclusion can do this by presenting a proposition that we intuitively know to be false but that is easily confused with (PC), while solutions that reject (EC) or (NC) should present a proposition that we intuitively know to be true but that is easily confused with (EC) or (NC).

Accepting non-ravens as relevant

Although this conclusion of the paradox seems counter-intuitive, some approaches accept that observations of (coloured) non-ravens can in fact constitute valid evidence in support for hypotheses about (the universal blackness of) ravens.

Hempel's resolution

Hempel himself accepted the paradoxical conclusion, arguing that the reason the result appears paradoxical is that we possess prior information without which the observation of a non-black non-raven would indeed provide evidence that all ravens are black.

He illustrates this with the example of the generalization "All sodium salts burn yellow", and asks us to consider the observation that occurs when somebody holds a piece of pure ice in a colorless flame that does not turn yellow: is perhaps the best known, and variations of the argument have been popular ever since, although it had been presented in 1958 and early forms of the argument appeared as early as 1940.

Good's argument involves calculating the weight of evidence provided by the observation of a black raven or a white shoe in favor of the hypothesis that all the ravens in a collection of objects are black. The weight of evidence is the logarithm of the Bayes factor, which in this case is simply the factor by which the odds of the hypothesis changes when the observation is made. The argument goes as follows:

Many of the proponents of this resolution and variants of it have been advocates of Bayesian probability, and it is now commonly called the Bayesian Solution, although, as Chihara observes, "there is no such thing as the Bayesian solution. There are many different 'solutions' that Bayesians have put forward using Bayesian techniques." Noteworthy approaches using Bayesian techniques (some of which accept !PC and instead reject NC) include Earman, Eells, Gibson, Hosiasson-Lindenbaum, Mackie, and Hintikka, who claims that his approach is "more Bayesian than the so-called 'Bayesian solution' of the same paradox". Bayesian approaches that make use of Carnap's theory of inductive inference include Humburg, Maher, introduced the term "Standard Bayesian Solution" to avoid confusion.

Carnap approach

Maher gives an example of background knowledge with respect to which the observation of a black raven decreases the probability that all ravens are black:

Good concludes that the white shoe is a "red herring": Sometimes even a black raven can constitute evidence against the hypothesis that all ravens are black, so the fact that the observation of a white shoe can support it is not surprising and not worth attention. Nicod's criterion is false, according to Good, and so the paradoxical conclusion does not follow.

Hempel rejected this as a solution to the paradox, insisting that the proposition 'c is a raven and is black' must be considered "by itself and without reference to any other information", and pointing out that it "was emphasized in section 5.2(b) of my article in Mind ... that the very appearance of paradoxicality in cases like that of the white shoe results in part from a failure to observe this maxim."

The question that then arises is whether the paradox is to be understood in the context of absolutely no background information (as Hempel suggests), or in the context of the background information that we actually possess regarding ravens and black objects, or with regard to all possible configurations of background information.

Good had shown that, for some configurations of background knowledge, Nicod's criterion is false (provided that we are willing to equate "inductively support" with "increase the probability of" – see below). The possibility remained that, with respect to our actual configuration of knowledge, which is very different from Good's example, Nicod's criterion might still be true and so we could still reach the paradoxical conclusion. Hempel, on the other hand, insists our background knowledge itself is the red herring, and that we should consider induction with respect to a condition of perfect ignorance.

Good's baby

In his proposed resolution, Maher implicitly made use of the fact that the proposition "All ravens are black" is highly probable when it is highly probable that there are no ravens. Good had used this fact before to respond to Hempel's insistence that Nicod's criterion was to be understood to hold in the absence of background information:

This, according to Good, is as close as one can reasonably expect to get to a condition of perfect ignorance, and it appears that Nicod's condition is still false. Maher made Good's argument more precise by using Carnap's theory of induction to formalize the notion that if there is one raven, then it is likely that there are many.

Maher's argument considers a universe of exactly two objects, each of which is very unlikely to be a raven (a one in a thousand chance) and reasonably unlikely to be black (a one in ten chance). Using Carnap's formula for induction, he finds that the probability that all ravens are black decreases from 0.9985 to 0.8995 when it is discovered that one of the two objects is a black raven.

Maher concludes that not only is the paradoxical conclusion true, but that Nicod's criterion is false in the absence of background knowledge (except for the knowledge that the number of objects in the universe is two and that ravens are less likely than black things).

Distinguished predicates

Quine argued that the solution to the paradox lies in the recognition that certain predicates, which he called natural kinds, have a distinguished status with respect to induction. This can be illustrated with Nelson Goodman's example of the predicate grue. An object is grue if it is blue before (say) and green afterwards. Clearly, we expect objects that were blue before to remain blue afterwards, but we do not expect the objects that were found to be grue before to be blue after , since after they would be green. Quine's explanation is that "blue" is a natural kind; a privileged predicate we can use for induction, while "grue" is not a natural kind and using induction with it leads to error.

This suggests a resolution to the paradox – Nicod's criterion is true for natural kinds, such as "blue" and "black", but is false for artificially contrived predicates, such as "grue" or "non-raven". The paradox arises, according to this resolution, because we implicitly interpret Nicod's criterion as applying to all predicates when in fact it only applies to natural kinds.

Another approach, which favours specific predicates over others, was taken by Hintikka. took an approach to the paradox that incorporates Karl Popper's view that scientific hypotheses are never really confirmed, only falsified.

The approach begins by noting that the observation of a black raven does not prove that "All ravens are black" but it falsifies the contrary hypothesis, "No ravens are black". A non-black non-raven, on the other hand, is consistent with both "All ravens are black" and with "No ravens are black". As the authors put it:

Selective confirmation violates the equivalence condition since a black raven selectively confirms "All ravens are black" but not "All non-black things are non-ravens".

Probabilistic or non-probabilistic induction

Scheffler and Goodman's concept of selective confirmation is an example of an interpretation of "provides evidence in favor of..." which does not coincide with "increase the probability of..." This must be a general feature of all resolutions that reject the equivalence condition, since logically equivalent propositions must always have the same probability.

It is impossible for the observation of a black raven to increase the probability of the proposition "All ravens are black" without causing exactly the same change to the probability that "All non-black things are non-ravens". If an observation inductively supports the former but not the latter, then "inductively support" must refer to something other than changes in the probabilities of propositions. A possible loophole is to interpret "All" as "Nearly all" – "Nearly all ravens are black" is not equivalent to "Nearly all non-black things are non-ravens", and these propositions can have very different probabilities.

This raises the broader question of the relation of probability theory to inductive reasoning. Karl Popper argued that probability theory alone cannot account for induction. His argument involves splitting a hypothesis, <math>H</math>, into a part that is deductively entailed by the evidence, <math>E</math>, and another part. This can be done in two ways.

First, consider the splitting:

<math display="block">H=A\ and\ B \ \ \ \ \ \ E=B\ and\ C</math>

where <math>A</math>, <math>B</math> and <math>C</math> are probabilistically independent: <math>P(A\ and\ B)=P(A)P(B)</math> and so on. The condition that is necessary for such a splitting of H and E to be possible is <math>P(H|E)>P(H)</math>, that is, that <math>H</math> is probabilistically supported by <math>E</math>.

Popper's observation is that the part, <math>B</math>, of <math>H</math> that receives support from <math>E</math> actually follows deductively from <math>E</math>, while the part of <math>H</math> that does not follow deductively from <math>E</math> receives no support at all from <math>E</math> – that is, <math>P(A|E)=P(A)</math>.

Second, the splitting:

<math display="block">H=(H\ or\ E)\ and\ (H\ or\ \overline{E})</math>

separates <math>H</math> into <math>(H\ or\ E)</math>, which as Popper says, "is the logically strongest part of <math>H</math> (or of the content of <math>H</math>) that follows [deductively] from <math>E</math>", and <math>(H\ or\ \overline{E})</math>, which, he says, "contains all of <math>H</math> that goes beyond <math>E</math>". He continues:

Orthodox approach

The orthodox Neyman–Pearson theory of hypothesis testing considers how to decide whether to accept or reject a hypothesis, rather than what probability to assign to the hypothesis. From this point of view, the hypothesis that "All ravens are black" is not accepted gradually, as its probability increases towards one when more and more observations are made, but is accepted in a single action as the result of evaluating the data that has already been collected. As Neyman and Pearson put it:

According to this approach, it is not necessary to assign any value to the probability of a hypothesis, although one must certainly take into account the probability of the data given the hypothesis, or given a competing hypothesis, when deciding whether to accept or to reject. The acceptance or rejection of a hypothesis carries with it the risk of error.

This contrasts with the Bayesian approach, which requires that the hypothesis be assigned a prior probability, which is revised in the light of the observed data to obtain the final probability of the hypothesis. Within the Bayesian framework there is no risk of error since hypotheses are not accepted or rejected; instead they are assigned probabilities.

An analysis of the paradox from the orthodox point of view has been performed, and leads to, among other insights, a rejection of the equivalence condition:

Rejecting material implication

The following propositions all imply one another: "Every object is either black or not a raven", "Every raven is black", and "Every non-black object is a non-raven." They are therefore, by definition, logically equivalent. However, the three propositions have different domains: the first proposition says something about "every object", while the second says something about "every raven".

The first proposition is the only one whose domain of quantification is unrestricted ("all objects"), so this is the only one that can be expressed in first-order logic. It is logically equivalent to:

<math display="block">\forall\ x, Rx\ \rightarrow\ Bx</math>

and also to

<math display="block">\forall\ x, \overline{Bx}\ \rightarrow\ \overline{Rx}</math>

where <math>\rightarrow</math> indicates the material conditional, according to which "If <math>A</math> then can be understood to mean

It has been argued by several authors that material implication does not fully capture the meaning of "If <math>A</math> then (see the paradoxes of material implication). "For every object, <math>x</math> is either black or not a raven" is true when there are no ravens. It is because of this that "All ravens are black" is regarded as true when there are no ravens. Furthermore, the arguments that Good and Maher used to criticize Nicod's criterion (see , above) relied on this fact – that "All ravens are black" is highly probable when it is highly probable that there are no ravens.

To say that all ravens are black in the absence of any ravens is an empty statement. It refers to nothing. "All ravens are white" is equally relevant and true, if this statement is considered to have any truth or relevance.

Some approaches to the paradox have sought to find other ways of interpreting "If <math>A</math> then and "All <math>A</math> are which would eliminate the perceived equivalence between "All ravens are black" and "All non-black things are non-ravens."

One such approach involves introducing a many-valued logic according to which "If <math>A</math> then has the truth value meaning "Indeterminate" or "Inappropriate" when <math>A</math> is false. In such a system, contraposition is not automatically allowed: "If <math>A</math> then is not equivalent to "If <math>\overline{B}</math> then Consequently, "All ravens are black" is not equivalent to "All non-black things are non-ravens".

In this system, when contraposition occurs, the modality of the conditional involved changes from the indicative ("If that piece of butter has been heated to 32&nbsp;°C then it has melted") to the counterfactual ("If that piece of butter had been heated to 32&nbsp;°C then it would have melted"). According to this argument, this removes the alleged equivalence that is necessary to conclude that yellow cows can inform us about ravens: