In statistics, the Rao–Blackwell theorem, sometimes referred to as the Rao–Blackwell–Kolmogorov theorem, is a result that characterizes the transformation of an arbitrarily crude estimator into an estimator that is optimal by the mean-squared-error criterion or any of a variety of similar criteria.
The Rao–Blackwell theorem states that if <math>\delta(X)</math> is any kind of estimator of a parameter <math>\theta</math>, then the conditional expectation of <math>\delta(X)</math> given <math>T(X)</math>, where <math>T</math> is a sufficient statistic, is typically a better estimator of <math>\theta</math>, and is never worse. Sometimes one can very easily construct a very crude estimator <math>\delta(X)</math>, and then evaluate that conditional expected value to get an estimator that is in various senses optimal.
The theorem is named after C.R. Rao and David Blackwell. The process of transforming an estimator using the Rao–Blackwell theorem can be referred to as Rao–Blackwellization. The transformed estimator is called the Rao–Blackwell estimator.
:<math>
\operatorname{E}[(\delta_1(X) - \theta)^2] =
\operatorname{E}[(\delta(X) - \theta)^2] -
\operatorname{E}\left[\operatorname{Var}[\delta(X) \mid T(X)]\right]
</math>
Since <math> \operatorname{E}\left[\operatorname{Var}[\delta(X) \mid T(X)]\right] \ge 0 </math>,
the Rao-Blackwell theorem immediately follows.
Convex loss generalization
The more general version of the Rao–Blackwell theorem speaks of the "expected loss" or risk function:
:<math> \operatorname{E}[L(\delta_1(X))] \leq \operatorname{E}[L(\delta(X))] </math>
where the "loss function" <math>L</math> may be any convex function. If the loss function is twice-differentiable, as in the case for mean-squared-error, then we have the sharper inequality Let <math>X_1, \ldots, X_n</math> be a random sample from a scale-uniform distribution <math>X \sim U \left( (1-k) \theta, (1+k) \theta \right),</math> with unknown mean <math>E[X]=\theta</math> and known design parameter <math>k \in (0,1)</math>. In the search for "best" possible unbiased estimators for <math>\theta,</math> it is natural to consider <math>X_1</math> as an initial (crude) unbiased estimator for <math>\theta</math> and then try to improve it. Since <math>X_1</math> is not a function of <math>T = \left( X_{(1)}, X_{(n)} \right)</math>, the minimal sufficient statistic for <math>\theta</math> (where <math>X_{(1)} = \min( X_i )</math> and <math>X_{(n)} = \max( X_i )</math>), it may be improved using the Rao–Blackwell theorem as follows:
:<math>\hat{\theta}_{RB}=E_{\theta} \left [X_1|X_{(1)}, X_{(n)} \right ]=\frac{X_{(1)}+X_{(n){2}.</math>
However, the following unbiased estimator can be shown to have lower variance:
:<math>\hat{\theta}_{LV} = \frac{1}{2 \left (k^2 \frac{n-1}{n+1}+1\right )} \left[ (1-k)+(1+k) \right].</math>
And in fact, it could be even further improved when using the following estimator:
:<math>\hat{\theta}_{BAYES} =\frac{n+1}{n} \left[ 1-\frac{\frac{\left( \frac}{1-k} \right)}{\left( \frac}{1+k} \right)}-1}-1} \right] \frac{X_{(n){1+k}</math>
The model is a scale model. Optimal equivariant estimators can then be derived for loss functions that are invariant.
See also
- Basu's theorem — Another result on complete sufficient and ancillary statistics
References
</references>
