{\Gamma(\nu/2)}~

\frac{\exp\left[ \frac{-\nu \tau^2}{2 x}\right]}{x^{1+\nu/2 </math>|

cdf =<math>\Gamma\left(\frac{\nu}{2},\frac{\tau^2\nu}{2x}\right)

\left/\Gamma\left(\frac{\nu}{2}\right)\right.</math>|

mean =<math>\frac{\nu \tau^2}{\nu-2}</math> for <math>\nu >2\,</math>|

median =|

mode =<math>\frac{\nu \tau^2}{\nu+2}</math>|

variance =<math>\frac{2 \nu^2 \tau^4}{(\nu-2)^2 (\nu-4)}</math>for <math>\nu >4\,</math>|

skewness =<math>\frac{4}{\nu-6}\sqrt{2(\nu-4)}</math>for <math>\nu >6\,</math>|

kurtosis =<math>\frac{12(5\nu-22)}{(\nu-6)(\nu-8)}</math>for <math>\nu >8\,</math>|

entropy =<math>\frac{\nu}{2}

\!+\!\ln\left(\frac{\tau^2\nu}{2}\Gamma\left(\frac{\nu}{2}\right)\right)</math>

<math>\!-\!\left(1\!+\!\frac{\nu}{2}\right)\psi\left(\frac{\nu}{2}\right)</math>|

mgf =<math>\frac{2}{\Gamma(\frac{\nu}{2})}\left(\frac{-\tau^2\nu t}{2}\right)^{\!\!\frac{\nu}{4\!\!K_{\frac{\nu}{2\left(\sqrt{-2\tau^2\nu t}\right)</math>|

char =<math>\frac{2}{\Gamma(\frac{\nu}{2})}\left(\frac{-i\tau^2\nu t}{2}\right)^{\!\!\frac{\nu}{4\!\!K_{\frac{\nu}{2\left(\sqrt{-2i\tau^2\nu t}\right)</math>|

The scaled inverse chi-squared distribution <math> \psi \, \mbox{inv-} \chi^2(\nu)</math>, where <math> \psi </math> is the scale parameter, equals the univariate inverse Wishart distribution

<math> \mathcal{W}^{-1}(\psi,\nu)</math> with degrees of freedom <math>\nu</math>.

This family of scaled inverse chi-squared distributions is linked to the inverse-chi-squared distribution and to the chi-squared distribution:

If <math>X \sim \psi \, \mbox{inv-} \chi^2(\nu)</math> then <math> X/\psi \sim \mbox{inv-} \chi^2(\nu) </math> as well as <math> \psi/X \sim \chi^2(\nu) </math> and <math> 1/X \sim \psi^{-1}\chi^2(\nu) </math>.

Instead of <math> \psi</math>, the scaled inverse chi-squared distribution is however most frequently

parametrized by the scale parameter <math>\tau^2 = \psi/\nu</math> and the distribution <math>\nu \tau^2 \, \mbox{inv-} \chi^2(\nu)</math> is denoted by <math>\mbox{Scale-inv-}\chi^2(\nu, \tau^2)</math>.

In terms of <math>\tau^2</math> the above relations can be written as follows:

If <math>X \sim \mbox{Scale-inv-}\chi^2(\nu, \tau^2)</math> then <math> \frac{X}{\nu \tau^2} \sim \mbox{inv-} \chi^2(\nu) </math> as well as <math> \frac{\nu \tau^2}{X} \sim \chi^2(\nu) </math> and <math> 1/X \sim \frac{1}{\nu \tau^2}\chi^2(\nu) </math>.

This family of scaled inverse chi-squared distributions is a reparametrization of the inverse-gamma distribution.

Specifically, if

:<math>X \sim \psi \, \mbox{inv-} \chi^2(\nu) = \mbox{Scale-inv-}\chi^2(\nu, \tau^2)</math> &nbsp;&nbsp;then&nbsp;&nbsp; <math>X \sim \textrm{Inv-Gamma}\left(\frac{\nu}{2}, \frac{\psi}{2}\right) = \textrm{Inv-Gamma}\left(\frac{\nu}{2}, \frac{\nu\tau^2}{2}\right)</math>

Either form may be used to represent the maximum entropy distribution for a fixed first inverse moment <math>(E(1/X))</math> and first logarithmic moment <math>(E(\ln(X))</math>.

The scaled inverse chi-squared distribution also has a particular use in Bayesian statistics. Specifically, the scaled inverse chi-squared distribution can be used as a conjugate prior for the variance parameter of a normal distribution.

The same prior in alternative parametrization is given by

the inverse-gamma distribution.

Characterization

The probability density function of the scaled inverse chi-squared distribution extends over the domain <math>x>0</math> and is

:<math>

f(x; \nu, \tau^2)=

\frac{(\tau^2\nu/2)^{\nu/2{\Gamma(\nu/2)}~

\frac{\exp\left[ \frac{-\nu \tau^2}{2 x}\right]}{x^{1+\nu/2

</math>

where <math>\nu</math> is the degrees of freedom parameter and <math>\tau^2</math> is the scale parameter. The cumulative distribution function is

:<math>F(x; \nu, \tau^2)=

\Gamma\left(\frac{\nu}{2},\frac{\tau^2\nu}{2x}\right)

\left/\Gamma\left(\frac{\nu}{2}\right)\right.</math>

:<math>=Q\left(\frac{\nu}{2},\frac{\tau^2\nu}{2x}\right)</math>

where <math>\Gamma(a,x)</math> is the incomplete gamma function, <math>\Gamma(x)</math> is the gamma function and <math>Q(a,x)</math> is a regularized gamma function. The characteristic function is

:<math>\varphi(t;\nu,\tau^2)=</math>

:<math>\frac{2}{\Gamma(\frac{\nu}{2})}\left(\frac{-i\tau^2\nu t}{2}\right)^{\!\!\frac{\nu}{4\!\!K_{\frac{\nu}{2\left(\sqrt{-2i\tau^2\nu t}\right) ,</math>

where <math>K_{\frac{\nu}{2(z)</math> is the modified Modified Bessel function of the second kind.

Parameter estimation

The maximum likelihood estimate of <math>\tau^2</math> is

:<math>\tau^2 = n/\sum_{i=1}^n \frac{1}{x_i}.</math>

The maximum likelihood estimate of <math>\frac{\nu}{2}</math> can be found using Newton's method on:

:<math>\ln\left(\frac{\nu}{2}\right) - \psi\left(\frac{\nu}{2}\right) = \frac{1}{n} \sum_{i=1}^n \ln\left(x_i\right) - \ln\left(\tau^2\right) ,</math>

where <math>\psi(x)</math> is the digamma function. An initial estimate can be found by taking the formula for mean and solving it for <math>\nu.</math> Let <math>\bar{x} = \frac{1}{n}\sum_{i=1}^n x_i</math> be the sample mean. Then an initial estimate for <math>\nu</math> is given by:

:<math>\frac{\nu}{2} = \frac{\bar{x{\bar{x} - \tau^2}.</math>

Bayesian estimation of the variance of a normal distribution

The scaled inverse chi-squared distribution has a second important application, in the Bayesian estimation of the variance of a Normal distribution.

According to Bayes' theorem, the posterior probability distribution for quantities of interest is proportional to the product of a prior distribution for the quantities and a likelihood function:

:<math>p(\sigma^2|D,I) \propto p(\sigma^2|I) \; p(D|\sigma^2)</math>

where D represents the data and I represents any initial information about &sigma;<sup>2</sup> that we may already have.

The simplest scenario arises if the mean &mu; is already known; or, alternatively, if it is the conditional distribution of &sigma;<sup>2</sup> that is sought, for a particular assumed value of &mu;.

Then the likelihood term L(&sigma;<sup>2</sup>|D) = p(D|&sigma;<sup>2</sup>) has the familiar form

:<math>\mathcal{L}(\sigma^2|D,\mu) = \frac{1}{\left(\sqrt{2\pi}\sigma\right)^n} \; \exp \left[ -\frac{\sum_i^n(x_i-\mu)^2}{2\sigma^2} \right]</math>

Combining this with the rescaling-invariant prior p(&sigma;<sup>2</sup>|I) = 1/&sigma;<sup>2</sup>, which can be argued (e.g. following Jeffreys) to be the least informative possible prior for &sigma;<sup>2</sup> in this problem, gives a combined posterior probability

:<math>p(\sigma^2|D, I, \mu) \propto \frac{1}{\sigma^{n+2 \; \exp \left[ -\frac{\sum_i^n(x_i-\mu)^2}{2\sigma^2} \right]</math>

This form can be recognised as that of a scaled inverse chi-squared distribution, with parameters &nu; = n and &tau;<sup>2</sup> = s<sup>2</sup> = (1/n) Σ (x<sub>i</sub>-&mu;)<sup>2</sup>

Gelman and co-authors remark that the re-appearance of this distribution, previously seen in a sampling context, may seem remarkable; but given the choice of prior "this result is not surprising."

In particular, the choice of a rescaling-invariant prior for &sigma;<sup>2</sup> has the result that the probability for the ratio of &sigma;<sup>2</sup> / s<sup>2</sup> has the same form (independent of the conditioning variable) when conditioned on s<sup>2</sup> as when conditioned on &sigma;<sup>2</sup>:

:<math>p(\tfrac{\sigma^2}{s^2}|s^2) = p(\tfrac{\sigma^2}{s^2}|\sigma^2)</math>

In the sampling-theory case, conditioned on &sigma;<sup>2</sup>, the probability distribution for (1/s<sup>2</sup>) is a scaled inverse chi-squared distribution; and so the probability distribution for &sigma;<sup>2</sup> conditioned on s<sup>2</sup>, given a scale-agnostic prior, is also a scaled inverse chi-squared distribution.

Use as an informative prior

If more is known about the possible values of &sigma;<sup>2</sup>, a distribution from the scaled inverse chi-squared family, such as Scale-inv-&chi;<sup>2</sup>(n<sub>0</sub>, s<sub>0</sub><sup>2</sup>) can be a convenient form to represent a more informative prior for &sigma;<sup>2</sup>, as if from the result of n<sub>0</sub> previous observations (though n<sub>0</sub> need not necessarily be a whole number):

:<math>p(\sigma^2|I^\prime, \mu) \propto \frac{1}{\sigma^{n_0+2 \; \exp \left[ -\frac{n_0 s_0^2}{2\sigma^2} \right]</math>

Such a prior would lead to the posterior distribution

:<math>p(\sigma^2|D, I^\prime, \mu) \propto \frac{1}{\sigma^{n+n_0+2 \; \exp \left[ -\frac{ns^2 + n_0 s_0^2}{2\sigma^2} \right]</math>

which is itself a scaled inverse chi-squared distribution. The scaled inverse chi-squared distributions are thus a convenient conjugate prior family for &sigma;<sup>2</sup> estimation.

Estimation of variance when mean is unknown

If the mean is not known, the most uninformative prior that can be taken for it is arguably the translation-invariant prior p(&mu;|I)&nbsp;∝&nbsp;const., which gives the following joint posterior distribution for &mu; and &sigma;<sup>2</sup>,

:<math>

\begin{align}

p(\mu, \sigma^2 \mid D, I) & \propto \frac{1}{\sigma^{n+2 \exp \left[ -\frac{\sum_i^n(x_i-\mu)^2}{2\sigma^2} \right] \\

& = \frac{1}{\sigma^{n+2 \exp \left[ -\frac{\sum_i^n(x_i-\bar{x})^2}{2\sigma^2} \right] \exp \left[ -\frac{n(\mu -\bar{x})^2}{2\sigma^2} \right]

\end{align}

</math>

The marginal posterior distribution for &sigma;<sup>2</sup> is obtained from the joint posterior distribution by integrating out over &mu;,

:<math>\begin{align}

p(\sigma^2|D, I) \; \propto \; & \frac{1}{\sigma^{n+2 \; \exp \left[ -\frac{\sum_i^n(x_i-\bar{x})^2}{2\sigma^2} \right] \; \int_{-\infty}^{\infty} \exp \left[ -\frac{n(\mu -\bar{x})^2}{2\sigma^2} \right] d\mu\\

= \; & \frac{1}{\sigma^{n+2 \; \exp \left[ -\frac{\sum_i^n(x_i-\bar{x})^2}{2\sigma^2} \right] \; \sqrt{2 \pi \sigma^2 / n} \\

\propto \; & (\sigma^2)^{-(n+1)/2} \; \exp \left[ -\frac{(n-1)s^2}{2\sigma^2} \right]

\end{align}</math>

This is again a scaled inverse chi-squared distribution, with parameters <math>\scriptstyle{n-1}\;</math> and <math>\scriptstyle{s^2 = \sum (x_i - \bar{x})^2/(n-1)}</math>.

  • If <math>X \sim \mbox{Scale-inv-}\chi^2(\nu, \tau^2)</math> then <math> k X \sim \mbox{Scale-inv-}\chi^2(\nu, k \tau^2)\, </math>
  • If <math>X \sim \mbox{inv-}\chi^2(\nu) \, </math> (Inverse-chi-squared distribution) then <math>X \sim \mbox{Scale-inv-}\chi^2(\nu, 1/\nu) \,</math>
  • If <math>X \sim \mbox{Scale-inv-}\chi^2(\nu, \tau^2)</math> then <math> \frac{X}{\tau^2 \nu} \sim \mbox{inv-}\chi^2(\nu) \, </math> (Inverse-chi-squared distribution)
  • If <math>X \sim \mbox{Scale-inv-}\chi^2(\nu, \tau^2)</math> then <math>X \sim \textrm{Inv-Gamma}\left(\frac{\nu}{2}, \frac{\nu\tau^2}{2}\right)</math> (Inverse-gamma distribution)
  • Scaled inverse chi square distribution is a special case of type 5 Pearson distribution

References