Generalized extreme value distribution

~ & \text{if } ~ \xi \neq 0 ~~ \text{ and } ~~ \xi < \tfrac{1}{3} \\

\dfrac{12 \sqrt{6} \, \zeta(3)}{\pi^3} ~~ & \text{if } ~~ \xi = 0 \end{cases}</math> <br> where <math> ~ \sgn(x) ~ </math> is the sign function <br/> and <math> ~ \zeta(x) ~ </math> is the Riemann zeta function

| kurtosis = <math>\begin{cases} \dfrac{g_4 - 4 g_3 g_1 + 6 g_1^2 g_2 - 3 g_1^4}{(g_2 - g_1^2)^2} ~ & \text{if } ~ \xi \neq 0 ~ \text{ and } ~ \xi < \tfrac{1}{4} \\

\tfrac{12}{5} ~ & \text{if } ~ \xi = 0 \end{cases}</math>

| entropy = <math> \ln(\sigma) + \gamma\xi + \gamma + 1 </math>

| mgf = see

is a family of continuous probability distributions developed within extreme value theory to combine the Gumbel, Fréchet and Weibull families also known as type I, II and III extreme value distributions. By the extreme value theorem the GEV distribution is the only possible limit distribution of properly normalized maxima of a sequence of independent and identically distributed random variables.

though allegedly

it could also have been given by .

Specification

Using the standardized variable <math>\ s \equiv \tfrac{x - \mu}{\sigma}\ ,</math> where <math>\ \mu\ ,</math> is the location parameter and can be any real number, and <math>\ \sigma > 0\ ,</math> is the scale parameter; the cumulative distribution function of the GEV distribution is then

:<math display="block"> F(s; \xi) = \begin{cases} \exp (-\mathrm{e}^{-s}) \quad & \text{for } \quad \xi = 0\ , \\

\exp \bigl( \! - ( 1 + \xi s)^{-1/\xi} \bigr) & \text{for } \quad \xi \neq 0 ~~ \text{ and } ~~ \xi s > -1\ , \\

0 & \text{for } \quad \xi > 0 ~~ \text{ and } ~~ s \le -\tfrac{1}{\xi}\ , \\

1 & \text{for } \quad \xi < 0 ~~ \text{ and } ~~ s \ge \tfrac{1}{|\xi|}\ , \end{cases}</math>

where <math>\ \xi\ ,</math> the shape parameter, can be any real number. Thus, for <math>\ \xi > 0\ ,</math> the expression is valid for <math>\ s > -\tfrac{1}{\xi}\ ,</math> while for <math>\ \xi < 0\ ,</math> it is valid for <math>\ s < - \tfrac{1}{\xi} ~.</math> In the first case, <math>\ -\tfrac{1}{\xi}\ </math> is the negative, lower end-point, where <math>\ F\ </math> is ; in the second case, <math>\ -\tfrac{1}{\xi}\ </math> is the positive, upper end-point, where <math>F</math> is 1. For <math>\ \xi = 0\ ,</math>, the second expression is formally undefined and is replaced with the first expression, which is the result of taking the limit of the second, as <math>\ \xi \to 0\ </math> in which case <math>\ s\ </math> can be any real number.

In the special case of <math>\ x = \mu\ ,</math> we have <math>\ s = 0\ ,</math> so then <math>\ F(0; \xi) = \mathrm{e}^{-1} \approx 0.368\ </math> regardless of the values of <math>\ \xi\ </math> and <math>\ \sigma ~.</math>

The probability density function of the standardized distribution is

:<math display="block">f(s; \xi) = \begin{cases} \mathrm{e}^{-s} \exp (-\mathrm{e}^{-s} ) & \text{for } \quad \xi = 0\ , \\

(1 + \xi s)^{-( 1 + 1/\xi )} \exp\bigl(\! -( 1 + \xi s )^{-1/\xi} \bigr) & \text{for } \quad \xi \neq 0 ~~ \text{ and } ~~ \xi s > -1\ , \\

0 & \text{otherwise;} \end{cases}</math>

again valid for <math>\ s > -\tfrac{1}{\xi}\ </math> in the case <math>\ \xi > 0\ </math>, and for <math>\ s < -\tfrac{1}{\xi}\ </math> in the case <math>\ \xi < 0 ~.</math> The density is zero outside of the relevant range. In the case <math>\ \xi = 0\ ,</math> the density is positive on the whole real line.

Since the cumulative distribution function is invertible, the quantile function for the GEV distribution has an explicit expression, namely

:<math display="block">Q(p; \mu, \sigma, \xi) =

\begin{cases}

\mu - \sigma\ \ln ( -\ln p ) ~~ & \text{for } ~~ \xi = 0 ~~ \text{ and } ~~ p \in (0, 1)\ , \\

\mu + \dfrac{\sigma}{\xi} \Big(\ ( -\ln p)^{-\xi} - 1\ \Big) ~~ & \text{for } ~~ \xi > 0 ~~ \text{ and } ~~ p \in [0, 1)\ , ~~ \text{ or for } ~~ \xi < 0 ~~ \text{ and } ~~ p \in (0, 1] ;

\end{cases}</math>

and therefore the quantile density function <math>\ q = \tfrac{\mathrm{d}Q}{\mathrm{d}p}\ </math> is

:<math>q(p; \sigma, \xi) = \frac{\sigma}{(-\ln p)^{\xi + 1} \,p} \qquad \text{for } p \in (0, 1)\ ,</math>

valid for <math>\ \sigma > 0\ </math> and for any real <math>\ \xi ~.</math>

Example of probability density functions for distributions of the GEV family.

thumb|300px|Fitted GEV probability distribution to monthly maximum one-day rainfalls in October, Surinam

However, the resulting shape parameters have been found to lie in the range leading to undefined means and variances, which underlines the fact that reliable data analysis is often impossible.
In hydrology the GEV distribution is applied to extreme events such as annual maximum one-day rainfalls and river discharges. The blue picture illustrates an example of fitting the GEV distribution to ranked annually maximum one-day rainfalls showing also the 90% confidence belt based on the binomial distribution. The rainfall data are represented by plotting positions as part of the cumulative frequency analysis.

Prediction

It is often of interest to predict probabilities of out-of-sample data under the assumption that both the training data and the out-of-sample data follow a GEV distribution.
Predictions of probabilities generated by substituting maximum likelihood estimates of the GEV parameters into the cumulative distribution function ignore parameter uncertainty. As a result, the probabilities are not well calibrated, do not reflect the frequencies of out-of-sample events, and, in particular, underestimate the probabilities of out-of-sample tail events.
Predictions generated using the objective Bayesian approach of calibrating prior prediction have been shown to greatly reduce this underestimation, although not completely eliminate it.

Example for Normally distributed variables

Let <math>\ \left\{\ X_i\ \big|\ 1 \le i \le n\ \right\}\ </math> be i.i.d. normally distributed random variables with mean and variance .

The Fisher–Tippett–Gnedenko theorem tells us that

<math>\ \max \{\ X_i\ \big|\ 1 \le i \le n\ \} \sim GEV(\mu_n, \sigma_n, 0)\ ,</math>

where

<math>

\begin{align}

\mu_n &= \Phi^{-1}\left( 1 - \frac{\ 1\ }{ n } \right) \\

\sigma_n &= \Phi^{-1}\left( 1 - \frac{ 1 }{\ n\ \mathrm{e}\ } \right)- \Phi^{-1}\left(1-\frac{\ 1\ }{ n } \right) ~.

\end{align}

</math>

This allow us to estimate e.g. the mean of <math>\ \max \{\ X_i\ \big|\ 1 \le i \le n\ \}\ </math>

from the mean of the GEV distribution:

<math>

\begin{align}

\operatorname{\mathbb E}\left\{\ \max\left\{\ X_i\ \big|\ 1 \le i \le n\ \right\}\ \right\}

& \approx \mu_n + \gamma_{\mathsf E}\ \sigma_n \\

&= (1 - \gamma_{\mathsf E})\ \Phi^{-1}\left( 1 - \frac{\ 1\ }{ n } \right) + \gamma_{\mathsf E}\ \Phi^{-1}\left( 1 - \frac{1}{\ e\ n\ } \right) \\

&= \sqrt{\log \left(\frac{ n^2 }{\ 2 \pi\ \log \left(\frac{n^2}{2\pi} \right)\ }\right) ~}\ \cdot\ \left(1 + \frac{ \gamma }{\ \log n\ } + \mathcal{o} \left(\frac{ 1 }{\ \log n\ } \right) \right)\ ,

\end{align}

</math>

where <math>\ \gamma_{\mathsf E}\ </math> is the Euler–Mascheroni constant.

If <math>\ X \sim \textrm{GEV}(\mu,\,\sigma,\,\xi)\ </math> then <math>\ m X + b \sim \textrm{GEV}(m \mu+b,\ |m|\sigma,\ \xi)\ </math>
If <math>\ X \sim \textrm{Gumbel}(\mu,\ \sigma)\ </math> (Gumbel distribution) then <math>\ X \sim \textrm{GEV}(\mu,\,\sigma,\,0)\ </math>
If <math>\ X \sim \textrm{Weibull}(\sigma,\,\mu)\ </math> (Weibull distribution) then <math>\ \mu\left(1-\sigma\log \tfrac{X}{\sigma} \right) \sim \textrm{GEV}(\mu,\,\sigma,\,0)\ </math>
If <math>\ X \sim \textrm{GEV}(\mu,\,\sigma,\,0)\ </math> then <math>\ \sigma \exp (-\tfrac{X-\mu}{\mu \sigma} ) \sim \textrm{Weibull}(\sigma,\,\mu)\ </math> (Weibull distribution)
If <math>\ X \sim \textrm{Exponential}(1)\ </math> (Exponential distribution) then <math>\ \mu - \sigma \log X \sim \textrm{GEV}(\mu,\,\sigma,\,0)\ </math>
If <math>\ X \sim \mathrm{Gumbel}(\alpha_X, \beta)\ </math> and <math>\ Y \sim \mathrm{Gumbel}(\alpha_Y, \beta)\ </math> then <math>\ X-Y \sim \mathrm{Logistic}(\alpha_X-\alpha_Y,\beta)\ </math> (see Logistic distribution).
If <math>\ X\ </math> and <math>\ Y \sim \mathrm{Gumbel}(\alpha, \beta)\ </math> then <math>\ X+Y \nsim \mathrm{Logistic}(2 \alpha,\beta)\ </math> (The sum is not a logistic distribution).

:: Note that <math>\ \operatorname{\mathbb E}\{\ X + Y\ \} = 2\alpha+2\beta\gamma \neq 2\alpha = \operatorname{\mathbb E}\left\{\ \operatorname{Logistic}(2 \alpha,\beta)\ \right\} ~.</math>

Proofs

4. Let <math>\ X \sim \textrm{ Weibull }(\sigma,\,\mu)\ ,</math> then the cumulative distribution of <math>\ g(x) = \mu\left(1-\sigma\log\frac{X}{\sigma} \right)\ </math> is:

:<math>

\begin{align}

\operatorname{\mathbb P}\left\{\ \mu \left(1-\sigma\log\frac{\ X\ }{ \sigma } \right) < x\ \right\} &= \operatorname{\mathbb P}\left\{\ \log\frac{X}{\sigma} > \frac{1 - x/\mu}{\sigma}\ \right\} \\ {} \\

& \mathsf{\ Since\ the\ logarithm\ is\ always\ increasing:\ } \\ {} \\

&= \operatorname{\mathbb P}\left\{\ X > \sigma \exp\left[ \frac{1 - x/\mu}{\sigma} \right]\ \right\} \\

&= \exp\left( - \left(\cancel{\sigma} \exp\left[ \frac{1 - x/\mu}{\sigma} \right] \cdot \cancel{\frac{1}{\sigma \right)^\mu \right) \\

&= \exp\left( - \left( \exp\left[ \frac{\cancelto{\mu}{1} - x/\cancel{\mu{\sigma} \right] \right)^\cancel{\mu} \right) \\

&= \exp\left( - \exp\left[ \frac{\mu - x}{\sigma} \right] \right) \\

&= \exp\left( - \exp\left[ - s \right] \right), \quad s = \frac{x - \mu}{\sigma}\ ,

\end{align}

</math>

:which is the cdf for <math>\sim \textrm{GEV}(\mu,\,\sigma,\,0) ~.</math>

5. Let <math>\ X \sim \textrm{Exponential}(1)\ ,</math> then the cumulative distribution of <math>\ g(X) = \mu - \sigma \log X\ </math> is:

:<math>

\begin{align}

\operatorname{\mathbb P}\left\{\ \mu - \sigma \log X < x\ \right\} &= \operatorname{\mathbb P}\left\{\ \log X > \frac{\mu - x}{\sigma}\ \right\} \\ {} \\

& \mathsf{\ Since\ the\ logarithm\ is\ always\ increasing:\ } \\ {} \\

&= \operatorname{\mathbb P}\left\{\ X > \exp\left( \frac{\ \mu - x\ }{ \sigma } \right)\ \right\} \\

&= \exp\left[- \exp\left( \frac{\ \mu - x\ }{ \sigma } \right) \right] \\

&= \exp\left[- \exp(-s) \right]\ , \quad ~\mathsf{ where }~ \quad s \equiv \frac{x - \mu}{\sigma}\ ;

\end{align}

</math>

:which is the cumulative distribution of <math>\ \operatorname{GEV}(\mu, \sigma, 0) ~.</math>

References

</references>

Generalized extreme value distribution

Specification

Prediction

Example for Normally distributed variables

Proofs

See also

References

Further reading

Specification

Prediction

Example for Normally distributed variables

Related distributions

Proofs

See also

References

Further reading