</math>

|kurtosis =<math>\frac{1 - 6pq}{pq}</math>

|entropy =<math>-q\ln q - p\ln p</math>

|mgf =<math>q+pe^t</math>

|char =<math>q+pe^{it}</math>

|pgf =<math>q+pz</math>

|fisher =<math> \frac{1}{pq} </math>

In probability theory and statistics, the Bernoulli distribution, named after Swiss mathematician Jacob Bernoulli, is the discrete probability distribution of a random variable which takes the value 1 with probability <math>p</math> and the value 0 with probability <math>q = 1-p</math>. Less formally, it can be thought of as a model for the set of possible outcomes of any single experiment that asks a yes–no question. Such questions lead to outcomes that are Boolean-valued: a single bit whose value is success/yes/true/one with probability p and failure/no/false/zero with probability q. It can be used to represent a (possibly biased) coin toss where 1 and 0 would represent "heads" and "tails", respectively, and p would be the probability of the coin landing on heads (or vice versa where 1 would represent tails and p would be the probability of tails). In particular, unfair coins would have <math>p \neq 1/2.</math>

The Bernoulli distribution is a special case of the binomial distribution where a single trial is conducted (so n would be 1 for such a binomial distribution). It is also a special case of the two-point distribution, for which the possible outcomes need not be 0 and 1.

Properties

If <math>X</math> is a random variable with a Bernoulli distribution, then:

<math display="block">\begin{align}

\Pr(X{=}1) &= p, \\

\Pr(X{=}0) &= q =1 - p.

\end{align}</math>

The probability mass function <math>f</math> of this distribution, over possible outcomes k, is

<math display="block"> f(k;p) = \begin{cases}

p & \text{if }k=1, \\

q = 1-p & \text {if } k = 0.

\end{cases}</math>

This can also be expressed as

<math display="block">f(k;p) = p^k (1-p)^{1-k} \quad \text{for } k\in\{0,1\}</math>

or as

<math display="block">f(k;p)=pk+(1-p)(1-k) \quad \text{for } k\in\{0,1\}.</math>

The Bernoulli distribution is a special case of the binomial distribution with <math>n = 1.</math>

The kurtosis goes to infinity for high and low values of <math>p,</math> but for <math>p=1/2</math> the two-point distributions including the Bernoulli distribution have a lower excess kurtosis, namely −2, than any other probability distribution.

The Bernoulli distributions for <math>0 \le p \le 1</math> form an exponential family.

The maximum likelihood estimator of <math>p</math> based on a random sample is the sample mean.

thumb|The probability mass distribution function of a Bernoulli experiment along with its corresponding cumulative distribution function

Mean

The expected value of a Bernoulli random variable <math>X</math> is

<math display="block">\operatorname{E}[X]=p</math>

This is because for a Bernoulli distributed random variable <math>X</math> with <math>\Pr(X{=}1) = p</math> and <math display="inline">\Pr(X{=}0) = q</math> we find

  • The geometric distribution models the number of independent and identical Bernoulli trials needed to get one success.
  • If <math display="inline">Y \sim \mathrm{Bernoulli}\left(\frac{1}{2}\right)</math>, then <math display="inline">2Y - 1</math> has a Rademacher distribution.

See also

  • Bernoulli process, a random process consisting of a sequence of independent Bernoulli trials
  • Bernoulli sampling
  • Binary entropy function
  • Binary decision diagram

References

Author's mention

  • .
  • Interactive graphic: Univariate Distribution Relationships.