In the theory of evolution and natural selection, the Price equation (also known as Price's equation or Price's theorem) describes how a "characteristic" of a population changes in frequency over time as the result of reproduction and natural selection. A characteristic may be a physical or behavioral trait (phenotype) or a particular genetic makeup (allele).

The equation uses a covariance between a characteristic and fitness, to give a mathematical description of evolution and natural selection. It provides a way to understand the effects that gene transmission and natural selection have on the frequency of alleles and/or phenotypes within each new generation of a population. The Price equation was derived by George R. Price, working in London to re-derive W.D. Hamilton's work on kin selection.

Examples of the Price equation have been constructed for various evolutionary cases. For example Collins and Gardner use the Price equation to partition the total change in toxin resistance in microbial communities into evolutionary change, ecological change and physiological change. Ellner et al. use the Price equation to disentangle "ecological impacts of evolution vs. non-heritable trait change", using examples from data on birds, fish and zooplankton. The Price equation also has applications in economics.

Statement

thumb|Example for a characteristic under positive selection

The Price equation shows that a change in the average amount <math>z</math> of a characteristic in a population from one generation to the next (<math>\Delta z</math>) is determined by the covariance between the amounts <math>z_i</math> of the characteristic for subpopulation <math>i</math> and the fitnesses <math>w_i</math> of the subpopulations, together with the expected change in the characteristic due to fitness, namely <math>\mathrm{E}(w_i \Delta z_i)</math>:

:<math>\Delta{z} = \frac{1}{w}\operatorname{cov}(w_i, z_i) + \frac{1}{w}\operatorname{E}(w_i\,\Delta z_i).</math>

Here <math>w</math> is the average fitness over the population, and <math>\operatorname{E}</math> and <math>\operatorname{cov}</math> represent the population mean and covariance respectively. 'Fitness' <math>w</math> is the ratio of the average number of offspring for the whole population per the number of adult individuals in the population, and <math>w_i</math> is that same ratio only for subpopulation <math>i</math>.

If the covariance between fitness (<math>w_i</math>) and the characteristic (<math>z_i</math>) is positive, the characteristic is expected to rise on average across population <math>i</math>. If the covariance is negative, the characteristic is harmful, and its frequency is expected to drop.

By noting that the covariance is the standard deviation multiplied by the correlation (<math> \operatorname{cov}(X,Y) = r_{X,Y} \sigma_X \sigma_Y </math>) we can rewrite this as

:<math>\frac{\Delta{z{\sigma_z} = \frac{r_{w,z{w/\sigma_w} + \frac{1}{w/\sigma_w} \operatorname{E} \left ( \frac{w_i}{\sigma_w} \frac{\Delta z_i}{\sigma_z} \right) </math>

:<math>\frac{\Delta{z{\sigma_z} = CV_w r_{w,z} + CV_w \operatorname{E} \left ( \frac{w_i}{\sigma_w} \frac{\Delta z_i}{\sigma_z} \right) </math>

where <math> CV_w = \frac{\sigma_w}{w} </math> is the coefficient of variation of the fitness. That is, we can rewrite the quantities in standardized form.

We see that if the correlation between the characteristic and fitness is held constant, more variance increases the magnitude of selection.

The second term, <math>\mathrm{E}(w_i \Delta z_i)</math>, represents the portion of <math>\Delta z</math> due to all factors other than direct selection which can affect the evolution of the characteristic. This term can encompass genetic drift, mutation bias, or meiotic drive. Additionally, this term can encompass the effects of multi-level selection or group selection.

Proof

As with any concept, the more general its application, the more difficult it usually is to understand. The Price equation is no exception, and its proof is rather extended and detailed. There are a number of equations in evolutionary biology which are special cases of the Price equation and are more easily understood, such as the Robertson-Price identity (Simple Price equation), the breeder's equation, and Fisher's fundamental theorem of natural selection.

The Price equation relies on a simplified description of the properties of evolutionary systems.

All these properties can be summarized in a diagram representing two generations, in which the ancestors are parents and the children are the descendants.

thumb|This diagram illustrates the basic properties of evolutionary systems used in some derivations of the Price equation. In this example, the characteristic of interest is leaf length (measured in cm). This characteristic changes over time. One source of change is that the parent with larger leaves produces more children than the parent with smaller leaves.

This simplified description of evolution can be formalized by studying four equal-length lists of real numbers, the abundance of individuals of type <math>i</math> in the past, <math>n_i</math>, the characteristic of individuals of type <math>i</math> in the past <math>z_i</math>, the abundance of individuals of type <math>i</math> in the present <math>n_i'</math>, and the characteristic of individuals of type <math>i</math> in the present <math>z_i'</math>. From this from we may define <math>w_i=n_i'/n_i</math>. <math>n_i</math> and <math>z_i</math> will be called the parent population numbers and characteristics associated with each index i. Likewise <math>n_i'</math> and <math>z_i'</math> will be called the child population numbers and characteristics, and <math>w_i'</math> will be called the fitness associated with index i. (Equivalently, we could have been given <math>n_i</math>, <math>z_i</math>, <math>w_i</math>, <math>z_i'</math> with <math>n_i'=w_i n_i</math>.) Define the parent and child population totals:

:{| cellspacing="20"

|-

| <math>n\;\stackrel{\mathrm{def{=}\;\sum_i n_i</math> || <math>n'\;\stackrel{\mathrm{def{=}\;\sum_i n_i'</math>

|}

and the probabilities (or frequencies):

:{|cellspacing=20

|-

| <math>q_i\;\stackrel{\mathrm{def{=}\;n_i/n</math> || <math>q_i'\;\stackrel{\mathrm{def{=}\;n_i'/n'</math>

|}

Note that these are of the form of probability mass functions in that <math>\sum_i q_i = \sum_i q_i' = 1</math> and are in fact the probabilities that a random individual drawn from the parent or child population has a characteristic <math>z_i</math>. Define the fitnesses:

:<math>w_i\;\stackrel{\mathrm{def{=}\;n_i'/n_i</math>

The average of any list <math>x_i</math> is given by:

:<math>E(x_i)=\sum_i q_i x_i</math>

so the average characteristics are defined as:

:{|cellspacing=20

|-

|<math>z\;\stackrel{\mathrm{def{=}\;\sum_i q_i z_i</math> || <math>z'\;\stackrel{\mathrm{def{=}\;\sum_i q_i' z_i'</math>

|}

and the average fitness is:

:<math>w\;\stackrel{\mathrm{def{=}\;\sum_i q_i w_i</math>

A simple theorem can be proved:

<math>q_i w_i = \left(\frac{n_i}{n}\right)\left(\frac{n_i'}{n_i}\right) = \left(\frac{n_i'}{n'}\right) \left(\frac{n'}{n}\right)=q_i'\left(\frac{n'}{n}\right)</math>

so that:

:<math>w=\frac{n'}{n}\sum_i q_i' = \frac{n'}{n}</math>

and

:<math>q_i w_i = w\,q_i'</math>

The covariance of <math>w_i</math> and <math>z_i</math> is defined by:

:<math>\operatorname{cov}(w_i,z_i)\;\stackrel{\mathrm{def{=}\;E(w_i z_i)-E(w_i)E(z_i) = \sum_i q_i w_i z_i - w z</math>

Defining <math>\Delta z_i \;\stackrel{\mathrm{def{=}\; z_i'-z_i</math>, the expectation value of <math>w_i \Delta z_i</math> is

:<math>E(w_i \Delta z_i) = \sum_i q_i w_i (z_i'-z_i) = \sum_i q_i w_i z_i' - \sum_i q_i w_i z_i</math>

The sum of the two terms is:

:<math>\operatorname{cov}(w_i,z_i)+E(w_i \Delta z_i) = \sum_i q_i w_i z_i - w z + \sum_i q_i w_i z_i' - \sum_i q_i w_i z_i = \sum_i q_i w_i z_i' - w z </math>

Using the above mentioned simple theorem, the sum becomes

:<math>\operatorname{cov}(w_i,z_i)+E(w_i \Delta z_i) = w\sum_i q_i' z_i' - w z = w z'-wz = w\Delta z</math>

where

<math>\Delta z\;\stackrel{\mathrm{def{=}\;z'-z</math>.

Derivation of the continuous-time Price equation

Consider a set of subgroups with populations of size <math>x_1, x_2, \dots, x_n</math>. Suppose they each grow exponentially over time with their own rates, <math>r_1, r_2, \dots, r_n</math>:<math display="block">\dot x_i\;\overset{\underset{\mathrm{def

\frac{d}{dt} \mathbb E[z] &= \sum_{i=1}^n (\dot f_i z_i + f_i \dot z_i),\\

&= \left(\sum_{i=1}^n \dot f_i z_i\right) + \left(\sum_{i=1}^n f_i \dot z_i\right).

\end{align}</math>The second term is just <math>\mathbb E[\dot z]</math>, the expected change in the characteristic <math>z_i</math> per group <math>i</math>, averaged across groups by their current sizes. It accounts for <math>z</math> varying within subgroups but not the variation in <math>z</math> due to some subgroups growing faster than others. The first term accounts for this selection pressure. To calculate the first term, we can use the chain rule:<math display="block">\dot f_i = \sum_{j=1}^n \frac{d x_j}{dt} \frac{\partial f_i}{\partial x_j} = \sum_{j=1}^n (r_ix_i) \frac{\partial f_i}{\partial x_j}.</math>We can derive that<math display="block">\frac{\partial f_i}{\partial x_j} = \begin{cases}

(1 - f_i) / \sum_{k=1}^n x_k & \text{if }i=j,\\

-f_i / \sum_{k=1}^n x_k & \text{otherwise},

\end{cases}</math>so that<math display="block">\begin{align}

\dot f_i &= \frac{(r_ix_i) - f_i \sum_{j=1}^n (r_j x_j)}{\sum_{k=1}^n x_k} \\

&= f_ir_i - f_i\sum_{j=1}^n f_j r_j \\

&= f_i(r_i - \mathbb E[r]).

\end{align}</math>Therefore the first term in <math>\frac{d}{dt}\mathbb E[z]</math> equals<math display="block">\begin{align}

\sum_{i=1}^n \dot f_i z_i &= \sum_{i=1}^n f_i(r_i - \mathbb E[r])z_i \\

&= \mathbb E\left[(r_i - \mathbb E[r])z_i\right]\\

&= \mathrm{Cov}[r, z].

\end{align}</math>Putting the two components together, we arrive at the continuous-time Price equation:<math display="block">\frac{d}{dt} \mathbb{E}[x] = \underbrace{\text{Cov}[r, z]}_{\text{Selection effect + \underbrace{\mathbb{E}[\dot{z}]}_{\text{Dynamic effect.</math>When the second term is zero, we get the replicator equation, which is the continuous analogue of the simple Price equation below.

Simple Price equation (Robertson-Price identity)

When the characteristic values <math>z_i</math> do not change from the parent to the child generation, the second term in the Price equation becomes zero resulting in a simplified version of the Price equation:

:<math>w\,\Delta z = \operatorname{cov}\left(w_i, z_i\right)</math>

which can be restated as:

:<math>\Delta z = \operatorname{cov}\left(v_i, z_i\right)</math>

where <math>v_i</math> is the fractional fitness: <math>v_i=w_i/w</math>.

This simple Price equation can be proven using the definition in Equation&nbsp;(2) above. It makes this fundamental statement about evolution: "If a certain inheritable characteristic is correlated with an increase in fractional fitness, the average value of that characteristic in the child population will be increased over that in the parent population."

Applications

The Price equation can describe any system that changes over time, but is most often applied in evolutionary biology. The evolution of sight provides an example of simple directional selection. The evolution of sickle cell anemia shows how a heterozygote advantage can affect characteristic evolution. The Price equation can also be applied to population context dependent characteristics such as the evolution of sex ratios. Additionally, the Price equation is flexible enough to model second order characteristics such as the evolution of mutability. The Price equation also provides an extension to Founder effect which shows change in population characteristics in different settlements

Dynamical sufficiency and the simple Price equation

Sometimes the genetic model being used encodes enough information into the parameters used by the Price equation to allow the calculation of the parameters for all subsequent generations. This property is referred to as dynamical sufficiency. For simplicity, the following looks at dynamical sufficiency for the simple Price equation, but is also valid for the full Price equation.

Referring to the definition in Equation&nbsp;(2), the simple Price equation for the character <math>z</math> can be written:

:<math>w(z' - z) = \langle w_i z_i \rangle - wz</math>

For the second generation:

:<math>w'(z - z') = \langle w'_i z'_i \rangle - w'z'</math>

The simple Price equation for <math>z</math> only gives us the value of <math>z'</math> for the first generation, but does not give us the value of <math>w'</math> and <math>\langle w_iz_i\rangle</math>, which are needed to calculate <math>z</math> for the second generation. The variables <math>w_i</math> and <math>\langle w_iz_i\rangle</math> can both be thought of as characteristics of the first generation, so the Price equation can be used to calculate them as well:

:<math>\begin{align}

w(w' - w) &= \langle w_i^2\rangle - w^2 \\

w\left(\langle w'_i z'_i\rangle - \langle w_i z_i\rangle\right) &= \langle w_i ^2 z_i\rangle - w\langle w_i z_i\rangle

\end{align}</math>

The five 0-generation variables <math>w</math>, <math>z</math>, <math>\langle w_iz_i\rangle</math>, <math>\langle w_i^2\rangle</math>, and <math>\langle w_i^2z_i</math> must be known before proceeding to calculate the three first generation variables <math>w'</math>, <math>z'</math>, and <math>\langle w'_iz'_i\rangle</math>, which are needed to calculate <math>z</math> for the second generation. It can be seen that in general the Price equation cannot be used to propagate forward in time unless there is a way of calculating the higher moments <math>\langle w_i^n\rangle</math> and <math>\langle w_i^nz_i\rangle</math> from the lower moments in a way that is independent of the generation. Dynamical sufficiency means that such equations can be found in the genetic model, allowing the Price equation to be used alone as a propagator of the dynamics of the model forward in time.

Full Price equation

The simple Price equation was based on the assumption that the characters <math>z_i</math> do not change over one generation. If it is assumed that they do change, with <math>z_i</math> being the value of the character in the child population, then the full Price equation must be used. A change in character can come about in a number of ways. The following two examples illustrate two such possibilities, each of which introduces new insight into the Price equation.

Genotype fitness

We focus on the idea of the fitness of the genotype. The index <math>i</math> indicates the genotype and the number of type <math>i</math> genotypes in the child population is:

:<math>n'_i = \sum_j w_{ji}n_j\,</math>

which gives fitness:

:<math>w_i = \frac{n'_i}{n_i}</math>

Since the individual mutability <math>z_i</math> does not change, the average mutabilities will be:

:<math>\begin{align}

z &= \frac{1}{n}\sum_i z_i n_i \\

z' &= \frac{1}{n'}\sum_i z_i n'_i

\end{align}</math>

with these definitions, the simple Price equation now applies.

Lineage fitness

In this case we want to look at the idea that fitness is measured by the number of children an organism has, regardless of their genotype. Note that we now have two methods of grouping, by lineage, and by genotype. It is this complication that will introduce the need for the full Price equation. The number of children an <math>i</math>-type organism has is:

:<math>n'_i = n_i\sum_j w_{ij}\,</math>

which gives fitness:

:<math>w_i = \frac{n'_i}{n_i} = \sum_j w_{ij}</math>

We now have characters in the child population which are the average character of the <math>i</math>-th parent.

:<math>z'_j = \frac{\sum_i n_i z_i w_{ij} }{\sum_i n_i w_{ij</math>

with global characters:

:<math>\begin{align}

z &= \frac{1}{n}\sum_i z_i n_i \\

z' &= \frac{1}{n'}\sum_i z_i n'_i

\end{align}</math>

with these definitions, the full Price equation now applies.

Criticism

The use of the change in average characteristic (<math>z'-z</math>) per generation as a measure of evolutionary progress is not always appropriate. There may be cases where the average remains unchanged (and the covariance between fitness and characteristic is zero) while evolution is nevertheless in progress. For example, if we have <math>z_i=(1,2,3)</math>, <math>n_i=(1,1,1)</math>, and <math>w_i=(1,4,1)</math>, then for the child population, <math>n_i'=(1,4,1)</math> showing that the peak fitness at <math>w_2=4</math> is in fact fractionally increasing the population of individuals with <math>z_i=2</math>. However, the average characteristics are z=2 and z'=2 so that <math>\Delta z=0</math>. The covariance <math>\mathrm{cov}(z_i,w_i)</math> is also zero. The simple Price equation is required here, and it yields 0=0. In other words, it yields no information regarding the progress of evolution in this system.

A critical discussion of the use of the Price equation can be found in van&nbsp;Veelen (2005), van&nbsp;Veelen et al. (2012), and van&nbsp;Veelen (2020). Frank (2012) discusses the criticism in van&nbsp;Veelen et al. (2012).

Cultural references

Price's equation features in the plot and title of the 2008 thriller film WΔZ.

The Price equation also features in posters in the computer game BioShock 2, in which a consumer of a "Brain Boost" tonic is seen deriving the Price equation while simultaneously reading a book. The game is set in the 1950s, substantially before Price's work.

See also

  • The breeder's equation, which is a special case of the Price equation.
  • Fisher's fundamental theorem of natural selection, which is a special case of the Price equation.