In statistical mechanics, a semi-classical derivation of entropy that does not take into account the indistinguishability of particles yields an expression for entropy which is not extensive (is not proportional to the amount of substance in question). This leads to a paradox known as the Gibbs paradox, after Josiah Willard Gibbs, who proposed this thought experiment in 1874‒1875. Two containers of an ideal gas sit side-by-side. The gas in container #1 is identical in every respect to the gas in container #2 (i.e. in volume, mass, temperature, pressure, etc). Accordingly, they have the same entropy S. Now a door in the container wall is opened to allow the gas particles to mix between the containers. No macroscopic changes occur, as the system is in equilibrium. But if the formula for entropy is not extensive, the entropy of the combined system will not be 2S. In fact, the particular non-extensive entropy quantity considered by Gibbs predicts additional entropy (more than 2S). Closing the door then reduces the entropy again to S per box, in apparent violation of the second law of thermodynamics.

As understood by Gibbs, and reemphasized more recently, this is a misuse of Gibbs' non-extensive entropy quantity. If the gas particles are distinguishable, closing the doors will not return the system to its original state many of the particles will have switched containers. There is a freedom in defining what is "ordered", and it would be a mistake to conclude that the entropy has not increased. In particular, Gibbs' non-extensive entropy quantity for an ideal gas is not intended for situations where the number of particles changes.

The paradox is averted by assuming the indistinguishability (at least effective indistinguishability) of the particles in the volume. This results in the extensive Sackur–Tetrode equation for entropy, as derived next.

Calculating the entropy of ideal gas, and making it extensive

In classical mechanics, the state of an ideal gas of energy U, volume V and with N particles, each particle having mass m, is represented by specifying the momentum vector p and the position vector x for each particle. This can be thought of as specifying a point in a 6N-dimensional phase space, where each of the axes corresponds to one of the momentum or position coordinates of one of the particles. The set of points in phase space that the gas could occupy is specified by the constraint that the gas will have a particular energy:

<math display="block">U = \frac{1}{2m} \sum_{i=1}^N \left(p_{ix}^2 + p_{iy}^2 + p_{iz}^2\right)</math>

and be contained inside of the volume V (let's say V is a cube of side X so that ):

<math display="block">0 \le x_{ij} \le X</math>

for <math>i = 1, \dots, N</math> and <math>j = 1, 2, 3</math>

The first constraint defines the surface of a 3N-dimensional ball of radius (2mU)<sup>1/2</sup>, which is a 3N-1 dimensional hypersphere, and the second is a 3N-dimensional hypercube of volume V<sup>N</sup>. These combine to form a 6N-dimensional hypercylinder. Just as the area of the wall of a cylinder is the circumference of the base times the height, so the area φ of the wall of this hypercylinder is:

{\left(2mU\right)}^{\frac{3N-1}{2}{\Gamma{\left(\frac{3N}{2}\right)

</math>|

The entropy is proportional to the logarithm of the number of states that the gas could have while satisfying these constraints. In classical physics, the number of states is infinitely large, but according to quantum mechanics it is finite. Before the advent of quantum mechanics, this infinity was regularized by making phase space discrete. Phase space was divided up in blocks of volume h<sup>3N</sup>. The constant h thus appeared as a result of a mathematical trick and thought to have no physical significance. However, using quantum mechanics one recovers the same formalism in the semi-classical limit, but now with h being the Planck constant. One can qualitatively see this from Heisenberg's uncertainty principle; a volume in N phase space smaller than h<sup>3N</sup> (h is the Planck constant) cannot be specified.

To compute the number of states we must compute the volume in phase space in which the system can be found and divide that by h<sup>3N</sup>. This leads us to another problem: The volume seems to approach zero, as the region in phase space in which the system can be is an area of zero thickness. This problem is an artifact of having specified the energy U with infinite accuracy. In a generic system without symmetries, a full quantum treatment would yield a discrete non-degenerate set of energy eigenstates. An exact specification of the energy would then fix the precise state the system is in, so the number of states available to the system would be one, the entropy would thus be zero.

When we specify the internal energy to be U, what we really mean is that the total energy of the gas lies somewhere in an interval of length <math>\delta U</math> around U. Here <math>\delta U</math> is taken to be very small, it turns out that the entropy doesn't depend strongly on the choice of <math>\delta U</math> for large N. This means that the above "area" φ must be extended to a shell of a thickness equal to an uncertainty in momentum <math display="inline">\delta p = \delta\left(\sqrt{2 m U}\right) = \sqrt \delta U</math>, so the entropy is given by:

<math display="block">

S = k_\text{B} \, \ln(\phi \delta p/h^{3N})

</math>

where the constant of proportionality is k<sub>B</sub>, the Boltzmann constant. Using Stirling's approximation for the Gamma function which omits terms of less than order N, the entropy for large N becomes:

<math display="block">

S = k_\text{B} N \ln \left( \frac{V U^{3/2{N^{3/2 \right) +

\frac{3}{2} k_\text{B} N \left( 1+ \ln\frac{4\pi m}{3h^2}\right)

</math>

This quantity is not extensive as can be seen by considering two identical volumes with the same particle number and the same energy. Suppose the two volumes are separated by a barrier in the beginning. Removing or reinserting the wall is reversible, but the entropy increases when the barrier is removed by the amount

<math display="block">

\delta S = k_\text{B} \left[ 2N \ln(2V) - N\ln V - N \ln V \right] = 2 k_\text{B} N \ln 2 > 0

</math>

which is in contradiction to thermodynamics if you re-insert the barrier. This is the Gibbs paradox.

The paradox is resolved by postulating that the gas particles are in fact indistinguishable. This means that all states that differ only by a permutation of particles should be considered as the same state. For example, if we have a 2-particle gas and we specify AB as a state of the gas where the first particle (A) has momentum p<sub>1</sub> and the second particle (B) has momentum p<sub>2</sub>, then this state as well as the BA state where the B particle has momentum p<sub>1</sub> and the A particle has momentum p<sub>2</sub> should be counted as the same state.

In the limit of large N, for an N-particle gas, there are N! states which are identical in this sense, if one assumes that each particle is in a different single particle state. One can safely make this assumption provided the gas isn't at an extremely high density. Under normal conditions, one can thus calculate the volume of phase space occupied by the gas, by dividing Equation 1 by N!. Using the Stirling approximation again for large N, ln(N!) ≈ N ln(N) − N, the entropy for large N is:

<math display="block">

S = k_\text{B} N \ln

\left( \frac{V U^{3/2{N^{5/2\right) + k_\text{B} N \left( \frac{5}{2} + \frac{3}{2} \ln\frac{4\pi m}{3h^2}\right)

</math>

which can be easily shown to be extensive. This is the Sackur–Tetrode equation.

Mixing paradox

A closely related paradox to the Gibbs paradox is the mixing paradox. The Gibbs paradox is a special case of the "mixing paradox" which contains all the salient features. The difference is that the mixing paradox deals with arbitrary distinctions in the two gases, not just distinctions in particle ordering as Gibbs had considered. In this sense, it is a straightforward generalization to the argument laid out by Gibbs. Again take a box with a partition in it, with gas A on one side, gas B on the other side, and both gases are at the same temperature and pressure. If gas A and B are different gases, there is an entropy that arises once the gases are mixed, the entropy of mixing. If the gases are the same, no additional entropy is calculated. The additional entropy from mixing does not depend on the character of the gases; it only depends on the fact that the gases are different. The two gases may be arbitrarily similar, but the entropy from mixing does not disappear unless they are the same gas – a paradoxical discontinuity.

This "paradox" can be explained by carefully considering the definition of entropy. In particular, as concisely explained by Edwin Thompson Jaynes,

Setup

We will present a simplified version of the calculation. It differs from the full calculation in three ways:

  1. The ideal gas consists of particles confined to one spatial dimension.
  2. We keep only the terms of order <math>n \log(n)</math>, dropping all terms of size n or less, where n is the number of particles. For our purposes, this is enough, because this is where the Gibbs paradox shows up and where it must be resolved. The neglected terms play a role when the number of particles is not very large, such as in computer simulation and nanotechnology. Also, they are needed in deriving the Sackur–Tetrode equation.
  3. The subdivision of phase space into units of the Planck constant (h) is omitted. Instead, the entropy is defined using an integral over the "accessible" portion of phase space. This serves to highlight the purely classical nature of the calculation.

We begin with a version of Boltzmann's entropy in which the integrand is all of accessible phase space:

<math display="block">S = k_\text{B} \ln\Omega = k_\text{B} \ln {\!\! \oint\limits_{H(\mathbf{p}, \mathbf{q}) = E}\!\! d^n\mathbf{p} \, d^n\mathbf{q</math>

The integral is restricted to a contour of available regions of phase space, subject to conservation of energy. In contrast to the one-dimensional line integrals encountered in elementary physics, the contour of constant energy possesses a vast number of dimensions. The justification for integrating over phase space using the canonical measure involves the assumption of equal probability. The assumption can be made by invoking the ergodic hypothesis as well as the Liouville's theorem of Hamiltonian systems.

(The ergodic hypothesis underlies the ability of a physical system to reach thermal equilibrium, but this may not always hold for computer simulations (see the Fermi–Pasta–Ulam–Tsingou problem) or in certain real-world systems such as non-thermal plasmas.)

Liouville's theorem assumes a fixed number of dimensions that the system 'explores'. In calculations of entropy, the number of dimensions is proportional to the number of particles in the system, which forces phase space to abruptly change dimensionality when particles are added or subtracted. This may explain the difficulties in constructing a clear and simple derivation for the dependence of entropy on the number of particles.

For the ideal gas, the accessible phase space is an (n − 1)-sphere (also called a hypersphere) in the n-dimensional <math>\mathbf{v}</math> space:

<math display="block">E = \sum_{j=1}^n \frac{1}{2} m v_j^2\,,</math>

To recover the paradoxical result that entropy is not extensive, we integrate over phase space for a gas of <math>n</math> monatomic particles confined to a single spatial dimension by <math>0<x<\ell</math>. Since our only purpose is to illuminate a paradox, we simplify notation by taking the particle's mass and the Boltzmann constant equal to unity: <math> m = k = 1</math>. We represent points in phase-space and its x and v parts by n and 2n dimensional vectors:

<math display="block">\boldsymbol\xi = [x_1, \dots, x_n, v_1, \dots, v_n] = [ \mathbf{x}, \mathbf{v} ]</math>

where

<math display="block">\begin{align}

\mathbf{x} &= [x_1, \dots, x_n] \\

\mathbf{v} &= [v_1, \dots, v_n]\,.

\end{align}</math>

To calculate entropy, we use the fact that the (n-1)-sphere, <math display="inline">\sum v_j^2=R^2 ,</math> has an -dimensional "hypersurface volume" of

<math display="block">\tilde A_n(R)=\frac{n\pi^{n/2{(n/2)!} R^{n-1}\,.</math>

For example, if n = 2, the 1-sphere is the circle <math>\tilde A_2(R)=2\pi R</math>, a "hypersurface" in the plane. When the sphere is even-dimensional (n odd), it will be necessary to use the gamma function to give meaning to the factorial; see below.

Gibbs paradox in a one-dimensional gas

Gibbs paradox arises when entropy is calculated using an <math>n</math> dimensional phase space, where <math>n</math> is also the number of particles in the gas. These particles are spatially confined to the one-dimensional interval <math>\ell^n</math>. The volume of the surface of fixed energy is

<math display="block">\Omega_{E,\ell} = \left( \int dx_1 \int dx_2 \cdots \int dx_n\right) \underbrace{\left(\int dv_1 \int dv_2 \cdots \int dv_n\right)}_{\sum v_i^2 = 2E}</math>

The subscripts on <math>\Omega</math> are used to define the 'state variables' and will be discussed later, when it is argued that the number of particles, <math>n</math> lacks full status as a state variable in this calculation. The integral over configuration space is <math>\ell^n</math>. As indicated by the underbrace, the integral over velocity space is restricted to the "surface area" of the dimensional hypersphere of radius <math>\sqrt{2E}</math>, and is therefore equal to the "area" of that hypersurface. Thus

<math display="block">\Omega_{E,\ell} = \ell^n \frac{n\pi^{n/2{(n/2)!} (2E)^{\frac{n-1}{2</math>

<!--BEGIN HIDDEN TEXT (do not remove this comment) -->

: {| class="toccolours collapsible collapsed" width="50%" style="text-align:left"x

! Click to view the algebraic steps

|- <!--KEEP THIS LINE UNTOUCHED; next line begins with "|". -->

| We begin with:

<math display="block">\Omega_{E,\ell} = \ell^n \frac{n \pi^{n/2{(n/2)!} {\left(2E\right)}^{\frac{n-1}{2</math>

<math display="block">

\ln\Omega_{E,\ell} = \ln\left(\ell^n n \pi^{n/2} {\left(2E\right)}^{\frac{n-1}{2 \right) - \ln\left[ (n/2)! \right]

</math>

Both terms on the right hand side have dominant terms. Using the Stirling approximation for large M, <math>\ln M! \approx M\ln M</math> <math>-M + \ln \sqrt{2\pi M} </math>, we have:

<math display="block">

\ln\left(\ell^n n\pi^{n/2} {\left(2E\right)}^{\frac{n-1}{2 \right) =

\underbrace{\ln\left(\ell^n E^\frac{n}{2}\right)}_{\text{important +

\underbrace{\ln\left(\frac{n {\left(2\pi\right)}^\frac{n}{2} }{\sqrt{2E\right)}_{\text{drop

</math>

<math display="block">\begin{align}

- \ln [(n/2)! ] &\approx -\frac n 2 \ln \frac n 2 + \frac n 2 -\ln \sqrt{n\pi}\\

&= \underbrace{-\frac n 2 \ln n}_{\text{keep + \underbrace{\frac n 2 \ln 2 +\frac n 2 -\ln \sqrt{n\pi_{\text{drop \\

\end{align}</math>

Terms are neglected if they exhibit less variation with a parameter, and we compare terms that vary with the same parameter. Entropy is defined with an additive arbitrary constant because the area in phase space depends on what units are used. For that reason it does not matter if entropy is large or small for a given value of E. We instead to seek how entropy varies with E, i.e., we seek <math>\partial S/\partial E</math>:

  • An expression such as <math>E^{1/2}</math> is much less important than an expression like <math>E^{n/2} </math>
  • An expression like <math>\pi^n </math> is much less important than an expression like <math>n^n </math>. Note that the logarithm is not a strongly increasing function. The neglect of terms proportional to n compared with terms proportional to n&nbsp;ln&nbsp;n is only justified if n is extremely large.

Combining the important terms:

<math display="block">\begin{align}

\ln\Omega_{E,\ell} &=

\ln\left(\ell^n E^\frac{n}{2}\right) - \frac{n}{2} \ln n\\

&= n \ln \ell + \frac n 2 \ln\left(\frac{E}{n}\right)\\

\end{align}</math>

|}<!--KEEP THIS LINE UNTOUCHED "|}" unhides-->

<!--END HIDDEN TEXT (do not remove this comment) -->

After approximating the factorial and dropping the small terms, we obtain

<math display="block">\begin{align}

\ln\Omega_{E,\ell} &\approx n\ln\ell + n \ln\sqrt{\frac E n} + \text{const.}\\

&= \underbrace{n\ln\frac{\ell}{n} + n \ln\sqrt{\frac E n_{\text{extensive + \,n\ln n +\text{const.}\\

\end{align}</math>

In the second expression, the term <math>n\ln n</math> was subtracted and added, using the fact that <math>\ln\ell-\ln n = \ln (\ell /n)</math>. This was done to highlight exactly how the "entropy" defined here fails to be an extensive property of matter. The first two terms are extensive: if the volume of the system doubles, but gets filled with the same density of particles with the same energy, then each of these terms doubles. But the third term is neither extensive nor intensive and is therefore wrong.

The arbitrary constant has been added because entropy can usually be viewed as being defined up to an arbitrary additive constant. This is especially necessary when entropy is defined as the logarithm of a phase space volume measured in units of momentum-position. Any change in how these units are defined will add or subtract a constant from the value of the entropy.

Two standard ways to make the classical entropy extensive

As discussed above, an extensive form of entropy is recovered if we divide the volume of phase space, <math>\Omega_{E,\ell}</math>, by n!. An alternative approach is to argue that the dependence on particle number cannot be trusted on the grounds that changing <math>n</math> also changes the dimensionality of phase space. Such changes in dimensionality lie outside the scope of Hamiltonian mechanics and Liouville's theorem. For that reason it is plausible to allow the arbitrary constant to be a function of <math>n</math>. Defining the function to be, <math>f(n)=-\frac32 n\ln n</math>, we have:

<math display="block">\begin{align}

S = \ln\Omega_{E,\ell} &\approx n\ln\ell + n \ln\sqrt{E} + \text{const.}\\

&= n\ln\ell + n \ln\sqrt{E} + f(n)\\

\ln\Omega_{E,\ell,n} &\approx n\ln\frac{\ell}{n} + n \ln\sqrt{\frac E n} +\text{const.},\\

\end{align}</math>

which has extensive scaling:

<math display="block"> S(\alpha E,\alpha\ell,\alpha n) = \alpha\, S(E,\ell,n)</math>

Swendsen's particle-exchange approach

Following Swendsen,