Helmholtz free energy

In thermodynamics, the Helmholtz free energy (or Helmholtz energy) is a thermodynamic potential that measures the useful work obtainable from a closed thermodynamic system at a constant temperature (isothermal). The change in the Helmholtz energy during a process is equal to the maximum amount of work that the system can perform in a thermodynamic process in which temperature is held constant. At constant temperature, the Helmholtz free energy is minimized at equilibrium.

In contrast, the Gibbs free energy or free enthalpy is most commonly used as a measure of thermodynamic potential (especially in chemistry) when it is convenient for applications that occur at constant pressure. For example, in explosives research Helmholtz free energy is often used, since explosive reactions by their nature induce pressure changes. It is also frequently used to define fundamental equations of state of pure substances.

The concept of free energy was developed by Hermann von Helmholtz, a German physicist, and first presented in 1882 in a lecture called "On the thermodynamics of chemical processes". From the German word Arbeit (work), the International Union of Pure and Applied Chemistry (IUPAC) recommends the symbol A and the name Helmholtz energy. In physics, the symbol F is also used in reference to free energy or Helmholtz function.

Definition

The Helmholtz free energy is defined as

<math display="block">A \equiv U - TS,</math>

where

A is the Helmholtz free energy

<math display="block">S = \left.-\left( \frac{\partial A}{\partial T} \right) \right|_{V,N}, \quad

P = \left.-\left( \frac{\partial A}{\partial V} \right) \right|_{T,N}, \quad

\mu = \left.\left( \frac{\partial A}{\partial N} \right) \right|_{T,V}.</math>

These three equations, along with the free energy in terms of the partition function,

allow an efficient way of calculating thermodynamic variables of interest given the partition function and are often used in density of state calculations. One can also do Legendre transformations for different systems. For example, for a system with a magnetic field or potential, it is true that

<math display="block">m = \left.-\left( \frac{\partial A}{\partial B} \right) \right|_{T,N}, \quad

V = \left.\left ( \frac{\partial A}{\partial Q} \right) \right|_{N,T}.</math>

Bogoliubov inequality

Computing the free energy is an intractable problem for all but the simplest models in statistical physics. A powerful approximation method is mean-field theory, which is a variational method based on the Bogoliubov inequality. This inequality can be formulated as follows.

Suppose we replace the real Hamiltonian <math>H</math> of the model by a trial Hamiltonian <math>\tilde{H}</math>, which has different interactions and may depend on extra parameters that are not present in the original model. If we choose this trial Hamiltonian such that

<math display="block">\left\langle\tilde{H}\right\rangle = \langle H \rangle,</math>

where both averages are taken with respect to the canonical distribution defined by the trial Hamiltonian <math>\tilde{H}</math>, then the Bogoliubov inequality states

<math display="block">A \leq \tilde{A},</math>

where <math>A</math> is the free energy of the original Hamiltonian, and <math>\tilde{A}</math> is the free energy of the trial Hamiltonian. We will prove this below.

By including a large number of parameters in the trial Hamiltonian and minimizing the free energy, we can expect to get a close approximation to the exact free energy.

The Bogoliubov inequality is often applied in the following way. If we write the Hamiltonian as

<math display="block">H = H_0 + \Delta H,</math>

where <math>H_0</math> is some exactly solvable Hamiltonian, then we can apply the above inequality by defining

<math display="block">\tilde{H} = H_0 + \langle\Delta H\rangle_0.</math>

Here we have defined <math>\langle X\rangle_0</math> to be the average of X over the canonical ensemble defined by <math>H_0</math>. Since <math>\tilde{H}</math> defined this way differs from <math>H_0</math> by a constant, we have in general

<math display="block">\langle X\rangle_0 = \langle X\rangle.</math>

where <math>\langle X\rangle</math> is still the average over <math>\tilde{H}</math>, as specified above. Therefore,

<math display="block">\left\langle\tilde{H}\right\rangle = \big\langle H_0 + \langle\Delta H\rangle \big\rangle = \langle H\rangle,</math>

and thus the inequality

<math display="block">A \leq \tilde{A}</math>

holds. The free energy <math>\tilde{A}</math> is the free energy of the model defined by <math>H_0</math> plus <math>\langle\Delta H\rangle</math>. This means that

<math display="block">\tilde{A} = \langle H_0\rangle_0 - T S_0 + \langle\Delta H\rangle_0 = \langle H\rangle_0 - T S_0,</math>

and thus

<math display="block">A\leq \langle H\rangle_0 - T S_0.</math>

Proof of the Bogoliubov inequality

For a classical model we can prove the Bogoliubov inequality as follows. We denote the canonical probability distributions for the Hamiltonian and the trial Hamiltonian by <math>P_{r}</math> and <math>\tilde{P}_{r}</math>, respectively. From Gibbs' inequality we know that:

<math display="block">\sum_{r} \tilde{P}_{r}\log\left(\tilde{P}_{r}\right)\geq \sum_{r} \tilde{P}_{r}\log\left(P_{r}\right) \,</math>

holds. To see this, consider the difference between the left hand side and the right hand side. We can write this as:

<math display="block">\sum_{r} \tilde{P}_{r}\log\left(\frac{\tilde{P}_{r{P_{r\right) \,</math>

Since

<math display="block">\log\left(x\right)\geq 1 - \frac{1}{x}\,</math>

it follows that:

<math display="block">\sum_{r} \tilde{P}_{r}\log\left(\frac{\tilde{P}_{r{P_{r\right)\geq \sum_{r}\left(\tilde{P}_{r} - P_{r}\right) = 0 \,</math>

where in the last step we have used that both probability distributions are normalized to 1.

We can write the inequality as:

<math display="block">\left\langle\log\tilde{P}_{r}\right\rangle \geq \left\langle\log P_{r} \right\rangle</math>

where the averages are taken with respect to <math>\tilde{P}_{r}</math>. If we now substitute in here the expressions for the probability distributions:

<math display="block">P_{r} = \frac{\exp\left[-\beta H(r)\right]}{Z}</math>

and

<math display="block">\tilde{P}_{r} = \frac{\exp\left[-\beta\tilde{H}(r)\right]}{\tilde{Z</math>

we get:

<math display="block">\left\langle -\beta \tilde{H} - \log \tilde{Z} \right\rangle\geq \left\langle -\beta H - \log Z \right\rangle</math>

Since the averages of <math>H</math> and <math>\tilde{H}</math> are, by assumption, identical we have:

<math display="block">A\leq\tilde{A}

</math>

Here we have used that the partition functions are constants with respect to taking averages and that the free energy is proportional to minus the logarithm of the partition function.

We can easily generalize this proof to the case of quantum mechanical models. We denote the eigenstates of <math>\tilde{H}</math> by <math>\left|r\right\rangle</math>. We denote the diagonal components of the density matrices for the canonical distributions for <math>H</math> and <math>\tilde{H}</math> in this basis as:

<math display="block">P_{r}=\left\langle r\left|\frac{\exp\left[-\beta H\right]}{Z}\right|r\right\rangle\,</math>

and

<math display="block">\tilde{P}_{r}=\left\langle r\left|\frac{\exp\left[-\beta\tilde{H}\right]}{\tilde{Z\right|r\right\rangle=\frac{\exp\left(-\beta\tilde{E}_{r}\right)}{\tilde{Z\,</math>

where the <math>\tilde{E}_{r}</math> are the eigenvalues of <math>\tilde{H}</math>

We assume again that the averages of H and <math>\tilde{H}</math> in the canonical ensemble defined by <math>\tilde{H}</math> are the same:

<math display="block">\left\langle\tilde{H}\right\rangle = \left\langle H\right\rangle \,</math>

where

<math display="block">\left\langle H\right\rangle = \sum_{r}\tilde{P}_{r}\left\langle r\left|H\right|r\right\rangle\,</math>

The inequality

<math display="block">\sum_{r} \tilde{P}_{r} \log \tilde{P}_{r} \geq \sum_{r} \tilde{P}_{r} \log P_{r}</math>

still holds as both the <math>P_{r}</math> and the <math>\tilde{P}_{r}</math> sum to 1. On the left-hand side we can replace:

<math display="block">\log \tilde{P}_{r} = -\beta \tilde{E}_{r} - \log \tilde{Z}</math>

On the right-hand side we can use the inequality

<math display="block">\left\langle e^X \right\rangle_{r} \geq e^</math>

where we have introduced the notation

<math display="block">\left\langle Y\right\rangle_{r}\equiv\left\langle r\left|Y\right|r\right\rangle\,</math>

for the expectation value of the operator Y in the state r. See here for a proof. Taking the logarithm of this inequality gives:

<math display="block">\log\left[\left\langle e^X \right\rangle_{r}\right]\geq\left\langle X\right\rangle_{r}\,</math>

This allows us to write:

<math display="block">\log P_{r} = \log\left[\left\langle \exp\left(-\beta H - \log Z \right)\right\rangle_{r}\right] \geq \left\langle -\beta H - \log Z\right\rangle_{r}</math>

The fact that the averages of H and <math>\tilde{H}</math> are the same then leads to the same conclusion as in the classical case:

<math display="block">A\leq\tilde{A}</math>

Generalized Helmholtz energy

In the more general case, the mechanical term <math>p\mathrm{d}V</math> must be replaced by the product of volume, stress, and an infinitesimal strain:

<math display="block">\mathrm{d}A = V \sum_{ij} \sigma_{ij}\,\mathrm{d} \varepsilon_{ij} - S\,\mathrm{d}T + \sum_i \mu_i \,\mathrm{d}N_i,</math>

where <math>\sigma_{ij}</math> is the stress tensor, and <math>\varepsilon_{ij}</math> is the strain tensor. In the case of linear elastic materials that obey Hooke's law, the stress is related to the strain by

<math display="block">\sigma_{ij} = C_{ijkl}\varepsilon_{kl},</math>

where we are now using Einstein notation for the tensors, in which repeated indices in a product are summed. We may integrate the expression for <math>\mathrm{d}A</math> to obtain the Helmholtz energy:

\begin{align}

A &= \frac{1}{2}VC_{ijkl}\varepsilon_{ij}\varepsilon_{kl} - ST + \sum_i \mu_i N_i \\

&= \frac{1}{2}V\sigma_{ij}\varepsilon_{ij} - ST + \sum_i \mu_i N_i.

\end{align}

</math>

Application to fundamental equations of state

The Helmholtz free energy function for a pure substance (together with its partial derivatives) can be used to determine all other thermodynamic properties for the substance. See, for example, the equations of state for water, as given by the IAPWS in their IAPWS-95 release.

Application to training auto-encoders

Hinton and Zemel "derive an objective function for training auto-encoder based on the minimum description length (MDL) principle". "The description length of an input vector using a particular code is the sum of the code cost and reconstruction cost. They define this to be the energy of the code. Given an input vector, they define the energy of a code to be the sum of the code cost and the reconstruction cost." The true expected combined cost is

"which has exactly the form of Helmholtz free energy".