thumb|The [[logistic curve]]

thumb|Plot of the [[error function]]

A sigmoid function is any mathematical function whose graph has a characteristic S-shaped or sigmoid curve.

A common example of a sigmoid function is the logistic function.

Other sigmoid functions are given in the Examples section. In some fields, most notably in the context of artificial neural networks, the term "sigmoid function" is used as a synonym for "logistic function".

Special cases of sigmoid functions include the Gompertz curve (used in modeling systems that saturate at large values of x) and the ogee curve (used in the spillway of some dams). Sigmoid functions have domain of all real numbers, with return (response) value commonly monotonically increasing but could be decreasing. Sigmoid functions most often show a return value (y axis) in the range 0 to 1. Another commonly used range is from −1 to 1.

There is also the Heaviside step function, which instantaneously transitions between 0 and 1.

A wide variety of sigmoid functions including the logistic and hyperbolic tangent functions have been used as the activation function of artificial neurons. Sigmoid curves are also common in statistics as cumulative distribution functions (which go from 0 to 1), such as the integrals of the logistic density, the normal density, and Student's t probability density functions. The logistic sigmoid function is invertible, and its inverse is the logit function.

Theory

In mathematics, a unitary sigmoid function is a bounded sigmoid-type function normalized to the unit range, typically with lower and upper asymptotes at 0 and 1. The theory proposed by Grebenc normalized to (−1,1):

<!--

<math display="block"> f(x) = \begin{cases}

{\displaystyle

2\frac{e^{\frac{1}{u}{e^{\frac{1}{u+e^{\frac{-1}{1+u} - 1}, u=\frac{x+1}{-2}, & |x| < 1 \\

\\

\sgn(x) & |x| \ge 1 \\

\end{cases}</math> AManWithNoPlan simplified below -->

<math display="block">\begin{align}f(x) &= \begin{cases}

{\displaystyle

\frac{2}{1+e^{-2m\frac{x}{1-x^2} - 1}, & |x| < 1 \\

\\

\sgn(x) & |x| \ge 1 \\

\end{cases} \\

&= \begin{cases}

{\displaystyle

\tanh\left(m\frac{x}{1-x^2}\right)}, & |x| < 1 \\

\\

\sgn(x) & |x| \ge 1 \\

\end{cases}\end{align}</math> using the hyperbolic tangent mentioned above. Here, <math>m</math> is a free parameter encoding the slope at <math>x=0</math>, which must be greater than or equal to <math>\sqrt{3}</math> because any smaller value will result in a function with multiple inflection points, which is therefore not a true sigmoid. This function is unusual because it actually attains the limiting values of −1 and 1 within a finite range, meaning that its value is constant at −1 for all <math>x \leq -1</math> and at 1 for all <math>x \geq 1</math>. Nonetheless, it is smooth (infinitely differentiable, <math>C^\infty</math>) everywhere, including at <math>x = \pm 1</math>.

Applications

thumb|right|320px|Inverted logistic S-curve to model the relation between wheat yield and soil salinity

Many natural processes, such as those of complex systems learning curves, exhibit a progression from small beginnings that accelerates and approaches a climax over time. When a specific mathematical model is lacking, a sigmoid function is often used. with the primary goal to re-analyze kinetic data, the so-called N-t curves, from heterogeneous nucleation experiments, in electrochemistry. The hierarchy includes at present three models, with 1, 2, and 3 parameters, if not counting the maximal number of nuclei N<sub>max</sub>, respectively—a tanh<sup>2</sup> based model called α<sub>21</sub> originally devised to describe diffusion-limited crystal growth (not aggregation!) in 2D, the Johnson–Mehl–Avrami–Kolmogorov (JMAK) model, and the Richards model. It was shown that for the concrete purpose, even the simplest model works, and thus it was implied that the experiments revisited are an example of two-step nucleation with the first step being the growth of the metastable phase in which the nuclei of the stable phase form.

</references>

Further reading

  • . (NB. In particular see "Chapter 4: Artificial Neural Networks" (in particular pp.&nbsp;96–97) where Mitchell uses the word "logistic function" and the "sigmoid function" synonymously – this function he also calls the "squashing function" – and the sigmoid (aka logistic) function is used to compress the outputs of the "neurons" in multi-layer neural nets.)
  • (NB. Properties of the sigmoid, including how it can shift along axes and how its domain may be transformed.)