Directional derivative

In multivariable calculus, the directional derivative measures the instantaneous rate at which a function changes along a specified vector through a given point. If the vector is multiplied by a scalar, the corresponding directional derivative is multiplied by the same scalar.

Some elementary texts instead use the phrase "directional derivative in the direction of " for the rate of change per unit distance in that direction. In that convention the nonzero vector is first normalized to the unit vector <math>\hat{\mathbf v} = \mathbf v/\|\mathbf v\|</math>, where the normalized vector is denoted with a circumflex (hat) symbol: <math>\mathbf{\widehat{</math>.

The directional derivative of a scalar function f with respect to a vector v may be denoted by any of the following:

\begin{aligned}

\nabla_{\mathbf{v{f}(\mathbf{x})

&=f'_\mathbf{v}(\mathbf{x})\\

&=D_\mathbf{v}f(\mathbf{x})\\

&=Df(\mathbf{x})(\mathbf{v})\\

&=\partial_\mathbf{v}f(\mathbf{x})\\

&=\frac{\partial f(\mathbf{x})}{\partial \mathbf{v\\

&=\mathbf{v}\cdot{\nabla f(\mathbf{x})}\\

&=\mathbf{v} \cdot \frac{\partial f(\mathbf{x})}{\partial\mathbf{x.\\

\end{aligned}

</math>

It therefore generalizes the notion of a partial derivative, in which the rate of change is taken along one of the curvilinear coordinate curves, all other coordinates being constant.

In functional analysis, the analogous notion for functions between topological vector spaces is the Gateaux derivative.

Definition

thumb|275px|A [[contour plot of <math>f(x, y)=x^2 + y^2</math>, showing the gradient vector in black, and the unit vector <math>\mathbf{u}</math> scaled by the directional derivative in the direction of <math>\mathbf{u}</math> in orange. The gradient vector is longer because the gradient points in the direction of greatest rate of increase of a function.]]

The directional derivative of a scalar function

<math display="block">f(\mathbf{x}) = f(x_1, x_2, \ldots, x_n)</math>

along a vector

<math display="block">\mathbf{v} = (v_1, \ldots, v_n)</math>

is the function <math>\nabla_{\mathbf{v{f}</math> defined by the limit

<math display="block">\nabla_{\mathbf{v{f}(\mathbf{x}) = \lim_{h \to 0}{\frac{f(\mathbf{x} + h\mathbf{v}) - f(\mathbf{x})}{h = \left. \frac{\mathrm{d{\mathrm{d}t}f(\mathbf{x}+t\mathbf{v})\right|_{t=0}.</math>

This definition is valid in a broad range of contexts, for example, where the norm of a vector is defined. In finite dimensions, it does not depend on the choice of norm, since all norms are equivalent. Its applicability extends to functions on finite-dimensional vector spaces without a metric and to differentiable manifolds, such as in general relativity.

For differentiable functions

If the function f is differentiable at x, then the directional derivative exists along any vector v at x, and one has

<math display="block">\nabla_{\mathbf{v{f}(\mathbf{x}) = \nabla f(\mathbf{x}) \cdot \mathbf{v}</math>

where the <math>\nabla</math> on the right denotes the gradient and <math>\cdot</math> is the dot product.

It can be derived by using the property that all directional derivatives at a point make up a single tangent plane which can be defined using partial derivatives. This can be used to find a formula for the gradient vector and an alternative formula for the directional derivative, the latter of which can be rewritten as shown above for convenience.

It also follows from defining a path <math>h(t) = x + tv</math> and using the definition of the derivative as a limit which can be calculated along this path to get:

<math display="block">\begin{align}

&=\lim_{t \to 0}\frac {f(x+t v)-f(x)-t\nabla f(x)\cdot v} t \\

&=\lim_{t \to 0}\frac {f(x+t v)-f(x)} t - \nabla f(x)\cdot v \\

&=\nabla_v f(x)-\nabla f(x)\cdot v.\\

&\nabla f(x)\cdot v=\nabla_v f(x)

\end{align}</math>

Using only direction of vector

thumb|The angle α between the tangent A and the horizontal will be maximum if the cutting plane contains the direction of the gradient A.

In a Euclidean space, some authors define the directional derivative to be with respect to an arbitrary nonzero vector v after normalization, thus being independent of its magnitude and depending only on its direction.

This definition gives the rate of increase of per unit of distance moved in the direction given by . In this case, one has

<math display="block">\nabla_{\hat{\mathbf{v}{f}(\mathbf{x}) = \lim_{h \to 0}{\frac{f(\mathbf{x} + h\mathbf{v}) - f(\mathbf{x})}{h\|v\|,</math>

or in case f is differentiable at x,

<math display="block">\nabla_{\hat{\mathbf{v}{f}(\mathbf{x}) = \nabla f(\mathbf{x}) \cdot \hat{\mathbf{v .</math>

Properties

Many of the familiar properties of the ordinary derivative hold for the directional derivative. These include, for any functions f and g defined in a neighborhood of, and differentiable at, p:

sum rule: <math display="block">\nabla_{\mathbf{v (f + g) = \nabla_{\mathbf{v f + \nabla_{\mathbf{v g.</math>
constant factor rule: For any constant c, <math display="block">\nabla_{\mathbf{v (cf) = c\nabla_{\mathbf{v f.</math>
product rule (or Leibniz's rule): <math display="block">\nabla_{\mathbf{v (fg) = g\nabla_{\mathbf{v f + f\nabla_{\mathbf{v g.</math>
chain rule: If g is differentiable at p and h is differentiable at g(p), then <math display="block">\nabla_{\mathbf{v(h\circ g)(\mathbf{p}) = h'(g(\mathbf{p})) \nabla_{\mathbf{v g (\mathbf{p}).</math>

In differential geometry

Let be a differentiable manifold and a point of . Suppose that is a function defined in a neighborhood of , and differentiable at . If is a tangent vector to at , then the directional derivative of along , denoted variously as (see Exterior derivative), <math>\nabla_{\mathbf{v f(\mathbf{p})</math> (see Covariant derivative), <math>L_{\mathbf{v f(\mathbf{p})</math> (see Lie derivative), or <math>{\mathbf{v_{\mathbf{p(f)</math> (see ), can be defined as follows. Let be a differentiable curve with and . Then the directional derivative is defined by

<math display="block">\nabla_{\mathbf{v f(\mathbf{p}) = \left.\frac{d}{d\tau} f\circ\gamma(\tau)\right|_{\tau=0}.</math>

This definition can be proven independent of the choice of , provided is selected in the prescribed manner so that and .

The Lie derivative

The Lie derivative of a vector field <math> W^\mu(x)</math> along a vector field <math> V^\mu(x)</math> is given by the difference of two directional derivatives (with vanishing torsion):

<math display="block">\mathcal{L}_V W^\mu=(V\cdot\nabla) W^\mu-(W\cdot\nabla) V^\mu.</math>

In particular, for a scalar field <math> \phi(x)</math>, the Lie derivative reduces to the standard directional derivative:

<math display="block">\mathcal{L}_V \phi=(V\cdot\nabla) \phi.</math>

The Riemann tensor

Directional derivatives are often used in introductory derivations of the Riemann curvature tensor. Consider a curved rectangle with an infinitesimal vector <math>\delta</math> along one edge and <math>\delta'</math> along the other. We translate a covector <math>S</math> along <math>\delta</math> then <math>\delta'</math> and then subtract the translation along <math>\delta'</math> and then <math>\delta</math>. Instead of building the directional derivative using partial derivatives, we use the covariant derivative. The translation operator for <math>\delta</math> is thus

<math display="block">1+\sum_\nu \delta^\nu D_\nu=1+\delta\cdot D,</math>

and for <math>\delta'</math>,

<math display="block">1+\sum_\mu \delta'^\mu D_\mu=1+\delta'\cdot D.</math>

The difference between the two paths is then

<math display="block">(1+\delta'\cdot D)(1+\delta\cdot D)S^\rho-(1+\delta\cdot D)(1+\delta'\cdot D)S^\rho=\sum_{\mu,\nu}\delta'^\mu \delta^\nu[D_\mu,D_\nu]S_\rho.</math>

It can be argued that the noncommutativity of the covariant derivatives measures the curvature of the manifold:

<math display="block">[D_\mu,D_\nu]S_\rho=\pm \sum_\sigma R^\sigma{}_{\rho\mu\nu}S_\sigma,</math>

where <math>R</math> is the Riemann curvature tensor and the sign depends on the sign convention of the author.

In group theory

Translations

In the Poincaré algebra, we can define an infinitesimal translation operator P as

<math display="block">\mathbf{P}=i\nabla.</math>

(the i ensures that P is a self-adjoint operator) For a finite displacement λ, the unitary Hilbert space representation for translations is

<math display="block">U(\boldsymbol{\lambda})=\exp\left(-i\boldsymbol{\lambda}\cdot\mathbf{P}\right).</math>

By using the above definition of the infinitesimal translation operator, we see that the finite translation operator is an exponentiated directional derivative:

<math display="block">U(\boldsymbol{\lambda})=\exp\left(\boldsymbol{\lambda}\cdot\nabla\right).</math>

This is a translation operator in the sense that it acts on multivariable functions f(x) as

<math display="block">U(\boldsymbol{\lambda}) f(\mathbf{x})=\exp\left(\boldsymbol{\lambda}\cdot\nabla\right) f(\mathbf{x}) = f(\mathbf{x}+\boldsymbol{\lambda}).</math>

Rotations

The rotation operator also contains a directional derivative. The rotation operator for an angle θ, i.e. by an amount θ = |θ| about an axis parallel to <math> \hat{\theta} = \boldsymbol{\theta}/\theta</math> is

<math display="block">U(R(\mathbf{\theta}))=\exp(-i\mathbf{\theta}\cdot\mathbf{L}).</math>

Here L is the vector operator that generates SO(3):

<math display="block">\mathbf{L}=\begin{pmatrix}

0& 0 & 0\\

0& 0 & 1\\

0& -1 & 0

\end{pmatrix}\mathbf{i}+\begin{pmatrix}

0 &0 & -1\\

0& 0 &0 \\

1 & 0 & 0

\end{pmatrix}\mathbf{j}+\begin{pmatrix}

0&1 &0 \\

-1&0 &0 \\

0 & 0 & 0

\end{pmatrix}\mathbf{k}.</math>

It may be shown geometrically that an infinitesimal right-handed rotation changes the position vector x by

<math display="block">\mathbf{x}\rightarrow \mathbf{x}-\delta\boldsymbol{\theta}\times\mathbf{x}.</math>

So we would expect under infinitesimal rotation:

<math display="block">U(R(\delta\boldsymbol{\theta})) f(\mathbf{x}) = f(\mathbf{x}-\delta\boldsymbol{\theta}\times\mathbf{x})=f(\mathbf{x})-(\delta\boldsymbol{\theta}\times\mathbf{x})\cdot\nabla f.</math>

It follows that

<math display="block">U(R(\delta\mathbf{\theta}))=1-(\delta\mathbf{\theta}\times\mathbf{x})\cdot\nabla.</math>

Following the same exponentiation procedure as above, we arrive at the rotation operator in the position basis, which is an exponentiated directional derivative:

<math display="block">U(R(\mathbf{\theta}))=\exp(-(\mathbf{\theta}\times\mathbf{x})\cdot\nabla).</math>

Normal derivative

A normal derivative is a directional derivative taken in the direction normal (that is, orthogonal) to some surface in space, or more generally along a normal vector field orthogonal to some hypersurface. See for example Neumann boundary condition. If the normal direction is denoted by <math>\mathbf{n}</math>, then the normal derivative of a function f is sometimes denoted as <math display="inline">\frac{ \partial f}{\partial \mathbf{n</math>. In other notations,

<math display="block">\frac{ \partial f}{\partial \mathbf{n = \nabla f(\mathbf{x}) \cdot \mathbf{n} = \nabla_{\mathbf{n{f}(\mathbf{x}) = \frac{\partial f}{\partial \mathbf{x \cdot \mathbf{n} = Df(\mathbf{x})[\mathbf{n}].</math>

In the continuum mechanics of solids

Several important results in continuum mechanics require the derivatives of vectors with respect to vectors and of tensors with respect to vectors and tensors. The directional derivative provides a systematic way of finding these derivatives.

Notes

References

External links

Directional derivatives at MathWorld.
Directional derivative at PlanetMath.