Geometrical optics

Geometrical optics, or ray optics, is a model of optics that describes light propagation in terms of rays. The ray in geometrical optics is an abstraction useful for approximating the paths along which light propagates under certain circumstances.

The simplifying assumptions of geometrical optics include that light rays:

propagate in straight-line paths as they travel in a homogeneous medium
bend, and in particular circumstances may split in two, at the interface between two dissimilar media
follow curved paths in a medium in which the refractive index changes
may be absorbed or reflected.

Geometrical optics does not account for certain optical effects such as diffraction and interference, which are considered in physical optics. This simplification is useful in practice; it is an excellent approximation when the wavelength is small compared to the size of structures with which the light interacts. The techniques are particularly useful in describing geometrical aspects of imaging, including optical aberrations.

Explanation

thumb|right|As light travels through space, it [[oscillation|oscillates in amplitude. In this image, each maximum amplitude crest is marked with a plane to illustrate the wavefront. The ray is the arrow perpendicular to these parallel surfaces.]]

A light ray is a line or curve that is perpendicular to the light's wavefronts (and is therefore collinear with the wave vector).

A slightly more rigorous definition of a light ray follows from Fermat's principle, which states that the path taken between two points by a ray of light is the path that can be traversed in the least time.

Geometrical optics is often simplified by making the paraxial approximation, or "small angle approximation". The mathematical behavior then becomes linear, allowing optical components and systems to be described by simple matrices. This leads to the techniques of Gaussian optics and paraxial ray tracing, which are used to find basic properties of optical systems, such as approximate image and object positions and magnifications.

Reflection

frame|Diagram of [[specular reflection]]

Glossy surfaces such as mirrors reflect light in a simple, predictable way. This allows for production of reflected images that can be associated with an actual (real) or extrapolated (virtual) location in space.

With such surfaces, the direction of the reflected ray is determined by the angle the incident ray makes with the surface normal, a line perpendicular to the surface at the point where the ray hits. The incident and reflected rays lie in a single plane, and the angle between the reflected ray and the surface normal is the same as that between the incident ray and the normal. This is known as the Law of Reflection.

For flat mirrors, the law of reflection implies that images of objects are upright and the same distance behind the mirror as the objects are in front of the mirror. The image size is the same as the object size. (The magnification of a flat mirror is equal to one.) The law also implies that mirror images are parity inverted, which is perceived as a left-right inversion.

Mirrors with curved surfaces can be modeled by ray tracing and using the law of reflection at each point on the surface. For mirrors with parabolic surfaces, parallel rays incident on the mirror produce reflected rays that converge at a common focus. Other curved surfaces may also focus light, but with aberrations due to the diverging shape causing the focus to be smeared out in space. In particular, spherical mirrors exhibit spherical aberration. Curved mirrors can form images with magnification greater than or less than one, and the image can be upright or inverted. An upright image formed by reflection in a mirror is always virtual, while an inverted image is real and can be projected onto a screen.

360px|thumb|A ray tracing diagram for a simple converging lens

A device which produces converging or diverging light rays due to refraction is known as a lens. Thin lenses produce focal points on either side that can be modeled using the lensmaker's equation. In general, two types of lenses exist: convex lenses, which cause parallel light rays to converge, and concave lenses, which cause parallel light rays to diverge. The detailed prediction of how images are produced by these lenses can be made using ray-tracing similar to curved mirrors. Similarly to curved mirrors, thin lenses follow a simple equation that determines the location of the images given a particular focal length (<math>f</math>) and object distance

where <math>S_2</math> is the distance associated with the image and is considered by convention to be negative if on the same side of the lens as the object and positive if on the opposite side of the lens. Their derivation was based on an oral remark by Peter Debye. Consider a monochromatic scalar field <math>\psi(\mathbf{r},t)=\phi(\mathbf{r})e^{i\omega t}</math>, where <math>\psi</math> could be any of the components of electric or magnetic field and hence the function <math>\phi</math> satisfy the wave equation

<math display="block">\nabla^2\phi + k_o^2 n(\mathbf{r})^2 \phi =0</math>

where <math>k_o = \omega/c = 2\pi/\lambda_o</math> with <math>c</math> being the speed of light in vacuum. Here, <math>n(\mathbf{r})</math> is the refractive index of the medium. Without loss of generality, let us introduce <math>\phi = A(k_o,\mathbf{r}) e^{i k_o S(\mathbf{r})}</math> to convert the equation to

<math display="block">-k_o^2 A[(\nabla S)^2 - n^2] + 2 i k_o(\nabla S\cdot \nabla A) + ik_o A\nabla^2 S + \nabla^2 A =0.</math>

Since the underlying principle of geometrical optics lies in the limit <math>\lambda_o\sim k_o^{-1}\rightarrow 0</math>, the following asymptotic series is assumed,

<math display="block">A(k_o,\mathbf{r}) = \sum_{m=0}^\infty \frac{A_m(\mathbf{r})}{(ik_o)^m}</math>

For large but finite value of <math>k_o</math>, the series diverges, and one has to be careful in keeping only appropriate first few terms. For each value of <math>k_o</math>,

one can find an optimum number of terms to be kept and adding more terms than the optimum number might result in a poorer approximation. Substituting the series into the equation and collecting terms of different orders, one finds

<math display="block">\begin{align}

O(k_o^2): &\quad (\nabla S)^2 = n^2, \\[1ex]

O(k_o) : &\quad 2\nabla S\cdot \nabla A_0 + A_0\nabla^2 S =0, \\[1ex]

O(1): &\quad 2\nabla S\cdot \nabla A_1 + A_1\nabla^2 S =-\nabla^2 A_0,

\end{align}</math>

in general,

<math display="block">O(k_o^{1-m}):\quad 2\nabla S\cdot \nabla A_m + A_m\nabla^2 S =-\nabla^2 A_{m-1}.</math>

The first equation is known as the eikonal equation, which determines the eikonal <math>S(\mathbf{r})</math> is a Hamilton–Jacobi equation, written for example in Cartesian coordinates becomes

<math display="block">\left(\frac{\partial S}{\partial x}\right)^2 + \left(\frac{\partial S}{\partial y}\right)^2 + \left(\frac{\partial S}{\partial z}\right)^2 = n^2.</math>

The remaining equations determine the functions <math>A_m(\mathbf{r})</math>.

Luneburg method

The method of obtaining equations of geometrical optics by analysing surfaces of discontinuities of solutions to Maxwell's equations was first described by Rudolf Karl Luneburg in 1944. It does not restrict the electromagnetic field to have a special form required by the Sommerfeld-Runge method which assumes the amplitude <math>A(k_o,\mathbf{r})</math> and phase <math>S(\mathbf{r})</math> satisfy the equation <math display="inline">\lim_{k_0 \to \infty} \frac{1}{k_0}\left(\frac{1}{A}\,\nabla S \cdot \nabla A + \frac{1}{2}\nabla^2 S\right) = 0</math>. This condition is satisfied by e.g. plane waves but is not additive.

The main conclusion of Luneburg's approach is the following:

Theorem. Suppose the fields <math>\mathbf{E}(x, y, z, t)</math> and <math>\mathbf{H}(x, y, z, t)</math> (in a linear isotropic medium described by dielectric constants <math>\varepsilon(x, y, z)</math> and <math>\mu(x, y, z)</math>) have finite discontinuities along a (moving) surface in <math>\mathbf{R}^3</math> described by the equation Then Maxwell's equations in the integral form imply that <math>\psi</math> satisfies the eikonal equation:

<math display="block">\psi_x^2 + \psi_y^2 + \psi_z^2 = \varepsilon\mu = n^2,</math>

where <math>n(x,y,z)</math> is the index of refraction of the medium (Gaussian units).

An example of such surface of discontinuity is the initial wave front emanating from a source that starts radiating at a certain instant of time.

The surfaces of field discontinuity thus become geometrical optics wave fronts with the corresponding geometrical optics fields defined as:

<math display="block">\begin{align}

\mathbf{E}^*(x, y, z) &= \mathbf{E}(x, y, z, \psi(x, y, z)/c) \\[1ex]

\mathbf{H}^*(x, y, z) &= \mathbf{H}(x, y, z, \psi(x, y, z)/c)

\end{align}</math>

Those fields obey transport equations consistent with the transport equations of the Sommerfeld-Runge approach. Light rays in Luneburg's theory are defined as trajectories orthogonal to the discontinuity surfaces and can be shown to obey Fermat's principle of least time thus establishing the identity of those rays with light rays of standard optics.

The above developments can be generalised to anisotropic media.

The proof of Luneburg's theorem is based on investigating how Maxwell's equations govern the propagation of discontinuities of solutions. The basic technical lemma is as follows:

A technical lemma. Let <math>\varphi(x, y, z, t) = 0</math> be a hypersurface (a 3-dimensional manifold) in spacetime <math>\mathbf{R}^4</math> on which one or more of: <math>\mathbf{E}(x, y, z, t)</math>, <math>\mathbf{H}(x, y, z, t)</math>, <math>\varepsilon(x, y, z)</math>, <math>\mu(x, y, z)</math>, have a finite discontinuity. Then at each point of the hypersurface the following formulas hold:

<math display="block">\begin{align}

\nabla\varphi \cdot [\varepsilon\mathbf{E}] &= 0 \\[1ex]

\nabla\varphi \cdot [\mu \mathbf{H}] &= 0 \\[1ex]

\nabla\varphi \times [\mathbf{E}] + \frac{1}{c} \, \varphi_t \, [\mu\mathbf{H}] &= 0 \\[1ex]

\nabla\varphi \times [\mathbf{H}] - \frac{1}{c} \, \varphi_t \, [\varepsilon\mathbf{E}] &= 0

\end{align}</math>

where the <math>\nabla</math> operator acts in the <math>xyz</math>-space (for every fixed <math>t</math>) and the square brackets denote the difference in values on both sides of the discontinuity surface (set up according to an arbitrary but fixed convention, e.g. the gradient <math>\nabla\varphi</math> pointing in the direction of the quantities being subtracted from).

Sketch of proof. Start with Maxwell's equations away from the sources (Gaussian units):

<math display="block">\begin{align}

\nabla \cdot \varepsilon\mathbf{E} = 0 \\[1ex]

\nabla \cdot \mu \mathbf{H} = 0 \\[1ex]

\nabla \times \mathbf{E} + \tfrac{\mu}{c} \, \mathbf{H}_t = 0 \\[1ex]

\nabla \times \mathbf{H} - \tfrac{\varepsilon}{c} \, \mathbf{E}_t = 0

\end{align}</math>

Using Stokes' theorem in <math>\mathbf{R}^4</math> one can conclude from the first of the above equations that for any domain <math>D</math> in <math>\mathbf{R}^4</math> with a piecewise smooth (3-dimensional) boundary <math>\Gamma</math> the following is true:

<math display="block">\oint_\Gamma (\mathbf{M} \cdot \varepsilon\mathbf{E}) \, dS = 0</math>

where <math>\mathbf{M} = (x_N, y_N, z_N)</math> is the projection of the outward unit normal <math>(x_N, y_N, z_N, t_N)</math> of <math>\Gamma</math> onto the 3D slice <math>t = \rm{const}</math>, and <math>dS</math> is the volume 3-form on <math>\Gamma</math>. Similarly, one establishes the following from the remaining Maxwell's equations:

<math display="block">\begin{align}

\oint_\Gamma \left(\mathbf{M} \cdot \mu\mathbf{H}\right) dS &= 0 \\[1.55ex]

\oint_\Gamma \left(\mathbf{M} \times \mathbf{E} + \frac{\mu}{c} \, t_N \, \mathbf{H}\right) dS &= 0 \\[1.55ex]

\oint_\Gamma \left(\mathbf{M} \times \mathbf{H} - \frac{\varepsilon}{c} \, t_N \, \mathbf{E}\right) dS &= 0

\end{align}</math>

Now by considering arbitrary small sub-surfaces <math>\Gamma_0</math> of <math>\Gamma</math> and setting up small neighbourhoods surrounding <math>\Gamma_0</math> in <math>\mathbf{R}^4</math>, and subtracting the above integrals accordingly, one obtains:

<math display="block">\begin{align}

\int_{\Gamma_0} (\nabla\varphi \cdot [\varepsilon\mathbf{E}]) \, {dS\over \|\nabla^{4D}\varphi\|} &= 0 \\[1ex]

\int_{\Gamma_0} (\nabla\varphi \cdot [\mu\mathbf{H}]) \, {dS\over \|\nabla^{4D}\varphi\|} &= 0 \\[1ex]

\int_{\Gamma_0} \left( \nabla\varphi \times [\mathbf{E}] + {1\over c} \, \varphi_t \, [\mu\mathbf{H}] \right) \, \frac{dS}{\|\nabla^{4D}\varphi\|} &= 0 \\[1ex]

\int_{\Gamma_0} \left( \nabla\varphi \times [\mathbf{H}] - {1\over c} \, \varphi_t \, [\varepsilon\mathbf{E}] \right) \, \frac{dS}{\|\nabla^{4D}\varphi\|} &= 0

\end{align}</math>

where <math>\nabla^{4D}</math> denotes the gradient in the 4D <math>xyzt</math>-space. And since <math>\Gamma_0</math> is arbitrary, the integrands must be equal to 0 which proves the lemma.

It's now easy to show that as they propagate through a continuous medium, the discontinuity surfaces obey the eikonal equation. Specifically, if <math>\varepsilon</math> and <math>\mu</math> are continuous, then the discontinuities of <math>\mathbf{E}</math> and <math>\mathbf{H}</math> satisfy: <math>[\varepsilon\mathbf{E}] = \varepsilon[\mathbf{E}]</math> and <math>[\mu\mathbf{H}] = \mu[\mathbf{H}]</math>. In this case the last two equations of the lemma can be written as:

<math display="block">\begin{align}

\nabla\varphi \times [\mathbf{E}] + {\mu\over c} \, \varphi_t \, [\mathbf{H}] &= 0 \\[1ex]

\nabla\varphi \times [\mathbf{H}] - {\varepsilon\over c} \, \varphi_t \, [\mathbf{E}] &= 0

\end{align}</math>

Taking the cross product of the second equation with <math>\nabla\varphi</math> and substituting the first yields:

<math display="block">\nabla\varphi \times (\nabla\varphi \times [\mathbf{H}]) - {\varepsilon\over c} \, \varphi_t \, (\nabla\varphi \times [\mathbf{E}]) = (\nabla\varphi \cdot [\mathbf{H}]) \, \nabla\varphi - \|\nabla\varphi\|^2 \, [\mathbf{H}] + {\varepsilon\mu\over c^2} \varphi_t^2 \, [\mathbf{H}] = 0</math>

The continuity of <math>\mu</math> and the second equation of the lemma imply: <math>\nabla\varphi \cdot [\mathbf{H}] = 0</math>, hence, for points lying on the surface <math>\varphi = 0</math> only:

<math display="block">\|\nabla\varphi\|^2 = {\varepsilon\mu\over c^2} \varphi_t^2</math>

(Notice the presence of the discontinuity is essential in this step as we'd be dividing by zero otherwise.)

Because of the physical considerations one can assume without loss of generality that <math>\varphi</math> is of the following form:

<math>\varphi(x, y, z, t) = \psi(x, y, z) - ct</math>, i.e. a 2D surface moving through space, modelled as level surfaces of <math>\psi</math>. (Mathematically <math>\psi</math> exists if <math>\varphi_t \ne 0</math> by the implicit function theorem.)

The above equation written in terms of <math>\psi</math> becomes:

<math display="block">\|\nabla\psi\|^2 = {\varepsilon\mu\over c^2} \, (-c)^2 = \varepsilon\mu = n^2</math>

i.e.,

which is the eikonal equation and it holds for all <math>x</math>, <math>y</math>, <math>z</math>, since the variable <math>t</math> is absent. Other laws of optics like Snell's law and Fresnel formulae can be similarly obtained by considering discontinuities in <math>\varepsilon</math> and <math>\mu</math>.

General equation using four-vector notation

In four-vector notation used in special relativity, the wave equation can be written as

<math display="block">\frac{\partial^2 \psi}{\partial x_i\partial x^i} = 0</math>

and the substitution <math>\psi= A e^{iS / \varepsilon}</math> leads to

<math display="block">-\frac{A}{\varepsilon^2}\frac{\partial S}{\partial x_i} \frac{\partial S}{\partial x^i} + \frac{2i}{\varepsilon} \frac{\partial A}{\partial x_i} \frac{\partial S}{\partial x^i} + \frac{iA}{\varepsilon} \frac{\partial^2 S}{\partial x_i\partial x^i} + \frac{\partial^2 A}{\partial x_i\partial x^i} = 0. </math>

Therefore, the eikonal equation is given by

<math display="block">\frac{\partial S}{\partial x_i} \frac{\partial S}{\partial x^i} = 0.</math>

Once eikonal is found by solving the above equation, the wave four-vector can be found from

<math display="block">k_i = - \frac{\partial S}{\partial x^i}.</math>

References

External links

Feynman's lecture on Geometrical Optics

Geometrical optics

Explanation

Reflection

Luneburg method

General equation using four-vector notation

See also

References

Further reading

English translations of some early books and papers

External links