thumb|400px|right|Depth perception shown in a two-dimensional image: Perspective, relative size, occultation and texture gradients all contribute to the three-dimensional appearance in this photo.
Depth perception is the ability to perceive distance to objects in the world using the visual system and visual perception. It is a major factor in perceiving the world in three dimensions.
Depth sensation is the corresponding term for non-human animals, since although it is known that they can sense the distance of an object, it is not known whether they perceive it in the same way that humans do.
Depth perception arises from a variety of depth cues. These are typically classified into binocular cues and monocular cues. Binocular cues are based on the receipt of sensory information in three dimensions by both eyes while monocular cues can be observed with just one eye. Binocular cues include retinal disparity, which exploits parallax and vergence. Stereopsis is made possible with binocular vision. Monocular cues include relative size (distant objects subtend smaller visual angles than near objects), texture gradient, occlusion, linear perspective, contrast differences, and motion parallax.
Monocular cues
thumb|Motion parallax
<!-- This section is linked from Virtual Boy -->
Monocular cues provide depth information even when viewing a scene with only one eye.
Motion parallax
When an observer moves, the apparent relative motion of several stationary objects against a background gives hints about their relative distance. If information about the direction and velocity of movement is known, motion parallax can provide absolute depth information. This effect can be seen clearly when riding in a car. Nearby things pass quickly, while far-off objects appear stationary. Some animals that lack binocular vision due to their eyes having little common field-of-view employ motion parallax more explicitly than humans for depth cueing (for example, some types of birds, which bob their heads to achieve motion parallax, and squirrels, which move in lines orthogonal to an object of interest to do the same). The dynamic stimulus change enables the observer not only to see the object as moving, but to perceive the distance of the moving object. Thus, in this context, the changing size serves as a distance cue. A related phenomenon is the visual system's capacity to calculate time-to-contact (TTC) of an approaching object from the rate of optical expansiona useful ability in contexts ranging from driving a car to playing a ball game. However, the calculation of TTC is, strictly speaking, a perception of velocity rather than depth.
Kinetic depth effect
If a stationary rigid figure (for example, a wire cube) is placed in front of a point source of light so that its shadow falls on a translucent screen, an observer on the other side of the screen will see a two-dimensional pattern of lines. But if the cube rotates, the visual system will extract the necessary information for perception of the third dimension from the movements of the lines, and a cube is seen. This is an example of the kinetic depth effect. The effect also occurs when the rotating object is solid (rather than an outline figure), provided that the projected shadow consists of lines which have definite corners or end points, and that these lines change in both length and orientation during the rotation.
Perspective
The property of parallel lines converging in the distance, at infinity, allows us to reconstruct the relative distance of two parts of an object, or of landscape features. An example would be standing on a straight road, looking down the road, and noticing the road narrows as it goes off in the distance. Visual perception of perspective in real space, for instance in rooms, in settlements and in nature, is a result of several optical impressions and the interpretation by the visual system. The angle of vision is important for the apparent size. A nearby object is imaged on a larger area on the retina, while the same object or an object of the same size further away is imaged on a smaller area. The perception of perspective is possible when looking with one eye only, but stereoscopic vision enhances the impression of the spatial. Regardless of whether the light rays entering the eye come from a three-dimensional space or from a two-dimensional image, they hit the inside of the eye on the retina as a surface. What a person sees, is based on the reconstruction by their visual system, in which one and the same image on the retina can be interpreted both two-dimensionally and three-dimensionally. If a three-dimensional interpretation has been recognised, it receives a preference and determines the perception.
<gallery>
Perspektivisches Sehen und Interpretation.png|Context-dependent interpretation of the size
08913-Perspective Run.jpg|Shots at different distances
Study in Vanishing Perspective.jpg|The horizon line is at the height of the armrests.
Spatial vision and perspective.jpg|View from a window on the 2nd floor of a house
Mountain panorama in France 3.jpg|Mountain peak near the snow line and several mountain peaks above the snow line
ISS-40 Sicily and Italy.jpg|Earth curvature
</gallery>
In spatial vision, the horizontal line of sight can play a role. In the picture taken from the window of a house, the horizontal line of sight is at the level of the second floor (yellow line). Below this line, the further away objects are, the higher up in the visual field they appear. Above the horizontal line of sight, objects that are further away appear lower than those that are closer. To represent spatial impressions in graphical perspective, one can use a vanishing point. When looking at long geographical distances, perspective effects also partially result from the angle of vision, but not only by this. In picture 5 of the series, in the background is Mont Blanc, the highest mountain in the Alps. It appears lower than the mountain in front in the center of the picture. Measurements and calculations can be used to determine the proportion of the curvature of Earth in the subjectively perceived proportions.
Relative size
If two objects are known to be the same size (for example, two trees) but their absolute size is unknown, relative size cues can provide information about the relative depth of the two objects. If one subtends a larger visual angle on the retina than the other, the object which subtends the larger visual angle appears closer.
Familiar size
Since the visual angle of an object projected onto the retina decreases with distance, this information can be combined with previous knowledge of the object's size to determine the absolute depth of the object. For example, people are generally familiar with the size of an average automobile. This prior knowledge can be combined with information about the angle it subtends on the retina to determine the absolute depth of an automobile in a scene.
Absolute size
Even if the actual size of the object is unknown and there is only one object visible, a smaller object seems farther away than a large object that is presented at the same location.
Aerial perspective
Due to light scattering by the atmosphere, objects that are a great distance away have lower luminance contrast and lower color saturation. Due to this, images seem hazy the farther they are away from a person's point of view. In computer graphics, this is often called "distance fog". The foreground has high contrast; the background has low contrast. Objects differing only in their contrast with a background appear to be at different depths. The color of distant objects is also shifted toward the blue end of the spectrum (for example, distant mountains). Some painters, notably Cézanne, employ "warm" pigments (red, yellow and orange) to bring features forward towards the viewer, and "cool" ones (blue, violet, and blue-green) to indicate the part of a form that curves away from the picture plane.
Accommodation
Accommodation is an oculomotor cue for depth perception. When humans try to focus on distant objects, the ciliary muscles relax, allowing the eye lens to become thinner, which increases the focal length. Depth perception of distant objects is made possible by other methods besides accommodation. The kinesthetic sensations of the contracting and relaxing ciliary muscles (intraocular muscles) are sent to the visual cortex where they are used for interpreting distance and depth. Accommodation is only effective for distances less than 2 meters.
Occultation
Occultation (also referred to as interposition) happens when near surfaces overlap far surfaces. If one object partially blocks the view of another object, humans perceive it as closer. However, this information only allows the observer to make a "ranking" of relative nearness. The presence of monocular ambient occlusions consist of the object's texture and geometry. These phenomena are able to reduce depth perception latency both in natural and artificial stimuli.
Curvilinear perspective
At the outer extremes of the visual field, parallel lines become curved, as in a photo taken through a fisheye lens. This effect, although it is usually eliminated from both art and photos by the cropping or framing of a picture, greatly enhances the viewer's sense of being positioned within a real, three-dimensional space. (Classical perspective has no use for this so-called "distortion", although in fact the "distortions" strictly obey optical laws and provide perfectly valid visual information, just as classical perspective does for the part of the field of vision that falls within its frame.)
Texture gradient
Fine details on nearby objects can be seen clearly, whereas such details are not visible on faraway objects. Texture gradients are the grains of an item. For example, on a long gravel road, the gravel near the observer can be clearly seen of shape, size and colour. In the distance, the road's texture cannot be clearly differentiated.
Lighting and shading
The way that light falls on an object and reflects off its surfaces, and the shadows that are cast by objects provide an effective cue for the brain to determine the shape of objects and their position in space.
Defocus blur
Selective image blurring is very commonly used in photography and video to establish the impression of depth. This can act as a monocular cue even when all other cues are removed. It may contribute to depth perception in natural retinal images, because the depth of focus of the human eye is limited. In addition, there are several depth estimation algorithms based on defocus and blurring. Some jumping spiders are known to use image defocus to judge depth.
Elevation
When an object is visible relative to the horizon, humans tend to perceive objects which are closer to the horizon as being farther away from them, and objects which are farther from the horizon as being closer to them. In addition, if an object moves from a position close to the horizon to a position higher or lower than the horizon, it will appear to move closer to the viewer.
Ocular parallax
Ocular parallax is a perceptual effect where the rotation of the eye causes perspective-dependent image shifts. This happens because the optical center and the rotation center of the eye are not the same. Ocular parallax does not require head movement. It is separate and distinct from motion parallax.
Binocular cues
Binocular cues provide depth information when viewing a scene with both eyes.
Stereopsis, or retinal (binocular) disparity, or binocular parallax
Animals that have their eyes placed frontally can also use information derived from the different projections of objects onto each retina to judge depth. By using two images of the same scene obtained from slightly different angles, it is possible to triangulate the distance to an object with a high degree of accuracy. Each eye views a slightly different angle of an object seen by the left and right eyes. This happens because of the horizontal separation parallax of the eyes. If an object is far away, the disparity of that image falling on both retinas will be small. If the object is close or near, the disparity will be large. It is stereopsis that tricks people into thinking they perceive depth when viewing Magic Eyes, autostereograms, 3D movies, and stereoscopic photos.
Convergence
Convergence is a binocular oculomotor cue for distance and depth perception. Because of stereopsis, the two eyeballs focus on the same object; in doing so they converge. The convergence will stretch the extraocular musclesthe receptors for this are muscle spindles. As happens with the monocular accommodation cue, kinesthetic sensations from these extraocular muscles also help in distance and depth perception. The angle of convergence is smaller when the eye is fixating on objects which are far away. Convergence is effective for distances less than 10 meters.
Shadow stereopsis
Antonio Medina Puerta demonstrated that retinal images with no parallax disparity but with different shadows were fused stereoscopically, imparting depth perception to the imaged scene. He named the phenomenon "shadow stereopsis". Shadows are therefore an important, stereoscopic cue for depth perception.
Of these various cues, only convergence, accommodation and familiar size provide absolute distance information. All other cues are relative (as in, they can only be used to tell which objects are closer relative to others). Stereopsis is merely relative because a greater or lesser disparity for nearby objects could either mean that those objects differ more or less substantially in relative depth or that the foveated object is nearer or further away (the further away a scene is, the smaller is the retinal disparity indicating the same depth difference).
Theories of evolution
The law of Newton–Müller–Gudden
Isaac Newton proposed that the optic nerve of humans and other primates has a specific architecture on its way from the eye to the brain. Nearly half of the fibres from the human retina project to the brain hemisphere on the same side as the eye from which they originate. That architecture is labelled hemi-decussation or ipsilateral (same sided) visual projections (IVP). In most other animals, these nerve fibres cross to the opposite side of the brain.
Bernhard von Gudden showed that the OC contains both crossed and uncrossed retinal fibers, and Ramon y Cajal observed that the grade of hemidecussation differs between species. In other words, that the number of fibers that do not cross the midline is proportional to the size of the binocular visual field. However, an issue of the Newton–Müller–Gudden law is the considerable interspecific variation in IVP seen in non-mammalian species. That variation is unrelated to mode of life, taxonomic situation, and the overlap of visual fields.
Thus, the general hypothesis was for long that the arrangement of nerve fibres in the optic chiasm in primates and humans has developed primarily to create accurate depth perception, stereopsis, or explicitly that the eyes observe an object from somewhat dissimilar angles and that this difference in angle assists the brain to evaluate the distance.
The eye-forelimb (EF) hypothesis
The eye-forelimb (EF) hypothesis suggests that the need for accurate eye-hand control was key in the evolution of stereopsis. According to the EF hypothesis, stereopsis is evolutionary spinoff from a more vital process: that the construction of the optic chiasm and the position of eyes (the degree of lateral or frontal direction) is shaped by evolution to help the animal to coordinate the limbs (hands, claws, wings or fins).
The EF hypothesis postulates that it has a selective value to have short neural pathways between areas of the brain that receive visual information about the hand and the motor nuclei that control the coordination of the hand. The essence of the EF hypothesis is that evolutionary transformation in OC will affect the length and thereby speed of these neural pathways.
Having the primate type of OC means that motor neurons controlling/executing let us say right hand movement, neurons receiving sensory e.g. tactile information about the right hand, and neurons obtaining visual information about the right hand, all will be situated in the same (left) brain hemisphere. The reverse is true for the left hand, the processing of visual, tactile information, and motor commandall of which takes place in the right hemisphere. Cats and arboreal (tree-climbing) marsupials have analogous arrangements (between 30 and 45% of IVP and forward-directed eyes). The result will be that visual info of their forelimbs reaches the proper (executing) hemisphere.
The evolution has resulted in small, and gradual fluctuations in the direction of the nerve pathways in the OC. This transformation can go in either direction.
Snakes, cyclostomes and other animals that lack extremities have relatively many IVP. Notably these animals have no limbs (hands, paws, fins or wings) to direct. Besides, the left and right body parts of snakelike animals cannot move independently of each other. For example, if a snake coils clockwise, its left eye only sees the left body-part and in an anticlockwise position the same eye will see just the right body-part. For that reason, it is functional for snakes to have some IVP in the OC (Naked). Cyclostome descendants (in other words, most vertebrates) that due to evolution ceased to curl and, instead developed forelimbs would be favored by achieving completely crossed pathways as long as forelimbs were primarily occupied in a lateral direction. Reptiles such as snakes that lost their limbs, would gain by recollecting a cluster of uncrossed fibres in their evolution. That seems to have happened, providing further support for the EF hypothesis. Stereoscopes and Viewmasters, as well as 3D films, employ binocular vision by forcing the viewer to see two images created from slightly different positions (points of view). Charles Wheatstone was the first to discuss depth perception being a cue of binocular disparity. He invented the stereoscope, which is an instrument with two eyepieces that displays two photographs of the same location/scene taken at relatively different angles. When observed, separately by each eye, the pairs of images induced a clear sense of depth. By contrast, a telephoto lens—used in televised sports, for example, to zero in on members of a stadium audience—has the opposite effect. The viewer sees the size and detail of the scene as if it were close enough to touch, but the camera's perspective is still derived from its actual position a hundred meters away, so background faces and objects appear about the same size as those in the foreground.
Trained artists are keenly aware of the various methods for indicating spatial depth (color shading, distance fog, perspective and relative size), and take advantage of them to make their works appear "real". The viewer feels it would be possible to reach in and grab the nose of a Rembrandt portrait or an apple in a Cézanne still life—or step inside a landscape and walk around among its trees and rocks.
Cubism was based on the idea of incorporating multiple points of view in a painted image, as if to simulate the visual experience of being physically in the presence of the subject, and seeing it from different angles. The radical experiments of Georges Braque, Pablo Picasso, Jean Metzinger's Nu à la cheminée, Albert Gleizes's La Femme aux Phlox, or Robert Delaunay's views of the Eiffel Tower, employ the explosive angularity of Cubism to exaggerate the traditional illusion of three-dimensional space. The subtle use of multiple points of view can be found in the pioneering late work of Cézanne, which both anticipated and inspired the first actual Cubists. Cézanne's landscapes and still lives powerfully suggest the artist's own highly developed depth perception. At the same time, like the other Post-Impressionists, Cézanne had learned from Japanese art the significance of respecting the flat (two-dimensional) rectangle of the picture itself; Hokusai and Hiroshige ignored or even reversed linear perspective and thereby remind the viewer that a picture can only be "true" when it acknowledges the truth of its own flat surface. By contrast, European "academic" painting was devoted to a sort of Big Lie that the surface of the canvas is only an enchanted doorway to a "real" scene unfolding beyond, and that the artist's main task is to distract the viewer from any disenchanting awareness of the presence of the painted canvas. Cubism, and indeed most of modern art is an attempt to confront, if not resolve, the paradox of suggesting spatial depth on a flat surface, and explore that inherent contradiction through innovative ways of seeing, as well as new methods of drawing and painting.
In robotics and computer vision
In robotics and computer vision, depth perception is often achieved using sensors such as RGBD cameras.
See also
- Arboreal theory
- Cyclopean stimuli
- Optical illusion
- Orthoptics
- Peripheral vision
- Senses
- Vision therapy
- Visual cliff
- Vista paradox
References
Notes
</references>
Bibliography
- In three volumes
External links
- Depth perception example | GO Illusions.
- Monocular Giants
- What is Binocular (Two-eyed) Depth Perception?
- Why Some People Can't See in Depth
- Space perception | Webvision.
- Depth perception | Webvision.
- Make3D.
- Depth Cues for Film, TV and Photography
