Distance (graph theory)

thumb|upright=1.35|Distances in various graphs between selected vertices. Some have no defined distance (marked as infinite distance) because they are in different [[component (graph_theory)|connected components, or because edges in a directed graph can't lead from the first to the second. The latter may occur even if the distance in the other direction between the same two vertices is defined.]]

In the mathematical field of graph theory, the distance between two vertices in a graph is the number of edges in a shortest path (also called a graph geodesic) connecting them. This is also known as the geodesic distance or shortest-path distance. Notice that there may be more than one shortest path between two vertices. If there is no path connecting the two vertices, i.e., if they belong to different connected components, then conventionally the distance is defined as infinite.

In the case of a directed graph the distance between two vertices and is defined as the length of a shortest directed path from to consisting of arcs, provided at least one such path exists. Notice that, in contrast with the case of undirected graphs, does not necessarily coincide with —so it is just a quasi-metric, and it might be the case that one is defined while the other is not.

Computational Representation

In graph theory, distances between nodes can be represented computationally by using a distance matrix (also called the all-pairs shortest-path matrix). This is the square matrix <math>(D = (d_{ij}))</math>, where each entry <math>(d_{ij})</math> indicates the length of a shortest path between the two vertices <math>(v_i)</math> and <math>(v_j)</math>.

center|342px|An example of a simple undirected graph and its corresponding distance matrix.

Distance matrices have several applications, which include telecommunications and chemistry. In chemical graph theory, multiple topological indices that characterize molecular structure were able to be derived from the distance matrix.

Beyond exact matrix representations, graph distances can also be represented using metric embeddings. This approach maps the vertices of a graph to points in a geometric space (such as Euclidean space) in a way that preserves the graph distances as closely as possible.

Directed Graph Distance

In a directed graph, edges have an assigned direction, meaning travel between vertices is not necessarily bidirectional. As a result, the distance from <math display="inline">u</math> to <math>v</math> may differ from the distance from <math>v</math> to <math>u</math>. This makes directed graph distance a quasi-metric rather than a true metric.

Formally, for vertices <math>u</math> and <math>v</math> in a directed graph <math>G</math>, it is not guaranteed that <math>d(u, v) = d(v, u)</math>, and <math>d(u, v)</math> may be undefined if no directed path from <math>u</math> to <math>v</math> exists.

center|277x277px|A directed graph with three vertices A, B, and C illustrating the asymmetric nature of distance in directed graphs. Shows that d(A,B) = 1 via a direct edge, while d(B,A) = 2 requiring traversal through C, demonstrating that directed graph distance is a quasi-metric.

A metric space defined over a set of points in terms of distances in a graph defined over the set is called a graph metric.

The vertex set (of an undirected graph) and the distance function form a metric space, if and only if the graph is connected.

The eccentricity of a vertex is the greatest distance between and any other vertex; in symbols,

:<math>\epsilon(v) = \max_{u \in V}d(v,u).</math>

It can be thought of as how far a node is from the node most distant from it in the graph.

The radius of a graph is the minimum eccentricity of any vertex or, in symbols,

:<math>r = \min_{v \in V} \epsilon(v) = \min_{v \in V}\max_{u \in V}d(v,u).</math>

The diameter of a graph is the maximum eccentricity of any vertex in the graph. That is, is the greatest distance between any pair of vertices or, alternatively,

:<math>d = \max_{v \in V}\epsilon(v) = \max_{v \in V}\max_{u \in V}d(v,u).</math>

To find the diameter of a graph, first find the shortest path between each pair of vertices. The greatest length of any of these paths is the diameter of the graph.

A central vertex in a graph of radius is one whose eccentricity is —that is, a vertex whose distance from its furthest vertex is equal to the radius, equivalently, a vertex such that .

A peripheral vertex in a graph of diameter is one whose eccentricity is —that is, a vertex whose distance from its furthest vertex is equal to the diameter. Formally, is peripheral if .

A pseudo-peripheral vertex has the property that, for any vertex , if is as far away from as possible, then is as far away from as possible. Formally, a vertex is pseudo-peripheral if, for each vertex with , it holds that .

A level structure of the graph, given a starting vertex, is a partition of the graph's vertices into subsets by their distances from the starting vertex.

A geodetic graph is one for which every pair of vertices has a unique shortest path connecting them. For example, all trees are geodetic.

The weighted shortest-path distance generalises the geodesic distance to weighted graphs. In this case it is assumed that the weight of an edge represents its length or, for complex networks the cost of the interaction, and the weighted shortest-path distance is the minimum sum of weights across all the paths connecting and . See the shortest path problem for more details and algorithms.

Algorithm for finding pseudo-peripheral vertices

Many sparse matrix and graph algorithms require a starting vertex of high eccentricity, defined as

<math display="block">\epsilon(v) = \max_{u \in V} d(v, u).</math>

A peripheral vertex, whose eccentricity equals the graph's diameter, would be ideal but is typically expensive to locate since it requires all-pairs distance information. A pseudo-peripheral vertex is a vertex whose eccentricity is close to the diameter; it can be found cheaply and serves as an effective substitute in practice.

The standard heuristic is the George–Liu algorithm:

Choose any vertex <math>u \in V</math>.
Among all vertices at maximum distance from <math>u</math>, let <math>v</math> be one of minimum degree.
If <math>\epsilon(v) > \epsilon(u)</math>, set <math>u := v</math> and return to step 2; otherwise <math>u</math> is a pseudo-peripheral vertex.

The same procedure in pseudocode, using breadth-first search (BFS) to compute eccentricities:

function PseudoPeripheral(G = (V, E)):

choose any u ∈ V

loop

BFS from u to obtain ε(u) and F = {w : d(u, w) = ε(u)}

let v ← vertex in F of minimum degree

if ε(v) > ε(u) then u ← v

else return u

Pseudo-peripheral vertices are most commonly used as starting points for bandwidth and profile reduction orderings of sparse matrices, particularly the reverse Cuthill–McKee method. Comparative evaluations of seven pseudoperipheral vertex finders have found that the George–Liu algorithm remains competitive with more recent alternatives on matrices with symmetric sparsity patterns.

Distance in trees

A tree is a connected, acyclic graph in which there is exactly one simple path between any pair of vertices. This makes the distance d(u, v) uniquely determined, unlike in general graphs where multiple shortest paths may exist.

For a rooted tree, the distance between two vertices can be computed using their lowest common ancestor (LCA):

:<math>d(u, v) = \operatorname{depth}(u) + \operatorname{depth}(v) - 2\cdot\operatorname{depth}(\operatorname{LCA}(u,v))</math>

where depth(v) is the number of edges from the root to v.

The diameter of a tree always occurs between two leaf nodes and can be found in O(n) time using two breadth-first searches. The center — the vertex minimizing eccentricity — is found by iteratively pruning leaf nodes until one or two vertices remain.

Applications

Graph distance has practical applications across several domains of computer science and engineering, where the theoretical notion of shortest-path distance translates directly into real-world problem solving.

Navigation and GPS systems

Road networks can be modeled as weighted graphs, where intersections represent vertices and roads represent edges with weights corresponding to distance or travel time. Finding the shortest route between two locations - as performed by GPS applications such as Google Maps is a direct application of weighted graph distance, typically computed using Dijkstra's algorithm.

Network routing protocols

In computer networking, protocols such as OSPF (Open Shortest Path First) model the internet as a weighted graph and use shortest-path distance to determine optimal routes for data packets between routers.

Computational Representation

Directed Graph Distance

Related concepts

Algorithm for finding pseudo-peripheral vertices

Distance in trees

Applications