In compiler theory, a reaching definition for a given instruction is an earlier instruction whose target variable can reach (be assigned to) the given one without an intervening assignment. For example, in the following code:

d1 : y := 3

d2 : x := y

<code>d1</code> is a reaching definition for <code>d2</code>. In the following, example, however:

d1 : y := 3

d2 : y := 4

d3 : x := y

<code>d1</code> is no longer a reaching definition for <code>d3</code>, because <code>d2</code> kills its reach: the value defined in <code>d1</code> is no longer available and cannot reach <code>d3</code>.

As analysis

The similarly named reaching definitions is a data-flow analysis which statically determines which definitions may reach a given point in the code. Because of its simplicity, it is often used as the canonical example of a data-flow analysis in textbooks. The data-flow confluence operator used is set union, and the analysis is forward flow. Reaching definitions are used to compute use-def chains.

The data-flow equations used for a given basic block <math>S</math> in reaching definitions are:

  • <math>{\rm REACH}_{\rm in}[S] = \bigcup_{p \in pred[S]} {\rm REACH}_{\rm out}[p]</math>
  • <math>{\rm REACH}_{\rm out}[S] = {\rm GEN}[S] \cup ({\rm REACH}_{\rm in}[S] - {\rm KILL}[S])</math>

In other words, the set of reaching definitions going into <math>S</math> are all of the reaching definitions from <math>S</math>'s predecessors, <math>pred[S]</math>. <math>pred[S]</math> consists of all of the basic blocks that come before <math>S</math> in the control-flow graph. The reaching definitions coming out of <math>S</math> are all reaching definitions of its predecessors minus those reaching definitions whose variable is killed by <math>S</math> plus any new definitions generated within <math>S</math>.

For a generic instruction, we define the <math>{\rm GEN}</math> and <math>{\rm KILL}</math> sets as follows:

  • <math>{\rm GEN}[d : y \leftarrow f(x_1,\cdots,x_n)] = \{d\}</math> , a set of locally available definitions in a basic block
  • <math>{\rm KILL}[d : y \leftarrow f(x_1,\cdots,x_n)] = {\rm DEFS}[y] - \{d\}</math>, a set of definitions (not locally available, but in the rest of the program) killed by definitions in the basic block.

where <math>{\rm DEFS}[y]</math> is the set of all definitions that assign to the variable <math>y</math>. Here <math>d</math> is a unique label attached to the assigning instruction; thus, the domain of values in reaching definitions are these instruction labels.

Worklist algorithm

Reaching definition is usually calculated using an iterative worklist algorithm.

Input: control-flow graph CFG = (Nodes, Edges, Entry, Exit)

<syntaxhighlight lang="c">

// Initialize

for all CFG nodes n in N,

OUT[n] = emptyset; // can optimize by OUT[n] = GEN[n];

// put all nodes into the changed set

// N is all nodes in graph,

Changed = N;

// Iterate

while (Changed != emptyset)

{

choose a node n in Changed;

// remove it from the changed set

Changed = Changed -{ n };

// init IN[n] to be empty

IN[n] = emptyset;

// calculate IN[n] from predecessors' OUT[p]

for all nodes p in predecessors(n)

IN[n] = IN[n] Union OUT[p];

oldout = OUT[n]; // save old OUT[n]

// update OUT[n] using transfer function f_n ()

OUT[n] = GEN[n] Union (IN[n] -KILL[n]);

// any change to OUT[n] compared to previous value?

if (OUT[n] changed) // compare oldout vs. OUT[n]

{

// if yes, put all successors of n into the changed set

for all nodes s in successors(n)

Changed = Changed U { s };

}

}

</syntaxhighlight>

See also

  • Dead-code elimination
  • Loop-invariant code motion
  • Reachable uses
  • Static single assignment form

Further reading