Nondeterministic finite automaton

In automata theory, a finite-state machine is called a deterministic finite automaton (DFA), if

each of its transitions is uniquely determined by its source state and input symbol, and
reading an input symbol is required for each state transition.

A nondeterministic finite automaton (NFA), or nondeterministic finite-state machine, does not need to obey these restrictions. In particular, every DFA is also an NFA. Sometimes the term NFA is used in a narrower sense, referring to an NFA that is not a DFA, but not in this article.

Using the subset construction algorithm, each NFA can be translated to an equivalent DFA; i.e., a DFA recognizing the same formal language.

Like DFAs, NFAs only recognize regular languages.

NFAs were introduced in 1959 by Michael O. Rabin and Dana Scott, who also showed their equivalence to DFAs. NFAs are used in the implementation of regular expressions: Thompson's construction is an algorithm for compiling a regular expression to an NFA that can efficiently perform pattern matching on strings. Conversely, Kleene's algorithm can be used to convert an NFA into a regular expression (whose size is generally exponential in the input automaton).

NFAs have been generalized in multiple ways, e.g., nondeterministic finite automata with ε-moves, finite-state transducers, pushdown automata, alternating automata, ω-automata, and probabilistic automata.

Besides the DFAs, other known special cases of NFAs

are unambiguous finite automata (UFA)

and self-verifying finite automata (SVFA).

Informal introduction

There are at least two equivalent ways to describe the behavior of an NFA. The first way makes use of the nondeterminism in the name of an NFA. For each input symbol, the NFA transitions to a new state until all input symbols have been consumed. In each step, the automaton nondeterministically "chooses" one of the applicable transitions. If there exists at least one "lucky run", i.e. some sequence of choices leading to an accepting state after completely consuming the input, it is accepted. Otherwise, i.e. if no choice sequence at all can consume all the input and lead to an accepting state, the input is rejected.

In the second way, the NFA consumes a string of input symbols, one by one. In each step, whenever two or more transitions are applicable, it "clones" itself into appropriately many copies, each one following a different transition. If no transition is applicable, the current copy is in a dead end, and it "dies". If, after consuming the complete input, any of the copies is in an accept state, the input is accepted, else, it is rejected.

Formal definition

For a more elementary introduction of the formal definition, see automata theory.

Automaton

An NFA is represented formally by a 5-tuple,

<math>(Q, \Sigma, \delta, q_0, F)</math>, consisting of

a finite set of states <math>Q</math>,
a finite set of input symbols called the alphabet <math>\Sigma</math>,
a transition function <math>\delta</math> : <math>Q\times \Sigma \rightarrow \mathcal{P}(Q)</math>,
an initial (or start) state <math>q_0 \in Q</math>, and
a set of accepting (or final) states <math>F \subseteq Q</math>.

Here, <math>\mathcal{P}(Q)</math> denotes the power set of <math>Q</math>.

Recognized language

Given an NFA <math>M = (Q, \Sigma, \delta, q_0, F)</math>, its recognized language is denoted by <math>L(M)</math>, and is defined as the set of all strings over the alphabet <math>\Sigma</math> that are accepted by <math>M</math>.

Loosely corresponding to the above informal explanations, there are several equivalent formal definitions of a string <math>w = a_1 a_2 ... a_n</math> being accepted by <math>M</math>:

<math>w</math> is accepted if a sequence of states, <math>r_0, r_1, ..., r_n</math>, exists in <math>Q</math> such that:
<math>r_0 = q_0</math>
<math>r_{i+1} \in \delta (r_i, a_{i+1})</math>, for <math>i = 0, \ldots, n-1</math>
<math>r_n \in F</math>.

:In words, the first condition says that the machine starts in the start state <math>q_0</math>. The second condition says that given each character of string <math>w</math>, the machine will transition from state to state according to the transition function <math>\delta</math>. The last condition says that the machine accepts <math>w</math> if the last input of <math>w</math> causes the machine to halt in one of the accepting states. In order for <math>w</math> to be accepted by <math>M</math>, it is not required that every state sequence ends in an accepting state, it is sufficient if one does. Otherwise, i.e. if it is impossible at all to get from <math>q_0</math> to a state from <math>F</math> by following <math>w</math>, it is said that the automaton rejects the string. The set of strings <math>M</math> accepts is the language recognized by <math>M</math> and this language is denoted by <math>L(M)</math>. If, when the automaton has finished reading, it is in an accepting state, the NFA is said to accept the string, otherwise it is said to reject the string.

The set of all strings accepted by an NFA is the language the NFA accepts. This language is a regular language.

For every NFA a deterministic finite automaton (DFA) can be found that accepts the same language. Therefore, it is possible to convert an existing NFA into a DFA for the purpose of implementing a (perhaps) simpler machine. This can be performed using the powerset construction, which may lead to an exponential rise in the number of necessary states. For a formal proof of the powerset construction, please see the Powerset construction article.

Implementation

There are many ways to implement an NFA:

Convert to the equivalent DFA. In some cases this may cause exponential blowup in the number of states.
Keep a set data structure of all states which the NFA might currently be in. On the consumption of an input symbol, unite the results of the transition function applied to all current states to get the set of next states; if ε-moves are allowed, include all states reachable by such a move (ε-closure). Each step requires at most s2 computations, where s is the number of states of the NFA. On the consumption of the last input symbol, if one of the current states is a final state, the machine accepts the string. A string of length n can be processed in time O(ns2), and space O(s).
Create multiple copies. For each n way decision, the NFA creates up to n−1 copies of the machine. Each will enter a separate state. If, upon consuming the last input symbol, at least one copy of the NFA is in the accepting state, the NFA will accept. (This, too, requires linear storage with respect to the number of NFA states, as there can be one machine for every NFA state.)
Explicitly propagate tokens through the transition structure of the NFA and match whenever a token reaches the final state. This is sometimes useful when the NFA should encode additional context about the events that triggered the transition. (For an implementation that uses this technique to keep track of object references have a look at Tracematches.)

Complexity

One can solve in linear time the emptiness problem for NFA, i.e., check whether the language of a given NFA is empty. To do this, we can simply perform a depth-first search from the initial state and check if some final state can be reached.
It is PSPACE-complete to test, given an NFA, whether it is universal, i.e., if there is a string that it does not accept. As a consequence, the same is true of the inclusion problem, i.e., given two NFAs, is the language of one a subset of the language of the other.
Given as input an NFA A and an integer n, the counting problem of determining how many words of length n are accepted by A is intractable; it is #P-hard. In fact, this problem is complete (under parsimonious reductions) for the complexity class SpanL.

Application of NFA

NFAs and DFAs are equivalent in that if a language is recognized by an NFA, it is also recognized by a DFA and vice versa. The establishment of such equivalence is important and useful. It is useful because constructing an NFA to recognize a given language is sometimes much easier than constructing a DFA for that language. It is important because NFAs can be used to reduce the complexity of the mathematical work required to establish many important properties in the theory of computation. For example, it is much easier to prove closure properties of regular languages using NFAs than DFAs.

Notes

References

(See §1.2: Nondeterminism, pp. 47–63.)
(See chapter 2, "Finite Automata".)