Converting FA to Regular Expression
State Elimination Method
Soulimane Mammar
Discrete Event Systems Course
ENPO-MA
Converting FAs into equivalent Regular Expressions
We break this procedure into two parts, using a new type of finite automaton called a
generalized nondeterministic finite automaton, GNFA. First we show how to convert FAs
into GNFAs, and then GNFAs into regular expressions.
Generalized nondeterministic finite automata are simply nondeterministic finite automata
wherein the transition arrows may have any regular expressions as labels, instead of only
members of the alphabet or . The GNFA reads blocks of symbols from the input, not
necessarily just one symbol at a time as in an ordinary NFA. The GNFA moves along a
transition arrow connecting two states by reading a block of symbols from the input, which
themselves constitute a string described by the regular expression on that arrow. A GNFA
is nondeterministic and so may have several different ways to process the same input string.
It accepts its input if its processing can cause the GNFA to be in an accept state at the end
of the input. The following figure presents an example of a GNFA.
Figure 1: A generalized nondeterministic finite automaton
For convenience, we require that GNFAs always have a special form that meets the
following conditions.
• The start state has transition arrows going to every other state but no arrows coming
in from any other state.
• There is only a single accept state, and it has arrows coming in from every other state
but no arrows going to any other state. Furthermore, the accept state is not the same
as the start state.
• Except for the start and accept states, one arrow goes from every state to every other
state and also from each state to itself.
We can easily convert an FA into a GNFA in the special form. We simply add a new start
state with an arrow to the old start state and a new accept state with arrows from the
old accept states. If any arrows have multiple labels (or if there are multiple arrows going
between the same two states in the same direction), we replace each with a single arrow
whose label is the union of the previous labels. Finally, we add arrows labeled ∅ between
1
states that had no arrows. This last step won’t change the language recognized because a
transition labeled with ∅ can never be used. From here on we assume that all GNFAs are
in the special form.
Now we show how to convert a GNFA into a regular expression. Say that the GNFA has
k states. Then, because a GNFA must have a start and an accept state and they must be
different from each other, we know that k ≥ 2. If k > 2, we construct an equivalent GNFA
with k − 1 states. This step can be repeated on the new GNFA until it is reduced to two
states. If k = 2, the GNFA has a single arrow that goes from the start state to the accept
state. The label of this arrow is the equivalent regular expression. For example, the stages
in converting an FA with three states to an equivalent regular expression are shown in the
following figure.
Figure 2: Typical stages in converting a DFA to a regular expression
The crucial step is constructing an equivalent GNFA with one fewer state when k > 2.
We do so by selecting a state, ripping it out of the machine, and repairing the remainder so
that the same language is still recognized. Any state will do, provided that it is not the start
or accept state. We are guaranteed that such a state will exist because k > 2. Let’s call
the removed state qrip . After removing qrip we repair the machine by altering the regular
expressions that label each of the remaining arrows. The new labels compensate for the
absence of qrip by adding back the lost computations. The new label going from a state qi
to a state qj is a regular expression that describes all strings that would take the machine
from qi to qj either directly or via qrip . We illustrate this approach in Figure 3.
Figure 3: Constructing an equivalent GNFA with one fewer state
We make this change for each arrow going from any state qi to any state qj , including
the case where qi = qj . The new machine recognizes the original language.
Example
In this example, we use the preceding algorithm to convert an FA into a regular expression.
We begin with the two-state FA in Figure 4(a). In Figure 4(b), we make a four-state GNFA
by adding a new start state and a new accept state, called s and a instead of qstart and
qaccept so that we can draw them conveniently. To avoid cluttering up the figure, we do not
draw the arrows labeled ∅, even though they are present. Note that we replace the label a,
b on the self-loop at state 2 on the FA with the label a ∪ b at the corresponding point on
2
Figure 4: Converting a two-state DFA to an equivalent regular expression
the GNFA. We do so because the FA’s label represents two transitions, one for a and the
other for b, whereas the GNFA may have only a single transition going from 2 to itself.
In Figure 4(c), we remove state 2 and update the remaining arrow labels. In this case,
the only label that changes is the one from 1 to a. In part (b) it was ∅, but in part (c) it
is b(a ∪ b)∗ . Therefore, the new label on the arrow from 1 to a is (b)(a ∪ b)∗ () ∪ ∅. We
simplify this regular expression to b(a ∪ b)∗ . In Figure 4(d), we remove state 1 from part (c)
and follow the same procedure. Because only the start and accept states remain, the label
on the arrow joining them is the regular expression that is equivalent to the original FA.