REGULAR EXPRESSIONS AND LANGUAGES
Definition:
• A method of representing language as expression is known as
“regular expression”. That is, the language accepted by finite
state automata is easily described by simple expressions.
• The operators of Regular Expressions:
• Let ∑ be a finite set of symbols.
• Let L1,L2 be set of strings in ∑*
1. Union:
The union of two languages L and M denoted L∪M is the set of
strings that are in either L or M or both.
For example
if L={001,10,111} and M={ε,001} then
L∪M = {ε, 10,001,111}
[Link]:
• The concatenation of languages L and M is the set of
strings that can be formed by taking any string in L
and concatenating it with any string in M.
• Example1: If L={001,10,111} and M={ε,001},
then
L.M = {001,10,111,001001,10001,1110001}
• Example 2: If L1={10,01} and L2 = {11,00},
then
L1.L2 = {1011,1000,0111,0100}.
3. Closure:
L* = ⋃ Li ∞ 𝑖=0
Kleenex Closure or Closure of L:Li
• Example: If L1={0,11} then
L*= L0∪L1∪L2∪...
L0={ε}
L1={0,11}
L2={00,011,110,1111}
L3={000,0110,1100,11110,1100,11011,11110,111111}
• L+ = ⋃𝐿𝑖 ∞ 𝑖=0
Positive Closure of L
= L+ - L0
= L+ - {ε}
• Definition:
Let ∑ be an alphabet. The regular expressions over ∑ and
the sets that they denote are defined recursively as follows:
i. ϕ is a regular expression and denotes the empty set.
ii. ε is a regular expression and denotes the set {ε}.
iii. For each a ϵ ∑, ‘a’ is a regular expression and denotes the set
{a}.
iv. If r and s are regular expressions denoting the language R and S
respectively then (r+s), (rs), (r)* are regular expressions that
denotes the sets R∪S, RS and R respectively.
Example 1:
Write a regular expression for the language
accepting all combination of a’s over the set ∑ = {a}.
Solution:
All combinations of a’s means a may be single,
double, triple and so on.
There may be the case that ‘a’ is appearing for
zero times, which means a null string.
So L={ε,a,aa,aaa,aaaa,...}.
R=a*
Example 2:
Design the regular expression for the language
accepting all combinations of a’s except the null string
over ∑={a}.
Solution:
R=a+
Example 3:
Design the regular expression for the language
containing all the strings containing any number of a’s
and b’s.
• Solution:
• R = (a+b)*
Example 4:
Construct the regular expression for the language
accepting all the strings which are ending with 00 over
the set ∑={0,1}.
Solution:
R = (0+1)*00
Example 5:
Write the regular expression for the language accepting
the string which are starting with 1 and ending with 0
over the set ∑={0,1}.
Solution:
• R = 1(0+1)*0
• In order to find out a regular expression of a Finite Automaton, we
use Arden’s Theorem along with the properties of regular
expressions.
Statement −
• Let P and Q be two regular expressions.
• If P does not contain null string, then R = Q + RP has a unique
solution that is R = QP*
Proof −
R = Q + (Q + RP)P [After putting the value R = Q + RP]
= Q + QP + RPP
When we put the value of R recursively again and again, we get the
following equation −
R = Q + QP + QP2 + QP3…..
R = Q (ε + P + P2 + P3 + …. )
R = QP* [As P* represents (ε + P + P2 + P3 + ….) ]
Hence, proved.
Assumptions for Applying Arden’s Theorem
The transition diagram must not have NULL transitions
It must have only one initial state
Method
Step 1 − Create equations as the following form for all the states of
the DFA having n states with initial state q1.
q1 = q1R11 + q2R21 + … + qnRn1 + ε
q2 = q1R12 + q2R22 + … + qnRn2
…………………………
…………………………
qn = q1R1n + q2R2n + … + qnRnn
• Rij represents the set of labels of edges from qi to qj, if no such
edge exists, then Rij = ∅
• Step 2 − Solve these equations to get the equation for the final
state in terms of Rij
Problem
• Construct a regular expression corresponding to the automata given
below −
Solution −
Here the initial state and final state is [Link] equations for the three states
q1, q2, and q3 are as follows
q1 = q1a + q3a + ε (ε move is because q1 is the initial state0
q2 = q1b + q2b + q3b
q3 = q2a
Now, we will solve these three equations −
q2 = q1b + q2b + q3b
= q1b + q2b + (q2a)b (Substituting value of q3)
= q1b + q2(b + ab)
q1 = q1a + q3a + ε
= q1a + q2aa + ε (Substituting value of q3)
= q1a + q1b(b + ab*)aa + ε (Substituting value of q2)
= q1(a + b(b + ab)*aa) + ε
= ε (a+ b(b + ab)*aa)*
= (a + b(b + ab)*aa)*
Hence, the regular expression is (a + b(b + ab)*aa)*.
Problem
Construct a regular expression corresponding to the automata given below −
Solution −
Here the initial state is q1 and the final state is q2
Now we write down the equations −
q 1 = q 10 + ε
q2 = q11 + q20
q3 = q21 + q30 + q31
Now, we will solve these three equations −
q1 = ε0* [As, εR = R]
So, q1 = 0*
q2 = 0*1 + q20
So, q2 = 0*1(0)* [By Arden’s theorem]
Hence, the regular expression is 0*10*.
We can use Thompson's Construction to find out a Finite
Automaton from a Regular Expression. We will reduce the regular
expression into smallest regular expressions and converting these
to NFA and finally to DFA.
Some basic RA expressions are the following −
• Case 1 − For a regular expression ‘a’, we can construct the
following FA −
• Case 2 − For a regular expression ‘ab’, we can construct the
following FA −
Case 3 − For a regular expression (a+b), we can construct the
following FA
Case 4 − For a regular expression (a+b)*, we can
construct the following FA −
Method
• Step 1 Construct an NFA with Null moves from the given
regular expression.
• Step 2 Remove Null transition from the NFA and convert it
into its equivalent DFA.
• Problem
Convert the following RA into its equivalent DFA − 1 (0 + 1)* 0
• Solution
We will concatenate three expressions "1", "(0 + 1)*" and "0"
• Now we will remove the ε transitions. After we remove
the ε transitions from the NDFA, we get the following −
It is an NDFA corresponding to the RE − 1 (0 + 1)* 0. If you want
to convert it into a DFA, simply apply the method of converting
NDFA to DFA .
Finite Automata with Null Moves (NFA-ε)
A Finite Automaton with null moves (FA-ε) does transit not only
after giving input from the alphabet set but also without any input
symbol. This transition without input is called a null move
• An NFA-ε is represented formally by a 5-tuple (Q, ∑, δ, q0, F),
consisting of
• Q − a finite set of states
• ∑ − a finite set of input symbols
• δ − a transition function δ : Q × (∑ ∪ {ε}) → 2Q
• q0 − an initial state q0 ∈ Q
• F − a set of final state/states of Q (F⊆Q).
The above (FA-ε) accepts a string set − {0, 1, 01}
Removal of Null Moves from Finite Automata
• If in an NDFA, there is ϵ-move between vertex X to vertex Y,
we can remove it using the following steps −
• Find all the outgoing edges from Y.
• Copy all these edges starting from X without changing the
edge labels.
• If X is an initial state, make Y also an initial state.
• If Y is a final state, make X also a final state.
• Problem
• Convert the following NFA-ε to NFA without Null move
Solution
• Step 1 −
• Here the ε transition is between q1 and q2, so
let q1 is X and qf is Y.
• Here the outgoing edges from qf is to qf for inputs 0 and 1.
• Step 2 −
• Now we will Copy all these edges from q1 without changing
the edges from qf and get the following FA −
Step 3 −
• Here q1 is an initial state, so we make qf also an initial state.
• So the FA becomes −
Step 4 −
• Here qf is a final state, so we make q1 also a
final state.
• So the FA becomes −
Pumping Lemma Theorem
Let L be a regular language. Then there exists a
constant ‘c’ such that for every string w in L −
• |w| ≥ c
• We can break w into three strings, w = xyz, such that −
• |y| > 0
• |xy| ≤ c
For all k ≥ 0, the string xykz is also in L.
Applications of Pumping Lemma
• Pumping Lemma is to be applied to show that certain
languages are not regular. It should never be used to show a
language is regular.
• If L is regular, it satisfies Pumping Lemma.
• If L does not satisfy Pumping Lemma, it is non-regular.
Method to prove that a language L is not regular
At first, we have to assume that L is [Link], the pumping
lemma should hold for L.
Use the pumping lemma to obtain a contradiction −
– Select w such that |w| ≥ c
– Select y such that |y| ≥ 1
– Select x such that |xy| ≤ c
– Assign the remaining string to z.
– Select k such that the resulting string is not in L.
Hence L is not regular.
Problem
Prove that L = {aibi | i ≥ 0} is not regular.
Solution −
At first, we assume that L is regular and n is the number of
states.
• Let w = anbn. Thus |w| = 2n ≥ n.
By pumping lemma, let w = xyz, where |xy| ≤ n.
• Let x = ap, y = aq, and z = arbn, where p + q + r = n, p ≠ 0, q
≠ 0, r ≠ 0. Thus |y| ≠ 0.
• Let k = 2. Then xy2z = apa2qarbn.
• Number of as = (p + 2q + r) = (p + q + r) + q = n + q
Hence, xy2z = an+q bn. Since q ≠ 0, xy2z is not of the form anbn.
Thus, xy2z is not in L. Hence L is not regular.