0% found this document useful (0 votes)
8 views33 pages

Understanding Alphabets and Automata

The document covers fundamental concepts in formal languages, including alphabets, strings, deterministic and non-deterministic finite automata, regular expressions, and grammars. It explains the properties and applications of these concepts, such as the Pumping Lemma for proving languages are not regular and techniques for reducing ambiguity in grammars. Additionally, it discusses Chomsky Normal Form and Greibach Normal Form in the context of context-free grammars.

Uploaded by

jeren2167
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views33 pages

Understanding Alphabets and Automata

The document covers fundamental concepts in formal languages, including alphabets, strings, deterministic and non-deterministic finite automata, regular expressions, and grammars. It explains the properties and applications of these concepts, such as the Pumping Lemma for proving languages are not regular and techniques for reducing ambiguity in grammars. Additionally, it discusses Chomsky Normal Form and Greibach Normal Form in the context of context-free grammars.

Uploaded by

jeren2167
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Module 1

===============================================
i) Alphabet (Σ):An alphabet is a finite, non-empty set of symbols used to
construct strings.
# An alphabet contains symbols, not strings. # The set must be finite
(limited number of symbols). # Alphabets are the basic building blocks for
strings and languages. # Examples: Σ = {0, 1} , Σ = {a, b, c} , Σ = {x, y}
ii) Power of Alphabet (Σᵏ): For an alphabet Σ, Σᵏ is the set of all strings of
length k formed using symbols from Σ.
# Each string in Σᵏ has exactly k symbols. # Σ⁰ = {ε}, where ε is the empty
string. # Σ* represents strings of all possible lengths, including ε. #
Example: Let Σ = {0, 1} , Σ¹ = {0, 1} Σ² = {00, 01, 10, 11} Σ³={000, 001,
010, 011, 100, 101, 110, 111}
iii) Language (L): A language is a set of strings formed from an alphabet
Σ, i.e., L ⊆Σ
¿

# A language may be finite or infinite. # Languages consist of valid strings


over an alphabet. # Formal languages are studied using automata and
grammars. # Examples: Σ = {0, 1}, L = {00, 11} # Σ = {a, b}, L = {a, ab,
abb} # L = { w | w has even number of 0’s }
iv)String: A string (also called a word) is a finite sequence of symbols
chosen from a given alphabet.
#A string is formed by arranging symbols from an alphabet in a specific
order.
#The sequence must be finite in length. # Strings are the basic objects
studied in formal languages. # The empty string, denoted by ε, contains
zero symbols.
Examples: If Σ={0 ,1 }, then 01101 and 111 are valid strings. If Σ={a , b }, then ab,
aab, bba are strings.---------------------------------------------------------
Deterministic Finite Automaton (DFA) : A Deterministic Finite
Automaton (DFA) is a finite state machine that accepts or rejects strings of
symbols by moving through a unique sequence of states determined by the
input symbols.
Formal Definition: A DFA is defined as a 5-tuple: A=¿)
1. Q – A finite, non-empty set of states.
Each state represents the current status of the automaton.
2. Σ – A finite set of input symbols called the alphabet.
3. δ – A transition function , δ :Q × Σ → Q
For every state and input symbol, exactly one transition is defined.
4. q₀ – The start state, where the automaton begins processing the input
string (q ∈ Q).
0

5. F – A set of accepting (final) states, where F ⊆ Q .


Key Characteristics of a DFA
 For each state and input symbol, there is exactly one next state.
 No ε-moves (no transitions without input).
 The automaton either accepts or rejects an input string after reading it
completely.
Acceptance Condition
A string is accepted by the DFA if, after processing the entire input string,
the automaton reaches any state in F; otherwise, it is
rejected.------------------------
Non-Deterministic Finite Automaton (NFA): A Non-Deterministic
Finite Automaton (NFA) is a finite automaton in which for a given state
and input symbol, the next state may be one, many, or none.
An NFA is defined as a 5-tuple: A=(Q , Σ ,δ , q , F )
0

1. Q – A finite, non-empty set of states.


Each state represents a possible current configuration.
2. Σ – A finite set of input symbols (alphabet).
3. δ – A transition function , δ :Q ×( Σ ∪ {ε })→ 2
Q

This means:
o Multiple transitions may exist for the same input.
o ε-transitions (moves without consuming input) are allowed.
4. q₀ – The start state, where computation begins
(q ∈ Q).
0

5. F – A set of accepting (final) states, where


F ⊆ Q.
Key Characteristics of an NFA
 May have more than one transition for the same state and input
symbol.
 ε-moves are allowed.
 Multiple computation paths can exist for a single input string.
Acceptance Condition
A string is accepted by an NFA if at least one computation path ends in
an accepting state after the entire input is processed.

M-5 conti…

• Finite Tape: The amount of memory available to the machine is a linear


function of the length of the input.
• Boundaries: The transition function is defined such that the tape head cannot
move to the left of the left endmarker or to the right of the right endmarker.
• Computational Power: While it has restricted memory, an LBA can still
solve complex problems that finite automata or pushdown automata cannot,
such as recognizing the language .
Module 2
=====================================================
Regular Expressions: A Regular Expression (RE) is a formal notation
used to describe regular languages using symbols and operators.
Formal Definition: Let Σ be an alphabet. A regular expression over Σ is
defined recursively as follows:
Basic Regular Expressions: #∅ is a regular expression denoting the empty
a ∈ Σ, a is a regular expression denoting {a}.
language. # ε is a regular expression denoting the language {ε}. # For each
Inductive Rules: # If r and s are regular expressions, then: # (r + s) denotes
union → L(r )∪ L(s) # (rs) denotes concatenation → L(r )L(s) # (r*) denotes
Kleene star → zero or more repetitions of L(r )
Key Points: Regular expressions describe patterns of strings. # Every
regular expression represents a regular language. # Regular expressions are
equivalent to DFAs and NFAs in expressive power.
Examples
 (a + b)* → all strings over {a, b}
 a*b → strings with zero or more a’s followed by one b
 (0 + 1)*01 → all binary strings ending with 01
-------------------------------------------------------------------------------------------
Applications of Regular Expressions (RE): Regular Expressions are
widely used in computer science and software applications for pattern
matching and text processing.
# Lexical Analysis: Used in compilers to identify tokens such as keywords,
identifiers, operators, and constants. # Text Searching: Used in editors and
tools (e.g., grep, find) to search for specific patterns in text files. # String
Validation: Used to validate input formats such as email IDs, phone
numbers, passwords, and URLs. # Syntax Highlighting: Used in code
editors to highlight keywords, comments, and strings. # Data Extraction:
Helps in extracting required information from large text data (logs,
documents, web pages).
# Search and Replace Operations: Used to find patterns and replace them
with other strings efficiently. # Network Security: Used in intrusion
detection systems and firewalls for detecting malicious patterns.
------------------------------------------------------------------------------------------
Pumping Lemma for Regular Languages:
Let L be a regular language. Then there exists a constant n ≥ 1 (called the
pumping length) such that every string
w ∈ Lwith ∣ w ∣≥ n can be written as: w=xyz
satisfying the following conditions:
1. y ≠ ε
2. ∣ xy ∣≤ n
3. For all k ≥ 0, x y z ∈ L
k
That is, the string y can be pumped (repeated any number of times,
including zero) and the resulting string still belongs to L.
Proof:
1. Since L is a regular language, there exists a DFA A=(Q , Σ ,δ , q , F ) 0

that recognizes L.
2. Let the DFA have n states, i.e., ∣Q ∣=n.
3. Consider any string w=a a ⋯ a ∈ Lwhere m ≥n.
1 2 m

4. As the DFA processes the input, define the sequence of states:


p0=q0 , p i=δ (q 0 ,a 1 a 2 ⋯ ai ), i=1, 2 , … , n
5. There are n+1 states p0 , p 1 , … , pnbut
only n states in the DFA.
By the Pigeonhole Principle, at least two states must be equal.
Hence, there exist integers iand j such that:0 ≤ i< j≤ n and p = pi j

6. Split the string w into three parts:


o x=a 1 a 2 ⋯ ai
o y=ai+1 ai +2 ⋯ a j
o z=a j +1 a j+ 2 ⋯ am
7. From this construction:
o y ≠ ε because j>i
o ∣ xy ∣= j ≤ n
o Reading y takes the DFA from state p back to p , forming a loop
i i

8. Therefore, for any k ≥ 0, the DFA can repeat this loop k times and still
reach an accepting state after reading z.
9. Hence, x y z ∈ L for all k ≥ 0
k

-------------------------------------------------------------------------------------------
Applications of the Pumping Lemma:
# Proving a Language is Not Regular :The primary application is to show
that certain languages cannot be accepted by any DFA. # Distinguishing
Regular and Non-Regular Languages : Helps differentiate languages that
look similar but have different computational power. # Showing
Limitations of Finite Automata : Demonstrates that DFAs have finite
memory and cannot count arbitrarily. # Theoretical Validation Tool :
Used in formal proofs to justify that some patterns cannot be represented
by regular expressions or DFAs. # Automata Theory Education: Helps in
understanding the structure of regular languages and the role of loops in
DFAs. # Proof by Contradiction: Often applied by assuming a language is
regular and then showing it violates pumping conditions.

# Common Languages Proven Non-Regular Using Pumping Lemma


 { 0 n 1n ∣n ≥ 1 } ¿
 { ww ∣ w ∈{0 ,1 } }
 { an bn ∣ n ≥ 0 }
Module 3=================================================
Ambiguous Grammar: A context-free grammar (CFG) is said to be
ambiguous if there exists at least one string in the language that has more than
one distinct parse tree (or more than one leftmost/rightmost derivation).
Example of Ambiguous Grammar:
Consider the grammar: E → E + E ∣ E∗E ∣id
The string id + id * id has two different parse trees:
 One groups * first: id +(id∗id )
 Another groups + first: (id+id)∗id
Hence, the grammar is ambiguous.
Techniques for Reducing Ambiguity: Ambiguity cannot always be removed,
but for practical grammars (like arithmetic expressions), it can often be reduced
using the following techniques:
1. Enforcing Operator Precedence: Operators with higher precedence must be
evaluated first. Ambiguous Grammar, E → E + E ∣ E∗E ∣id
Unambiguous Grammar (with precedence)
E → E+T ∣ T
T → T∗F ∣ F
F → id
✔ Here, * has higher precedence than +.
2. Enforcing Associativity: Associativity decides whether operators group
from the left or right. # Left-Associative Grammar E → E +T ∣T
This forces expressions like: id+id+ id=(id +id )+id
✔ Eliminates ambiguity caused by multiple groupings.
3. Grammar Restructuring: Rewrite the grammar to produce only one valid
parse tree per string.# Ambiguous Grammar S → SS ∣ a
# Unambiguous Grammar S → aS ∣a
✔ Restricts derivations and avoids multiple parse trees.
4. Using Parser Declarations (YACC Approach): Parser generators like
YACC resolve ambiguity by declaring: Operator precedence , Associativity
Example (YACC): %left '+' , %left '*'
✔ * has higher precedence than +
✔ Both operators are left-associative
Important Notes
 Not all ambiguous grammars can be made unambiguous.
 Some languages are inherently ambiguous.
 Ambiguity must be handled carefully in compiler design.
-----------------------------------------------------------------------------------------------
i. Grammar: A grammar is a formal device used to generate strings of a
language.
A context-free grammar (CFG) is defined as a 4-tuple: G=(V , Σ , P , S)
Components:
 V → set of variables (non-terminals)
 Σ → set of terminals
 P → set of production rules
 S → start symbol
Example : G=({S },{a , b },{S → aSb ∣ ab }, S )----------------------------------------
ii. Derivation: A derivation is a sequence of applications of production rules to
generate a string from the start symbol.
Types: Leftmost derivation, Rightmost derivation
Example: Using grammar: S → aSb ∣ab
Derivation of aabb: S ⇒ aSb ⇒ aabb--------------------------------------------------
iii. Leftmost and Rightmost Derivation:
Leftmost Derivation: The leftmost non-terminal is replaced at each step.
Rightmost Derivation: The rightmost non-terminal is replaced at each step.
Example: Grammar: E → E + E ∣id
Leftmost Derivation of id + id: E ⇒ E+ E ⇒ id + E ⇒ id+id
Rightmost Derivation of id + id: E ⇒ E+ E ⇒ E+id ⇒ id+id -----------------
iv. Ambiguous Grammar: A grammar is said to be ambiguous if there exists at
least one string in the language that has more than one parse tree or more than
one leftmost/rightmost derivation.
Example: Grammar: E → E + E ∣ E∗E ∣id
String: id + id * id
 Two different parse trees exist
 Hence, the grammar is ambiguous-------------------------------------------------
v. Parse Tree: A parse tree is a hierarchical tree representation that shows how
a string is generated from a grammar.
Key Properties: # Root is the start symbol # Internal nodes are non-terminals
# Leaf nodes are terminals
Example: Grammar: S → aSb ∣ab
Parse tree for string aabb:
S
/ \
a b
|
S
/ \
a b
Module 4 ================================================
(i) Inherently Ambiguous Language: A context-free language (CFL) is said to
be inherently ambiguous if every context-free grammar (CFG) that generates
the language is ambiguous. That is, no unambiguous grammar exists for the
language
1. Ambiguity is a property of languages, not just grammars.
2. If all possible CFGs for a language are ambiguous, the language is
inherently ambiguous.
3. Such ambiguity cannot be removed by rewriting or simplifying the
grammar.
Example: L={a b c ∣i= j or j=k , i, j, k ≥ 1}
i j k

 This language can be generated in two different ways:


o by matching a b c , or by matching a b c
i i k i j j

 Any CFG for this language leads to multiple parse trees for some strings.
✅ Hence, L is inherently ambiguous.-----------------------------------------------
(ii) Chomsky Normal Form (CNF): A context-free grammar is said to be in
Chomsky Normal Form if every production is of one of the following forms:
1. A → BC (A, B, C are variables)
 S → ε is allowed only if ε ∈ L(G). #
2. A → a(A is a variable and ais a terminal)
1. No ε-productions, unit productions, or useless symbols (except as
noted).
2. Every CFG (without ε) can be converted into CNF.
3. CNF is useful in parsing algorithms like CYK.
S → AB
Example: Grammar in CNF: BA →
→b
a

This grammar generates the string ab and follows CNF rules.----------------------


(iii) Greibach Normal Form (GNF): A context-free grammar is in Greibach
Normal Form if every production is of the form: A → aα ,where: # Ais a variable
# ais a terminal # α is a (possibly empty) string of variables
1. Each production begins with a terminal symbol.
2. There are no ε-productions (except possibly S → ε).
3. Each derivation step introduces exactly one terminal.
S → aA ∣b
Example A → a
This grammar is in GNF since every production starts with a terminal.
Module 5 ================================================
[Link] Turing Machine. With a neat Block diagram, explain the
working of basic Turing Machine.
->A Turing Machine (TM) is a mathematical model proposed by Alan Turing in
1936 that serves as a formal representation of any possible computation. It is
more powerful than a finite automaton or a pushdown automaton, as it consists
of a finite control and a single tape of infinite length that can be used to both
read and write data.
Formal Definition
A basic Turing Machine is formally described by a 7-tuple: M=(Q,Σ,Γ,δ,q0
,B,F)Q: A finite set of states of the finite control.
 Σ: A finite set of input symbols.
 Γ: The complete set of tape symbols; Σ is always a subset of Γ.
 δ: The transition function, where \δ(q, X) = (p, Y, D). Here, q is the current
state, X is the symbol being scanned, p is the next state, Y is the symbol
written to the tape, and D is the direction of movement (L for left or R for
right).
 q0: The start state, a member of Q where the finite control is found
initially.
 B: The blank symbol, which is in \Γ but not in \Σ. This symbol appears
initially in all tape cells except those containing the input.
 F: The set of final or accepting states, a subset of Q.
Block Diagram of a Turing Machine: Based on the sources, a Turing Machine
can be visualized as having three primary components: a finite control, an
infinite tape, and a tape head.
+------------------+
| Finite Control |
| (State q) |
+--------+---------+
|
| (Tape Head)
v
+----+----+----+----+----+----+----+
| B | X1 | X2 | Xi | Xn | B |... | (Infinite Tape)
+----+----+----+----+----+----+----+
Working of a Basic Turing Machine: The operation of a Turing Machine is a
cycle of moves based on its current state and the symbol currently under the
tape head.
1. Initial Configuration: At the start, the input (a finite-length string of
symbols from \Σ) is placed on the tape. All other cells on the infinite tape
are filled with the blank symbol (B). The tape head is positioned at the
leftmost cell containing the input, and the finite control is set to the start
state (q_0).
2. The Move: In one move, the machine performs the following three actions
based on the current state and the tape symbol being scanned:
o Change State: The machine transitions to a next state (which may be
the same as the current state).
o Write Symbol: A tape symbol is written into the cell being scanned,
replacing whatever symbol was previously there.
o Move the Head: The tape head moves one cell to the left (L) or one
cell to the right (R).
3. Acceptance: The machine continues its moves until it enters an accepting
state (a state in F), at which point the input is accepted. Alternatively, the
machine may "halt" if it enters a state and scans a symbol for which no
transition is defined.
------------------------------------------------------------------------------------------------
[Link] a short note on: a) Multitape Turing Machine b) Nondeterministic
Turing Machine
a) Multitape Turing Machine: A Multitape Turing Machine consists of a finite
control with a finite number of tapes, each having its own independently
moving tape head. Initially, the input string is placed on the first tape while all
other tapes contain blanks, and all heads are positioned at the left end of their
respective tapes.
The operation of the machine is determined by a transition function where a
single move depends on the current state and the symbols scanned by all heads
simultaneously. In response, the machine enters a new state, writes a new
symbol on each of its tapes, and moves each head independently to the left (),
right (), or keeps it stationary (). While this model is often more convenient for
simulating real-world computers and performing certain tasks faster, it does not
add any additional power to the basic model. The sources state that every
language accepted by a multitape Turing Machine is also accepted by a standard
one-tape Turing Machine.
b) Nondeterministic Turing Machine (NTM): A Nondeterministic Turing
Machine differs from the basic model by having a transition function that
returns a set of possible moves for a given state and scanned symbol. This
means that at any step, the machine can choose which move to make from a
finite number of options.
An NTM is said to accept an input if there exists at least one sequence of
choices that leads the machine from the initial configuration to an accepting
state. If a particular branch of choices does not lead to an accepting state, it
does not prevent the input from being accepted, as long as one valid accepting
path exists. Despite this flexibility, Nondeterministic Turing Machines are
equivalent in power to deterministic ones; any language accepted by an NTM
can also be recognized by a deterministic TM, though the simulation may
require exponentially more time.
------------------------------------------------------------------------------------------------
[Link] explain The Techniques for Turing Machine construction. Also
write applications of Turing Machine.
To make the design of a Turing Machine (TM) more transparent and
manageable, several programming techniques are used to extend the basic
model's utility without changing its fundamental power.
Techniques for Turing Machine Construction
1. Storage in the State: The finite control can be used to hold a finite amount
of data, much like a computer's register. To achieve this, a state is represented
as a tuple , where is the control state and are data elements being
"remembered". This allows the machine to track a specific symbol it has just
seen while moving to another part of the tape.
2. Multiple Tracks: You can treat a single tape as if it has multiple tracks. In
this technique, the tape alphabet consists of tuples, where each component of
the tuple represents a different track. This is particularly useful for marking
positions (e.g., "checking off" symbols) without losing the original input data.
3. Subroutines: Similar to traditional programming, a TM can be designed as a
collection of interacting components. A subroutine is a set of states that
performs a specific task (like copying a string or performing addition). The
main TM "calls" the subroutine by entering its start state and regains control
when the subroutine enters a designated return state.
4. Shifting Over: This is a common operation where the machine needs to
create space on the tape by moving a block of symbols one or more cells to the
right.
5. Simulating Higher-Level Models: TMs can be constructed to simulate more
complex systems, such as multitape machines, nondeterministic machines,
or even the architecture of a real computer (including its CPU, memory, and
instruction cycle).
------------------------------------------------------------------------------------------------
Applications of Turing Machines
The Turing Machine is primarily used as a formal model of computation to
define what is "computable". Its key applications include:
• Defining Decidability: TMs are used to distinguish between problems that are
decidable (solvable by an algorithm) and those that are undecidable (problems
no computer can ever solve, such as the "hello-world" problem).
• Language Recognition: They serve as the most powerful type of recognizer,
capable of accepting recursively enumerable languages (RE languages),
which is a broader class than those accepted by finite automata or pushdown
automata.
• Mathematical Function Computation: Beyond recognizing strings, TMs can
act as computers of integer-valued functions. For instance, they can be
designed to perform arithmetic like proper subtraction or multiplication by
treating tallies on the tape as numbers.
• Complexity Theory: TMs provide a standardized framework for analyzing
the time and space complexity of algorithms, helping to define classes like P
and NP.
• Universal Computation: The concept of a Universal Turing Machine (one
that can simulate any other TM when given its description as input) serves as
the theoretical foundation for the stored-program computer.
Analogy: Constructing a Turing Machine is like building with modular
blocks. You can create a "Copying" block or an "Addition" block (Subroutines)
and stack them together, or you can give your builder a multi-colored pen
(Multiple Tracks) to keep track of different types of information on the same
piece of paper.
------------------------------------------------------------------------------------------------

Q4.i. Explain The Church-Turing machine with neat a diagram ii. Explain
Multiple TM with a neat diagram.
i. The Church-Turing Thesis: The Church-Turing thesis (also known as
Church’s hypothesis) is the unprovable assumption that the Turing Machine is
a formal way of expressing any problem that can be solved by a computer.
In 1936, Alan Turing proposed this model as a representation of "any possible
computation," whether performed by a human, an electronic computer, or even
an electromechanical device.
The thesis states that any general way to compute will allow us to compute only
the same functions or recognize the same languages as a Turing Machine. While
there were other serious proposals for models of computation at the time—such
as those by logician A. Church—they were all found to have the same power as
the Turing Machine. Because the Turing Machine is a simple and precise
representation of what a computer can do, it is used to develop the theory of
"undecidable" problems (problems no computer can solve).
Diagram of a Basic Turing Machine (Representing the Church-Turing
Model): The basic model used to represent this thesis consists of a finite
control, an infinite tape divided into cells, and a tape head.
+------------------+
| Finite Control |
| (State q) |
+--------+---------+
|
| (Tape Head)
v
+----+----+----+----+----+----+----+
| B | X1 | X2 | Xi | Xn | B |... | (Infinite Tape)
+----+----+----+----+----+----+----+
----------------------------------------------------------------------------------------------
ii. Multitape Turing Machine: A Multitape Turing Machine is an extension
of the basic model that consists of a finite control with independent tapes.
Each tape is divided into cells and has its own independent tape head.
Working of a Multitape TM:
• Initial Configuration: The input string is placed on the first tape, while all
other tapes are filled with the blank symbol (). The finite control starts in the
initial state, and the head of the first tape is at the left end of the input.
• The Move: A single move is determined by the current state and the symbols
scanned by all tape heads simultaneously.
• Action: In one move, the machine:
1. Enters a new state.
2. Writes a new symbol on each tape (this can be the same as the existing
symbol).
3. Moves each head independently to the left (), right (), or keeps it
stationary ().
Diagram of a Multitape Turing Machine: The diagram below illustrates how
multiple heads interact with a single finite control.
+----------------+
| Finite Control |
+-------+--------+
/ | \
Head 1/ Head 2\ Head 3\
v v v
Tape 1: [X][Y][Z]...
Tape 2: [B][B][B]...
Tape 3: [B][B][B]...
Note on Power: Although a multitape machine might be more efficient or
easier to "program" for certain tasks, it adds no additional language-defining
power; any language accepted by a multitape TM can also be accepted by a
standard one-tape TM.
------------------------------------------------------------------------------------------------
[Link] the model of Linear bound automata(LBA) with neat
diagram
Definition of Linear Bounded Automata (LBA): A Linear Bounded
Automaton is a restricted type of nondeterministic Turing Machine. Unlike a
standard Turing Machine, which has an infinite tape, an LBA has a tape that is
limited to the portion occupied by the input string. To ensure the machine stays
within these boundaries, the input is typically enclosed between two special
markers:
• Left Endmarker (¢): Prevents the tape head from moving to the left of the
input.
• Right Endmarker ($): Prevents the tape head from moving to the right of the
input.
An LBA is the formal model for Context-Sensitive Languages (Type 1 in the
Chomsky hierarchy), making it more powerful than a Pushdown Automaton but
less powerful than a standard Turing Machine.
--------------------------------------------------------------------------------
Model/Diagram of a Linear Bounded Automaton:
The LBA consists of a finite control and a tape head that can read and write on
a tape of fixed length , where is the length of the input string.
+------------------+
| Finite Control |
| (State q) |
+--------+---------+
|
| (Tape Head)
v
+----+----+----+----+----+----+----+
| ¢ | a1 | a2 | .. | an | $ | | (Tape limited to input length)
+----+----+----+----+----+----+----+
^ ^
| |
Left Endmarker Right Endmarker
Key Characteristics (External Information)
UNIX Regular Expression Notations (-----------------Module 2----------------)
UNIX regular expressions provide shorthand symbols and operators to
describe patterns concisely.
1. Character and Character-Class Notations
Notation Meaning Example
. Any single character a.c → abc, axc
[abc] Any one of a, b, or c [aeiou]
[a-z] Any lowercase letter [a-z]+
[A-Z] Any uppercase letter [A-Z]*
[0-9] Any digit [0-9]+
[^abc] Any character except a, b, c [^0-9]
2. Predefined Character Classes
Notation Meaning Equivalent
[:digit:] Digits [0-9]
[:alpha:] Alphabetic characters [A-Za-z]
[:alnum:] Alphanumeric characters [A-Za-z0-9]
[:space:] Whitespace space, tab
3. Operators and Quantifiers
Operator Meaning Example
` ` Union (OR)
* Zero or more a*
+ One or more [0-9]+
? Zero or one colou?r
{n} Exactly n times [0-9]{5}
{m,n} m to n times [a-z]{2,4}
4. Grouping and Anchors
Notation Meaning Example
() Grouping (ab)+
^ Start of line ^Hello
$ End of line world$
Examples of Regular Expressions with UNIX Notation
Regular Expression Description
[0-9]+ One or more digits
[A-Za-z]+ One or more letters
^[A-Za-z][A-Za-z0-9]* Valid identifier
[+-]?[0-9]+ Signed integer
[0-9]{5} Exactly 5 digits
[A-Za-z0-9]+@[A-Za-z]+\.com Simple email pattern

You might also like