Data Structure Unit 1
Data Structure Unit 1
Data Structure is a branch of Computer Science. The study of data structure allows us to understand the
organization of data and the management of the data flow in order to increase the efficiency of any
process or program. Data Structure is a particular way of storing and organizing data in the memory of
the computer so that these data can easily be retrieved and efficiently utilized in the future when required.
The data can be managed in various ways, like the logical or mathematical model for a specific
organization of data is known as a data structure.
The following are some fundamental terminologies used whenever the data structures are involved:
1. Data: We can define data as an elementary value or a collection of values. For example, the Employee's
name and ID are the data related to the Employee.
2. Data Items: A Single unit of value is known as Data Item.
3. Group Items: Data Items that have subordinate data items are known as Group Items. For example, an
employee's name can have a first, middle, and last name.
4. Elementary Items: Data Items that are unable to divide into sub-items are known as Elementary Items. For
example, the ID of an Employee.
5. Entity and Attribute: A class of certain objects is represented by an Entity. It consists of different Attributes.
Each Attribute symbolizes the specific property of that Entity
1. Robustness: Generally, all computer programmers aim to produce software that yields correct
output for every possible input, along with efficient execution on all hardware platforms. This type
of robust software must manage both valid and invalid inputs.
2. Adaptability: Building software applications like Web Browsers, Word Processors, and Internet
Search Engine include huge software systems that require correct and efficient working or
execution for many years. Moreover, software evolves due to emerging technologies or ever-
changing market conditions.
3. Reusability: The features like Reusability and Adaptability go hand in hand. It is known that the
programmer needs many resources to build any software, making it a costly enterprise. However,
if the software is developed in a reusable and adaptable way, then it can be applied in most future
applications. Thus, by executing quality data structures, it is possible to build reusable software,
which appears to be cost-effective and timesaving.
Classification of Data Structures
A Data Structure delivers a structured set of variables related to each other in various ways. It forms the
basis of a programming tool that signifies the relationship between the data elements and allows
programmers to process the data efficiently.
Based on memory allocation, the Linear Data Structures are further classified into two types:
1. Static Data Structures: The data structures having a fixed size are known as Static Data Structures. The
memory for these data structures is allocated at the compiler time, and their size cannot be changed by the
user after being compiled; however, the data stored in them can be altered.
The Array is the best example of the Static Data Structure as they have a fixed size, and its data can be
modified later.
2. Dynamic Data Structures: The data structures having a dynamic size are known as Dynamic Data
Structures. The memory of these data structures is allocated at the run time, and their size varies during the
run time of the code. Moreover, the user can change the size as well as the data elements stored in these
data structures at the run time of the code.
Linked Lists, Stacks, and Queues are common examples of dynamic data structures
1. Arrays
An Array is a data structure used to collect multiple data elements of the same data type into one variable.
Instead of storing multiple values of the same data types in separate variable names, we could store all
of them together into one variable. This statement doesn't imply that we will have to unite all the values
of the same data type in any program into one array of that data type. But there will often be times when
some specific variables of the same data types are all related to one another in a way appropriate for an
array.
An Array is a list of elements where each element has a unique place in the list. The data elements of the
array share the same variable name; however, each carries a different index number called a subscript.
We can access any data element from the list with the help of its location in the list. Thus, the key feature
of the arrays to understand is that the data is stored in contiguous memory locations, making it possible
for the users to traverse through the data elements of the array using their respective indexes.
Figure 3. An Array
Arrays can be classified into different types:
a. One-Dimensional Array: An Array with only one row of data elements is known as a One-Dimensional
Array. It is stored in ascending storage location.
b. Two-Dimensional Array: An Array consisting of multiple rows and columns of data elements is called a
Two-Dimensional Array. It is also known as a Matrix.
c. Multidimensional Array: We can define Multidimensional Array as an Array of Arrays. Multidimensional
Arrays are not bounded to two indices or two dimensions as they can include as many indices are per the
need.
a. We can store a list of data elements belonging to the same data type.
b. Array acts as an auxiliary storage for other data structures.
c. The array also helps store data elements of a binary tree of the fixed count.
d. Array also acts as a storage of matrices.
2. Linked Lists
A Linked List is another example of a linear data structure used to store a collection of data elements
dynamically. Data elements in this data structure are represented by the Nodes, connected using links or
pointers. Each node contains two fields, the information field consists of the actual data, and the pointer
field consists of the address of the subsequent nodes in the list. The pointer of the last node of the linked
list consists of a null pointer, as it points to nothing. Unlike the Arrays, the user can dynamically adjust
the size of a Linked List as per the requirements.
a. Singly Linked List: A Singly Linked List is the most common type of Linked List. Each node has data and a
pointer field containing an address to the next node.
b. Doubly Linked List: A Doubly Linked List consists of an information field and two pointer fields. The
information field contains the data. The first pointer field contains an address of the previous node, whereas
another pointer field contains a reference to the next node. Thus, we can go in both directions (backward
as well as forward).
c. Circular Linked List: The Circular Linked List is similar to the Singly Linked List. The only key difference is
that the last node contains the address of the first node, forming a circular loop in the Circular Linked List.
Some Applications of Linked Lists:
a. The Linked Lists help us implement stacks, queues, binary trees, and graphs of predefined size.
b. We can also implement Operating System's function for dynamic memory management.
c. Linked Lists also allow polynomial implementation for mathematical operations.
d. We can use Circular Linked List to implement Operating Systems or application functions that Round Robin
execution of tasks.
e. Circular Linked List is also helpful in a Slide Show where a user requires to go back to the first slide after
the last slide is presented.
f. Doubly Linked List is utilized to implement forward and backward buttons in a browser to move forward
and backward in the opened pages of a website.
3. Stacks
A Stack is a Linear Data Structure that follows the LIFO (Last In, First Out) principle that allows operations
like insertion and deletion from one end of the Stack, i.e., Top. Stacks can be implemented with the help
of contiguous memory, an Array, and non-contiguous memory, a Linked List. Real-life examples of Stacks
are piles of books, a deck of cards, piles of money, and many more.
The above figure represents the real-life example of a Stack where the operations are performed from
one end only, like the insertion and removal of new books from the top of the Stack. It implies that the
insertion and deletion in the Stack can be done only from the top of the Stack. We can access only the
Stack's tops at any given time.
Figure 6. A Stack
4. Queues
A Queue is a linear data structure similar to a Stack with some limitations on the insertion and deletion
of the elements. The insertion of an element in a Queue is done at one end, and the removal is done at
another or opposite end. Thus, we can conclude that the Queue data structure follows FIFO (First In, First
Out) principle to manipulate the data elements. Implementation of Queues can be done using Arrays,
Linked Lists, or Stacks. Some real-life examples of Queues are a line at the ticket counter, an escalator, a
car wash, and many more.
The above image is a real-life illustration of a movie ticket counter that can help us understand the Queue
where the customer who comes first is always served first. The customer arriving last will undoubtedly be
served last. Both ends of the Queue are open and can execute different operations. Another example is a
food court line where the customer is inserted from the rear end while the customer is removed at the
front end after providing the service they asked for.
a. Enqueue: The insertion or Addition of some data elements to the Queue is called Enqueue. The element
insertion is always done with the help of the rear pointer.
b. Dequeue: Deleting or removing data elements from the Queue is termed Dequeue. The deletion of the
element is always done with the help of the front pointer.
Figure A Queue
Some Applications of Queues:
1. Trees
A Tree is a Non-Linear Data Structure and a hierarchy containing a collection of nodes such that each
node of the tree stores a value and a list of references to other nodes (the "children").
The Tree data structure is a specialized method to arrange and collect data in the computer to be utilized
more effectively. It contains a central node, structural nodes, and sub-nodes connected via edges. We can
also say that the tree data structure consists of roots, branches, and leaves connected.
Figure A Tree
a. Binary Tree: A Tree data structure where each parent node can have at most two children is termed a
Binary Tree.
b. Binary Search Tree: A Binary Search Tree is a Tree data structure where we can easily maintain a sorted list
of numbers.
c. AVL Tree: An AVL Tree is a self-balancing Binary Search Tree where each node maintains extra information
known as a Balance Factor whose value is either -1, 0, or +1.
d. B-Tree: A B-Tree is a special type of self-balancing Binary Search Tree where each node consists of multiple
keys and can have more than two children.
a. Trees implement hierarchical structures in computer systems like directories and file systems.
b. Trees are also used to implement the navigation structure of a website.
c. We can generate code like Huffman's code using Trees.
d. Trees are also helpful in decision-making in Gaming applications.
e. Trees are responsible for implementing priority queues for priority-based OS scheduling functions.
f. Trees are also responsible for parsing expressions and statements in the compilers of different
programming languages.
g. We can use Trees to store data keys for indexing for Database Management System (DBMS).
h. Spanning Trees allows us to route decisions in Computer and Communications Networks.
i. Trees are also used in the path-finding algorithm implemented in Artificial Intelligence (AI), Robotics, and
Video Games Applications.
Consider a Binary Tree T. T will be maintained in memory by means of a linked list representation which
uses three parallel arrays; INFO, LEFT, and RIGHT pointer variable ROOT as follows. In Binary Tree each
node N of T will correspond to a location k such that
In this representation of binary tree root will contain the location of the root R of T. If any one of the
subtree is empty, then the corresponding pointer will contain the null value if the tree T itself is empty,
the ROOT will contain the null value.
Example
Consider the binary tree T in the figure. A schematic diagram of the linked list representation
of T appears in the following figure. Observe that each node is pictured with its three fields, and that
the empty subtree is pictured by using x for null entries.
Binary Tree
Let us consider that we have a tree T. let our tree T is a binary tree that us complete binary tree. Then
there is an efficient way of representing T in the memory called the sequential representation or array
representation of T. This representation uses only a linear array TREE as follows:
For Example:
DS Algorithm
An algorithm is a process or a set of rules required to perform calculations or some other problem-solving
operations especially by a computer. The formal definition of an algorithm is that it contains the finite set
of instructions which are being carried in a specific order to perform the specific task. It is not the complete
program or code; it is just a solution (logic) of a problem, which can be represented either as an informal
description using a Flowchart or Pseudocode.
Characteristics of an Algorithm
The following are the characteristics of an algorithm:
o Input: An algorithm has some input values. We can pass 0 or some input value to an algorithm.
o Output: We will get 1 or more output at the end of an algorithm.
o Unambiguity: An algorithm should be unambiguous which means that the instructions in an algorithm
should be clear and simple.
o Finiteness: An algorithm should have finiteness. Here, finiteness means that the algorithm should contain
a limited number of instructions, i.e., the instructions should be countable.
o Effectiveness: An algorithm should be effective as each instruction in an algorithm affects the overall
process.
o Language independent: An algorithm must be language-independent so that the instructions in an
algorithm can be implemented in any of the languages with the same output.
Dataflow of an Algorithm
o Problem: A problem can be a real-world problem or any instance from the real-world problem for which
we need to create a program or the set of instructions. The set of instructions is known as an algorithm.
o Algorithm: An algorithm will be designed for a problem which is a step by step procedure.
o Input: After designing an algorithm, the required and the desired inputs are provided to the algorithm.
o Processing unit: The input will be given to the processing unit, and the processing unit will produce the
desired output.
o Output: The output is the outcome or the result of the program.
Algorithm Analysis
The algorithm can be analyzed in two levels, i.e., first is before creating the algorithm, and second is after
creating the algorithm. The following are the two analysis of an algorithm:
o Priori Analysis: Here, priori analysis is the theoretical analysis of an algorithm which is done before
implementing the algorithm. Various factors can be considered before implementing the algorithm like
processor speed, which has no effect on the implementation part.
o Posterior Analysis: Here, posterior analysis is a practical analysis of an algorithm. The practical analysis is
achieved by implementing the algorithm using any programming language. This analysis basically evaluate
that how much running time and space taken by the algorithm.
Algorithm Complexity
The performance of the algorithm can be measured in two factors:
o Time complexity: The time complexity of an algorithm is the amount of time required to complete the
execution. The time complexity of an algorithm is denoted by the big O notation. Here, big O notation is
the asymptotic notation to represent the time complexity. The time complexity is mainly calculated by
counting the number of steps to finish the execution. Let's understand the time complexity through an
example.
1. sum=0;
2. // Suppose we have to calculate the sum of n numbers.
3. for i=1 to n
4. sum=sum+i;
5. // when the loop ends then sum holds the sum of the n numbers
6. return sum;
In the above code, the time complexity of the loop statement will be atleast n, and if the value of n
increases, then the time complexity also increases. While the complexity of the code, i.e., return sum will
be constant as its value is not dependent on the value of n and will provide the result in one step only.
We generally consider the worst-time complexity as it is the maximum time taken for any given input
size.
o Space complexity: An algorithm's space complexity is the amount of space required to solve a problem
and produce an output. Similar to the time complexity, space complexity is also expressed in big O notation.
Auxiliary space: The extra space required by the algorithm, excluding the input size, is known as an
auxiliary space. The space complexity considers both the spaces, i.e., auxiliary space, and space used by
the input.
So,
Space complexity = Auxiliary space + Input size.
Stack
A Stack is a linear data structure that follows the LIFO (Last-In-First-Out) principle. Stack has one end,
whereas the Queue has two ends (front and rear). It contains only one pointer top pointer pointing to
the topmost element of the stack. Whenever an element is added in the stack, it is added on the top of
the stack, and the element can be deleted only from the stack. In other words, a stack can be defined as
a container in which insertion and deletion can be done from the one end known as the top of the
stack.
Working of Stack
Stack works on the LIFO pattern. As we can observe in the below figure there are five memory blocks in
the stack; therefore, the size of the stack is 5.
Suppose we want to store the elements in a stack and let's assume that stack is empty. We have taken
the stack of size 5 as shown below in which we are pushing the elements one by one until the stack
becomes full.
Since our stack is full as the size of the stack is 5. In the above cases, we can observe that it goes from the
top to the bottom when we were entering the new element in the stack. The stack gets filled up from the
bottom to the top.
When we perform the delete operation on the stack, there is only one way for entry and exit as the other
end is closed. It follows the LIFO pattern, which means that the value entered first will be removed last. In
the above case, the value 5 is entered first, so it will be removed only after the deletion of all the other
elements.
PUSH operation
The steps involved in the PUSH operation is given below:
POP operation
The steps involved in the POP operation is given below:
o Before deleting the element from the stack, we check whether the stack is empty.
o If we try to delete the element from the empty stack, then the underflow condition occurs.
o If the stack is not empty, we first access the element which is pointed by the top
o Once the pop operation is performed, the top is decremented by 1, i.e., top=top-1.
Applications of Stack
The following are the applications of the stack:
o Balancing of symbols: Stack is used for balancing a symbol. For example, we have the following program:
1. int main()
2. {
3. cout<<"Hello";
4. cout<<"javaTpoint";
5. }
As we know, each program has an opening and closing braces; when the opening braces come, we push
the braces in a stack, and when the closing braces appear, we pop the opening braces from the stack.
Therefore, the net value comes out to be zero. If any symbol is left in the stack, it means that some syntax
occurs in a program.
o String reversal: Stack is also used for reversing a string. For example, we want to reverse a "javaTpoint"
string, so we can achieve this with the help of a stack.
First, we push all the characters of the string in a stack until we reach the null character.
After pushing all the characters, we start taking out the character one by one until we reach the bottom of
the stack.
o UNDO/REDO: It can also be used for performing UNDO/REDO operations. For example, we have an editor
in which we write 'a', then 'b', and then 'c'; therefore, the text written in an editor is abc. So, there are three
states, a, ab, and abc, which are stored in a stack. There would be two stacks in which one stack shows
UNDO state, and the other shows REDO state.
If we want to perform UNDO operation, and want to achieve 'ab' state, then we implement pop operation.
o Recursion: The recursion means that the function is calling itself again. To maintain the previous states, the
compiler creates a system stack in which all the previous records of the function are maintained.
o DFS(Depth First Search): This search is implemented on a Graph, and Graph uses the stack data structure.
o Backtracking: Suppose we have to create a path to solve a maze problem. If we are moving in a particular
path, and we realize that we come on the wrong way. In order to come at the beginning of the path to
create a new path, we have to use the stack data structure.
o Expression conversion: Stack can also be used for expression conversion. This is one of the most important
applications of stack. The list of the expression conversion is given below:
Infix to prefix
Infix to postfix
Prefix to infix
Prefix to postfix
Postfix to infix
o Memory management: The stack manages the memory. The memory is assigned in the contiguous
memory blocks. The memory is known as stack memory as all the variables are assigned in a function call
stack memory. The memory size assigned to the program is known to the compiler. When the function is
created, all its variables are assigned in the stack memory. When the function completed its execution, all
the variables assigned in the stack are released.
Examples:
Input: A + B * C + D
Output: ABC*+D+
Input: ((A + B) – C * (D / E)) + F
Output: AB+CDE/*-F+
Illustration:
Follow the below illustration for a better understanding
Consider the infix expression exp = “a+b*c+d”
and the infix expression is scanned using the iterator i, which is initialized as i = 0.
1st Step: Here i = 0 and exp[i] = ‘a’ i.e., an operand. So add this in the postfix expression.
Therefore, postfix = “a”.
Add ‘a’ in the postfix
2nd Step: Here i = 1 and exp[i] = ‘+’ i.e., an operator. Push this into the stack. postfix =
“a” and stack = {+}.
3rd Step: Now i = 2 and exp[i] = ‘b’ i.e., an operand. So add this in the postfix
expression. postfix = “ab” and stack = {+}.
5th Step: Now i = 4 and exp[i] = ‘c’ i.e., an operand. Add this in the postfix expression. postfix
= “abc” and stack = {+, *}.
6th Step: Now i = 5 and exp[i] = ‘+’ i.e., an operator. The topmost element of the stack has
higher precedence. So pop until the stack becomes empty or the top element has less
precedence. ‘*’ is popped and added in postfix. So postfix = “abc*” and stack = {+}.
7th Step: Now i = 6 and exp[i] = ‘d’ i.e., an operand. Add this in the postfix
expression. postfix = “abc*+d”.
Final Step: Now no element is left. So empty the stack and add it in the postfix
expression. postfix = “abc*+d+”.
Pop ‘+’ and add it in postfix
Examples:
Input: A * B + C / D
Output: + * A B/ C D
Input: (A – B/C) * (A/K-L)
Output: *-A/BC-/AKL
Illustration:
See the below image for a clear idea:
Convert infix expression to prefix expression
Postfix expression: The expression of the form “a b operator” (ab+) i.e., when a pair of
operands is followed by an operator.
Examples:
Input: str = “2 3 1 * + 9 -“
Output: -4
Explanation: If the expression is converted into an infix expression, it will be 2 + (3 * 1) – 9 = 5
– 9 = -4.
Input: str = “100 200 + 2 / 5 * 7 +”
Output: 757
Evaluation of Postfix Expression using Stack:
To evaluate a postfix expression we can use a stack.
Iterate the expression from left to right and keep on storing the operands into a stack. Once an
operator is received, pop the two topmost elements and evaluate them and push the result in
the stack again.
Illustration:
Follow the below illustration for a better understanding:
Consider the expression: exp = “2 3 1 * + 9 -“
Scan 2, it’s a number, So push it into stack. Stack contains ‘2’.
Scan 3, again a number, push it to stack, stack now contains ‘2 3’ (from bottom to top)
Scan *, it’s an operator. Pop two operands from stack, apply the * operator on operands. We
get 3*1 which results in 3. We push the result 3 to stack. The stack now becomes ‘2 3’.
Scan 9, it’s a number. So we push it to the stack. The stack now becomes ‘5 9’.
There are no more elements to scan, we return the top element from the stack (which is the
only element left in a stack).
So the result becomes -4.
Follow the steps mentioned below to evaluate postfix expression using stack: