Data Structure
Data Structure
Data is a collection of facts and figures or a set of values or values of a specific format that refers to a single set
of item values. The data items are then classified into sub-items, which is the group of items that are not known
as the simple primary form of the item.
Data Structure is a branch of Computer Science. The study of data structure allows us to understand the
organization of data and the management of the data flow in order to increase the efficiency of any process or
program. Data Structure is a particular way of storing and organizing data in the memory of the computer so that
these data can easily be retrieved and efficiently utilized in the future when required. The data can be managed
in various ways, like the logical or mathematical model for a specific organization of data is known as a data
structure.
1. First, it must be loaded enough into the structure to reflect the definite correlation of the data with a real-
world object.
2. Second, the formation should be so straightforward that one can adapt to process the data efficiently
whenever necessary.
Some examples of Data Structures Are Arrays, Linked Lists, Stack, Queue, Trees, etc. Data Structures are widely
used in almost every aspect of Computer Science, i.e., Compiler Design, Operating Systems, Graphics, Artificial
Intelligence, and many more.
Data Structures are the main part of many Computer Science Algorithms as they allow the programmers to
manage the data in an effective way. It plays a crucial role in improving the performance of a program or software,
as the main objective of the software is to store and retrieve the user's data as fast as possible.
Basic Terminologies related to Data Structures
Data Structures are the building blocks of any software or program. Selecting the suitable data structure for a
program is an extremely challenging task for a programmer.
The following are some fundamental terminologies used whenever the data structures are involved:
1. Data: We can define data as an elementary value or a collection of values. For example, the Employee's
name and ID are the data related to the Employee.
2. Data Items: A Single unit of value is known as Data Item.
3. Group Items: Data Items that have subordinate data items are known as Group Items. For example, an
employee's name can have a first, middle, and last name.
4. Elementary Items: Data Items that are unable to divide into sub-items are known as Elementary Items.
For example, the ID of an Employee.
5. Entity and Attribute: A class of certain objects is represented by an Entity. It consists of different
Attributes. Each Attribute symbolizes the specific property of that Entity. For example,
Entities with similar attributes form an Entity Set. Each attribute of an entity set has a range of values, the set of
all possible values that could be assigned to the specific attribute.
The term "information" is sometimes utilized for data with given attributes of meaningful or processed data.
1. Field: A single elementary unit of information symbolizing the Attribute of an Entity is known as Field.
2. Record: A collection of different data items are known as a Record. For example, if we talk about the
employee entity, then its name, id, address, and job title can be grouped to form the record for the
employee.
3. File: A collection of different Records of one entity type is known as a File. For example, if there are 100
employees, there will be 25 records in the related file containing data about each employee.
1. Data Structure: A data structure is a way of organizing and storing data in a computer's memory or storage
system, along with operations that can be performed on that data.
2. Element: An element is an individual data item stored within a data structure. For example, in an array, each
value stored in the array is considered an element.
3. Collection: A collection refers to a group of elements that are stored and managed together within a data
structure. Collections can be homogeneous (containing elements of the same type) or heterogeneous
(containing elements of different types).
4. Access Operation: An access operation refers to the process of retrieving or accessing a specific element or
group of elements from a data structure. Common access operations include searching, retrieving, or updating
elements.
5. Insertion Operation: An insertion operation involves adding a new element to a data structure. Insertion
operations may vary in complexity depending on the type of data structure and its organization.
6. Deletion Operation: A deletion operation involves removing an existing element from a data structure.
Similar to insertion operations, deletion operations can vary in complexity based on the data structure being
used.
7. Traversal: Traversal refers to the process of visiting and accessing each element in a data structure in a
specific order. Traversal is often used for searching, sorting, or processing all elements within a data structure.
8. Search Operation: A search operation involves finding a specific element or group of elements within a data
structure. The efficiency of search operations can vary depending on factors such as the size of the data
structure and the search algorithm used.
9. Sorting: Sorting refers to the process of arranging the elements of a data structure in a specified order, such
as numerical or lexicographic order. Sorting algorithms are commonly used to organize data for efficient
searching and retrieval.
10. Algorithm: An algorithm is a step-by-step procedure or set of instructions used to perform a specific task or
solve a problem. Data structures often work in conjunction with algorithms to manipulate and process data
efficiently.
Rich Standard Template Library (STL): The C++ STL provides a wide range of pre-implemented data
structures (like vectors, maps, sets, queues, and stacks) and algorithms, which greatly accelerate
development and simplify complex tasks.
Low-Level Memory Manipulation: C++ gives programmers explicit control over memory allocation and
deallocation through dynamic memory management (using pointers), allowing for fine-grained
optimization crucial for performance-critical applications and embedded systems.
Object-Oriented Programming (OOP): C++ is an OOP language, which enables the use of classes,
inheritance, and polymorphism. This approach helps in building scalable, modular, and reusable code,
making programs easier to maintain and debug.
Scalability: Well-chosen C++ data structures can handle a large increase in data volume gracefully, which
is vital for enterprise-level systems and large-scale applications.
Portability: C++ programs are largely portable, meaning that code written on one platform can often run
on another with minimal or no modification, especially if adhering to standard C++
Manual Memory Management: The programmer is responsible for managing memory (allocating and
deallocating it manually). The absence of an automatic garbage collector can lead to memory leaks and
increase the risk of memory-related bugs.
Complexity and Steep Learning Curve: C++ is considered a complex language, especially for
beginners. Features like pointers, templates, and manual memory management can be difficult to master
and may lead to errors if not handled correctly.
Security Concerns: Features that offer low-level control, such as global variables and pointers, can
make C++ programs more vulnerable to security issues, such as memory corruption.
Verbose and Strict Syntax: C++ has a very strict and sometimes verbose syntax, meaning a small error
can result in numerous compilation errors and more time spent debugging.
Lack of Built-in Thread Support: Compared to other languages like Java, C++ has historically lacked
built-in support for multi-threading, making the process more complicated, though modern C++
standards are addressing this
1. Primitive Data Structures are the data structures consisting of the numbers and the characters that
come in-built into programs.
2. These data structures can be manipulated or operated directly by machine-level instructions.
3. Basic data types like Integer, Float, Character, and Boolean come under the Primitive Data Structures.
4. These data types are also called Simple data types, as they contain characters that can't be divided further
1. Non-Primitive Data Structures are those data structures derived from Primitive Data Structures.
2. These data structures can't be manipulated or operated directly by machine-level instructions.
3. The focus of these data structures is on forming a set of data elements that is either homogeneous (same
data type) or heterogeneous (different data types).
4. Based on the structure and arrangement of data, we can divide these data structures into two sub-categories
1. Linear Data Structures
2. Non-Linear Data Structures
Linear Data Structure: A data structure in which data elements are arranged sequentially or linearly,
where each element is attached to its previous and next adjacent elements, is called a linear data structure.
Examples are array, stack, queue, etc.
Non-linear Data Structure: Data structures where data elements are not placed sequentially or linearly
are called non-linear data structures. Examples are trees and graphs.
2. Linked Lists:
Like arrays, Linked List is a linear data structure. Unlike arrays, linked list elements are not stored at a
contiguous location; the elements are linked using pointers.
3. Stack:
Stack is a linear data structure which follows a particular order in which the operations are performed. The
order may be LIFO (Last In First Out) or FILO (First In Last Out). In stack, all insertion and deletion are
permitted at only one end of the list.
4. Queue:
Like Stack, Queue is a linear structure which follows a particular order in which the operations are performed.
The order is First in First Out (FIFO). In the queue, items are inserted at one end and deleted from the other
end. A good example of the queue is any queue of consumers for a resource where the consumer that came first
is served first. The difference between stacks and queues is in removing. In a stack we remove the item the
most recently added; in a queue, we remove the item the least recently added.
1. Graph:
Graph is a data structure that consists of a collection of nodes (vertices) connected by edges. Graphs are used
to represent relationships between objects and are widely used in computer science, mathematics, and other
fields. Graphs can be used to model a wide variety of real-world systems, such as social networks,
transportation networks, and computer networks.
2. Tree:
Unlike Arrays, Linked Lists, Stack and queues, which are linear data structures, trees are hierarchical data
structures. A binary tree is a tree data structure in which each node has at most two children, which are referred
to as the left child and the right child. It is implemented mainly using Links.
A Tree is represented by a pointer to the topmost node in the tree. If the tree is empty, then the value of root is
NULL.
A data structure is crucial for efficient problem solving in computer science as it provides a structured way to
organize and manage data, allowing for faster access, manipulation, and processing, which is essential for
designing effective algorithms to tackle complex problems; choosing the right data structure based on the
problem characteristics can significantly impact the performance and scalability of a solution, making it a key
consideration in software development.
Efficient Data Access: Data structures enable optimized data retrieval and manipulation by organizing
data in a logical manner, reducing search time and improving overall performance
Algorithm Design: Algorithms often rely heavily on specific data structures, and selecting the right data
structure can simplify the design and implementation of an algorithm.
Memory Management: Data structures help manage memory efficiently by utilizing appropriate storage
techniques based on the problem requirements, minimizing memory waste and improving resource
allocation.
Problem Analysis: Understanding the relationships between data elements within a problem allows for
the selection of the most suitable data structure to solve it effectively.
Scalability: Different data structures are better suited for handling large datasets, ensuring that a solution
can scale efficiently as data volume increases.
Searching: To quickly find a specific element in a large dataset, a binary search tree can be used, which
leverages the sorted nature of the data to significantly reduce search time.
Sorting: Algorithms like quicksort or merge sort utilize arrays or linked lists to efficiently arrange data
in ascending or descending order.
Graph Traversal: Representing a network or relationship between entities using a graph data structure
allows for efficient navigation and analysis of connections.
Stack Implementation: For operations like backtracking or managing function calls, a stack data
structure is used to maintain a LIFO (Last in First Out) order.
Queue Implementation: When dealing with first-in-first-out (FIFO) processing, like managing a waiting
line, a queue data structure is ideal.
Stack is a linear data structure that follows LIFO (Last in First Out) Principle, the last element inserted is
the first to be popped out. It means both insertion and deletion operations happen at one end only.
LIFO (Last In First Out) Principle
Here are some real world examples of LIFO
Consider a stack of plates. When we add a plate, we add at the top. When we remove, we remove from the
top.
A shuttlecock box (or any other box that is closed from one end) is another great real-world example of
the LIFO (Last In, First Out) principle where do insertions and removals from the same end.
Types of Stack:
Fixed Size Stack: As the name suggests, a fixed size stack has a fixed size and cannot grow or shrink
dynamically. If the stack is full and an attempt is made to add an element to it, an overflow error occurs. If
the stack is empty and an attempt is made to remove an element from it, an underflow error occurs.
Dynamic Size Stack: A dynamic size stack can grow or shrink dynamically. When the stack is full, it
automatically increases its size to accommodate the new element, and when the stack is empty, it decreases
its size. This type of stack is implemented using a linked list, as it allows for easy resizing of the stack.
In order to make manipulations in a stack, there are certain operations provided to us.
Pop () – Pop operation refers to removing an element from the stack. As we know that we can only work with
one element at a time, we remove the top of the stack.
Peek () – This operation allows us to view the element which is at the top. There are no changes made in the
stack here.
isEmpty () – It checks if the stack is empty or not to prevent the programmers from performing operations on an
empty stack. This operation returns a Boolean value – ‘True’ if size = 0, otherwise it returns ‘false’
isFull () – It’s the exact opposite of isEmpty operation, as it returns ‘true’ if the stack is full, else returns ‘false’.
Push Operation on Stack
Adds an item to the stack. If the stack is full, then it is said to be an Overflow condition.
if TOP = MAX-1
return "Overflow"
endif
top = top + 1
stack[top] = item
end
Removes an item from the stack. The items are popped in the reversed order in which they are pushed. If the
stack is empty, then it is said to be an Underflow condition.
If TOP=-1
return "Underflow"
endif
item=Stack[Top]
top=top-1
return Item
end
An array is used to store an ordered list of elements. Using an array for representation of stack is one of the
easy techniques to manage the data. But there is a major difference between an array and a stack.
While, in a stack, there is no fixed size since the size of stack changed with the number of elements
inserted or deleted to and from it.
Despite the difference, an array can be used to represent a stack by taking an array of maximum size; big
enough to manage a stack.
For Example:
We are given a stack of elements: 12, 08, 21, 33, 18, 40.
Applications of Stacks:
Function calls: Stacks are used to keep track of the return addresses of function calls, allowing the program
to return to the correct location after a function has finished executing.
Recursion: Stacks are used to store the local variables and return addresses of recursive function calls,
allowing the program to keep track of the current state of the recursion.
Expression evaluation: Stacks are used to evaluate expressions in postfix notation (Reverse Polish
Notation).
Syntax parsing: Stacks are used to check the validity of syntax in programming languages and other formal
languages.
Memory management: Stacks are used to allocate and manage memory in some operating systems and
programming languages.
Used to solve popular problems like Next Greater, Previous Greater, Next Smaller, Previous
Smaller, Largest Area in a Histogram and Stock Span Problems.
Disadvantages of Stacks:
Limited access: Elements in a stack can only be accessed from the top, making it difficult to retrieve or
modify elements in the middle of the stack.
Potential for overflow: If more elements are pushed onto a stack than it can hold, an overflow error will
occur, resulting in a loss of data.
Not suitable for random access: Stacks do not allow for random access to elements, making them
unsuitable for applications where elements need to be accessed in a specific order.
Limited capacity: Stacks have a fixed capacity, which can be a limitation if the number of elements that
need to be stored is unknown or highly variable.
Rules
The puzzle involves three rods (source, auxiliary, destination) and a number of disks of different sizes. The rules
are:
Only one disk can be moved at a time.
Each move consists of taking the top disk from one stack and placing it on top of another stack.
A larger disk can never be placed on top of a smaller one.
Recursive Algorithm
To move disks from a Source rod to a Destination rod using an Auxiliary rod, the recursive algorithm follows
these steps:
1. Move the top disks from the Source rod to the Auxiliary rod, using the Destination rod as temporary
storage.
2. Move the largest disk (the disk) from the Source rod to the Destination rod.
3. Move the disks from the Auxiliary rod to the Destination rod, using the Source rod as temporary
storage. The base case is when there is only one disk(); it is simply moved from the source to the
destination. The minimum number of moves required is
Example
#include <iostream.h>
#include <conio.h>
// Recursive function to solve Tower of Hanoi
void towerOfHanoi (int n, char source, char destination, char auxiliary)
{
if (n == 1)
{
cout << "Move disk 1 from " << source << " to " << destination << endl;
return;
}
towerOfHanoi (n - 1, source, auxiliary, destination);
cout << "Move disk " << n << " from " << source << " to " << destination << endl;
towerOfHanoi (n - 1, auxiliary, destination, source);
}
int main()
{
int n;
clrscr();
cout << "Enter the number of disks: ";
cin >> n;
cout << "The sequence of moves involved in the Tower of Hanoi are:\n";
towerOfHanoi(n, 'A', 'C', 'B'); // A=Source, C=Destination, B=Auxiliary
getch();
return 0;
}
Infix, Prefix and Postfix Expressions
1. Infix expression:
In this notation, we place operator in the middle of the operands.
<Operand> <Operator> <Operand>
2. Prefix expressions:
In this notation, we place operator at the beginning of the
operands.
<Operator> <Operand> <Operand>
3. Postfix expression:
In this notation, we place operator at the end of the operands.
<Operand> <Operand> <Operator>
We will evaluate the given postfix expression step by step using a stack-based approach. At each scan point,
we will explain the operation and update the stack. Following diagram will fully explain that how to
evaluate Postfix Expression (7 5 3 * 5 1 ^ / + 3 2 -+). Let start
Above diagram is fully explained below
Initial Setup
Expression: 7 5 3 * 5 1 ^ / + 3 2 -+
Stack: Start with an empty stack.
Step 01: First we scan each element
Scan 7:
Explanation: 7 is an operand. Push it onto the stack.
Stack: [7]
Scan 5:
Explanation: 5 is an operand. Push it onto the stack.
Stack: [7, 5]
Scan 3:
Explanation: 3 is an operand. Push it onto the stack.
Stack: [7, 5, 3]
Scan *:
Explanation: * is an operator. Pop the top two operands (5 and 3) from the stack. Compute5 * 3 = 15,
then push the result back onto the stack.
Stack: [7, 15]
Scan 5:
Explanation: 5 is an operand. Push it onto the stack.
Stack: [7, 15, 5]
Scan 1:
Explanation: 1 is an operand. Push it onto the stack.
Stack: [7, 15, 5, 1]
Scan ^:
Explanation: ^ is an operator. Pop the top two operands (5 and 1) from the stack. Compute 5 ^ 1 =
5 (exponentiation), then push the result back onto the stack.
Stack: [7, 15, 5]
Scan /:
Explanation: / is an operator. Pop the top two operands (15 and 5) from the stack. Compute 15 / 5 = 3,
then push the result back onto the stack.
Stack: [7, 3]
Scan +:
Explanation: + is an operator. Pop the top two operands (7 and 3) from the stack. Compute 7 + 3 = 10,
then push the result back onto the stack.
Stack: [10]
Scan 3:
Explanation: 3 is an operand. Push it onto the stack.
Stack: [10, 3]
Scan 2:
Explanation: 2 is an operand. Push it onto the stack.
Stack: [10, 3, 2]
Scan -:
Explanation: - is an operator. Pop the top two operands (3 and 2) from the stack. Compute 3 - 2 = 1, then
push the result back onto the stack.
Stack: [10, 1]
Scan +:
Explanation: + is an operator. Pop the top two operands (10 and 1) from the stack. Compute 10 + 1 = 11,
then push the result back onto the stack.
Stack: [11]
Final Step
Explanation: After scanning all symbols of postfix expression7 5 3 * 5 1 ^ / + 3 2 -+, the stack contains
one value: 11. This is the final result.
Stack: [11]
A Queue Data Structure is a fundamental concept in computer science used for storing and managing data in
a specific order. It follows the principle of "First in, First out" (FIFO), where the first element added to the
queue is the first one to be removed. Queues are commonly used in various algorithms and applications for
their simplicity and efficiency in managing data flow.
Representation of Queues
Similar to the stack ADT, a queue ADT can also be implemented using arrays, linked lists, or pointers. As a small
example in this tutorial, we implement queues using a one-dimensional array.
Operation 1: enqueue()
Inserts an element at the end of the queue i.e. at the rear end.
The following steps should be taken to enqueue (insert) data into a queue:
Check if the queue is full.
If the queue is full, return overflow error and exit.
If the queue is not full, increment the rear pointer to point to the next empty space.
Add the data element to the queue location, where the rear is pointing.
return success.
Operation 2: dequeue()
This operation removes and returns an element that is at the front end of the queue.
The following steps are taken to perform the dequeue operation:
Check if the queue is empty.
If the queue is empty, return the underflow error and exit.
If the queue is not empty, access the data where the front is pointing.
Increment the front pointer to point to the next available data element.
The Return success.
Types of Queue
o Simple Queue or Linear Queue
o Circular Queue
o Priority Queue
o Double Ended Queue (or Deque)
The major drawback of using a linear Queue is that insertion is done only from the rear end. If the first three
elements are deleted from the Queue, we cannot insert more elements even though the space is available in a
Linear Queue. In this case, the linear Queue shows the overflow condition as the rear is pointing to the last
element of the Queue.
Circular Queue
In Circular Queue, all the nodes are represented as circular. It is similar to the linear Queue except that the last
element of the queue is connected to the first element. It is also known as Ring Buffer, as all the ends are
connected to another end. The representation of circular queue is shown in the below image -
The drawback that occurs in a linear queue is overcome by using the circular queue. If the empty space is available
in a circular queue, the new element can be added in an empty space by simply incrementing the value of rear.
The main advantage of using the circular queue is better memory utilization.
Priority Queue
It is a special type of queue in which the elements are arranged based on the priority. It is a special type of queue
data structure in which every element has a priority associated with it. Suppose some elements occur with the
same priority, they will be arranged according to the FIFO principle. The representation of priority queue is
shown in the below image -
Insertion in priority queue takes place based on the arrival, while deletion in the priority queue occurs based on
the priority. Priority queue is mainly used to implement the CPU scheduling algorithms.
There are two types of priority queue that are discussed as follows -
o Ascending priority queue - In ascending priority queue, elements can be inserted in arbitrary order, but
only smallest can be deleted first. Suppose an array with elements 7, 5, and 3 in the same order, so,
insertion can be done with the same sequence, but the order of deleting the elements is 3, 5, 7.
o Descending priority queue - In descending priority queue, elements can be inserted in arbitrary order,
but only the largest element can be deleted first. Suppose an array with elements 7, 3, and 5 in the same
order, so, insertion can be done with the same sequence, but the order of deleting the elements is 7, 5, 3.
Deque can be used both as stack and queue as it allows the insertion and deletion operations on both ends. Deque
can be considered as stack because stack follows the LIFO (Last In First Out) principle in which insertion and
deletion both can be performed only from one end. And in deque, it is possible to perform both insertion and
deletion from one end, and Deque does not follow the FIFO principle.
o Input restricted deque - As the name implies, in input restricted queue, insertion operation can be
performed at only one end, while deletion can be performed from both ends.
Output restricted deque - As the name implies, in output restricted queue, deletion operation can be performed
at only one end, while insertion can be performed from both ends.
1. Task Scheduling
Queues are vital for task scheduling in operating systems, where processes are managed in the order they arrive.
The scheduler uses a queue to allocate CPU time fairly, ensuring that all processes get a chance to execute. This
FIFO approach helps prevent starvation and ensures efficient use of system resources, contributing to overall
system performance and responsiveness.
2. Buffering
Queues function as buffers in scenarios like data streaming and I/O operations. They temporarily store data
packets, allowing the system to manage differences in processing speeds between producers and consumers.
For instance, in video streaming, queues ensure smooth playback by queuing data even when retrieval is delayed,
preventing interruptions and enhancing user experience.
8. Concurrency Control
In multithreaded programming, queues facilitate safe access to shared resources. By using a queue to manage
requests, threads can communicate and coordinate effectively without causing race conditions.
This control mechanism enhances program stability and ensures that multiple threads can operate smoothly
without interfering with each other.
9. Simulation Systems
Queues are used in simulation systems to model real-world processes, such as customer arrivals at service points
or vehicles at toll booths. By queuing entities, these systems analyze performance metrics, optimize resource
allocation, and identify potential bottlenecks. This modeling helps organizations improve efficiency and plan
better for varying demand scenarios.