Choosing Algorithms: Bubble vs. Merge Sort
Choosing Algorithms: Bubble vs. Merge Sort
Q1 :
How will you choose when more than one algorithm is available for a
problem? State various parameters to analyze and choose an algorithm. Write
your comments on how to choose between the following alternative
algorithms “bubble sort” and “merge sort”.
( 10 )
Q2 :
What are sparse and dense matrices? How are sparse matrices represented in
memory? Discuss some bottlenecks of using sparse matrices.
(7)
Q3 :
What are abstract data types? What are its various kinds of operations?
Explain in detail.
(8)
1(b)
Q4 :
Write algorithms to perform the following operations:
i. Insert a node before a node with data ‘10’ in a doubly linked list.
ii. Reverse doubly linked list.
( 10 )
Q5 :
Differentiate between
i. Static and dynamic memory allocation
ii. Recursion and iteration
(8)
Q6 :
i. Convert the given expression into a prefix using stack.
(A+(B*C-(D/E*F)^G)+3*H-2)
ii. What do you mean by priority queue? Write an algorithm to create a
priority queue.
( 3,4 )
Q1 :
How will you choose when more than one algorithm is available for a
problem? State various parameters to analyze and choose an algorithm.
Write your comments on how to choose between the following alternative
algorithms “bubble sort” and “merge sort”.
When faced with multiple algorithms to solve a problem, the choice of the most appropriate algorithm
depends on a careful analysis of various performance metrics, primarily time and space complexity.
The algorithm with the best combination of time and space complexity is typically the most efficient
choice for implementation, as it will provide the optimal balance between computational speed and
memory usage. However, this theoretical analysis must also be complemented by practical runtime
analysis and consideration of the specific characteristics of the input data, as real-world performance
can sometimes deviate from the predicted asymptotic bounds. A comprehensive evaluation of the
trade-offs between algorithms, taking into account both theoretical and empirical factors, is crucial in
selecting the most suitable solution for a given problem and its requirements.
1. Time Complexity : One of the most common ways to measure algorithm performance is time
complexity, which is the amount of time it takes for an algorithm to complete its task as a
function of the input size. Time complexity is usually expressed using the big O notation,
which gives the upper bound of the worst-case scenario. For example, O(n) means that the
algorithm takes at most linear time proportional to the input size, while O(n^2) means that it
takes quadratic time. Time complexity helps you estimate how your algorithm scales with
larger inputs and how it compares to other algorithms with different big O values.
Analyze the time complexity of each algorithm, which describes how the algorithm's running
time scales with the size of the input. Algorithms with lower time complexity, such as O(n log
n) for merge sort, are generally preferred over algorithms with higher time complexity, such
as O(n^2) for bubble sort, especially for large inputs.
Analyze the amount of additional memory or storage required by each algorithm. Algorithms
with lower space complexity are preferred, especially when memory is limited. Merge sort has
a higher space complexity than bubble sort, as it requires additional memory to store the
temporary arrays during the merge process.
3. Runtime Analysis : A more practical way to measure algorithm performance is runtime
analysis, which is the actual measurement of how long an algorithm takes to run on a specific
machine or environment. Runtime analysis can be done using various tools, such as timers,
profilers, or benchmarks, that record the execution time of an algorithm or a part of it.
Runtime analysis helps you test your algorithm on real data and conditions and identify any
bottlenecks or inefficiencies that may not be obvious from the theoretical analysis.
While Bubble Sort has a time complexity of O(n^2) in the average and worst cases, its runtime
mah vary significantly based on the input data, performing better on already-sorted or
nearly-sorted inputs. In contrast, Merge Sort's consistent O(n log n) time complexity makes it
generally more predictable and efficient, especially for larger randomly-ordered inputs.
4. Asymptotic Analysis : A more general way to measure algorithm performance is asymptotic
analysis, which is the study of how an algorithm behaves as the input size approaches infinity.
Asymptotic analysis is useful for comparing algorithms that have different time or space
complexities and for determining the best or worst cases. Asymptotic analysis uses different
notations, such as big O, big Theta, and big Omega, to describe the upper, lower, or tight
bounds of an algorithm's performance. Asymptotic analysis helps you understand the
fundamental limits and trade-offs of your algorithm and how it relates to the problem domain
Analyze the characteristics of the input data, such as the size, distribution, and whether it is
already partially sorted. Bubble sort may perform better than merge sort for small inputs or
nearly-sorted inputs, as it has lower overhead. Merge sort, on the other hand, is more efficient
for larger inputs and can handle a wider range of input distributions.
6. Algorithm Design Techniques : One of the best ways to improve algorithm performance is
to use algorithm design techniques, which are strategies or principles that guide you to create
efficient and effective algorithms. Algorithm design techniques can be divided into two
categories: problem reduction and solution construction. Problem reduction techniques, such
as divide and conquer, dynamic programming, or greedy algorithms, help you simplify or
transform a complex problem into smaller or easier subproblems. Solution construction
techniques, such as brute force, backtracking, or heuristics, help you generate or search for a
feasible or optimal solution for a problem. Algorithm design techniques help you optimize
your algorithm's time and space complexity, runtime, asymptotic behavior, or empirical
performance.
Bubble sort is generally simpler to implement than merge sort, which involves more complex
logic for the divide-and-conquer approach. Also, Bubble sort may perform better than merge
sort for small inputs or nearly-sorted inputs, as it has lower overhead. On the other hand,
Merge sort is more efficient for larger inputs and can handle a wider range of input
distributions.
Q2 :
What are sparse and dense matrices? How are sparse matrices represented
in memory? Discuss some bottlenecks of using sparse matrices.
Dense matrices are the matrices where most of the values are non-empty. In a much simpler way, we
can say the dense matrix is the general matrix being used for algebraic calculations.
However, any matrix which comprises mostly zero values can be termed as a sparse matrix. Roughly,
if more than two-thirds of the data points are zero, it is safe to assume that it can be stored as a sparse
matrix though there is no strict definite proportion. A dense matrix is the opposite with mostly
non-zero values. A sparse matrix can always be stored as a dense matrix too.
A matrix is sparse if many of its coefficients are zero. The interest in sparsity arises because its
exploitation can lead to enormous computational savings and because many large matrix problems
that occur in practice are sparse.
Sparsity : This is the ratio of the total number of zero elements divided by the total number of
elements in a matrix. Sparsity (s) = no of zeros / total number of elements in the matrix
Sparse matrices provide efficient storage of double or logical data that has a large percentage of zeros.
While full (or dense) matrices store every single element in memory regardless of value, sparse
matrices store only the nonzero elements and their row indices. For this reason, using sparse matrices
can significantly reduce the amount of memory required for data storage.
Sparse matrices are more memory-efficient than dense matrices when dealing with large matrices that
have a significant number of zero elements. By storing only the non-zero elements explicitly, sparse
matrices can save memory and computation time for certain operations
However, there are situations where operations or algorithms may require dense matrices. For
example, some mathematical operations, such as matrix multiplication or certain linear algebra
operations, may be more efficient or easier to implement with dense matrices. Dense matrices allow
for straightforward element-wise operations and can leverage optimized libraries for linear algebra
computations.
In Machine learning, there are a bulk of operations that are performed in a series. In a case where we
have a big matrix, converting it into a sparse matrix always helps. A sparse matrix is stored in a
different format than a dense matrix. When we store a dense matrix as a sparse matrix, the program
only stores the location of non-zero values, making it smaller in memory size and much faster for any
computations.
Storage : We know that a sparse matrix contains lesser non-zero elements than zero, so less memory
can be used to store elements. It evaluates only the non-zero elements.
Computing time : In the case of searching in a sparse matrix, we need to traverse only the non-zero
elements rather than traversing all the sparse matrix elements. It saves computing time by logically
designing a data structure traversing non-zero elements.
The non-zero elements in the sparse matrix can be stored using triplets that are rows, columns, and
values. There are two ways to represent the sparse matrix that are listed as follows :
- Array representation
- Linked list representation
Representing a sparse matrix by a 2D array leads to the wastage of lots of memory. This is because
zeroes in the matrix are of no use, so storing zeroes with non-zero elements is wastage of memory. To
avoid such wastage, we can store only non-zero elements. If we store only non-zero elements, it
reduces the traversal time and the storage space.
In 2D array representation of sparse matrix, there are three fields used that are named as -
○ Row - It is the index of a row where a non-zero element is located in the matrix.
○ Column - It is the index of the column where a non-zero element is located in the matrix.
○ Value - It is the value of the non-zero element that is located at the index (row, column).
In a linked list representation, the linked list data structure is used to represent the sparse matrix. The
advantage of using a linked list to represent the sparse matrix is that the complexity of inserting or
deleting a node in a linked list is lesser than the array. Unlike the array representation, a node in the
linked list representation consists of four fields.
Some bottlenecks of using sparse matrices is that not all problems can be represented as sparse
matrices, sparse matrices may lose information since many elements are zero, and choosing the right
algorithms and libraries for sparse matrix operations can be challenging.
Q3 :
What are abstract data types? What are its various kinds of operations?
Explain in detail.
An abstract data type is a model for data consisting of values and operations, whose concrete structure
is hidden. To be strictly correct, the word “model” is an operative part of this definition : an abstract
data type is an idea, describing a data type that could be represented in different languages.
Colloquially, it’s common to use the term “abstract data type” to refer both to the formal definition of
the data type as well as to concrete implementations of it.
ADTs are more theoretical and high-level. They define a set of operations that can be performed on
data without specifying how those operations are implemented. This allows programmers to think in
terms of what a data type can do, rather than how it does it. For instance, a stack ADT involves
operations like push, pop, and peek, without specifying whether it’s implemented using an array, a
linked list, or some other structure.
Abstraction, at its core, involves thinking or stating without reference to a specific instance. Similarly,
ADTs in programming encapsulate complexity, providing users with clear, simple interfaces.
An Abstract Data Type is a model for a certain data structure that defines the type of data stored, the
operations allowed on them, and the types of parameters of the operations. ADTs are defined by their
behavior from the point of view of a user, rather than how they are implemented.
Common ADTs :
The List ADT models a sequence of positions storing arbitrary objects -establishes a before/after
relation between positions. This is an important property associated with lists. A list A of size N can
be represented as A[0], A[1], …, A[N-1] where each element A[k] has a unique position k in the list.
In addition, remember that the elements of a list can be arbitrarily complex. The basic list operations
include insertion, deletion, retrieval and traversal.
Operations on Lists
We must remember that the LIST ADT having a different set of operations is a different ADT.
However, whatever the purpose for which we are designing the List ADT, it must have the following
methods associated with it : Constructor, isEmpty(), insert(), delete(), display().
These are some of the Update methods associated with Lists.
Here p stands for position and e for element.
A stack is a sequence of items that are accessible at only one end of the sequence. It is an ordered
group of homogeneous items or elements. Elements are added to and removed from a specially
designated end called the top of the stack (the most recently added items are at the top of the stack).
The last element to be added is the first to be removed (LIFO: Last In, First Out). In other words, a
stack is a restricted linear list in which all additions and deletions are made at one end, the top.
The Stack Abstract Data Type (Stack ADT) is a collection of data together with the operations on that
data. The interface of the collection tells us what we need to know in order to interact with it, i.e. how
to use the operations associated with the stack. The stack can be specified by the user by defining :
1. bool isEmpty – This is a Boolean operation that checks whether a stack is empty and signals a
true value if the stack is empty.
2. push (ItemType newItem) – This is the push operation that adds a new item newItem of type
ItemType to the stack.
3. pop () – This is the pop operation where we remove from the Stack, the item that has been
added most recently. In this definition of pop we do not output the item removed, we just
change the contents of the stack.
4. top() – This returns the last inserted element without removing it in other words gets the item
that was added to stack most recently.
In addition to the most important operations described above, a set of auxiliary stack operations are
also described below which will be part of the definition of a Stack ADT depending on the
application. These auxiliary operations are described below:
1. size() – This returns the number of elements that is currently available in the stack
2. Create – This operation creates an Empty Stack with Stack Pointer (top) pointing to null and
the number of Stack Elements set to 0
3. Check if Full – This is a Boolean operation that will be true when the number of Elements =
maximum number of elements allowed. However this operation is valid only for array based
Implementation.
Any list implementation such as arrays could be used to implement a stack. When using arrays the
implementation is static and the size of stack is to be fixed and given initially. Another
implementation of stacks is linked lists where the stack is dynamic and in general can never become
full.
.
The key characteristic of a queue is that it follows the First-In-First-Out (FIFO) principle. This means
that the first element added to the queue will be the first one to be removed.
Q4 :
Write algorithms to perform the following operations:
i. Insert a node before a node with data ‘10’ in a doubly linked list.
ii. Reverse doubly linked list.
5. IF current is NULL:
PRINT "Target node not found"
RETURN head
This approach has been used to insert a node before the target
position. Two dummy nodes have been used, previous and current,
where the current node will point to the node at the target
position and the previous node will point to the node at
target-1 position, once the traversing has been done.
Algorithm ReverseDoublyLinkedList(head):
1. IF head is NULL or [Link] is NULL:
RETURN head
2. SET prev = NULL
3. SET curr = head
4. WHILE curr is not NULL:
4.1 SET next = [Link]
4.2 SET [Link] = prev
4.3 SET [Link] = next
4.4 SET prev = curr
4.5 SET curr = next
5. RETURN prev
Q5 :
Differentiate between
i. Static and dynamic memory allocation
ii. Recursion and iteration
I. A computer’s memory is just a series of “buckets” that can hold numbers, characters, or
boolean values. When variables take the space of the computer’s memory, we called it allocation
of memory. Deallocations indicate that the computer has regained the space and that the variable
can no longer be accessed. The memory is divided into two parts; Stack memory and Heap
memory.
Stack memory is temporary memory storage that stores temporary variables created by a function.
In the stack, variables are declared, stored, and initialized during runtime Heap memory is greater
than stack memory and is utilized in dynamic memory allocation.
Memory allocation is the process of allocating physical or virtual memory space to computer
programs and services. Memory allocation takes place either before or during program execution.
There are two basic types of memory allocation:
stack is always faster than heap memory Heap memory is always slower than the stack
because the location for memory allocation is memory, because heap memory searches for a
always at the top of stack and there was no free space before allocating memory.
need to search for free memory.
fragmentation can happen, where memory is
fragmentation does not happen, memory is available yet not in a contiguous block that is
always stored in a contiguous block manner if required therefore resulting in an error.
available.
dynamic memory is managed by the operating
static memory is managed by cpu's stack system and the memory allocator provided by
pointer and handling the stack frames. the compiler.
II.
Recursion is a method of solving a problem by breaking it into small sub problems till the time it can
be solved easily. It follows a divide and conquer approach and in a program, it is generally
implemented by using a function which calls itself again and again till a certain baseline condition is
met, when the baseline condition is met it provides a solution without recursion which then helps in
solving the bigger sub problems.
Recursion function has two main parts: the recursive function and the base case. The base case is
important as without it function would in theory repeat forever and would practically result in the
stack overflow.
Iteration is implemented using a loop like for, while, do-while. It uses a simple variable to store
control data and does not have the overhead of repeated function calls and thus is faster than the
recursive function.
The difference between them is that recursion is simply a method call in which the method being
called is the same as the one making the call while iteration is when a loop is repeatedly executed
until a certain condition is met. Under the hood, both recursion and iteration depend on a condition so
as to know when to stop but recursion is simply a process, always applied to a function.
Recursion : Iteration :
* Recursion and iteration both will execute a * Whereas Iteration occurs when a loop
set of statements. However, recursion occurs repeatedly executes until the condition is
when a function calls itself repeatedly in a satisfied.
block of code. * Iteration is a repetitive structure and has
* Recursion will have a selective structure and more lines of code.
less lines of code. * It involves a looping method which is used
* Recursion achieves the output through to run a particular code repeatedly.
repeated method calls. * Iteration repeatedly executes code. This can
* Recursion program terminates when a base be less expensive in both processor time and
case is recognized. memory space.
* Recursion repeatedly invokes the * Infinite loop or iteration may cause system
mechanism, and constantly the overhead of freeze or more pressure on CPU.
method calls which can be expensive in terms * Iteration doesn’t return a value in fact its task
of processor time and memory space. is to perform a specific task.
* If it is an infinite recursion without setting a
proper base case system may collapse.
Major Differences In Recursion & Iteration
● A conditional statement decides the termination of recursion while a control variable’s value
decides the termination of the iteration statement (except in the case of a while loop).
● Infinite recursion can lead to system crash whereas infinite iteration consumes CPU cycles.
● Recursion repeatedly invokes the mechanism, and consequently the overhead, of method
calls. This can be expensive in both processor time and memory space while iteration doesn’t.
● Recursion makes code smaller while iteration makes it longer.
Q6 :
i. Convert the given expression into a prefix using stack.
(A+(B*C-(D/E*F)^G)+3*H-2)
To convert Infix to Prefix expression, computers usually use the stack data structure.
1. Reverse the infix expression.
2. Obtain the “nearly” postfix expression of the modified expression.
3. Reverse the postfix expression.
0 (
1 ( ((
2 2 (( 2
3 - ((- 2
4 H ((- 2H
5 * ((-* 2H
6 3 ((-* 2H3
7 + ((-+ 2H3*
8 ( ((-+( 2H3*
9 G ((-+( 2H3*G
10 ^ ((-+(^ 2H3*G
11 ( ((-+(^( 2H3*G
12 F ((-+(^( 2H3*GF
13 * ((-+(^(* 2H3*GF
14 E ((-+(^(* 2H3*GFE
15 / ((-+(^(*/ 2H3*GFE
16 D ((-+(^(*/ 2H3*GFED
17 ) ((-+(^ 2H3*GFED/*
18 - ((-+(- 2H3*GFED/*^
19 C ((-+(- 2H3*GFED/*^C
20 * ((-+(-* 2H3*GFED/*^C
21 B ((-+(-* 2H3*GFED/*^CB
22 ) ((-+ 2H3*GFED/*^CB*-
23 + ((-++ 2H3*GFED/*^CB*-
24 A ((-++ 2H3*GFED/*^CB*-A
25 ) ( 2H3*GFED/*^CB*-A+
+-
Heap is a complete binary tree data structure that satisfies the heap property: for every node, the value
of its children is greater than or equal to its own value. Heaps are usually used to implement priority
queues, where the smallest (or largest) element is always at the root of the tree.
After heapifying your heap, it takes only O(1) operations to access the maximum or minimum value
in a max/min heap. In this case, you could say that you’re treating the heap like it’s a queue — you
“pop” (dequeue) the first element, aka the root element. The highest priority items are always at the
front of the queue, and the lowest priority towards the back.
Class Node:
data
priority
next
Class PriorityQueue:
head = NULL
Function CreatePriorityQueue():
1. SET head = NULL
RETURN head
Function Enqueue(data, priority):
1. CREATE newNode with data and priority.
2. IF head is NULL OR [Link] > [Link]:
SET [Link] = head
SET head = newNode
3. ELSE:
SET current = head