Data Structures and Algorithms (DSA) Tutorial
Data structures and algorithms (DSA) are two important aspects of any programming
language. Every programming language has its own data structures and different types of
algorithms to handle these data structures.
Data Structures are used to organise and store data to use it in an effective way when
performing data operations.
Algorithm is a step-by-step procedure, which defines a set of instructions to be executed in a
certain order to get the desired output. Algorithms are generally created independent of
underlying languages, i.e. an algorithm can be implemented in more than one programming
language.
Almost every enterprise application uses various types of data structures in one or the other
way. So, as a programmer, data structures and algorithms are really important aspects of day-
to-day programming.
A data structure is a particular way to arrange data so it can be saved in memory and
retrieved for later use where as an algorithm is a set of steps for solving a known problem.
Data Structures and Algorithms is abbreviated as DSA in the context of Computer Science.
Why to Learn Data Structures & Algorithms (DSA)?
As applications are getting complex and data rich, there are three common problems that
applications face now-a-days.
Data Search − Consider an inventory of 1 million(10 6) items of a store. If the application is
to search an item, it has to search an item in 1 million(10 6) items every time slowing down
the search. As data grows, search will become slower.
Processor speed − Processor speed although being very high, falls limited if the data grows
to billion records.
Multiple requests − As thousands of users can search data simultaneously on a web server,
even the fast server fails while searching the data.
What is Data Structure?
A data structure is a particular way of organising data in a computer so that it can be used
effectively. The idea is to reduce the space and time complexities of different tasks.
The choice of a good data structure makes it possible to perform a variety of critical
operations effectively. An efficient data structure also uses minimum memory space and
execution time to process the structure. A data structure is not only used for organising the
data. It is also used for processing, retrieving, and storing data. There are different basic and
advanced types of data structures that are used in almost every program or software system
that has been developed. So we must have good knowledge of data structures.
Need Of Data Structure:
1
The structure of the data and the synthesis of the algorithm are relative to each other. Data
presentation must be easy to understand so the developer, as well as the user, can make an
efficient implementation of the operation.
Data structures provide an easy way of organising, retrieving, managing, and storing data.
Here is a list of the needs for data
Data structure modification is easy.
It requires less time.
Save storage memory space.
Data representation is easy.
Easy access to the large database
Classification/Types of Data Structures:
1. Linear Data Structure
2. Non-Linear Data Structure.
Linear Data Structure:
Elements are arranged in one dimension ,also known as linear dimension.
Example: lists, stack, queue, etc.
Non-Linear Data Structure
Elements are arranged in one-many, many-one and many-many dimensions.
Example: tree, graph, table, etc.
How to start learning Data Structures & Algorithms (DSA)?
The basic steps to learn DSA is as follows:
2
Step 1 - Learn Time and Space complexities
Time and Space complexities are the measures of the amount of time required to execute the
code (Time Complexity) and amount of space required to execute the code (Space
Complexity).
Step 2 - Learn Different Data Structures
Here we learn different types of data structures like Array, Stack, Queye, Linked List et.
Step 3 - Learn Different Algorithms
Once you have good undertanding about various data sturtcures then you can start learning
associated algorithms to process the data stored in these data structures. These algorithms
include searching, sorting, and other different algorithms.
Applications of Data Structures & Algorithms (DSA)
From the data structure point of view, following are some important categories of algorithms
−
Search − Algorithm to search an item in a data structure.
Sort − Algorithm to sort items in a certain order.
Insert − Algorithm to insert item in a data structure.
Update − Algorithm to update an existing item in a data structure.
Delete − Algorithm to delete an existing item from a data structure.
The following computer problems can be solved using Data Structures −
Fibonacci number series
Knapsack problem
Tower of Hanoi
All pair shortest path by Floyd-Warshall
Shortest path by Dijkstra
Project scheduling
What is Algorithm?
Algorithm is a step-by-step procedure, which defines a set of instructions to be executed in a
certain order to get the desired output. Algorithms are generally created independent of
underlying languages, i.e. an algorithm can be implemented in more than one programming
language.
3
Types of Algorithms
Here are different type of algorithms which we are going to learn in this tutorial:
DSA - Searching Algorithms
DSA - Sorting Algorithms
DSA - Approximation Algorithms
DSA - Divide and Conquer Algorithms
DSA - Greedy Algorithms
DSA - Recursion Algorithm
DSA - Backtracking Algorithm
DSA - Randomized Algorithms
DSA - Dynamic Programming
DSA - Pattern Searching
DSA - Mathematical Algorithms
DSA - Geometric Algorithms
DSA - Bitwise Algorithms
DSA - Branch and Bound Algorithm
Characteristics of a Data Structure
Correctness − Data structure implementation should implement its interface correctly.
Time Complexity − Running time or the execution time of operations of data structure must
be as small as possible.
Space Complexity − Memory usage of a data structure operation should be as little as
possible.
Execution Time Cases
There are three cases which are usually used to compare various data structure's execution
time in a relative manner.
Worst Case − This is the scenario where a particular data structure operation takes maximum
time it can take. If an operation's worst case time is ƒ(n) then this operation will not take
more than ƒ(n) time where ƒ(n) represents function of n.
Average Case − This is the scenario depicting the average execution time of an operation of
a data structure. If an operation takes ƒ(n) time in execution, then m operations will take
mƒ(n) time.
Best Case − This is the scenario depicting the least possible execution time of an operation of
a data structure. If an operation takes ƒ(n) time in execution, then the actual operation may
take time as the random number which would be maximum as ƒ(n).
Basic DSA Terminologies
Data − Data are values or set of values.
4
Data Item − Data item refers to single unit of values.
Group Items − Data items that are divided into sub items are called as Group Items.
Elementary Items − Data items that cannot be divided are called as Elementary
Items.
Attribute and Entity − An entity is that which contains certain attributes or
properties, which may be assigned values.
Entity Set − Entities of similar attributes form an entity set.
Field − Field is a single elementary unit of information representing an attribute of an
entity.
Record − Record is a collection of field values of a given entity.
File − File is a collection of records of the entities in a given entity set.
Algorithm is a step-by-step procedure, which defines a set of instructions to be executed in a
certain order to get the desired output. Algorithms are generally created independent of
underlying languages, i.e. an algorithm can be implemented in more than one programming
language.
From the data structure point of view, following are some important categories of algorithms
−
Search − Algorithm to search an item in a data structure.
Sort − Algorithm to sort items in a certain order.
Insert − Algorithm to insert item in a data structure.
Update − Algorithm to update an existing item in a data structure.
Delete − Algorithm to delete an existing item from a data structure.
Characteristics of an Algorithm
Not all procedures can be called an algorithm. An algorithm should have the following
characteristics −
Unambiguous − Algorithm should be clear and unambiguous. Each of its steps (or phases),
and their inputs/outputs should be clear and must lead to only one meaning.
Input − An algorithm should have 0 or more well-defined inputs.
Output − An algorithm should have 1 or more well-defined outputs, and should match the
desired output.
Finiteness − Algorithms must terminate after a finite number of steps.
Feasibility − Should be feasible with the available resources.
Independent − An algorithm should have step-by-step directions, which should be
independent of any programming code.
5
How to Write an Algorithm?
There are no well-defined standards for writing algorithms. Rather, it is problem and resource
dependent. Algorithms are never written to support a particular programming code.
As we know that all programming languages share basic code constructs like loops (do, for,
while), flow-control (if-else), etc. These common constructs can be used to write an
algorithm.
We write algorithms in a step-by-step manner, but it is not always the case. Algorithm
writing is a process and is executed after the problem domain is well-defined. That is, we
should know the problem domain, for which we are designing a solution.
Example
Let's try to learn algorithm-writing by using an example.
Problem − Design an algorithm to add two numbers and display the result.
Step 1 − START
Step 2 − declare three integers a, b & c
Step 3 − define values of a & b
Step 4 − add values of a & b
Step 5 − store output of step 4 to c
Step 6 − print c
Step 7 − STOP
Algorithms tell the programmers how to code the program. Alternatively, the algorithm can
be written as −
Step 1 − START ADD
Step 2 − get values of a & b
Step 3 − c ← a + b
Step 4 − display c
Step 5 − STOP
In design and analysis of algorithms, usually the second method is used to describe an
algorithm. It makes it easy for the analyst to analyze the algorithm ignoring all unwanted
definitions. He can observe what operations are being used and how the process is flowing.
Algorithm Analysis
Efficiency of an algorithm can be analyzed at two different stages, before implementation and
after implementation. They are the following −
6
A Priori Analysis − This is a theoretical analysis of an algorithm. Efficiency of an algorithm
is measured by assuming that all other factors, for example, processor speed, are constant and
have no effect on the implementation.
A Posterior Analysis − This is an empirical analysis of an algorithm. The selected algorithm
is implemented using programming language. This is then executed on target computer
machine. In this analysis, actual statistics like running time and space required, are collected.
We shall learn about a priori algorithm analysis. Algorithm analysis deals with the execution
or running time of various operations involved. The running time of an operation can be
defined as the number of computer instructions executed per operation.
Algorithm Complexity
Suppose X is an algorithm and n is the size of input data, the time and space used by the
algorithm X are the two main factors, which decide the efficiency of X.
Time Factor − Time is measured by counting the number of key operations such as
comparisons in the sorting algorithm.
Space Factor − Space is measured by counting the maximum memory space required by the
algorithm.
The complexity of an algorithm f(n) gives the running time and/or the storage space required
by the algorithm in terms of n as the size of input data.
Space Complexity
Space complexity of an algorithm represents the amount of memory space required by the
algorithm in its life cycle. The space required by an algorithm is equal to the sum of the
following two components −
A fixed part that is a space required to store certain data and variables, that are independent of
the size of the problem. For example, simple variables and constants used, program size, etc.
A variable part is a space required by variables, whose size depends on the size of the
problem. For example, dynamic memory allocation, recursion stack space, etc.
Space complexity S(P) of any algorithm P is S(P) = C + SP(I), where C is the fixed part
and S(I) is the variable part of the algorithm, which depends on instance characteristic I.
Following is a simple example that tries to explain the concept −
Algorithm: SUM(A, B)
Step 1 − START
Step 2 − C ← A + B + 10
Step 3 − Stop
Here we have three variables A, B, and C and one constant. Hence S(P) = 1 + 3. Now,
space depends on data types of given variables and constant types and it will be multiplied
accordingly.
Time Complexity
7
Time complexity of an algorithm represents the amount of time required by the algorithm to
run to completion. Time requirements can be defined as a numerical function T(n), where
T(n) can be measured as the number of steps, provided each step consumes constant time.
computational time is T(n) = c ∗ n, where c is the time taken for the addition of two bits.
For example, addition of two n-bit integers takes n steps. Consequently, the total
Here, we observe that T(n) grows linearly as the input size increases.
Asymptotic Analysis
Asymptotic analysis of an algorithm refers to defining the mathematical foundation/framing
of its run-time performance. Using asymptotic analysis, we can very well conclude the best
case, average case, and worst case scenario of an algorithm.
Asymptotic analysis is input bound i.e., if there's no input to the algorithm, it is concluded to
work in a constant time. Other than the "input" all other factors are considered constant.
Asymptotic analysis refers to computing the running time of any operation in mathematical
units of computation. For example, the running time of one operation is computed as f(n) and
may be for another operation it is computed as g(n2). This means the first operation running
time will increase linearly with the increase in n and the running time of the second operation
will increase exponentially when n increases. Similarly, the running time of both operations
will be nearly the same if n is significantly small.
Usually, the time required by an algorithm falls under three types −
Best Case − Minimum time required for program execution.
Average Case − Average time required for program execution.
Worst Case − Maximum time required for program execution.
Asymptotic Notations
Execution time of an algorithm depends on the instruction set, processor speed, disk I/O
speed, etc. Hence, we estimate the efficiency of an algorithm asymptotically.
Time function of an algorithm is represented by T(n), where n is the input size.
Different types of asymptotic notations are used to represent the complexity of an algorithm.
Following asymptotic notations are used to calculate the running time complexity of an
algorithm.
O − Big Oh Notation
Ω − Big omega Notation
θ − Big theta Notation
o − Little Oh Notation
ω − Little omega Notation
8
Big Oh, O: Asymptotic Upper Bound
The notation (n) is the formal way to express the upper bound of an algorithm's running time.
is the most commonly used notation. It measures the worst case time complexity or the
longest amount of time an algorithm can possibly take to complete.
A function f(n) can be represented is the order of g(n) that is O(g(n)), if there exists a value
of positive integer n as n0 and a positive constant c such that −
f(n)⩽c.g(n)�(�)⩽�.�(�) for n>n0�>�0 in all case
Hence, function g(n) is an upper bound for function f(n), as g(n) grows faster than f(n).
Example
Let us consider a given function, f(n)=4.n3+10.n2+5.n+1�(�)=4.�3+10.�2+5.�+1
Considering g(n)=n3�(�)=�3,
f(n)⩽5.g(n)�(�)⩽5.�(�) for all the values of n>2�>2
Hence, the complexity of f(n) can be represented as O(g(n))�(�(�)), i.e. O(n3)�(�3)
Big Omega, Ω: Asymptotic Lower Bound
The notation Ω(n) is the formal way to express the lower bound of an algorithm's running
time. It measures the best case time complexity or the best amount of time an algorithm can
possibly take to complete.
We say that f(n)=Ω(g(n))�(�)=Ω(�(�)) when there exists
constant c that f(n)⩾c.g(n)�(�)⩾�.�(�) for all sufficiently large value of n. Here n is a
positive integer. It means function g is a lower bound for function f ; after a certain value
of n, f will never go below g.
9
Example
Let us consider a given function, f(n)=4.n3+10.n2+5.n+1�(�)=4.�3+10.�2+5.�+1.
Considering g(n)=n3�(�)=�3, f(n)⩾4.g(n)�(�)⩾4.�(�) for all the values of n>0�>0.
Hence, the complexity of f(n) can be represented as Ω(g(n))Ω(�(�)), i.e. Ω(n3)Ω(�3)
Theta, θ: Asymptotic Tight Bound
The notation (n) is the formal way to express both the lower bound and the upper bound of an
algorithm's running time. Some may confuse the theta notation as the average case time
complexity; while big theta notation could be almost accurately used to describe the average
case, other notations could be used as well.
We say that f(n)=θ(g(n))�(�)=�(�(�)) when there exist
constants c1 and c2 that c1.g(n)⩽f(n)⩽c2.g(n)�1.�(�)⩽�(�)⩽�2.�(�) for all
sufficiently large value of n. Here n is a positive integer.
This means function g is a tight bound for function f.
10
Example
Let us consider a given function, f(n)=4.n3+10.n2+5.n+1�(�)=4.�3+10.�2+5.�+1
Considering g(n)=n3�(�)=�3, 4.g(n)⩽f(n)⩽5.g(n)4.�(�)⩽�(�)⩽5.�(�) for all the
large values of n.
Hence, the complexity of f(n) can be represented as θ(g(n))�(�(�)), i.e. θ(n3)�(�3).
Little Oh, o
The asymptotic upper bound provided by O-notation may or may not be asymptotically
tight. The bound 2.n2=O(n2)2.�2=�(�2) is asymptotically tight, but the
bound 2.n=O(n2)2.�=�(�2) is not.
We use o-notation to denote an upper bound that is not asymptotically tight.
We formally define o(g(n)) (little-oh of g of n) as the set f(n) = o(g(n)) for any positive
constant c>0�>0 and there exists a value n0>0�0>0, such
that 0⩽f(n)⩽c.g(n)0⩽�(�)⩽�.�(�).
Intuitively, in the o-notation, the function f(n) becomes insignificant relative
to g(n) as n approaches infinity; that is,
limn→∞(f(n)g(n))=0
Common Asymptotic Notations
Following is a list of some common asymptotic notations −
constant − O(1)
11
logarithmic − O(log n)
linear − O(n)
n log n − O(n log n)
quadratic − O(n2)
cubic − O(n3)
polynomial − nO(1)
exponential − 2O(n)
12