0% found this document useful (0 votes)
3 views35 pages

Data Structures Overview and Concepts

Module-1 of BCS304 covers the fundamentals of data structures, including definitions of data, data items, and entities, as well as classifications of data structures into primitive and non-primitive types. It explains operations on data structures, the concept of arrays, structures, unions, and pointers, along with memory allocation techniques in C. Additionally, it discusses dynamic memory management functions such as malloc, calloc, and realloc, emphasizing the importance of proper memory handling to avoid issues like dangling references.

Uploaded by

lookashley93
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views35 pages

Data Structures Overview and Concepts

Module-1 of BCS304 covers the fundamentals of data structures, including definitions of data, data items, and entities, as well as classifications of data structures into primitive and non-primitive types. It explains operations on data structures, the concept of arrays, structures, unions, and pointers, along with memory allocation techniques in C. Additionally, it discusses dynamic memory management functions such as malloc, calloc, and realloc, emphasizing the importance of proper memory handling to avoid issues like dangling references.

Uploaded by

lookashley93
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Module-1 BCS304

Data Structures and Applications


Module-1
1. Introduction

Data is a value or a set of values. Data as such may not convey any meaning. Example 90, Bob

When data is interpreted to convey a meaning we call it an information. For example Bob scored 90
marks

A data item refers to a single unit of values. Data items that are divided into sub items are called
group items. Example :Name of an employee can be divided to three subitems- first name, middle
name and last name

Data items that are not divided into sub items are called elementary items.

Collection of data is organized into a hierarchy of fields, records and files.

An entity is something that has certain attributes to which values can be assigned

Example: For an entity called employee the attributes and values can be as follows.

Attributes Name Age Emp_id


Values Bob 35 12

Entities with similar attributes form an entity set. Example all employees of an orgnanizaation.

The data organization field can be compared to an attribute of an entity. Record is the field values
for a given entity and file is the entity set.

Records may be classified according to length.


• Fixed Length Records: All data items contain the same type of data and the same amount of
space is assigned to each data item.
• Variable Length Records: File records may contain different lenght

2. Introduction to data Structures

A data structure is a particular method of storing and organizing data in a computer so that it can be
used efficiently

The data Structure is classified into


➢ Primitive data structure: These can be manipulated directly by the machine instructions. Example
integer character, float etc
➢ Non primitive data structures: They cannot be manipulated directly by the machine instructions.
The non primitive data structures are further classified into linear and non linear data structures.
• Linear data structures: show the relationship of adjacency between the elements of the data
structures. Example are arrays, stacks, queues , list etc.

CSE@HKBKCE 1
Module-1 BCS304

• Non linear data structure: They do not show the relationship of adjacency between the elements.
Example are Trees and graphs

1.1 Operations on data structures

1. Create: Creating a new data structure


2. Insert: Adding a new record to the structure.
3. Delete: Removing the record from the structure.
4. Search: Finding the location of a particular record with a given key value, or finding the location of
all records which satisfy one or more conditions.
5. Sort: Managing the data or record in some logical order (Ascending or descending order).
6. Merge: Combining the record in two different sorted files into a single sorted file
7. Traversal: Accessing each record exactly once so that certain items in the record may be processed.

1.2 Review of arrays


Array is a collection of elements of the same data type
An array is declared by appending brackets to the name of a variable.

For example
int list[5]; // declares an array that can store 5 integers
In C all array index start at 0 and so list[0],list[1],list[2],list[3],list[4] are the names of the five array
elements each of which contains an integer value.

1.3 Structures
Structure is basically a user-defined data type that can store related information that may be of same or
different data types together. A structure is therefore a collection of variables under a single name. The
variables within a structure are of different data types and each has a name that is used to select it from
the structure.

Syntax:

struct structure_name {
data_type member1;
data_type member2;
data_type member3;
...
};

Example,
Struct student {
char sname[10];
int age;
float average_marks;
} st;

creates a variable whose name is st and that has three fields:


• a name that is a character array
• an integer value representing the age of the student
• a float value representing the average marks of the individual student.

CSE@HKBKCE 2
Module-1 BCS304

To assign values to these fields dot operator (. ) is used as the structure member operator. We use
this operator to select a particular member of the structure.

strcpy([Link],"james");
[Link] = 10;
st.average_marks = 35;

We can create our own structure data types by using the typedef statement
Consider an example that creates a structure for the employee details.

typedef struct Employee {


char name[10];
int age;
float salary;
};

or

typedef struct {
char name[10];
int age;
float salary;
} Employee;

Employee p1, p2;


Employee p1, p2;
This says that Employee is the name of the type defined by the structure definition,

Nested Structure
a structure can be embedded within another structure. That is a structure can have another structure as its
member such a structure is called a nested structure.

For example, associated with employee structure we may wish to include the date of Birth of an
employee by using nested structure

typedef struct {
int month;
int day;
int year;
} date;

typedef struct {
char name[10];
int age;
float salary;
date dob;
}employee;

CSE@HKBKCE 3
Module-1 BCS304

If we have to store the information of a person we declare a variable as


Employee p1;

A person born on September 10, 1974, would have the values for the date struct set as:
[Link] = 9;
[Link] = 10;
[Link] = 1974;

1.2.1 Array of Structures

In the case of a student or the employee we may not store the details of only 1 student or 1
employee. When we have to store the details of a group of students we can declare an array of
structures.

Example: struct student s[10];

Now we can access the details of the ith student as follows

strcpy(s[i[.sname,"james");
s[i].age = 10;
s[i].average_marks = 35;

Self-Referential Structures

A self-referential structure is one in which one or more of its data members is a pointer to itself. They
require dynamic memory allocation (malloc and free) to explicitly obtain and release memory.

Example:
typedef struct list {
int data;
list *link ;
};

Each instance of the structure list will have two components, data and link. data is a single character,
while link is a pointer to a list structure. The value of link is either the address in memory of an instance
of list or the null pointer.

Consider these statements, which create three structures and assign values to their respective fields:
list item1, item2, item3;
[Link] = 5
[Link] = 10
[Link] = 15
[Link] = [Link] = [Link] = NULL;

CSE@HKBKCE 4
Module-1 BCS304

We can attach these structures together by replacing the null link field in item 2 with one that points to item
3 and by replacing the null link field in item 1 with one that points to item 2.
[Link] = &item2; item2.1ink = &item3;

1.4 Unions
A union is a user-defined data type that can store related information that may be of different data types or
same data type, but the fields of a union must share their memory space. This means that only one field of
the union is "active" at any given time.

Example1,
Suppose a program uses either a number that is int or float we can define a union as

Union num
{
int a;
Float b;
};

Union num n1;

Now we can store values as n1.a=5 or n2.b= 3.14 only one member is active at a point of time.

Example 2:

Since only one of the data member is active in union we can use tag field to indicate which of the fields are
active.

Struct number{
Enum tagfield {i,f} d;
Union
{
int a;
Float b;
}num;
};

Struct number n1;

Now if the value of n1.d is i we could assign the value [Link].a=5


If the value of n1.d is f we could assign the value [Link].b=3.14

C does not verify and validate whether we use the appropriate field. C does not require us to use the
correct fields of a union.

CSE@HKBKCE 5
Module-1 BCS304

1.5 Pointer variables, Declaration and Definition


Pointer variable is a variable that stores the address of another variable.

Declaring Pointer variables

A variable can be declared as a pointer variable by using an indirection operator(*)


Syntax: type * identifier;
Type signifies the type of the pointer variable
For example
char *p;
int *m;
The variable p is declared as a pointer variable of type character. The variable m is declared as a pointer
variable of type integer.

Initialization of pointer variables

Uninitialized variables have unknown garbage values stored in them, similarly uninitialized pointer
variables will have uninitialized memory address stored inside them which may be interpreted as a memory
location, and may lead to runtime error. These errors are difficult to debug and correct, therefore a pointer
should always be initialized with a valid memory address.

A pointer can be initialized as follows


int a;
int *p;
Here the variable a and the pointer variable p are of the same data type. To make p to point at a we have
to write a statement
p=&a; now the address of a is stored in the pointer variable p and now p is said to be
pointing at a.
If we do not want the pointer variable to point at anything we can initialize it to point at NULL
For example int *p =NULL;
When we dereference a null pointer, we are using address zero, which is a valid address in the computer.

NOTE: A pointer variable can only point at a variable of the same type.
We can have more than one pointer variable pointing at the same variable. For example
int a;
int *p,*q;
p=&a;
q=&a;
now both the pointer variable p and q are pointing at the same variable a. There is no limit to the number
of pointer variable that can point to a variable.

Accessing variables through pointers


A Variable can be accessed through a pointer variable by using an indirection operator.

Indirection operator(*): An indirection operator is a unary operator whose operand must be a pointer
value.

CSE@HKBKCE 6
Module-1 BCS304

For example to access a variable a through a pointer variable p we have to code it as follows
Void main()
{
int a=5;
int *p
p=&a;// p is now pointing at a
*p=*p+1
printf(“ %d %d %p”, a, *p,p);
}
Output: 6 6 XXXXX(address of variable a)
Now the value of a is modified through the pointer variable p

Note:
➢ we need parenthesis for expressions like (*p) ++ as the precedence of postfix increment is more
than precedence of the indirection operator (*). If the parenthesis is not used the address will be
incremented.
➢ The indirection and the address operators are the inverse of each other when combined in an
expression such as *&a they cancel each other

Write a program to add two numbers using pointers


Write a program to swap two numbers

Pointers and Functions


When we call a function by passing the address of a variable we call it as pass by reference. By passing
the address of variables defined in main we can directly store the data in the calling function rather
returning the value. Pointers can also be used when we have to return more than one value from a
function

program to swap two characters.


void main()
{
char a ,b;
printf(“\nenter two characters\n”);
scanf(“%c %c”, &a,&b);
printf(“the value before swap: %c %c” a,b);
swap(&a,&b);
printf(“the value after swap: %c %c” a,b);
}
void swap(char *p1,char *p2)
{
char temp;
temp=*p1;
*p1=*p2;
*p2=temp;
}

1.6 Memory allocation functions


In high level languages the data structures are fully defined at compile time. Modern languages like C can
allocate memory at execution this feature is known as dynamic memory allocation.

CSE@HKBKCE 7
Module-1 BCS304

There are two ways in which we can reserve memory locations for a variable

➢ Static memory allocation: the declaration and definition of memory should be specified in the source
program. The number of bytes reserved cannot be changed during runtime
➢ Dynamic memory allocation : Data definition can be done at runtime .It uses predefined functions to
allocate and release memory for data while the program is running. To use dynamic memory allocation
the programmer must use either standard data types or must declare derived data types

Memory usage
Four memory management functions are used with dynamic memory. malloc, calloc and realloc are used
for memory allocation. The function free is used to return memory when it is not used.

Heap: It is the unused memory allocated to the program When requests are made by memory allocating
functions, memory is allocated from the heap at run time.

Memory Allocation (malloc)


When a malloc function is invoked requesting for memory, it allocates a block of memory that contains the
number of bytes specified in its parameter and returns a pointer to the start of the allocated memory. When
the requested memory is not available the pointer NULL is returned.

syntax: void *malloc (long int size);


Example: void *malloc(sizeof(int));

The pointer returned by the malloc function can be type cast to the pointer of the required type by
making use of type cast expressions

Example:
To allocate an integer in the heap we code
int *pint
pint=(int*)malloc(sizeof(int))

Releasing memory (free)


When memory locations allocated are no longer needed, they should be freed by using the predefined
function free.

Syntax: free(void*);
Example: int *p,a;
p=&a;
free(p);

Program showing the allocation of memory using malloc


int i,*pi;
float f,*pf;
Pi= (int*) malloc (sizeof((int));
Pf= (float *) malloc(sizeof(float));
*pi= 1344;
*pf= 3.14
Printf(“integer value= %d float value= %f”,*pi, *pf);
Free(pi);

CSE@HKBKCE 8
Module-1 BCS304

Free(pf);

Contiguous memory allocation (calloc)

This function is used to allocate contiguous block of memory. It is primarily used to allocate memory for
arrays.

The function calloc() allocates a user specified amount of memory and initializes the allocated memory to
0. A pointer to the start of the allocated memory is returned. In case there is insufficient memory it returns
NULL

syntax :Void * calloc (long int count , long int size);

Example:
To allocate a one dimensional array of integers whose capacity is n the following code can be written.
Int *ptr
ptr=(int*)calloc(n,sizeof(int))

Reallocation of memory(realloc)
The function realloc resizes the memory previously allocated by either malloc or calloc.

syntax: Void * realloc (void * ptr , long int new_size);

Example
int *p;
p=(int*)calloc(n,sizeof(int))
p=realloc(p,s) /*where s is the new size*/

The statement realloc(p,s)


Changes the size of the memory pointed by p to s. The existing contents of the block remain unchanged.

➢ When s> oldsize(Block size increases) the additional (s – oldsize )have unspecified value
➢ When s<oddsize (Block size reduces) the rightmost (oldsize-s) bytes of the old block are freed.
➢ When realloc is able to do the resizing it returns a pointer to the start of the new block
➢ When is not able to do the resizing the old block is unchanged and the function returns the value
NULL.

1.6.1. Dangling Reference


Once a pointer is freed using the free function then there is no way to retrieve this storage and any
reference to this location is called dangling reference.

Example2:
int i,*p,*f;
i=2;
p=&i;
f=p;
free(p);

CSE@HKBKCE 9
Module-1 BCS304

*f=*f+2 /* Invalid dangling reference*/

The location that holds the value 2 is freed but still there exist a reference to this location through f and
pointer f will try to access a location that is freed so the pointer f is a dangling reference

1.6.2 pointers can be dangerous

When pointers are used the following points needs to be taken care

1. When a pointer is not pointing at any object it is a good practise to set it to NULL so that there is no
attempt made to access a memory location that is out of range of our program or that does not contain
a pointer reference to the legitimate object.
2. Use explicit type casts when converting between pointer types.
int *pi;
float *pf;
Pi= (int*) malloc (sizeof((int));
Pf= (float *)pi;
3. Define explicit return types for functions. If the return type is omitted it defaults to integer which
has the same size as a pointer and can be later interpreted as a pointer

1.7 Dynamically allocated Arrays

One dimensional array


When we cannot determine the exact size of the array the space of the array can be allocated at
runtime.
For example consider the code given below
int i,n,*list;
printf(“enter the size of the array”);
scanf(“%d”,&n);
if (n<1)
{
fprintf(stderr,”Improper values of n \n”);
exit();
}
List=(int*) malloc (n*sizeof(n))/* or list=(int*)calloc(n,sizeof(int))

1.8 Arrays
1.8.1 Linear Arrays
A Linear Array is a list of finite number (n) of homogenous data elements.
a. The elements of the array are referenced by an index set consisting of n consecutive
numbers(0....(n-1)).
b. The elements of the array are stored in successive memory locations

The number n of elements is called the length or size of the array. Length of the array can be obtained
from the index set using the formula
Length = Upper bound – Lowe bound +1

CSE@HKBKCE 10
Module-1 BCS304

The elements of an array may be denoted by a[0],a[2]………a[n-1]. The number k in a[k] is called a
subscript or index and a[k] is called the subscripted value.
An array is usually implemented as a consecutive set of memory locations

Declaration

Linear arrays are declared by adding a bracket to the name of a variable. The size of the array is
mentioned within the brackets.

Eg :- int list[5];

Declares an array containing five integer elements.

In C all arrays start at index 0


Therefore, list[0], list[1], list[2], list[3], and list[4] are the names of the five array elements ,each of
which contains an integer value.

Representation of Linear Arrays in memory


When the compiler encounters an array declaration such as int list[5], to create list, it allocates five
consecutive memory locations. Each memory location is large enough to hold a single integer.

The address of the first element list[0], is called the base address
base address=address(list[0])
Using the base address the address of any element of list can be calculated using the formula

address(list[k])= base addres + w.k


Where w is the size of each element in the array list

Example:
int list[5]

The elements of the array is list[0]……..list[4]


If the size of an integer on the machine is denoted by sizeof(int), then the memory Address(list [k])= base
address+ sizeof (int).k.

Variable Memory Address canlculatio


Let α the base address , the address of list[0]
Let w=sizeof(int)
address(list[1] )= α + w*1
address(list[2]) = α + w*2
address(list[3]) = α + w*3
address(list[4]) = α + w*4

1.8.2 Multi dimensional arrays

Two dimensional arrays


C uses the array of array representation to represent a multidimensional array. In this representation a 2
dimensional array is represented as a one dimensional array in which each element is itself a one
dimensional array.

CSE@HKBKCE 11
Module-1 BCS304

A two dimensional m X n array A is a collection of m* n data elements such that each element is
specified by a pair of integers called subscripts.

An element with subscript i and j will be represented as A[i][j]

Declaration: int A[3][5];

It declares an array A that contains three elements where each element is a one dimensional array. Each
one dimensional array has 5 integer elements.

Example : A 2 dimensional array A[3][4] is represented as

0 1 2 3
0 A[0][0] A[0][1] A[0][2] A[0][3]
Rows 1 A[1][0] A[1][1] A[1][2] A[1][3]
2 A[2][0] A[2][1] A[2][2] A[2][3]

Representation of two dimensional arrays in memory


A two dimensional m X n array A is stored in the memory as a collection of m*n sequential memory
[Link] the array is stored column by column it is called column major order and if the array is stored
row by row it is called row major order.

Example: Representation of the two dimensional array A[3][4] in row major order and column major
order

A Subscript A Subscript
A[0][0] A[0][0] Column1
A[0][1] A[1][0]
Row1
A[0][2] A[2][0]
A[0][3] A[0][1] Column2
A[1][0] A[1][1]
A[1][1] A[2][1]
Row2
A[1][2] A[0][2] Column3
A[1][3] A[1][2]
A[2][0] A[2][2]
A[2][1] A[0][3] Column4
Row3
A[2][2] A[1][3]
A[2][3] A[2][3]

Row Major Order Column major order

Figure 1.1 Representation of Two Dimensional array

Using the base address , the address of any element in an array A of size row X col can be calculated
using the formula

Row Major order

CSE@HKBKCE 12
Module-1 BCS304

Address (A[i][j]) = Base address + w[ i*col+ j] considering the array indexing starts at 0

Column Major order


Address (A[i][j]) = Base address + w[ i+row.j] considering the array indexing starts at 0

Example

When the compiler encounters an array declaration such as int A[3][4] it creates an array A and allocates
20 consecutive memory locations. Each memory location is large enough to hold a single integer.

Let α be the address of the first element A[0][0], is called the base address
Considering Row major order
Using the bases address we can calculate the addresses of other element
Address of A[0][1] = 100 +2[0*4+ 1]= 100 +2=102
Address of A[0][2] = 100 +2[0*4+ 2]= 100 +4=104
Address of A[1][0] = 100 +2[1*4+ 0]= 100 +8=108
Addres of A[2][3]= 100 +2[2*4+3]= 100+22= 122

1.8.3 Representation of Multidimensional Arrays


In C, multidimensional arrays are represented using the array-of-arrays representation The linear list is then
stored in consecutive memory just as we store a one-dimensional array.

If an n-dimensional array a is declared as a[upper0][upper1] ⋯ [uppern-1];


then the number of elements in the array is: upper0*upper1*……uppern-1 also represented as

where Π is the product of the upperi's. For instance, if we declare a as a[10][10][10], then we require
10·10·10 = 1000 memory cell to hold the array. There are two common ways to represent multidimensional
arrays: row major order and column major order. We consider only row major order here. As its name
implies, row major order stores multidimensional arrays by rows.
Two dimensional arrays
For instance, we interpret the two dimensional Array a[upper0][upper1] as upper0 rows, , each row
containing upper1 elements.
If we assume that α the base address is the address of a[0][0]

Then the address of an arbitrary element,


Address( a[i][j])= α + i·upper1 + j. ----------(1)

Here the size is not considered. Considering the size the formula can be written as
Address (a[i][j]) = α + w(i·upper1 + j) where w is the size of each unit of memory location.

Representation of a three-dimensional array

a[upper0][upper1][upper2], we interpret the array as upper0 two dimensional arrays of


dimension upper1 × upper2.

address of a[i][j][k]= α + i·upper1·upper2 + j·upper2 + k ---------(2)

CSE@HKBKCE 13
Module-1 BCS304

Representation of an n-dimensional array


Generalizing on the preceding discussion, we can obtain the addressing formula for any element
a[i0][i1] … [in-1] in an n dimensional array declared as:

a[upper0][upper1] … [uppern-1]

If α is the address for A[0][0] … [0] the base address Then

Repeating in this way the


Address (a[i0][i1] … [in-1] )is=

1.9 Polynomials
A polynomial is a sum of terms, where each term has a form axe, where x is the variable, a is the
coefficient, and e is the exponent.

Example for polynomials :

A(x) = 3x20 + 2x5 + 4


B(x) = x4 + 10x3 + 3x2 + 1

The largest (or leading) exponent of a polynomial is called its degree. Coefficients that are zero are not
displayed.
Standard mathematical definitions for the sum and product of polynomials are:

Assume that we have two polynomials

then

CSE@HKBKCE 14
Module-1 BCS304

1.10.1 Polynomial Representation

A polynomial can be represented as an array of structures as follows.

Only one global array, terms, is used to store all the polynomials. The C declarations needed are:

# define MAX_TERMS 100 /*size of terms array*/


typedef struct {
float coef;
int expon;
} polynomial;
polynomial terms[MAX_TERMS];
int avail = 0;

Consider the two polynomials


A(x) = 2x1000 + 1 and
B(x) = x4 + 10x3 + 3x2 + 1.
Figure below shows how these polynomials are stored in the array terms. The index of the first term of A
and B is given by startA and startB, respectively, finishA and finishB give the index of the last term of A
and B respectively. The index of the next free location in the array is given by avail.

For our example, startA = 0, finishA = 1, startB = 2, finishB = 5, and avail =6.

Start A Finish A startB FinishB avail

Coef 2 1 1 10 3 1
exp 1000 0 4 3 2 0
0 1 2 3 4 5 6 7 8

Figure 1.3 Representation of polynomial in array

• There is no limit on the number of polynomials that we can place in terms.


• The total number of nonzero terms must not be greater than MAX_TERMS.

since A (x) = 2x 1000 + 1 uses only six units of storage: one for startA, one for finishA, two for the
coefficients, and two for the exponents. However, when all the terms are nonzero, the current
representation requires about twice as much space as the first one. This representation is useful only when
the number of non zero terms are more.

1.10.2 Polynomial addition

C function that adds two polynomials, A and B to obtain the resultant polynomial D = A + B. The
polynomial is added term by term. The attach function places the terms of D into the array, terms starting

CSE@HKBKCE 15
Module-1 BCS304

at position avail,. If there is not enough space in terms to accommodate D, an error message is printed to
the standard error device and we exit the program with an error condition.

void padd(int startA,int finishA,int startB, int finishB, int *startD,int *finishD)
{
/ * add A(x) and B(x) to obtain D(x) */
float coefficient;
*startD = avail;
while (startA <= finishA && startB <= finishB)
{
switch(COMPARE(terms[startA].expon, terms[startB].expon))
{
case -1: attach(terms[startB].coef,terms[startB].expon);
startB++;
break;
case 0: coefficient = terms[startA].coef + terms[startB].coef;
if (coefficient)
attach(coefficient,terms[startA].expon);
startA++;
startB++;
break;
case 1: attach(terms[startA].coef,terms[startA].expon);
startA++;
}
}
While(startA <= finishA)
{
attach(terms[startA].coef,terms[startA].expon); /* add in remaining terms of A(x) */
startA++;
}

While(startB <= finishB)


{
attach(terms[startB].coef, terms[startB].expon); /* add in remaining terms of B(x) */
startB++;
}

*finishD = avail-1;
}

/* add a new term to the polynomial */

void attach(float coefficient, int exponent)


{
if (avail >= MAX_TERMS)
{
fprintf(stderr,"Too many terms in the polynomial\n") ;
exit(EXIT_FAILURE);

CSE@HKBKCE 16
Module-1 BCS304

}
terms[avail].coef = coefficient;
terms[avail].expon = exponent;
avail++;
}

Time complexity analysis


The number of non zero terms in A and in B are the most important factors in the time [Link]
first loop can iterate m times and the second can iterate n times. So, the asymptotic computing time of
this algorithm is O(n +m).

1.10 Sparse Matrices


If a matrix contains m rows and n columns the total number of elements in such a matrix is m*n. If m
equals n the matrix is a square matrix. When a matrix is represented as a two dimensional array defined as
a[max_rows][max_cols], we can locate each element quickly by writing a[i][j] where i is the row index
and j is the column index.

Consider the matrix given below. It contains many zero entries, such a matrix is called a sparse matrix

Col0 Col1 Col2 Col3 Col4 Col5


Row0 15 0 0 22 0 -15
Row1 0 11 3 0 0 0
Row2 0 0 0 -6 0 0
Row3 0 0 0 0 0 0
Row4 91 0 0 0 0 0
Row5 0 0 28 0 0 0

When a sparse matrix is represented as a two dimensional array space is wasted for example if we
have 1000x 1000 matrix with only 2000 non zero element, the corresponding 2 dimensional array
requires space for 1,000,000 elements

1.11.1 Sparse Matrix Representation

A Sparse matrix can be represented by using an array of triple <row, col, value >.
In addition to ensure the operations terminate , it is necessary to know the number of rows and columns,
and the number of nonzero elements in the matrix. Putting all this information together a sparse matrix
can be created as follows

SparseMatrix Create(maxRow, maxCol) ::=


#define MAX_TERMS 101 /* maximum number of terms +1*/
typedef struct {
int col;
int row;
int value;
} term;

CSE@HKBKCE 17
Module-1 BCS304

term a[MAX_TERMS] ;

Example

Col0 Col1 Col2 Col3 Col4 Col5 Row Col value


Row0 15 0 0 22 0 -15 a[0] 6 6 8
Row1 0 11 3 0 0 0 a[1] 0 0 15
Row2 0 0 0 -6 0 0 a[2] 0 3 22
Row3 0 0 0 0 0 0 a[3] 0 5 -15
Row4 91 0 0 0 0 0 a[4] 1 1 11
Row5 0 0 28 0 0 0 a[5] 1 2 3
a[6] 2 3 -6
a[7] 4 0 91
a[8] 5 2 28

a)Two dimensional array b) Sparse matrix stored as triples

Figure 1.4 two dimensional array and its sparse matrix stored as triples

1.11.2 Transposing a Matrix

To transpose a matrix we must interchange the rows and columns. This means that each element a[i][j] in
the original matrix becomes element b[j][i] in the transpose matrix.

The algorithm finds all the elements in column 0 and store them in row 0 of the transpose matrix, find all the elements
in column 1 and store them in row 1, etc." Since the original matrix was ordered by rows and the columns were
ordered within each row. The transpose matrix will also be arranged in ascending order. The variable, currentb,
holds the position in b that will contain the next transposed term. The terms in b is generated by rows by collecting
the nonzero terms from column i of a

The transpose b of the sparse matrix a of figure 1.4b is shown in figure 1.5

Row Col value


b[0] 6 6 8
b[1] 0 0 15
b[2] 0 4 91
b[3] 1 1 11
b[4] 2 1 3
b[5] 2 5 28
b[6] 3 0 22
b[7] 3 2 -6
b[8] 5 0 -15

Figure 1.5 Transpose of the matrix

Function to find the transpose of a sparse matrix

void transpose(term a[], term b[]) /* b is set to the transpose of a */


{
int n,i,j, currentb;

CSE@HKBKCE 18
Module-1 BCS304

n = a[0].value; /* total number of elements */


b[0].row = a[0].col; /* rows in b = columns in a */
b[0] .col = a[0] .row; /* columns in b = rows in a */
b[0].value = n;
if (n > 0 ) /* non zero matrix */
{
currentb = 1;
for (i = 0; i < a[0].col; i++) /* transpose by the columns in a */
for (j = 1; j <= n; j++)
if (a[j].col == i) /* find elements from the current column */
{
b[currentb].row = a[j].col; /* element is in current column, add it to b */
b[currentb].col = a[j].row;
b[currentb].value = a[j].value;
currentb++;
}
}
}
Analysis of transpose: Hence, the asymptotic time complexity of the transpose algorithm is
O(columns·elements).

1.11.3 Fast Transpose


We can transpose a matrix represented as a sequence of triples in O(columns + elements) time. This
algorithm, fastTranspose is listed below .

It first determines the number of elements in each column of the original matrix. This gives us the number
of elements in each row of the transpose matrix. From this information, we can determine the starting
position of each row in the transpose matrix. We now can move the elements in the original matrix one by
one into their correct position in the transpose matrix. We assume that the number of columns in the original
matrix never exceeds MAX_COL.

Program Fast Transpose

void fastTranspose(term a[], term b[]) /* the transpose of a is placed in b */


{
int rowTerms[MAX_COL], startingPos[MAX_COL];
int i,j, numCols = a[0].col, numTerms = a[0].value;
b[0].row = numCols;
b[0].col = a[0].row;

CSE@HKBKCE 19
Module-1 BCS304

b[0].value = numTerms;
if (numTerms > 0) { /* nonzero matrix */
for (i = 0; i < numCols; i++)
rowTerms[i] = 0;
for (i = 1; i <= numTerms; i++)
rowTerms[a[i].col]++;
startingPos[0] = 1;
for (i = 1; i < numCols; i++)
startingPos[i] = startingPos[i-1] + rowTerms[i-1];

for (i = 1; i <= numTerms; i++)


{
j = startingPos[a[i].col]++;
b[j].row = a[i].col;
b[j].col = a[i].row;
b[j].value = a[i].value;
}
}
}
Analysis of Fast Transpose

• The first two for loops compute the values for rowTerms, the third for loop carries out the
computation of startingPos, and the last for loop places the triples into the transpose matrix. These
four loops determine the computing time of fastTranspose.
• The bodies of the loops are executed numCols, numTerms, numCols - 1, and numTerms times,
respectively. The computing time for the algorithm is O(columns + elements).
• However, transpose requires less space than fastTranspose since the latter function must allocate
space for the rowTerms and startingPos arrays.

1.11 Strings
A string is an array of characters that is delimited by the null character (\0).

Example1:
Char s[100] = {“class”} ;
The string is internally represented as follows

C L A S S \0
S[0] S[1] S[2] S[3] S[4] S[5]

The same can also be declared as char s[]={“class”} ;

Using this declaration the compiler would have reserved just enough space to hold each character word
including the null character. In such cases we cannot store a string of length more than 5 in s

C provides several string functions which we access by including the header file string.h

CSE@HKBKCE 20
Module-1 BCS304

Given below is a set of C string functions

char *strcat(char *dest, const char Appends the string pointed to, by src to the end of the
*src) string pointed to by dest.

int strcmp(const char *str1, const Compares the string pointed to, by str1 to the string
char *str2) pointed to bystr2.

char *strcpy(char *dest, const char Copies the string pointed to, by src to dest and return dest
*src)

size_t strlen(const char *str) Returns the length of the string str . But not including the
terminating null character.

1.12.1 Implementation of string Manipulation Functions

Function to find the length of a given string


int string_length(const char *s) {
int i = 0,len=0;
for(i=0;s[i] != '\0;i++') {
len++;
}
return len;
}

Function to concatenate the given two strings


Given two stings s1 and s2 , s2 string is copied to the end of s1 string
void concat_str(char s1[],char s2[])
{
int i=0,j=0;
while(s1[i]!= '\0')
i++;
while(s2[j]!='\0')
{
s1[i]=s2[j];
j++;
i++;
}
s1[i]='\0';
}

CSE@HKBKCE 21
Module-1 BCS304

Function to reverse a string

void reverseString(char str[]) {


int start = 0;
int end = strlen(str) - 1;
char temp;
// Swap characters from start to end until they meet in the middle
while (start < end)
{
temp = str[start];
str[start] = str[end];
str[end] = temp;
start++;
end--;
}
}

Function for String Copy


Copies string s2 to s1

void str_copy(char s1[],char s2[])


{
int i=0,j=0;
while(s2[j]!='\0')
{
s1[i]=s2[j];
j++;
i++;
}
s1[i]='\0';
}

Function to compare two strings


This function compares two strings and returns 1 if the two strings are equal else returns 0;
void str_compare(char s1[],char s2[])
{
int i=0,j=0;
if (strlen(s1) !=strlen(s2)
return(0)

else
{
while(s1[i]!='\0')
{
if((s1[i]!=s2[j])
return(0);
j++;
i++;
}

CSE@HKBKCE 22
Module-1 BCS304

retun(1);
}

1.12.2 Pattern matching


Consider two strings str and pat where pat is a pattern to be searched for in stri

The easiest way to find if the pat is in str is by using the built in function strstr.
Example: If we have a declaration as follows
Char pat[max_size], str[max_size], *t;

The pattern matching can be carried out as follows


If((t= strstr(str,pat))
Printf(“The pattern found in the string is %s”,t);
Else
Printf(“ The pattern was not found”);

Since there are different methods to find pattern matching discussed below are two functions that finds
pattern matching in a more efficient way.

The easiest and the least efficient method in pattern matching is sequential search. The computing time is
of O(n.m).

Exhaustive patter matching is improved in nfind function by

 Quitting when strlen(pat) is greater than the number of remaining characters


 Compare the first and last character if pat and string before we check the remaining
characters

int nfind(char *string,char *pat)


{
Int i,j,start=0;
int lasts=strlen(string)-1;
int lastp=strlen(pat)-1;
int endmatch= lastp

for(i=0;endmatch<=lasts;endmatch++,strt++)
{
If(string[endmatch] == pat[lastp])
{
j=0;i= start;
while(j<lastp && string[i]== pat[j])
{
i++;
j++);
}
}
if(j==lastp)
return start;
}

CSE@HKBKCE 23
Module-1 BCS304

return -1
}

Simulation of nfind
Pattern

a a B

j lastp

a b a b b A a b a a

s em ls
No Match

a b a b b A a b a a

s em ls
No Match

a b a b b A a b a a

s i em ls

No Match

a b a b b A a b a a

s Em ls

No Match

a b a b b A a b a a

s em ls
No Match

a b a b b A a b a a

CSE@HKBKCE 24
Module-1 BCS304

S em ls
Match
Analysis of nfind algorithm.

The speed of the program is linear O(m) the length of the string in the best and average case but the worst
case computing is still O(n.m)

Knuth, Morris, pratt pattern matching algorithm

Ideally we would like an algorithm that works in O(strlen(str)+ strlen(pat)) time. This is optimal for this
problem as in the worst cast it is necessary to look at all characters in the pattern and string at least once.
We want to search the string for the pattern without moving backwards in the string. That is if a mismatch
occurs we want to use the knowledge of the characters in the pattern and the position in the pattern where
the mismatch occurred to determine where the search should continue. Knuth, Morris, and pratt have
developed an algorithm that works in this way and has linear complexity.

The following declarations are assumed.


#define max_string_size 100
#define max_pat_size 100
int pmatch();
void fail();
int failure[max_pat_size];
char string[max_string_size];
char pat[max_pat_size];

int pmatch(char *string,char *pat)


{
Int i=0,j=0;
Int lens= strlen(string);
Int lenp= strlen(spat);
While (i<lens && j<lenp)
{
If (string[i] == pat[j])
{
i++; j++; }
else if (j==0) i++;
else j= failure[j-1] +1;
}
return ((j=lenp) ? (i-lenp) : -1);
}

Void fail(char *pat)


{
int n=strlen(pat);
failure[0]=-1;
for (j=1;j<n;j++)
{
i= failure[j-1];

CSE@HKBKCE 25
Module-1 BCS304

while((pat[j]!=pat[i+1] && (i>=0)


i= failure[i];
if (pat[j]== pat[i+1])
failure[j]= i+1;
Else failure[j]= -1;
}
}

Example:

For the pattern pat=abcabcacab we have the failure values calculated as below

j 0 1 2 3 4 5 6 7 8 9

pat a b c a b c a c a b
failure -1 -1 -1 0 1 2 3 -1 0 1

Analysis of pmatch
The time complexity of function pmatch is O (m) = O(strlen(string))

Analysis of fail

The computing time of fails is O(n)=O(strlem(pa)).

Therefore when the failure function is not known in advance the total computing time is
O(strlen(string)) + O(strlem(pa))

1.13 Stacks
1.13.1 Stack Definition and Examples

stack is an ordered list in which insertions (also called push) and deletions (also called pops ) are made at
one end called the top.

Given a stack S = (a0, ...., an-1), we say that a0 is the bottom element, an-1 is the top element,
and ai is on top of element ai-1, 0 < i < n.

Since the last element inserted into a stack is the first element removed, a stack is also known
as a Last-In-First-Out (LIFO) list.

Illustration of stack operations

CSE@HKBKCE 26
Module-1 BCS304

Illustration of the push and pop operations

Implementation of stack
Stackis implemented by using a one-dimensional array, stack [MAX_STACK_SIZE], where MAX is
the maximum number of entries.
• The first, or bottom, element of the stack is stored in stack [0], the second in stack [1], and the ith
in stack [i-1].
• Variable top points to the top element in the stack.
• Top= -1 to denote an empty stack.

Array Implementation of a stack of integers


#include<stdio.h>
#define MAX 10
int top= -1,stack[MAX];

void push(int item)


{
if (top==MAX-1)
printf("Stack Overflow\n");
else
stack[++top]=item;
}

int pop()
{
int itemdel;
if (top==-1)
return 0;
else
{
itemdel=stack[top--];
return itemdel;
}
}

void display()
{
int i;
if(top==-1)
printf("Stack Empty\n");
else

CSE@HKBKCE 27
Module-1 BCS304

{
printf("Elements Are:\n");
for(i=top;i>=0;i--)
printf("%d\n",stack[i]);
}
}

void main()
{
int ch,item,num,itemdel;
while(1)
{
printf("\nEnter the Choice\[Link]\[Link]\[Link]\[Link]\n");
scanf("%d",&ch);
switch(ch)
{
case 1: printf("Enter item to be inserted\n");
scanf("%d",&item);
push(item);
break;
case 2: itemdel=pop();
if(itemdel)
printf("\n Deleted Item is:%d\n",itemdel);
else
printf("Stack Underflow\n");
break;
case 3: display();

break;
case 4: exit(0);
}
}
}

1.13.2 Stacks Using Dynamic Arrays


If we do not know the maximum size of the stack at compile time, space can be allocated for the
elements dynamically at run time and the size of the array can be increases as needed.

Creation of stack
Here the capacity of the stack is taken as 1. The value of the capacity can be altered specific to the
application

StackCreateS() ::=

int *stack
Stack=(int*)malloc(stack, sizeof(int));
int capacity = 1;
int top = -1;

CSE@HKBKCE 28
Module-1 BCS304

BooleanIsEmpty(Stack) ::= top < 0;


BooleanIsFull(Stack) ::= top >= capacity-1;

The function push remains the same except that MAX_STACK_SIZE is replaced with capacity
void push(element item)
{
if (top >=capacity-1)
stackFull();
stack[++top] = item;
}

The code for the pop function remains unchanged

element pop()
{/* delete and return the top element from the stack */
if (top == -1)
return stackEmpty(); /* returns an error key */
return stack[top--];
}

Stackfull with Array doubling

The code for stackFull is changed. The new code for stackFull attempts to increase the capacity of the array
stack so that we can add an additional element to the stack. In array doubling, the capacity of the array is
doubled whenever it becomes necessary to increase the capacity of an array.
void stackFull()
{
stack=(int*)realloc(stack, 2 * capacity * sizeof(int))
capacity =capacity * 2;
}

Analysis
• In the worst case, the realloc function needs to allocate 2*capacity *sizeof (*stack) bytes of memory
and copy capacity*sizeof (*stack)) bytes of memory from the old array into the new one.
• Under the assumptions that memory may be allocated in O(1) time and that a stack element can be
copied in O(1) time, the time required by array doubling is O(capacity). The total time spent in
array doubling is of O(n) where n is the total number of push operations.
• Hence even with the time spent on array doubling in the total time of push over all n pushes in O(n).
This conclusion is valid even the stack array is resized by a factor c>1.

1.13.3 Application of stack


➢ Conversion of Expression
➢ Evaluation of expression
➢ Recursion

Infix, postfix(Suffix) and Prefix(polish)

CSE@HKBKCE 29
Module-1 BCS304

Expression is a collection of operands and operators


An expression can be represented in three different ways.
➢ Infix expression: operators are in between the two operands . Example a+b
➢ Prefix expression: operators are present before the operands. Example +ab
➢ Postfix expression: operators are present after the operands. Example ab+

The prefixes “pre”, “post”, and “in” refer to the relative position of the operator with respect to the two
operands.

To convert an expression from infix to prefix or postfix we follow the rules of precedence.
Precedence : The order in which different operators are evaluated in an expression is called precendence
Associativity : The order in which operators of same precedence are evaluated in an expression is called
Associativity.

The operators are listed in the order of higher precedence down to lower precedence

Operator Associativity
--,++ left-to-right
Unary operators ,!,-,+, &, Right to left
*,sizeof
*,/,% left-to-right
+,- left-to-right

Converting an expression from infix to postfix


The operands in the infix and the postfix expression are in the same order. With respect to operators ,
precedence of operators plays an important role in converting an infix expression to postfix expression. We
make use of the stack to insert the operators according to their precedence.

The following operations are performed to convert an infix expression to postfix.

Scan the symbol character by character


➢ If the symbol is an operand place it in the postfix string
➢ If the symbol is an opening parenthesis push it on to the stack
➢ If the symbol is a closing parenthesis pop the contents of the stack until we see an opening
parenthesis and place it in the postfix string. The opening parenthesis and the closing parenthesis is
not placed in the postfix string.
➢ If the symbol is an operator and if the precedence of the input symbol is more than the precedence
of the symbol on top of the stack, then the operator is pushed on to the stack. If the precedence of
the input symbol is lesser than the symbol on top of the stack we pop each such operators and place
it in the postfix string

Algorithm Polish(Q,P)
Suppose Q is an arithmetic expression written in infix notation. This algorithm finds the equivalent
Postfix expression P.

1. Push ‘(’ on to STACK to identify the end of the stack

CSE@HKBKCE 30
Module-1 BCS304

2. Scan Q from Left to right and repeat steps 3 to 6 for each character of Q until the end of the string
3. If an operand is encountered add it to P
4. If a left parenthesis is encountered ,push it to onto STACK.
5. If an operator  is encountered then
a. Repeatedly pop each operator that has equal or higher precedence and add to P.
b. Add  to Stack
6. If a right Parenthesis is encountered, then
a. Repeatedly pop from stack and add to P each operator on top of stack until until a left
parenthesis is encountered.
b. Remove the left Paranthesis.[ do not add left parenthesis to stack]
[End of If Structure.]
[End of Step 2 loop]
7. Repeatedly pop stack until it is empty and add to P
8. Exit

Function to convert an infix expression to postfix

Assume the stack and the top is declared globally


char s[25];
int top=-1;

int precd(char op)


{
int r;
switch(op)
{
Case ‘^’:
Case ‘$’ r=3; break;
case '*':
case '/': r=2;break;
case '+':
case '-': r=1;break;
case '(': r=0;break;
case '#': r=-1;break;
}
return(r);
}

void infix_postfix(char infix[],char postfix[])


{
int i,p=0;
char symbol,item;
push('#');
for (i=0;infix[i]!='\0';i++)
{
symbol=infix[i];
switch(symbol)
{

CSE@HKBKCE 31
Module-1 BCS304

case '(': push(symbol);


break;
case ')':item=pop();
while(item!='(')
{
postfix[p++]=item;
item=pop();
}
break;
case '+':
case '-':
case '*':
case '/':
case '%':while(precd(s[top])>=precd(symbol))
{
item=pop();
postfix[p++]=item;
}
push(symbol);
break;

case '^': while(precd(s[top])>precd(symbol))


{
item=pop();
postfix[p++]=item;
}
push(symbol);
break;
default: postfix[p++]=symbol;
break;
}
}
while(top>0)
{
item=pop();
postfix[p++]=item;
}
postfix[p]='\0';
}

Example: Translation of the infix string a*(b+c)*d to postfix

Infix Stack Postfix


Token [0] [1] [2] [3]
a # a
* # * a
( # * ( a
b # * ( ab
+ # * ( + ab

CSE@HKBKCE 32
Module-1 BCS304

c # * ( + abc
) # * abc+
* # * abc+*
d # * abc+*d
Eos(\0) # abc+*d*

Analysis:
Let n be length of the infix string. (n) time is spent extracting tokens . There are two while loop where
the total time spent is (n) since the number of tokens that get stacked and unstacked is linear in n . So
the complexity of the function is (n)

Evaluating a postfix expression

Each operator in a postfix string refers to the previous two operands in the string. If we are parsing a
string, each time we read operands we push it to the stack and when we read a operator, its operands will
be the two topmost elements in the stack. We can then pop these two operands and perform the indicated
operation and push the result back to the stack so that it can be used by the next operator.

Example: Evaluation of postfix string 6 2 3 + - 3 8 2 / + * 2 $ 3 +

Token Stack content Operand1 Operand2 Result


[0] [1] [2] [3]
6 6
2 6 2
3 6 2 3
+ 6 5 2 3 5
- 1 6 5 1
3 1 3
8 1 3 8
2 1 3 8 2
/ 1 3 4 8 2 4
+ 1 7 3 4 7
* 7 1 7 7
2 7 2
$ 49 7 2 49
3 49 3
+ 52 49 3 52

The following function evaluates a postfix expression using a stack.


A stack of float elements is declared globally

float s[25];
int top;

float Evaluate(char *postfix)


{

CSE@HKBKCE 33
Module-1 BCS304

int i=0;
float res=0,op1,op2;
char symbol;

While(postfix[i]!=’\0’)
{

Symbol=postfix[i];
If isdigit(symbol)
{
Push(symbol-‘0’);
}
Else
{
op2=pop();
op1=pop();
res=operation(symbol,op1,op2);
push(res);
}
}
res=pop();
return(res);
}

float operation(char op, float op1,float op2)


{
float res;
switch(op)
{
case '+': res= op1+op2;
case '-': res= op1-op2;
case '*': res= op1*op2;
case '/': res= op1/op2;
case '^': res= pow(op1,op2);
}
return (res);
}

Limitations of the program

➢ It does not check if the postfix expression is valid or not. If we input erroneous expression it returns
wrong result
➢ We cannot enter negative numbers, as the symbol to indicate negation will be misinterpreted as
subtraction operation

Analysis:
Let n be length of the postfix string then the complexity of the function is (n)

Algorithm PostfixEval(P)

CSE@HKBKCE 34
Module-1 BCS304

This algorithm finds the VALUE of an arithmetic expression P written in postfix notation
1. Scan P from Left to righ
2. Repeats steps 3 and 4 until we reach the end of P
3. If an operand is encountered put it on stack
4. If an operator  is encountered then:
a) remove two top elements of STACK, where A is the top element and B is the next
top- element
b) Evaluate B  A
c) Place the result of (4) in STACK.
[End of If structure]
[End of step 2 Loop]
5. Set VALUE equal to the top element of STACK.
6. EXit

CSE@HKBKCE 35

You might also like