Department of Computer Applications
URCW
B-Tree
A B-Tree is a specialized m-way tree designed to optimize data access, especially on
disk-based storage systems. In a B-Tree of order m, each node can have up to m children and
m-1 keys, allowing it to efficiently manage large datasets. The value of m is decided based on
disk block and key [Link] of the standout features of a B-Tree is its ability to store a
significant number of keys within a single node, including large key values. It significantly
reduces the tree’s height, hence reducing costly disk operations.
B Trees allow faster data retrieval and updates, making them an ideal choice for systems
requiring efficient and scalable data management. By maintaining a balanced structure at all
times, B-Trees deliver consistent and efficient performance for critical operations such as
search, insertion, and deletion.
Properties of a B-Tree
A B Tree of order m can be defined as an m-way search tree which satisfies the
following properties:
All leaf nodes of a B tree are at the same level, i.e. they have the same depth (height
of the tree).
The keys of each node of a B tree (in case of multiple keys), should be stored in the
ascending order.
In a B tree, all non-leaf nodes (except root node) should have at least m/2 children.
All nodes (except root node) should have at least m/2 - 1 keys.
If the root node is a leaf node (only node in the tree), then it will have no children and
will have at least one key. If the root node is a non-leaf node, then it will have at least
2 children and at least one key.
A non-leaf node with n-1 key values should have n non NULL children.
Introduction to B TREE and its operations, invention of B- tree
A B tree is designed to store sorted data and allows search, insertion and deletion operation
to be performed in logarithmic time. As In multiway search tree, there are so many nodes
which have left subtree but no right subtree. Similarly, they have right subtree but no left
subtree. As is known, access time in the tree is totally dependent on the level of the tree. So
our aim is to minimize the access time which can be through balance tree only.
For balancing the tree each node should contain n/2 keys. So the B tree of order n can be
defined as:
1. All leaf nodes should be at same level.
2. All leaf nodes can contain maximum n-1 keys.
3. The root has at least two children.
4. The maximum number of children should be n and each node can contain k keys.
Where, k<=n-1.
5. Each node has at least n/2 and maximum n nonempty children.
Department of Computer Applications
URCW
6. Keys in the non-leaf node will divide the left and right sub-tree where the value of left
subtree keys will be less and value of right subtree keys will be more than that
particular key.
Operations performed on B Tree
1. Insertion in B-Tree
2. Deletion from B-Tree
1) Insertion in B-Tree
The insertion of a key in a B tree requires the first traversal in B-tree. Through the traversal, it
is easy to find that key which needs to be inserted is already existed or not. There are
basically two cases for inserting the key that are:
1. Node is not full
2. Node is already full
If the leaf node in which the key is to be inserted is not full, then the insertion is done in the
node.
If the node were to be full then insert the key in order into existing set of keys in the node,
split the node at its median into two nodes at the same level, pushing the median element up
by one level.
Let us take a list of keys and create a B-Tree: 5,9,3,7,1,2,8,6,0,4
1) Insert 5
2) Insert 9: B-tree insert simply calls B tree insert non-full, putting 9 to the right of 5.
3) Insert 3: Again B-tree insert non-full is called
4) Insert 7: Tree is full. We allocate a new empty node, make it the root, split a former root,
and then pull 5 into a new root.
Department of Computer Applications
URCW
5) Insert 1: It goes with 3
6) Insert 2: It goes with 3
7) Insert 8, 6: As firstly 8 goes with the 9 and then 6 would go with 7, 8, 9 but that node is
full. So we split it bring its middle child into the root.
Department of Computer Applications
URCW
8) Insert 0, 4: 0 would go with the 1, 2, and 3 which are full, so we split it sending the middle
child up to the root. Now it would be nice to just stick 4 in with 3, but the B-tree algorithm
requires us to split the full root. Now we can insert 4 assured that future insertion will work.
2) Deletion from B Tree
Let us take a B-tree of order 5,
Deletion of the key also requires the first traversal in B tree, after reaching on a particular
node two cases may be occurred that are:
1. Node is leaf node
2. Node is non leaf node
Example: Let us take a B tree of order 5
1) Delete 190: Here 190 is in leaf node, so delete it from only leaf node.
Department of Computer Applications
URCW
2) Delete 60: Here 60 is in non leaf node. So first it will be deleted from the node and then
the element of the right child will come in that node.
B+ Tree
Introduction
A B+ Tree is a type of data structure used in computer science for efficient data storage and
retrieval. Unlike B Tree, the B+ Tree stores all values in the leaf nodes and use internal
nodes only for indexing. This structure ensures fast data access, making it ideal for database
indexing and file systems.
B+ Tree
A B+ Tree is a type of data structure used in computer science to store and organize data. It is
a special kind of tree where all the values (data) are stored in the leaf nodes, and the internal
nodes are used only for indexing (like a table of contents). This structure helps in quickly
finding, inserting, and deleting data. In a B Plus Tree, the leaf nodes are linked together like
a linked list, which makes it easy to find a range of values quickly.
Properties of B+ Tree
Balanced Tree Structure: B+ Trees are balanced, meaning all leaf nodes are at the
same level.
Leaf Nodes Store All Values: All actual data values are stored in the leaf nodes.
Internal Nodes for Indexing: Internal nodes contain only keys and pointers to child
nodes.
Department of Computer Applications
URCW
Linked List of Leaf Nodes: Leaf nodes are linked together in a linked list.
Order of the Tree: The order (m) of a B+ Tree defines the maximum number of
children each internal node can have.
Minimum Degree: The minimum degree (t) of a B+ Tree is defined as ⌈m/2⌉.
Efficient Disk Read/Write: B+ Trees are designed to minimize disk I/O operations.
Nodes in B+ Tree
In a B+ Tree, nodes are of two main types: internal nodes and leaf nodes.
Internal Nodes
Internal nodes are used for indexing and routing purposes within the tree. They do not store
actual data values but rather keys that help in navigating the tree.
Leaf Nodes
Leaf nodes store the actual data values. All the data in a B+ Tree is contained within the leaf
nodes.
Order of B+ Tree
The order of a B+ Tree, often denoted as m, is a fundamental property that determines the
tree's structure and balance. The order of a B+ Tree specifies the maximum number of
children that any node in the tree can have. Here’s a detailed explanation of what the order of
a B+ Tree means and its implications:
Order m:
The order of a B+ Tree is the maximum number of children that any internal node can have.
For a B+ Tree of order m:
Each internal node can have at most m children.
Each internal node (except the root) must have at least (m/2) children.
Each internal node must have between (m/2) - 1 and m - 1 keys.
Each leaf node must have between (m/2) and m values.
Operations on B+ Trees
B+ Trees support various operations such as insertion, deletion, and searching. These
operations ensure that the tree remains balanced and efficient for data retrieval and
modification.
Department of Computer Applications
URCW
1. Insertion
1. Start from the root and find the appropriate leaf node where the key should be
inserted.
2. Insert the key in the leaf node in sorted order.
3. Split the Node (if necessary):
If the leaf node has more than m−1 keys after insertion, split the node into two.
Promote the middle key to the parent node.
If the parent node also overflows, repeat the splitting process up to the root.
4. Ensure that leaf nodes are linked correctly after any split.
2. Deletion
1. Start from the root and find the leaf node containing the key.
2. Remove the key from the leaf node.
3. Merge or Redistribute Nodes (if necessary):
If the leaf node has fewer than ⌈m/2⌉ keys after deletion, merge or redistribute nodes.
If merging, remove the corresponding key from the parent and merge nodes.
If the parent node underflows, repeat the process up to the root.
3. Searching
1. Begin the search from the root node.
2. Compare the key with the keys in the current node:
If the key is less than a key in the node, follow the corresponding child pointer.
If the key is greater, move to the next key or child pointer.
3. Continue this process until you reach the leaf node.
4. Look for the key in the leaf node.
Department of Computer Applications
URCW
Applications of B+ Tree
B Plus Tree is widely used in various applications due to their efficient data storage, retrieval,
and range query capabilities:
1. Database Indexing
B+ Trees are extensively used in database management systems to implement indexing.
Example:
SQL databases use B+ Trees to index columns, enabling fast lookups for query operations
such as SELECT, UPDATE, and DELETE.
2. File Systems
B+ Trees are used in file systems to manage files and directories.
Example:
Many modern file systems, such as NTFS (New Technology File System) and ReiserFS
(Reiser File System), use B+ Trees for directory indexing and file metadata management.
3. Metadata Indexing
B+ Trees are used to index metadata in various systems.
Example:
Content management systems (CMS) use B+ Trees to index metadata for documents, images,
and other digital assets, facilitating efficient searches and access control.
4. Caching Systems
B+ Trees are used in caching systems to manage cache entries.
Example:
Web browsers and web servers use B+ Trees to manage cached web pages and resources,
ensuring fast retrieval of frequently accessed content.
5. Transaction Processing
B+ Trees are used in transaction processing systems to manage transaction logs.
Example:
Financial systems and banking applications use B+ Trees to manage transaction logs,
ensuring accurate and efficient processing of financial transactions.
6. Memory Management
B+ Trees are used in memory management systems to track free memory blocks.
Example:
Operating systems use B+ Trees to manage virtual memory and allocate memory to running
processes, ensuring efficient use of available memory.
7. Geographic Information Systems (GIS)
B+ Trees are used in GIS to index spatial data.
Example:
GIS applications use B+ Trees to manage and query geographic coordinates, enabling quick
searches for locations and regions.