### Inserting a key into the B/B+-tree

• Assumption

• Assumption:

 We assume the search keys in the B-tree are unique I.e.: no duplicate keys stored in the B-tree

• Overview: Inserting the search key x into a B-tree

• Overview of the B-tree insert algorithm:

1. A (new) search key (and a corresponding record pointer) is (always) inserted into a leaf node

 The insertion can cause an overflow condition which will split the leaf node into 2 leaf nodes

2. An overflow condition will trigger the execution of the insert algorithm for an internal node

 The insert algorithm for an internal node is similar (but is not the same) as the insert algorithm for a leaf node !!! (So we will actually discuss 2 insertion algorithms !!!)

• Definition: the middle key

• Middle key:

• Let n = the maximum number of keys in a B+ tree node

The middle key is the key with the index equal to:

 ``` m = ⌈ (n+1)/2 ⌉ ```

• Examples: the middle key

 ``` n | m ---------+-------- 3 | 2 1 2 3 4 | 3 1 2 3 4 5 | 3 1 2 3 4 5 6 | 4 1 2 3 4 5 6 ... | ... ```

• Note:

• We start counting keys at the number 1:

 First key = p1 Second key = p2          Third key = p3 ...

(Normally, computer science starts counting at 0 !!! :))

• Inserting search key x and its record pointer into a B+-tree

• Given:

 A B+-tree Search key value x Block/record pointer of x

Goal:

 Insert (x, ptr(x)) into the B+-tree

• Insert Algorithm:

 ``` /* ======================================================== Insert( x, record ptr(x) ) into B+-tree ======================================================== */ Insert( x, px ) { Use the B-tree search algorithm to find the leaf node L where the x belongs Let: L = leaf node used to insert x, px /* ============================================ Insert x, px in leaf node L ============================================ */ if ( L ≠ full ) { Shift keys to make room for x Insert (x, px) into its position in leaf node L return } else { /* ------------------------------------------- Leaf node L has n keys + n ptrs (full !!!): L: |p1|k1|p2|k2|....|pn|kn|next| ------------------------------------------- */ Make the following virtual node by inserting px,x in L: |p1|k1|p2|k2|...|px|x|...|pn|kn|next| (There are (n+1) keys in this node !!!) Split this node in two 2 "equal" halves: Let: m = ⌈(n+1)/2⌉ L: |p1|k1|p2|k2|...|pm-1|km-1|R| <--- use the old node for L R: |pm|km|....|px|x|...|pn|kn|next| <--- use a new node for R /* ================================================== We need to fix the information at 1 level up to direct the search to the new node R We know that all keys in R: >= km ================================================== */ if ( L == root node of B+ tree ) { Make a new root node containing (L, km, R) return; } else { /* --------------------------------------------- Use the InsertInternal alg to insert: (km,R) into the parent node of L ---------------------------------------------- */ InsertInternal( (km, R), parent(L) ); // Note: all keys in R are ≥ km // So: R is the right subtree of km } } } ```

• Inserting a (search key, right subtree pointer) into an internal node of a B+-tree

• Given:

 A internal node N in the B+-tree Search key value x The right subtree pointer ( RSub(x) ) of x

Goal:

 Insert (x, RSub(x)) into the internal node N of the B+-tree

• Insert Algorithm for an internal node:

 ``` /* ======================================================== Insert (x, RSub(x)) into internal node N of B+-tree ======================================================== */ InsertInternal( x, rx, N ) { if ( N ≠ full ) { Shift keys to make room for x insert (x, rx) into its position in N return } else { /* ------------------------------------------- Internal node N has: n keys + n+1 node ptrs: N: |p1|k1|p2|k2|....|pn|kn|pn+1| ------------------------------------------- */ Make the following virtual node by inserting x,rx in N: |p1|k1|p2|k2|...|x|rx|...|pn|kn|pn+1| (There are (n+2) pointers in this node !!!) Split this node into 3 parts: 1. Take the middle key out 2. L = the "left" half (use the old node to do this) 3. R = the "right" half (create a new node to do this) Let: m = ⌈(n+1)/2⌉ 1. Take km out 2. L = |p1|k1|p2|k2|...|pm-1|km-1|pm| (old node N) 3. R = |pm+1|km+1|....|x|rx|...|pn|kn|pn+1| (new node) if ( N == root ) // N is same node as L { Make a new root node containing (L, km, R) return; } else { InsertInternal( (km, R), parent(N)); // Recurse !! } } } ```

• Example 1 of Insert Algorithm: leaf node has space

• Insert ( 10, recordPtr(10) ) into the following B+-tree:

• Insert Algorithm:

• Find the leaf node L that would contain search key 10:

• Shift keys and insert ( 10, recordPtr(10) ) into the correct position:

• Example 2 of Leaf Insert Algorithm: leaf node is full - cascade insert up 1 level

• Insert ( 4, recordPtr(4) ) into the following B+-tree:

• Leaf Insert Algorithm:

• Find the leaf node L that would contain search key 4:

• Make a virtual node by inserting ( 4, recordPtr(4) ) into the correct position:

(The virtual node does not exist !

This step is only done to help you visualize the insert algorithm)

• Split this virtual node into 2 "equal" halves:

Redrawn:

Next:

• Use InsertInternal( ) to insert:

 The middle search key (= 4) (and the new node pointer R) --- (4, R) --- into the parent node of L

Graphically:

• Continue with the InsertInternal( (4,R), parent(L) ) algorithm:

• Insert (4, R) into the parent node of L:

Result:

Notice that the node R is the right subtree of the key 4 !!!

DONE !!!

• Example 3 of Insert Algorithm: leaf node is full - cascade insert up multiple levels

• Insert ( 40, recordPtr(40) ) into the following B+-tree:

• Insert Algorithm:

• Find the leaf node L that would contain search key 40:

• Make a virtual node by inserting ( 40, recordPtr(40) ) into the correct position:

(The virtual node does not exist !!!!!

This step is only done to help you visualize the insert algorithm)

• Split this virtual node into 2 "equal" halves:

Redrawn:

Next:

 Use InsertInternal( ) to insert (40, R) into the parent node of L

Graphically:

• Continue with the InsertInternal( (40,R), parent(L) ) algorithm:

• Insert (40, R) into the parent node of L:

Result:

(The virtual node does not exist !!!!! Shown here to make algorithm more concrete)

This step is only done to help you visualize the insert algorithm

BTW: I will re-use the letters L and R for the next level of node split --- the letters L and R now denote different nodes !!!

• Split this virtual node into 3 parts:

• Take the middle node out !!!
• Make a left half node L
• Make a right half node R

Graphically:

Result:

Notice that the new node R is the right subtree of the key 40 !!!

Recurse:

 ``` InsertInternal( (40, R), parent(L) ) ```

• InsertInternal( (40, R), parent(L) ):

Result:

DONE !!!

• Implementation details: How to split an overflow node into 2 nodes

• Splitting a leaf node:

• Recall we made a virtual node:

and split the virtual node into two:

• In actual programming code, you should do the following:

 ``` Allocate a new node R (the right node); /* ======================================== Link the leaf nodes ======================================== */ R.last pointer = L.last pointer L.last poinetr = R /* ============================================ Divide the key in half over the 2 nodes ============================================ */ Let: x = insertedKey; m = middle key in L (= k⌈(n+1)/2⌉); if ( x > m ) { /* ====================================== x belongs to the right half ====================================== */ for (key = "middle + 1" key; key < x; next key) Move key from L to R ; Put x in R; for ( current key ; last key ; next key ) Move key from L to R } else { /* ====================================== x belongs to the left half ====================================== */ for (key = "middle" key; last key; next key) Move key from L to R ; /* -------------------------- Make an empty space for x -------------------------- */ for (key = "middle − 1" key; key < x; previous key) Move the key 1 position to the right within L; Put x in the empty slot within L; } ```

• Splitting an internal node:

• The procedure is similar to the one I described above and you should be able to work out the details yourself....

• You do need define a help variable to store the key that you extract from the node.

 The extracted key is the middle key if x will be put in the right half Otherwise, the extracted key is the next key.

• Postscript