### Multiway search trees (in general)

• Multiway search trees

• A binary search tree is an ordered tree where each internal node has at most 2 children nodes

Example:

Note:

 Each internal node has exactly 2 (internal or external) child nodes External nodes have 0 children nodes (they are "terminal nodes")       (External nodes are often simply a null value :-))

• Multiway trees:

• A multiway (m-way) search tree is a generalization of the binary search tree....

 Each internal node of a m-way tree has exactly m (internal or external) children nodes External nodes have 0 children nodes (they are "terminal nodes")       (External nodes are often simply a null value :-))

Example: a 4-way tree

Notice that:

 Each internal node has 4 (internal or external) child nodes External nodes has no children

• d-node

• Terminology: d-node (d ≤ m)

• A d-node = a node with d internal children nodes

Each d-node contains:

 d   references (pointers) to d subtrees (= children nodes) d − 1   (key, value) pairs (k0, v0), (k1, v1), ... (kd-2, vd-2)        such that: k0  <  k1  <  ...  <  kd-2

• Graphical depiction of a d-node (they say that a picture is worth 1000 words....):

• A d-node has d subtrees organized by d−1 keys:

• Keys in left most subtree must be < k0
• Keys in subtree sandwiched between k0 and k1 must have values between k0 and k1
• Keys in subtree sandwiched between k1 and k2 must have values between k1 and k2
• ...
• Keys in right most subtree must be > kd−2

Example:

• Notable facts

• Leaf node have zero (0) internal child nodes (i.e., no subtrees)

Example:

• Internal nodes have ≥ 2 internal child nodes (i.e., ≥ 2 subtrees)

Example:

• However:

• There are no nodes that has exactly one internal child node (exactly 1 subtree) !!!

 An internal node will never have just one (1) subtree !!! An internal node will always have 2 or more subtrees

(You will see that the node insertion/deletion algorithm will create such a tree !!!)

• Data structure for a node

• Each node contains:

 d subtree references d−1 keys (and possibly associated values) 1 reference to the parent node !!!

• Variables:

 ``` public class Entry { public String key; // Again, I use concrete data types.. public Integer value; public Entry(String k, Integer v) { key = k; value = v; } } public class Node { public Entry[] e; // Keys and its associated values public Node[] child; // Subtrees public Node parent; // the parent node ... } ```

Correspondence of the variables:

• Allocating the arrays:

• In Java, an array is allocated after you define the array reference variable

• That's why you see:

 ``` public class Node { public Entry[] e; // Keys and its associated values public Node[] child; // Subtrees public Node parent; // the parent node ... } ```

• It's the job of the constructor to allocate (create) the arrays.

Example constructor:

 ``` public class Node { public Entry[] e; // Keys and its associated values public Node[] child; // Subtrees public Node parent; // the parent node /* =========================================================== Sample constructor: create a node contains max 4 subtress (and 3 keys) =========================================================== */ public Node() { int i; child = new Node[4]; // 4 subtree pointers e = new Entry[3]; // We need 3 keys to separates the 4 trees /* ------------------------------------ Initialize all references to null ------------------------------------ */ for ( i = 0; i < 3 ; i++ ) { e[i] = null; } for ( i = 0; i < 4 ; i++ ) child[i] = null; parent = null; // Don't forget the parent link... } ... } ```

• Data structure to represent a multi-way tree

• Facts:

 A multi-way tree is still a tree A tree is just a gerenalization of a linked list

• So just like a linked list, you must remember the root node of the multi-way tree in the data structure:

 ``` public class MultiWayTree { /* ================================================= Variable to represent a multiway tree: root ! ================================================= */ public Node root; /* ================================================= Constructor: make an empty tree ================================================= */ public MultiWayTree() { root = null; } .... } ```

• Traversing/Searching in a multiway search tree

• Example: Find key 8 in the multi-way tree:

• Start at the root and find:

 some entry in the node contains key 8,     or the subtree that may contain key 8

In this case, we go left:

• In the next node, find:

 some entry in the node contains key 8,     or the subtree that may contain key 8

In this case, we go between 5 and 10:

• In the next node, find:

 some entry in the node contains key 8,     or the subtree that may contain key 8

In this case, we found 8:

We return the entry found....

• Example: Find key 7 that is not in the tree --- preparing for insertion :

• Start at current node and previous node at the root and find in the current node:

 some entry in the node contains key 7,     or the subtree that may contain key 7

In this case, we go left:

Make previous to point to the current node and use current to descend down the tree.

• In the next node, find:

 some entry in the node contains key 7,     or the subtree that may contain key 7

In this case, we go between 5 and 10:

Make previous to point to the current node and use current to descend down the tree.

• In the next node, find:

 some entry in the node contains key 7,     or the subtree that may contain key 8

In this case, we go between 6 and 8:

Make previous to point to the current node and use current to descend down the tree.

• Notice that:

We have:

 current == null !!!

More importantly:

 We remember the location of the last entry (previous) (Because we may use this node for insertion !!!)

• A high level description of the Multi-way tree search algorithm:

 ``` Node searchEndPos; // Use an instance variable to hold value longer // Remember that LOCAL variables // ONLY exists INSIDE a METHOD keySearch(k) { Node curr; curr = root; searchEndPos = curr; /* ============================================================ It is important to know that a node looks like this: Node: child[0] e[0] child[1] e[1] ... 0-node: null e[0] null e[1] ... 1-node: impossible !!! 2-node: child[0] e[0] child[1] null ... 3-node: child[0] e[0] child[1] e[1] child[2] null ... ============================================================= */ while ( curr != null ) { searchEndPos = curr; // Remember the last visited node if ( entry[0] != null && k < entry[0].key ) { search in child[0] subtree: curr = child[0]; } if ( entry[0] != null && k == entry[0].key ) { found key k: return entry[0]; } if ( entry[0] is last entry (a 1-node) ) { search in child[1] subtree: curr = child[1]; } ===================================================== if ( entry[1] != null && k < entry[1].key ) { search in child[1] subtree: curr = child[1]; } if ( entry[1] != null && k == entry[1].key ) { found key k: return entry[1]; } if ( entry[1] is last entry (a 2-node) ) { search in child[2] subtree: curr = child[2]; } .... and so on ... } return null; // k not found.... } ```

• Search algorithm for multiway trees: (it's called keySearch instead of findEntry)

 ``` /* =============================================================== keySearch(k): find entry containing key k Return value: e[i] if found (e[i].key == k) null if not found AND: searchEndPos = node last visited in search =============================================================== */ Node searchEndPos; // Use an instance variable to hold value longer public Entry keySearch(String k) { int i; Node curr; // current node searchEndPos = root; // searchEndPos = previous node // This variable is NOT local !!! curr = root; while ( curr != null ) { searchEndPos = curr; // Remember the last visited node for ( i = 0; i < d; i++ ) { /* ============================================================ It is important to know that a node looks like this: Node: child[0] e[0] child[1] e[1] ... 0-node: null null null null ... 1-node: impossible !!! 2-node: child[0] e[0] child[1] null ... 3-node: child[0] e[0] child[1] e[1] child[2] null ... ============================================================= */ if ( curr.e[i] != null && k.compareTo( curr.e[i].key ) < 0 ) { curr = curr.child[i]; break; // end to for loop } if ( curr.e[i] != null && k.compareTo( curr.e[i].key ) == 0 ) { return( curr.e[i] ); // found key } if ( i == d-1 || curr.e[i+1] == null ) // e[i] is last value key { curr = curr.child[i+1]; break; // end to for loop } } } return(null); // To make Java happy, it complain of no return value... } ```

• Example Program: (Demo above code)

How to run the program:

 Right click on link(s) and save in a scratch directory To compile:   javac TestProg.java To run:          java TestProg

Sample output:

 ``` 2:((m,6),(-),(-)) 3:((ka,6),(l,6),(-)) 1:((kb,6),(-),(-)) 0:((j,6),(k,6),(-)) 1:((h,8),(-),(-)) 2:((g,7),(-),(-)) 0:((e,5),(f,6),(-)) 0:((ba,6),(d,4),(i,9)) 1:((ca,6),(-),(-)) 1:((c,3),(-),(-)) 0:((bb,6),(-),(-)) 3:((b,2),(-),(-)) 2:((ae,6),(af,6),(-)) 0:((ab,6),(ad,6),(ax,6)) 1:((ac,6),(-),(-)) 0:((a,1),(aa,6),(-)) vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv Enter a key: h get(h): ((ba,6),(d,4),(i,9)) ---- traverse LEFT subtree of (i,9) ((g,7),(-),(-)) ---- traverse RIGHT subtree of (g,7) ((h,8),(-),(-)) ---- FOUND: (h,8) == get(h) = 8 vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv Enter a key: x get(x): ((ba,6),(d,4),(i,9)) ---- traverse RIGHT subtree of (i,9) ((ka,6),(l,6),(-)) ---- traverse RIGHT subtree of (l,6) ((m,6),(-),(-)) ---- traverse RIGHT subtree of (m,6) ===== Not found... Search ended at node: ((m,6),(-),(-)) == get(x) = null ```

• Postscript: Goodrich's approach is unnecessarily complex

• Goodrich used one single loop to accomplish the entire algorithm and made it unnecessaily complex (which he did throughout his whole text book).

His loop is complex because he handle 2 things at the same time:

 Iterate through all the subtrees Find out which subtree is the last subtree in the node

• I would split up the 2 tasks and use 2 loops:

 ``` /* =============================================================== keySearch(k): find entry containing key k Return value: e[i] if found (e[i].key == k) null if not found AND: searchEndPos = node last visited in search =============================================================== */ Node searchEndPos; // Use an instance variable to hold value longer public Entry keySearch(String k) { int i; Node curr; // current node int N; // Number of entries boolean found; // Did we find the subtree already ? searchEndPos = root; // searchEndPos = previous node // This variable is NOT local !!! curr = root; while ( curr != null ) { searchEndPos = curr; // Remember the last visited node /* =========================================================== Find out how many entries are stored in the current node =========================================================== */ for ( N = 0; N < d; N++ ) if ( curr.e[N] == null ) break; // last entry found found = false; // Flag whether we need to take the right most subtree for ( i = 0; i < N; i++ ) { if ( k.compareTo( curr.e[i].key ) < 0 ) { curr = curr.child[i]; // Go to the left subtree found = true; // We found the subtree ! // DO NOT take the right most subtree break; // end to for loop } if ( k.compareTo( curr.e[i].key ) == 0 ) { return( curr.e[i] ); // found key } } if ( !found ) curr = curr.child[N]; // Go to the right-most subtree // because no left tree was found } return(null); // To make Java happy, it complain of no return value... } ```

• Example Program: (Same DEMO, but using my own keySearch algorithm)

How to run the program:

 Right click on link(s) and save in a scratch directory To compile:   javac TestProg.java To run:          java TestProg