### Using a kd-tree (for common multi-dim queries)

• Performance of the kd-tree for commonly used multi-dim. queries

• Partial Match queries

 The query specifies conditions on some dimensions but not on all dimensions

Search Algorithm:

• For a dimension for which the search value is given (= specified):

 Take the (one) branch of the subtree for the search value

• For a dimension for which the search value is not given (= not specified):

 Take both branches of the subtree !!!

Example:

• Find all person with age = 50

• Analysis: how efficient is the kd-tree ?

• Assume the kd-tree is perfectly balance (same height everywhere).

There are 2 dimensions and there are 2n levels:

• At level 1, we can eliminate half of all records:

 We need only to process 1/2 of all records

• At level 3, we can eliminate another half of all remaining records:

 We need only to process 1/4 of all records

• At level 5, we can eliminate another half of all remaining records:

 We need only to process 1/8 of all records

And so on !!!

• If we have 2n   levels in the kd-tree:

 We need only to process (1/2)n of all records

Example: a 4 (= 22) level kd-tree:

• Range queries

 Find objects that are located either partial or wholly within a certain range

Search Algorithm:

• For the search range is completely contained by the left subtree, then:

 Take only the left branch of the subtree for the search value

• For the search range is completely contained by the right subtree, then:

 Take only the right branch of the subtree for the search value

• Otherwise (the search range saddles at the search value ):

 Search both subtrees

Example:

• Nearest neighbor queries:

• Not easy to to find the nearest neighbor using a kd-tree index:

It requires up and down traversal/search in the kd-tree.....

• Where-am-I queries:

• Not applicable:

 KD-tree can only stores points Cannot store objects