### Maintaining good performance in Hashing

• Performance of Hashing

• Recall that the performance of hashing depends on the load factor (occupancy factor) a:

 ``` running time for insert, lookup and delete = O ( 1 / (1-a) ) ```

• Sample running times:

 ``` load factor a | Average running time of Hashing --------------------+-------------------------------------------- 0.1 (10% full) | 1/(0.9) = 1.11 compare operations 0.2 (20% full) | 1/(0.8) = 1.25 compare operations 0.3 (30% full) | 1/(0.7) = 1.43 compare operations 0.4 (40% full) | 1/(0.6) = 1.67 compare operations 0.5 (50% full) | 1/(0.5) = 2.00 compare operations 0.6 (60% full) | 1/(0.4) = 2.50 compare operations 0.7 (70% full) | 1/(0.3) = 3.33 compare operations 0.8 (80% full) | 1/(0.2) = 5.00 compare operations 0.9 (90% full) | 1/(0.1) = 10.00 compare operations ```

• Fact:

 To keep average running time of hashing low, we must keep the load factor (occupancy level) low

Common practice:

 Keep the load factor to at most 50%

• Rehashing

• Rehashing:

• Rehashing = convert one hashing indexing scheme into another hashing scheme

• Fact:

 The rehash operation is usually performed using hash tables of different sizes !!!

• Example:

• Hashing scheme 1 using the hash function: hash( x ) = x % 5 in an array of 5 elements

Bucket assignment:

• Suppose we want to rehash the content of the hash table into a new hash table of 10 entries using the hash function: hash( x ) = x % 10

• The rehash operation will compute the new hash index for every key stored in the existing hash table and insert the element at the appropriate location:

• Pseudo code:

 ``` rehash() { create a new (larger) hash table new_Hash_Table; select a new hash function H_new(x) for the new hashing scheme; for ( each element e in the old hash table ) do { insert e in new_Hash_Table using hash function H_new(e/key); } } ```

• Java code:

 ``` /* ************************************************************** rehash(): increase the hash table size ************************************************************** */ public void rehash() { Entry[] old = bucket; // Old contains the original entries capacity = 2*capacity; bucket = new Entry[capacity]; // Allocate a new bucket is twice as big /* ---------------------------------------------- Make a new MAD function compression function (This is not really necessary...) ---------------------------------------------- */ java.util.Random rand = new java.util.Random(); MAD_a = rand.nextInt(MAD_p-1) + 1; // new hash scaling factor MAD_b = rand.nextInt(MAD_p); // new hash shifting factor /* ======================================================= Rehash all entry from old hash table into the new one ======================================================== */ for ( int i=0; i < old.length; i++ ) { Entry e; e = old[i]; // Process old entry i if ( (e != null) && (e != AVAILABLE) ) { // e contains a non-empty entry findEntry(e.getKey()); // Find index using the key // The variable index_for_key // contains the index bucket[index_for_key] = e; // Insert in new bucket } } } ```

• HashMap implementation with rehash

• When do we need to rehash:

 Rehash will increase the size of the hash table The size of the hash table will only increase when we insert a new key

• Modified implementation of the put() method:

 ``` public Integer put (String key, int value) { int found = findEntry(key); //find the appropriate spot for this entry if ( found > 0 ) // key found { Integer oldValue = bucket[index_for_key].setValue(value);// set new value return ( oldValue ); // Return old value } /* ======================================================= NEW: Keep occupance (load factor) of array below 50% ======================================================= */ if ( NItems >= capacity/2 ) { rehash(); // rehash to keep the load factor <= 0.5 found = findEntry(key); // find the appropriate spot for this entry // We must do it again, because hash method has // changed !!! } /* =================================================== Insert (key, value) in bucket[index_for_key] =================================================== */ bucket[index_for_key] = new Entry(key, value); // Insert it NItems++; return null; // there was no previous value } ```

• Example Program: (Demo above code)

• HashMap.java (with rehash()) Prog file: click here
• MyTestMap.java Test Prog file: click here (test get, put, remove)
• TestMap1.java Test Prog file: click here (the word counting program, but uses own HashMap)

How to run the program:

 Right click on link(s) and save in a scratch directory To compile:   javac MyTestMap.java To run:          java MyTestMap