### Improving the recursive (top-down) merge sort algorithm

• Short-coming of the naive recursive (top-down) merge sort algorithm

• The naive version of the merge sort algorithm creates many arrays during its execution:

 ``` public static void sort(double[] a) { double[] left, right; if ( a.length == 1 ) { // No need to sort ! return; } /* ================================================================= Split a[0 ..... middle-1 middle .... a.length-1] / \ left[0 .. middle-1] right[0 ... a.length-1-middle] ================================================================= */ int middle = a.length/2; left = new double[middle]; // Create array to hold left half for ( int i = 0; i < middle; i++ ) left[i] = a[i]; right = new double[a.length-middle]; // Create array to hold right half for ( int i = 0; i < a.length-middle; i++ ) right[i] = a[i+middle]; /* ====================================== Sort both halves of the arrays ====================================== */ sort( left ); // Recursion sort( right ); // Recursion /* ====================================== Merge both sorted arrays back ====================================== */ merge( a, left, right ); // We have discussed the Merge alg. already... } ```

• Problem:

 Re-write the merge sort algorithm so that it will not create any array during its execution.

• A more efficient way to pass 2 adjacent arrays to be sorted

• The naive algorithm passes the arrays by creating new arrays:

 ``` int middle = a.length/2; left = new double[middle]; // Create array to hold left half /* ================================== Copy the left half in first array ================================== */ for ( int i = 0; i < middle; i++ ) left[i] = a[i]; right = new double[a.length-middle]; // Create array to hold right half /* =================================== Copy the right half in second array =================================== */ for ( int i = 0; i < a.length-middle; i++ ) right[i] = a[i+middle]; ```

• A more efficient way is:

 Use the input array (i.e., the input parameter of the merge sort) Pass the range of the array that you want to sort

What do I mean:

• We must augment the sort method with 2 additional parameters:

 ``` sort( double[] a, int left, int right ) { ... } ```

Meaning of the parameters:

 ``` a = the input array that we must sort The sort() method must sort these elements: a[left] a[left+1] .... a[right-1] ```

• How to split an array using the new scheme:

Original algorithm:

 ``` public static void sort(double[] a) { double[] left, right; if ( a.length == 1 ) { // No need to sort ! return; } /* ================================================================= Split a[0 ..... middle-1 middle .... a.length-1] / \ left[0 .. middle-1] right[0 ... a.length-1-middle] ================================================================= */ int middle = a.length/2; left = new double[middle]; // Create array to hold left half for ( int i = 0; i < middle; i++ ) left[i] = a[i]; right = new double[a.length-middle]; // Create array to hold right half for ( int i = 0; i < a.length-middle; i++ ) right[i] = a[i+middle]; /* ====================================== Sort both halves of the arrays ====================================== */ sort( left ); // Recursion sort( right ); // Recursion /* ====================================== Merge both sorted arrays back ====================================== */ merge( a, left, right ); // We have discussed the Merge alg. already... } ```

New way to split an array into 2 halves:

 ``` /* ================================================== The method sort(a, left, right) will sort these elements: a[left] a[left+1] ... a[right-1] ================================================== */ sort( double[] a, int left, int right ) { double[] left, right; if ( (right-left) == 1 ) { // One element in array --- No need to sort ! return; } ************************** NEW *************************** /* ================================================================= Split a[left ..... middle-1 middle .... right-1] / \ a[left .. middle-1] right[middle ... right-1] ================================================================= */ int middle = (right + left)/2; sort( a, left, middle ); // Sort left array sort( a, middle, right ); // Sort right array /* ========================================================== Merge both sorted arrays back ========================================================== */ merge( a, left, middle, right ); // Merge !!! } ```

Test program:

 ``` public static void main( String[] args ) { double[] x = {6.4, 3.5, 7.5, 2.5, 8.9, 4.2, 9.2, 1.1} ; System.out.println("Before sort: " + Arrays.toString(x) ); MergeSort1.sort( x, 0, x.length ); // Merge sort System.out.println("\nAfter sort: " + Arrays.toString(x) ); } ```

• Example Program: (Demo above code)

How to run the program:

 Right click on link(s) and save in a scratch directory To compile:   javac testProg1.java To run:          java testProg1

• Programming note: we must use a modified merge algorithm

• The new merge sort algorithm uses a slightly modified merge algorithm

Reason:

• The original merge() method receives 3 different arrays as input:

 ``` /* ================================================================== Input: a[ ] and b[ ] Output: merge the values from a[ ] and b[ ] into array result[ ] ================================================================== */ public static void merge(double[] result, double[] a, double[] b) { int i, j, k; i = j = k = 0; while ( i < a.length || j < b.length ) { if ( i < a.length && j < b.length ) { // Both array have elements if ( a[i] < b[j] ) result[k++] = a[i++]; else result[k++] = b[j++]; } else if ( i == a.length ) result[k++] = b[j++]; // a is empty else if ( j == b.length ) result[k++] = a[i++]; // b is empty } } ```

• Since the new merge sort algorithm does not create arrays, the merge() algorithm must be changed to operation with one input array as parameter:

 ``` The call: merge( double[] a, int left, int middle, int right ) Notice merge has only ONE array (a[ ]) as input parameter. Merge() must merge the values in these 2 array halves: left part: a[left] a[left+1] ... a[middle-1] right part: a[middle] a[middle+1] ... a[right-1] into this array: a[left] a[left+1] ..... a[right-1] ```

• Graphic example of what merge( double[] a, int left, int middle, int right ) must accomplish:

• The call merge( a, 0, 2, 4) will merge these 2 array halves:

 ``` left part: a[0] a[1] (because middle = 2, so 2-1 = 1) right part: a[2] a[3] (because right = 4, so 4-1 = 3) ```

Example:

The call merge( a, 0, 2, 4) will accomplish this:

• The modified merge algorithm used in the new merge sort algorithm

• Differences between the original merge algorith and the modified merge algorithm:

• Original merge algorithm uses 2 arrays:

 ``` left[ ]: left[0] left[1] ... right[ ]: right[0] right[1] .... ```

The new merge algorithm must use 2 parts of the same array:

 ``` left part: a[left] a[left+1] ... a[middle-1] right part: a[middle] a[middle+1] ... a[right-1] ```

• We must create a temporal array to hold the merge result (otherwise we will overwrite the array values !!!)

 We must copy the merge result stored in the temporal array back at the end of the merge algorithm

Graphically illustrated:

• Create a temporal array:

• Merge the first array element into the temporal array:

• Merge the next array element into the temporal array:

• Merge the next array element into the temporal array:

• And so on:

• When you are done, you must copy the merged result back to the array a[ ]:

• Modified merge algorithm in Java:

 ``` /* ====================================================== Merge does this: 1. Create a tmp[ ] array for the merge operation 2. Merge 2 SORTED adjacent pieces of array a[ ] into a SORTED array tmp[ ]: left array (sorted) right array (sorted) a[iLeft ... iMiddle-1] a[iMiddle... iRight] \ / \ / sort \ / tmp[ iLeft ... iRight ] | | Copy back ! V a[ iLeft... iRight ] ` 3. Copy the merged result in tmp[ ] back to a[ ] The SAME portion of the array tmp[ ] is used !!! ====================================================== */ public static void Merge(double[] a, int iLeft, int iMiddle, int iRight) { int i, j, k; double[] tmp; tmp = new double[a.length]; // Create tmp[] array for merge op. i = iLeft; // Re-adjust the indices j = iMiddle; k = iLeft; while ( i < iMiddle || j < iRight ) // It's the same algorithm ! { if ( i < iMiddle && j < iRight ) { // Both array have elements if ( a[i] < a[j] ) tmp[k++] = a[i++]; else tmp[k++] = a[j++]; } else if ( i == iMiddle ) tmp[k++] = a[j++]; // a is empty else if ( j == iRight ) tmp[k++] = a[i++]; // b is empty } /* ================================= Copy tmp[] back to a[] ================================= */ for ( i = iLeft; i < iRight; i++ ) a[i] = tmp[i]; } ```

You can compare it with the original merge algorithm:

 ``` /* ================================================================== Input: a[ ] and b[ ] Output: merge the values from a[ ] and b[ ] into array result[ ] ================================================================== */ public static void merge(double[] result, double[] a, double[] b) { int i, j, k; i = j = k = 0; while ( i < a.length || j < b.length ) { if ( i < a.length && j < b.length ) { // Both array have elements if ( a[i] < b[j] ) result[k++] = a[i++]; else result[k++] = b[j++]; } else if ( i == a.length ) result[k++] = b[j++]; // a is empty else if ( j == b.length ) result[k++] = a[i++]; // b is empty } } ```

You can see that the differences is very small

Here's a lesson you get learn from this exercise:

 Good programmers do not re-invent the tool (e.g., screwdriver) to solve a slightly modified problem, they adjust the tool !!!

• Final note

• The new merge() method will work cleanly in most programming languages that has

 User-controlled memory allocation functions (This would be the case in C/C++)

• In Java however, the new merge() method will create a lot of garbage:

• Look at the new merge() method:

 ``` public static void Merge(double[] a, int left, int leftEnd, int rightEnd) { double[] tmp = new double[a.length]; // Help array for merging int i, j, k, left_orig; i = left; j = leftEnd; k = left; while ( i < leftEnd || j < rightEnd ) { if ( i < leftEnd && j < rightEnd ) { // Both array have elements if ( a[i] < a[j] ) tmp[k++] = a[i++]; else tmp[k++] = a[j++]; } else if ( i == leftEnd ) tmp[k++] = a[j++]; // left halft is empty else if ( j == rightEnd ) tmp[k++] = a[i++]; // right halft is empty } // Copy tmp[] back to a[] ! for ( i = left; i < rightEnd; i++ ) a[i] = tmp[i]; } ```

• Fact:

• The tmp[ ] array created in the Merge() method will become garbage when the Merge() method exits !!!

 This will happen every time the Merge() method is called !!!

• The Merge() method will be called many times !!!

So:

 The Merge() method will result in a lot of garbage in Java !!

• Improved organization for Java implementation

• Let the user create 2 arrays:

• Array a[ ] is used to store the input data that need to be sorted

• A help array tmp[ ] (of the same size as array a[ ]) that is used in the merge operation

 The user passes both arrays to the merge sort algorithm

Then the merge() method does not need to create a temporal array each time it is called.

• This is what I mean:

• The merge() method gets the help array tmp{ ] as a parameter:

 ``` public static void Merge(double[] a, int left, int leftEnd, int rightEnd, double[] tmp) { // We do NOT create the tmp array anymore !!!! int i, j, k, left_orig; i = left; j = leftEnd; k = left; while ( i < leftEnd || j < rightEnd ) { if ( i < leftEnd && j < rightEnd ) { // Both array have elements if ( a[i] < a[j] ) tmp[k++] = a[i++]; else tmp[k++] = a[j++]; } else if ( i == leftEnd ) tmp[k++] = a[j++]; // left halft is empty else if ( j == rightEnd ) tmp[k++] = a[i++]; // right halft is empty } // Copy tmp back for ( i = left; i < rightEnd; i++ ) a[i] = tmp[i]; } ```

• The sort() method will also get the help array tmp[ ] as parameter:

 ``` public static void sort(double[] a, int left, int right, double[] tmp) { double[] left, right; if ( (right-left) == 1 ) { // One element in array --- No need to sort ! return; } ******************** Same as the sort() above ********************* ******************** except things in red ********************* /* ================================================================= Split a[left ..... middle-1 middle .... right-1] / \ a[left .. middle-1] right[middle ... right-1] ================================================================= */ int middle = (right + left)/2; sort( a, left, middle, tmp ); // Pass tmp also to be consistent sort( a, middle, righ, tmp ); // Pass tmp also to be consistent /* ========================================================== Merge both sorted arrays back ========================================================== */ merge( a, left, middle, right, tmp ); // Pass tmp to help merge ! Merge !!! } ```

• The user must create the tmp[ ] for the merge sort algorithm:

 ``` public static void main( String[] args ) { double[] x = {6.4, 3.5, 7.5, 2.5, 8.9, 4.2, 9.2, 1.1} ; double[] tmp = new double[x.length]; System.out.println("Before sort: " + Arrays.toString(x) ); MergeSort1.sort( x, 0, x.length, tmp ); // Pass tmp to help merge System.out.println("\nAfter sort: " + Arrays.toString(x) ); } ```

• Example Program: (Demo above code)

How to run the program:

 Right click on link(s) and save in a scratch directory To compile:   javac testProg1.java To run:          java testProg1