### Running time analysis of merge sort

• Operation of the Merge Sort algorithm

• Merge Sort Algorithm:

 ``` public static void sort(double[] a, int left, int right, double[] tmp) { double[] left, right; if ( (right-left) == 1 ) { // One element in array --- No need to sort ! return; } int middle = (right + left)/2; sort( a, left, middle, tmp ); // Sort left HALF of a[ ] sort( a, middle, righ, tmp ); // Sort right HALF of a[ ] /* ========================================================== Merge both sorted arrays back ========================================================== */ merge( a, left, middle, right, tmp ); // Takes N steps to merge // an array of N elements // (It really takes 2*N steps, // but if you only want to // find the ORDER, you can use N) } ```

• Define:

 ``` T(N) = # statements that you need to execute to merge sort an array of N elements ```

• From the merge sort algorithm, we can see that:

 ``` sort( array of N elements ) will call: sort( array of N/2 elements ); // Sort left HALF of array sort( array of N/2 elements ); // Sort right HALF of array merge( array of N elements ); // Takes N statements ```

Therefore:

 ``` Amount of work done by sort(array of size N) = Amount of work done by sort(array of size N) + Amount of work done by sort(array of size N) + N Or: T(N) = T(N/2) + T(N/2) + N ```

Furthermore:

 ``` T(1) = 1 // Time to execute the if-statement ```

• The performance of merge sort is then given by this recurrence relation:

 ``` T(N) = 2*T(N/2) + N ...... (1) T(1) = 1 ...... (2) ```

• Solving the recurrence relation of Merge Sort analysis

• We can use the expansion method to find a solution:

 ``` T(N) = 2*T(N/2) + N ( T(N/2) = 2*T(N/4) + N/2 ) = 2*(2*T(N/4) + N/2) + N = 4*T(N/4) + N + N = 4*T(N/4) + 2*N T(N) = 4*T(N/4) + 2*N ( T(N/4) = 2*T(N/8) + N/4 ) = 4*(2*T(N/8) + N/4) + 2*N = 8*T(N/8) + N + 2*N = 8*T(N/8) + 3*N T(N) = 8*T(N/8) + 3*N ( T(N/8) = 2*T(N/16) + N/8 ) = 8*(2*T(N/16) + N/8) + 3*N = 16*T(N/16) + N + 3*N = 16*T(N/16) + 4*N And so on.... ```

We can detect this pattern:

 ``` T(N) = 2*T(N/2) + N = 4*T(N/4) + 2*N = 22*T(N/(22)) + 2*N = 8*T(N/8) + 3*N = 23*T(N/(23)) + 3*N In general: T(N) = 2k*T(N/(2k)) + k*N and: T(1) = 1 (from Equation (2)) ```

• For simplicity, let us assume that N is equal to some power of 2

For example:

 ``` N = 1024 (210) or: N = 1048576 (220) ```

If N is equal to some power of 2, we can solve T(N) exactly !

For example:

 ``` If N = 1024 (210) T(N) = 2*T(N/2) + N = 4*T(N/4) + 2*N = 8*T(N/8) + 3*N = 16*T(N/16) + 4*N = 32*T(N/32) + 5*N = ... = 1024*T(N/1024) + 10*N (N = 1024) = N*T(1) + 10*N (T(1) = 1 !) = N + 10*N = 11*N or: If N = 1048576 (220) T(N) = 2*T(N/2) + N = 4*T(N/4) + 2*N = 8*T(N/8) + 3*N = 16*T(N/16) + 4*N = 32*T(N/32) + 5*N = ... = 1024*T(N/1024) + 10*N = 2048*T(N/2048) + 11*N = ... = 1048576*T(N/1048576) + 20*N (N = 1048576) = N*T(1) + 20*N (T(1) = 1 !) = 1 + 20*N = 21*N ```

• In general, if N = 2k, we can solve T(N) exactly as follows:

 ``` If N = 2k T(N) = 2*T(N/2) + N = 4*T(N/4) + 2*N = 8*T(N/8) + 3*N = 16*T(N/16) + 4*N = 32*T(N/32) + 5*N = ... = (2k)*T(N/(2k)) + k*N (N = 2k, so: N/(2k) = 1) = (2k)*T(1) + k*N (2k= N) = N*T(1) + k*N (T(1) = 1) = N + k*N = (k+1)*N (N = 2k ⇒ k = lg(N)) = (lg(N)+1)*N = (log(N)+1)*lg(N) ```

• If N ≠ 2k, we will use an upper bound:

• If N ≠ 2k, then:

 Sorting an array of N takes at most the amount of time to sort an array of 2k (because this array has more elements than the one you are sorting !!!!)

• Since sorting an array of size N = 2k will take:

 ``` running time of merge sort = (log(N)+1)*N ```

then sorting an array of size N < 2k will take at most:

 ``` running time of merge sort ≤ (log(N)+1)*N ```

• Therefore:

• The running time of merge sort to sort an array of size N is bounded by:

 (log(N)+1) × N = O(N×log(N))

• Compare performance of Merge Sort and Bubble Sort

• We generate n random numbers and sort them with Bubble sort and Merge sort

• The Bubble Sort performance test: (Demo above code)

How to run the program:

 Right click on link(s) and save in a scratch directory To compile:   javac testProg.java To run:          java testProg

• The Merge Sort performance test: (Demo above code)

How to run the program:

 Right click on link(s) and save in a scratch directory To compile:   javac testProg.java To run:          java testProg

• Run this test and you will understand why people prefer merge sort !!!