# Another way to distribute the work load

• Find the Minimum value in an array - take 2

• Let's do the "Find min" example again, now splitting the task of "Finding the minimum value" in an array in a different manner

• Solution 2:

• Split the array into 2 (approximate) equal halfs
• Thread 1 finds the minimum in the odd-indexed elements of the array
(I.e.: x[0], x[2], x[4], etc)
• Thread 2 finds the minimum in the even-indexed elements of the array
(I.e.: x[1], x[3], x[5], etc)
• Main thread waits for the results and find the actual minimum.

Pictorially:

 ``` values handled by thread 0 | | | | | | | | | | | | | | V V V V V V V V V V V V V V |-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-| ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ | | | | | | | | | | | | | | values handled by thread 1 Thread 0 Thread 1 | | | | V V min[0] min[1] \ / \ / \ / \ / \ / main thread | | V Actual minimum ```

• The division of labor in general is:

 ``` // ----------------------------------- // Create worker threads.... // ----------------------------------- for (i = 0; i < num_threads; i = i + 1) { start[i] = i; // Pass ID to thread in a private variable if ( pthread_create(&tid[i], NULL, worker, (void *)&start[i]) ) { cout << "Cannot create thread" << endl; exit(1); } } // ----------------------------------- // Wait for worker threads to end.... // ----------------------------------- for (i = 0; i < num_threads; i = i + 1) pthread_join(tid[i], NULL); // ---------------------------------------- // Post processing: Find actual minimum // ---------------------------------------- my_min = min[0]; for (i = 1; i < num_threads; i++) if ( min[i] < my_min ) my_min = min[i]; ```

 ``` void *worker(void *arg) { int i, s; double my_min; s = * (int *) arg; // Convert arg to an integer // -------------------------------------- // Find min in my range // -------------------------------------- my_min = x[s]; for (i = s+num_threads; i < MAX; i += num_threads) { if ( x[i] < my_min ) my_min = x[i]; } min[s] = my_min; // Store min in private slot return(NULL); /* Thread exits (dies) */ } ```

See the elements processed by the thread s:

It's much easier to code the worker thread !!!

• Example Program: (Demo above code)

• Speed up...

• Try running the programs using different threads (the program prints the elapsed time)

• Notice that the first version have drastically improved times on multi-processors (e.g. on compute

But the second version... no so much...

• \$60,000 question:

• Why the second version is not doing so great ?