Lecture 12--Quicksort: average behavior, pivot choice. 10/21/97.

=============================================================================

12.1. What we know about quicksort so far. From Lecture 11 we know the following facts about quicksort:

1. It is a divide-and-conquer algorithm satisfying the recurrence relation

T(0) = T(1) = b for a constant b

T(n) = T(j) + T(n-j) + cn for a constant c and for some j with 1 <= j <= n-1.

2. The term cn comes from the time it takes to split the remaining list into two parts and is proportional to the number of comparisons done.

3. The quicksort splitting can be done "in place", so that the quicksort partitioning does not require much extra space.

4. Because we cannot guarantee the sizes of the two lists with any simple choice of pivot element, in the worst case quicksort is a "capital theta"(n2) time algorithm. In this worst case the stack size can grow to "capital theta"(n) and so quicksort does not use space efficiently in this case. We can alleviate the space problem by replacing one recursive call with a loop, i.e., by recursively sorting part of the array and then updating the "first" or "last" pointer to point to the beginning of what is left.

5. If we COULD guarantee that the remaining part of the array was always split into two subarrays of about equal size, then quicksort would always finish in time = "capital theta"(nlog2n).

12.2. Average time taken by quicksort. In fact, we can show that ON THE AVERAGE, where we average over all n! possible arrangements of the input, and where we assume that each of these arrangements is equally likely to be input, quicksort will only take time "capital theta"(nlog2n). Because of this average behavior quicksort is probably the sort most often used in practice for general sorting of an array.

We assume that the data input to quicksort does not have any repeated values. This assumption is just to simplify the following proof somewhat. With a little more work it can be removed.

Theorem. If we assume all inputs to the quicksort algorithm are equally likely, then the expected time for quicksort to sort an input, which is proportional to the number of comparisons done by quicksort, is "capital theta"(nlog2n), i.e., if TE(n) denotes the expected time for quicksort to sort an array of n (distinct) items then

TE(n) = "capital theta"(nlog2n).

Proof. At each call of the quicksort procedure we choose a pivot p. Suppose that p is the jth smallest element, 1 <= j <= n. Then the two recursive calls to quicksort will require expected time TE(j-1) and TE(n-j). Now for each j p is the jth smallest element with probability 1 / n. Therefore we have

TE(n) <= cn + (1 / n ) * "SUM"1 <= j <= n {TE(j - 1) + TE(n-j)}

= cn + (2 / n) *"SUM" 0 <= j <= n-1 TE(j)

Now let b = TE(0) = TE(1). We prove by induction that for n >= 2

TE(n) <= (2c + 2b) nlog2n.

Step 1. For n = 2 the above summation gives

TE(2) <= cn + (2 / 2) (TE(0) + TE(1)) and so step 1 is complete.

Step 2. Now assume we have proved the bound for 2 <= k <= n-1. Then we have by the induction hypothesis

TE(n) <= cn + (2 / n) *"SUM" 0 <= j <= n-1 TE(j)

= cn + (2 / n) (TE(0) + TE(1)) + (2 / n) "SUM" 2 <= j <= n-1 (2c + 2b) jlog2j

<= cn + (4b / n) + (2 / n) (2c + 2b) "integral"2 <= x <= n xlog2x dx

<= cn + (4b / n) + (2 / n) (2c + 2b){ n2log2n / 2 - n2 / 4}

<= cn + (4b / n) + ( 2c + 2b) (nlog2n) - (2c + 2b) (n / 2)

<= (2c + 2b) (nlog2n)

(since cn + (4b / n) - (2c + 2b) (n / 2) = (4b / n) - bn < = 0 for n >= 2).

Step 3. Since we have verified Step 1 and Step 2, we have shown by induction that the inequality is valid.

12.3. Two "random" algorithms. The above proof assumes that the data input to quicksort is in "random order" (actually that the pivot p chosen at each step is equally likely to be the jth element for any j with 1 <= j <= n). We can modify the quicksort algorithm slightly to increase the chances of this being true. We give two modified versions of quicksort. In each version we assume the existence of a (pseudo-) random number generator. If we run one of these versions twice, on the exact same input, we may get different running times because we are making some random choices about what to do at various points in the program. (We will, of course, get the correct sorted output in any case. So by introducing some overhead to generate random numbers we can come closer to guaranteeing that, regardless of the initial data input, quicksort will exhibit running time proportional to nlog2n rather than n2.) An algorithm which makes some random choices during its execution, and which can therefore exhibit different behavior on the same input, is called a "random" algorithm.

Quicksort (first, last). \\ Random version 1.

     This procedure is given in the programming assignment.  At each
     stage choose the pivot randomly, i.e., any item in the array 
     remaining to be sorted is equally likely to be chosen.

Quicksort (first, last). \\Random version 2.

replace procedure split (first, last, pivot) with:

random-split (first, last, pivot):

generate a random permutation P of the integers first, first + 1, . . . , last

reorder the array by placing the element at position j in position P(j)

x = A[first]

pivot = first

for index = first + 1 to last do

if A[index] < x then

pivot = pivot + 1

swap (A[pivot] , A[index] )

swap (A[first], A[pivot] )

return pivot

Exercise 12.1. Explain how the randomization steps in the two randomized versions of quicksort affect the running time of the algorithm.

Exercise 12.2. Give a randomized algorithm to search an array of n elements for an element X. Assume the array is unordered and X is equally likely to be at any position or not in the array.

Exercise 12.3. (grad) Calculate the running time of the algorithm you developed in exercise 12.2

Exercise 12.4. Explain how a random number generator which generates only two values, 0 and 1, can be used to implement the randomized versions of quicksort. What is the effect on the running time of each algorithm if such a bit generator is used?