Lecture 7--Searching a list: average time. 10/6/97
================================================================
In Lecture 6 we analyzed linear search and binary search algorithms for searching a list. Here we will consider the average time behavior of these algorithms. First we review some probability. For most, if not all, that we do this quarter, we will only need simple probabilistic arguments. So we assume we have a finite set S and, for each element s in S, a weight P(s), chosen so that
"SUM"s in S P(s) = 1.
We call P(s) the probability of s. If R is a subset of S, then we can define
P(R) = "SUM"s an element of R P(s).
Example. Let S be the set of outcomes when a coin is flipped. There are two outcomes, H (heads) or T (tails). We have P(H) = p and P(T) = q, with p + q = 1. If p = q = 1/2, we say the coin is "fair".
Often we will be concerned with some combination of events. If A and B are sets, we use the notaion
AB = A and B (i.e., A intersect B).
A + B = A or B (i.e., A union B)
A' = not A (i.e., A complement).
For A,B subsets of S, with P(B) <> 0, we define the conditional probability of A given B,
P[A | B ] = P(AB) / P(B).
If P[A | B ] = P(A) (i.e., if P(AB) = P(A)P(B)) , then we say A and B are independent.
Example. Suppose we flip the above coin twice. Then there are four possible outcomes: HH, HT, TH, TT. If the two flips are independent, then we have
P(HH) = p2, P(TT) = q2, P(HT) = P(TH) = pq.
Example. Suppose we flip a fair coin 3 times. What is the probility of getting exactly two heads? Using brute force, we calculate
P(exactly two heads) = P(HHT) + P(HTH) + P(THH) = 3/8.
More elegantly,
P(exactly two heads) = C(3,2) * P(one outcome) = 3 * (1 /8)
Example. Suppse we have an N-element array containing N distinct integers i0 < i1 < i2 < i3 < . . . < iN-1. Suppose we have randomly selected the position in the array for each integer (i.e., each integer is equally likely to appear at any one of the N positions). What is the probability that i0 is at position 0? Clearly
P(i0 is at position 0) = 1 / N .
What is the probability that i0 is at position 0 and i1 is at position 1?
This just amounts to P(i0 is at position 0) * P(i1 is at position 1 | i0 is at position 0)
= ( 1 / N ) * ( 1 / (N - 1))
Exercise 7.1. What is the probability that ij is at position j for all j with 0 <= j <= N-1?
Exercise 7.2. How many different ways are there to arrange the N integers in the array A? Explain. your answer.
Expected value. Recall that if we have a probability function P defined on a set S, then the expected value of a function f defined on S is just the weighted average with respect to
P,
E ( f ) = "SUM"s an element of S f ( s ) * P(s).
Remember that a (real-valued) function defined on a probability space, such as the above f, is called a random variable.
Example. What is the expected value of the position of i0 in the array A? We know that i0 is equally likely to be at any position 0 through N-1. So let S be the set of possible positions of i0 , i.e., S = {0 1, 2, 3, . . . , N - 1 } and let f (s) = s. We assume each position is equally likely. Then
Expected position of i0 =
"SUM" 0 <= s <= N - 1 s* P (s)
= "SUM" 0 <= s <= N - 1 s* ( 1 / N )
= (1 / N ) * ( (N-1)N) / 2
= (N-1) / 2 i.e.,
i0 's expected position is "in the middle" of the array A. Now we can compute the expected time to search an array, using linear search, for a given element X. Assume X is equally likely to be in any array position or not in the array. Inspecting the linear search algorithm, we see that the number of time units it takes to find X in position i is a constant k times (i+1). (Here position N represents not found). Then the expected time to search for X is
"SUM"0 <= i <= N k(i+1)*P(X is in position i)
= "SUM" 0 <= i <= N k(i+1)*(1/N+1)
= ((N+1)(N+2) / 2 ) * (1 / (N+1) ) * k = k (N+2) / 2
= "capital theta"(N). So the average time is about half the worst case time.
We can also show that the average time to search an ordered list using binary search is approximately
|_ log2N _| + 1 / 2,
while the worst case time is about
|_ log2 N _| + 1,
i.e., there is not much difference in the average time and the worst case time for binary search. Since the times are about the same and since the proof is quite long, we will omit it here.
Exercise 7.3. We define the variance of a random variable f to be
E(f - E(f))2.
What is the variance of f if f is the number of steps it takes to find whether or not X is in an array of N items using linear search?
Exercise 7.4. What is the expected running time of your algorithm to find the maximum element in an unsorted array of N elements (lecture 3)?
Some computer scientists frown on using expected time rather than worst time to evaluate an algorithm, because they say that the worst case might be the one encountered in practice. In many cases, however, an algorithm incoporated in a program for a utility job such as sorting or searching will be run over and over on many different data sets, and so calculating its average running time will give a reasonable prediction of how long it will take to process these many sets in practice.
Exercise 7.5. Suppose we have N integers stored in an array. What is the probability that they are in order (either nondecreasing or nonincreasing)? You may assume the integers are all distinct.
Exercise 7.6. (grad) In the above array, what is the probability that either the integers are in order or that exactly one of them, the smallest, is in the wrong place?