Lecture 1--Review (diagnostic test). 10/3/97

================================================================================

An answer key to the diagnostic test will be distributed when your work is returned to you.

================================================================================

Lecture 2. Example algorithm (Euclid); measuring algorithm efficiency. 10/3/97

================================================================================

In this course we will study the design and analysis of algorithms. We will be concerned with theoretical analysis and also with how algorithms behave in practice. Along with the references listed for this course, you should be aware of an encyclopaedic collection of information on algorithms for many different problems:

D. Knuth, The Art of Computer Programming, Addison-Wesley. Volume 1, Fundamental Algorithms, 2nd edition, 1973. Volume 2, Seminumerical Algorithms. Volume 3, Sorting and Searching.

What is an algorithm? What characteristics of an algorithm are important? In his Volume 1, Knuth defines an algorithm to be a finite set of rules giving a sequence of operations for solving a specific type of problem. The word algorithm, Knuth points out, comes from the name of the Persian Al-Kowarizmi, who wrote a book on arithmetic in about 825. We already know many algorithms. For example, in grade school we all learned an algorithm for adding two integers and another algorithm for subtracting one integer from another. We also learned algorithms for multiplication of two integers and for "long division". One of the oldest known procedures which fits the definition of an algorithm is Euclid's algorithm for finding the greatest common divisor of two positive integers. A modern version of it goes as follows:

Input:  positive integers m and n
Output:  g, the greatest common divisor of m and n

Begin
 
while n <> 0 do
  
     temp = m mod n
     m = n
     n = temp

g = m

End

Note that we have written this algorithm in "pseudocode", i.e., we have not used the syntax of any particular computer language, but we have used constructs which can be found in most modern computer languages. In particular we have used the "while" loop construct.

What properties of an algorithm are important?

1. Correctness. We will require that an algorithm A be implementable on a computer and give a correct answer to a given problem P (in a finite amount of time).

2. Efficiency. A good algorithm should make efficient use of the two main resources in a computer system, time and space.

3. A good algorithm should be easy to implement (in terms of the number of times it will be run)

4. A good algorithm should make efficient use of time and space not only in the theoretical sense but also in its actual running time for the cases of interest to the user of the algorithm..

5. A good algorithm takes into account special characteristics of the data that will be processed.

Property 1 is of course, essential. An algorithm must be implementable and correctly give the output desired. Properties 2-5 are relative. It may be that by using extra space (or memory) we can have a faster algorithm. It may be that the algorithm will be used thousands of times once it is implemented, so we can choose a more efficient algorithm which is not so easy to implement. It may be that an algorithm makes efficient use of time and space in theory but is inefficient in practice or for the sizes of input we will be encountering. We will see many example of this as we go along.

As we mentioned above, we will usually measure efficiency in terms of space and time used, since these are the resources important in a computer system. We will mostly be concerned with a theoretical analysis of the efficiency of various algorithms, but a good algorithm developer should always keep in mind that practical considerations cannot be ignored. Important practical considerations include the difficulty of implementing an algorithm, whether the resulting program will be used only a few times or many times, and specific characteristics of the data to which the algorithm will actually be applied.

Example (Euclid's algorithm given above).
If m = 15, n = 35, we get: 

iteration      m      n 
      0        15     35
      1        35     15
      2        15      5
      3         5      0

output is m = 5

In fact, this algorithm is quite difficult to analyze. First we must know that the problem of computing the gcd is well-defined. This means we need to know enough algebra to know that every integer has a unique factorization into a product of powers of primes and that from the representations for m and n the gcd of m and n can be extracted. Then we must know that the algorithm is correct. This follows from knowing that we can always write

m = kn + r,

where k is an integer and 0 <= r < n. Now divisibility properties of integers imply that any integer x which divides m and n must also divide r. So gcd(m,n) will divide r. When the algorithm finishes there is no r' with 0 < r' < r for whcih gcd(m,n) also divides r', since otherwise the algorithm would keep running. We also see that the algorithm must finish eventually since the r value being computed is always less that n (and less than m after the first iteration) and there are only finitely many integers between initial points m,n and 0. So the algorithm is well-defined and correct.

Efficiency is also diifficult to measure. It is fairly easy to see that we are only using about 3 memory cells, to hold m,n, and temp, but how many iterations will be required? (It turns out that the number of interations is bounded by a constant time the maximum number of bits in m or n, i.e., the time for the algorithm to run is proportional to the maximum of the base 2 logarthm of m and the base 2 logarithm of n. Usually fewer iterations will be required, but the fibonacci numbers (lecture 4) give examples of pairs m,n for which the maximum number of iterations is required.

Exercise 2.1. For Euclid's algorithm given above

a. How many memory cells will be used?

b. Suppose for inputs m,n the while loop will be executed a total of K times. How many statements will be executed? (explain how you are counting the statements in the algorithm, for example, how many statements are you counting for "while n <> 0" ?)

c. What is the "best case" running time of Euclid's algorithm, i.e., what is the fastest it can run? Give an example of values m,n for which this time is achieved.

Exercise 2.2. Give an algorithm to find the maximum value stored in an (unsorted) array A of n integers. In terms of n, calculate much space your algorithm uses and how many statements will be executed when it is run

a. in the "worst" case.

b. in the "best" case.

Give an example of a specific array A on which your algorithm will achieve "best case" behavior and an array A on which your algorithm will achieve "worst case" behavior. (For this part you may assume n = 10.)