Program Efficiency
&
Complexity Analysis
Algorithm Review
◼ An algorithm is a definite procedure for
solving a problem in finite number of steps
◼ Algorithm is a well defined computational
procedure that takes some value (s) as
input, and produces some value (s) as
output.
◼ Algorithm is finite number of computational
statements that transform input into the
output
Good Algorithms?
◼ Run in less time
◼ Consume less memory
But computational resources (time
complexity) is usually more important
Measuring Efficiency
◼ The efficiency of an algorithm is a measure of
the amount of resources consumed in solving a
problem of size n.
◼ The resource we are most interested in is time
◼ We can use the same techniques to analyze the
consumption of other resources, such as memory
space.
◼ It would seem that the most obvious way to
measure the efficiency of an algorithm is to run it
and measure how much processor time is
needed
◼ But is it correct???
Factors
◼ Hardware
◼ Operating System
◼ Compiler
◼ Size of input
◼ Nature of Input
◼ Algorithm
Which should be improved?
Running Time of an Algorithm
◼ Depends upon
◼ Input Size
◼ Nature of Input
◼ Generally time grows with size of input, so
running time of an algorithm is usually
measured as function of input size.
◼ Running time is measured in terms of
number of steps/primitive operations
performed
◼ Independent from machine, OS
Finding running time of an
Algorithm / Analyzing an Algorithm
◼ Running time is measured by number of
steps/primitive operations performed
◼ Steps means elementary operation like
,+, *,<, =, A[i] etc
◼ We will measure number of steps taken in
term of size of input
Simple Example (1)
// Input: int A[N], array of N integers
// Output: Sum of all numbers in array A
int Sum(int A[], int N)
{
int s=0;
for (int i=0; i< N; i++)
s = s + A[i];
return s;
}
How should we analyse this?
Simple Example (2)
// Input: int A[N], array of N integers
// Output: Sum of all numbers in array A
int Sum(int A[], int N){
int s=0; 1
for (int i=0; i< N; i++)
2 3 4
s = s + A[i];
5 1,2,8: Once
return s;
6 7
3,4,5,6,7: Once per each iteration
}
8 of for loop, N iteration
Total: 5N + 3
The complexity function of the
algorithm is : f(N) = 5N +3
Simple Example (3)
Growth of 5n+3
Estimated running time for different values of N:
N = 10 => 53 steps
N = 100 => 503 steps
N = 1,000 => 5003 steps
N = 1,000,000 => 5,000,003 steps
As N grows, the number of steps grow in linear
proportion to N for this function “Sum”
What Dominates in Previous
Example?
What about the +3 and 5 in 5N+3?
As N gets large, the +3 becomes insignificant
5 is inaccurate, as different operations require varying
amounts of time and also does not have any significant
importance
What is fundamental is that the time is linear in N.
Asymptotic Complexity: As N gets large, concentrate on
the highest order term:
◼ Drop lower order terms such as +3
◼ Drop the constant coefficient of the highest order term
i.e. N
Asymptotic Complexity
◼ The 5N+3 time bound is said to "grow
asymptotically" like N
◼ This gives us an approximation of the
complexity of the algorithm
◼ Ignores lots of (machine dependent)
details, concentrate on the bigger picture
Comparing Functions: Asymptotic
Notation
◼ Big Oh Notation: Upper bound
◼ Omega Notation: Lower bound
◼ Theta Notation: Tighter bound
Big Oh Notation [1]
If f(N) and g(N) are two complexity functions, we say
f(N) = O(g(N))
(read "f(N) is order g(N)", or "f(N) is big-O of g(N)")
if there are constants c and N0 such that for N > N0,
f(N) ≤ c * g(N)
for all sufficiently large N.
Big Oh Notation [2]
◼ O(f(n)) =
{g(n) : there exists positive constants c and n0
such that 0 <= g(n) <= c f(n) }
◼ O(f(n)) is a set of functions.
◼ n = O(n2) means that function n belongs to
the set of functions O(n2)
BIG OMEGA NOTATION
◼ If
we wanted to say “running time is at least…”
we use Ω
◼ Big Omega notation, Ω, is used to express the
lower bounds on a function.
◼ If f(n) and g(n) are two complexity functions then
we can say:
f(n) is Ω(g(n)) if there exist
positive 5n+3=Ω(n)
numbers c and n0
such that 0<=f(n)>=cΩ(n)
for all n>=n0
BIG THETA NOTATION
◼ If we wish to express tight bounds we use the theta notation, Θ
◼ f(n) = Θ(g(n)) means that f(n) = O(c2g(n)) and f(n) = Ω(c g(n))
1
17
WHAT DOES THIS ALL MEAN?
◼ If f(n) = Θ(g(n)) we say that f(n) and g(n)
grow at the same rate, asymptotically
◼ If f(n) = O(g(n)) and f(n) ≠ Ω(g(n)), then we
say that f(n) is asymptotically slower
growing than g(n).
◼ If f(n) = Ω(g(n)) and f(n) ≠ O(g(n)), then we
say that f(n) is asymptotically faster growing
than g(n).
18
WHICH NOTATION DO WE USE?
◼ To
express the efficiency of our algorithms
which of the three notations should we use?
◼ Ascomputer scientist we generally like to
express our algorithms as big O since we
would like to know the upper bounds of our
algorithms.
19
◼ Why?
O(f(n))
Example (1)
◼ Consider
f(n)=2n2+3
and g(n)=n2
Is f(n)=O(g(n))? i.e. Is 2n2+3 = O(n2)?
Proof:
2n2+3 ≤ c * n2
Assume N0 =1 and c=1?
Assume N0 =1 and c=2?
Assume N0 =1 and c=3?
◼ If true for one pair of N0 and c, then there exists infinite set of such
pairs of N0 and c
Example (2): Comparing
Functions 4000
◼Which function
3500
is better?
3000
10 n2 Vs n3
2500
10 n^2
2000
n^3
1500
1000
500
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Comparing Functions
◼ As inputs get larger, any algorithm of a
smaller order will be more efficient than an
algorithm of a larger order
0.05 N2 = O(N2)
Time (steps)
3N = O(N)
Input (size)
N = 60
Big-Oh Notation
◼ Even though it is correct to say “7n - 3 is
O(n3)”, a better statement is “7n - 3 is O(n)”,
that is, one should make the approximation as
tight as possible
◼ Simple Rule:
Drop lower order terms and constant
factors
7n-3 is O(n)
8n2log n + 5n2 + n is O(n2log n)
Some Questions
3n2 - 100n + 6 = O(n2)?
3n2 - 100n + 6 = O(n3)?
3n2 - 100n + 6 = O(n)?
3n2 - 100n + 6 = (n2)?
3n2 - 100n + 6 = (n3)?
3n2 - 100n + 6 = (n)?
3n2 - 100n + 6 = (n2)?
3n2 - 100n + 6 = (n3)?
3n2 - 100n + 6 = (n)?
Performance Classification
f(n) Classification
1 Constant: run time is fixed, and does not depend upon n. Most instructions are
executed once, or only a few times, regardless of the amount of information being
processed
log n Logarithmic: when n increases, so does run time, but much slower. Common in
programs which solve large problems by transforming them into smaller problems.
n Linear: run time varies directly with n. Typically, a small amount of processing is
done on each element.
n log n When n doubles, run time slightly more than doubles. Common in programs which
break a problem down into smaller sub-problems, solves them independently, then
combines solutions
n2 Quadratic: when n doubles, runtime increases fourfold. Practical only for small
problems; typically the program processes all pairs of input (e.g. in a double nested
loop).
n3 Cubic: when n doubles, runtime increases eightfold
2n Exponential: when n doubles, run time squares. This is often the result of a natural,
“brute force” solution.
Size does matter[1]
What happens if we double the input size N?
N log2N 5N N log2N N2 2N
8 3 40 24 64 256
16 4 80 64 256 65536
32 5 160 160 1024 ~109
64 6 320 384 4096 ~1019
128 7 640 896 16384 ~1038
256 8 1280 2048 65536 ~1076
COMPLEXITY CLASSES
Time (steps)
28
28
Size does matter[2]
◼ Suppose a program has run time O(n!) and the
run time for
n = 10 is 1 second
For n = 12, the run time is 2 minutes
For n = 14, the run time is 6 hours
For n = 16, the run time is 2 months
For n = 18, the run time is 50 years
For n = 20, the run time is 200 centuries
Standard Analysis Techniques
◼ Constant time statements
◼ Analyzing Loops
◼ Analyzing Nested Loops
◼ Analyzing Sequence of Statements
◼ Analyzing Conditional Statements
Constant time statements
◼ Simplest case: O(1) time statements
◼ Assignment statements of simple data types
int x = y;
◼ Arithmetic operations:
x = 5 * y + 4 - z;
◼ Array referencing:
A[j] = 5;
◼ Array assignment:
j, A[j] = 5;
◼ Most conditional tests:
if (x < 12) ...
Analyzing Loops[1]
◼ Any loop has two parts:
How many iterations are performed?
How many steps per iteration?
int sum = 0,j;
for (j=0; j < N; j++)
sum = sum +j;
Loop executes N times (0..N-1)
4 = O(1) steps per iteration
◼ Total time is N * O(1) = O(N*1) = O(N)
ANALYZING LOOPS – LINEAR LOOPS
◼ Example (have a look at this code segment):
◼ Efficiency is proportional to the number of iterations.
◼ Efficiency time function is :
f(n) = 1 + (n-1) + c*(n-1) +( n-1)
= (c+2)*(n-1) + 1
= (c+2)n – (c+2) +1
33
◼ Asymptotically, efficiency is : O(n)
33
Analyzing Loops[2]
◼ What about this for loop?
int sum =0, j;
for (j=0; j < 100; j++)
sum = sum +j;
◼ Loop executes 100 times
◼ 4 = O(1) steps per iteration
◼ Total time is 100 * O(1) = O(100 * 1) =
O(100) = O(1)
Analyzing Nested Loops[1]
◼ Treat just like a single loop and evaluate each
level of nesting as needed:
int j,k;
for (j=0; j<N; j++)
for (k=N; k>0; k--)
sum += k+j;
◼ Start with outer loop:
How many iterations? N
How much time per iteration? Need to evaluate
inner loop
◼ Inner loop uses O(N) time
◼ Total time is N * O(N) = O(N*N) = O(N2)
Analyzing Nested Loops[2]
◼ What if the number of iterations of one loop
depends on the counter of the other?
int j,k;
for (j=0; j < N; j++)
for (k=0; k < j; k++)
sum += k+j;
◼ Analyze inner and outer loop together:
◼ Number of iterations of the inner loop is:
◼ 0 + 1 + 2 + ... + (N-1) = O(N2)
HOW DID WE GET THIS ANSWER?
◼ When doing Big-O analysis, we sometimes have
to compute a series like: 1 + 2 + 3 + ... + (n-1) + n
◼ i.e.
Sum of first n numbers. What is the
complexity of this?
◼ Gaussfigured out that the sum of the first n
numbers is always:
37
37
CONDITIONAL STATEMENTS
◼ What about conditional statements such as
if (condition)
statement1;
else
statement2;
◼ wherestatement1 runs in O(n) time and
statement2 runs in O(n2) time?
◼ We use "worst case" complexity: among all inputs
of size n, what is the maximum running time?
38
◼ The
38 analysis for the example above is O(n2)
DERIVING A RECURRENCE EQUATION
◼ So far, all algorithms that we have been analyzing have been non
recursive
◼ Example : Recursive power method
◼ If N = 1, then running time T(N) is 2
◼ However if N ≥ 2, then running time T(N) is the cost of each step taken plus
time required to compute power(x,n-1). (i.e. T(N) = 2+T(N-1) for N ≥ 2)
39
◼ How do we solve this? One way is to use the iteration method.
39
ITERATION METHOD
◼ This is sometimes known as “Back Substituting”.
◼ Involves expanding the recurrence in order to see a pattern.
◼ Solving formula from previous example using the iteration
method :
◼ Solution : Expand and apply to itself :
Let T(1) = n0 = 2, so T(N) = nk
T(N) = 2 + T(N-1)
= 2 + 2 + T(N-2)
= 2 + 2 + 2 + T(N-3)
= 2 + 2 + 2 + ……+ 2 + T(1)
= 2N + 2 remember that T(1) = n0 = 2 for N = 1
40
◼ So T(N) = 2N+2 is O(N) for last example.
40
Analyzing Sequence of Statements
◼ For a sequence of statements, compute their
complexity functions individually and add them
up
for (j=0; j < N; j++)
for (k =0; k < j; k++) O(N2)
sum = sum + j*k;
for (l=0; l < N; l++) O(N)
sum = sum -l;
cout<<“Sum=”<<sum; O(1)
Total cost is O(N2) + O(N) +O(1) = O(N2)
SUM RULE
Best Case
◼ Best case is defined as which input of size
n is cheapest among all inputs of size n.
◼ “The best case for my algorithm is n=1
because that is the fastest.” WRONG!
Misunderstanding
Some Properties of Big “O”
◼ Transitive property
If f is O(g) and g is O(h) then f is O(h)
◼ Product of upper bounds is upper bound for the
product
If f is O(g) and h is O(r) then fh is O(gr)
◼ Exponential functions grow faster than
polynomials
nk is O(bn ) b > 1 and k ≥ 0
e.g. n20 is O( 1.05n)
◼ Logarithms grow more slowly than powers
logbn is O( nk) b > 1 and k 0
e.g. log2n is O( n0.5)
SUMMARY
◼ Algorithms can be classified according to their
complexity => O-Notation
only relevant for large input sizes
◼ "Measurements" are machine independent
worst-, average-, best-case analysis
44
44