0% found this document useful (0 votes)
12 views8 pages

Asymptotic Analysis and Notation Guide

mt2

Uploaded by

Ankit Abhinav
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views8 pages

Asymptotic Analysis and Notation Guide

mt2

Uploaded by

Ankit Abhinav
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

A Theorist’s Toolkit (CMU 18-859T, Fall 2013)

Lecture 1: Asymptotics
September 9, 2013
Lecturer: Ryan O’Donnell Scribe: Misha Lavrov

In addition to the book references provided at the end of this document, two chapters of
lecture notes on asymptotics by A.J. Hildebrand can be found at:

• [Link]

• [Link]

1 Asymptotic Notation
We all know that n
X n(n + 1) 1 1
i= = n2 + n.
i=1
2 2 2
If we want to describe the behavior of this function for large n, the quadratic term is the
more important. We can write this as
n
X
i = O(n2 ).
i=1

Formally, the notation f (x) = O(g(x)) (“big-oh of g(x)”) means that for x sufficiently large,
there is a constant C > 0 such that 0 ≤ f (x) ≤ Cg(x).1 This describes the asymptotic
behavior of f (x) as x → ∞. Sometimes we will also talk about asymptotics of some function
f (x) as x → 0+ . In that case, f (x) = O(g(x)) means that there is some x0 > 0 such that for
0 ≤ x ≤ x0 , 0 ≤ f (x) ≤ Cg(x). It will probably be clear from context which one is meant;
generally, a variable named n goes to ∞, while a variable named  goes to 0.
If we use O(g(x)) in an expression, such as f (x) = 2O(g(x)) , what we mean is that O(g(x))
can be replaced with some anonymous function of x which is O(g(x)). For example, we could
write n   
X 1 2 1 2 1
i = n + O(n) = n 1 + O .
i=1
2 2 n
One use of this that deserves special mention is O(1): this is a function of x that is eventually
bounded above by some constant.
In addition to O, there are several other symbols that can be used to say slightly different
things:
1
This is not entirely standard: some people allow −Cg(x) ≤ f (x) ≤ Cg(x).

1
• f (x) = Ω(g(x)) means that g(x) = O(f (x)): in other words, for x sufficiently large/small,
there is a constant C > 0 such that f (x) ≥ Cg(x).

• f (x) = Θ(g(x)) means that f (x) = Ω(g(x)) and f (x) = O(g(x)) simultaneously: for
x sufficiently large/small, there are constants C1 , C2 > 0 such that C1 g(x) ≤ f (x) ≤
C2 g(x).
f (x)
• f (x) ∼ g(x) means that g(x)
→ 1 in the limit. For example, we could write
n
X 1
i ∼ n2 .
i=1
2

f (x) f (x)
• f (x) = o(g(x)) means that g(x)
→ 0, and f (x) = ω(g(x)) means that g(x)
→ ∞. For
example, we could write
n
X 1
i = n2 (1 + o(1)).
i=1
2

• f (x) ≤ poly(g(x))) means that f (x) = g(x)O(1) : f (x) is bounded by some polynomial
function of g(x).

• f (x) = O(g(x))
e means that f (x) ≤ g(x) · poly(log g(x)): we forget about some polyno-
mial in log g(x), which is insignificant compared to g itself. For example, we can write
n5 · 3n = O(3
e n ). Note that n5 · 3n 6= O(2e n ): the difference between 2n and 3n is too
big. We can put a tilde on other things as well, such as Θ(g(x)).
e

• If x → 0+ , then f (x) = O(g(x))


e instead means f (x) ≤ g(x) · poly(log 1/g(x)), since
g(x) is probably something small. For example,  · log2 (1/) = O().
e

2 The Harmonic Number


Let Hn = 1 + 12 + · · · + n1 . This is called the n-th harmonic number. It comes up in many
applications including, for example, the coupon collector problem: if there are n different
coupons, and you pick coupons uniformly at random with replacement, you will need to look
at n · Hn coupons (in expectation) before you have seen all of them at least once.
We can get a simple upper bound for Hn as follows: if k = dlog2 (n + 1)e, then
1 1 1
Hn = 1 + + + ··· +
2 3 n
1 1 1
≤ 1 + + + ··· + k
2 3 2 −1

2
2.0 2.0 2.0

1.5 1.5 1.5

1.0 1.0 1.0

0.5 0.5 0.5

0.0 0.0 0.0


0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3 4 5 6
Z n
1 1 1 1 1 1
(a) + + ··· + (b) dx (c) 1 + + ··· +
2 3 n 1 x 2 n−1

Figure 1: Comparing the harmonic series to an integral

     
1 1 1 1 1 1 1 1
≤1+ + + + + + + ··· + + ··· + k
2 3 4 5 6 7 2k−1 2 −1
     
1 1 1 1 1 1 1 1
≤1+ + + + + + + ··· + + · · · + k−1
2 2 4 4 4 4 2k−1 2
≤ 1| + 1 +{z· · · + 1} = k.
k

Therefore Hn ≤ dlog2 (n + 1)e ∼ log2 n.


A similar technique can give us a lower bound on Hn : if k = blog2 nc, then
1 1 1
Hn = 1 + + + ··· +
2 3 n
1 1 1
≥1+ + + ··· + k
2 3
 2    
1 1 1 1 1 1 1 1 1
≤1+ + + + + + + + ··· + + ··· + k
2 3 4 5 6 7 8 2k−1 + 1 2
     
1 1 1 1 1 1 1 1 1
≤1+ + + + + + + + ··· + k
+ ··· + k
2 4 4 8 8 8 8 2 2
1 1 1 1
≤1+ + + · · · + = 1 + k.
|2 2 {z 2} 2
k

Therefore Hn ≥ 1 + 12 blog2 nc ∼ 12 log2 n. This is already enough to conclude Hn = Θ(log n).


(Note that the exact base of the log is irrelevant in big-Θ notation.)
In this class, this may be the last time you see the base 2 in log2 n written explicitly; we
will assume log n is log2 n (which is sometimes also written as lg n); for the natural log, we
will write ln n (pronounced “lon enn”).
What should we do if we want to Pnfigure out how big Hn is more precisely?
1
R n 1We can1 approximate the sum i=1 i by an integral. A lower bound for the integral
1 x
dx is 2 + 3 + · · · + n = Hn − 1, and an upper bound is 1 + 21 + · · · + n−1
1 1 1
= Hn−1 (see
Figure 1).

3
1
Since the antiderivative of x
is ln x, the integral is ln n − ln 1 = ln n, so we have

Hn − 1 ≤ ln n ≤ Hn−1 ⇔ ln(n + 1) ≤ Hn ≤ 1 + ln n.

We can write down a slightly less accurate but prettier estimate: ln n ≤ Hn ≤ 1 + ln n. In


particular, Hn ∼ ln n.
What is the error when we replace ln(n + 1) by ln n?
  
1
ln(n + 1) = ln n · 1 +
n
 
1
= ln n + ln 1 +
n
 
1
= ln n + Θ .
n

The last step follows from the Taylor expansion of ln x: for −1 < x ≤ 1.

x 2 x3
ln(1 + x) = x − + − ··· .
2 3
Since x in our case is n1 , which goes to 0, the first term is most significant, and we can
approximate ln 1 + n1 by n1 . The error bound in Taylor’s theorem states that ln 1 + n1 =
2
1 + n1 − ξ2 for some 0 ≤ ξ ≤ n1 , so actually we can say ln(n + 1) = ln n + n1 − O n12 .


This is related to one of the most useful asymptotic approximations you will use: ex is
approoximately 1 + x for small x. To be more precise,

x2 x3
ex = 1 + x + + + ···
2! 3!
= 1 + x + O(x2 ).

The Taylor expansion holds for all x, but only for small x is x2 less significant than x.
However, it’s true for all x that ex ≥ 1 + x.
Although we will not prove this, Hn can actually be described even more precisely:
 
1
Hn = ln n + γ + O ,
n

where γ ≈ 0.577 is known as the Euler-Mascheroni constant.

3 The Birthday Paradox


Suppose we have m bins which can hold balls, and we chuck n balls into the bins at random
(to be precise, each ball chooses a bin uniformly at random, and the choices are independent).
What is the probability that no two balls land in the same bin?

4
In the case m = 366, we get the birthday paradox: if you have n people, what is the
probability that two will share a birthday? The reason it’s a “paradox” is that the probability
is surprisingly large: although only n = 367 guarantees that two people share a birthday, in
a group of 30 people the probability is already over 70%.2
In general, the probability of no collisions between n balls thrown into m bins is
    
1 2 n−1
pn,m = 1 − 1− ··· 1 − .
m m m

A reasonable question to ask is: for what n, as a function of m, is pn,m ≈ 21 ?


By using the inequality ex ≥ 1 + x from the previous section, we can get a simple upper
bound:

pn,m ≤ e−1/m · e−2/m · · · e−(n−1)/m


 
1 2 n−1
= exp − − − ··· −
m m m
 
n(n − 1)
= exp − .
2m
 
n2
The n − 1 is somewhat annoying: we’d like to write pn,m ≤ exp − 2m
, but this doesn’t
quite follow from the above. In any case, if we believe that this inequality is close to being
the truth, we could answer the question “when is pn,m ≈ 12 ” by solving this for n, obtaining
√ √
n ∼ 2 ln 2 · m.
We also want a lower bound. The inequality 1 + x ≤ ex came from ln(1 + x) ≤ x, or
ln x ≤ x − 1. By extending the Taylor series for ln x, we can get a matching lower bound
that looks like 1 + x ≥ exp(x − O(x2 )). It follows that:
 2 
(n − 1)2
      
1 1 2 2 n−1
pn,m ≥ exp − − O exp − − O · · · exp − −O
m m2 m m2 m m2
   
1 2 n−1 1 2 2 2

= exp − − − ··· − exp O 1 + 2 + · · · + (n − 1)
m m m m2
    3 
n(n − 1) n
= exp − exp −O .
2m m2

We get the same thing we got in the upper bound,


 multiplied by an errorfactor.
 But when
√ 
1

1
n = Θ( m), this error factor looks like exp −O √m , which is 1 − O √m .

The implied constant in n√ = Θ( m) doesn’t matter too much. But it often helps to
know that when there are O( m) balls and m bins, you probably get a collision.
2
In our class, Alex Kazachkov shares a birthday with me (Misha Lavrov).

5
2.0 2.0 2.0

1.5 1.5 1.5

1.0 1.0 1.0

0.5 0.5 0.5

0.0 0.0 0.0


0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3 4 5 6

Rn 1
(a) ln 2 + · · · + ln n (b) 1 ln x dx (c) 2 ln n

Figure 2: Comparing the sum ln 1 + ln 2 + · · · + ln n to an integral

4 Stirling’s Formula
Recall that n! = n(n − 1)(n − 2) · · · · · 2 · 1. How large is n!, asymptotically?
A very simple upper bound for n!: we can replace every factor by n, getting n! ≤ nn .
For a lower bound, we could try several things:
• All factors except 1 are at least 2, so n! ≥ 2n−1 .
 dn/2e n n/2
• The first 21 n factors are at least 12 n , so n! ≥ n2
    
≥ 2
.
Because all of these are very large products, we can get a better sense of how large they are
by taking logs. We can then conclude that
 n n/2
n−1
2 ≤ ≤ n! ≤ nn
2
because
n n
(n − 1) ln 2 ≤
ln ≤ ln(n!) ≤ n ln n.
2 2
This is already enough to say that ln(n!) = Θ(n ln n), and therefore n! = 2Θ(n log n) .
To get a better estimate, we can expand

ln(n!) = ln 2 + ln 3 + · · · + ln n.

Just like we did for harmonic numbers, we can approximate this sum R n by an integral. The
sum ln 2+· · ·+ln n (in Figure 2a) is an over-estimate of the integral 1 ln x dx (in Figure 2b).
The antiderivative of ln x is x ln x − x, so we conclude

ln(n!) ≥ (n ln n − n) − (1 ln 1 − 1) = n ln n − n + 1.

To get an upper bound on ln(n!), we can subtract off the triangles in Figure 2c, which
will leave an under-estimate of the integral. Looked at sideways, the triangles have height 1
and their bases sum to ln n, so their area is 21 ln n, and therefore

1
ln(n!) ≤ n ln n − n + ln n + 1.
2
6
If we take these bounds and exponentiate, we get:
 n n √  n n
e· ≤ n! ≤ e n · .
e e
e n n .
 
In particular, n! = Θ e
We can still hope to do better. The error in our upper bound is the area of the slivers
that form the overlap between Figure 2b and Figure 2c. It can be shown3 that the total area
of these slivers is O(1). As a consequence,
√  n n
n! = Θ( n) · .
e
This is known as Stirling’s formula, of which a more precise variant states that
√  n n   
1
n! = 2πn · 1±O .
e n

We will see where the 2π comes from when we talk about the Central Limit Theorem.
We can use Stirling’s formula to estimate the binomial coefficients nk = k!(n−k)! n!

when
k = pn (to simplify notation, let q = 1 − p, so that n − k = qn). As n → ∞ for constant p,
 
n n! n!
= =
k k!(n − k)! (pn)!(qn)!
√ n
2πn ne (1 ± O( n1 ))
=√ pn √ qn qn
2πpn pn 1 1

e
2πqn e
(1 ± O( pn ))(1 ± O( qn ))
 n
1 1 1
∼√ ·√
2πn pq pp q q
1 1
=√ · √ · 2H(p)n
2πpq n

where H(p) = p log p1 + q log 1q . This quantity is of independent interest and is known as the
binary entropy of p. In particular, H(1/2) = 1, so we can conclude
n
 r  
n/2 2 1
∼ =Θ √ .
2n πn n

When n is even, this is the probability that if n fair coins are flipped, exactly half will come
up heads.
3
Cut the k-th triangle by the tangent line to ln x at k + 1. This bounds the k-th sliver in a smaller
1
triangle with height 1 and base ln(k + 1) − ln k − k+1 . The sum of the areas of the first n triangles is
1 1
therefore 2 (ln(n + 1) − Hn+1 + 1), which converges to 2 (1 − γ) as n → ∞.

7
When k = o(n), things are slightly different: the error factor depending on pn is no longer
comparable to the one depending on n. In this case, it’s more accurate to approximate nk
k
by nk! . The error analysis is as follows:
 
n n(n − 1)(· · · )(n − k + 1)
=
k k!
k
    
n 1 2 k−1
= 1− 1− ··· 1 − .
k! n n n

The error factor is exactly the probability pk,n we looked


√ at in the Birthday
√ Paradox section.
We know that pk,n has a constant value for k = Θ( n), so for k = o( n), pk,n = 1 − o(1),
and we have
nk
 
n
∼ .
k k!
Since n(n − 1)(· · · )(n − k + 1) ≤ nk , this is also an upper bound for all k.

References
[DB70] Nicolaas Govert De Bruijn. Asymptotic Methods in Analysis, volume 4. Dover
Publications, 1970.

[GKP94] Ronald L Graham, Donald E Knuth, and Oren Patashnik. Concrete Mathematics:
A Foundation for Computer Science. Addison-Wesley Longman Publishing Co.,
Inc., 1994.

You might also like