Decision and Game Theory Insights
Decision and Game Theory Insights
4. Maximize expected outcome: Choose the option with the best expected decisions, often influenced by biases, emotions, and cognitive limitations.
outcome. ü Growing field of behavioral decision theory that studies these human
ü Decision analysis is a process that provides a structured method with
deviations from rationality.
analytical tools designed to improve one’s decision-making skills. It provides
a conceptual framework allowing to break the problem down into smaller ü Game Theory studies strategic interactions where the outcome depends not
and easier sub-problems. just on a single decision maker but on the choices of others.
²multiple objectives, and Ø Too many consequences, some of which may have impacts that are
hard to evaluate such as intangible benefits.
²competing viewpoints.
7 8
Uncertainty Multiple objectives
Uncertainty could be the major handicap in making decision. Future Ø Progress towards one objective may impede progress in others
events are beyond the DM’s control and hence could result in good or bad ü Quality vs. cost
consequences. ü Short-term profit vs. long-term profit
ü Economic benefits against environmental damage
ü Shareholders’ benefits against employees’
Ø In 1953 Maurice ALLais, winer of Nobel prize in Economics (1988), has • Gamble A: Win $1 million (1.0)
published a paper that contradicts the expected utility hypothesis based • Gamble B: Win $5 million (0.10) + Win $1 million (0.89) + Win $0 (0.01)
ü In numerous experiments, most people prefer A to B and D to C. • The expected values of C and D are
Are these choices consistent with one another ? EV(C) = 0.11*$1m + 0.89*$0m = $110,000
• The expected values of A and B are EV(D) = 0.1*$5m + 0.90*$0m = $500,000
EV(A) = 1.0*$1m = $1m • So if someone prefer D over C, this supposes a maximization of the
EV(B) = 0.1*$5m + 0.89*$1m + 0.01*$0m = $1.39m expected value.
• So if someone prefer A over B, this supposes a maximization of the • If we chose A over B and D over C, we are violating expected utility
expected utility and not of the expected value. Which means: theory. Let’s get a better understanding of the paradox by rearranging
U($1m) > 0.1*U($5m) + 0.89*U($1m) + 0.01*U($0m) what we have written out these gambles.
15 16
The framing of the problem confuses participants
• Gamble A: Win $1 million (1.0) • Gamble A: Win $1 million (0.11) + Win $1 Decision and Game Theory
million (0.89)
• Gamble B: Win $5 million (0.10) + • Gamble B: Win $5 million (0.10) + Win $0
Win $1 million (0.89) + Win $0 (0.01) (0.01) + Win $1 million (0.89)
Chapter 1: Problem Formulation
• Gamble C: Win $1 million (0.11) + • Gamble C: Win $1 million (0.11) + Win $0 Sonia REBAI - M. Naceur Azaiez
Win $0 (0.89) (0.89)
The first step in the decision analysis process is the problem formulation. We
Setting objectives
begin with a verbal statement of the problem. We then identify the following
An objective is a specific goal that a decision-maker attempts to reach.
elements:
Setting the right objectives is vital for a decision problem.
• the objective to optimize,
• It should be measurable to evaluate the alternatives leading to different
• the decision alternatives, levels of the objective.
• the uncertain future events, referred to as chance events or state of • Knowing the right objectives helps identifying the adequate alternatives
nature to consider.
• the consequences associated with each decision alternative and each • It also dictates the way to measure the results and the type of
chance event outcome, referred to the outcome (gain or loss) function uncertainty inherited in the problem. 3
• Some objectives may be related; others may serve for more important
• It is sometimes difficult to understand the objective itself. For
ones. The DM should distinguish between what is important for an
instance, business firms seek “success” or “excellence”.
objective and what is the ultimate objective.
ü How to measure it accurately?
• It is possible to have a hierarchy of objectives. For example; by having a
ü Would profit maximization, cost reduction, or operating according to
university degree, one may get a good job that can generate a high
international standards be good indicators to track progress towards
salary which can improve the quality of life of the DM. Hence,
such an objective?
improving the quality of life is the ultimate objective in this hierarchy.
4 5
Identifying alternatives
• Among possible actions to consider:
Once the decision situation is clearly identified and the objectives are
ü Wait & see: in order to gain more information before acting.
adequately set, one may attempt to discover and create alternatives.
• Some of the alternatives may arise in a straightforward way: e.g., ü Do nothing: in order to not to get worse than the current situation.
accept/reject, select among a list. Others may be obtained only after
ü Pay for safety or insurance: in order to get rid of risk.
some careful consideration and analysis of the problem.
• The DM should make sure that the identified actions would serve the
final objective(s) either directly or indirectly.
6 7
• In complex situations, some alternatives are less obvious to identify. It is Environment of the decision problem
the case where alternatives are of technical nature: legal procedures,
The decision problem environment can be identified through the level of
financial solutions, reforms, engineering methods, medical procedures.
knowledge of the states of nature. Three levels can be distinguished.
These would be obtained by consulting the right experts.
• Other alternatives may be less technical yet require deep thinking. These
More information
may be identified by calling for meetings, to take more time to think, to
employ competencies & know-how. Among scientific tools used are Uncertainty Risk Certainty
8 9
ü Uncertainty In general, it is not necessary that the problem falls in one of the three
It reflects the extreme case where situations are out of the control of the decision- cases above. However, the modeling approach would dictate the
maker without knowledge on the likelihood of their occurrence environment as a choice such as investigating the uncertainty or totally
ü Risk neglecting it.
The available information is used to assess a probability distribution of the
occurrences of the situations that are out of the control of the DM Uncertain Environment
10 11
Example 1
Alternatives
A company manufacturing electronic components realizes that the computer market
• A1 : Buying machine 1
grow rapidly. It has reached the maximum of its production for the moment and can no
longer meet any additional demand.
• A2: Buying machine 2
The managers, whose policy is to satisfy the demand, are considering the opportunity of • A3: Subtracting
purchasing a machine from two preselected ones or or move towards subcontracting. States of Nature
The possible results in million TD are estimated as follows: • E1: High Additional Demand
Additional Additional Additional Weak No Additional • E2: Medium Additional Demand
High Demand Medium Demand Demand Demand
Machine 1 0.7 0.3 -0.1 -0.3 • E3: Weak Additional Demand
Machine 2 0.5 0.4 0.1 -0.2 • E4: Null Additional Demand
Subcontracting 0.3 0.2 0.3 0 12 13
A3
E3 0.1
• Decision nodes represented by squares and E4 -0.2
E1 0.3
• Probability nodes represented by circles.
E2 0.2
E3
Ø the different results are indicated at the extremities of the tree. 0.3
E4 0
16 17
Optimistic approach
• In this chapter we consider approaches that do not require knowledge of
the probabilities of the states of nature. • The DM assumes that each alternative will generate the best possible
result, then he chooses the one that gives him the best result among
• These approaches are appropriate in situations in which a simple best-
the most favorable results.
case and worst-case analysis is desirable.
• If the objective of the problem is a maximization, the DM uses the
• Because different approaches sometimes lead to different decision
Max-max criterion.
recommendations, the DM needs to select the specific approach that,
according to him is the most appropriate.
• If the objective is a minimization, the DM uses the criterion Min-min.
Conservative (Pessimistic) approach
• A DM assumes that each decision alternative generates the worst
e1 e2 e3 e4 Max
payoff, then he chooses the action that gives him the best result among
a1 250 350 350 400 400
the least favorable ones (Wald's criterion).
a2 225 300 380 420 420
• If the objective of the problem is a maximization of the payoff, the DM
a3 200 200 400 500 500 Max
uses the criterion Max-min.
Hurwicz criterion
• Most decision makers are not totally pessimistic or optimistic.
e1 e2 e3 e4 Min
• To measure the degree of optimism of such a DM, we use a scale aÎ [0,1],
a1 250 350 350 400 250 Max where
a2 225 300 380 420 225 ü 0 indicates a totally pessimistic DM.
10
e1 e2 e3 e4 Remark
a1 250 350 350 400
a2 225 300 380 420 • In all the above criteria the maximum and/or the minimum payoff is
a3 200 200 400 500 used as a basis for decision
• Except for the regret criterion, all other entries are ignored in
Regret table
e1 e2 e3 e4 Max choosing the “best” alternative
a1 0 0 50 100 100 • Note that these extreme payoffs that were used for decision might be
a2 25 50 20 80 80 Min extremely unlikely
a3 50 150 0 0 150
Laplace criterion
• A rational DM tries to incorporate all the information into the evaluation e1 e2 e3 e4 Mean
process. He uses the average payoff over all entries. a1 250 350 350 400 337.5 Max
• Because no probabilities are identified, all payoffs are considered as a2 225 300 380 420 331.25
• The criterion selects the best average (largest when maximizing a payoff
and the smallest when minimizing it)
a1 250 350 350 400 350 Max a1 250 350 350 400 345 Max
a2 225 300 380 420 300 a2 225 300 380 420 329
a3 200 200 400 500 200 a3 200 200 400 500 310
3 4
5 6
(0.2) (0.4) (0.1) (0.3) where Z is a random variable representing payoff and k is a predefined
EOL
S1 S2 S3 S4 constant reflecting the decision-maker's risk aversion factor. It is used as
a1 0 0 50 100 35 Min a weight indicating the degree of importance of the variance of Z in
a2 25 50 20 80 51 relation to its expected value. For example, a decision maker who is
a3 50 150 0 0 70 particularly sensitive to results far below EV will choose a value for k
greater than 1.
Note that EMV and EOL yield the same optimal decision. 5 6
7
Pennzoil-Getty negotiations.
7
Pennzoil won the case; in late 1985, it was awarded $11.1 billion, the largest In April 1987, just before Pennzoil began to file the liens, Texaco offered to
judgment ever in the United States at that time. A Texas appeals court pay Pennzoil $2 billion to settle the entire case. Hugh Liedtke, chairman of
Pennzoil, indicated that his advisors were telling him that a settlement
reduced the judgment by $2 billion, but interest and penalties drove the total
between $3 and $5 billion would be fair.
back up to $10.3 billion. James Kinnear, Texaco’s chief executive officer, had
said that Texaco would file for bankruptcy if Pennzoil obtained court What do you think Liedtke should do? Should he accept the offer of $2
billion, or should he refuse and make a firm counteroffer? If he refuses the
permission to secure the judgment by filing liens against Texaco’s assets.
sure $2 billion, he faces a risky situation. Texaco might agree to pay $5 billion,
Furthermore, Kinnear had promised to fight the case all the way to the U.S.
a reasonable amount in Liedtke’s mind. If he counteroffered $5 billion as a
Supreme Court if necessary, arguing in part that Pennzoil had not followed
settlement amount, perhaps Texaco would counter with $3 billion or simply
Security and Exchange Commission regulations in its negotiations with Getty.
pursue further appeals.
Below is a decision tree that shows a simplified version of Liedtke’ s
Experts expect that the Supreme Court will keep the fine with only 20% problem.
chance, will reduce it to $5 billion with 50% chance or will eliminate it
completely with a 30% chance.
It was also believed that Texaco accepts a counter-offer of $5 billion with
1/6 chance and would place a counter-offer of $3 billion with 1/3 chance.
Find the optimal strategy.
Example 2
A government is bidding for the exploitation of an oil field. A company plans to
make an offer of 110 million TD. The company estimates that it has a 60%
chance of winning the contract. If it wins the contract it can choose between
three methods for exploiting the field: the new method (NM), the existing
method (EM), or subcontracting (SC). The outcomes are provided below.
The cost of preparing and submitting the bid is 2 million TD. If the company
The decision tree shows that his best current choice is to make the $5 billion does not bid, it can invest in another alternative that guarantees a return of 30
counteroffer with an expected payoff of $4.63 billion. The decision tree shows clearly million TD. Build a decision tree and determine the company's optimal
what Liedtke should do if Texaco counteroffers $3 billion: He should refuse.
strategy.
142 Do not submit an offer
30
Sub
NM MS (0.6) 300
NM Moderate Success 0.6 300 350
mit )
F (0.1) -100
Failure 0.1 -100
(-2
an
) EM S (0.5) 300
Success 0.5 300 0.6 202
offe
( MS (0.3)
in ) 200
EM Moderate Success 0.3 200 W 10
r
( - 1 F (0.2)
144 -40
Failure 0.2 -40 SC
250
SC Moderate Success 1 250
Lose (0.4)
0
Risk Profiles
ü EMV alone does not tell us the whole story; it does not inform us about
how much variation there is in the consequences.
ü To help us choose the best alternative, we should consider both the EMV
and the set of possible consequences for each alternative.
Counteroffer $0 15%
$5 Billion; Accept $3 $4.12 $3 Billion 33%
Billion” strategy Billion $5 Billion 42%
$10.3 Billion 10%
Risk profile for the “Accept $2 Billion” alternative Risk profile for the “Counteroffer $5 Billion; Refuse Texaco Counteroffer” strategy
ü Risk profiles can be calculated for strategies that might not have
Risk profile for the “Counteroffer $5 Billion; Accept $3 Billion” strategy
appeared as optimal in an expected-value analysis.
ü For example, Comparing the two previous figures indicates that the
strategy “Counteroffer $5 Billion; Accept $3 Billion,” which we ruled out
on the basis of EMV, yields a smaller chance of getting nothing, but also
less chance of a $10.3 billion judgment. Compensating for this is the
greater chance of getting something in the middle: $3 or $5 billion.
• Most often, additional information is obtained through experiments, a A company has the choice between producing itself or buying from a supplier
consulting advice, or a forecast. one of the electronic components that it uses in its activity. Net profit depends
• Such information is usually acquired at some cost. The question is on the level of demand for the product requiring this component. This
whether the cost paid is worth. demand can be low, medium or high. The a priori distribution is estimated at
0.35; 0.35 and 0.30.
• To answer this question we need to calculate the Expected Value of the
S1 S2 S3
additional information. Low Demand Medium Demand High Demand
Produce -20 40 100
buy 10 45 70 4
S1 -20
S2
The company has the opportunity to test the demand on the market. Two d uce 40
Pro S3 100
S1 10
results are possible for this test favorable (F) or unfavorable (U). The
F
Buy S2 45
S3
following conditional probabilities were estimated: S1
70
-20
ce S2
du 40
ey
Pro S3
v
100
S1 S2 S3 U
Sur
S1 10
Buy S2
Favorable P(F|S1)=0.10 P(F|S2)=0.40 P(F|S3)=0.60 45
S3 70
Unfavorable P(U|S1)=0.90 P(U|S2)=0.60 P(U|S3)=0.40 S1
No -20
Su S2
r ve Produce 40
y S3 100
Bu S1
10
Find the optimal strategy to adopt. y
S2 45
5
S3 70
Bayes’ theorem
P(Ij | Si): Conditional Probability F 0.1 0.4 0.6 0.035 0.14 0.18 0.355 0.09859 0.39437 0.50704
U 0.9 0.6 0.4 0.315 0.21 0.12 0.645 0.48837 0.32558 0.18605
P(IjÇSi) = P(SiÇIj): Joint Probability
c e 40
32,56 Produ S3 (0.18605) 100
S ur
32.56 S1 (0.48837) 10
Buy S2 (0.32558) 45
• EVII = 43.90 – 40.25 = 3.65
43.90
S3 (0.18605) 70
37
S1(0.35)
-20
• So if the cost paid for this information is less than 3.65, it will be
No S2(0.35)
Su 40.25 Prod uce 40 worthy to acquire. Otherwise, the information will be worthless.
rve S3(0.30)
y 100
S1(0.35) 10
Bu 40.25
y S2(0.35) 45
S3(0.30) 70
• Additional information reduces risk in decision making. It can in some • The value of EPVI is just simply the expected value under certainty minus the
extreme situations completely remove the risk and provide a certainty expected value under uncertainty.
environment. In General, such information is very expensive. EVPI = Expected Payoff - Expected payoff
under Certainty with no information
• How to evaluate the value of perfect information (EVPI)?
• To compute the expected value under certainty simply take the best payoff
• The idea behind EVPI is that if the state of nature that will occur is
under each state of nature and multiply it by its prior probability and sum
known with certainty, then the best alternative can be determined
these.
with certainty as well.
• EVPI places an upper bound on what one would pay for additional information.
• The Efficiency of the imperfect information is the ratio of EVII to EVPI. probabilities of 20%, 30%, and 50%. Correspondingly, the estimated profits
under these scenarios are 100,000 TD, 60,000 TD, and a loss of 20,000 TD,
• As the EVPI provides an upper bound for the EVII, efficiency is always a
respectively.
number between 0 and 1.
1. Say whether the introduction of the new product would be beneficial for
• The efficiency of the survey = EVII/EVPI = (3.65)/(9) = 40.556% Company A.
Company A can gather additional information about the state of the market via unit enabling Company A to collect a profit of 27,000 TD. If negotiations fail,
a cost of 5,000 TD. Historical data indicates a 50% probability of accurately Company B has the right to withdraw its offer entirely, leaving Company A
predicting a given situation, and a 25% likelihood of misjudging one of the two without an avenue to introduce its product to the market. The prevailing belief
remaining scenarios. is that there's a 40% likelihood for the first event, a 50% chance for the
2. What is your advice to Company A? second, and a 10% probability for the third event to occur.
Company B has put forth an offer to buy all of Company A's production at 12 3. Assist Company A in making a decision.
TD per unit. Company A may accept and receive a profit of 25,000 TD. 4. Determine the expected value of perfect information in relation to
Alternatively, Company A has the option to counter with a price of 15 TD per Company B's response to Company A's potential counter-offer?
unit. If Company B consent to this counter-offer Company A stands to earn a
profit of 30,000 TD. Company B might also propose a revised offer of 13 TD per
18
(0.5) (0.5)
Example 3: Marketing a new product Favorable market Unfavorable market
Construct a large plant $200,000 -$180,000
Construct a small plant $100,000 -$20,000
Assume that a feasibility study at Getz company of a new product led to
Do nothing $0 $0
encouraging the introduction of this product to the market. The management
of Getz is not sure whether a large plant or a small one should be built to 2. Getz has the possibility to conduct a survey for $10,000. The survey will result in
a favorable or an unfavorable prediction. Past experience indicates a 70%
manufacture the product. The relevant data is presented in the table below.
probability of predicting favorable market conditions when the market is
1. A marketing research company requests $65000 for a perfect information.
favorable, and a 20% probability when the market is unfavorable. What should
How much does this information worth? be the optimal strategy of the company?
3. Calculate the efficiency of the survey
19
(0.5) (0.5)
EMV
Favorable market Unfavorable market 2. Posterior distribution calculations
Construct a
$200,000 -$180,000 $10,000
large plant
Construct a States FAV UNF The P(FAV│ “.”) P(UNF│ “.”)
$100,000 -$20,000 $40,000
small plant 0.5 0.5 sum
Survey
Do nothing $0 $0 $0
“FAV” 0.7 0.2 0.35 0.10 0.45 0.78 0.22
“UNF” 0.3 0.8 0.15 0.40 0.55 0.27 0.73
1. EVPI = Expected Value Under Certainty - Max(EMV)
= ($200,000*0.50 + 0*0.50) - $40,000
= $100,000 - $40,000 = $60,000
So Getz should not be willing to pay more than $60,000
25
1st decision point 2nd decision point $106,400 Fav. Mkt (0.78)
$190,000
2
plant
Unfav. Mkt (0.22)
Example 3: Texaco vs. Pennzoil
$106,400
-$190,000
Large
Small $63,600 Fav. Mkt (0.78) $90,000
)
V”
(.4
5
No
plant 3 Unfav. Mkt (0.22) -$30,000
“FA
pla
nt
$49,200 -$87,400 Fav. Mkt (0.27)
-$10,000
1. Calculate the EVPI regarding Texaco’s reaction to a counteroffer of $5
1 “UN $190,000
F” ( 4 Unfav. Mkt (0.73)
plant
.5 5) -$190,000
billion? Can you explain this result intuitively?
y
Large
Surve
$2,400
pla
nt -$10,000
No
s
$10,000 Fav. Mkt (0.5)
$200,000
2. The timing of information acquisition may make a difference.
urv 6 Unfav. Mkt (0.5)
ey nt
e pla -$180,000
$40,000
No p plant
ll
7 Unfav. Mkt (0.5)
$100,000
-$20,000
a. suppose that Penzoil could obtain information about the final court
lant
$0
decision before making his current decision (taking the $2 billion or
Hence, if the survey results are favorable, the company should build a large
counteroffer $5 billion). What would be the EVPI of this information?
plant. However, if they are unfavorable, it should build a small plant.
The efficiency of the survey = EVII/EVPI = (19,200)/(60,000) = 32%
b. Suppose that Penzoil knew that it would be able to obtain perfect
information only after it has made its current decision but before it
would have to respond to a potential Texaco counteroffer of $3 billion.
What would be the EVPI in this case?
3. In question 2, EVPI for (b) should be less than EVPI calculated in (a). Can
you explain why?
4. What is EVPI if Liedtke can learn both Texaco’s reaction and the final
court decision before he makes up his mind about the current $2 billion The decision tree shows that his best current choice is to make the $5 billion
offer? Can you explain why the interaction of the two bits of information counteroffer with an expected payoff of $4.63 billion. The decision tree shows clearly
should have this effect? what Liedtke should do if Texaco counteroffers $3 billion: He should refuse.
1.
Texaco accept Refuse Counter-offer 3 EMV
(0.17) (0.5) (0.33)
Accept 2 2 2 2 2
Couter- 5 4.56 4.56 4.63
offer 5
2. b.
EVPI = EV(under certainty) - EMV
= 4.93 - 4.63
= $0.30 billion.
3. The earlier the information, the more valuable it is. In fact, we saw in
part (a) that, if the award turned out to be zero, Pennzoil would accept
the $2 billion. Obtaining the information later does not allow him to use
the information in this way.
4.
EVPI = EV(with Information) - EMV(without) = 5.23 - 4.63 = $0.60 billion.
§ Then, the DM is asked to determine CE(L), where L(x*, 0.5; x0, 0.5),
DM attitudes toward risk
denoted x0.5. It is obvious that U(x0.5)= 0.5.
²A DM is averse to risk if for any lottery L, the certainty equivalent of L is
§ Next, find the CE of the lottery L(x*, 0.5; x0.5 , 0.5) denoted x0.75, with
U(x0.75) = 0.75. worst than the expected value of the lottery L. In profit case, we have
§ Next, find the CE of the lottery (x0.5, 0.5; x0, 0.5) denoted x0.25, with E(L) > CE(L)
U(x0.25) = 0.25. § The difference RP(L) = E(L)-CE(L) > 0 is called the risk premium of L.
§ Fit the curve passing through (x*,1); (x0.75,0.75); (x0.5,0.5); (x0.25,0.25); and § RP(L) is the amount of money the DM is willing to pay to avoid risk.
(x0,0).
²A DM is prone to risk, risk seeker, or risk taker if for any lottery L,
²A DM is risk-neutral if for any lottery L, the certainty equivalent
the certainty equivalent of L is better than the expected value of
of L is equal to the expected value of L
the lottery L. In profit case we have E(L) < CE(L)
E(L) = CE(L)
§ Therefore, the risk premium is zero § The risk premium of L, RP(L)=E(L)-CE(L) < 0
Profit case
Convexity/Concavity and Attitude toward risk
Interpretation
Concave Function
²Let the lottery L have values x with probability p and y with probability 1-p.
²If the DM is neutral to risk, he would accept the expected value criterion.
U(z)= U(E(L))
the lottery L
U(z)
= pU(x)+(1-p)U(y)
U(L)=
pU(x)+(1-p)U(y) ²z=px+(1- p)y
U(y)
U(y) ²U(z)=U(E(L))≥E(U(L)).
x Z y
x z y
²The utility function is concave
Theorem
²If the DM is risk-seeker, the
U(x)
² A DM is risk-averse if and only if his utility function is concave.
lottery L is preferred to the
U(L)=
certain situation z (px+(1-p)y) ² A DM is risk-neutral if and only if his utility function is linear.
pU(x)+(1-p)U(y)
U(z)
²U(z) = U(E(L))≤E(U(L)) ² A DM is risk-seeker if and only if his utility function is a convex
U(y)
function
²The utility function is convex
x z y
U(M) U(M) U(M)
Example
An individual owns a house with a value of 150,000 TND. The risk to lose his
house from fire or other catastrophic event is 0.1% each year. His utility
Risk Averse
M
Risk prone
M
Risk Neutral
M function for the house is of the form:
U (x) = ax + b
²The higher the curvature the more averse or seeker the DM. How much he is willing to pay for an insurance company to preserve the value
²The same DM may have different risk attitudes based on the
of his asset? What is the corresponding risk premium?
problem situation or the values considered.
Let us find x, such that U(x) = 0.999; or the certainty equivalent of L, CE(L).
Solution
Let u be the utility function of the loss of the asset of the DM, where u(0) 1
Then solving 0.999 = − x +1
=1 and u(150,000)=0. Then, b=1 and a=-1/150,000. That is, 150,000
1
u(x)= − x +1
150,000 yields x = 299.850 = 300 TND
Sonia REBAI - Naceur Azaiez of their own goals but also of those of other protagonists.
ü Decision-makers are supposed to be rational and to pursue exogenous
Tunis Business School
objectives
University of Tunis
ü Game theory concerns social problems for which you are not the only Three elements should be specified
person who makes a decision. You interact with others.
ü Players: individuals, firms, countries, etc. A player has the ability to choose
ü You need to think about what is best for you that depends on what
among a set of possible actions. The specific identity of the players is
others do. Such a situation is called a strategic situation.
irrelevant to the game.
ü Game theory helps in examining and predicting how people behave in a
ü Strategies can be very simple or very complex but each is assumed to be
strategic situation.
well-defined.
ü Game theory provides you a unified way for solving many kinds of
ü Payoffs: The final returns of the players at the end of the game. They are
problems.
measured in terms of utility obtained by the player.
Types of Games
ü The goal of each player is to maximize his utility, which depends on his
Games are generally divided into two categories:
decisions (strategies) and that of all other players.
✓ Cooperative games: players can make irrevocable agreements with each
ü Each player is free to choose his decision, but not that of the others. other.
✓ Non-cooperative games:
ü They take into account the knowledge they have or the anticipations they
• Either it is assumed that players can not make irrevocable agreements
make of the behavior of other decision-makers.
before engaging in the action (impossibility of communicating or a
prohibition of consultation between competitors).
• or it is assumed that the players are in no way compelled to take any Description of Non-Cooperative Games
action respecting the agreements that they may have previously A non-cooperative game can be described in two ways
concluded between them.
1. A matrix form: it consists in giving strategies and payoffs for each set of
Further, players.
✓ Sequential vs. Simultaneous moves
2. An extensive form: it provides an extended description of the game as a
✓ Single Play vs. Iterated tree revealing outcomes from each set of players’ strategies and possible
✓ Zero vs. non-zero sum actions that each set of players can take in response to other players’
✓ Perfect vs. Imperfect information moves. This description is suitable for dynamic games or those with
ü For this example sequential moves put player who moves first at a
disadvantage. The other player will always choose an action that
results in a win.
Ø A wife and husband may either go to shopping or to a football match ✓ Two suspects are arrested for a crime. They are
• both prefer spending time together put in different rooms and being interrogated
• the wife prefers shopping and the husband prefers football by police force.
Wife Football Shopping ✓ Both players have two choices. Either to
✓ The Deal
• “if you defect to confess, but your companion remains silent, you will 2 Cooperate Defect
1
be set free, and your companion will be harshly punished for 15
Cooperate -1, -1 -15, 0
years”
Defect 0, -15 -10, -10
• “if you both defect to confess, you will each get a ten-year sentence”
hesitation since his best choice is independent of that of the other eliminate them.
maximizes the minimum allowed gains (offensive). Player 1 should play strategy 1 and player 2 should play C.
Since min-max = max-min, we say that the game possesses a saddle point
✓ However, the 2nd player minimizes the maximum loss (defensive).
leading to a stable solution.
✓ A saddle point is the combination of strategies in which each player can ✓ Every saddle point is an equilibrium point but not every equilibrium point
find the highest possible payoff assuming the best possible play by the is a saddle point.
opponent. ✓ A saddle point occurs when each player is achieving the highest possible
✓ None of the players will benefit from changing his move in an attempt payoff and thus neither would benefit from changing strategies if the other
to make advantage of the opponent strategy to improve his own didn’t also change - which is why it is also called an equilibrium point.
position. ✓ The optimal strategies and value for a two-players constant-sum game may
✓ When there is a saddle point, both players should stick to their min-max be found by the same methods used to find the optimal strategies and
and max-min solutions respectively. value for a two-players zero-sum game.
✓ A game does not always possess a saddle point.
✓ Each player has to look for the frequency to play each of the strategies
so that the result will be independent of the policy adopted by the
other player.
Matrix 2x2 ü Similarly, for the 2nd player, the probability q of playing strategy B is
B C
1 1 5 chosen so that the expected value of his loss is independent of the game
2 7 4 of the 1st player.
ü Let P be the probability of playing strategy 1. The expected value of the 1*q + 5*(1-q) = 7*q + 4*(1-q) or q = 1/7
gain is independent of the game of the 2nd iff
ü Thus, player 2 must play 1/7 of the time the strategy B and 6/7 of the
1 * P + 7 * (1-P) = 5*P + 4*(1-P) or P = 3/7
time the strategy C. His expected loss is
ü Thus, player 1 must play 3/7 of the time the 1st strategy and 4/7 of the
time the 2nd strategy. His expected profit = 31/7 1 * 1/7 + 5 * 6/7 = 31/7
Player 2
Matrix 2xm A B C D
1 16 4 18 6
Player 1
2 16 22 12 18
𝜋# = 𝜋$ Hence p* = 1/3. ü Thus player 2 must play 2/3 of the time strategy C and 1/3 of the time
strategy D. His expected loss = 14
ü Thus player 1 must play 1/3 of the time the 1st strategy and 2/3 of the
time the 2nd strategy. Game value = 14
Matrix mxn Let pij = the payoff under strategies i, j respectively for player 1 and 2
(i=1,2,…,m) and (j=1,2,…,n)
ü Let xj= the probability that Player 1 will use strategy j, (j =1,2,…,m).
Player 1 will solve the following LP Player 2 will solve the following LP (dual of the player 1’s LP
Note that the jth
Max v constraint implies that the Min w The ith constraint in the column
s.t. p11x1+ p21x2+ …+ pm1xm ≥ v s.t. p11y1+ p12y2+ … +p1nyn ≤ w player’s LP implies that if the
expected reward against
p12x1+ p22x2+ … + pm2xm ≥ v p21y1+ p22y2+ …+ p2nyn ≤ w
column j must at least row player chooses row i, then
. .
equal v; otherwise, the the column player’s expected
. .
column player could hold . losses cannot exceed w ;
.
p1nx1+ p2nx2+ … + pmnxm ≥ v the row player’s expected pm1y1+ pm2y2+ … +pmnyn ≤ w otherwise, the row player could
x1+x2+ …+xm= 1 reward below v by y1+y2+ …+yn=1 obtain an expected reward that
x1, x2, …, xm ≥0 y1, y2, …yn ≥ 0 exceeded w by choosing row i .
choosing column j.
Example
A B 1st player
1 14 6 Max V
14X1+2X2 ≥ V
2 2 12
6X1+12X2 ≥ V
X1+X2=1
Let denote by X1, X2 ≥ 0
2nd player
² X1 and X2 the probabilities of playing respectively Strategies 1 and 2. Min W
14Y1+6Y2 ≤ w
² Y1 and Y2 the probabilities of playing respectively Strategies A and B.
2Y1+12Y2 ≤ w
Y1+Y2 = 1
Y1, Y2 ≥ 0
ü Game theory became a field since the book of John von Neumann (a
computer scientist and mathematician and theoretical physicist) and
Decision & Game Theory
Oskar Morgenstern (a professor of economics at Princeton University)
published in 1944.
Chapter 8
Nash Equilibrium
formulated as a simple mathematical model of game by specifying the question was discovered by a
players, the strategies and the payoffs. mathematical genius, John Nash.
ü They posed the question about finding a unified principle or governing ü He received his Ph.D. from Princeton
principle which can be applied to all social. University with a 28-page thesis on his
22nd birthday.
ü Specifically they ask the following question “Is there any single solution
concept which can be applied to all social interactions?” that remains an ü He invented the notion of Nash
open question. equilibrium.
John Nash(1928-2015)
ü He had a problem of schizophrenia, and he ü Nash found the answer in a cup of coffee.
couldn't continue his research.
ü Nash get the ingenious idea of considering
ü Sooner It was discovered that his unified
the surface of a coffee as the set of all
principle was very useful in addressing lots of
economics questions. possible human behavior in a social
problem.
ü In 1994 he received Nobel Prize in economics.
ü If you stir the surface of a coffee, you get a The vortex
ü His life was so interesting that it was made into a
movie called A Beautiful Mind. vortex, a point that doesn't move.
ü A point in this picture represents
Nash Equilibrium
best replies original behavior of people in a
ü A Nash equilibrium is a set of strategies, one for each player, that are each
social situation.
best responses against one another.
ü The destination represents
another social situation where ü In a two-player game, a Nash equilibrium is a pair of strategies (a*,b*)
All players players are moving towards best such that a* is an optimal strategy for player A against b* and b* is an
Original behavior of are doing
players replies. optimal strategy for player B against a*.
their best
against ü At the vortex point, all players are
others ü A Nash equilibrium is where the strategy of one player is the best strategy
doing their best against the others
when other players play their best strategy.
ü This is called Nash equilibrium
ü The existence theorem does not guarantee the existence of a pure- ü A quick way to find the Nash equilibrium is to underline the best-response
strategy Nash equilibrium. payoffs. The Nash equilibria correspond to the boxes in which every player’s
payoff is highlighted.
Example
Player R
R S P C D
R (D,D) (W,L) (L,W)
Player F C -1, -1 -15, 0
S (L,W) (D,D) (W,L)
P (W,L) (L,W) (D,D) D 0, -15 -10, -10
The children’s hand game has no pure Nash equilibrium D is always best, irrespective of the opponent’s strategy
Player
2 (4,1) (5,3) Player 1 2 (2,0) (1,1) (2,0)
1
game. 3 (6,5) (4,7) 3 (0,3) (0,2) (3,0)
Player 2
ü When a dominant strategy exists, it is the unique Nash equilibrium. A B C D
1 (5,2) (5,3) (7,5) (6,3)
Player 1
ü There is no dominant strategy in the battle of the sexes game. 2 (2,0) (6,5) (4,5) (4,3)
3 (0,8) (4,3) (3,7) (5,6)
4 (4,2) (2,2) (3,0) (8,3)
Solve the following game To find the pure Nash equilibria, we need to identify the best response for
each player with regard to each possible strategy chosen by his opponent.
A B C D
1 38 16 38 18 Each identified pair of best response strategies corresponds to a pure Nash
2 34 52 30 42
3 18 50 14 26 equilibrium. In the following examples, the best responses for each player
with regard to the choice of his opponent are tinted in red and the pure
Nash equilibria are highlighted in yellow.
Player 2 Player 2
A B A B C
1 (3,7) (0,0) 1 (3,0) (0,2) (0,3)
Player
Π! = 16𝑝 + 52 1 − 𝑝 = 52 − 36𝑝
Π" = 38𝑝 + 30 1 − 𝑝 = 30 + 8𝑝
Π# = 18𝑝 + 42 1 − 𝑝 = 42 − 24𝑝
60
Optimal value of p is obtained by
50 solving Π" = Π# ; 30 + 8𝑝 = 42 − 24𝑝
40 then 𝑝 = 3/8
30 Player 1 should play strategy 1 with a
20
probability 3/8 and probability 2 with
probability 5/8
10
However, player 2 should play
0
0 0,2 0,4 0,6 0,8 1 strategies C and D.
Π_B Π_C Π_𝐷 Let’s denote by q the probability to play
strategy C. The payoffs of player 2 are
Π$ = 38𝑞 + 18 1 − 𝑞 = 18 + 20𝑞
Π% = 30𝑞 + 42 1 − 𝑞 = 30 − 12𝑞 then as follows:
Π$ = Π%; 𝑞 = 3/8
Player 2 should play strategy C with a probability 3/8 and probability D with probability 5/8