Lecture 4
Bräss' Paradox, and more on Mixed Strategy

6.8L transcribed by Satyavarta

1  Introduction

In this lecture, we will discuss Bräss' Paradox, mixed strategy NE in two-player zero-sum games, Min-Max theorem, and Extensive Form Games.

2  Bräss' Paradox

Consider the situation shown in the figure. A and B are two cities, with two connecting roads between them, one passing through C and the other through D. AC and CB are short but narrow roads, while AD and CB are wide but long.

Figure

Let Xi = the traffic (cars/sec) on road i, and Di the delay on road i. Hence, for the situation in the figure,

                            D1 = 5X1 + 1
                            D2 = X2 + 32
                            D3 = 5X3 + 1
                            D4 = X4 + 32

Let the total traffic between the cities be 6 cars/sec.

The expression for the delay on the two possible paths may be expressed as

                            Dpath1 = 6X1 + 33
                            Dpath2 = 6X2 + 33

These conditions in equilibrium condition dictate the following distribution of traffic

                            X1 = X2 = X3
                            DP1 = DP2 = 52

Suppose a wide road is built from C to D with constant delay zero.

Figure

From C,

                            the cost of CB is 32 + X

                            the cost of CDB is 5X + 1

This makes the path CDB cheaper than CB for all possible values of X Î [0,6].

Then, all cars go over ACDB. But this raises the average delay to 31+31=62. Thus if every individual acts greedy, the average delay for every individual may increase. The objective function that the society seeks to minimize is the total delay, so that the scheme is beneficial to everyone. Formally, this objective function is

                            minimize åp Xpd(P) (total delay)

Every individual can be charged equivalent to damage caused to society. Then the system has a desirable equilibrium. In the example here, a toll could be levied on the car to increase the efective cost of CD, so that the equilibrium exists at minimal average delay.

The cost is dependent on the individual. We shall soon study more theories on this behaviour, e.g. the Utility Theory.

3  Mixed Strategy Nash Equilibrium in two-player zero-sum games

A two-player, zero-sum game occurs as follows,

Playing a mixed strategy means that the moves made by the players (in the second step above) are not made deterministically. Rather, each strategy is a probability distribution over rows (columns)

                            P = {p1, p2, ... pn}

for pi = probability of choosing ith row (column).

Since exactly one row (column) may be selected (``played''),

                            Sipi = 1, p ³ 0.

This is in contrast to a pure strategy, where pi = 1 for exactly one i and 0 for all others.

Hence, the following symbols may be defined for further analysis of this game.

                            A : Payoff matrix
                            p : mixed strategy for row player
                            q : mixed strategy for column player

The payoff with such a scheme is calculated as aij for row player (-aij for column player), when i is selected by row player, and j by the column player. The expected payoff for a player is thus the sum of payoffs multiplied by the probability of that payoff. Thus,

                            Expected payoff to row player = Si,jpiqjaij

Then, since we are considering a zero-sum game, the expected payoffs for the players are

                            row player Ur = pTAq
                            column player Uc = -pTAq

Each player wants to maximize his profit, and since he does not know what strategy the other one is using, he wants to maximize his expected profit irrespective of what the opponent plays. Hence row player wants p¢ to maximize minq p¢TAq, and column player wants q¢ to minimize maxp pTAq¢.

For any pair of strategies (p¢,q¢)

                                     minq p¢TAq £ p¢TAq¢ £ maxp pTAq¢

Now if a saddle point strategy (p*, q*) exists (so that saddle points exist in PAQ, where P and Q are vectors of p's and q's), then by definition of saddle point,

                            "p, q          pTAq* £ p*TAq* £ p*TAq

This is also a Nash Equilibrium, since no player can increase his own profit beyond p*TAq*.

Assuming rational behaviour from the players so that they play the best best strategy available to them, the guaranteed payoffs to the players are

                            row player Vr = maxp minq pTAq = p*TAq1

                            column player Vc = minp maxq pTAq = p1TAq*

Property:

                            Vr £ Vc

Proof: Using the fact that (p**,q*) is a saddle point, we may write

                            Vr = p*TAq1 £ p*TAq*

                            Vc = p1TAq* ³ p*TAq*

Hence,

                            Vr £ Vc

We will use this property to prove the following property.

Property: Iff the guaranteed payoffs of the two players are equal, a saddle point (and hence, Nash Equilibrium) exists.

                            (p*, q*) is Saddle Point Û Vr = Vc

Proof:

4  Min-Max Theorem

A relevant question to ask in such a situation is ``Does a Saddle Point always exist?'' The Min-Max Theorem says, yes, it does.

                           

max
p 

min
q 
pTAq =
min
q 

max
p 
pTAq

Proof: The proof for this theorem is on the lines of the concept of duality in Linear Programming, wherein these two behave as duals of each other. (The proof I illustrate here is taken from the scribe notes by Tugkan Batu, ORIE630 Mathematical Programming Fall 1998, of lectures by Jon Kleinberg.)

We write two linear programs, one for each player with expected payoff being the objective function, and show that this pair of linear programs are actually duals of each other which will prove the theorem.

The row player's objective function is maxp minq pTAq. Given some x¢, the problem min((x¢A)y subject to Siqi = 1 , q ³ 0) looks infinite, because of the possibilities of y. However, for a given x¢, there is an optimal pure strategy. The reason for this is if column player knows row player's mixed strategy, the column player can choose the column to maximize the expected payoff (instead of getting a weighted average over all columns). So, now we can write row columns' objective function as

                           
maxp [ minj Si=1mAijpi ]

Now, we can write the row player's problem as a linear program as follows:

                            max Z

                                     Z - Si=1nAijpi £ 0 "j
                                     Si=1n pi = 1
                                     p ³ 0

Similarly, the column player's problem is formulated as a linear program as follows:

                            min W

                                     W - Sj=1nAijqj £ 0 "i
                                     Sj=1n qj = 1
                                     q ³ 0

Now, if we take e* to be all the 1's vector then we can write this pair of linear programs as follows:

max 1.Z + 0.pmin 1.W + 0.y
Z.1 + p(-A) £ 0W.1 + (-A)y ³ 0
Z.0 + p.e* = 1W.0 + e*.y = 1
x ³ 0y ³ 0

The generic primal/dual pair are described as

Primal Dual
min cx +[`c][`x] max yb + y¢b¢
x ³ 0 y¢ ³ 0
Mx+[`M][`x] = b yM + y¢M¢ £ c
M¢x+[`M]¢[`x] ³ b y[`M] + y¢[`M]¢ £ c

Thus, the linear programs for the row and column player strategies are instantiations of this generic form with the substitution {b¢=0, b=1, c=0, [`c]=1, M=e*, [`M]=0, M¢=-A, [`M]¢=e*}.

5  Extensive Form Games

Modelling Chess

                            B(s) = set of moves black can make.

The moves of opponents are all a path along a tree of moves that can be constructed, with the board position at each node and all the possible next moves as children. The discussion on extensive form games spills over to the next lecture.

:




File translated from TEX by TTH, version 3.13.
On 14 Sep 2002, 00:11.