Lecture 4
Bräss' Paradox, and more on Mixed Strategy

6.8L transcribed by Satyavarta

Introduction

In this lecture, we will discuss Bräss' Paradox, mixed strategy NE in two-player zero-sum games, Min-Max theorem, and Extensive Form Games.

Bräss' Paradox

Consider the situation shown in the figure. $A$ and $B$ are two cities, with two connecting roads between them, one passing through $C$ and the other through $D$. $AC$ and $CB$ are short but narrow roads, while $AD$ and $CB$ are wide but long.

\includegraphics[]{figure.1.eps}

Let $X_i =$ the traffic (cars/sec) on road $i$, and $D_i$ the delay on road $i$. Hence, for the situation in the figure,

$D_1 = 5X_1 + 1$
$D_2 = X_2 + 32$
$D_3 = 5X_3 + 1$
$D_4 = X_4 + 32$

Let the total traffic between the cities be 6 cars/sec.

The expression for the delay on the two possible paths may be expressed as

$D_{path_1} = 6X_1 + 33$
$D_{path_2} = 6X_2 + 33$

These conditions in equilibrium condition dictate the following distribution of traffic

$X_1 = X_2 = X_3$
$D_{P_1} = D_{P_2} = 52$

Suppose a wide road is built from $C$ to $D$ with constant delay $zero$.

\includegraphics[]{figure.2.eps}

From $C$,

the cost of $CB$ is $32 + X$

the cost of $CDB$ is $5X + 1$

This makes the path $CDB$ cheaper than $CB$ for all possible values of $X \in [0,6]$.

Then, all cars go over $ACDB$. But this raises the average delay to 31+31=62. Thus if every individual acts greedy, the average delay for every individual may increase. The objective function that the society seeks to minimize is the total delay, so that the scheme is beneficial to everyone. Formally, this objective function is

minimize $\sum_{p} X_pd(P)$ (total delay)

Every individual can be charged equivalent to damage caused to society. Then the system has a desirable equilibrium. In the example here, a toll could be levied on the car to increase the efective cost of $CD$, so that the equilibrium exists at minimal average delay.

The cost is dependent on the individual. We shall soon study more theories on this behaviour, $e.g.$ the Utility Theory.

Mixed Strategy Nash Equilibrium in two-player zero-sum games

A two-player, zero-sum game occurs as follows,

Playing a mixed strategy means that the moves made by the players (in the second step above) are not made deterministically. Rather, each strategy is a probability distribution over rows (columns)

$P = \{p_1, p_2, ... p_n\}$

for $p_i = $ probability of choosing $i^{th}$ row (column).

Since exactly one row (column) may be selected (``played''),

$\Sigma_{i}p_i = 1, p \geq 0$.

This is in contrast to a pure strategy, where $p_i = 1$ for exactly one $i$ and $0$ for all others.

Hence, the following symbols may be defined for further analysis of this game.

A : Payoff matrix
p : mixed strategy for row player
q : mixed strategy for column player

The payoff with such a scheme is calculated as $a_{ij}$ for row player ($-a_{ij}$ for column player), when $i$ is selected by row player, and $j$ by the column player. The expected payoff for a player is thus the sum of payoffs multiplied by the probability of that payoff. Thus,

Expected payoff to row player = $\Sigma_{i,j}p_iq_ja_{ij}$

Then, since we are considering a zero-sum game, the expected payoffs for the players are

row player $U_r = p^TAq$
column player $U_c = -p^TAq$

Each player wants to maximize his profit, and since he does not know what strategy the other one is using, he wants to maximize his expected profit irrespective of what the opponent plays. Hence row player wants $p'$ to maximize $min_q{ } {p'}^TAq$, and column player wants $q'$ to minimize $max_p{ } p^TAq'$.

For any pair of strategies $(p',q')$

$ \hspace*{30pt} min_q {p'}^TAq \leq {p'}^TAq' \leq max_p p^TAq' $

Now if a saddle point strategy $(p^*, q^*)$ exists (so that saddle points exist in $PAQ$, where $P$ and $Q$ are vectors of $p$'s and $q$'s), then by definition of saddle point,

$ \forall p, q \hspace*{30pt} p^TAq^* \leq {p^*}^TAq^* \leq {p^*}^TAq $

This is also a Nash Equilibrium, since no player can increase his own profit beyond ${p^*}^TAq^*$.

Assuming rational behaviour from the players so that they play the best best strategy available to them, the guaranteed payoffs to the players are

row player $V_r = max_p min_q p^TAq = {p^*}^TAq_1$

column player $V_c = min_p max_q p^TAq = p_1^TAq^*$

Property:

$V_r \leq V_c $

Proof: Using the fact that $(p*^*,q^*)$ is a saddle point, we may write

$V_r = p^{*^T}Aq_1 \leq p^{*^T}Aq^* $

$V_c = p_1^TAq^* \geq p^{*^T}Aq^* $

Hence,

$V_r \leq V_c $

We will use this property to prove the following property.

Property: Iff the guaranteed payoffs of the two players are equal, a saddle point (and hence, Nash Equilibrium) exists.

$(p^*, q^*)$ is Saddle Point $\Longleftrightarrow V_r = V_c $

Proof:

Min-Max Theorem

A relevant question to ask in such a situation is ``Does a Saddle Point always exist?'' The Min-Max Theorem says, yes, it does.


\begin{displaymath}\max_p \min_q p^TAq = \min_q \max_p p^TAq\end{displaymath}

Proof: The proof for this theorem is on the lines of the concept of duality in Linear Programming, wherein these two behave as duals of each other. (The proof I illustrate here is taken from the scribe notes by Tugkan Batu, ORIE630 Mathematical Programming Fall 1998, of lectures by Jon Kleinberg.)

We write two linear programs, one for each player with expected payoff being the objective function, and show that this pair of linear programs are actually duals of each other which will prove the theorem.

The row player's objective function is $\max_p \min_q p^TAq$. Given some $x'$, the problem min($(x'A)y$ subject to $\Sigma_iq_i = 1 , q \geq 0$) looks infinite, because of the possibilities of $y$. However, for a given $x'$, there is an optimal pure strategy. The reason for this is if column player knows row player's mixed strategy, the column player can choose the column to maximize the expected payoff (instead of getting a weighted average over all columns). So, now we can write row columns' objective function as


\begin{displaymath}max_p \left[ min_j \Sigma_{i=1}^mA_ijp_i \right]\end{displaymath}

Now, we can write the row player's problem as a linear program as follows:

$max Z$

$Z - \Sigma_{i=1}^{n}A_{ij}p_i \leq 0 \forall j$
$\Sigma_{i=1}^{n} p_i = 1$
$p \geq 0$

Similarly, the column player's problem is formulated as a linear program as follows:

$min W$

$W - \Sigma_{j=1}^{n}A_{ij}q_j \leq 0 \forall i$
$\Sigma_{j=1}^{n} q_j = 1$
$q \geq 0$

Now, if we take $e^*$ to be all the 1's vector then we can write this pair of linear programs as follows:

max $1.Z + 0.p$ min $1.W + 0.y$
$Z.1 + p(-A) \leq 0$ $W.1 + (-A)y \geq 0$
$Z.0 + p.e^* = 1$ $W.0 + e^*.y = 1$
$x \geq 0$ $y \geq 0$

The generic primal/dual pair are described as

Primal Dual
min $cx + \overline{c}\overline{x}$ max $yb + y'b'$
$x \geq 0$ $y' \geq 0$
$Mx+\overline{M}\overline{x} = b$ $yM + y'M' \leq c$
$M'x+\overline{M}'\overline{x} \geq b$ $y\overline{M} + y'\overline{M}' \leq c$

Thus, the linear programs for the row and column player strategies are instantiations of this generic form with the substitution {$b'=0$, $b=1$, $c=0$, $\overline{c}=1$, $M=e^*$, $\overline{M}=0$, $M'=-A$, $\overline{M}'=e^*$}.

Extensive Form Games

Modelling Chess

$B(s)$ = set of moves black can make.

The moves of opponents are all a path along a tree of moves that can be constructed, with the board position at each node and all the possible next moves as children. The discussion on extensive form games spills over to the next lecture.

$\vdots$



Satyavarta 2002-09-14