Lecture 4
Bräss' Paradox, and more on Mixed Strategy

6.8L transcribed by Satyavarta

Introduction

In this lecture, we will discuss Bräss' Paradox, mixed strategy NE in two-player zero-sum games, Min-Max theorem, and Extensive Form Games.

Bräss' Paradox

Consider the situation shown in the figure.

and

are two cities, with two connecting roads between them, one passing through

and the other through

and

are short but narrow roads, while

and

are wide but long.

$\includegraphics[]{figure.1.eps}$

Let the traffic (cars/sec) on road , and the delay on road . Hence, for the situation in the figure,

Let the total traffic between the cities be 6 cars/sec.

The expression for the delay on the two possible paths may be expressed as

$D_{path_1} = 6X_1 + 33$
$D_{path_2} = 6X_2 + 33$

These conditions in equilibrium condition dictate the following distribution of traffic

$D_{P_1} = D_{P_2} = 52$

Suppose a wide road is built from to with constant delay .

$\includegraphics[]{figure.2.eps}$

From ,

the cost of is

This makes the path cheaper than for all possible values of $X \in [0,6]$ .

Then, all cars go over . But this raises the average delay to 31+31=62. Thus if every individual acts greedy, the average delay for every individual may increase. The objective function that the society seeks to minimize is the total delay, so that the scheme is beneficial to everyone. Formally, this objective function is

minimize $\sum_{p} X_pd(P)$ (total delay)

Every individual can be charged equivalent to damage caused to society. Then the system has a desirable equilibrium. In the example here, a toll could be levied on the car to increase the efective cost of , so that the equilibrium exists at minimal average delay.

The cost is dependent on the individual. We shall soon study more theories on this behaviour, the Utility Theory.

Mixed Strategy Nash Equilibrium in two-player zero-sum games

A two-player, zero-sum game occurs as follows,

There are two players, denoted as row player and column player.
Each player makes a move simultaneously, chosen from corresponding sets of strategies known to both players.
Then, a payoff is determined based on the pair of moves made.
For this to be a zero-sum game,
payoff of row player = -(payoff of column player)

Playing a mixed strategy means that the moves made by the players (in the second step above) are not made deterministically. Rather, each strategy is a probability distribution over rows (columns)

$P = \{p_1, p_2, ... p_n\}$

for probability of choosing $i^{th}$ row (column).

Since exactly one row (column) may be selected (``played''),

$\Sigma_{i}p_i = 1, p \geq 0$ .

This is in contrast to a pure strategy, where for exactly one and for all others.

Hence, the following symbols may be defined for further analysis of this game.

A : Payoff matrix
p : mixed strategy for row player
q : mixed strategy for column player

The payoff with such a scheme is calculated as $a_{ij}$ for row player ( $-a_{ij}$ for column player), when is selected by row player, and by the column player. The expected payoff for a player is thus the sum of payoffs multiplied by the probability of that payoff. Thus,

Expected payoff to row player = $\Sigma_{i,j}p_iq_ja_{ij}$

Then, since we are considering a zero-sum game, the expected payoffs for the players are

row player
column player

Each player wants to maximize his profit, and since he does not know what strategy the other one is using, he wants to maximize his expected profit irrespective of what the opponent plays. Hence row player wants to maximize $min_q{ } {p'}^TAq$ , and column player wants to minimize $max_p{ } p^TAq'$ .

For any pair of strategies

$\hspace*{30pt} min_q {p'}^TAq \leq {p'}^TAq' \leq max_p p^TAq'$

Now if a saddle point strategy exists (so that saddle points exist in , where and are vectors of 's and 's), then by definition of saddle point,

$\forall p, q \hspace*{30pt} p^TAq^* \leq {p^*}^TAq^* \leq {p^*}^TAq$

This is also a Nash Equilibrium, since no player can increase his own profit beyond ${p^*}^TAq^*$ .

Assuming rational behaviour from the players so that they play the best best strategy available to them, the guaranteed payoffs to the players are

row player $V_r = max_p min_q p^TAq = {p^*}^TAq_1$

column player

Property:

$V_r \leq V_c$

Proof: Using the fact that is a saddle point, we may write

$V_r = p^{*^T}Aq_1 \leq p^{*^T}Aq^*$

$V_c = p_1^TAq^* \geq p^{*^T}Aq^*$

Hence,

$V_r \leq V_c$

We will use this property to prove the following property.

Property: Iff the guaranteed payoffs of the two players are equal, a saddle point (and hence, Nash Equilibrium) exists.

is Saddle Point $\Longleftrightarrow V_r = V_c$

Proof:

$\Rightarrow$ is a Saddle Point. Then,
$V_r = max_p min_q p^TAq = {p^*}^TAq^*$
$V_c = min_p max_q p^TAq = {p^*}^TAq^*$
$\Rightarrow$
$\Leftarrow$
From the above property,
$V_r \leq {p^*}^TAq^* \leq V_c$
$\Rightarrow$ $V_r = V_c = {p^*}^TAq^*$
Hence, the proof.

Min-Max Theorem

A relevant question to ask in such a situation is ``Does a Saddle Point always exist?'' The Min-Max Theorem says, yes, it does.

$\begin{displaymath}\max_p \min_q p^TAq = \min_q \max_p p^TAq\end{displaymath}$

Proof: The proof for this theorem is on the lines of the concept of duality in Linear Programming, wherein these two behave as duals of each other. (The proof I illustrate here is taken from the scribe notes by Tugkan Batu, ORIE630 Mathematical Programming Fall 1998, of lectures by Jon Kleinberg.)

We write two linear programs, one for each player with expected payoff being the objective function, and show that this pair of linear programs are actually duals of each other which will prove the theorem.

The row player's objective function is $\max_p \min_q p^TAq$ . Given some , the problem min( subject to $\Sigma_iq_i = 1 , q \geq 0$ ) looks infinite, because of the possibilities of . However, for a given , there is an optimal pure strategy. The reason for this is if column player knows row player's mixed strategy, the column player can choose the column to maximize the expected payoff (instead of getting a weighted average over all columns). So, now we can write row columns' objective function as

$\begin{displaymath}max_p \left[ min_j \Sigma_{i=1}^mA_ijp_i \right]\end{displaymath}$

Now, we can write the row player's problem as a linear program as follows:

$Z - \Sigma_{i=1}^{n}A_{ij}p_i \leq 0 \forall j$
$\Sigma_{i=1}^{n} p_i = 1$
$p \geq 0$

Similarly, the column player's problem is formulated as a linear program as follows:

$W - \Sigma_{j=1}^{n}A_{ij}q_j \leq 0 \forall i$
$\Sigma_{j=1}^{n} q_j = 1$
$q \geq 0$

Now, if we take to be all the 1's vector then we can write this pair of linear programs as follows:

max	min
$Z.1 + p(-A) \leq 0$	$W.1 + (-A)y \geq 0$

$x \geq 0$	$y \geq 0$

The generic primal/dual pair are described as

Primal	Dual
min $cx + \overline{c}\overline{x}$	max
$x \geq 0$	$y' \geq 0$
$Mx+\overline{M}\overline{x} = b$	$yM + y'M' \leq c$
$M'x+\overline{M}'\overline{x} \geq b$	$y\overline{M} + y'\overline{M}' \leq c$

Thus, the linear programs for the row and column player strategies are instantiations of this generic form with the substitution {, , , $\overline{c}=1$ , , $\overline{M}=0$ , , $\overline{M}'=e^*$ }.

Extensive Form Games

Modelling Chess

= set of moves black can make.

The moves of opponents are all a path along a tree of moves that can be constructed, with the board position at each node and all the possible next moves as children. The discussion on extensive form games spills over to the next lecture.

$\vdots$