7 Correlated equilibrium
correlated equilibrium, coarse correlated equilibrium
“If there is intelligent life on other planets, in a majority of them, they would have discovered correlated equilibrium before Nash equilibrium.” – Roger Myerson
Cosider the following “traffic stop” game:
\(\mathsf{Stop}\) | \(\mathsf{Go}\) | |||
\(\mathsf{Stop}\) | \(0\) | \(0\) | \(0\) | \(1\) |
\(\mathsf{Go}\) | \(1\) | \(0\) | \(-100\) | \(-100\) |
There are two pure strategy Nash equilibria: \((\mathsf{Stop}, \mathsf{Go})\) and \((\mathsf{Go}, \mathsf{Stop})\). In both equilibria, one player gets a payoff of \(1\) and the other gets a payoff of \(0\).
In addition, there is a mixed strategy Nash equilibrium \((\tfrac {100}{101}, \tfrac{1}{101})\). The mixed strategy equilibrium seems to be worse off for both players: on average both of them get a payoff of \(0\) but risk a huge negative penalty of \(-100\).
The mixed strategy Nash equilibrium induces the following probability distribution on the action profiles
\(\mathsf{Stop}\) | \(\mathsf{Go}\) | |
\(\mathsf{Stop}\) | \(0.98\) | \(0.1\) |
\(\mathsf{Go}\) | \(0.1\) | \(0.01\) |
A better solution is to do a randomization between the pure Nash equilibrium strategies, i.e., use the following probability distribution on the action profiles:
\(\mathsf{Stop}\) | \(\mathsf{Go}\) | |
\(\mathsf{Stop}\) | \(0\) | \(0.5\) |
\(\mathsf{Go}\) | \(0.5\) | \(0\) |
This is consistent with how we resolve the conflict in real-life: by using a traffic light which tells which user should go and which should stop. Once a traffic light makes a joint recommendation to both players, it is in the best interest of the players to follow that recommendation.
It is not always possible to achieve such a “coordination” via mixed strategies. The reason is that, in mixed strategies, the players are randomizing independently: so the joint distribution of the form above cannot be achieved.
7.2 Examples
7.2.1 A Hawk-Dove game
Consider the following variation of “hawk-dove” game:
\(\mathsf{D}\) | \(\mathsf{H}\) | |||
\(\mathsf{D}\) | \(2\) | \(2\) | \(0\) | \(3\) |
\(\mathsf{H}\) | \(3\) | \(0\) | \(-10\) | \(-10\) |
We can verify that this game has two pure strategies Nash equilibria \((\mathsf{D}, \mathsf{H})\) and \((\mathsf{H}, \mathsf{D})\). In addition, there is a symmetric mixed strategy Nash equilibrium \((\tfrac {10}{11}, \tfrac{1}{11})\) which has a payoff of \((\tfrac{20}{11}, \tfrac{20}{11})\).
\[ \def\D{\mathsf{D}} \def\H{\mathsf{H}} \]
We claim that the following is a correlated equilibrium for the game: \[ π = \begin{bmatrix} \frac {10}{12} & \frac{1}{12} \\ \frac{1}{12} & 0 \end{bmatrix} \]
We first verify the conditions in \eqref{eq:corr} and then verify the conditions in \eqref{eq:corr2}.
7.2.1.1 Verification of \eqref{eq:corr}
We consider the analysis from the point of view of player 1. By symmetry, the argument is the same for player 1.
Suppose the mediator recommends strategy \(\D\) to player~\(1\). The conditional payoff if player 1 follows the recommendation is \[ \frac{2 π(\D, \D) + 0 π(\D, \H)}{ π(\D, \D) + π(\D, \H) } = \frac{2 \frac{10}{12}}{\frac{11}{12}} = \frac{20}{11}. \] The player’s payoff if they deviate is \[ \frac{3 π(\D, \D) - 10 π(\D, \H)}{ π(\D, \D) + π(\D, \H) } = \frac{3 \frac{10}{12} - 10\frac{1}{12}}{\frac{11}{12}} = \frac{20}{11}. \] Thus, the player has no incentive to deviate.
Now suppose the mediator recommends strategy \(\H\) to player \(1\). Then the player knows that player \(2\) has received a recommendation of \(\D\). Since \((\H, \D)\) is a Nash equilibrium, there is no incentive to deviate.
7.2.1.2 Verification of \eqref{eq:corr2}
We consider each case separately:
\(i = 1\), \(a_1 = \D\), \(b_1 = \H\): \[\begin{align*} \hskip 2em & \hskip -2em π(\D,\D)[ u_1(\D,\D) - u_1(\H,\D) ] + π(\D,\H)[ u_1(\D,\H) - u_1(\H,\H) ] \\ &= \frac{10}{12}[ 2 - 3 ] + \frac{1}{12}[ 0 + 10] \\ &= 0 \ge 0. \end{align*}\]
\(i = 1\), \(a_1 = \H\), \(b_1 = \D\): \[\begin{align*} \hskip 2em & \hskip -2em π(\H,\D)[ u_1(\H,\D) - u_1(\D,\D) ] + π(\H,\H)[ u_1(\H,\H) - u_1(\D,\H) ] \\ &= \frac{1}{12}[ 3 - 2 ] + 0 [ -10 + 0] \\ &= \frac{1}{12} \ge 0. \end{align*}\]
\(i = 2\), \(a_2 = \D\), \(b_2 = \H\): \[\begin{align*} \hskip 2em & \hskip -2em π(\D,\D)[ u_2(\D,\D) - u_2(\D,\H) ] + π(\H,\D)[ u_2(\H,\D) - u_2(\H,\H) ] \\ &= \frac{10}{12}[ 2 - 3 ] + \frac{1}{12}[ 0 + 10] \\ &= 0 \ge 0. \end{align*}\]
\(i = 2\), \(a_2 = \H\), \(b_2 = \D\): \[\begin{align*} \hskip 2em & \hskip -2em π(\D,\H)[ u_2(\D,\H) - u_2(\D,\D) ] + π(\H,\H)[ u_2(\H,\H) - u_2(\H,\D) ] \\ &= \frac{1}{12}[ 3 - 2 ] + 0 [ -10 + 0] \\ &= \frac{1}{12} \ge 0. \end{align*}\]
Thus in all cases, neither player has an incentive to deviate.
Note that the calculation for verifying \eqref{eq:corr2} are the same as the calculations in the numerator of verifying \eqref{eq:corr}.
The social payoff of the correlated equilibrium strategy is \[ 4 \frac{10}{12} + 3 \frac{1}{12} + 3 \frac{1}{12} = \frac{46}{12} \] which is higher than the social welfare of \(40/11\) for the mixed strategy Nash equilibrium but is worse than the team optimal payoff of \(4\).
Notes
The notion of correlated equilibrium is due to Aumann (1974). Also see Aumann (1987). See Amir et al. (2017) for discussion of correlated equilibrium from \(2×2\) games. Some of the discussion in this section is adapted from Maschler et al. (2020).
See Papadimitriou and Roughgarden (2008) for algorithmic aspects of computing correlated equilibrium.