7  Correlated equilibrium

Updated

January 2, 2025

Keywords

correlated equilibrium, coarse correlated equilibrium

“If there is intelligent life on other planets, in a majority of them, they would have discovered correlated equilibrium before Nash equilibrium.” – Roger Myerson

Cosider the following “traffic stop” game:

\(\mathsf{Stop}\) \(\mathsf{Go}\)
\(\mathsf{Stop}\) \(0\) \(0\) \(0\) \(1\)
\(\mathsf{Go}\) \(1\) \(0\) \(-100\) \(-100\)

There are two pure strategy Nash equilibria: \((\mathsf{Stop}, \mathsf{Go})\) and \((\mathsf{Go}, \mathsf{Stop})\). In both equilibria, one player gets a payoff of \(1\) and the other gets a payoff of \(0\).

In addition, there is a mixed strategy Nash equilibrium \((\tfrac {100}{101}, \tfrac{1}{101})\). The mixed strategy equilibrium seems to be worse off for both players: on average both of them get a payoff of \(0\) but risk a huge negative penalty of \(-100\).

The mixed strategy Nash equilibrium induces the following probability distribution on the action profiles

\(\mathsf{Stop}\) \(\mathsf{Go}\)
\(\mathsf{Stop}\) \(0.98\) \(0.1\)
\(\mathsf{Go}\) \(0.1\) \(0.01\)

A better solution is to do a randomization between the pure Nash equilibrium strategies, i.e., use the following probability distribution on the action profiles:

\(\mathsf{Stop}\) \(\mathsf{Go}\)
\(\mathsf{Stop}\) \(0\) \(0.5\)
\(\mathsf{Go}\) \(0.5\) \(0\)

This is consistent with how we resolve the conflict in real-life: by using a traffic light which tells which user should go and which should stop. Once a traffic light makes a joint recommendation to both players, it is in the best interest of the players to follow that recommendation.

It is not always possible to achieve such a “coordination” via mixed strategies. The reason is that, in mixed strategies, the players are randomizing independently: so the joint distribution of the form above cannot be achieved.

7.1 Correlated equilibrium

One way to interpret correlated equilibrium is as follows:

  • There is an omniscient observer (who may be viewed as an information designer) who recommends strategies to the players.
  • The observer randomizes to choose his recommendations based on a probability distribution that is known to the players.
  • The recommendations are private, with each player knows only the recommendation addressed to him.
  • The mechanism is common knowledge among the players: each plaoer knows that this mechanism is being used; each player knows that the other players know that this mechanism is being used, and so on.

Now consider a strategic game \(\mathscr{G} = \langle \ALPHABET N, (\ALPHABET A_i)_{i \in \ALPHABET N}, (u_i)_{i \in \ALPHABET N} \rangle\). Let \(\ALPHABET A = \prod_{i \in \ALPHABET N} \ALPHABET A_i\) be the strategy space of all players. Consider a probability distribution \(π\) over \(\ALPHABET A\). Define an extensive form game \(Γ(π)\) with imperfect information as follows:

  • An outside observer (i.e., the information designer) probabilistically chooses an action profile \(a \in \ALPHABET A\) according to \(π\).
  • The observer reveals the coordinate \(a_i\) (but not \(a_{-i}\)) to each player \(i \in \ALPHABET N\). We may interpret this as the observer recommending strategy \(a_i\) to player \(i\).
  • The recommendation of the observer is non-binding. Each player chooses an action \(b_i \in \ALPHABET A_i\), where \(b_i\) may be different from \(a_i\).
  • The payoff of each player \(i\) is \(u_i(b_1, \dots, b_n)\).

Definition 7.1 A (pure) strategy of player \(i\) in game \(Γ(π)\) is a function \(s_i \colon \ALPHABET A_i \to \ALPHABET A_i\), mapping every recommendation \(a_i\) of the observer to an action \(s_i(a_i)\).

A correlated strategy \(π\) is a joint probability distribution on all the pure strategies of the game. For example, consider any \(2×2\) two-player game. Then, a correlated strategy is of the form \[ π = \begin{bmatrix} π_{11} & π_{12} \\ π_{21} & π_{22} \end{bmatrix}, \quad π_{ij} \ge 0, \quad \sum_{i,j} π_{ij} = 1. \]

Definition 7.2 Given a strategic game \(\mathscr{G} = \langle \ALPHABET N, (\ALPHABET A_i)_{i \in \ALPHABET N}, (u_i)_{i \in \ALPHABET N} \rangle\), a correlated equilibrium is a is a correlated strategy \(π\) such that \[\begin{equation}\label{eq:corr} \EXP^{π}[ u(a_i, A_{-i}) \mid A_i = a_i ] \ge \EXP^{π}[ u(a_i, A_{-i}) \mid A_i = b_i ] , \quad \forall b_i \in \ALPHABET A_i, \forall i \in \ALPHABET N. \end{equation}\]

Remark

Note that Nash equilibria are special case of correlated equilibria in which the mediator recommends actions via independent randomizations. So, correlated equilibria always exist.

It can also be shown that any convex combination of Nash equilibira is a correlated equilibirum.

To understand this definition, we restrict attetion to two player games. Consider a correlated strategy \(π\). Then the conditional distribution of \(A_2\) given \(A_1 = a_1\) is \[ \PR(A_2 = a_2 \mid A_1 = a_1) = \frac{π(a_1, a_2)}{\sum_{b_2 \in \ALPHABET A_2} π(a_1, b_2)}. \] Note that the denominator is the same for both sides of the expectation in \eqref{eq:corr}. So, we can rewrite \eqref{eq:corr} as \[\begin{equation}\label{eq:corr2} \sum_{a_j \in \ALPHABET A_j} π(a_i, a_j) \bigl[ u_i(a_i, a_j) - u_i(b_i, a_j) \bigr] \ge 0, \quad \forall a_i, b_i \in \ALPHABET A_i, \quad \forall i \in \{1,2\}. \end{equation}\]

Note that \eqref{eq:corr2} are a set of linear inequalities. Therefore, the set of all correlated equilibria is convex and can be obtained by identifying the feasibility region of a linear program.

7.2 Examples

7.2.1 A Hawk-Dove game

Consider the following variation of “hawk-dove” game:

\(\mathsf{D}\) \(\mathsf{H}\)
\(\mathsf{D}\) \(2\) \(2\) \(0\) \(3\)
\(\mathsf{H}\) \(3\) \(0\) \(-10\) \(-10\)

We can verify that this game has two pure strategies Nash equilibria \((\mathsf{D}, \mathsf{H})\) and \((\mathsf{H}, \mathsf{D})\). In addition, there is a symmetric mixed strategy Nash equilibrium \((\tfrac {10}{11}, \tfrac{1}{11})\) which has a payoff of \((\tfrac{20}{11}, \tfrac{20}{11})\).

\[ \def\D{\mathsf{D}} \def\H{\mathsf{H}} \]

We claim that the following is a correlated equilibrium for the game: \[ π = \begin{bmatrix} \frac {10}{12} & \frac{1}{12} \\ \frac{1}{12} & 0 \end{bmatrix} \]

We first verify the conditions in \eqref{eq:corr} and then verify the conditions in \eqref{eq:corr2}.

7.2.1.1 Verification of \eqref{eq:corr}

We consider the analysis from the point of view of player 1. By symmetry, the argument is the same for player 1.

  • Suppose the mediator recommends strategy \(\D\) to player~\(1\). The conditional payoff if player 1 follows the recommendation is \[ \frac{2 π(\D, \D) + 0 π(\D, \H)}{ π(\D, \D) + π(\D, \H) } = \frac{2 \frac{10}{12}}{\frac{11}{12}} = \frac{20}{11}. \] The player’s payoff if they deviate is \[ \frac{3 π(\D, \D) - 10 π(\D, \H)}{ π(\D, \D) + π(\D, \H) } = \frac{3 \frac{10}{12} - 10\frac{1}{12}}{\frac{11}{12}} = \frac{20}{11}. \] Thus, the player has no incentive to deviate.

  • Now suppose the mediator recommends strategy \(\H\) to player \(1\). Then the player knows that player \(2\) has received a recommendation of \(\D\). Since \((\H, \D)\) is a Nash equilibrium, there is no incentive to deviate.

7.2.1.2 Verification of \eqref{eq:corr2}

We consider each case separately:

  • \(i = 1\), \(a_1 = \D\), \(b_1 = \H\): \[\begin{align*} \hskip 2em & \hskip -2em π(\D,\D)[ u_1(\D,\D) - u_1(\H,\D) ] + π(\D,\H)[ u_1(\D,\H) - u_1(\H,\H) ] \\ &= \frac{10}{12}[ 2 - 3 ] + \frac{1}{12}[ 0 + 10] \\ &= 0 \ge 0. \end{align*}\]

  • \(i = 1\), \(a_1 = \H\), \(b_1 = \D\): \[\begin{align*} \hskip 2em & \hskip -2em π(\H,\D)[ u_1(\H,\D) - u_1(\D,\D) ] + π(\H,\H)[ u_1(\H,\H) - u_1(\D,\H) ] \\ &= \frac{1}{12}[ 3 - 2 ] + 0 [ -10 + 0] \\ &= \frac{1}{12} \ge 0. \end{align*}\]

  • \(i = 2\), \(a_2 = \D\), \(b_2 = \H\): \[\begin{align*} \hskip 2em & \hskip -2em π(\D,\D)[ u_2(\D,\D) - u_2(\D,\H) ] + π(\H,\D)[ u_2(\H,\D) - u_2(\H,\H) ] \\ &= \frac{10}{12}[ 2 - 3 ] + \frac{1}{12}[ 0 + 10] \\ &= 0 \ge 0. \end{align*}\]

  • \(i = 2\), \(a_2 = \H\), \(b_2 = \D\): \[\begin{align*} \hskip 2em & \hskip -2em π(\D,\H)[ u_2(\D,\H) - u_2(\D,\D) ] + π(\H,\H)[ u_2(\H,\H) - u_2(\H,\D) ] \\ &= \frac{1}{12}[ 3 - 2 ] + 0 [ -10 + 0] \\ &= \frac{1}{12} \ge 0. \end{align*}\]

Thus in all cases, neither player has an incentive to deviate.

Note that the calculation for verifying \eqref{eq:corr2} are the same as the calculations in the numerator of verifying \eqref{eq:corr}.

Remark

The social payoff of the correlated equilibrium strategy is \[ 4 \frac{10}{12} + 3 \frac{1}{12} + 3 \frac{1}{12} = \frac{46}{12} \] which is higher than the social welfare of \(40/11\) for the mixed strategy Nash equilibrium but is worse than the team optimal payoff of \(4\).

Notes

The notion of correlated equilibrium is due to Aumann (1974). Also see Aumann (1987). See Amir et al. (2017) for discussion of correlated equilibrium from \(2×2\) games. Some of the discussion in this section is adapted from Maschler et al. (2020).

See Papadimitriou and Roughgarden (2008) for algorithmic aspects of computing correlated equilibrium.