Chapter 3

Ellipsoid Against Hope

The techniques we have covered so far based on regret minimization lead to a running time complexity growing polynomially in \(1/\epsilon\), where \(\epsilon > 0\) is the approximation quality of the equilibrium—be it a Nash equilibrium in zero-sum games or a (coarse) correlated equilibrium in general-sum games. Specifically, if we employ algorithms with (\(\Phi\)-)regret bounded by \(\sqrt{T}\) as a function of the time horizon, we need \(1/\epsilon^{2}\) iterations to reach an equilibrium. Can we do better in the regime where \(\epsilon\) is, say, exponentially small? The other main approach we have introduced for equilibrium computation is based on linear programming, which has the advantage of finding an exact equilibrium. But, as we have already seen, the LP describing (C)CEs has a number of variables that scales exponentially with the number of players.

3.1 Ellipsoid against hope

We will now introduce ellipsoid against hope (EAH), a polynomial-time algorithm for computing \(\Phi\)-equilibria even in games with many players and complex strategy sets. It was first introduced by Papadimitriou and Roughgarden [PR08[PR08] C. H. Papadimitriou and T. Roughgarden. (2008). Computing correlated equilibria in multi-player games. Journal of the ACM.]. In what follows, we mostly follow the generalized version of the algorithm due to Farina and Pipis [FP24[FP24] G. Farina and C. Pipis. (2024). Polynomial-Time Computation of Exact \(\Phi\)-Equilibria in Polyhedral Games. Neural Information Processing Systems.] and Zhang, Anagnostides, Tewolde, Berker, Farina, Conitzer and Sandholm [ZAT+25[ZAT+25] B. H. Zhang, I. Anagnostides, E. Tewolde, R. E. Berker, G. Farina, V. Conitzer and T. Sandholm. (2025). Learning and Computation of \(\Phi\)-Equilibria at the Frontier of Tractability. Economics and Computation.]. To be clear, this algorithm works in the centralized model, and it’s not compatible with the framework of online learning, although as we shall see similarities do exist between the two approaches.

Our goal is to compute an \(\epsilon\)-\(\Phi\)-equilibrium of a multilinear game, that is, a correlated distribution \(\mu \in \Delta (\mathcal{X}_{1} \times \dots \times \mathcal{X}_{n})\) such that for any player \(i \in [n]\) and deviation function \(\phi_{i} \in \Phi_{i}\) mapping \(\mathcal{X}_{i} \to \mathcal{X}_{i}\),

\[\displaystyle \mathop{\mathbb{E}}\limits_{\left(\boldsymbol{x}_{1}, \dots , \boldsymbol{x}_{n}\right) \sim \mu} u_{i} \left(\boldsymbol{x}_{1}, \dots , \boldsymbol{x}_{n}\right) \ge \mathop{\mathbb{E}}\limits_{\left(\boldsymbol{x}_{1}, \dots , \boldsymbol{x}_{n}\right) \sim \mu} u_{i} \left(\phi_{i} \left(\boldsymbol{x}_{i}\right), \boldsymbol{x}_{-i}\right) - \epsilon .\] (1)

From a more abstract point of view, EAH deals with optimization problems of the form11

We caution that

here corresponds to

in the context of (1), even though we sometimes take

to be the strategy set of a single player in the context of regret minimization.

\[\displaystyle \text{find } \mu \in \Delta \left(\mathcal{X}\right) \text{ such that } \mathop{\mathbb{E}}\limits_{\boldsymbol{x} \sim \mu} \left\langle \boldsymbol{y},G\left(\boldsymbol{x}\right) \right\rangle \ge 0 \quad \forall \boldsymbol{y} \in \mathcal{Y},\] (2)

where \(\mathcal{X} \subseteq \mathbb{R}^{d}, \mathcal{Y} \subseteq \mathbb{R}^{k}\), and \(G : \mathcal{X} \to \mathbb{R}^{k}\) is a function that can be evaluated efficiently. The crux in the optimization problem (2) lies in the fact that \(\mu\) resides in a high-dimensional space; a canonical case to think about is when \(\mathcal{X} = \mathcal{A}_{1} \times \dots \times \mathcal{A}_{n}\) in a normal-form game, so that even describing a distribution \(\mu\) could require specifying \(\prod_{i=1}^{n} |\mathcal{A}_{i}| - 1\) coordinates. We assume that \(\mathcal{Y}\), which corresponds to the set of deviations, admits a separation oracle; under mild geometric assumptions, it’s equivalent to posit merely a membership oracle, which returns whether a point \(\boldsymbol{y} \in \mathbb{R}^{k}\) belongs to \(\mathcal{Y}\) or not. A special case is when \(\mathcal{Y}\) is a polytope described with a polynomial number of constraints, as is the case for swap regret in normal-form games—each player’s set of deviations amounts to the set of stochastic matrices.

The key assumption in the EAH framework [FP24] is the existence of a good-enough-response (GER) oracle, which, given any \(\boldsymbol{y} \in \mathcal{Y}\), returns a point \(\boldsymbol{x} \in \mathcal{X}\) such that \(\left\langle \boldsymbol{y},G(\boldsymbol{x}) \right\rangle \ge 0\).

The EAH algorithm enables solving (2) with just a separation oracle for \(\mathcal{Y}\) and a GER oracle. The basic idea is to consider an \(\epsilon\)-approximate version of the dual of (2),

\[\displaystyle \text{find } \boldsymbol{y} \in \mathcal{Y} \text{ such that } \left\langle \boldsymbol{y},G\left(\boldsymbol{x}\right) \right\rangle \le - \epsilon \quad \forall \boldsymbol{x} \in \mathcal{X}.\] (3)

On account of the fact that a GER oracle exists, (3) is guaranteed to be infeasible. Even so, EAH proceeds by executing the ellipsoid algorithm on that infeasible program—this is where the apt name “ellipsoid against hope” comes from. In every step \(t\) of the algorithm, we have a candidate \(\boldsymbol{y}^{(t)} \in \mathcal{Y}\) and use the GER oracle to produce a point \(x^{(t)} \in \mathcal{X}\) that refutes \(\boldsymbol{y}^{(t)}\); in fact, an entire halfspace in \(\mathbb{R}^{k}\). This goes on until the volume of the ellipsoid has shrank to a small enough amount. It then follows that

\[\displaystyle \forall \boldsymbol{y} \in \mathcal{Y} \exists t \in \left[T\right] \text{ such that } \left\langle \boldsymbol{y},G\left(\boldsymbol{x}^{\left(t\right)}\right) \right\rangle > - \epsilon .\]

Thus,

\[\displaystyle \operatorname*{min}_{\boldsymbol{y} \in \mathcal{Y}} \operatorname*{max}_{\mu \in \Delta \left(\left[T\right]\right)} \sum_{t=1}^{T} \mu^{\left(t\right)} \left\langle \boldsymbol{y},G\left(\boldsymbol{x}^{\left(t\right)}\right) \right\rangle > - \epsilon .\]

By the minimax theorem, we conclude that

\[\displaystyle \operatorname*{max}_{\mu \in \Delta \left(\left[T\right]\right)} \operatorname*{min}_{\boldsymbol{y} \in \mathcal{Y}} \sum_{t=1}^{T} \mu^{\left(t\right)} \left\langle \boldsymbol{y},G\left(\boldsymbol{x}^{\left(t\right)}\right) \right\rangle > - \epsilon .\] (4)

In other words, there is a convex combination of \(\boldsymbol{x}^{(1)}, \dots , \boldsymbol{x}^{(T)}\) that refutes any possible \(y \in \mathcal{Y}\). That convex combination, which is a certificate of dual infeasibility, will thus be an approximate solution to (2). The key point is that the resulting zero-sum game in (4) is much smaller than the one we started with, and can be solved with standard techniques.

Theorem 3.1 ([FP24]) .

Assuming the existence of a separation oracle for \(\mathcal{Y}\) and a GER oracle, EAH runs in time \(\text{poly}(d, k, \operatorname{log}(1/\epsilon ))\) and returns an \(\epsilon\)-approximate solution to (2).

3.2 Application to computing correlated equilibria

Let’s now see to apply this algorithm to solve (1). As before, we assume that each \(\Phi_{i}\) contains linear functions of the form \(\boldsymbol{x}_{i} \mapsto \mathbf{M}_{i} q_{i} (\boldsymbol{x}_{i}) \in \mathcal{X}_{i}\) for some feature map \(q_{i}: \mathcal{X}_{i} \to \mathbb{R}^{k_{i}}\). For these notes, we will assume that the set of valid matrices \(\mathbf{M}_{i}\) form a convex, compact set \(\mathcal{Y}_{i} \subset \mathbb{R}^{d_{i} \times k_{i}}\); this assumption can be relaxed using a similar idea to the semi-separation oracle [ZAT+25]. we will later briefly explain how to relax this assumption. Then a \(\Phi\)-equilibrium is a distribution \(\mu\) such that

\[\displaystyle \sum_{i=1}^{n} \mathop{\mathbb{E}}\limits_{\left(\boldsymbol{x}_{1}, \dots , \boldsymbol{x}_{n}\right) \sim \mu} \left\langle \mathbf{I}_{i} - \mathbf{M}_{i},u_{i} \left(\boldsymbol{x}_{-i}\right) ⊗ q\left(\boldsymbol{x}_{i}\right) \right\rangle \ge - \epsilon \quad \forall i \in \left[n\right], \mathbf{M}_{i} \in \mathcal{Y}_{i},\] (5)

where \(\mathbf{I}_{i}\) is the matrix for which \(\phi_{\mathbf{I}_{i}}\) is the identity map. This indeed adheres to (2).

What’s left is to prove that (5) admits an efficient GER oracle. To do so, let’s consider any \(i \in [n]\) and deviation \(\mathbf{M}_{i} \in \mathcal{Y}_{i}\). Taking \(\mu_{i} \in \Delta (\mathcal{X}_{i})\) to be an \(\epsilon\)-approximate expected fixed point of \(\phi_{\mathbf{M}_{i}}\), we have

\[\displaystyle \mathop{\mathbb{E}}\limits_{\left(\boldsymbol{x}_{1}, \dots , \boldsymbol{x}_{n}\right) \sim \mu} \left\langle \boldsymbol{x}_{i} - \mathbf{M}_{i} q\left(\boldsymbol{x}_{i}\right),u_{i} \left(\boldsymbol{x}_{-i}\right) \right\rangle \ge - \epsilon \cdot \left\Vert u_{i} \left(\boldsymbol{x}_{-i}\right)\right\Vert_{2},\]

But this is not immediately good enough. Recall that our algorithm for expected fixed points, from Chapter 2, Theorem 2.6, has \(\text{poly}(1/\epsilon )\) runtime. Since our goal in this section is to construct algorithms with \(\operatorname{log}(1/\epsilon )\) runtime, we need to do better. Fortunately, there is a \(\text{poly}(d, \operatorname{log}(1/\epsilon ))\)-time algorithm for semi-separation with expected fixed points:

Theorem 3.2 ([ZAT+25]) .

There is a \(\text{poly}(d, \operatorname{log}(1/\epsilon ))\)-time algorithm for semi-separation.

Proof Sketch.

We use EAH. Given \(\phi : \mathcal{X} \to \mathbb{R}^{d}\), consider the EAH problem with \(\mathcal{Y}\) set to the unit \(ℓ_{2}\)-ball, and \(G(\boldsymbol{x}) = \phi (\boldsymbol{x}) - \boldsymbol{x}\). Then, by Theorem 3.1, it suffices to implement a GER oracle. That is, given \(\boldsymbol{y} \in \mathbb{R}^{d}\), we need to find \(\boldsymbol{x}\) with \(\left\langle \boldsymbol{y},\phi (\boldsymbol{x}) - \boldsymbol{x} \right\rangle \ge 0\). But this is easy: we can simply take \(\boldsymbol{x} \in \mathop{\operatorname{arg}\hspace{2.83pt}\operatorname*{min}}\limits_{\hat{\boldsymbol{x}} \in \mathcal{X}} \left\langle \boldsymbol{y},\hat{\boldsymbol{x}} \right\rangle\). Then either \(\phi (\boldsymbol{x}) \notin \mathcal{X}\), in which case we output \(\boldsymbol{x}\) as a certificate of infeasibility of \(\phi\), or \(\phi (\boldsymbol{x}) \in \mathcal{X}\), in which case \(\left\langle \boldsymbol{y},\phi (\boldsymbol{x}) - \boldsymbol{x} \right\rangle \ge 0\) by definition of \(\boldsymbol{x}\).

Finally, if \(\mu_{i} \in \Delta (\mathcal{X}_{i})\) is an expected fixed point of \(\phi_{i}\) for every \(i\), then the product distribution \(\mu_{1} \times \dots \times \mu_{n}\) is a GER solution satisfying (5). We therefore have:

Theorem 3.3 ([FP24; ZAT+25]) .

If for each player \(i \in [n]\) in a multilinear game the set of deviations \(\Phi_{i}\) admits a separation oracle, there is an algorithm polynomial in \(\operatorname{log}(1/\epsilon )\), \(n\), \(d_{1}, \dots , d_{n}\), and \(k_{1}, \dots , k_{n}\) that outputs an \(\epsilon\)-\(\Phi\)-equilibrium.

We conclude this chapter with several remarks about the previous theorem.

  1. Nested EAH. The algorithm that we derive for Theorem 3.3 is a doubly-nested ellipsoid against hope algorithm: the outer EAH algorithm computes the equilibrium itself, and on every iteration, EAH is invoked for each player \(i\) to compute an expected fixed point for \(i\).

  2. Necessity of linearity. The results presented in this document assume, and use, the fact that utilities are multilinear. The multilinearity assumption is used twice:

    1. in the GER oracle, to calculate expected utilities under a product distribution \(\mu = \mu_{1} \times \dots \times \mu_{n}\) (which has support that increases exponentially with the number of players \(n\)); and

    2. to ensure that an expected fixed point solution actually guarantees separation: for example, if instead we had arbitrary concave utilities \(u(\cdot , \boldsymbol{x}_{-i})\), then we would need the condition

      \[\displaystyle \mathop{\mathbb{E}}\limits_{\boldsymbol{x} \sim \mu} \left\langle ∇ u\left(\boldsymbol{x}\right),\phi \left(\boldsymbol{x}\right) - \boldsymbol{x} \right\rangle \ge -\epsilon\]

      for every convex function \(u\). Expected fixed points do not guarantee this.

    The first issue is not circumventable, and is generally a barrier to efficient equilibrium computation in concave games with many players. The second issue is not circumventable in general, as it would imply the ability to compute fixed points of contraction maps. However, it is circumventable in the special case where \(u(\cdot )\) is quadratic concave for each player \(i\); therefore, it is possible to compute \(\Phi\)-equilibria in quadratic games with a constant number of players. The required techniques, however, are beyond the scope of this document. We refer the interested reader to our recent preprint on this topic [ADF+26[ADF+26] I. Anagnostides, C. Daskalakis, G. Farina, N. Golowich, T. Sandholm and B. H. Zhang. (2026). On the Complexity of Correlated Equilibria Beyond Normal-Form Games. arXiv preprint arXiv:2605.17665.].

  3. Non-separable sets \(\Phi\). So far in this section, we have assumed that the sets \(\mathcal{Y}_{i}\) admit efficient separation oracles. However, this is not a necessary assumption. As it turns out, like we did for regret minimization in Chapter 2, Section 2.3, it is sufficient to take \(\mathcal{Y}_{i}\) to be a superset of the valid deviations and rely on the semi-separation oracle to declare when a given matrix \(\mathbf{M}_{i} \in \mathcal{Y}_{i}\) is invalid. Therefore, Theorem 3.3 also generalizes to the case where \(\mathcal{Y}_{i}\) only admits an efficient semi-separation oracle, such as the sets \(\Phi^{q}\) discussed there. For more details, we refer the interested reader to our paper [ZAT+25].

Bibliography for this chapter

[PR08] C. H. Papadimitriou and T. Roughgarden. (2008). Computing correlated equilibria in multi-player games. Journal of the ACM.
[FP24] G. Farina and C. Pipis. (2024). Polynomial-Time Computation of Exact Φ-Equilibria in Polyhedral Games. Neural Information Processing Systems.
[ZAT+25] B. H. Zhang, I. Anagnostides, E. Tewolde, R. E. Berker, G. Farina, V. Conitzer and T. Sandholm. (2025). Learning and Computation of Φ-Equilibria at the Frontier of Tractability. Economics and Computation.
[ADF+26] I. Anagnostides, C. Daskalakis, G. Farina, N. Golowich, T. Sandholm and B. H. Zhang. (2026). On the Complexity of Correlated Equilibria Beyond Normal-Form Games. arXiv preprint arXiv:2605.17665.

Notes

1

We caution that

here corresponds to

in the context of (1), even though we sometimes take

to be the strategy set of a single player in the context of regret minimization.