Mean Field Games and Mean Field Control

22 Jul 2022 23:03

Yet Another Inadequate Placeholder

In physics, a "mean field" approximation is one where one imagines the state of a single particle interacting with the average (mean) state of all other particles, or, by a slight extension, the distribution of states over all other particles. It's often a useful first cut at understanding what's going on in a complex system, and it can become a very good approximation when there are, in fact, all-to-all interactions of about equal strength, or nearly enough, such as interactions structured according to a dense random graph.

A mean field game is one where each agent's pay-off depends on their own action, and on the distribution of actions made by all the other agent. (That is, Irene doesn't care whether Joey defects and Karl cooperates or vice versa, but just that there's one defector and one cooperator.) The economic motivation is that this is a way of thinking about the situation where there are multiple participants in a single centralized market, e.g., something Walrasian.

In the most widely-considered mathematical models, actions are continuous, and each agent also has a state, also continuous, and states evolve according to some stochastic differential equation that involves the current state, the current action, the population distribution over states, and an independent white noise driver for each agent. Introducing some notation (mostly following the Bensoussan, Frehse and Yam review), the state of agent \( i \) at time \( t \) is \( X^{i}(t) \), and the action taken is \( u^i(t) \). (These both live in some finite-dimensional Euclidean vector space.) The distribution of the states of other agents is \( M^i(t) = \frac{1}{n-1}\sum_{j\neq i}{\delta_{X^i(t)}} \). Now \( X^i \) evolves according to the stochastic differential equation \[ dX^i = g(X^{i}(t), M^{i}(t), u^i(t)) dt + \sigma(X^{i}(t)) dW^i \] where \( W^i \) is a standard Wiener process, IID across the agents. The expected pay-off to agent \( i \) is given by \[ p^i = \mathbb{E}\left[ \int_{0}^{T}{f(X^i(t), M^i(t), u^i(t)) dt} + h(X(T), M^{i}(T)) \right] \] A choice of strategy here amounts to a feedback control rule, i.e., \( u^{i}(t) = v(X^i(t), t) \). (Strategies with, e.g., memory, are certainly possible but complicate notation.) So the reward to the agent depends on its state and the distribution of what other agents are doing, and the dynamics of each agent depend on its state, the distribution of what other agents do, its action, and noise. Agents interact with the distribution of other agents, not any particular set of agents. Now notice that when the number of agents \( n \rightarrow\infty \), \( M^i(t) \) and \( M^j(t) \) will become increasingly similar, and will amount every agent interacting with the distribution of all agents as a whole. (Everyone confronts the results of their joint actions as an alien force.) In another Walrasian touch, no one agent's actions matter to the distribution \( m(t) \), i.e., everyone's "market impact" is zero.

One interesting question is to find a strategy which will be a Nash equilibrium, so that if every agent uses that strategy, the population distribution will evolve in such a way that nobody has any incentive to use a different strategy. In symbols, \( \hat{v} \) and \( m \) together form an equilibrium when \begin{eqnarray} dX & = & g(X(t), m(t), \hat{v}(X(t), t)) + \sigma(X(t))dW\\ m(t) & = & \mathcal{D}(X(t)) \end{eqnarray} and finally \[ v \neq \hat{v} \Rightarrow p(\hat{v}, m) \geq p(v, m) \] That is, if infinitely many others acted like you did, none of you would have any reason to want to change.

When this holds, we'll get a rather complicated set of coupled partial differential equations, combining a Hamilton-Jacobi-Bellman equation for optimality and a Fokker-Planck equation for the evolution of the state distribution \( m(t) \). In this limit of infinitely many agents, \( m(t) \) should actually evolve deterministically (so I write it in lower case).

The convergence from large-but-finite numbers of agents to the infinite-population limit presents some subtle mathematical issues. Those are actually what may be relevant to a long-simmering project, and why I've been trying to educate myself on this topic.

Economic-theory speculation by someone who has never even taken an econ. class: It feels like this might be relevant to the issue of "representative agents" in macro. Say the state space for a single agent is \( \mathbb{R}^m \) and the action space is \( \mathbb{R}^d \). The single agent interacts with the mean field, but that mean field is a probability measure on \( \mathbb{R}^m \), not a point in \( \mathbb{R}^m \), and of course measures are a much richer space. There will not, in general, be any way to replace, or approximate, the mean field game by an agent interacting with another finite-dimensional agent. The question would be to find conditions under which we could do this, so that we can replace the mean field game by one where individuals play against a finite-dimensional agent which represents the population as a whole (and, e.g., Nash equilibrium strategies for individual agents in the one model remain Nash equilibria in the other model). If the conditions prove not too onerous, this would be a step towards, e.g., giving DSGE models actual, and not merely performative, microfoundations. Now my suspicion, based on e.g. Jackson and Yariv 2017, is that those conditions will be exceedingly onerous, but wouldn't it be nice to know?

See also: Calculating Macroscopic Consequences of Microscopic Interactions; Compartment Models; Control Theory; Interacting Particle Systems; Large Deviations; Learning in Games