(ICLR 2024) Sample-Efficient Multi-Agent RL: An Optimization Perspective
Published in International Conference on Learning Representations, 2024
We study multi-agent reinforcement learning (MARL) for the general-sum Markov Games (MGs) under general function approximation. In order to find the minimum assumption for sample-efficient learning, we introduce a novel complexity measure called the Multi-Agent Decoupling Coefficient (MADC) for general-sum MGs. Using this measure, we propose the first unified algorithmic framework that ensures sample efficiency in learning Nash Equilibrium, Coarse Correlated Equilibrium, and Correlated Equilibrium for both model-based and model-free MARL problems with low MADC. We also show that our algorithm provides comparable sublinear regret to the existing works. Moreover, our algorithm only requires an equilibrium-solving oracle and an oracle that solves regularized supervised learning, and thus avoids solving constrained optimization problems within data-dependent constraints (Jin et al. 2020, Wang et al. 2023) or executing sampling procedures with complex multi-objective optimization problems (Foster et al. 2023). Moreover, the model-free version of our algorithms is the first provably efficient model-free algorithm for learning the Nash equilibrium of general-sum MGs.