This lecture will provide an introduction to (non-statistical) online learning and multi-armed bandits.
We will discuss the multiplicative weights algorithm Hedge, and its partial information counterpart EXP3, as well as some applications to learning in games.
Hazan, E. (2019). Introduction to Online Convex Optimization. arXiv:1909.05207v1. [Chapter 6.2]
Sessa, P. G., et al. (2019). No-Regret Learning in Unknown Games with Correlated Payoffs. Available online.