This course is an introduction to reinforcement learning (RL) and sequential decision problems. Students will learn about RL’s agent-environment interface, Markov decision processes, multi-armed bandits, dynamic programming, Monte Carlo methods, temporal-difference learning, and eligibility traces. Students will learn about the differences between model-free and model-based methods, and the trade-off between exploration and exploitation. While the course is aimed at giving students a solid foundation in tabular solution methods for reinforcement learning, the course will also provide a high-level overview of RL with function approximation using neural networks. Students will be required to complete several programming assignments and a final project.
Computer Science Applications Elective