On the many dimensions of Dynamic Programming based Reinforcement Learning algorithms by Prof. Bruno Scherrer
Starting from the standard Value and Policy Iteration, I shall describe many dimensions of Dynamic Programming algorithms for solving the Reinforcement Learning Problem. I will discuss their sensitivity to errors. I will also explain the connections to some of them to somewhat recent state-of-the-art algorithms.
Bruno Scherrer has been a researcher at INRIA since 2004. He has contributed to the mathematical analysis of Dynamic Programming algorithms applied to Reinforcement Learning, in particular to approximation schemes.