[PhD Thesis Presentation] - Chris Reinke - The Gamma-Ensemble. Adaptive Reinforcement Learning via Modular Discounting
Reinforcement learning allows artificial agents to learn complex tasks, such as playing Go on an expert level. Still, unlike humans, artificial agents lack the ability to adapt learned behavior to task changes, or to new objectives, such as to capture as many opponent pieces within a given number of moves, instead of simply winning. The Independent Gamma-Ensemble (IGE), a new brain-inspired framework, allows such adaptations. It is composed of several Q-learning modules, each with a different discount factor. The off-policy nature of Q-learning allows modules to learn several policies in parallel, each representing a different solution for the payoff between a high reward sum and the time to gain it. The IGE adapts to new task conditions by switching between its policies (transfer learning). It can also decode the expected reward sum and the required time for each policy, allowing it to immediately select the most appropriate policy for a new task objective (zero-shot learning). Additionally, this allows to optimize the average reward in discrete MDPs where non-zero reward is only given in goal states. The convergence to the optimal policy can be proven for such MDPs. The modular structure behind the IGE can be combined with many reinforcement learning algorithms and applied to various tasks, allowing to improve the adaptive abilities of artificial agents in general.