[PhD Thesis Public Presentation_Zoom] - Dongqi Han - Toward a Cognitive Neurorobotic Agent That Can Abstract, Infer and Plan: Reinforcement Learning and Active Inference in Hierarchical and Partially Observable Tasks

Date

Tuesday, February 8, 2022 - 09:00 to 10:00

Location

Please join via zoom

Description

Presenter: Dongqi Han

Supervisor: Prof. Jun Tani

Co-supervisor: Prof. Kenji Doya

 

Unit: Cognitive Neurorobotics Research Unit

 

Zoom URL: to be available 48 hours prior to the examination

 

Title: Toward a Cognitive Neurorobotic Agent That Can Abstract, Infer and Plan: Reinforcement Learning and Active Inference in Hierarchical and Partially Observable Tasks

 

Abstract:

It has been a central question in cognitive science and artificial intelligence that what are the underlying mechanisms of learning to make highly adaptive and cognitive decisions in various challenging tasks. The current thesis proposes computational models of decision-making agents, most of which are simulated robots with sensors and motors, and discusses how the proposed models contribute to efficient learning and adaptive decision-making. Rather than the conventional usage of recurrent neural network (RNN) as function approximators in RL, the proposed results provide novel insights of how RNN models can be incorporated with RL to achieve more intelligent decision making.

Firstly, a novel, multiple-level, stochastic RNN model is proposed for solving hierarchical control tasks by model-free reinforcement learning (RL). It is shown that an action hierarchy, characterized by consistent representation for abstracted sub-goals in the higher level, self-develops during the learning in a challenging continuous control task. The emerged action hierarchy is also observed to enable faster relearning when the sub-goals are re-composed.

Then, the author introduces a variational RNN model for predicting state transitions in control tasks in which information of environment state is partially observable. By predicting new observations, the models learns to represent the underlying states of the environment that are important but probably not observable. A corresponding algorithm is proposed to facilitate efficient learning in partially observable environments.

Finally, the variational RNN model is extended to a predictive coding model that cooperate RL and active inference in the same network. It is investigated how RL is used for exploring the environment and avoiding punishment and goal-directed planning is then conducted in the framework of active inference. It is shown that a trained agent is able to select near-optimal actions to achieve a given goal by providing goal observation using backpropagation of prediction error in the variational RNN. The study offers novel insights about how RL and active inference can collaborate in a complementary manner for different purposes.

All-OIST Category: 

Intra-Group Category


Subscribe to the OIST Calendar: Right-click to download, then open in your calendar application.