Sample-efficient Reinforcement Learning for Real-world Robot Control by Prof. Takamitsu Matsubara
Reinforcement learning (RL) has been applied in a broad range of robot control scenarios. However, its application to real-world robots remains hard since a prohibitively long-time experiment for collecting sufficient data samples is often required. Therefore, developing sample-efficient RL algorithms is of primary importance. In this talk, I will introduce our recent progress in the development of sample-efficient RL algorithms and their application to real robot control problems.
Firstly, I present a sample-efficient deep RL algorithm, Deep P-Network (DPN).
It is a combining the nature of smooth policy update in Dynamic Policy Programming with the capability
of automatic feature extraction in a deep neural network for enhancing the sample efficiency and learning
stability with fewer samples [Tsurumine et al., RAS2019]. We applied our method to real robotic cloth manipulation tasks as flipping a handkerchief and folding a T-shirt. Experimental results suggest that our approach can result in better sample efficiency and learning stability than previous DRL methods.
Secondly, I present a Gaussian process-based model-based RL algorithm, Sample-efficient probabilistic model predictive control (SPMPC) [Cui et al. IROS2019]. SPMPC iteratively leans a GP as a dynamical model of the targeted system by following PILCO but further extends it by introducing a scheme of model-predictive control for enhancing its robustness to disturbances. We applied our method to a full-sized boat equipped with a single engine and limited sensors for autonomous driving in the real ocean environment. Experimental results show the capability of the proposed system with both robustness to disturbances and sample efficiency.
He received his Ph.D. in information science from the Nara Institute of Science and Technology, Nara, Japan, in 2007. From 2005 to 2007, he was a research fellow (DC1) of the Japan Society for the Promotion of Science. From 2013 to 2014, he was a visiting researcher of the Donders Institute for Brain Cognition and Behaviour, Radboud University Nijmegen, Nijmegen, The Netherlands. He is currently an associate professor and the PI of the Robot Learning Laboratory at the Nara Institute of Science and Technology. He is also a visiting researcher at the ATR Computational Neuroscience Laboratories, Kyoto, Japan, and the National Institute of Advanced Industrial Science (AIST), Tokyo, Japan. His research interest is machine learning for robot control and perception.