Skip to main content
適応システムグループの出版物
2017
- Wang, J., Uchibe, E., & Doya, K. (2017). Adaptive Baseline Enhances EM-based Policy Search: Validation in a View-based Positioning Task of a Smartphone Balancer. Frontiers in Neurorobotics, 11:1.
- Elfwing, S., Uchibe, E., & Doya, K. (2017). Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning. arXiv:1702.03118.
2016
- Elfwing, S., Uchibe, E., & Doya, K. (2016). From free energy to expected energy: Improving energy-based value function approximation in reinforcement learning. Neural Networks, 84: 17-27.
- Wang, J., Uchibe, E., & Doya, K. (2016). EM-based Policy Hyper Parameter Exploration: Application to Standing and Balancing of a Two-wheeled Smartphone Robot. Journal of Artificial Life and Robotics. vol. 21, issue 1, pp. 125-131.
- 内部英治. (2016). 線形可解マルコフ決定過程を用いた順・逆強化学習(解説論文).日本神経回路学会誌,vol. 23, no. 1, pp. 2-13.
- Uchibe, E. (2016). Deep inverse reinforcement learning by logistic regression. In Proc. of the 23rd International Conference on Neural Information Processing (ICONIP), pp. 23-31.
- Huang, Q., Uchibe, E., & Doya, K. (2016). Emergence of communication among reinforcement learning agents under coordination environment. In Proc. of the 6th Joint IEEE International Conference on Developmental Learning and on Epigenetic Robotics.
- Reinke, C., Uchibe, E., & Doya, K. (2016). From Neuroscience to Artificial Intelligence: Maximizing Average Reward in Episodic Reinforcement Learning Tasks with an Ensemble of Q-Learners. In the Third CiNet Conference, Neural mechanisms of decision making: Achievements and new directions, Osaka, poster.
- Reinke, C., Uchibe, E., & Doya, K. (2016). Learning of Stress Adaptive Habits with an Ensemble of Q-Learners. In The 2nd International Workshop on Cognitive Neuroscience Robotics, Osaka, poster.
2015
- Elfwing, S., Uchibe, E., & Doya, K. (2015). Expected energy-based restricted Boltzmann machine for classification. Neural Networks. vol. 64, 29-38.
- Reinke, C., Uchibe, E., & Doya, K. (2015). Maximizing the average reward in episodic reinforcement learning tasks. In Proc. of IEEE International Conference on Intelligent Informatics and BioMedical Sciences, Okinawa, pp, 420-421, 2015.
- Wang, J., Uchibe, E., & Doya, K. (2015). Two-wheeled smartphone robot learns to stand up and balance by EM-based policy hyper parameter exploration. In Proc. of the 20th International Symposium on Artificial Life and Robotics.
- Uchibe, E., & Doya, K. (2015). Inverse Reinforcement Learning with Density Ratio Estimation. The 2nd Multidisciplinary Conference on Reinforcement Learning and Decision Making, University of Alberta, Canada, poster.
- Reinke, C., Uchibe, E., & Doya, K. (2015). Gamma-QCL: Learning multiple goals with a gamma submodular reinforcement learning framework. In Winter Workshop on Mechanism of Brain and Mind. (poster presentation).
2014
- Elfwing, S., & Doya, K. (2014). Emergence of polymorphic mating strategies in robot colonies. PLoS ONE, 9(4), e93622.
- Uchibe, E., & Doya, K. (2014).Inverse Reinforcement Learning Using Dynamic Policy Programming. In Proc. of the 4th Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics, pp. 222-228.
- Uchibe, E., & Doya, K. (2014).Combining learned controllers to achieve new goals based on linearly solvable MDPs. In Proc. of IEEE International Conference on Robotics and Automation, pp. 5252-5259.
- Kinjo, K., Uchibe, E., & Doya, K. (2014). Robustness of Linearly Solvable Markov Games with inaccurate dynamics model. In Proc. of the 19th International Symposium on Artificial Life and Robotics.
- Wang, J. Uchibe, E., & Doya, K. (2014). Control of Two-Wheeled Balancing and Standing-up Behaviors by an Android Phone Robot. Proc. of the 32nd Annual Conference of Robotics Society of Japan, Kyushu Sangyo University.
- Eren Sezener, C., Uchibe, E., & Doya, K. (2014). Ters Peki,stirmeli Ogrenme ile Farelerin Odul Fonksiyonunun Elde Edilmesi. In Proc. of Turkiye Otonom Robotlar Konferans? (TORK). [published in Turkey, but see also anEnglish version].
- 内部英治,銅谷賢治 (2014).密度比推定を用いた逆強化学習.第32回日本ロボット学会学術講演会予稿集,九州産業大学.
2013
- Kinjo, K., Uchibe, E., & Doya, K. (2013). Evaluation of linearly solvable Markov decision process with dynamic model learning in a mobile robot navigation task. Frontiers in Neurorobotics, 7(7).
- Elfwing, S., Uchibe, E., & Doya, K. (2013). Scaled free-energy based reinforcement learning for robust and efficient learning in high-dimensional state spaces. Frontiers in Neurorobotics, 7(February), 3.
- Sakuma, T., Shimizu, T., Miki, Y., Doya, K., & Uchibe, E. (2013). Computation of Driving Pleasure based on Driver's Learning Process Simulation by Reinforcement Learning. In Proc. of Asia Pacific Automotive Engineering Conference.
- Yoshida, N., Uchibe, E., & Doya, K. (2013). Reinforcement learning with state-dependent discount factor. In Proc. of the 3rd Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics (pp. 1-6). IEEE.
- Wang, J., Uchibe, E., & Doya, K. (2013). Standing-up and Balancing Behaviors of Android Phone Robot. In Proc. of IEICE-NLP2013-122, 49-54.
- 内部英治,銅谷賢治 (2013).オープンソースソフトウェアを用いた強化学習アルゴリズムの実現.クラウドネットワークロボティクス研究会,1-6.
- Uchibe, E., Ota, S., & Doya, K. (2013). Inverse Reinforcement Learning for Analysis of Human Behaviors. The 1st Multidisciplinary Conference on Reinforcement Learning and Decision Making, Princeton, New Jersey, USA, poster.
- Ota, S., Uchibe, E., & Doya, K. (2013). Analysis of human behaviors by inverse reinforcement learning in a pole balancing task. The 3rd International Symposium on Biology of Decision Making, Paris, France, poster.
- 内部英治,銅谷賢治 (2013).密度比推定を用いた逆強化学習.第16回情報論的学習理論ワークショップ (IBIS2013), ポスター.
2012
- 吉田尚人,吉本潤一郎,内部英治,銅谷賢治 (2012).スマートフォンを用いたロボットプラットホームの開発.第30回日本ロボット学会学術講演会.
- 金城健,内部英治,吉本潤一郎,銅谷賢治 (2012).運動―視覚ダイナミクス学習と線形ベルマン方程式によるロボット制御.情報通信学会研究報告,バイオ情報学,2012-BIO-29(4), 1-6.
2011
- Uchibe, E., & Doya, K. (2011). Evolution of rewards and learning mechanisms in Cyber Rodents. J. K. Krichmar and H. Wagatsuma (eds.), Neuromorphic and Brain-Based Robotics, chapter 6, 109-128.
- Elfwing, S., Uchibe, E., Doya, K., & Christensen, H. I. (2011). Darwinian embodied evolution of the learning ability for survival. Adaptive Behavior, 19(2), 101-120.
- 金城健,内部英治,吉本潤一郎,銅谷賢治 (2011).線形ベルマン方程式に基づくロボット制御:システム同定と指数価値関数近似.電子情報通信学会技術研究報告,NCニューロコンピューティング 110(461) 107-112.
2010
- Morimura, T., Uchibe, E., Yoshimoto, J., Peters, J., & Doya, K. (2010). Derivatives of logarithmic stationary distributions for policy gradient reinforcement learning. Neural Computation, 22(2), 342-76.
- Elfwing, S., Otsuka, M., Uchibe, E., & Doya, K. (2010). Free-Energy Based Reinforcement Learning for Vision-Based Navigation with High-Dimensional Sensory Inputs. In Proc. of the 17th International Conference on Neural Information Processing (pp. 215-222).
- 木村慎治,芳賀真由美,内部英治,吉本潤一郎,銅谷賢治 (2010).センサフィードバックを用いたCPG制御における環境ダイナミクスと観測の不確定性の影響.電子情報通信学会技術研究報告,NCニューロコンピューティング 109(461) 219-224.
2009
- Uchibe, E., & Doya, K. (2009). Constrained Reinforcement Learning from Intrinsic and Extrinsic Rewards. In M. J. Er & Y. Zhou (Eds.), Theory and Novel Applications of Machine Learning. IN-TECH.
- Elfwing, S., Uchibe, E., Doya, K., & Christensen, H. I. (2009). Co-evolution of Rewards and Meta-parameters in Embodied Evolution. In B. Sendhoff, E. Korner, O. Sporns, H. Ritter, & K. Doya (Eds.), Creating Brain-Like Intelligence (pp. 278-302). Springer.
- Morimura, T., Uchibe, E., Yoshimoto, J., & Doya, K. (2009). A Generalized Natural Actor-Critic Algorithm. In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, & A. Culotta (Eds.), Advances in Neural Information Processing Systems 22 (pp. 1312-1320). MIT Press.
- Elfwing, S., Uchibe, E., & Doya, K. (2009). Emergence of Different Mating Strategies in Artificial Embodied Evolution. In Proc. of the 16th International Conference on Neural Information Processing (pp. 638-647).
- 小林幹浩,内部英治,銅谷賢治 (2009).感覚情報の能動的低次元化による強化学習.電子情報通信学会技術研究報告,NCニューロコンピューティング 109(53) 19-24.
2008
- Uchibe, E., & Doya, K. (2008). Finding intrinsic rewards by embodied evolution and constrained reinforcement learning. Neural Networks, 21(10), 1447-55.
- Elfwing, S., Uchibe, E., Doya, K., & Christensen, H. I. (2008). Co-evolution of Shaping Rewards and Meta-Parameters in Reinforcement Learning. Adaptive Behavior, 16(6), 400-412.
- Sato, T., Uchibe, E., & Doya, K. (2008). Learning how, what, and whether to communicate: emergence of protocommunication in reinforcement learning agents. Journal of Artificial Life and Robotics, 12, 70-74.
- 森村哲郎,内部英治,吉本潤一郎, 銅谷賢治 (2008).自然方策こう配法:平均報酬の自然こう配に基づく方策探索.電子情報通信学会論文誌D.Vol. J91-D, No.6, pp.1515-1527.
- Morimura, T., Uchibe, E., Yoshimoto, J., & Doya, K. (2008). A New Natural Policy Gradient by Stationary Distribution Metric. In Proc. of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (pp. 82-97). Springer Berlin / Heidelberg.
- Morimura, T., Uchibe, E., & Doya, K. (2008). Natural policy gradient with baseline adjustment function for variance reduction. Artificial Life and Robotics, 2008.
- Kamioka, T., Uchibe, E., & Doya, K. (2008). Neuroevolution Based on Reusable and Hierarchical Modular Representation. Proc. of INNS-NNN Symposia.
2007
- Elfwing, S., Uchibe, E., Doya, K., & Christensen, H. I. (2007). Evolutionary Development of Hierarchical Learning Structures. IEEE Transactions on Evolutionary Computation, 11(2), 249-264.
- 上岡拓未,内部英治,銅谷賢治.Max-min Actor-Critic による複数報酬課題の強化学習.電子情報通信学会論文誌D. Vol. J90-D, No. 9, pp. 2510-2521, 2007.
- 佐藤尚,内部英治,銅谷賢治 (2007).強化学習エージェントによる協調行動とコミュニケーションの創発.情報処理学会論文誌:数理モデル化と応用 (TOM19), vol. 48, No. SIG19, pp. 55-67.
- 内部英治,銅谷賢治.サイバーローデントプロジェクト(解説論文)(2007).日本神経回路学会誌.Vol. 14, No. 4.
- Uchibe, E., & Doya, K. (2007). Constrained reinforcement learning from intrinsic and extrinsic rewards. In Proc. of International Conference on Developmental Learning. London, UK: IEEE.
- Uchibe, E., & Doya, K. (2007). Finding Exploratory Rewards by Embodied Evolution and Constrained Reinforcement Learning in the Cyber Rodents. In Proc. of the 14th International Conference on Neural Information Processing, 167-176. Kitakushu, Japan: Springer Berlin.
- 大塚誠,内部英治,銅谷賢治 (2007).近傍成分分析による行動指向的状態表現の獲得.ニューロコンピューティング研究会,玉川大学.
2006
- Uchibe, E., & Asada, M. (2006). Incremental Coevolution With Competitive and Cooperative Tasks in a Multirobot Environment. Proceedings of the IEEE, 94(7), 1412-1424.
- 上岡拓未,内部英治,銅谷賢治 (2006).複数の価値関数を用いた多目的強化学習.ニューロコンピューティング研究会,玉川大学.
- 内部英治,銅谷賢治 (2006).複数の報酬によって与えられる拘束のもとでの強化学習.ニューロコンピューティング研究会,OIST.
- Brunskill, E., Uchibe, E., & Doya, K. (2006). Adaptive state space construction with reinforcement learning for robots. poster presentation in Proc. of the International Conference on Robotics and Automation.
2005
- Doya, K., & Uchibe, E. (2005). The Cyber Rodent Project: Exploration of Adaptive Mechanisms for Self-Preservation and Self-Reproduction. Adaptive Behavior, 13, 149-160.
- Morimura, T., Uchibe, E., & Doya, K. (2005). Utilizing the natural gradient in temporal difference reinforcement learning with eligibility traces. In Proc. of the 2nd International Symposium on Information Geometry and its Application (pp. 256-263).
- Uchibe, E., & Doya, K. (2005). Reinforcement Learning with Multiple Modules: A Framework for Developmental Robot Learning. In Proc. of the 4th International Conference on Developmental Learning, 87-92.
- Elfwing, S., Uchibe, E., Doya, K., & Christensen, H. I. (2005).Biologically inspired embodied evolution of survival. In Proc. of the IEEE Congress on Evolutionary Computation, 2210-2216.
2004
- 内部英治,銅谷賢治 (2004).複数報酬のもとでの階層強化学習.日本ロボット学会誌.Vol. 22, No. 1, pp. 120-129.
- Elfwing, S., Uchibe, E., Doya, K., & Christensen, H. I. (2004). Multi-agent reinforcement learning: using macro actions to learn a mating task. In Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (Vol. 4, pp. 3164-3169).
- Uchibe, E., & Doya, K. (2004). Competitive-Cooperative-Concurrent Reinforcement Learning with Importance Sampling. In S. Schaal, A. Ijspeert, A. Billard, S. Vijayakumar, J. Hallam, & J.-A. Meyer (Eds.), Proc. of the Eighth International Conference on Simulation of Adaptive Behavior: From Animals to Animats 8 (pp. 287?296). MIT Press, Cambridge, MA.