Publications

List of Publications

Preprints

Han D, Doya K, Li D, Tani J (2024). Synergizing habits and goals with variational Bayes. PsyArXiv, 10.31234/osf.io/v63yj. https://doi.org/10.31234/osf.io/v63yj

Keshmiri S, Tomonaga S, Mizutani H, Doya K (2024). Information Dynamics of the Heart and Respiration Rates: a Novel Venue for Digital Phenotyping in Humans. bioRxiv https://doi.org/10.1101/2024.01.21.576502

Han D, Doya K, Li D, Tani J (2023). Habits and goals in synergy: a variational Bayesian framework for behavior. https://doi.org/10.48550/arXiv.2304.05008

Rahman F, Mikheyev A, Doya K (2020). Emergence of Alternative Reproductive Tactics in Simulated Robot Colonies. Available at SSRN:. SSRN, https://ssrn.com/abstract=3699150. 

Elfwing S, Uchibe E, Doya K (2018). Unbounded output networks for classification. arXiv:1807.09443. 

Journal Articles

2023

Abekawa N, Doya K, Gomi H (2023). Body and visual instabilities functionally modulate implicit reaching corrections. iScience, 26. https://doi.org/10.1016/j.isci.2022.105751

Blackwell KT, Doya K (2023). Enhancing reinforcement learning models by including direct and indirect pathways improves performance on striatal dependent tasks. PLoS Comput Biol, 19, e1011385. https://doi.org/10.1371/journal.pcbi.1011385

Doya K (2023). Neuroscience and open intelligence. Journal of the Robotics Society of Japan, 41, 616-617 (in Japanese). https://doi.org/10.7210/jrsj.41.616

Florian L, Kenji D (2023). Numerical data imputation for multimodal data sets: A probabilistic nearest-neighbor kernel density approach. Transactions on Machine Learning Research, https://openreview.net/forum?id=KqR3rgooXb. 

Hata J, Nakae K, Tsukada H, Woodward A, Haga Y, Iida M, Uematsu A, Seki F, Ichinohe N, Gong R, Kaneko T, Yoshimaru D, Watakabe A, Abe H, Tani T, Hamda HT, Gutierrez CE, Skibbe H, Maeda M, Papazian F, Hagiya K, Kishi N, Ishii S, Doya K, Shimogori T, Yamamori T, Tanaka K, Okano HJ, Okano H (2023). Multi-modal brain magnetic resonance imaging database covering marmosets with a wide age range. Scientific Data, 10.1038/s41597-023-02121-2. https://doi.org/10.1038/s41597-023-02121-2

Kuniyoshi Y, Kuriyama R, Omura S, Gutierrez CE, Sun Z, Feldotto B, Albanese U, Knoll AC, Yamada T, Hirayama T, Morin FO, Igarashi J, Doya K, Yamazaki T (2023). Embodied bidirectional simulation of a spiking cortico-basal ganglia-cerebellar-thalamic brain model and a mouse musculoskeletal body model distributed across computers including the supercomputer Fugaku. Frontiers in Neurorobotics, 17. https://doi.org/10.3389/fnbot.2023.1269848

Skibbe H, Rachmadi MF, Nakae K, Gutierrez CE, Hata J, Tsukada H, Poon C, Schlachter M, Doya K, Majka P, Rosa MGP, Okano H, Yamamori T, Ishii S, Reisert M, Watakabe A (2023). The Brain/MINDS Marmoset Connectivity Resource: An open-access platform for cellular-level tracing and tractography in the primate brain. PLoS Biol, 21, e3002158. https://doi.org/10.1371/journal.pbio.3002158

Toulkeridou E, Gutierrez CE, Baum D, Doya K, Economo EP (2023). Automated segmentation of insect anatomy from micro‐CT images using deep learning. Natural Sciences, 10.1002/ntls.20230010. https://doi.org/10.1002/ntls.20230010

Yamane Y, Ito J, Joana C, Fujita I, Tamura H, Maldonado PE, Doya K, Grun S (2023). Neuronal Population Activity in Macaque Visual Cortices Dynamically Changes through Repeated Fixations in Active Free Viewing. eneuro, 10. https://doi.org/10.1523/ENEURO.0086-23.2023

Yoshizawa T, Ito M, Doya K (2023). Neuronal representation of a working memory-based decision strategy in the motor and prefrontal cortico-basal ganglia loops. eneuro, 10, ENEURO.0413-22.2023. https://doi.org/10.1523/ENEURO.0413-22.2023

2022

Doya K, Ema A, Kitano H, Sakagami M, Russell S (2022). Social impact and governance of AI and neurotechnologies. Neural Networks, 152, 542-554. https://doi.org/10.1016/j.neunet.2022.05.012

Feldotto B, Eppler JM, Jimenez-Romero C, Bignamini C, Gutierrez CE, Albanese U, Retamino E, Vorobev V, Zolfaghari V, Upton A, Sun Z, Yamaura H, Heidarinejad M, Klijn W, Morrison A, Cruz F, McMurtrie C, Knoll AC, Igarashi J, Yamazaki T, Doya K, Morin FO (2022). Deploying and Optimizing Embodied Simulations of Large-Scale Spiking Neural Networks on HPC Infrastructure. Front Neuroinform, 16, 884180. https://doi.org/10.3389/fninf.2022.884180

Gutierrez CE, Skibbe H, Musset H, Doya K (2022). A spiking neural network builder for systematic data-to-model workflow. Frontiers in Neuroinformatics, 16, 855765. https://doi.org/10.3389/fninf.2022.855765

Taniguchi T, Yamakawa H, Nagai T, Doya K, Sakagami M, Suzuki M, Nakamura T, Taniguchi A (2022). A whole brain probabilistic generative model: Toward realizing cognitive architectures for developmental robots. Neural Networks, 150, 293-312. https://doi.org/10.1016/j.neunet.2022.02.026

浦久保秀俊, 渡我部昭哉, 中江健, 石井信, 銅谷賢治 (2022). コネクトーム : ミクロ・メゾ・マクロレベルの新展開. 解剖学雑誌, 97, 41-44. 

塚田啓道, 銅谷賢治 (2022). 神経トレーサー、構造MRI、機能MRIデータの統合による全脳モデルシミュレーション. 生体の科学, 73, 436-437. 

2021

Doya K (2021). Canonical cortical circuits and the duality of Bayesian inference and optimal control. Current Opinion in Behavioral Sciences, 41, 160-167. https://doi.org/10.1016/j.cobeha.2021.07.003

Doya K, Miyazaki KW, Miyazaki K (2021). Serotonergic modulation of cognitive computations. Current Opinion in Behavioral Sciences, 38, 116-123. https://doi.org/10.1016/j.cobeha.2021.02.003

Girard B, Lienard J, Gutierrez CE, Delord B, Doya K (2021). A biologically constrained spiking neural network model of the primate basal ganglia with overlapping pathways exhibits action selection. Eur J Neurosci, 53, 2254-2277. https://doi.org/10.1111/ejn.14869

Uchibe E, Doya K (2021). Forward and inverse reinforcement learning sharing network weights and hyperparameters. Neural Networks, 144, 138-153. https://doi.org/10.1016/j.neunet.2021.08.017

2020

​Abe Y, Takata N, Sakai Y, Hamada H, Hiraoka Y, Aida T, Tanaka K, Bihan DL, Doya K, Tanaka KF (2020). Diffusion functional MRI reveals global brain network functional abnormalities driven by targeted local activity in a neuropsychiatric disease mouse model. Neuroimage, 223, 117318. https://doi.org/10.1016/j.neuroimage.2020.117318

Gutierrez CE, Skibbe H, Nakae K, Tsukada H, Lienard J, Watakabe A, Hata J, Reisert M, Woodward A, Yamaguchi Y, Yamamori T, Okano H, Ishii S, Doya K (2020). Optimization and validation of diffusion MRI-based fiber tracking with neural tracer data as a reference. Sci Rep, 10, 21285. https://doi.org/10.1038/s41598-020-78284-4

Gutierrez CE, Sun Z, Yamaura H, Morteza H, Igarashi J, Yamazaki T, Doya K (2020). Simulation of resting-state neural activity in a loop circuit of the cerebral cortex, basal ganglia, cerebellum, and thalamus using NEST simulator. JNNS2020. 

Han D, Doya K, Tani J (2020). Self-organization of action hierarchy and compositionality by reinforcement learning with recurrent neural networks. Neural Networks, 129, 149-162. https://doi.org/10.1016/j.neunet.2020.06.002

Miyazaki K, Miyazaki KW, Sivori G, Yamanaka A, Tanaka KF, Doya K (2020). Serotonergic projections to the orbitofrontal and medial prefrontal cortices differentially modulate waiting for future rewards. Science Advances, 6, eabc7246. https://doi.org/10.1126/sciadv.abc7246

高橋英彦, 山下祐一, 銅谷賢治 (2020). AIと脳神経科学―精神神経疾患へのデータ駆動と理論駆動のアプローチ. Clinical Neuroscience, 38, 1358-1363. 

2019

Doya K, Taniguchi T (2019). Toward evolutionary and developmental intelligence. Current Opinion in Behavioral Sciences, 29, 91-96. http://doi.org/10.1016/j.cobeha.2019.04.006.

Doya K, Matsuo Y (2019). Artificial intelligence and brain science: the present and the future. Brain and Nerve (in Japanese) Brain and Nerve, 71, 649-655. https://doi.org/10.11477/mf.1416201337

銅谷賢治 (2019). 日本神経回路学会学術賞にあたって. 日本神経回路学会誌: The Brain & Neural Networks, 26, 159-164. https://doi.org/10.3902/jnns.26.159

2018

Elfwing S, Uchibe E, Doya K (2018). Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Networks, 107, 3-11. https://doi.org/10.1016/j.neunet.2017.12.012

Kazumi K, Hideki H, Yoshihiko F, Charles SD, Manabu H, with a sensorimotor rhythm-based brain-computer interface in a Parkinson’s disease patient. Brain-Computer Interfaces. http://doi.org/10.1080/2326263X.2018.1440781

Magrans de Abril I, Yoshimoto J, Doya K (2018). Connectivity inference from neural recording data: Challenges, mathematical bases and research directions. Neural Networks. http://doi.org/10.1016/j.neunet.2018.02.016

Miyazaki K, Miyazaki KW, Yamanaka A, Tokuda T, Tanaka KF, Doya K (2018). Reward probability and timing uncertainty alter the effect of dorsal raphe serotonin neurons on patience. Nat Commun, 9, 2048. http://doi.org/10.1038/s41467-018-04496-y

Tokuda T, Yoshimoto J, Shimizu Y, Okada G, Takamura M, Okamoto Y, Yamawaki S, Doya K (2018). Identification of depression subtypes and relevant brain regions using a data-driven approach. Sci Rep, 8, 14082. http://doi.org/10.1038/s41598-018-32521-z

Yoshizawa T, Ito M, Doya K (2018). Reward-predictive neural activities in striatal striosome compartments. eneuro, 5, e0367-17.2018. https://doi.org/10.1523/ENEURO.0367-17.2018

2017

Shouno O, Tachibana Y, Nambu A, Doya K (2017). Computational model of recurrent subthalamo-pallidal circuit for generation of parkinsonian oscillations. Frontiers in Neuroanatomy, 11, 1-15. http://doi.org/10.3389/fnana.2017.00021

Tokuda T, Yoshimoto J, Shimizu Y, Okada G, Takamura M, Okamoto Y, Yamawaki S, Doya K (2017). Multiple co-clustering based on nonparametric mixture models with heterogeneous marginal distributions. PLoS ONE, 12, e0186566. http://doi.org/10.1371/journal.pone.0186566

Uchibe E (2017). Model-free deep inverse reinforcement learning by logistic regression. Neural Processing Letters. http://doi.org/10.1007/s11063-017-9702-7

Wang JX, Uchibe E, Doya K (2017). Adaptive Baseline Enhances EM-Based Policy Search: Validation in a View-Based Positioning Task of a Smartphone Balancer. Front Neurorobot, 11, 1-15. http://doi.org/10.3389/fnbot.2017.00001

Yoshida K, Shimizu Y, Yoshimoto J, Takumura M, Okada G, Okamoto Y, Yamawaki S, Doya K (2017). Prediction of clinical depression scores and detection of changes in whole-brain using resting-state functional MRI data with partial least squares regression. PLoS ONE. http://doi.org/10.1371/journal.pone.0179638

Yoshida K, Yoshimoto J, Doya K (2017). Sparse kernel canonical correlation analysis for discovery of nonlinear interactions in high-dimensional data. BMC bioinformatics, 18, 1-11. http://doi.org/10.1186/s12859-017-1543-x

2016

Caligiore D, Pezzulo G, Baldassarre G, Bostan AC, Strick http://doi.org/10.1007/s12311-016-0763-3

Elfwing S, Uchibe E, Doya K (2016). From free energy to value function approximation in reinforcement learning. Neural Networks, 84, 17-27. http://doi.org/10.1016/j.neunet.2016.07.013

Fermin ASR, Yoshida T, Yoshimoto J, Ito M, Tanaka SC, Doya K (2016). Model-based action planning involves cortico-cerebellar and basal ganglia networks. Nature Scientific Reports, 6, 1-14. http://doi.org/10.1038/srep31378

Funamizu A, Kuhn B, Doya K (2016). Neural substrate of dynamic Bayesian inference in the cerebral cortex. Nature Neuroscience, 1-12. http://doi.org/10.1038/nn.4390

Nagai T, Nakamuta S, Kuroda K, Nakauchi S, Nishioka T, Takano T, Zhang X, Tsuboi D, Funahashi Y, Nakano T, Yoshimoto J, Kobayashi K, Uchigashima M, Watanabe M, Miura M, Nishi A, Kobayashi K, Yamada K, Amano M, Kaibuchi K (2016). Phosphoproteomics of the dopamine pathway enables discovery of rap1 activation as a reward signal in vivo. Neuron, 89, 550-65. http://doi.org/10.1016/j.neuron.2015.12.019

Okamoto Y, Okada G, Tanaka S, Miyazaki K, Miyazaki K, Doya K, Yamawaki S (2016). The role of serotonin in waiting for future rewards in depression. International Journal of Neuropsychopharmacology, 19, 33-33.

Shimizu Y, Doya K, Okada G, Okamoto Y, Takamura M, Yamawaki S, Yoshimoto J (2016). Depression severity and related characteristics correlate significantly with activation in brain areas selected through machine learning. International Journal of Neuropsychopharmacology, 19, 135-136.

Uchibe E (2016). Forward and inverse reinforcement learning by linearly solvable Markov decision

Wang J, Uchibe E, Doya K (2016). EM-based policy hyper131. http://doi.org/10.1007/s10015-015-0260-7

2015

Balleine B, Dezfouli A, Ito M, Doya K (2015). Hierarchical control of goal-directed action in the cortical–basal ganglia network. Science Direct, 5, 1-7. http://doi.org/10.1016/j.cobeha.2015.06.001

Elfwing S, Uchibe E, Doya K (2015). Expected energy-based restricted Boltzmann machine for classification. Neural Networks, 64, 29-38. http://doi.org/10.1016/j.neunet.2014.09.006

Funamizu A, Ito M, Doya K, Kanzaki R, Takahashi H (2015). Condition interference in rats performing a choice task with switched variable- and fixed-reward conditions. Fronteirs in Neuroscience, 9. http://doi.org/10.3389/fnins.2015.00027. eCollection 2015.

Hahne J, Helias M, Kunkel S, Igarashi J, Bolten M, Frommer A, Diesmann M (2015). A unified framework for spiking and gap-junction interactions in distributed neuronal network simulations. Frontiers in Neuroinformatics, 9. http://doi.org/10.3389/fninf.2015.00022

Ito M, Doya K (2015). Parallel Representation of Value-Based and Finite State-Based Strategies in the Ventral and Dorsal Striatum. PLoS ONE. http://doi.org/10.1371/journal.pcbi.1004540

Ito M, Doya K (2015). Distinct neural representation in the dorsolateral, dorsomedial, and ventral parts

Nakano T, Otsuka M, Yoshimoto J, Doya K (2015). A spiking neural network model of model-free

Shimizu Y, Yoshimoto J, Toki S, Takamura M, Yoshimura S,Toward probabilistic diagnosis and understanding of depression based on functional MRI data analysis with logistic group LASSO. PLoS ONE, 10, e0123524. http://doi.org/10.1371/journal.pone.0123524

2014

Elfwing S, Doya K (2014). Emergence of polymorphic mating strategies in robot colonies. PLoS ONE, 9, e93622. http://doi.org/10.1371/journal.pone.0093622

Kunkel S, Schmidt M, Eppler MJ, Plesser HE, Masumoto G, Igarash J, Ishii S, Fukai T, Morrison A, Diesmann M, Moritz H (2014). Spiking network simulation code for petascale computers. Frontiers in Neuroinfomatics, 8. http://doi.org/10.3389/fninf.2014.00078

Miyzaki WK, Miyazaki K, Tanaka FK, Yamanaka A, Takahashi A, Tabuchi S, Doya K (2014). Optogenetic Activation of Dorsal Raphe Serotonin Neurons Enhances Patience for Future Rewards. Current Biology. http://doi.org/10.1016/j.cub.2014.07.041

Obrochta SP, Yokoyama Y, Moren J, Crowley TJ (2014). Conversion of GISP2-based sediment core age models to the GICC05 extended chronology. Quaternary Geochronology, 20, 1-7. http://doi.org/10.1016/j.quageo.2013.09.001

2013

Elfwing S, Uchibe E, Doya K (2013). Scaled free-energy based reinforcement learning for robust and efficient learning in high-dimensional state spaces. Front Neurorobot, 7, 3. http://doi.org/10.3389/fnbot.2013.00003

Funamizu A, Kanzaki R, Takahashi H (2013). Pre-attentive, context-specific representation of fear

Kinjo K, Uchibe E, Doya K (2013). Evaluation of linearly solvable Markov decision process with dynamic model learning in a mobile robot navigation task. Front Neurorobot, 7, 7. http://doi.org/10.3389/fnbot.2013.00007

Moren J, Shibata T, Doya K (2013). The mechanism of saccade motor pattern generation investigated by a large-scale spiking neuron model of the superior colliculus. PLoS ONE, 8, e57134. http://doi.org/10.1371/journal.pone.0057134

Nakano T, Yoshimoto J, Doya K (2013). A model-based prediction of the calcium responses in the striatal synaptic spines depending on the timing of cortical and dopaminergic inputs and post-synaptic spikes. Frontiers in Computational Neuroscience, 7, 119. http://doi.org/10.3389/fncom.2013.00119

Yoshimoto J, Ito M, Doya K (2013). Recent progress in reinforcement learning: Decision making in the brain and reinforcement learning. Journal of the Society of Instrument and Control Engineers, 52, 749-754.

2012

Demoto Y, Okada G, Okamoto Y, Kunisato Y, Aoyama S, Onoda K, Munakata A, Nomura M, Tanaka SC, Schweighofer N, Doya K, Yamawaki S (2012). Neural and personality correlates of individual differences related to the effects of acute tryptophan depletion on future reward evaluation. Neuropsychobiology, 65, 55-64. http://doi.org/10.1159/000328990

Funamizu A, Ito M, Doya K, Kanzaki R, Takahashi H (2012).Neuroscience, 35, 1180-1189. http://doi.org/10.1111/j.1460-9568.2012.08025.x.

Miyazaki KW, Miyazaki K, Doya K (2012). Activation of dorsal raphe serotonin neurons is necessary for waiting for delayed rewards. Journal of Neuroscience, 32, 10451-10457. http://doi.org/10.1523/JNEUROSCI.0915-12.2012

Sugimoto N, Haruno M, Doya K, Kawato M (2012). MOSAIC for Multiple-Reward Environments. Neural Computation, 24, 577-606.

2011

Elfwing S, Uchibe, E., Doya, K., Christensen, HI (2011). Darwinian embodied evolution of the learning ability for survival. Adaptive Behavior. http://doi.org/10.1177/1059712310397633

Ito M, Doya K (2011). Multiple representations and algorithms for reinforcement learning in the cortico-basal ganglia circuit. Current Opinion in Neurobiology, 21. http://doi.org/10.1016/j.conb.2011.04.001

Miyazaki K, Miyazaki KW, Doya K (2011). Activation of dorsal raphe serotonin neurons underlies waiting for delayed rewards. Journal of Neuroscience, 31, 469-479. http://doi.org/10.1523/JNEUROSCI.3714-10.2011

Miyazaki KW, Miyazaki K, Doya K (2011). Activation of the central serotonergic system in response to delayed but not omitted rewards. European Journal of Neuroscience, 33, 153-160. http://doi.org/10.1111/j.1460-9568.2010.07480.x.

Pammi VSC, Miyapuram KP, Ahmed, Samejima K, Bapi RS, Doya K (2011). Changing the structure of

Yoshimoto J, Sato M-A, Ishii S (2011). Bayesian normalized Gaussian network and hierarchical model selection method. Intelligent Automation and Soft Computing, 17, 71-94. http://doi.org/10.1080/10798587.2011.10643134

2010

Fermin A, Yoshida T, Ito M, Yoshimoto J (2010). Evidence for Model-Based Action Planning in a Sequential Finger Movement Task. Journal of Motor Behavior, 42, 371-379. http://doi.org/10.1080/00222895.2010.526467

Klein M, Kamp H, Palm G, Doya K (2010). A computational neural model of goal-directed utterance selection. Neural Networks. http://doi.org/10.1016/j.neunet.2010.01.003

Morimura T, Uchibe E, Yoshimoto J, Peters J, Doya K (2010). Derivatives of logarithmic stationary distributions for policy gradient reinforcement learning. Neural Computation, 22, 342-376.

Nakano T, Doi T, Yoshimoto J, Doya K (2010). A kinetic model of dopamine and calcium dependent striatal synaptic plasticity. PLoS Computational Biology, 6, e1000670. http://doi.org/10.1371/journal.pcbi.1000670

2009

Fujiwara Y, Yamashita O, Kawawaki D, Doya K, Kawato M, Toyama K, Sato M-a (2009). A hierarchical Bayesian method to resolve an inverse problem of MEG contaminated with eye movement artifacts. NeuroImage, 45, 393-409. http://doi.org/10.1016/j.neuroimage.2008.12.012

Ito M, Doya K (2009). Validation of Decision-Making Models and Analysis of Decision Variables in the

Ito M, Shirao T, Doya K, Sekino Y (2009). supramammillary nucleus of the rat exposed to novel environment. Neuroscience Research, 64, 397-402. http://doi.org/10.1016/j.neures.2009.04.013

Otsuka M, Yoshimoto J, Doya K (2009). Reward-dependent sensory coding in free-energy-based reinforcement learning. Neural Network World., 19, 597-610.

Tanaka SC, Shishida K, Schweighofer N, Okamoto Y, Yamawaki S, Doya K (2009). Serotonin affects association of aversive outcomes to past actions. Journal of Neuroscience, 16, 15669-74. http://doi.org/ 10.1523/JNEUROSCI.2799-09.2009

2008

Elfwing S, Uchibe E, Doya K, I. CH (2008). Co-evolution of shaping rewards and meta-parameters in reinforcement learning. Adaptive Behavior, 16, 400-412.

Morimura T, Uchibe E, Yoshimoto J, Doya K (2008). A new natural gradient of average reward for policy search. IEICE Transactions, J91-D, 1515-1527.

Sato T, Uchibe E, Doya K (2008). Emergence of communication and cooperative behavior by

Schweighofer N, Bertin M, Shishida K, Okamoto Y, Tanaka S, Yamawaki S, Doya K (2008). Low-serotonin levels increase delayed reward discounting in humans. Journal of Neuroscience, 28, 4528-4532 (Erratum in 28, 5619).

2007

Bertin M, Schweighofer N, Doya K (2007). Multiple model-based reinforcement learning explains dopamine neuronal activity. Neural Netw, 20, 668-75. http://doi.org/10.1016/j.neunet.2007.04.028

Corrado G, Doya K (2007). Understanding neural coding through the model-based analysis of decision making. Journal of Neuroscience, 27, 8178-8180. http://doi.org/10.1523/Jneurosci.1590-07.2007

Doya K (2007). Reinforcement learning: Computational theory and biological mechanisms. HFSP Journal, 10.2976/1.2732246.

Elfwing S, Doya K, Christensen HI (2007). Evolutionary development of hierarchical learning structures. IEEE Transactions on Evolutionary Computations, 11, 249-264.

Kamioka T, Uchibe E, Doya K (2007). Max-Min Actor-Critic for Multiple Reward Reinforcement Learning. IEICE TRANSACTIONS on Information and Systems, J90-D, 2510-2521.

Morimoto J, Doya K (2007). Reinforcement learning state estimator. Neural Computation, 19, 730-756.

Ogasawara H, Doi T, Doya K, Kawato M (2007). Nitric oxide regulates input specificity of long-term depression and context dependence of cerebellar learning. PLoS Computational Biology, 3, e179.

Samejima K, Doya K (2007). Multiple representations of belief states and action values in corticobasal ganglia loops. Annals of New York Academy of Sciences, 1104, 213-228.

Schweighofer N, Tanaka SC, Doya K (2007). Serotonin and the evaluation of future rewards: Theory, experiments, and possible neural mechanisms. Annals of New York Academy of Sciences, 1104, 289-300.

Tanaka SC, Samejima K, Okada G, Ueda K, Okamoto Y, Yamawaki S, Doya K (2007). Brain mechanism of reward prediction under predictable and unpredictable environmental dynamics. Neural Networks, 19, 1233-1241.

Tanaka SC, Schweighofer N, Asahi S, Shishida K, Okamoto Y, Yamawaki S, Doya K (2007). Serotonin differentially regulates short- and long-term prediction of rewards in the ventral and dorsal striatum. PLoS ONE, 2, e1333.

2006

Bando T, Shibata T, Doya K, Ishii S (2006). Switching particle filters for efficient visual tracking. Robotics and Autonomous Systems, 54, 873-884. http://doi.org/10.1016/j.robot.2006.03.004

Bapi RS, Miyapuram KP, Graydon FX, Doya K (2006). fMRI investigation of cortical and subcortical networks in the learning of abstract and effector-specific representations of motor sequences. NeuroImage, 32, 714-727. http://doi.org/10.1016/j.neuroimage.2006.04.205

Daw ND, Doya K (2006). The computational neurobiology of learning and reward. Current Opinion in Neurobiology, 16, 199-204.

Hirayama J, Yoshimoto J, Ishii S (2006). Balancing plasticity and stability of on-line learning based on

Kawawaki D, Shibata T, Goda N, Doya K, Kawato M (2006). Anterior and superior lateral occipito-temporal cortex responsible for target motion prediction during overt and covert visual pursuit. Neuroscience Research, 54, 112-123.

Matsubara T, Morimoto J, Nakanishi J, Sato MA, Doya K (2006). Learning CPG-based biped locomotion with a policy gradient method. Robotics and Autonomous Systems, 54, 911-920. http://doi.org/10.1016/j.robot.2006.05.012

Schweighofer N, Shishida K, Han CE, Okamoto Y, Tanaka SC, Yamawaki S, Doya K (2006). Humans can adopt optimal discounting strategy under real-time constraints. PLoS Computational Biology, 2, e152.

Sugimoto N, Samejima K, Doya K, Kawato M (2006). Hierarchical reinforcement learning: Temporal abstraction based on MOSAIC model. Transactions of Institute of Electronics, Information and Communication Engineers, J89-D, 1577-1587.

Uchibe E, Asada M (2006). Incremental co-evolution with competitive and cooperative tasks in a multi-robot environment. Proceedings of the IEEE.

2005

Capi G, Doya K (2005). Evolution of neural architecture fitting environmental dynamics. Adaptive Behavior, 13, 53-66.

Capi G, Doya K (2005). Evolution of recurrent neural controllers using an extended parallel genetic algorithm. Robotics and Autonomous Systems, 52, 148-159.

Doya K, Uchibe E (2005). The Cyber Rodent project: Exploration of adaptive mechanisms for self-preservation and self-reproduction. Adaptive Behavior, 13, 149-160.

Morimoto J, Doya K (2005). Robust reinforcement learning. Neural Computation, 17, 335-359. http://doi.org/10.1162/0899766053011528

Nishimura M, Yoshimoto J, Tokita Y, Nakamura Y, Ishii S (2005). Control of real acrobot by learning the switching rule of multiple controllers. IEICE Transactions on Fundamentals.

Samejima K, Ueda Y, Doya K, Kimura M (2005). Representation of action-specific reward values in the striatum. Science, 310, 1337-1340. http://doi.org/10.1126/science.1115270

Yoshimoto J, Doya K, Ishii S (2005). Fundamental theory and application of reinforcement learning. Keisoku to Seigyo: Journal of the Society of Instrument and Control Engineers, 44, 313-318.

Yoshimoto J, Nishimura M, Tokita Y, Ishii S (2005). Acrobot control by learning the switching of multiple controllers. Journal of Artificial Life and Robotics.

Yukinawa N, Yoshimoto J, Oba S, Ishii S (2005). System identification of gene expression time-series based on a linear dynamical system model with variational Bayesian estimation. IPSJ Transactions on Mathematical Modeling and its Applications, 46SIG10, 57-65.

2004

Haruno M, Kuroda T, Doya K, Toyama K, Kimura M, Samejima K, Imamizu H, Kawato M (2004). A

Hirayama J, Yoshimoto J, Ishii S (2004). Bayesianacetylcholine. Neural Networks, 17, 1391-1400.

Miyamoto H, Morimoto J, Doya K, Kawato M (2004). Reinforcement learning with via-point representation. Neural Networks, 17, 299-305. http://doi.org/10.1016/j.neunet.2003.11.004

Sato M, Yoshioka T, Kajiwara S, Toyama K, Goda N, Doya K, Kawato M (2004). Hierarchical Bayesian estimation for MEG inverse problem. NeuroImage, 23, 806-826.

Sugimoto N, Samejima K, Doya K, Kawato M (2004). Reinforcement learning and goal estimation by multiple forward and reward models. Transactions of Institute of Electronic, Information and Communication Engineers, J87-D-II, 683-694.

Tanaka S, Doya K, Okada G, Ueda K, Okamoto Y, Yamawaki S (2004). Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops. Nature Neuroscience, 7, 887-893.

Uchibe E, Doya K (2004). Hierarchical reinforcement learning for multiple reward functions. Journal of Robotics Society of Japan, 22, 120-129.

Peer-Reviewed Conference Papers [since 2014]

Sangati E, Sangati F, Slors M, Doya K (2024). The collaborative abilities of ChatGPT agents in a number guessing game. Proceedings of the Joint Symposium of AROB-ISBC-SWARM 2024, https://katja.sangati.me/publication/sangati-2024-do/sangati-2024-do.pdf. 

Huang Q, Doya K (2023). Distributed reinforcement learning for DC open energy systems. ICLR Workshop: Tackling Climate Change with Machine Learning, https://www.climatechange.ai/papers/iclr2023/48. 

Han D, Kozuno T, Luo X, Chen Z-Y, Doya K, Yang Y, Li D (2022). Variational oracle guiding for reinforcement learning. International Conference on Learning Representations (ICLR2022), https://openreview.net/forum?id=pjqqxepwoMy. 

Han D, Doya K, Tani J (2020). Variational recurrent models for solving partially observable control tasks. International Conference on Learning Representations (ICLR 2020), https://iclr.cc/virtual_2020/poster_r1lL4a4tDB.html. 

Hennes D, Morrill D, Omidshafiei S, Munos Rm, Perolat J, Lanctot M, Gruslys A, Lespiau J-B, Parmas P, Duéñez-Guzmán E, Tuyls K (2020). Neural replicator dynamics: Multiagent learning via hedging policy gradients. Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent SystemsMay (AAMAS '20), 492–501. https://doi.org/10.48550/arXiv.1906.00190

Vieillard N, Kozuno T, Scherrer B, Pietquin O, Munos R, Geist M (2020). Leverage the Average: an Analysis of Regularization in RL. 34th Conference on Neural Information Processing Systems (NeurIPS 2020), https://proceedings.neurips.cc/paper/2020/file/8e2c381d4dd04f1c55093f22c59c3a08-Paper.pdf.

Han D, Doya K, Tani J (2019). Self-organization of action hierarchy and compositionality by reinforcement learning with recurrent networks. NeurIPS 2019 Workshop on Context and Compositionality in Biological and Artificial Neural Systems. https://context-composition.github.io/

Kozuno T, Uchibe E, Doya K (2019). Theoretical analysis of efficiency and robustness of softmax and gap-increasing operators in reinforcement learning. 22nd International Conference on Artificial Intelligence and Statistics (AISTATS 2019), 89, 2995-3003, Proceedings of Machine Learning Research (PMLR). 

Parmas P, Sugiyama M (2019). A unified view of likelihood ratio and reparameterization gradients and an optimal importance sampling scheme. NeurIPS 2019 Deep Reinforcement Learning Workshop. https://doi.org/10.48550/arXiv.1910.06419

Tokuda T, Yoshimoto J, Shimizu Y, Doya K (2019). Multiple co-clustering with heterogenous marginal distributions and its application to identify subtypes of depressive disorder. Proceedings of 62nd ISI World Statistics Congress, 2, 323-332. 

Elfwing S, Uchibe E, Doya K (2018). Online meta-learning by parallel algorithm competition. Genetic and Evolutionary Computation Conference, 10.1145/3205455.3205486, 426-433, Kyoto, Japan. https://doi.org/10.1145/3205455.3205486

Parmas P (2018). Total stochastic gradient algorithms and applications in reinforcement learning. Thirty-second Conference on Neural Information Processing Systems (NeurIPS2018). Montreal, Canada.

Parmas P, Rasmussen CE, Peters J, Doya K (2018). PIPPS: Flexible model-based policy search robust to the curse of chaos. 35th International Conference on Machine Learning (ICML 2018), http://proceedings.mlr.press/v80/parmas18a/parmas18a.pdf. 

Uchibe E (2017). Model-free deep inverse reinforcement learning by logistic regression. 3rd Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM2017). University of Michigan, Ann Arbor, Michigan, USA.

Reinke C, Uchibe E, Doya K (2017). Average Reward Optimization with Multiple Discounting Reinforcement Learners. The 24th International Conference on Neural Information Processing (ICONIP 2017). Guangzhou, China. (Lecture Notes in Computer Science, 10634). http://doi.org/10.1007/978-3-319-70087-8_81

Reinke C, Uchibe E, Doya K (2017). Fast Adaptation of Behavior to Changing Goals with a Gamma Ensemble. 3rd Multidisciplinary conference on reinforcement learning and decision making (RLDM2017). University of Michigan, Ann Arbor, Michigan, USA.

Huang Q, Uchibe E, Doya K (2016). Emergence of communication among reinforcement learning agents under coordination environment. IEEE ICDL-EPIROB 2016. Cergy-Pontoise, Paris, France.

Yukinawa N, Doya K, Yoshimoto J (2015). A Kinetic Signal Transduction Model for Structural Plasticity of Striatal Medium Spiny Neurons. 3rd Annual Winter q-bio Meeting, Maui, Hawaii, USA.

Uchibe E, Doya K (2015). Inverse Reinforcement Learning with Density Ratio Estimation. The Multi-disciplinary Conference on Reinforcement Learning and Decision Making 2015 (RLDM2015), University of Alberta, Edmonton, Canada.

Tokuda T, Yoshimoto J, Shimizu Y, Doya K (2015). Multiple clustering based on co-clustering views. IJCNN 2015 workshop on Advances in Learning from/with Multiple Learners, Killarney, Ireland.

Uchibe E, Doya K (2014). Combining learned controllers to achieve new goals based on linearly solvable MDPs. IEEE International Conference on Robotics and Automation (ICRA2014), Hong Kong.

PhD Theses

Huang Q (2023). Multi-agent reinforcement learning for distributed solar-battery energy systems. PhD Thesis, Okinawa Institute of Science and Technology Graduate University. https://doi.org/10.15102/1394.00002630

Taira M (2022). The role of serotonin neurons in mouse reward-based behaviors. PhD Thesis, Okinawa Institute of Science and Technology Graduate University. https://doi.org/10.15102/1394.00002295

Ota S (2021). Intrinsic motivation in creative activity. PhD Thesis, Okinawa Institute of Science and Technology Graduate University. https://doi.org/10.15102/1394.00001849

Paavo P (2020). Total stochastic gradient algorithms and applications to model-based reinforcement learning. PhD Thesis, Okinawa Institute of Science and Technology Graduate University. https://doi.org/10.15102/1394.00001064

Rahman F (2020). Identifying the evolutionary conditions for the emergence of alternative reproductive tactics in simulated robot colonies. PhD Thesis, Okinawa Institute of Science and Technology Graduate University. https://doi.org/10.15102/1394.00001447

Hamada H (2019). Serotonergic control of brain-wide dynamics. PhD Thesis, Okinawa Institute of Science and Technology Graduate University. http://doi.org/10.15102/1394.00000739

Reinke C (2018). The gamma-ensemble: adaptive reinforcement learning via modular discounting. PhD Thesis, Okinawa Institute of Science and Technology Graduate University. http://doi.org/10.15102/1394.00000369

Schulze JV (2018). Spatial and modular regularization in effective connectivity inference from neural activity data. PhD thesis, Okinawa Institute of Science and Technology Graduate University.  http://doi.org/10.15102/1394.00000808

Books and Book Chapters

Doya K (2023). Brain Computation: A Hands-on Guidebook. https://oist.github.io/BrainComputation/. 

Doya K (2023). Introduction to Scientific Computing. https://oist.github.io/iSciComp/. 

Doya K (2023). Computational cognitive models of reinforcement learning. Sun R, The Cambridge Handbook of Computational Cognitive Sciences, Cambridge University Press, 10.1017/9781108755610.026, 739 - 766. https://doi.org/10.1017/9781108755610.026

Doya K (2023). Reinforcement learning. Sun R, The Cambridge Handbook of Computational Cognitive Sciences, Cambridge University Press, 10.1017/9781108755610.013, 350-370. https://doi.org/10.1017/9781108755610.013

Morén J, Igarashi J, Shouno O, Yoshimoto J, Doya K (2019). Dynamics of basal ganglia and thalamus in parkinsonian tremor. Cutsuridis V, Multiscale Models of Brain Disorders, Spriinger, 978-3-030-18830-6_2, 13-20. https://doi.org/978-3-030-18830-6_2

銅谷賢治 監訳 (2019). ディープラーニング革命. ニュートンプレス. (Supervisory translation of The Deep Learning Revolution by Terrence J. Sejnowski. MIT Press, 2018)

銅谷賢治 (2019). 自律学習ロボットは何の夢を見るか. 人工知能美学芸術研究会, 人工知能美学芸術展 記録集, 人工知能美学芸術研究会, 118-119. 

銅谷賢治 (2019). 人工美学芸術展 in OIST を振り返って. 人工知能美学芸術研究会, 人工知能美学芸術展 記録集, 人工知能美学芸術研究会, 170. 

Doya K, Kimura M (2013). The basal ganglia, reinforcement learning, and the encoding of value. In Glimcher PW, Camerer CF, Fehr E (eds.) Neuroeconomics, Second Edition: Decision Making and the Brain, 321-333. Academic Press, London.

Uchibe E, Doya K (2011). Evolution of rewards and learning mechanisms in cyber rodents. In Krichmar JL, Wagatsuma H (eds.) Neuromorphic and Brain-Based Robots. Cambridge University Press.

Doya K, Kimura M (2009). The basal ganglia and the encoding of value. In Glimcher PW, Camerer CF, Fehr E, Poldrack RA (eds.) Neuroeconomics: Decision Making and the Brain, 407-416. Academic Press, London.

Elfwing S, Uchibe E, Doya K (2009). Co-evolution of rewards and meta-parameters in embodied evolution. In Sendhoff B, Koerner E, Sportns O, Ritter H, Doya K (eds.) Creating Brain-like Intelligence, 278-302. Springer-Verlag, Berlin.

Sendhoff B, Koerner E, Sportns O, Ritter H, Doya K (2009). Creating Brain-like Intelligence. Springer-Verlag, Berlin.

Doya, K. (2007). Invitation to Computational Neuroscience: Towards Understanding the Brain Mechanisms of Learning. Science-Sha (in Japansese).

Doya, K., Ishii, S., Pouget, A., Rao, R. P. N. (2007). Bayesian Brain: Probabilistic Approaches to Neural Coding, MIT Press.

Doya, K., Ishii, S. (2007). A probability primer. In Doya, K., Ishii, S., Pouget, A., Rao, R. P. N. eds. Bayesian Brain: Probabilistic Approaches to Neural Coding, pp. 3-13. MIT Press.

Bissmarck, F., Nakahara, H., Doya,  K., Hikosaka, O. (2005). Responding to modalities with different latencies.  Advances in Neural Information Processing Systems, 17, MIT Press.

Doya, K., Gomi, H., Sakaguchi, Y., Kawato , M. (2005). Computational Mechanisms of the Brain – Bottom-up and Top-down Dynamics. Asakura Shoten (in Japanese).