Publications
List of Publications
Preprints
Han D, Doya K, Li D, Tani J (2024). Synergizing habits and goals with variational Bayes. PsyArXiv, 10.31234/osf.io/v63yj. https://doi.org/10.31234/osf.io/v63yj
Keshmiri S, Tomonaga S, Mizutani H, Doya K (2024). Information Dynamics of the Heart and Respiration Rates: a Novel Venue for Digital Phenotyping in Humans. bioRxiv https://doi.org/10.1101/2024.01.21.576502
Han D, Doya K, Li D, Tani J (2023). Habits and goals in synergy: a variational Bayesian framework for behavior. https://doi.org/10.48550/arXiv.2304.05008
Rahman F, Mikheyev A, Doya K (2020). Emergence of Alternative Reproductive Tactics in Simulated Robot Colonies. Available at SSRN:. SSRN, https://ssrn.com/abstract=3699150.
Elfwing S, Uchibe E, Doya K (2018). Unbounded output networks for classification. arXiv:1807.09443.
Journal Articles
2023
Abekawa N, Doya K, Gomi H (2023). Body and visual instabilities functionally modulate implicit reaching corrections. iScience, 26. https://doi.org/10.1016/j.isci.2022.105751
Blackwell KT, Doya K (2023). Enhancing reinforcement learning models by including direct and indirect pathways improves performance on striatal dependent tasks. PLoS Comput Biol, 19, e1011385. https://doi.org/10.1371/journal.pcbi.1011385
Doya K (2023). Neuroscience and open intelligence. Journal of the Robotics Society of Japan, 41, 616-617 (in Japanese). https://doi.org/10.7210/jrsj.41.616
Florian L, Kenji D (2023). Numerical data imputation for multimodal data sets: A probabilistic nearest-neighbor kernel density approach. Transactions on Machine Learning Research, https://openreview.net/forum?id=KqR3rgooXb.
Hata J, Nakae K, Tsukada H, Woodward A, Haga Y, Iida M, Uematsu A, Seki F, Ichinohe N, Gong R, Kaneko T, Yoshimaru D, Watakabe A, Abe H, Tani T, Hamda HT, Gutierrez CE, Skibbe H, Maeda M, Papazian F, Hagiya K, Kishi N, Ishii S, Doya K, Shimogori T, Yamamori T, Tanaka K, Okano HJ, Okano H (2023). Multi-modal brain magnetic resonance imaging database covering marmosets with a wide age range. Scientific Data, 10.1038/s41597-023-02121-2. https://doi.org/10.1038/s41597-023-02121-2
Kuniyoshi Y, Kuriyama R, Omura S, Gutierrez CE, Sun Z, Feldotto B, Albanese U, Knoll AC, Yamada T, Hirayama T, Morin FO, Igarashi J, Doya K, Yamazaki T (2023). Embodied bidirectional simulation of a spiking cortico-basal ganglia-cerebellar-thalamic brain model and a mouse musculoskeletal body model distributed across computers including the supercomputer Fugaku. Frontiers in Neurorobotics, 17. https://doi.org/10.3389/fnbot.2023.1269848
Skibbe H, Rachmadi MF, Nakae K, Gutierrez CE, Hata J, Tsukada H, Poon C, Schlachter M, Doya K, Majka P, Rosa MGP, Okano H, Yamamori T, Ishii S, Reisert M, Watakabe A (2023). The Brain/MINDS Marmoset Connectivity Resource: An open-access platform for cellular-level tracing and tractography in the primate brain. PLoS Biol, 21, e3002158. https://doi.org/10.1371/journal.pbio.3002158
Toulkeridou E, Gutierrez CE, Baum D, Doya K, Economo EP (2023). Automated segmentation of insect anatomy from micro‐CT images using deep learning. Natural Sciences, 10.1002/ntls.20230010. https://doi.org/10.1002/ntls.20230010
Yamane Y, Ito J, Joana C, Fujita I, Tamura H, Maldonado PE, Doya K, Grun S (2023). Neuronal Population Activity in Macaque Visual Cortices Dynamically Changes through Repeated Fixations in Active Free Viewing. eneuro, 10. https://doi.org/10.1523/ENEURO.0086-23.2023
Yoshizawa T, Ito M, Doya K (2023). Neuronal representation of a working memory-based decision strategy in the motor and prefrontal cortico-basal ganglia loops. eneuro, 10, ENEURO.0413-22.2023. https://doi.org/10.1523/ENEURO.0413-22.2023
2022
Doya K, Ema A, Kitano H, Sakagami M, Russell S (2022). Social impact and governance of AI and neurotechnologies. Neural Networks, 152, 542-554. https://doi.org/10.1016/j.neunet.2022.05.012
Feldotto B, Eppler JM, Jimenez-Romero C, Bignamini C, Gutierrez CE, Albanese U, Retamino E, Vorobev V, Zolfaghari V, Upton A, Sun Z, Yamaura H, Heidarinejad M, Klijn W, Morrison A, Cruz F, McMurtrie C, Knoll AC, Igarashi J, Yamazaki T, Doya K, Morin FO (2022). Deploying and Optimizing Embodied Simulations of Large-Scale Spiking Neural Networks on HPC Infrastructure. Front Neuroinform, 16, 884180. https://doi.org/10.3389/fninf.2022.884180
Gutierrez CE, Skibbe H, Musset H, Doya K (2022). A spiking neural network builder for systematic data-to-model workflow. Frontiers in Neuroinformatics, 16, 855765. https://doi.org/10.3389/fninf.2022.855765
Taniguchi T, Yamakawa H, Nagai T, Doya K, Sakagami M, Suzuki M, Nakamura T, Taniguchi A (2022). A whole brain probabilistic generative model: Toward realizing cognitive architectures for developmental robots. Neural Networks, 150, 293-312. https://doi.org/10.1016/j.neunet.2022.02.026
浦久保秀俊, 渡我部昭哉, 中江健, 石井信, 銅谷賢治 (2022). コネクトーム : ミクロ・メゾ・マクロレベルの新展開. 解剖学雑誌, 97, 41-44.
塚田啓道, 銅谷賢治 (2022). 神経トレーサー、構造MRI、機能MRIデータの統合による全脳モデルシミュレーション. 生体の科学, 73, 436-437.
2021
Doya K (2021). Canonical cortical circuits and the duality of Bayesian inference and optimal control. Current Opinion in Behavioral Sciences, 41, 160-167. https://doi.org/10.1016/j.cobeha.2021.07.003
Doya K, Miyazaki KW, Miyazaki K (2021). Serotonergic modulation of cognitive computations. Current Opinion in Behavioral Sciences, 38, 116-123. https://doi.org/10.1016/j.cobeha.2021.02.003
Girard B, Lienard J, Gutierrez CE, Delord B, Doya K (2021). A biologically constrained spiking neural network model of the primate basal ganglia with overlapping pathways exhibits action selection. Eur J Neurosci, 53, 2254-2277. https://doi.org/10.1111/ejn.14869
Uchibe E, Doya K (2021). Forward and inverse reinforcement learning sharing network weights and hyperparameters. Neural Networks, 144, 138-153. https://doi.org/10.1016/j.neunet.2021.08.017
2020
Abe Y, Takata N, Sakai Y, Hamada H, Hiraoka Y, Aida T, Tanaka K, Bihan DL, Doya K, Tanaka KF (2020). Diffusion functional MRI reveals global brain network functional abnormalities driven by targeted local activity in a neuropsychiatric disease mouse model. Neuroimage, 223, 117318. https://doi.org/10.1016/j.neuroimage.2020.117318
Gutierrez CE, Skibbe H, Nakae K, Tsukada H, Lienard J, Watakabe A, Hata J, Reisert M, Woodward A, Yamaguchi Y, Yamamori T, Okano H, Ishii S, Doya K (2020). Optimization and validation of diffusion MRI-based fiber tracking with neural tracer data as a reference. Sci Rep, 10, 21285. https://doi.org/10.1038/s41598-020-78284-4
Gutierrez CE, Sun Z, Yamaura H, Morteza H, Igarashi J, Yamazaki T, Doya K (2020). Simulation of resting-state neural activity in a loop circuit of the cerebral cortex, basal ganglia, cerebellum, and thalamus using NEST simulator. JNNS2020.
Han D, Doya K, Tani J (2020). Self-organization of action hierarchy and compositionality by reinforcement learning with recurrent neural networks. Neural Networks, 129, 149-162. https://doi.org/10.1016/j.neunet.2020.06.002
Miyazaki K, Miyazaki KW, Sivori G, Yamanaka A, Tanaka KF, Doya K (2020). Serotonergic projections to the orbitofrontal and medial prefrontal cortices differentially modulate waiting for future rewards. Science Advances, 6, eabc7246. https://doi.org/10.1126/sciadv.abc7246
高橋英彦, 山下祐一, 銅谷賢治 (2020). AIと脳神経科学―精神神経疾患へのデータ駆動と理論駆動のアプローチ. Clinical Neuroscience, 38, 1358-1363.
2019
Doya K, Taniguchi T (2019). Toward evolutionary and developmental intelligence. Current Opinion in Behavioral Sciences, 29, 91-96. http://doi.org/10.1016/j.cobeha.2019.04.006.
Doya K, Matsuo Y (2019). Artificial intelligence and brain science: the present and the future. Brain and Nerve (in Japanese) Brain and Nerve, 71, 649-655. https://doi.org/10.11477/mf.1416201337
銅谷賢治 (2019). 日本神経回路学会学術賞にあたって. 日本神経回路学会誌: The Brain & Neural Networks, 26, 159-164. https://doi.org/10.3902/jnns.26.159
2018
Elfwing S, Uchibe E, Doya K (2018). Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Networks, 107, 3-11. https://doi.org/10.1016/j.neunet.2017.12.012
Kazumi K, Hideki H, Yoshihiko F, Charles SD, Manabu H, with a sensorimotor rhythm-based brain-computer interface in a Parkinson’s disease patient. Brain-Computer Interfaces. http://doi.org/10.1080/2326263X.2018.1440781
Magrans de Abril I, Yoshimoto J, Doya K (2018). Connectivity inference from neural recording data: Challenges, mathematical bases and research directions. Neural Networks. http://doi.org/10.1016/j.neunet.2018.02.016
Miyazaki K, Miyazaki KW, Yamanaka A, Tokuda T, Tanaka KF, Doya K (2018). Reward probability and timing uncertainty alter the effect of dorsal raphe serotonin neurons on patience. Nat Commun, 9, 2048. http://doi.org/10.1038/s41467-018-04496-y
Tokuda T, Yoshimoto J, Shimizu Y, Okada G, Takamura M, Okamoto Y, Yamawaki S, Doya K (2018). Identification of depression subtypes and relevant brain regions using a data-driven approach. Sci Rep, 8, 14082. http://doi.org/10.1038/s41598-018-32521-z
Yoshizawa T, Ito M, Doya K (2018). Reward-predictive neural activities in striatal striosome compartments. eneuro, 5, e0367-17.2018. https://doi.org/10.1523/ENEURO.0367-17.2018
2017
Shouno O, Tachibana Y, Nambu A, Doya K (2017). Computational model of recurrent subthalamo-pallidal circuit for generation of parkinsonian oscillations. Frontiers in Neuroanatomy, 11, 1-15. http://doi.org/10.3389/fnana.2017.00021
Tokuda T, Yoshimoto J, Shimizu Y, Okada G, Takamura M, Okamoto Y, Yamawaki S, Doya K (2017). Multiple co-clustering based on nonparametric mixture models with heterogeneous marginal distributions. PLoS ONE, 12, e0186566. http://doi.org/10.1371/journal.pone.0186566
Uchibe E (2017). Model-free deep inverse reinforcement learning by logistic regression. Neural Processing Letters. http://doi.org/10.1007/s11063-017-9702-7
Wang JX, Uchibe E, Doya K (2017). Adaptive Baseline Enhances EM-Based Policy Search: Validation in a View-Based Positioning Task of a Smartphone Balancer. Front Neurorobot, 11, 1-15. http://doi.org/10.3389/fnbot.2017.00001
Yoshida K, Shimizu Y, Yoshimoto J, Takumura M, Okada G, Okamoto Y, Yamawaki S, Doya K (2017). Prediction of clinical depression scores and detection of changes in whole-brain using resting-state functional MRI data with partial least squares regression. PLoS ONE. http://doi.org/10.1371/journal.pone.0179638
Yoshida K, Yoshimoto J, Doya K (2017). Sparse kernel canonical correlation analysis for discovery of nonlinear interactions in high-dimensional data. BMC bioinformatics, 18, 1-11. http://doi.org/10.1186/s12859-017-1543-x
2016
Caligiore D, Pezzulo G, Baldassarre G, Bostan AC, Strick http://doi.org/10.1007/s12311-016-0763-3
Elfwing S, Uchibe E, Doya K (2016). From free energy to value function approximation in reinforcement learning. Neural Networks, 84, 17-27. http://doi.org/10.1016/j.neunet.2016.07.013
Fermin ASR, Yoshida T, Yoshimoto J, Ito M, Tanaka SC, Doya K (2016). Model-based action planning involves cortico-cerebellar and basal ganglia networks. Nature Scientific Reports, 6, 1-14. http://doi.org/10.1038/srep31378
Funamizu A, Kuhn B, Doya K (2016). Neural substrate of dynamic Bayesian inference in the cerebral cortex. Nature Neuroscience, 1-12. http://doi.org/10.1038/nn.4390
Nagai T, Nakamuta S, Kuroda K, Nakauchi S, Nishioka T, Takano T, Zhang X, Tsuboi D, Funahashi Y, Nakano T, Yoshimoto J, Kobayashi K, Uchigashima M, Watanabe M, Miura M, Nishi A, Kobayashi K, Yamada K, Amano M, Kaibuchi K (2016). Phosphoproteomics of the dopamine pathway enables discovery of rap1 activation as a reward signal in vivo. Neuron, 89, 550-65. http://doi.org/10.1016/j.neuron.2015.12.019
Okamoto Y, Okada G, Tanaka S, Miyazaki K, Miyazaki K, Doya K, Yamawaki S (2016). The role of serotonin in waiting for future rewards in depression. International Journal of Neuropsychopharmacology, 19, 33-33.
Shimizu Y, Doya K, Okada G, Okamoto Y, Takamura M, Yamawaki S, Yoshimoto J (2016). Depression severity and related characteristics correlate significantly with activation in brain areas selected through machine learning. International Journal of Neuropsychopharmacology, 19, 135-136.
Uchibe E (2016). Forward and inverse reinforcement learning by linearly solvable Markov decision
Wang J, Uchibe E, Doya K (2016). EM-based policy hyper131. http://doi.org/10.1007/s10015-015-0260-7
2015
Balleine B, Dezfouli A, Ito M, Doya K (2015). Hierarchical control of goal-directed action in the cortical–basal ganglia network. Science Direct, 5, 1-7. http://doi.org/10.1016/j.cobeha.2015.06.001
Elfwing S, Uchibe E, Doya K (2015). Expected energy-based restricted Boltzmann machine for classification. Neural Networks, 64, 29-38. http://doi.org/10.1016/j.neunet.2014.09.006
Funamizu A, Ito M, Doya K, Kanzaki R, Takahashi H (2015). Condition interference in rats performing a choice task with switched variable- and fixed-reward conditions. Fronteirs in Neuroscience, 9. http://doi.org/10.3389/fnins.2015.00027. eCollection 2015.
Hahne J, Helias M, Kunkel S, Igarashi J, Bolten M, Frommer A, Diesmann M (2015). A unified framework for spiking and gap-junction interactions in distributed neuronal network simulations. Frontiers in Neuroinformatics, 9. http://doi.org/10.3389/fninf.2015.00022
Ito M, Doya K (2015). Parallel Representation of Value-Based and Finite State-Based Strategies in the Ventral and Dorsal Striatum. PLoS ONE. http://doi.org/10.1371/journal.pcbi.1004540
Ito M, Doya K (2015). Distinct neural representation in the dorsolateral, dorsomedial, and ventral parts
Nakano T, Otsuka M, Yoshimoto J, Doya K (2015). A spiking neural network model of model-free
Shimizu Y, Yoshimoto J, Toki S, Takamura M, Yoshimura S,Toward probabilistic diagnosis and understanding of depression based on functional MRI data analysis with logistic group LASSO. PLoS ONE, 10, e0123524. http://doi.org/10.1371/journal.pone.0123524
2014
Elfwing S, Doya K (2014). Emergence of polymorphic mating strategies in robot colonies. PLoS ONE, 9, e93622. http://doi.org/10.1371/journal.pone.0093622
Kunkel S, Schmidt M, Eppler MJ, Plesser HE, Masumoto G, Igarash J, Ishii S, Fukai T, Morrison A, Diesmann M, Moritz H (2014). Spiking network simulation code for petascale computers. Frontiers in Neuroinfomatics, 8. http://doi.org/10.3389/fninf.2014.00078
Miyzaki WK, Miyazaki K, Tanaka FK, Yamanaka A, Takahashi A, Tabuchi S, Doya K (2014). Optogenetic Activation of Dorsal Raphe Serotonin Neurons Enhances Patience for Future Rewards. Current Biology. http://doi.org/10.1016/j.cub.2014.07.041
Obrochta SP, Yokoyama Y, Moren J, Crowley TJ (2014). Conversion of GISP2-based sediment core age models to the GICC05 extended chronology. Quaternary Geochronology, 20, 1-7. http://doi.org/10.1016/j.quageo.2013.09.001
2013
Elfwing S, Uchibe E, Doya K (2013). Scaled free-energy based reinforcement learning for robust and efficient learning in high-dimensional state spaces. Front Neurorobot, 7, 3. http://doi.org/10.3389/fnbot.2013.00003
Funamizu A, Kanzaki R, Takahashi H (2013). Pre-attentive, context-specific representation of fear
Kinjo K, Uchibe E, Doya K (2013). Evaluation of linearly solvable Markov decision process with dynamic model learning in a mobile robot navigation task. Front Neurorobot, 7, 7. http://doi.org/10.3389/fnbot.2013.00007
Moren J, Shibata T, Doya K (2013). The mechanism of saccade motor pattern generation investigated by a large-scale spiking neuron model of the superior colliculus. PLoS ONE, 8, e57134. http://doi.org/10.1371/journal.pone.0057134
Nakano T, Yoshimoto J, Doya K (2013). A model-based prediction of the calcium responses in the striatal synaptic spines depending on the timing of cortical and dopaminergic inputs and post-synaptic spikes. Frontiers in Computational Neuroscience, 7, 119. http://doi.org/10.3389/fncom.2013.00119
Yoshimoto J, Ito M, Doya K (2013). Recent progress in reinforcement learning: Decision making in the brain and reinforcement learning. Journal of the Society of Instrument and Control Engineers, 52, 749-754.
2012
Demoto Y, Okada G, Okamoto Y, Kunisato Y, Aoyama S, Onoda K, Munakata A, Nomura M, Tanaka SC, Schweighofer N, Doya K, Yamawaki S (2012). Neural and personality correlates of individual differences related to the effects of acute tryptophan depletion on future reward evaluation. Neuropsychobiology, 65, 55-64. http://doi.org/10.1159/000328990
Funamizu A, Ito M, Doya K, Kanzaki R, Takahashi H (2012).Neuroscience, 35, 1180-1189. http://doi.org/10.1111/j.1460-9568.2012.08025.x.
Miyazaki KW, Miyazaki K, Doya K (2012). Activation of dorsal raphe serotonin neurons is necessary for waiting for delayed rewards. Journal of Neuroscience, 32, 10451-10457. http://doi.org/10.1523/JNEUROSCI.0915-12.2012
Sugimoto N, Haruno M, Doya K, Kawato M (2012). MOSAIC for Multiple-Reward Environments. Neural Computation, 24, 577-606.
2011
Elfwing S, Uchibe, E., Doya, K., Christensen, HI (2011). Darwinian embodied evolution of the learning ability for survival. Adaptive Behavior. http://doi.org/10.1177/1059712310397633
Ito M, Doya K (2011). Multiple representations and algorithms for reinforcement learning in the cortico-basal ganglia circuit. Current Opinion in Neurobiology, 21. http://doi.org/10.1016/j.conb.2011.04.001
Miyazaki K, Miyazaki KW, Doya K (2011). Activation of dorsal raphe serotonin neurons underlies waiting for delayed rewards. Journal of Neuroscience, 31, 469-479. http://doi.org/10.1523/JNEUROSCI.3714-10.2011
Miyazaki KW, Miyazaki K, Doya K (2011). Activation of the central serotonergic system in response to delayed but not omitted rewards. European Journal of Neuroscience, 33, 153-160. http://doi.org/10.1111/j.1460-9568.2010.07480.x.
Pammi VSC, Miyapuram KP, Ahmed, Samejima K, Bapi RS, Doya K (2011). Changing the structure of
Yoshimoto J, Sato M-A, Ishii S (2011). Bayesian normalized Gaussian network and hierarchical model selection method. Intelligent Automation and Soft Computing, 17, 71-94. http://doi.org/10.1080/10798587.2011.10643134
2010
Fermin A, Yoshida T, Ito M, Yoshimoto J (2010). Evidence for Model-Based Action Planning in a Sequential Finger Movement Task. Journal of Motor Behavior, 42, 371-379. http://doi.org/10.1080/00222895.2010.526467
Klein M, Kamp H, Palm G, Doya K (2010). A computational neural model of goal-directed utterance selection. Neural Networks. http://doi.org/10.1016/j.neunet.2010.01.003
Morimura T, Uchibe E, Yoshimoto J, Peters J, Doya K (2010). Derivatives of logarithmic stationary distributions for policy gradient reinforcement learning. Neural Computation, 22, 342-376.
Nakano T, Doi T, Yoshimoto J, Doya K (2010). A kinetic model of dopamine and calcium dependent striatal synaptic plasticity. PLoS Computational Biology, 6, e1000670. http://doi.org/10.1371/journal.pcbi.1000670
2009
Fujiwara Y, Yamashita O, Kawawaki D, Doya K, Kawato M, Toyama K, Sato M-a (2009). A hierarchical Bayesian method to resolve an inverse problem of MEG contaminated with eye movement artifacts. NeuroImage, 45, 393-409. http://doi.org/10.1016/j.neuroimage.2008.12.012
Ito M, Doya K (2009). Validation of Decision-Making Models and Analysis of Decision Variables in the
Ito M, Shirao T, Doya K, Sekino Y (2009). supramammillary nucleus of the rat exposed to novel environment. Neuroscience Research, 64, 397-402. http://doi.org/10.1016/j.neures.2009.04.013
Otsuka M, Yoshimoto J, Doya K (2009). Reward-dependent sensory coding in free-energy-based reinforcement learning. Neural Network World., 19, 597-610.
Tanaka SC, Shishida K, Schweighofer N, Okamoto Y, Yamawaki S, Doya K (2009). Serotonin affects association of aversive outcomes to past actions. Journal of Neuroscience, 16, 15669-74. http://doi.org/ 10.1523/JNEUROSCI.2799-09.2009
2008
Elfwing S, Uchibe E, Doya K, I. CH (2008). Co-evolution of shaping rewards and meta-parameters in reinforcement learning. Adaptive Behavior, 16, 400-412.
Morimura T, Uchibe E, Yoshimoto J, Doya K (2008). A new natural gradient of average reward for policy search. IEICE Transactions, J91-D, 1515-1527.
Sato T, Uchibe E, Doya K (2008). Emergence of communication and cooperative behavior by
Schweighofer N, Bertin M, Shishida K, Okamoto Y, Tanaka S, Yamawaki S, Doya K (2008). Low-serotonin levels increase delayed reward discounting in humans. Journal of Neuroscience, 28, 4528-4532 (Erratum in 28, 5619).
2007
Bertin M, Schweighofer N, Doya K (2007). Multiple model-based reinforcement learning explains dopamine neuronal activity. Neural Netw, 20, 668-75. http://doi.org/10.1016/j.neunet.2007.04.028
Corrado G, Doya K (2007). Understanding neural coding through the model-based analysis of decision making. Journal of Neuroscience, 27, 8178-8180. http://doi.org/10.1523/Jneurosci.1590-07.2007
Doya K (2007). Reinforcement learning: Computational theory and biological mechanisms. HFSP Journal, 10.2976/1.2732246.
Elfwing S, Doya K, Christensen HI (2007). Evolutionary development of hierarchical learning structures. IEEE Transactions on Evolutionary Computations, 11, 249-264.
Kamioka T, Uchibe E, Doya K (2007). Max-Min Actor-Critic for Multiple Reward Reinforcement Learning. IEICE TRANSACTIONS on Information and Systems, J90-D, 2510-2521.
Morimoto J, Doya K (2007). Reinforcement learning state estimator. Neural Computation, 19, 730-756.
Ogasawara H, Doi T, Doya K, Kawato M (2007). Nitric oxide regulates input specificity of long-term depression and context dependence of cerebellar learning. PLoS Computational Biology, 3, e179.
Samejima K, Doya K (2007). Multiple representations of belief states and action values in corticobasal ganglia loops. Annals of New York Academy of Sciences, 1104, 213-228.
Schweighofer N, Tanaka SC, Doya K (2007). Serotonin and the evaluation of future rewards: Theory, experiments, and possible neural mechanisms. Annals of New York Academy of Sciences, 1104, 289-300.
Tanaka SC, Samejima K, Okada G, Ueda K, Okamoto Y, Yamawaki S, Doya K (2007). Brain mechanism of reward prediction under predictable and unpredictable environmental dynamics. Neural Networks, 19, 1233-1241.
Tanaka SC, Schweighofer N, Asahi S, Shishida K, Okamoto Y, Yamawaki S, Doya K (2007). Serotonin differentially regulates short- and long-term prediction of rewards in the ventral and dorsal striatum. PLoS ONE, 2, e1333.
2006
Bando T, Shibata T, Doya K, Ishii S (2006). Switching particle filters for efficient visual tracking. Robotics and Autonomous Systems, 54, 873-884. http://doi.org/10.1016/j.robot.2006.03.004
Bapi RS, Miyapuram KP, Graydon FX, Doya K (2006). fMRI investigation of cortical and subcortical networks in the learning of abstract and effector-specific representations of motor sequences. NeuroImage, 32, 714-727. http://doi.org/10.1016/j.neuroimage.2006.04.205
Daw ND, Doya K (2006). The computational neurobiology of learning and reward. Current Opinion in Neurobiology, 16, 199-204.
Hirayama J, Yoshimoto J, Ishii S (2006). Balancing plasticity and stability of on-line learning based on
Kawawaki D, Shibata T, Goda N, Doya K, Kawato M (2006). Anterior and superior lateral occipito-temporal cortex responsible for target motion prediction during overt and covert visual pursuit. Neuroscience Research, 54, 112-123.
Matsubara T, Morimoto J, Nakanishi J, Sato MA, Doya K (2006). Learning CPG-based biped locomotion with a policy gradient method. Robotics and Autonomous Systems, 54, 911-920. http://doi.org/10.1016/j.robot.2006.05.012
Schweighofer N, Shishida K, Han CE, Okamoto Y, Tanaka SC, Yamawaki S, Doya K (2006). Humans can adopt optimal discounting strategy under real-time constraints. PLoS Computational Biology, 2, e152.
Sugimoto N, Samejima K, Doya K, Kawato M (2006). Hierarchical reinforcement learning: Temporal abstraction based on MOSAIC model. Transactions of Institute of Electronics, Information and Communication Engineers, J89-D, 1577-1587.
Uchibe E, Asada M (2006). Incremental co-evolution with competitive and cooperative tasks in a multi-robot environment. Proceedings of the IEEE.
2005
Capi G, Doya K (2005). Evolution of neural architecture fitting environmental dynamics. Adaptive Behavior, 13, 53-66.
Capi G, Doya K (2005). Evolution of recurrent neural controllers using an extended parallel genetic algorithm. Robotics and Autonomous Systems, 52, 148-159.
Doya K, Uchibe E (2005). The Cyber Rodent project: Exploration of adaptive mechanisms for self-preservation and self-reproduction. Adaptive Behavior, 13, 149-160.
Morimoto J, Doya K (2005). Robust reinforcement learning. Neural Computation, 17, 335-359. http://doi.org/10.1162/0899766053011528
Nishimura M, Yoshimoto J, Tokita Y, Nakamura Y, Ishii S (2005). Control of real acrobot by learning the switching rule of multiple controllers. IEICE Transactions on Fundamentals.
Samejima K, Ueda Y, Doya K, Kimura M (2005). Representation of action-specific reward values in the striatum. Science, 310, 1337-1340. http://doi.org/10.1126/science.1115270
Yoshimoto J, Doya K, Ishii S (2005). Fundamental theory and application of reinforcement learning. Keisoku to Seigyo: Journal of the Society of Instrument and Control Engineers, 44, 313-318.
Yoshimoto J, Nishimura M, Tokita Y, Ishii S (2005). Acrobot control by learning the switching of multiple controllers. Journal of Artificial Life and Robotics.
Yukinawa N, Yoshimoto J, Oba S, Ishii S (2005). System identification of gene expression time-series based on a linear dynamical system model with variational Bayesian estimation. IPSJ Transactions on Mathematical Modeling and its Applications, 46SIG10, 57-65.
2004
Haruno M, Kuroda T, Doya K, Toyama K, Kimura M, Samejima K, Imamizu H, Kawato M (2004). A
Hirayama J, Yoshimoto J, Ishii S (2004). Bayesianacetylcholine. Neural Networks, 17, 1391-1400.
Miyamoto H, Morimoto J, Doya K, Kawato M (2004). Reinforcement learning with via-point representation. Neural Networks, 17, 299-305. http://doi.org/10.1016/j.neunet.2003.11.004
Sato M, Yoshioka T, Kajiwara S, Toyama K, Goda N, Doya K, Kawato M (2004). Hierarchical Bayesian estimation for MEG inverse problem. NeuroImage, 23, 806-826.
Sugimoto N, Samejima K, Doya K, Kawato M (2004). Reinforcement learning and goal estimation by multiple forward and reward models. Transactions of Institute of Electronic, Information and Communication Engineers, J87-D-II, 683-694.
Tanaka S, Doya K, Okada G, Ueda K, Okamoto Y, Yamawaki S (2004). Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops. Nature Neuroscience, 7, 887-893.
Uchibe E, Doya K (2004). Hierarchical reinforcement learning for multiple reward functions. Journal of Robotics Society of Japan, 22, 120-129.
Peer-Reviewed Conference Papers [since 2014]
Sangati E, Sangati F, Slors M, Doya K (2024). The collaborative abilities of ChatGPT agents in a number guessing game. Proceedings of the Joint Symposium of AROB-ISBC-SWARM 2024, https://katja.sangati.me/publication/sangati-2024-do/sangati-2024-do.pdf.
Huang Q, Doya K (2023). Distributed reinforcement learning for DC open energy systems. ICLR Workshop: Tackling Climate Change with Machine Learning, https://www.climatechange.ai/papers/iclr2023/48.
Han D, Kozuno T, Luo X, Chen Z-Y, Doya K, Yang Y, Li D (2022). Variational oracle guiding for reinforcement learning. International Conference on Learning Representations (ICLR2022), https://openreview.net/forum?id=pjqqxepwoMy.
Han D, Doya K, Tani J (2020). Variational recurrent models for solving partially observable control tasks. International Conference on Learning Representations (ICLR 2020), https://iclr.cc/virtual_2020/poster_r1lL4a4tDB.html.
Hennes D, Morrill D, Omidshafiei S, Munos Rm, Perolat J, Lanctot M, Gruslys A, Lespiau J-B, Parmas P, Duéñez-Guzmán E, Tuyls K (2020). Neural replicator dynamics: Multiagent learning via hedging policy gradients. Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent SystemsMay (AAMAS '20), 492–501. https://doi.org/10.48550/arXiv.1906.00190
Vieillard N, Kozuno T, Scherrer B, Pietquin O, Munos R, Geist M (2020). Leverage the Average: an Analysis of Regularization in RL. 34th Conference on Neural Information Processing Systems (NeurIPS 2020), https://proceedings.neurips.cc/paper/2020/file/8e2c381d4dd04f1c55093f22c59c3a08-Paper.pdf.
Han D, Doya K, Tani J (2019). Self-organization of action hierarchy and compositionality by reinforcement learning with recurrent networks. NeurIPS 2019 Workshop on Context and Compositionality in Biological and Artificial Neural Systems. https://context-composition.github.io/
Kozuno T, Uchibe E, Doya K (2019). Theoretical analysis of efficiency and robustness of softmax and gap-increasing operators in reinforcement learning. 22nd International Conference on Artificial Intelligence and Statistics (AISTATS 2019), 89, 2995-3003, Proceedings of Machine Learning Research (PMLR).
Parmas P, Sugiyama M (2019). A unified view of likelihood ratio and reparameterization gradients and an optimal importance sampling scheme. NeurIPS 2019 Deep Reinforcement Learning Workshop. https://doi.org/10.48550/arXiv.1910.06419
Tokuda T, Yoshimoto J, Shimizu Y, Doya K (2019). Multiple co-clustering with heterogenous marginal distributions and its application to identify subtypes of depressive disorder. Proceedings of 62nd ISI World Statistics Congress, 2, 323-332.
Elfwing S, Uchibe E, Doya K (2018). Online meta-learning by parallel algorithm competition. Genetic and Evolutionary Computation Conference, 10.1145/3205455.3205486, 426-433, Kyoto, Japan. https://doi.org/10.1145/3205455.3205486
Parmas P (2018). Total stochastic gradient algorithms and applications in reinforcement learning. Thirty-second Conference on Neural Information Processing Systems (NeurIPS2018). Montreal, Canada.
Parmas P, Rasmussen CE, Peters J, Doya K (2018). PIPPS: Flexible model-based policy search robust to the curse of chaos. 35th International Conference on Machine Learning (ICML 2018), http://proceedings.mlr.press/v80/parmas18a/parmas18a.pdf.
Uchibe E (2017). Model-free deep inverse reinforcement learning by logistic regression. 3rd Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM2017). University of Michigan, Ann Arbor, Michigan, USA.
Reinke C, Uchibe E, Doya K (2017). Average Reward Optimization with Multiple Discounting Reinforcement Learners. The 24th International Conference on Neural Information Processing (ICONIP 2017). Guangzhou, China. (Lecture Notes in Computer Science, 10634). http://doi.org/10.1007/978-3-319-70087-8_81
Reinke C, Uchibe E, Doya K (2017). Fast Adaptation of Behavior to Changing Goals with a Gamma Ensemble. 3rd Multidisciplinary conference on reinforcement learning and decision making (RLDM2017). University of Michigan, Ann Arbor, Michigan, USA.
Huang Q, Uchibe E, Doya K (2016). Emergence of communication among reinforcement learning agents under coordination environment. IEEE ICDL-EPIROB 2016. Cergy-Pontoise, Paris, France.
Yukinawa N, Doya K, Yoshimoto J (2015). A Kinetic Signal Transduction Model for Structural Plasticity of Striatal Medium Spiny Neurons. 3rd Annual Winter q-bio Meeting, Maui, Hawaii, USA.
Uchibe E, Doya K (2015). Inverse Reinforcement Learning with Density Ratio Estimation. The Multi-disciplinary Conference on Reinforcement Learning and Decision Making 2015 (RLDM2015), University of Alberta, Edmonton, Canada.
Tokuda T, Yoshimoto J, Shimizu Y, Doya K (2015). Multiple clustering based on co-clustering views. IJCNN 2015 workshop on Advances in Learning from/with Multiple Learners, Killarney, Ireland.
Uchibe E, Doya K (2014). Combining learned controllers to achieve new goals based on linearly solvable MDPs. IEEE International Conference on Robotics and Automation (ICRA2014), Hong Kong.
PhD Theses
Huang Q (2023). Multi-agent reinforcement learning for distributed solar-battery energy systems. PhD Thesis, Okinawa Institute of Science and Technology Graduate University. https://doi.org/10.15102/1394.00002630
Taira M (2022). The role of serotonin neurons in mouse reward-based behaviors. PhD Thesis, Okinawa Institute of Science and Technology Graduate University. https://doi.org/10.15102/1394.00002295
Ota S (2021). Intrinsic motivation in creative activity. PhD Thesis, Okinawa Institute of Science and Technology Graduate University. https://doi.org/10.15102/1394.00001849
Paavo P (2020). Total stochastic gradient algorithms and applications to model-based reinforcement learning. PhD Thesis, Okinawa Institute of Science and Technology Graduate University. https://doi.org/10.15102/1394.00001064
Rahman F (2020). Identifying the evolutionary conditions for the emergence of alternative reproductive tactics in simulated robot colonies. PhD Thesis, Okinawa Institute of Science and Technology Graduate University. https://doi.org/10.15102/1394.00001447
Hamada H (2019). Serotonergic control of brain-wide dynamics. PhD Thesis, Okinawa Institute of Science and Technology Graduate University. http://doi.org/10.15102/1394.00000739
Reinke C (2018). The gamma-ensemble: adaptive reinforcement learning via modular discounting. PhD Thesis, Okinawa Institute of Science and Technology Graduate University. http://doi.org/10.15102/1394.00000369
Schulze JV (2018). Spatial and modular regularization in effective connectivity inference from neural activity data. PhD thesis, Okinawa Institute of Science and Technology Graduate University. http://doi.org/10.15102/1394.00000808
Books and Book Chapters
Doya K (2023). Brain Computation: A Hands-on Guidebook. https://oist.github.io/BrainComputation/.
Doya K (2023). Introduction to Scientific Computing. https://oist.github.io/iSciComp/.
Doya K (2023). Computational cognitive models of reinforcement learning. Sun R, The Cambridge Handbook of Computational Cognitive Sciences, Cambridge University Press, 10.1017/9781108755610.026, 739 - 766. https://doi.org/10.1017/9781108755610.026
Doya K (2023). Reinforcement learning. Sun R, The Cambridge Handbook of Computational Cognitive Sciences, Cambridge University Press, 10.1017/9781108755610.013, 350-370. https://doi.org/10.1017/9781108755610.013
Morén J, Igarashi J, Shouno O, Yoshimoto J, Doya K (2019). Dynamics of basal ganglia and thalamus in parkinsonian tremor. Cutsuridis V, Multiscale Models of Brain Disorders, Spriinger, 978-3-030-18830-6_2, 13-20. https://doi.org/978-3-030-18830-6_2
銅谷賢治 監訳 (2019). ディープラーニング革命. ニュートンプレス. (Supervisory translation of The Deep Learning Revolution by Terrence J. Sejnowski. MIT Press, 2018)
銅谷賢治 (2019). 自律学習ロボットは何の夢を見るか. 人工知能美学芸術研究会, 人工知能美学芸術展 記録集, 人工知能美学芸術研究会, 118-119.
銅谷賢治 (2019). 人工美学芸術展 in OIST を振り返って. 人工知能美学芸術研究会, 人工知能美学芸術展 記録集, 人工知能美学芸術研究会, 170.
Doya K, Kimura M (2013). The basal ganglia, reinforcement learning, and the encoding of value. In Glimcher PW, Camerer CF, Fehr E (eds.) Neuroeconomics, Second Edition: Decision Making and the Brain, 321-333. Academic Press, London.
Uchibe E, Doya K (2011). Evolution of rewards and learning mechanisms in cyber rodents. In Krichmar JL, Wagatsuma H (eds.) Neuromorphic and Brain-Based Robots. Cambridge University Press.
Doya K, Kimura M (2009). The basal ganglia and the encoding of value. In Glimcher PW, Camerer CF, Fehr E, Poldrack RA (eds.) Neuroeconomics: Decision Making and the Brain, 407-416. Academic Press, London.
Elfwing S, Uchibe E, Doya K (2009). Co-evolution of rewards and meta-parameters in embodied evolution. In Sendhoff B, Koerner E, Sportns O, Ritter H, Doya K (eds.) Creating Brain-like Intelligence, 278-302. Springer-Verlag, Berlin.
Sendhoff B, Koerner E, Sportns O, Ritter H, Doya K (2009). Creating Brain-like Intelligence. Springer-Verlag, Berlin.
Doya, K. (2007). Invitation to Computational Neuroscience: Towards Understanding the Brain Mechanisms of Learning. Science-Sha (in Japansese).
Doya, K., Ishii, S., Pouget, A., Rao, R. P. N. (2007). Bayesian Brain: Probabilistic Approaches to Neural Coding, MIT Press.
Doya, K., Ishii, S. (2007). A probability primer. In Doya, K., Ishii, S., Pouget, A., Rao, R. P. N. eds. Bayesian Brain: Probabilistic Approaches to Neural Coding, pp. 3-13. MIT Press.
Bissmarck, F., Nakahara, H., Doya, K., Hikosaka, O. (2005). Responding to modalities with different latencies. Advances in Neural Information Processing Systems, 17, MIT Press.
Doya, K., Gomi, H., Sakaguchi, Y., Kawato , M. (2005). Computational Mechanisms of the Brain – Bottom-up and Top-down Dynamics. Asakura Shoten (in Japanese).