A Neural Substrate of Prediction and Reward

Wolfram Schultz; Peter Dayan; P. Read Montague

doi:10.1126/science.275.5306.1593

journal article Mar 14, 1997

A Neural Substrate of Prediction and Reward

Wolfram Schultz Peter Dayan P. Read Montague

Science Vol. 275 No. 5306 pp. 1593-1599 · American Association for the Advancement of Science (AAAS)

View at Publisher Save 10.1126/science.275.5306.1593

Abstract

The capacity to predict future events permits a creature to detect, model, and manipulate the causal structure of its interactions with its environment. Behavioral experiments suggest that learning is driven by changes in the expectations about future salient events such as rewards and punishments. Physiological work has recently complemented these studies by identifying dopaminergic neurons in the primate whose fluctuating output apparently signals changes or errors in the predictions of future salient and rewarding events. Taken together, these findings can be understood through quantitative theories of adaptive optimizing control.

Topics

No keywords indexed for this article. Browse by subject →

References

81

[1]

Dickinson A., Contemporary Animal Learning Theory (Cambridge Univ. Press, Cambridge, 1980); N. J. Mackintosh, Conditioning and Associative Learning (Oxford Univ. Press, Oxford, 1983); C. R. Gallistel, The Organization of Learning (MIT Press, Cambridge, MA, 1990); L. A. Real, Science253, 980 (1991) .

[2]

Pavlov I. P., Conditioned Reflexes (Oxford Univ. Press, Oxford, 1927); B. F. Skinner, The Behavior of Organisms (Appleton-Century-Crofts, New York, 1938); J. Olds, Drives and Reinforcement (Raven, New York 1977); R. A. Wise, in The Neuropharmacological Basis of Reward, J. M. Liebeman and S. J. Cooper, Eds. (Clarendon Press, New York, 1989); N. W. White and P. M. Milner, Annu. Rev. Psychol.43, 443 (1992); T. W. Robbins and B. J. Everitt, Curr. Opin. Neurobiol.6, 228 (1996) .

[3]

Rescorla R. A. Wagner A. R.in Classical Conditioning II: Current Research and Theory A. H. Black and W. F. Prokasy Eds. (Appleton-Century-Crofts New York 1972) pp. 64–69.

[4]

10.1037/h0076778

[5]

Pearce J. M. and , Hall G., ibid. 87, 532 (1980).

[6]

Kamin L. J., Punishment and Aversive Behavior, Campbell B. A., Church R. M.Appleton-Century-CroftsNew York1969279296.

[7]

Sutton R. S., Barto A. G., Psychol. Rev. 8821351981; R. S. Sutton, Mach. Learn.3, 9 (1988). 10.1037/0033-295x.88.2.135

[8]

Sutton R. S. Barto A. G.Proceedings of the Ninth Annual Conference of the Cognitive Science SocietySeattle WA1987; in Learning and Computational Neuroscience M. Gabriel and J. Moore Eds. (MIT Press Cambridge MA 1989). For specific application to eyeblink conditioning see J. W. Moore et al . Behav. Brain Res. 12 143 (1986) .

[9]

Quartz S. R., Dayan P., Montague P. R., Sejnowski T. J., Soc. Neurosci. Abstr. 18, 1210 (1992);

[10]

Montague P. R. Dayan P. Nowlan S. J. Pouget A. Sejnowski T. J. in Advances in Neural Information Processing Systems 5 Hanson S. J. Cowan J. D. Giles C. L. Eds. (Morgan Kaufmann San Mateo CA 1993) pp. 969–976.

[11]

Montague P. R., Dayan P., Sejnowski T. J., in Advances in Neural Information Processing Systems 6, , Tesauro G., Cowan J. D., Alspector J., Eds. (Morgan Kaufmann, San Mateo, CA, 1994), pp. 598-605.

[12]

Montague P. R., Sejnowski T. J., Learn. Mem. 1, 1 (1994); 10.1101/lm.1.1.1

[13]

Montague P. R., Neural-Network Approaches to Cognition—Biobehavioral Foundations, , Donahoe J., Ed. (Elsevier, Amsterdam, in press);

[14]

Montague P. R. and Dayan P. A Companion to Cognitive Science Bechtel W. and Graham G. Eds. (Blackwell Oxford in press).

[15]

10.1038/377725a0

[16]

A framework for mesencephalic dopamine systems based on predictive Hebbian learning

PR Montague, P Dayan, TJ Sejnowski

The Journal of Neuroscience 10.1523/jneurosci.16-05-01936.1996

[17]

Other work has suggested an interpretation of monoaminergic influences similar to that taken above (8-12) [K. J. Friston G. Tononi G. N. Reeke O. Sporns G. M. Edelman Neuroscience 59 229 (1994) 10.1016/0306-4522(94)90592-4

[18]

J. C. Houk J. L. Adams A. G. Barto in Models of Information Processing in the Basal Ganglia J. C. Houk J. L. Davis D. G. Beiser Eds. (MIT Press Cambridge MA 1995)] pp. 249-270. Other models of monoaminergic influences have considered what could be called attention-based accounts (4) rather than prediction error-based explanations [D. Servan-Schreiber H. Printz J. D. Cohen Science 249 892 (1990)]. 10.7551/mitpress/4708.003.0020

[19]

Koob G. F., Semin. Neurosci. 4, 139 (1992); 10.1016/1044-5765(92)90012-q

[20]

Wise R. A. and , Hoffman D. C., Synapse 10, 247 (1992); 10.1002/syn.890100307

[21]

DiChiara G., Drug Alcohol Depend. 38, 95 (1995). 10.1016/0376-8716(95)01118-i

[22]

Phillips A. G., Brooke S. M., Fibiger H. C., Brain Res. 85, 13 (1975); 10.1016/0006-8993(75)90998-1

[23]

Phillips A. G. , Carter D. A. , Fibiger H. C., ibid. 104, 221 (1976);

[24]

10.1126/science.897677

[25]

Phillips A. G. , Mora F. , Rolls E. T., Psychopharmacology 62, 79 (1979); 10.1007/bf00426039

[26]

Corbett D. and , Wise R. A., Brain Res. 185, 1 (1980); 10.1016/0006-8993(80)90666-6

[27]

10.1146/annurev.ps.40.020189.001203

[28]

Wise R. A., Behav. Brain Sci. 5, 39 (1982); 10.1017/s0140525x00010372

[29]

Beninger R. J., Brain Res. Rev. 6, 173 (1983); 10.1016/0165-0173(83)90038-3

[30]

___ and Hahn B. L., Science 220, 1304 (1983); 10.1126/science.6857251

[31]

Beninger R. J., Brain Res. Bull. 23, 365 (1989); 10.1016/0361-9230(89)90223-2

[32]

LeMoal M. and , Simon H., Physiol. Rev. 71, 155 (1991); 10.1152/physrev.1991.71.1.155

[33]

10.1016/1044-5765(92)90010-y

[34]

Schultz W., J. Neurophysiol. 56, 1439 (1986); 10.1152/jn.1986.56.5.1439

[35]

Romo R. and , Schultz W., ibid. 63, 592 (1990);

[36]

Schultz W. and , Romo R., ibid., p. 607;

[37]

Ljungberg T. , Apicella P. , Schultz W.,ibid. 67, 145 (1992);

[38]

10.1523/jneurosci.13-03-00900.1993

[39]

Mirenowicz J. and , Schultz W., J. Neurophysiol. 72, 1024 (1994); 10.1152/jn.1994.72.2.1024

[40]

Schultz W. et al. in Models of Information Processing in the Basal Ganglia Houk J. C. Davis J. L. Beiser D. G. Eds. (MIT Press Cambridge MA 1995) pp. 233–248;

[41]

10.1038/379449a0

[42]

Recent experiments showed that the simple displacement of the time of reward delivery resulted in dopamine responses. In a situation in which neurons were not driven by a fully predicted drop of juice activations reappeared when the juice reward occurred 0.5 s earlier or later than predicted. Depressions were observed at the normal time of juice reward only if reward delivery was late [J. R. Hollerman and W. Schultz Soc. Neuroci. Abstr. 22 1388 (1996)].

[43]

10.1145/203330.203343

[44]

Bertsekas D. P. and Tsitsiklis J. N. Neurodynamic Programming (Athena Scientific Belmont NJ 1996).

[45]

Church R. M., Contemporary Learning Theories: Instrumental Conditioning Theory and the Impact of Biological Constraints on Learning, Klein S. B., Mowrer R. R.ErlbaumHillsdale, NJ198941; J. Gibbon, Learn. Motiv.22, 3 (1991).

[46]

Grossberg S., Schmajuk N. A., Neural Networks 2, 79 (1989); 10.1016/0893-6080(89)90026-9

[47]

Grossberg S. and , Merrill J. W. L.,Cognit. Brain Res. 1, 3 (1992). 10.1016/0926-6410(92)90003-a

[48]

Dayan P., Mach. Learn. 8, 341 (1992);

[49]

Dayan P. and , Sejnowski T. J., ibid. 14, 295 (1994);

[50]

Jaakkola T. , Jordan M. I. , Singh S. P.,Neural Computation 6, 1185 (1994). 10.1162/neco.1994.6.6.1185

Showing 50 of 81 references

Cited By

7,476

Why things get important: GPCRs in salience processing

Nina K. Blum, Rainer K. Reinscheid · 2026

Biochemical Pharmacology

Social reward outcompetes drug seeking dopaminergic ensembles to prevent relapse

Wei Zheng, Xiaoxing Liu · 2026

Nature Communications

Reinforcement Learning as Meta-induction

Igor Douven, Gerhard Schurz · 2026

Minds and Machines

Dopaminergic reciprocal circuits for learning and memory

M.C. Kiren, U.G. Bozok · 2026

Neurobiology of Disease

Frustrative nonreward: A century of progress

Thomas A. Green, Ellen Leibenluft · 2026

Neuroscience & Biobehavioral Re...

Social stress and affiliative touch: A tale of two affective states

Lito Parapera Papantoniou, Orysia Vityk · 2026

iScience

Duration between rewards controls the rate of behavioral and dopaminergic learning

Dennis A. Burke, Annie Taylor · 2026

Nature Neuroscience

Dopamine Dynamics in the Nucleus Accumbens Reflect Confidence in Detecting the Occurrence and Nonoccurrence of Visual Signals in Perceptual Decision-Making

Livia J. F. Wilod Versprille, Colin McKenzie · 2026

The Journal of Neuroscience

Au@MOF-199 functionalized graphene oxide nanocomposite for simultaneous electrochemical detection of dopamine and uric acid

Feng Zhou, Hong Ngee Lim · 2026

Talanta

Sustained activity of human substantia nigra neurons reflect prior rewards

Zarghona Imtiaz, Ayaka Kato · 2026

iScience

Interpretable deep learning for deconvolutional analysis of neural signals

Bahareh Tolooshams, Sara Matias · 2025

Neuron

Prediction-based sensory attenuation is related to prediction-based motor attenuation

Dominic M.D. Tran, Nicolas A. McNair · 2025

Imaging Neuroscience

Behavioral separation of liking and wanting in response to olfactory and visual food cues

Androula Savva, Renee Dijkman · 2025

Appetite

Estrogen modulates reward prediction errors and reinforcement learning

Carla E. M. Golden, Audrey C. Martin · 2025

Nature Neuroscience

The neurobiology of overeating

Garret D. Stuber, Valerie M. Schwitzgebel · 2025

Neuron

Mistakes in Action: On Clarifying the Phenomenon of Goal-Directedness

Jonathan Hill, David S. Oderberg · 2025

Biological Theory

The curious case of dopaminergic prediction errors and learning associative information beyond value

Thorsten Kahnt, Geoffrey Schoenbaum · 2025

Nature Reviews Neuroscience

Striatal dopamine signals errors in prediction across different informational domains

Kauê M. Costa, Akihiro Shimbo · 2025

Science Advances

Predictive learning as the basis of the testing effect

Haopeng Chen, Cathy Hauspie · 2025

Communications Psychology

Striatal Dopamine Contributions to Skilled Motor Learning

Chris D. Phillips, Alexander T. Hodge · 2024

The Journal of Neuroscience

Metrics

7,476

Citations

81

References

Details

Published: Mar 14, 1997
Vol/Issue: 275(5306)
Pages: 1593-1599

Authors

W

Wolfram Schultz

W. Schultz is at the Institute of Physiology, University of Fribourg, CH-1700 Fribourg, Switzerland.

P

Peter Dayan

P. Dayan is in the Department of Brain and Cognitive Sciences, Center for Biological and Computational Learning, E-25 MIT, Cambridge, MA 02139, USA.

P

P. Read Montague

P. R. Montague is in the Division of Neuroscience, Center for Theoretical Neuroscience, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA.

Cite This Article

Wolfram Schultz, Peter Dayan, P. Read Montague (1997). A Neural Substrate of Prediction and Reward. Science, 275(5306), 1593-1599. https://doi.org/10.1126/science.275.5306.1593

A Neural Substrate of Prediction and Reward

You May Also Like