Topics

No keywords indexed for this article. Browse by subject →

References
14
[1]
Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite time analysis of the multiarmed bandit problem. Machine Learning 47(2-3), 235–256 (2002) 10.1023/a:1013689704352
[2]
Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: The nonstochastic multiarmed bandit problem. SIAM Journal on Computing 32, 48–77 (2002) 10.1137/s0097539701398375
[3]
Barto, A.G., Bradtke, S.J., Singh, S.P.: Real-time learning and control using asynchronous dynamic programming. Technical report 91-57, Computer Science Department, University of Massachusetts (1991)
[4]
Billings, D., Davidson, A., Schaeffer, J., Szafron, D.: The challenge of poker. Artificial Intelligence 134, 201–240 (2002) 10.1016/s0004-3702(01)00130-8
[5]
Bouzy, B., Helmstetter, B.: Monte Carlo Go developments. In: van den Herik, H.J., Iida, H., Heinz, E.A. (eds.) Advances in Computer Games 10, pp. 159–174 (2004) 10.1007/978-0-387-35706-5_11
[6]
Chang, H.S., Fu, M., Hu, J., Marcus, S.I.: An adaptive sampling algorithm for solving Markov decision processes. Operations Research 53(1), 126–139 (2005) 10.1287/opre.1040.0145
[7]
Chung, M., Buro, M., Schaeffer, J.: Monte Carlo planning in RTS games. In: CIG 2005, Colchester, UK (2005)
[8]
Kearns, M., Mansour, Y., Ng, A.Y.: A sparse sampling algorithm for near-optimal planning in large Markovian decisi on processes. In: Proceedings of IJCAI 1999, pp. 1324–1331 (1999)
[9]
Lai, T.L., Robbins, H.: Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics 6, 4–22 (1985) 10.1016/0196-8858(85)90002-8
[10]
Péret, L., Garcia, F.: On-line search for solving Markov decision processes via heuristic sampling. In: de Mántaras, R.L., Saitta, L. (eds.) ECAI, pp. 530–534 (2004)
[11]
Sheppard, B.: World-championship-caliber Scrabble. Artificial Intelligence 134(1–2), 241–275 (2002) 10.1016/s0004-3702(01)00166-7
[12]
Smith, S.J.J., Nau, D.S.: An analysis of forward pruning. In: AAAI, pp. 1386–1391 (1994)
[13]
Tesauro, G., Galperin, G.R.: On-line policy improvement using Monte-Carlo search. In: Mozer, M.C., Jordan, M.I., Petsche, T. (eds.) NIPS 9, pp. 1068–1074 (1997)
[14]
Vanderbei, R.: Optimal sailing strategies, statistics and operations research program. University of Princeton (1996), http://www.sor.princeton.edu/~rvdb/sail/sail.html
Cited By
1,242
ACM Transactions on Design Automati...
IEEE Transactions on Automation Sci...
IEEE Transactions on Radar Systems
Journal of Computational and Graphi...
Artificial Intelligence Review
IEEE Transactions on Information Fo...
IEEE Transactions on Systems, Man,...
IEEE Transactions on Computer-Aided...
Science Advances
Annual Review of Control, Robotics,...
Monte-Carlo Robot Path Planning

Tuan Dam, Georgia Chalvatzaki · 2022

IEEE Robotics and Automation Letter...
Mastering Atari, Go, chess and shogi by planning with a learned model

Julian Schrittwieser, Ioannis Antonoglou · 2020

Nature
Machine Learning Guidance for Connection Tableaux

Michael Färber, Cezary Kaliszyk · 2020

Journal of Automated Reasoning
IEEE Transactions on Intelligent Tr...
Bayesian Reinforcement Learning: A Survey

Mohammad Ghavamzadeh, Shie Mannor · 2015

Foundations and Trends® in Machine...
Metrics
1,242
Citations
14
References
Details
Published
Jan 01, 2006
Pages
282-293
Cite This Article
Levente Kocsis, Csaba Szepesvári (2006). Bandit Based Monte-Carlo Planning. Lecture Notes in Computer Science, 282-293. https://doi.org/10.1007/11871842_29
Related

You May Also Like

U-Net: Convolutional Networks for Biomedical Image Segmentation

Olaf Ronneberger, Philipp Fischer · 2015

86,205 citations

Microsoft COCO: Common Objects in Context

Tsung-Yi Lin, Michael Maire · 2014

41,190 citations

SSD: Single Shot MultiBox Detector

Wei Liu, Dragomir Anguelov · 2016

20,287 citations

Visualizing and Understanding Convolutional Networks

Matthew D. Zeiler, Rob Fergus · 2014

7,281 citations