Model-Free Offline Reinfoecement Learning

- 1 min

BCQ: Off-Policy Deep Reinforcement Learning without Exploration

BCQ.pdf

BEAR, BRAC.pdf

Refernece

[1] A. Kumar. Conservative Q-Learning for Offline Reinforcement Learning. 2020.

[2] T. Yu. COMBO: Conservative Offline Model-Based Policy Optimization. 2021.

[3] B. Ning. Double Deep Q-Learning for Optimal Execution

comments powered by Disqus
rss facebook twitter github gitlab youtube mail spotify lastfm instagram linkedin google google-plus pinterest medium vimeo stackoverflow reddit quora quora