Explicitly show the relationships between various techniques of deep reinforcement learning methods.
Dedicated for learning and researching on DRL.
- Emergence of Locomotion Behaviours in Rich Environments 7 July 2017
- Equivalence Between Policy Gradients and Soft Q-Learning
- Trust Region Policy Optimization
- Reinforcement Learning with Deep Energy-Based Policies
- Q-PROP: SAMPLE-EFFICIENT POLICY GRADIENT WITH AN OFF-POLICY CRITIC
- Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning 1 Jun 2017
- Noisy Networks for Exploration 30 Jun 2017 implementation
- Count-Based Exploration in Feature Space for Reinforcement Learning 25 Jun 2017
- Count-Based Exploration with Neural Density Models 14 Jun 2017
- UCB and InfoGain Exploration via Q-Ensembles 11 Jun 2017
- Minimax Regret Bounds for Reinforcement Learning 16 Mar 2017
- Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models
- EX2: Exploration with Exemplar Models for Deep Reinforcement Learning
- The Reactor: A Sample-Efficient Actor-Critic Architecture 15 Apr 2017
- SAMPLE EFFICIENT ACTOR-CRITIC WITH EXPERIENCE REPLAY
- REINFORCEMENT LEARNING WITH UNSUPERVISED AUXILIARY TASKS
- Continuous control with deep reinforcement learning
- Robust Imitation of Diverse Behaviors
- Learning human behaviors from motion capture by adversarial imitation
- Connecting Generative Adversarial Networks and Actor-Critic Methods