L11-Unit_8-Part_1_Proximal_Policy_Optimization_(PPO)-G6-Additional_Readings
中英文对照学习,效果更佳!
原课程链接:https://huggingface.co/deep-rl-course/unit2/q-learning?fw=pt
Additional Readings
附加读数
These are optional readings if you want to go deeper.
如果你想更深入,这些都是可选的读物。
PPO Explained
PPO解释
- Towards Delivering a Coherent Self-Contained Explanation of Proximal Policy Optimization by Daniel Bick
- What is the way to understand Proximal Policy Optimization Algorithm in RL?
- Foundations of Deep RL Series, L4 TRPO and PPO by Pieter Abbeel
- OpenAI PPO Blogpost
- Spinning Up RL PPO
- Paper Proximal Policy Optimization Algorithms
PPO Implementation details
Daniel Bick对最近策略优化的连贯而完整的解释:理解RL中的最近策略优化算法的方法是什么?Pieter Abbee所著的Deep RL系列、L4 TRPO和PPO的基础OpenAI PPO博客文章旋转RL PPO纸张最近策略优化算法PPO实现细节
- The 37 Implementation Details of Proximal Policy Optimization
- Part 1 of 3 — Proximal Policy Optimization Implementation: 11 Core Implementation Details