F5-Unit_3-Deep_Q_Learning_with_Atari_Games-A0-Introduction
中英文对照学习,效果更佳!
原课程链接:https://huggingface.co/deep-rl-course/unit6/introduction?fw=pt
Deep Q-Learning
深度Q-学习
![]()
In the last unit, we learned our first reinforcement learning algorithm: Q-Learning, implemented it from scratch, and trained it in two environments, FrozenLake-v1 ☃️ and Taxi-v3 🚕.
单元3缩略图在最后一个单元中,我们学习了我们的第一个强化学习算法:Q-学习,并从头开始实现它,并在两个环境中训练它:FrozenLake-v1☃️和Taxi-v3🚕。
We got excellent results with this simple algorithm, but these environments were relatively simple because the state space was discrete and small (14 different states for FrozenLake-v1 and 500 for Taxi-v3). For comparison, the state space in Atari games can contain 10910^{9}109 to 101110^{11}1011 states.
我们用这个简单的算法得到了很好的结果,但这些环境相对简单,因为状态空间是离散的并且很小(FrozenLake-v1有14个不同的状态,Taxi-v3有500个不同的状态)。作为比较,Atari游戏中的状态空间可以包含10910^{9}109到101110^{11}1011个状态。
But as we’ll see, producing and updating a Q-table can become ineffective in large state space environments.
但正如我们将看到的,生成和更新Q表在大型状态空间环境中可能会变得无效。
So in this unit, we’ll study our first Deep Reinforcement Learning agent: Deep Q-Learning. Instead of using a Q-table, Deep Q-Learning uses a Neural Network that takes a state and approximates Q-values for each action based on that state.
因此,在本单元中,我们将研究我们的第一个深度强化学习代理:深度Q-学习。深度Q-学习不使用Q表,而是使用神经网络来获取状态,并根据该状态近似每个操作的Q值。
And we’ll train it to play Space Invaders and other Atari environments using RL-Zoo, a training framework for RL using Stable-Baselines that provides scripts for training, evaluating agents, tuning hyperparameters, plotting results, and recording videos.
我们将使用RL-Zoo训练它来玩太空入侵者和其他Atari环境,RL-Zoo是一个RL训练框架,使用稳定的基线,提供训练、评估代理、调整超参数、绘制结果和录制视频的脚本。

So let’s get started! 🚀
环境,让我们开始吧!🚀