中英文对照学习，效果更佳！
原课程链接：https://huggingface.co/deep-rl-course/unit6/introduction?fw=pt

Deep Q-Learning

深度Q-学习

In the last unit, we learned our first reinforcement learning algorithm: Q-Learning, implemented it from scratch, and trained it in two environments, FrozenLake-v1 ☃️ and Taxi-v3 🚕.

单元3缩略图在最后一个单元中，我们学习了我们的第一个强化学习算法：Q-学习，并从头开始实现它，并在两个环境中训练它：FrozenLake-v1☃️和Taxi-v3🚕。

We got excellent results with this simple algorithm, but these environments were relatively simple because the state space was discrete and small (14 different states for FrozenLake-v1 and 500 for Taxi-v3). For comparison, the state space in Atari games can contain 10910^{9}109 to 101110^{11}1011 states.

我们用这个简单的算法得到了很好的结果，但这些环境相对简单，因为状态空间是离散的并且很小(FrozenLake-v1有14个不同的状态，Taxi-v3有500个不同的状态)。作为比较，Atari游戏中的状态空间可以包含10910^{9}109到101110^{11}1011个状态。

But as we’ll see, producing and updating a Q-table can become ineffective in large state space environments.

但正如我们将看到的，生成和更新Q表在大型状态空间环境中可能会变得无效。

So in this unit, we’ll study our first Deep Reinforcement Learning agent: Deep Q-Learning. Instead of using a Q-table, Deep Q-Learning uses a Neural Network that takes a state and approximates Q-values for each action based on that state.

因此，在本单元中，我们将研究我们的第一个深度强化学习代理：深度Q-学习。深度Q-学习不使用Q表，而是使用神经网络来获取状态，并根据该状态近似每个操作的Q值。

And we’ll train it to play Space Invaders and other Atari environments using RL-Zoo, a training framework for RL using Stable-Baselines that provides scripts for training, evaluating agents, tuning hyperparameters, plotting results, and recording videos.

我们将使用RL-Zoo训练它来玩太空入侵者和其他Atari环境，RL-Zoo是一个RL训练框架，使用稳定的基线，提供训练、评估代理、调整超参数、绘制结果和录制视频的脚本。

Environments
So let’s get started! 🚀

环境，让我们开始吧！🚀

Reinforcement

#Reinforcement

E4-Unit_2-Introduction_to_Q_Learning-J9-Q_Learning_Recap 上一篇

F5-Unit_3-Deep_Q_Learning_with_Atari_Games-B1-From_Q_Learning_to_Deep_Q_Learning 下一篇

F5-Unit_3-Deep_Q_Learning_with_Atari_Games-A0-Introduction

Deep Q-Learning

深度Q-学习