浏览代码

resolved comments

/develop-generalizationTraining-TrainerController
Vincent Gao 7 年前
当前提交
69f106a4
共有 2 个文件被更改,包括 2 次插入2 次删除
  1. 2
      docs/ML-Agents-Overview.md
  2. 2
      docs/Training-Imitation-Learning.md

2
docs/ML-Agents-Overview.md


actions performed with the controller (in addition to the agent observations)
will be recorded and sent to the Python API. The imitation learning algorithm
will then use these pairs of observations and actions from the human player
to learn a policy.[Video Link](https://youtu.be/kpb8ZkMBFYs).
to learn a policy. [Video Link](https://youtu.be/kpb8ZkMBFYs).
The [Training with Imitation Learning](Training-Imitation-Learning.md) tutorial covers this
training mode with the **Banana Collector** sample environment.

2
docs/Training-Imitation-Learning.md


# Imitation Learning
It is often more intuitive to simply demonstrate the behavior we want an agent to perform, rather than attempting to have it learn via trial-and-error methods. Consider our [running example](ML-Agents-Overview.md#running-example-training-npc-behaviors) of training a medic NPC : instead of indirectly training a medic with the help of a reward function, we can give the medic real world examples of observations from the game and actions from a game controller to guide the medic's behavior. More specifically, in this mode, the Brain type during training is set to Player and all the actions performed with the controller (in addition to the agent observations) will be recorded and sent to the Python API. The imitation learning algorithm will then use these pairs of observations and actions from the human player to learn a policy.[Video Link](https://youtu.be/kpb8ZkMBFYs).
It is often more intuitive to simply demonstrate the behavior we want an agent to perform, rather than attempting to have it learn via trial-and-error methods. Consider our [running example](ML-Agents-Overview.md#running-example-training-npc-behaviors) of training a medic NPC : instead of indirectly training a medic with the help of a reward function, we can give the medic real world examples of observations from the game and actions from a game controller to guide the medic's behavior. More specifically, in this mode, the Brain type during training is set to Player and all the actions performed with the controller (in addition to the agent observations) will be recorded and sent to the Python API. The imitation learning algorithm will then use these pairs of observations and actions from the human player to learn a policy. [Video Link](https://youtu.be/kpb8ZkMBFYs).
## Using Behavioral Cloning

正在加载...
取消
保存