浏览代码

[Documentation] Added the LSTM documentation (#387)

* [Documentation] Added the LSTM documentation

* [Documentation] Fix the line breaks

* [Documentations] Modified the doc given feedback

* [Documentation] Improvements based of PR comments

* [Documentation] Removed reference to PPO and BC
/develop-generalizationTraining-TrainerController
GitHub 7 年前
当前提交
8e26db97
共有 4 个文件被更改,包括 969 次插入0 次删除
  1. 10
      docs/ML-Agents-Overview.md
  2. 1
      docs/Readme.md
  3. 47
      docs/Training-LSTM.md
  4. 911
      docs/images/ml-agents-LSTM.png

10
docs/ML-Agents-Overview.md


The [Imitation Learning](Training-Imitation-Learning.md) tutorial covers this training
mode with the **Anti-Graviator** sample environment.
### Recurrent Neural Networks
In some scenarios, agents must learn to remember the past in order to take the
best decision. When an agent only has partial observability of the environment,
keeping track of past observations can help the agent learn. We provide an
implementation of [LSTM](https://en.wikipedia.org/wiki/Long_short-term_memory) in
our trainers that enable the agent to store memories to be used in future steps.
The [Training with LSTM](Training-LSTM.md) tutorial covers this feature and
the **Hallway** environment demonstrates its capabilities.
## Flexible Training Scenarios
While the discussion so-far has mostly focused on training a single agent, with

1
docs/Readme.md


* [Training with Proximal Policy Optimization](Training-PPO.md)
* [Training with Curriculum Learning](Training-Curriculum-Learning.md)
* [Training with Imitation Learning](Training-Imitation-Learning.md)
* [Training with LSTM](Training-LSTM.md)
* [Training on the Cloud with Amazon Web Services](Training-on-Amazon-Web-Service.md)
* [Using TensorBoard to Observe Training](Using-Tensorboard.md)

47
docs/Training-LSTM.md


# Using Recurrent Neural Network in ML-Agents
## What are memories for?
Have you ever entered a room to get something and immediately forgot
what you were looking for? Don't let that happen to
your agents.
It is now possible to give memories to your agents. When training, the
agents will be able to store a vector of floats to be used next time
they need to make a decision.
![Brain Inspector](images/ml-agents-LSTM.png)
Deciding what the agents should remember in order to solve a task is not
easy to do by hand, but our training algorithms can learn to keep
track of what is important to remember with [LSTM](https://en.wikipedia.org/wiki/Long_short-term_memory).
## How to use
When configuring the trainer parameters in the `trainer_config.yaml`
file, add the following parameters to the Brain you want to use.
```
use_recurrent: true
sequence_length: 64
memory_size: 256
```
* `use_recurent` is a flag that notifies the trainer that you want
to use a Recurrent Neural Network.
* `sequence_length` defines how long the sequences of experiences
must be while training. In order to use a LSTM, training requires
a sequence of experiences instead of single experiences.
* `memory_size` corresponds to the size of the memory the agent
must keep. Note that if this number is too small, the agent will not
be able to remember a lot of things. If this number is too large,
the neural network will take longer to train.
## Limitations
* LSTM does not work well with continuous vector action space.
Please use discrete vector action space for better results.
* Since the memories must be sent back and forth between python
and Unity, using too large `memory_size` will slow down training.
* Adding a recurrent layer increases the complexity of the neural
network, it is recommended to decrease `num_layers` when using recurrent.
* It is required that `memory_size` be divisible by 4.

911
docs/images/ml-agents-LSTM.png

之前 之后
宽度: 1042  |  高度: 384  |  大小: 216 KiB
正在加载...
取消
保存