[Documentation] Added the LSTM documentation (#387)

* [Documentation] Added the LSTM documentation * [Documentation] Fix the line breaks * [Documentations] Modified the doc given feedback * [Documentation] Improvements based of PR comments * [Documentation] Removed reference to PPO and BC
7 年前 · 8e26db97
--- a/docs/ML-Agents-Overview.md
+++ b/docs/ML-Agents-Overview.md
 The [Imitation Learning](Training-Imitation-Learning.md) tutorial covers this training
 mode with the **Anti-Graviator** sample environment.

+### Recurrent Neural Networks
+In some scenarios, agents must learn to remember the past in order to take the 
+best decision. When an agent only has partial observability of the environment, 
+keeping track of past observations can help the agent learn. We provide an 
+implementation of [LSTM](https://en.wikipedia.org/wiki/Long_short-term_memory) in 
+our trainers that enable the agent to store memories to be used in future steps.
+
+The [Training with LSTM](Training-LSTM.md) tutorial covers this feature and 
+the **Hallway** environment demonstrates its capabilities.
+
 ## Flexible Training Scenarios

 While the discussion so-far has mostly focused on training a single agent, with 
--- a/docs/Readme.md
+++ b/docs/Readme.md
 * [Training with Proximal Policy Optimization](Training-PPO.md)
 * [Training with Curriculum Learning](Training-Curriculum-Learning.md)
 * [Training with Imitation Learning](Training-Imitation-Learning.md)
+ * [Training with LSTM](Training-LSTM.md)
 * [Training on the Cloud with Amazon Web Services](Training-on-Amazon-Web-Service.md)
 * [Using TensorBoard to Observe Training](Using-Tensorboard.md)

--- a/docs/Training-LSTM.md
+++ b/docs/Training-LSTM.md
+# Using Recurrent Neural Network in ML-Agents
+
+## What are memories for?
+Have you ever entered a room to get something and immediately forgot
+what you were looking for? Don't let that happen to 
+your agents.  
+
+It is now possible to give memories to your agents. When training, the 
+agents will be able to store a vector of floats to be used next time 
+they need to make a decision. 
+
+![Brain Inspector](images/ml-agents-LSTM.png)
+
+Deciding what the agents should remember in order to solve a task is not 
+easy to do by hand, but our training algorithms can learn to keep 
+track of what is important to remember with [LSTM](https://en.wikipedia.org/wiki/Long_short-term_memory). 
+
+## How to use
+When configuring the trainer parameters in the `trainer_config.yaml` 
+file, add the following parameters to the Brain you want to use.
+
+```
+	use_recurrent: true
+	sequence_length: 64
+	memory_size: 256
+```
+
+* `use_recurent` is a flag that notifies the  trainer that you want 
+to use a Recurrent Neural Network.
+* `sequence_length` defines how long the sequences of experiences 
+must be while training. In order to use a LSTM, training requires 
+a sequence of experiences instead of single experiences.
+* `memory_size` corresponds to the size of the memory the agent 
+must keep. Note that if this number is too small, the agent will not 
+be able to remember a lot of things. If this number is too large, 
+the neural network will take longer to train. 
+
+## Limitations
+* LSTM does not work well with continuous vector action space. 
+Please use discrete vector action space for better results.
+* Since the memories must be sent back and forth between python 
+and Unity, using too large `memory_size` will slow down training.
+* Adding a recurrent layer increases the complexity of the neural 
+network, it is recommended to decrease `num_layers` when using recurrent.
+* It is required that `memory_size` be divisible by 4.
+
+
--- a/docs/images/ml-agents-LSTM.png
+++ b/docs/images/ml-agents-LSTM.png