Adding the Variable length observation to the readme and to the overview of ML-Agents

4 年前 · bd839dd6
--- a/README.md
+++ b/README.md

 ## Features

- 15+ [example Unity environments](docs/Learning-Environment-Examples.md)
- Support for multiple environment configurations and training scenarios
- Flexible Unity SDK that can be integrated into your game or custom Unity scene
+- 18+ [example Unity environments](docs/Learning-Environment-Examples.md)
- Built-in support for Imitation Learning through Behavioral Cloning or
-  Generative Adversarial Imitation Learning
+- Built-in support for Imitation Learning through Behavioral Cloning (BC) or
+  Generative Adversarial Imitation Learning (GAIL)
- Flexible agent control with On Demand Decision Making
 - Train using multiple concurrent Unity environment instances
 - Utilizes the [Unity Inference Engine](docs/Unity-Inference-Engine.md) to
  provide native cross-platform support
--- a/docs/ML-Agents-Overview.md
+++ b/docs/ML-Agents-Overview.md

 Regardless of the training method deployed, there are a few model types that
 users can train using the ML-Agents Toolkit. This is due to the flexibility in
-defining agent observations, which can include vector, ray cast and visual
+defining agent observations, which include vector, ray cast and visual
 observations. You can learn more about how to instrument an agent's observation
 in the [Designing Agents](Learning-Environment-Design-Agents.md) guide.


 The choice of the architecture depends on the visual complexity of the scene and
 the available computational resources.
+
+### Learning from Variable Length Observations using Attention
+
+Using the ML-Agents toolkit, it is possible to have agents learn from a
+varying number of inputs. To do so, each agent can keep track of a buffer
+of vector observations. At each step, the agent will go through all the
+elements in the buffer and extract information but the elements
+in the buffer can change at every step.
+This can be useful in scenarios in which the agents must keep track of
+a varying number of elements throughout the episode. You can learn more
+about variable length observations and the BufferSensor 
+[here](Learning-Environment-Design-Agents.md#variable-length-observations)
+When variable length observations are utilized, the ML-Agents Toolkit
+leverages attention networks to learn from a varying number of entities.
+Agents using attention will ignore entities that are deemed not relevant
+and pay special attention to entities relevant to the current situation
+based on context.

 ### Memory-enhanced Agents using Recurrent Neural Networks