Merge branch 'master' into develop-coma2-trainer

4 年前 · e37c5a98
--- a/README.md
+++ b/README.md

 ## Features

- 15+ [example Unity environments](docs/Learning-Environment-Examples.md)
+- 18+ [example Unity environments](docs/Learning-Environment-Examples.md)
- Built-in support for Imitation Learning through Behavioral Cloning or
-  Generative Adversarial Imitation Learning
+- Built-in support for Imitation Learning through Behavioral Cloning (BC) or
+  Generative Adversarial Imitation Learning (GAIL)
 - Self-play mechanism for training agents in adversarial scenarios
 - Easily definable Curriculum Learning scenarios for complex tasks
 - Train robust agents using environment randomization
--- a/com.unity.ml-agents.extensions/Runtime/Input/InputActuatorComponent.cs
+++ b/com.unity.ml-agents.extensions/Runtime/Input/InputActuatorComponent.cs
    /// <see cref="InputActionActuator"/>s.
    /// </summary>
    [RequireComponent(typeof(PlayerInput), typeof(IInputActionAssetProvider))]
+    [AddComponentMenu("ML Agents/Input Actuator", (int)MenuGroup.Actuators)]
    public class InputActuatorComponent : ActuatorComponent
    {
        InputActionAsset m_InputAsset;
            }

            var inputControlScheme = new InputControlScheme(
-                 mlAgentsControlSchemeName,
+                mlAgentsControlSchemeName,
                deviceRequirements);

            return inputControlScheme;
                    var builder = new InputControlLayout.Builder()
                        .WithName(layoutName)
                        .WithFormat(mlAgentsLayoutFormat);
-                    for(var i = 0; i < defaultMap.actions.Count; i++)
+                    for (var i = 0; i < defaultMap.actions.Count; i++)
                    {
                        var action = defaultMap.actions[i];
                        builder.AddControl(action.name)
-
                }, layoutName);
            }
        }
--- a/com.unity.ml-agents.extensions/Runtime/Match3/Match3ActuatorComponent.cs
+++ b/com.unity.ml-agents.extensions/Runtime/Match3/Match3ActuatorComponent.cs
    /// <summary>
    /// Actuator component for a Match3 game. Generates a Match3Actuator at runtime.
    /// </summary>
+    [AddComponentMenu("ML Agents/Match 3 Actuator", (int)MenuGroup.Actuators)]
    public class Match3ActuatorComponent : ActuatorComponent
    {
        /// <summary>
--- a/com.unity.ml-agents.extensions/Runtime/Match3/Match3SensorComponent.cs
+++ b/com.unity.ml-agents.extensions/Runtime/Match3/Match3SensorComponent.cs
 using Unity.MLAgents.Sensors;
+using UnityEngine;

 namespace Unity.MLAgents.Extensions.Match3
 {
+    [AddComponentMenu("ML Agents/Match 3 Sensor", (int)MenuGroup.Sensors)]
    public class Match3SensorComponent : SensorComponent
    {
        /// <summary>
--- a/com.unity.ml-agents.extensions/Runtime/Sensors/GridSensor.cs
+++ b/com.unity.ml-agents.extensions/Runtime/Sensors/GridSensor.cs
    /// <summary>
    /// Grid-based sensor.
    /// </summary>
+    [AddComponentMenu("ML Agents/Grid Sensor", (int)MenuGroup.Sensors)]
    public class GridSensor : SensorComponent, ISensor, IBuiltInSensor
    {
        /// <summary>
        {
            return BuiltInSensorType.GridSensor;
        }
-

        /// <summary>
        /// GetCompressedObservation - Calls Perceive then puts the data stored on the perception buffer
--- a/com.unity.ml-agents.extensions/Tests/Runtime/Input/Unity.ML-Agents.Extensions.Input.Tests.Runtime.asmdef
+++ b/com.unity.ml-agents.extensions/Tests/Runtime/Input/Unity.ML-Agents.Extensions.Input.Tests.Runtime.asmdef
    "versionDefines": [
        {
            "name": "com.unity.inputsystem",
-            "expression": "1.1.0-preview",
+            "expression": "1.1.0",
            "define": "MLA_INPUT_TESTS"
        }
    ],
--- a/com.unity.ml-agents/Runtime/Constants.cs
+++ b/com.unity.ml-agents/Runtime/Constants.cs
    internal enum MenuGroup
    {
        Default = 0,
-        Sensors = 50
+        Sensors = 50,
+        Actuators = 100
    }
 }
--- a/docs/ML-Agents-Overview.md
+++ b/docs/ML-Agents-Overview.md
 - [Model Types](#model-types)
  - [Learning from Vector Observations](#learning-from-vector-observations)
  - [Learning from Cameras using Convolutional Neural Networks](#learning-from-cameras-using-convolutional-neural-networks)
+  - [Learning from Variable Length Observations using Attention](#learning-from-ariable-length-observations-using-attention)
  - [Memory-enhanced Agents using Recurrent Neural Networks](#memory-enhanced-agents-using-recurrent-neural-networks)
 - [Additional Features](#additional-features)
 - [Summary and Next Steps](#summary-and-next-steps)

 Regardless of the training method deployed, there are a few model types that
 users can train using the ML-Agents Toolkit. This is due to the flexibility in
-defining agent observations, which can include vector, ray cast and visual
+defining agent observations, which include vector, ray cast and visual
 observations. You can learn more about how to instrument an agent's observation
 in the [Designing Agents](Learning-Environment-Design-Agents.md) guide.


 The choice of the architecture depends on the visual complexity of the scene and
 the available computational resources.
+
+### Learning from Variable Length Observations using Attention
+
+Using the ML-Agents Toolkit, it is possible to have agents learn from a
+varying number of inputs. To do so, each agent can keep track of a buffer
+of vector observations. At each step, the agent will go through all the
+elements in the buffer and extract information but the elements
+in the buffer can change at every step.
+This can be useful in scenarios in which the agents must keep track of
+a varying number of elements throughout the episode. For example in a game
+where an agent must learn to avoid projectiles, but the projectiles can vary in
+numbers.
+
+![Variable Length Observations Illustrated](images/variable-length-observation-illustrated.png)
+
+You can learn more about variable length observations
+[here](Learning-Environment-Design-Agents.md#variable-length-observations).
+When variable length observations are utilized, the ML-Agents Toolkit
+leverages attention networks to learn from a varying number of entities.
+Agents using attention will ignore entities that are deemed not relevant
+and pay special attention to entities relevant to the current situation
+based on context.

 ### Memory-enhanced Agents using Recurrent Neural Networks

--- a/docs/images/variable-length-observation-illustrated.png
+++ b/docs/images/variable-length-observation-illustrated.png