浏览代码

Merge 'master' into develop-removeactionholder

/develop/removeactionholder-onehot
Ervin Teng 5 年前
当前提交
d57124b4
共有 39 个文件被更改,包括 1114 次插入659 次删除
  1. 6
      com.unity.ml-agents/CHANGELOG.md
  2. 2
      com.unity.ml-agents/Editor/DemonstrationImporter.cs
  3. 29
      com.unity.ml-agents/Runtime/Academy.cs
  4. 135
      com.unity.ml-agents/Runtime/Agent.cs
  5. 2
      com.unity.ml-agents/Runtime/InferenceBrain/TensorProxy.cs
  6. 4
      com.unity.ml-agents/Runtime/Sensor/ISensor.cs
  7. 534
      com.unity.ml-agents/Runtime/Sensor/RayPerceptionSensor.cs
  8. 4
      com.unity.ml-agents/Runtime/Sensor/RayPerceptionSensorComponent2D.cs
  9. 4
      com.unity.ml-agents/Runtime/Sensor/RayPerceptionSensorComponent3D.cs
  10. 38
      com.unity.ml-agents/Runtime/Sensor/RayPerceptionSensorComponentBase.cs
  11. 6
      com.unity.ml-agents/Runtime/Sensor/WriteAdapter.cs
  12. 50
      com.unity.ml-agents/Tests/Editor/DemonstrationTests.cs
  13. 2
      config/gail_config.yaml
  14. 9
      docs/API-Reference.md
  15. 4
      docs/Migrating.md
  16. 2
      docs/Training-Imitation-Learning.md
  17. 133
      docs/dox-ml-agents.conf
  18. 145
      gym-unity/gym_unity/envs/__init__.py
  19. 77
      gym-unity/gym_unity/tests/test_gym.py
  20. 62
      ml-agents-envs/mlagents_envs/environment.py
  21. 82
      ml-agents/mlagents/trainers/common/nn_policy.py
  22. 6
      ml-agents/mlagents/trainers/common/tf_optimizer.py
  23. 14
      ml-agents/mlagents/trainers/learn.py
  24. 2
      ml-agents/mlagents/trainers/sac/trainer.py
  25. 13
      ml-agents/mlagents/trainers/tests/test_learn.py
  26. 2
      com.unity.ml-agents/Runtime/Demonstrations/Demonstration.cs.meta
  27. 2
      com.unity.ml-agents/Runtime/Demonstrations/DemonstrationRecorder.cs.meta
  28. 14
      com.unity.ml-agents/Runtime/Demonstrations/DemonstrationWriter.cs.meta
  29. 4
      com.unity.ml-agents/Runtime/Demonstrations/Demonstration.cs
  30. 104
      com.unity.ml-agents/Runtime/Demonstrations/DemonstrationWriter.cs
  31. 8
      com.unity.ml-agents/Runtime/Demonstrations.meta
  32. 179
      com.unity.ml-agents/Runtime/Demonstrations/DemonstrationRecorder.cs
  33. 95
      com.unity.ml-agents/Runtime/DemonstrationRecorder.cs
  34. 0
      /com.unity.ml-agents/Runtime/Demonstrations/Demonstration.cs.meta
  35. 0
      /com.unity.ml-agents/Runtime/Demonstrations/DemonstrationRecorder.cs.meta
  36. 0
      /com.unity.ml-agents/Runtime/Demonstrations/DemonstrationWriter.cs.meta
  37. 0
      /com.unity.ml-agents/Runtime/Demonstrations/Demonstration.cs
  38. 0
      /com.unity.ml-agents/Runtime/Demonstrations/DemonstrationWriter.cs

6
com.unity.ml-agents/CHANGELOG.md


- Agent.CollectObservations now takes a VectorSensor argument. It was also overloaded to optionally take an ActionMasker argument. (#3352, #3389)
- Beta support for ONNX export was added. If the `tf2onnx` python package is installed, models will be saved to `.onnx` as well as `.nn` format.
Note that Barracuda 0.6.0 or later is required to import the `.onnx` files properly
- Multi-GPU training and the `--multi-gpu` option has been removed temporarily. (#3345)
### Minor Changes
- Monitor.cs was moved to Examples. (#3372)

- A tutorial on adding custom SideChannels was added (#3391)
- The stepping logic for the Agent and the Academy has been simplified (#3448)
- Update Barracuda to 0.6.0-preview
- The interface for `RayPerceptionSensor.PerceiveStatic()` was changed to take an input class and write to an output class.
- The command-line argument used to determine the port that an environment will listen on was changed from `--port` to `--mlagents-port`.
- `DemonstrationRecorder` can now record observations outside of the editor.
- `DemonstrationRecorder` now has an optional path for the demonstrations. This will default to `Application.dataPath` if not set.
- `DemonstrationStore` was changed to accept a `Stream` for its constructor, and was renamed to `DemonstrationWriter`
- The method `GetStepCount()` on the Agent class has been replaced with the property getter `StepCount`
### Bugfixes

2
com.unity.ml-agents/Editor/DemonstrationImporter.cs


var metaDataProto = DemonstrationMetaProto.Parser.ParseDelimitedFrom(reader);
var metaData = metaDataProto.ToDemonstrationMetaData();
reader.Seek(DemonstrationStore.MetaDataBytes + 1, 0);
reader.Seek(DemonstrationWriter.MetaDataBytes + 1, 0);
var brainParamsProto = BrainParametersProto.Parser.ParseDelimitedFrom(reader);
var brainParameters = brainParamsProto.ToBrainParameters();

29
com.unity.ml-agents/Runtime/Academy.cs


{
const string k_ApiVersion = "API-15-dev0";
const int k_EditorTrainingPort = 5004;
internal const string k_portCommandLineFlag = "--mlagents-port";
/// <summary>
/// True if the Academy is initialized, false otherwise.
/// </summary>
/// <summary>
/// The singleton Academy object.
/// </summary>
/// <summary>
/// Collection of float properties (indexed by a string).
/// </summary>
public IFloatProperties FloatProperties;

// Signals to all the listeners that the academy is being destroyed
internal event Action DestroyAction;
// Signals the Agent that a new step is about to start.
// Signals the Agent that a new step is about to start.
// This will mark the Agent as Done if it has reached its maxSteps.
internal event Action AgentIncrementStep;

// Signals to all the agents each time the Academy force resets.
internal event Action AgentForceReset;
// Signals that the Academy has been reset by the training process
/// <summary>
/// Signals that the Academy has been reset by the training process.
/// </summary>
public event Action OnEnvironmentReset;
AcademyFixedUpdateStepper m_FixedUpdateStepper;

/// <summary>
/// Initialize the Academy if it hasn't already been initialized.
/// This method is always safe to call; it will have no effect if the Academy is already initialized.
/// This method is always safe to call; it will have no effect if the Academy is already
/// initialized.
/// </summary>
internal void LazyInitialize()
{

}
/// <summary>
/// Enable stepping of the Academy during the FixedUpdate phase. This is done by creating a temporary
/// GameObject with a MonoBehavior that calls Academy.EnvironmentStep().
/// Enable stepping of the Academy during the FixedUpdate phase. This is done by creating
/// a temporary GameObject with a MonoBehaviour that calls Academy.EnvironmentStep().
/// </summary>
void EnableAutomaticStepping()
{

/// Registers SideChannel to the Academy to send and receive data with Python.
/// If IsCommunicatorOn is false, the SideChannel will not be registered.
/// </summary>
/// <param name="sideChannel"> The side channel to be registered.</param>
/// <param name="channel"> The side channel to be registered.</param>
public void RegisterSideChannel(SideChannel channel)
{
LazyInitialize();

/// Unregisters SideChannel to the Academy. If the side channel was not registered,
/// nothing will happen.
/// </summary>
/// <param name="sideChannel"> The side channel to be unregistered.</param>
/// <param name="channel"> The side channel to be unregistered.</param>
public void UnregisterSideChannel(SideChannel channel)
{
Communicator?.UnregisterSideChannel(channel);

var inputPort = "";
for (var i = 0; i < args.Length; i++)
{
if (args[i] == "--port")
if (args[i] == k_portCommandLineFlag)
{
inputPort = args[i + 1];
}

135
com.unity.ml-agents/Runtime/Agent.cs


using System.Collections.Generic;
using UnityEngine;
using Barracuda;
using UnityEngine.Serialization;
/// observations, actions and current status, that is sent to the Brain.
/// observations, actions and current status.
public struct AgentInfo
internal struct AgentInfo
{
/// <summary>
/// Keeps track of the last vector action taken by the Brain.

public float[] vectorActions;
}
/// Agent Monobehavior class that is attached to a Unity GameObject, making it
/// Agent MonoBehaviour class that is attached to a Unity GameObject, making it
/// user in <see cref="CollectObservations"/>. On the other hand, actions
/// are determined by decisions produced by a Policy. Currently, this
/// class is expected to be extended to implement the desired agent behavior.
/// user in <see cref="Agent.CollectObservations(VectorSensor)"/> or
/// <see cref="Agent.CollectObservations(VectorSensor, ActionMasker)"/>.
/// On the other hand, actions are determined by decisions produced by a Policy.
/// Currently, this class is expected to be extended to implement the desired agent behavior.
/// </summary>
/// <remarks>
/// Simply speaking, an agent roams through an environment and at each step

/// little may have changed between successive steps.
///
/// At any step, an agent may be considered <see cref="m_Done"/>.
/// This could occur due to a variety of reasons:
/// At any step, an agent may be considered done due to a variety of reasons:
/// - The agent reached an end state within its environment.
/// - The agent reached the maximum # of steps (i.e. timed out).
/// - The academy reached the maximum # of steps (forced agent to be done).

BehaviorParameters m_PolicyFactory;
/// This code is here to make the upgrade path for users using maxStep
/// easier. We will hook into the Serialization code and make sure that
/// easier. We will hook into the Serialization code and make sure that
/// agentParameters.maxStep and this.maxStep are in sync.
[Serializable]
internal struct AgentParameters

/// Whether or not the agent requests a decision.
bool m_RequestDecision;
/// Keeps track of the number of steps taken by the agent in this episode.
/// Note that this value is different for each agent, and may not overlap
/// with the step counter in the Academy, since agents reset based on

ActionMasker m_ActionMasker;
/// <summary>
/// Demonstration recorder.
/// Set of DemonstrationWriters that the Agent will write its step information to.
/// If you use a DemonstrationRecorder component, this will automatically register its DemonstrationWriter.
/// You can also add your own DemonstrationWriter by calling
/// DemonstrationRecorder.AddDemonstrationWriterToAgent()
DemonstrationRecorder m_Recorder;
internal ISet<DemonstrationWriter> DemonstrationWriters = new HashSet<DemonstrationWriter>();
/// <summary>
/// List of sensors used to generate observations.

/// </summary>
internal VectorSensor collectObservationsSensor;
/// MonoBehaviour function that is called when the attached GameObject
/// becomes enabled or active.
/// <summary>
/// <inheritdoc cref="OnBeforeSerialize"/>
/// </summary>
// Manages a serialization upgrade issue from v0.13 to v0.14 where maxStep moved
// from AgentParameters (since removed) to Agent
if (maxStep == 0 && maxStep != agentParameters.maxStep && !hasUpgradedFromAgentParameters)
{
maxStep = agentParameters.maxStep;

/// <summary>
/// <inheritdoc cref="OnAfterDeserialize"/>
/// </summary>
// Manages a serialization upgrade issue from v0.13 to v0.14 where maxStep moved
// from AgentParameters (since removed) to Agent
if (maxStep == 0 && maxStep != agentParameters.maxStep && !hasUpgradedFromAgentParameters)
{
maxStep = agentParameters.maxStep;

/// Helper method for the <see cref="OnEnable"/> event, created to
/// facilitate testing.
/// <summary>
/// Initializes the agent. Can be safely called multiple times.
/// </summary>
public void LazyInitialize()
{
if (m_Initialized)

// Grab the "static" properties for the Agent.
m_EpisodeId = EpisodeIdCounter.GetEpisodeId();
m_PolicyFactory = GetComponent<BehaviorParameters>();
m_Recorder = GetComponent<DemonstrationRecorder>();
m_Info = new AgentInfo();
m_Action = new AgentAction();

InitializeSensors();
}
/// Monobehavior function that is called when the attached GameObject
/// becomes disabled or inactive.
DemonstrationWriters.Clear();
// If Academy.Dispose has already been called, we don't need to unregister with it.
// We don't want to even try, because this will lazily create a new Academy!
if (Academy.IsInitialized)

// We request a decision so Python knows the Agent is done immediately
m_Brain?.RequestDecision(m_Info, sensors);
if (m_Recorder != null && m_Recorder.record && Application.isEditor)
// We also have to write any to any DemonstationStores so that they get the "done" flag.
foreach(var demoWriter in DemonstrationWriters)
m_Recorder.WriteExperience(m_Info, sensors);
demoWriter.Record(m_Info, sensors);
}
UpdateRewardStats();

m_Brain = m_PolicyFactory.GeneratePolicy(Heuristic);
}
/// <summary>
/// Returns the current step counter (within the current episode).
/// </summary>
/// <returns>

/// </returns>
public virtual float[] Heuristic()
{
throw new UnityAgentsException(string.Format(
throw new UnityAgentsException(
"{0} GameObject.",
gameObject.name));
$"{gameObject.name} GameObject.");
}
/// <summary>

collectObservationsSensor = new VectorSensor(param.vectorObservationSize);
if (param.numStackedVectorObservations > 1)
{
var stackingSensor = new StackingSensor(collectObservationsSensor, param.numStackedVectorObservations);
var stackingSensor = new StackingSensor(
collectObservationsSensor, param.numStackedVectorObservations);
sensors.Add(stackingSensor);
}
else

// Make sure the names are actually unique
for (var i = 0; i < sensors.Count - 1; i++)
{
Debug.Assert(!sensors[i].GetName().Equals(sensors[i + 1].GetName()), "Sensor names must be unique.");
Debug.Assert(
!sensors[i].GetName().Equals(sensors[i + 1].GetName()),
"Sensor names must be unique.");
}
#endif
}

m_Brain.RequestDecision(m_Info, sensors);
if (m_Recorder != null && m_Recorder.record && Application.isEditor)
// If we have any DemonstrationWriters, write the AgentInfo and sensors to them.
foreach(var demoWriter in DemonstrationWriters)
m_Recorder.WriteExperience(m_Info, sensors);
demoWriter.Record(m_Info, sensors);
for (var i = 0; i < sensors.Count; i++)
foreach (var sensor in sensors)
sensors[i].Update();
sensor.Update();
/// <summary>
/// Collects the vector observations of the agent.

/// <param name="sensor">
/// The vector observations for the agent.
/// </param>
/// <remarks>
/// An agents observation is any environment information that helps
/// the Agent achieve its goal. For example, for a fighting Agent, its

/// Vector observations are added by calling the provided helper methods
/// on the VectorSensor input:
/// - <see cref="AddObservation(int)"/>
/// - <see cref="AddObservation(float)"/>
/// - <see cref="AddObservation(Vector3)"/>
/// - <see cref="AddObservation(Vector2)"/>
/// - <see cref="AddObservation(Quaternion)"/>
/// - <see cref="AddObservation(bool)"/>
/// - <see cref="AddOneHotObservation(int, int)"/>
/// - <see cref="VectorSensor.AddObservation(int)"/>
/// - <see cref="VectorSensor.AddObservation(float)"/>
/// - <see cref="VectorSensor.AddObservation(Vector3)"/>
/// - <see cref="VectorSensor.AddObservation(Vector2)"/>
/// - <see cref="VectorSensor.AddObservation(Quaternion)"/>
/// - <see cref="VectorSensor.AddObservation(bool)"/>
/// - <see cref="VectorSensor.AddObservation(IEnumerable{float})"/>
/// - <see cref="VectorSensor.AddOneHotObservation(int, int)"/>
/// Depending on your environment, any combination of these helpers can
/// be used. They just need to be used in the exact same order each time
/// this method is called and the resulting size of the vector observation

}
/// <summary>
/// Collects the vector observations of the agent.
/// Collects the vector observations of the agent alongside the masked actions.
/// <param name="sensor">
/// The vector observations for the agent.
/// </param>
/// <param name="actionMasker">
/// The masked actions for the agent.
/// </param>
/// <remarks>
/// An agents observation is any environment information that helps
/// the Agent achieve its goal. For example, for a fighting Agent, its

/// Vector observations are added by calling the provided helper methods
/// on the VectorSensor input:
/// - <see cref="AddObservation(int)"/>
/// - <see cref="AddObservation(float)"/>
/// - <see cref="AddObservation(Vector3)"/>
/// - <see cref="AddObservation(Vector2)"/>
/// - <see cref="AddObservation(Quaternion)"/>
/// - <see cref="AddObservation(bool)"/>
/// - <see cref="AddOneHotObservation(int, int)"/>
/// - <see cref="VectorSensor.AddObservation(int)"/>
/// - <see cref="VectorSensor.AddObservation(float)"/>
/// - <see cref="VectorSensor.AddObservation(Vector3)"/>
/// - <see cref="VectorSensor.AddObservation(Vector2)"/>
/// - <see cref="VectorSensor.AddObservation(Quaternion)"/>
/// - <see cref="VectorSensor.AddObservation(bool)"/>
/// - <see cref="VectorSensor.AddObservation(IEnumerable{float})"/>
/// - <see cref="VectorSensor.AddOneHotObservation(int, int)"/>
/// Depending on your environment, any combination of these helpers can
/// be used. They just need to be used in the exact same order each time
/// this method is called and the resulting size of the vector observation

/// When using Discrete Control, you can prevent the Agent from using a certain
/// action by masking it. You can call the following method on the ActionMasker
/// input :
/// - <see cref="SetActionMask(int branch, IEnumerable<int> actionIndices)"/>
/// - <see cref="SetActionMask(int branch, int actionIndex)"/>
/// - <see cref="SetActionMask(IEnumerable<int> actionIndices)"/>
/// - <see cref="SetActionMask(int branch, int actionIndex)"/>
/// - <see cref="ActionMasker.SetActionMask(int)"/>
/// - <see cref="ActionMasker.SetActionMask(int, int)"/>
/// - <see cref="ActionMasker.SetActionMask(int, IEnumerable{int})"/>
/// - <see cref="ActionMasker.SetActionMask(IEnumerable{int})"/>
/// The branch input is the index of the action, actionIndices are the indices of the
/// invalid options for that action.
/// </remarks>

}
/// <summary>
/// Returns the last action that was decided on by the Agent (returns null if no decision has been made)
/// Returns the last action that was decided on by the Agent
/// <returns>
/// The last action that was decided by the Agent (or null if no decision has been made)
/// </returns>
public float[] GetAction()
{
return m_Action.vectorActions;

/// <param name="min"></param>
/// <param name="max"></param>
/// <returns></returns>
protected float ScaleAction(float rawAction, float min, float max)
protected static float ScaleAction(float rawAction, float min, float max)
{
var middle = (min + max) / 2;
var range = (max - min) / 2;

2
com.unity.ml-agents/Runtime/InferenceBrain/TensorProxy.cs


/// allowing the user to specify everything but the data in a graphical way.
/// </summary>
[Serializable]
public class TensorProxy
internal class TensorProxy
{
public enum TensorType
{

4
com.unity.ml-agents/Runtime/Sensor/ISensor.cs


/// Note that this (and GetCompressedObservation) may be called multiple times per agent step, so should not
/// mutate any internal state.
/// </summary>
/// <param name="adapater"></param>
/// <param name="adapter"></param>
int Write(WriteAdapter adapater);
int Write(WriteAdapter adapter);
/// <summary>
/// Return a compressed representation of the observation. For small observations, this should generally not be

534
com.unity.ml-agents/Runtime/Sensor/RayPerceptionSensor.cs


namespace MLAgents
{
public class RayPerceptionSensor : ISensor
/// <summary>
/// Determines which dimensions the sensor will perform the casts in.
/// </summary>
public enum RayPerceptionCastType
{
/// Cast in 2 dimensions, using Physics2D.CircleCast or Physics2D.RayCast.
Cast2D,
/// Cast in 3 dimensions, using Physics.SphereCast or Physics.RayCast.
Cast3D,
}
public struct RayPerceptionInput
public enum CastType
{
Cast2D,
Cast3D,
}
/// <summary>
/// Length of the rays to cast. This will be scaled up or down based on the scale of the transform.
/// </summary>
public float rayLength;
/// <summary>
/// List of tags which correspond to object types agent can see.
/// </summary>
public IReadOnlyList<string> detectableTags;
/// <summary>
/// List of angles (in degrees) used to define the rays.
/// 90 degrees is considered "forward" relative to the game object.
/// </summary>
public IReadOnlyList<float> angles;
/// <summary>
/// Starting height offset of ray from center of agent
/// </summary>
public float startOffset;
float[] m_Observations;
int[] m_Shape;
string m_Name;
/// <summary>
/// Ending height offset of ray from center of agent.
/// </summary>
public float endOffset;
float m_RayDistance;
List<string> m_DetectableObjects;
float[] m_Angles;
/// <summary>
/// Radius of the sphere to use for spherecasting.
/// If 0 or less, rays are used instead - this may be faster, especially for complex environments.
/// </summary>
public float castRadius;
float m_StartOffset;
float m_EndOffset;
float m_CastRadius;
CastType m_CastType;
Transform m_Transform;
int m_LayerMask;
/// <summary>
/// Transform of the GameObject.
/// </summary>
public Transform transform;
/// Debug information for the raycast hits. This is used by the RayPerceptionSensorComponent.
/// Whether to perform the casts in 2D or 3D.
public class DebugDisplayInfo
public RayPerceptionCastType castType;
/// <summary>
/// Filtering options for the casts.
/// </summary>
public int layerMask;
/// <summary>
/// Returns the expected number of floats in the output.
/// </summary>
/// <returns></returns>
public int OutputSize()
public struct RayInfo
return (detectableTags.Count + 2) * angles.Count;
}
/// <summary>
/// Get the cast start and end points for the given ray index/
/// </summary>
/// <param name="rayIndex"></param>
/// <returns>A tuple of the start and end positions in world space.</returns>
public (Vector3 StartPositionWorld, Vector3 EndPositionWorld) RayExtents(int rayIndex)
{
var angle = angles[rayIndex];
Vector3 startPositionLocal, endPositionLocal;
if (castType == RayPerceptionCastType.Cast3D)
public Vector3 localStart;
public Vector3 localEnd;
public Vector3 worldStart;
public Vector3 worldEnd;
public bool castHit;
public float hitFraction;
public float castRadius;
startPositionLocal = new Vector3(0, startOffset, 0);
endPositionLocal = PolarToCartesian3D(rayLength, angle);
endPositionLocal.y += endOffset;
public void Reset()
else
m_Frame = Time.frameCount;
// Vector2s here get converted to Vector3s (and back to Vector2s for casting)
startPositionLocal = new Vector2();
endPositionLocal = PolarToCartesian2D(rayLength, angle);
var startPositionWorld = transform.TransformPoint(startPositionLocal);
var endPositionWorld = transform.TransformPoint(endPositionLocal);
return (StartPositionWorld: startPositionWorld, EndPositionWorld: endPositionWorld);
}
/// <summary>
/// Converts polar coordinate to cartesian coordinate.
/// </summary>
static internal Vector3 PolarToCartesian3D(float radius, float angleDegrees)
{
var x = radius * Mathf.Cos(Mathf.Deg2Rad * angleDegrees);
var z = radius * Mathf.Sin(Mathf.Deg2Rad * angleDegrees);
return new Vector3(x, 0f, z);
}
/// <summary>
/// Converts polar coordinate to cartesian coordinate.
/// </summary>
static internal Vector2 PolarToCartesian2D(float radius, float angleDegrees)
{
var x = radius * Mathf.Cos(Mathf.Deg2Rad * angleDegrees);
var y = radius * Mathf.Sin(Mathf.Deg2Rad * angleDegrees);
return new Vector2(x, y);
}
}
public class RayPerceptionOutput
{
public struct RayOutput
{
/// "Age" of the results in number of frames. This is used to adjust the alpha when drawing.
/// Whether or not the ray hit anything.
/// </summary>
public bool hasHit;
/// <summary>
/// Whether or not the ray hit an object whose tag is in the input's detectableTags list.
/// </summary>
public bool hitTaggedObject;
/// <summary>
/// The index of the hit object's tag in the detectableTags list, or -1 if there was no hit, or the
/// hit object has a different tag.
/// </summary>
public int hitTagIndex;
/// <summary>
/// Normalized distance to the hit object.
/// </summary>
public float hitFraction;
/// <summary>
/// Writes the ray output information to a subset of the float array. Each element in the rayAngles array
/// determines a sublist of data to the observation. The sublist contains the observation data for a single cast.
/// The list is composed of the following:
/// 1. A one-hot encoding for detectable tags. For example, if detectableTags.Length = n, the
/// first n elements of the sublist will be a one-hot encoding of the detectableTag that was hit, or
/// all zeroes otherwise.
/// 2. The 'numDetectableTags' element of the sublist will be 1 if the ray missed everything, or 0 if it hit
/// something (detectable or not).
/// 3. The 'numDetectableTags+1' element of the sublist will contain the normalized distance to the object
/// hit, or 1.0 if nothing was hit.
public int age
/// <param name="numDetectableTags"></param>
/// <param name="rayIndex"></param>
/// <param name="buffer">Output buffer. The size must be equal to (numDetectableTags+2) * rayOutputs.Length</param>
public void ToFloatArray(int numDetectableTags, int rayIndex, float[] buffer)
get { return Time.frameCount - m_Frame; }
var bufferOffset = (numDetectableTags + 2) * rayIndex;
if (hitTaggedObject)
{
buffer[bufferOffset + hitTagIndex] = 1f;
}
buffer[bufferOffset + numDetectableTags] = hasHit ? 0f : 1f;
buffer[bufferOffset + numDetectableTags + 1] = hitFraction;
}
/// <summary>
/// RayOutput for each ray that was cast.
/// </summary>
public RayOutput[] rayOutputs;
}
public RayInfo[] rayInfos;
/// <summary>
/// Debug information for the raycast hits. This is used by the RayPerceptionSensorComponent.
/// </summary>
internal class DebugDisplayInfo
{
public struct RayInfo
{
public Vector3 worldStart;
public Vector3 worldEnd;
public float castRadius;
public RayPerceptionOutput.RayOutput rayOutput;
}
public void Reset()
{
m_Frame = Time.frameCount;
}
int m_Frame;
/// <summary>
/// "Age" of the results in number of frames. This is used to adjust the alpha when drawing.
/// </summary>
public int age
{
get { return Time.frameCount - m_Frame; }
public RayInfo[] rayInfos;
int m_Frame;
}
public class RayPerceptionSensor : ISensor
{
float[] m_Observations;
int[] m_Shape;
string m_Name;
RayPerceptionInput m_RayPerceptionInput;
public DebugDisplayInfo debugDisplayInfo
internal DebugDisplayInfo debugDisplayInfo
public RayPerceptionSensor(string name, float rayDistance, List<string> detectableObjects, float[] angles,
Transform transform, float startOffset, float endOffset, float castRadius, CastType castType,
int rayLayerMask)
public RayPerceptionSensor(string name, RayPerceptionInput rayInput)
var numObservations = (detectableObjects.Count + 2) * angles.Length;
var numObservations = rayInput.OutputSize();
m_RayPerceptionInput = rayInput;
m_RayDistance = rayDistance;
m_DetectableObjects = detectableObjects;
// TODO - preprocess angles, save ray directions instead?
m_Angles = angles;
m_Transform = transform;
m_StartOffset = startOffset;
m_EndOffset = endOffset;
m_CastRadius = castRadius;
m_CastType = castType;
m_LayerMask = rayLayerMask;
if (Application.isEditor)
{

{
using (TimerStack.Instance.Scoped("RayPerceptionSensor.Perceive"))
{
PerceiveStatic(
m_RayDistance, m_Angles, m_DetectableObjects, m_StartOffset, m_EndOffset,
m_CastRadius, m_Transform, m_CastType, m_Observations, m_LayerMask,
m_DebugDisplayInfo
);
Array.Clear(m_Observations, 0, m_Observations.Length);
var numRays = m_RayPerceptionInput.angles.Count;
var numDetectableTags = m_RayPerceptionInput.detectableTags.Count;
if (m_DebugDisplayInfo != null)
{
// Reset the age information, and resize the buffer if needed.
m_DebugDisplayInfo.Reset();
if (m_DebugDisplayInfo.rayInfos == null || m_DebugDisplayInfo.rayInfos.Length != numRays)
{
m_DebugDisplayInfo.rayInfos = new DebugDisplayInfo.RayInfo[numRays];
}
}
// For each ray, do the casting, and write the information to the observation buffer
for (var rayIndex = 0; rayIndex < numRays; rayIndex++)
{
DebugDisplayInfo.RayInfo debugRay;
var rayOutput = PerceiveSingleRay(m_RayPerceptionInput, rayIndex, out debugRay);
if (m_DebugDisplayInfo != null)
{
m_DebugDisplayInfo.rayInfos[rayIndex] = debugRay;
}
rayOutput.ToFloatArray(numDetectableTags, rayIndex, m_Observations);
}
// Finally, add the observations to the WriteAdapter
adapter.AddRange(m_Observations);
}
return m_Observations.Length;

}
/// <summary>
/// Evaluates a perception vector to be used as part of an observation of an agent.
/// Each element in the rayAngles array determines a sublist of data to the observation.
/// The sublist contains the observation data for a single cast. The list is composed of the following:
/// 1. A one-hot encoding for detectable objects. For example, if detectableObjects.Length = n, the
/// first n elements of the sublist will be a one-hot encoding of the detectableObject that was hit, or
/// all zeroes otherwise.
/// 2. The 'length' element of the sublist will be 1 if the ray missed everything, or 0 if it hit
/// something (detectable or not).
/// 3. The 'length+1' element of the sublist will contain the normalised distance to the object hit, or 1 if
/// nothing was hit.
///
/// Evaluates the raycasts to be used as part of an observation of an agent.
/// <param name="unscaledRayLength"></param>
/// <param name="rayAngles">List of angles (in degrees) used to define the rays. 90 degrees is considered
/// "forward" relative to the game object</param>
/// <param name="detectableObjects">List of tags which correspond to object types agent can see</param>
/// <param name="startOffset">Starting height offset of ray from center of agent.</param>
/// <param name="endOffset">Ending height offset of ray from center of agent.</param>
/// <param name="unscaledCastRadius">Radius of the sphere to use for spherecasting. If 0 or less, rays are used
/// instead - this may be faster, especially for complex environments.</param>
/// <param name="transform">Transform of the GameObject</param>
/// <param name="castType">Whether to perform the casts in 2D or 3D.</param>
/// <param name="perceptionBuffer">Output array of floats. Must be (num rays) * (num tags + 2) in size.</param>
/// <param name="layerMask">Filtering options for the casts</param>
/// <param name="debugInfo">Optional debug information output, only used by RayPerceptionSensor.</param>
/// <param name="input">Input defining the rays that will be cast.</param>
/// <param name="output">Output class that will be written to with raycast results.</param>
public static void PerceiveStatic(float unscaledRayLength,
IReadOnlyList<float> rayAngles, IReadOnlyList<string> detectableObjects,
float startOffset, float endOffset, float unscaledCastRadius,
Transform transform, CastType castType, float[] perceptionBuffer,
int layerMask = Physics.DefaultRaycastLayers,
DebugDisplayInfo debugInfo = null)
public static RayPerceptionOutput PerceiveStatic(RayPerceptionInput input)
Array.Clear(perceptionBuffer, 0, perceptionBuffer.Length);
if (debugInfo != null)
RayPerceptionOutput output = new RayPerceptionOutput();
output.rayOutputs = new RayPerceptionOutput.RayOutput[input.angles.Count];
for (var rayIndex = 0; rayIndex < input.angles.Count; rayIndex++)
debugInfo.Reset();
if (debugInfo.rayInfos == null || debugInfo.rayInfos.Length != rayAngles.Count)
{
debugInfo.rayInfos = new DebugDisplayInfo.RayInfo[rayAngles.Count];
}
DebugDisplayInfo.RayInfo debugRay;
output.rayOutputs[rayIndex] = PerceiveSingleRay(input, rayIndex, out debugRay);
// For each ray sublist stores categorical information on detected object
// along with object distance.
int bufferOffset = 0;
for (var rayIndex = 0; rayIndex < rayAngles.Count; rayIndex++)
{
var angle = rayAngles[rayIndex];
Vector3 startPositionLocal, endPositionLocal;
if (castType == CastType.Cast3D)
{
startPositionLocal = new Vector3(0, startOffset, 0);
endPositionLocal = PolarToCartesian3D(unscaledRayLength, angle);
endPositionLocal.y += endOffset;
}
else
{
// Vector2s here get converted to Vector3s (and back to Vector2s for casting)
startPositionLocal = new Vector2();
endPositionLocal = PolarToCartesian2D(unscaledRayLength, angle);
}
return output;
}
var startPositionWorld = transform.TransformPoint(startPositionLocal);
var endPositionWorld = transform.TransformPoint(endPositionLocal);
/// <summary>
/// Evaluate the raycast results of a single ray from the RayPerceptionInput.
/// </summary>
/// <param name="input"></param>
/// <param name="rayIndex"></param>
/// <param name="debugRayOut"></param>
/// <returns></returns>
static RayPerceptionOutput.RayOutput PerceiveSingleRay(
RayPerceptionInput input,
int rayIndex,
out DebugDisplayInfo.RayInfo debugRayOut
)
{
var unscaledRayLength = input.rayLength;
var unscaledCastRadius = input.castRadius;
var rayDirection = endPositionWorld - startPositionWorld;
// If there is non-unity scale, |rayDirection| will be different from rayLength.
// We want to use this transformed ray length for determining cast length, hit fraction etc.
// We also it to scale up or down the sphere or circle radii
var scaledRayLength = rayDirection.magnitude;
// Avoid 0/0 if unscaledRayLength is 0
var scaledCastRadius = unscaledRayLength > 0 ? unscaledCastRadius * scaledRayLength / unscaledRayLength : unscaledCastRadius;
var extents = input.RayExtents(rayIndex);
var startPositionWorld = extents.StartPositionWorld;
var endPositionWorld = extents.EndPositionWorld;
// Do the cast and assign the hit information for each detectable object.
// sublist[0 ] <- did hit detectableObjects[0]
// ...
// sublist[numObjects-1] <- did hit detectableObjects[numObjects-1]
// sublist[numObjects ] <- 1 if missed else 0
// sublist[numObjects+1] <- hit fraction (or 1 if no hit)
var rayDirection = endPositionWorld - startPositionWorld;
// If there is non-unity scale, |rayDirection| will be different from rayLength.
// We want to use this transformed ray length for determining cast length, hit fraction etc.
// We also it to scale up or down the sphere or circle radii
var scaledRayLength = rayDirection.magnitude;
// Avoid 0/0 if unscaledRayLength is 0
var scaledCastRadius = unscaledRayLength > 0 ?
unscaledCastRadius * scaledRayLength / unscaledRayLength :
unscaledCastRadius;
bool castHit;
float hitFraction;
GameObject hitObject;
// Do the cast and assign the hit information for each detectable tag.
bool castHit;
float hitFraction;
GameObject hitObject;
if (castType == CastType.Cast3D)
if (input.castType == RayPerceptionCastType.Cast3D)
{
RaycastHit rayHit;
if (scaledCastRadius > 0f)
RaycastHit rayHit;
if (scaledCastRadius > 0f)
{
castHit = Physics.SphereCast(startPositionWorld, scaledCastRadius, rayDirection, out rayHit,
scaledRayLength, layerMask);
}
else
{
castHit = Physics.Raycast(startPositionWorld, rayDirection, out rayHit,
scaledRayLength, layerMask);
}
// If scaledRayLength is 0, we still could have a hit with sphere casts (maybe?).
// To avoid 0/0, set the fraction to 0.
hitFraction = castHit ? (scaledRayLength > 0 ? rayHit.distance / scaledRayLength : 0.0f) : 1.0f;
hitObject = castHit ? rayHit.collider.gameObject : null;
castHit = Physics.SphereCast(startPositionWorld, scaledCastRadius, rayDirection, out rayHit,
scaledRayLength, input.layerMask);
RaycastHit2D rayHit;
if (scaledCastRadius > 0f)
{
rayHit = Physics2D.CircleCast(startPositionWorld, scaledCastRadius, rayDirection,
scaledRayLength, layerMask);
}
else
{
rayHit = Physics2D.Raycast(startPositionWorld, rayDirection, scaledRayLength, layerMask);
}
castHit = rayHit;
hitFraction = castHit ? rayHit.fraction : 1.0f;
hitObject = castHit ? rayHit.collider.gameObject : null;
castHit = Physics.Raycast(startPositionWorld, rayDirection, out rayHit,
scaledRayLength, input.layerMask);
if (debugInfo != null)
// If scaledRayLength is 0, we still could have a hit with sphere casts (maybe?).
// To avoid 0/0, set the fraction to 0.
hitFraction = castHit ? (scaledRayLength > 0 ? rayHit.distance / scaledRayLength : 0.0f) : 1.0f;
hitObject = castHit ? rayHit.collider.gameObject : null;
}
else
{
RaycastHit2D rayHit;
if (scaledCastRadius > 0f)
debugInfo.rayInfos[rayIndex].localStart = startPositionLocal;
debugInfo.rayInfos[rayIndex].localEnd = endPositionLocal;
debugInfo.rayInfos[rayIndex].worldStart = startPositionWorld;
debugInfo.rayInfos[rayIndex].worldEnd = endPositionWorld;
debugInfo.rayInfos[rayIndex].castHit = castHit;
debugInfo.rayInfos[rayIndex].hitFraction = hitFraction;
debugInfo.rayInfos[rayIndex].castRadius = scaledCastRadius;
rayHit = Physics2D.CircleCast(startPositionWorld, scaledCastRadius, rayDirection,
scaledRayLength, input.layerMask);
else if (Application.isEditor)
else
// Legacy drawing
Debug.DrawRay(startPositionWorld, rayDirection, Color.black, 0.01f, true);
rayHit = Physics2D.Raycast(startPositionWorld, rayDirection, scaledRayLength, input.layerMask);
if (castHit)
castHit = rayHit;
hitFraction = castHit ? rayHit.fraction : 1.0f;
hitObject = castHit ? rayHit.collider.gameObject : null;
}
var rayOutput = new RayPerceptionOutput.RayOutput
{
hasHit = castHit,
hitFraction = hitFraction,
hitTaggedObject = false,
hitTagIndex = -1
};
if (castHit)
{
// Find the index of the tag of the object that was hit.
for (var i = 0; i < input.detectableTags.Count; i++)
bool hitTaggedObject = false;
for (var i = 0; i < detectableObjects.Count; i++)
if (hitObject.CompareTag(input.detectableTags[i]))
if (hitObject.CompareTag(detectableObjects[i]))
{
perceptionBuffer[bufferOffset + i] = 1;
perceptionBuffer[bufferOffset + detectableObjects.Count + 1] = hitFraction;
hitTaggedObject = true;
break;
}
}
if (!hitTaggedObject)
{
// Something was hit but not on the list. Still set the hit fraction.
perceptionBuffer[bufferOffset + detectableObjects.Count + 1] = hitFraction;
rayOutput.hitTaggedObject = true;
rayOutput.hitTagIndex = i;
break;
else
{
perceptionBuffer[bufferOffset + detectableObjects.Count] = 1f;
// Nothing was hit, so there's full clearance in front of the agent.
perceptionBuffer[bufferOffset + detectableObjects.Count + 1] = 1.0f;
}
}
bufferOffset += detectableObjects.Count + 2;
}
}
debugRayOut.worldStart = startPositionWorld;
debugRayOut.worldEnd = endPositionWorld;
debugRayOut.rayOutput = rayOutput;
debugRayOut.castRadius = scaledCastRadius;
/// <summary>
/// Converts polar coordinate to cartesian coordinate.
/// </summary>
static Vector3 PolarToCartesian3D(float radius, float angleDegrees)
{
var x = radius * Mathf.Cos(Mathf.Deg2Rad * angleDegrees);
var z = radius * Mathf.Sin(Mathf.Deg2Rad * angleDegrees);
return new Vector3(x, 0f, z);
}
return rayOutput;
/// <summary>
/// Converts polar coordinate to cartesian coordinate.
/// </summary>
static Vector2 PolarToCartesian2D(float radius, float angleDegrees)
{
var x = radius * Mathf.Cos(Mathf.Deg2Rad * angleDegrees);
var y = radius * Mathf.Sin(Mathf.Deg2Rad * angleDegrees);
return new Vector2(x, y);
}
}
}

4
com.unity.ml-agents/Runtime/Sensor/RayPerceptionSensorComponent2D.cs


rayLayerMask = Physics2D.DefaultRaycastLayers;
}
public override RayPerceptionSensor.CastType GetCastType()
public override RayPerceptionCastType GetCastType()
return RayPerceptionSensor.CastType.Cast2D;
return RayPerceptionCastType.Cast2D;
}
}
}

4
com.unity.ml-agents/Runtime/Sensor/RayPerceptionSensorComponent3D.cs


[Tooltip("Ray end is offset up or down by this amount.")]
public float endVerticalOffset;
public override RayPerceptionSensor.CastType GetCastType()
public override RayPerceptionCastType GetCastType()
return RayPerceptionSensor.CastType.Cast3D;
return RayPerceptionCastType.Cast3D;
}
public override float GetStartVerticalOffset()

38
com.unity.ml-agents/Runtime/Sensor/RayPerceptionSensorComponentBase.cs


[Header("Debug Gizmos", order = 999)]
public Color rayHitColor = Color.red;
public Color rayMissColor = Color.white;
[Tooltip("Whether to draw the raycasts in the world space of when they happened, or using the Agent's current transform'")]
public bool useWorldPositions = true;
public abstract RayPerceptionSensor.CastType GetCastType();
public abstract RayPerceptionCastType GetCastType();
public virtual float GetStartVerticalOffset()
{

public override ISensor CreateSensor()
{
var rayAngles = GetRayAngles(raysPerDirection, maxRayDegrees);
m_RaySensor = new RayPerceptionSensor(sensorName, rayLength, detectableTags, rayAngles,
transform, GetStartVerticalOffset(), GetEndVerticalOffset(), sphereCastRadius, GetCastType(),
rayLayerMask
);
var rayPerceptionInput = new RayPerceptionInput();
rayPerceptionInput.rayLength = rayLength;
rayPerceptionInput.detectableTags = detectableTags;
rayPerceptionInput.angles = rayAngles;
rayPerceptionInput.startOffset = GetStartVerticalOffset();
rayPerceptionInput.endOffset = GetEndVerticalOffset();
rayPerceptionInput.castRadius = sphereCastRadius;
rayPerceptionInput.transform = transform;
rayPerceptionInput.castType = GetCastType();
rayPerceptionInput.layerMask = rayLayerMask;
m_RaySensor = new RayPerceptionSensor(sensorName, rayPerceptionInput);
if (observationStacks != 1)
{

public override int[] GetObservationShape()
{
var numRays = 2 * raysPerDirection + 1;
var numTags = detectableTags == null ? 0 : detectableTags.Count;
var numTags = detectableTags?.Count ?? 0;
var obsSize = (numTags + 2) * numRays;
var stacks = observationStacks > 1 ? observationStacks : 1;
return new[] { obsSize * stacks };

foreach (var rayInfo in debugInfo.rayInfos)
{
// Either use the original world-space coordinates of the raycast, or transform the agent-local
// coordinates of the rays to the current transform of the agent. If the agent acts every frame,
// these should be the same.
if (!useWorldPositions)
{
startPositionWorld = transform.TransformPoint(rayInfo.localStart);
endPositionWorld = transform.TransformPoint(rayInfo.localEnd);
}
rayDirection *= rayInfo.hitFraction;
rayDirection *= rayInfo.rayOutput.hitFraction;
var lerpT = rayInfo.hitFraction * rayInfo.hitFraction;
var lerpT = rayInfo.rayOutput.hitFraction * rayInfo.rayOutput.hitFraction;
var color = Color.Lerp(rayHitColor, rayMissColor, lerpT);
color.a *= alpha;
Gizmos.color = color;

if (rayInfo.castHit)
if (rayInfo.rayOutput.hasHit)
{
var hitRadius = Mathf.Max(rayInfo.castRadius, .05f);
Gizmos.DrawWireSphere(startPositionWorld + rayDirection, hitRadius);

6
com.unity.ml-agents/Runtime/Sensor/WriteAdapter.cs


TensorShape m_TensorShape;
internal WriteAdapter() { }
/// <summary>
/// Set the adapter to write to an IList at the given channelOffset.
/// </summary>

public void SetTarget(IList<float> data, int[] shape, int offset)
internal void SetTarget(IList<float> data, int[] shape, int offset)
{
m_Data = data;
m_Offset = offset;

/// <param name="tensorProxy">Tensor proxy that will be writtent to.</param>
/// <param name="batchIndex">Batch index in the tensor proxy (i.e. the index of the Agent)</param>
/// <param name="channelOffset">Offset from the start of the channel to write to.</param>
public void SetTarget(TensorProxy tensorProxy, int batchIndex, int channelOffset)
internal void SetTarget(TensorProxy tensorProxy, int batchIndex, int channelOffset)
{
m_Proxy = tensorProxy;
m_Batch = batchIndex;

50
com.unity.ml-agents/Tests/Editor/DemonstrationTests.cs


[TestFixture]
public class DemonstrationTests : MonoBehaviour
{
const string k_DemoDirecory = "Assets/Demonstrations/";
const string k_DemoDirectory = "Assets/Demonstrations/";
const string k_ExtensionType = ".demo";
const string k_DemoName = "Test";

public void TestStoreInitalize()
{
var fileSystem = new MockFileSystem();
var demoStore = new DemonstrationStore(fileSystem);
Assert.IsFalse(fileSystem.Directory.Exists(k_DemoDirecory));
var gameobj = new GameObject("gameObj");
var brainParameters = new BrainParameters
{
vectorObservationSize = 3,
numStackedVectorObservations = 2,
vectorActionDescriptions = new[] { "TestActionA", "TestActionB" },
vectorActionSize = new[] { 2, 2 },
vectorActionSpaceType = SpaceType.Discrete
};
var bp = gameobj.AddComponent<BehaviorParameters>();
bp.brainParameters.vectorObservationSize = 3;
bp.brainParameters.numStackedVectorObservations = 2;
bp.brainParameters.vectorActionDescriptions = new[] { "TestActionA", "TestActionB" };
bp.brainParameters.vectorActionSize = new[] { 2, 2 };
bp.brainParameters.vectorActionSpaceType = SpaceType.Discrete;
demoStore.Initialize(k_DemoName, brainParameters, "TestBrain");
var agent = gameobj.AddComponent<TestAgent>();
Assert.IsTrue(fileSystem.Directory.Exists(k_DemoDirecory));
Assert.IsTrue(fileSystem.FileExists(k_DemoDirecory + k_DemoName + k_ExtensionType));
Assert.IsFalse(fileSystem.Directory.Exists(k_DemoDirectory));
var demoRec = gameobj.AddComponent<DemonstrationRecorder>();
demoRec.record = true;
demoRec.demonstrationName = k_DemoName;
demoRec.demonstrationDirectory = k_DemoDirectory;
var demoWriter = demoRec.LazyInitialize(fileSystem);
Assert.IsTrue(fileSystem.Directory.Exists(k_DemoDirectory));
Assert.IsTrue(fileSystem.FileExists(k_DemoDirectory + k_DemoName + k_ExtensionType));
var agentInfo = new AgentInfo
{

storedVectorActions = new[] { 0f, 1f },
};
demoStore.Record(agentInfo, new System.Collections.Generic.List<ISensor>());
demoStore.Close();
demoWriter.Record(agentInfo, new System.Collections.Generic.List<ISensor>());
demoRec.Close();
// Make sure close can be called multiple times
demoWriter.Close();
demoRec.Close();
// Make sure trying to write after closing doesn't raise an error.
demoWriter.Record(agentInfo, new System.Collections.Generic.List<ISensor>());
}
public class ObservationAgent : TestAgent

agentGo1.AddComponent<DemonstrationRecorder>();
var demoRecorder = agentGo1.GetComponent<DemonstrationRecorder>();
var fileSystem = new MockFileSystem();
demoRecorder.demonstrationDirectory = k_DemoDirectory;
demoRecorder.InitializeDemoStore(fileSystem);
demoRecorder.LazyInitialize(fileSystem);
var agentEnableMethod = typeof(Agent).GetMethod("OnEnable",
BindingFlags.Instance | BindingFlags.NonPublic);

// Read back the demo file and make sure observations were written
var reader = fileSystem.File.OpenRead("Assets/Demonstrations/TestBrain.demo");
reader.Seek(DemonstrationStore.MetaDataBytes + 1, 0);
reader.Seek(DemonstrationWriter.MetaDataBytes + 1, 0);
BrainParametersProto.Parser.ParseDelimitedFrom(reader);
var agentInfoProto = AgentInfoActionPairProto.Parser.ParseDelimitedFrom(reader).AgentInfo;

2
config/gail_config.yaml


strength: 1.0
gamma: 0.99
encoding_size: 128
demo_path: Project/Assets/Demonstrations/PushblockDemo.demo
demo_path: Project/Assets/ML-Agents/Examples/PushBlock/Demos/ExpertPush.demo
Hallway:
use_recurrent: true

9
docs/API-Reference.md


# API Reference
Our developer-facing C# classes (Academy, Agent, Decision and Monitor) have been
documented to be compatible with Doxygen for auto-generating HTML
documentation.
Our developer-facing C# classes have been documented to be compatible with
Doxygen for auto-generating HTML documentation.
To generate the API reference, download Doxygen
and run the following command within the `docs/` directory:

subdirectory to navigate to the API reference home. Note that `html/` is already
included in the repository's `.gitignore` file.
In the near future, we aim to expand our documentation to include all the Unity
C# classes and Python API.
In the near future, we aim to expand our documentation to include the Python
classes.

4
docs/Migrating.md


### Important changes
* The `Agent.CollectObservations()` virtual method now takes as input a `VectorSensor` sensor as argument. The `Agent.AddVectorObs()` methods were removed.
* The `SetActionMask` method must now be called on the optional `ActionMasker` argument of the `CollectObservations` method. (We now consider an action mask as a type of observation)
* The interface for `RayPerceptionSensor.PerceiveStatic()` was changed to take an input class and write to an output class.
* The `--multi-gpu` option has been removed temporarily.
* If you call `RayPerceptionSensor.PerceiveStatic()` manually, add your inputs to a `RayPerceptionInput`. To get the previous float array output, use `RayPerceptionOutput.ToFloatArray()`
* Re-import all of your `*.NN` files to work with the updated Barracuda package.
* Replace all calls to `Agent.GetStepCount()` with `Agent.StepCount`

2
docs/Training-Imitation-Learning.md


from a few minutes or a few hours of demonstration data may be necessary to
be useful for imitation learning. When you have recorded enough data, end
the Editor play session, and a `.demo` file will be created in the
`Assets/Demonstrations` folder. This file contains the demonstrations.
`Assets/Demonstrations` folder (by default). This file contains the demonstrations.
Clicking on the file will provide metadata about the demonstration in the
inspector.

133
docs/dox-ml-agents.conf


# Doxyfile 1.8.13
# To generate the C# API documentation, run:
#
#
# doxygen dox-ml-agents.conf
#
# from the ml-agents-docs directory

# title of most generated pages and in a few other places.
# The default value is: My Project.
PROJECT_NAME = "ML-Agents Toolkit"
PROJECT_NAME = "Unity ML-Agents Toolkit"
PROJECT_NUMBER = v0.4
PROJECT_NUMBER =
PROJECT_BRIEF =
PROJECT_BRIEF =
# With the PROJECT_LOGO tag one can specify a logo or an icon that is included
# in the documentation. The maximum height of the logo should not exceed 55

# entered, it will be relative to the location where doxygen was started. If
# left blank the current directory will be used.
OUTPUT_DIRECTORY =
OUTPUT_DIRECTORY =
# If the CREATE_SUBDIRS tag is set to YES then doxygen will create 4096 sub-
# directories (in 2 levels) under the output directory of each output format and

# will be relative from the directory where doxygen is started.
# This tag requires that the tag FULL_PATH_NAMES is set to YES.
STRIP_FROM_PATH =
STRIP_FROM_PATH =
# The STRIP_FROM_INC_PATH tag can be used to strip a user-defined part of the
# path mentioned in the documentation of a class, which tells the reader which

# using the -I flag.
STRIP_FROM_INC_PATH =
STRIP_FROM_INC_PATH =
# If the SHORT_NAMES tag is set to YES, doxygen will generate much shorter (but
# less readable) file names. This can be useful is your file systems doesn't

# "Side Effects:". You can put \n's in the value part of an alias to insert
# newlines.
ALIASES =
ALIASES =
TCL_SUBST =
TCL_SUBST =
# Set the OPTIMIZE_OUTPUT_FOR_C tag to YES if your project consists of C sources
# only. Doxygen will then generate output that is more tailored for C. For

# Note that for custom extensions you also need to set FILE_PATTERNS otherwise
# the files are not read by doxygen.
EXTENSION_MAPPING =
EXTENSION_MAPPING =
# If the MARKDOWN_SUPPORT tag is enabled then doxygen pre-processes all comments
# according to the Markdown format, which allows for more readable

# sections, marked by \if <section_label> ... \endif and \cond <section_label>
# ... \endcond blocks.
ENABLED_SECTIONS =
ENABLED_SECTIONS =
# The MAX_INITIALIZER_LINES tag determines the maximum number of lines that the
# initial value of a variable or macro / define can have for it to appear in the

# by doxygen. Whatever the program writes to standard output is used as the file
# version. For an example see the documentation.
FILE_VERSION_FILTER =
FILE_VERSION_FILTER =
# The LAYOUT_FILE tag can be used to specify a layout file which will be parsed
# by doxygen. The layout file controls the global structure of the generated

# LATEX_BIB_STYLE. To use this feature you need bibtex and perl available in the
# search path. See also \cite for info how to create references.
CITE_BIB_FILES =
CITE_BIB_FILES =
#---------------------------------------------------------------------------
# Configuration options related to warning and progress messages

# messages should be written. If left blank the output is written to standard
# error (stderr).
WARN_LOGFILE =
WARN_LOGFILE =
#---------------------------------------------------------------------------
# Configuration options related to the input files

# spaces. See also FILE_PATTERNS and EXTENSION_MAPPING
# Note: If this tag is empty the current directory is searched.
INPUT = ../Project/Assets/ML-Agents/Scripts/Academy.cs \
../Project/Assets/ML-Agents/Scripts/Agent.cs \
../Project/Assets/ML-Agents/Scripts/Monitor.cs \
../Project/Assets/ML-Agents/Scripts/Decision.cs
INPUT = ../com.unity.ml-agents/Runtime/
# This tag can be used to specify the character encoding of the source files
# that doxygen parses. Internally doxygen uses the UTF-8 encoding. Doxygen uses

# Note that relative paths are relative to the directory from which doxygen is
# run.
EXCLUDE =
EXCLUDE =
# The EXCLUDE_SYMLINKS tag can be used to select whether or not files or
# directories that are symbolic links (a Unix file system feature) are excluded

# Note that the wildcards are matched against the file with absolute path, so to
# exclude all test directories for example use the pattern */test/*
EXCLUDE_PATTERNS =
EXCLUDE_PATTERNS =
# The EXCLUDE_SYMBOLS tag can be used to specify one or more symbol names
# (namespaces, classes, functions, etc.) that should be excluded from the

# Note that the wildcards are matched against the file with absolute path, so to
# exclude all test directories use the pattern */test/*
EXCLUDE_SYMBOLS =
EXCLUDE_SYMBOLS =
EXAMPLE_PATH =
EXAMPLE_PATH =
# If the value of the EXAMPLE_PATH tag contains directories, you can use the
# EXAMPLE_PATTERNS tag to specify one or more wildcard pattern (like *.cpp and

EXAMPLE_PATTERNS =
EXAMPLE_PATTERNS =
# If the EXAMPLE_RECURSIVE tag is set to YES then subdirectories will be
# searched for input files to be used with the \include or \dontinclude commands

# need to set EXTENSION_MAPPING for the extension otherwise the files are not
# properly processed by doxygen.
INPUT_FILTER =
INPUT_FILTER =
# The FILTER_PATTERNS tag can be used to specify filters on a per file pattern
# basis. Doxygen will compare the file name with each pattern and apply the

# need to set EXTENSION_MAPPING for the extension otherwise the files are not
# properly processed by doxygen.
FILTER_PATTERNS =
FILTER_PATTERNS =
# If the FILTER_SOURCE_FILES tag is set to YES, the input filter (if set using
# INPUT_FILTER) will also be used to filter the input files that are used for

# *.ext= (so without naming a filter).
# This tag requires that the tag FILTER_SOURCE_FILES is set to YES.
FILTER_SOURCE_PATTERNS =
FILTER_SOURCE_PATTERNS =
# If the USE_MDFILE_AS_MAINPAGE tag refers to the name of a markdown file that
# is part of the input, its contents will be placed on the main page

# generated with the -Duse-libclang=ON option for CMake.
# The default value is: NO.
CLANG_ASSISTED_PARSING = NO
#CLANG_ASSISTED_PARSING = NO
# If clang assisted parsing is enabled you can provide the compiler with command
# line options that you would normally use when invoking the compiler. Note that

CLANG_OPTIONS =
#CLANG_OPTIONS =
#---------------------------------------------------------------------------
# Configuration options related to the alphabetical class index

# while generating the index headers.
# This tag requires that the tag ALPHABETICAL_INDEX is set to YES.
IGNORE_PREFIX =
IGNORE_PREFIX =