浏览代码

Merge branch 'master' into master-into-release-0.14.1

/release-0.14.1
Anupam Bhatnagar 5 年前
当前提交
e04fcd71
共有 214 个文件被更改,包括 1910 次插入1056 次删除
  1. 11
      .yamato/com.unity.ml-agents-test.yml
  2. 15
      .yamato/standalone-build-test.yml
  3. 17
      .yamato/training-int-tests.yml
  4. 10
      Project/Assets/ML-Agents/Examples/3DBall/Scripts/Ball3DAgent.cs
  5. 8
      Project/Assets/ML-Agents/Examples/3DBall/Scripts/Ball3DHardAgent.cs
  6. 4
      Project/Assets/ML-Agents/Examples/Basic/Scripts/BasicAgent.cs
  7. 6
      Project/Assets/ML-Agents/Examples/Bouncer/Scripts/BouncerAgent.cs
  8. 30
      Project/Assets/ML-Agents/Examples/Crawler/Scripts/CrawlerAgent.cs
  9. 11
      Project/Assets/ML-Agents/Examples/FoodCollector/Scripts/FoodCollectorAgent.cs
  10. 14
      Project/Assets/ML-Agents/Examples/GridWorld/Scripts/GridAgent.cs
  11. 4
      Project/Assets/ML-Agents/Examples/Hallway/Scripts/HallwayAgent.cs
  12. 6
      Project/Assets/ML-Agents/Examples/Pyramids/Scripts/PyramidAgent.cs
  13. 24
      Project/Assets/ML-Agents/Examples/Reacher/Scripts/ReacherAgent.cs
  14. 8
      Project/Assets/ML-Agents/Examples/SharedAssets/Scripts/ProjectSettingsOverrides.cs
  15. 2
      Project/Assets/ML-Agents/Examples/Template/Scripts/TemplateAgent.cs
  16. 20
      Project/Assets/ML-Agents/Examples/Tennis/Scripts/TennisAgent.cs
  17. 30
      Project/Assets/ML-Agents/Examples/Walker/Scripts/WalkerAgent.cs
  18. 15
      Project/Assets/ML-Agents/Examples/WallJump/Scripts/WallJumpAgent.cs
  19. 36
      com.unity.ml-agents/CHANGELOG.md
  20. 2
      com.unity.ml-agents/Editor/AgentEditor.cs
  21. 3
      com.unity.ml-agents/Editor/BehaviorParametersEditor.cs
  22. 6
      com.unity.ml-agents/Editor/BrainParametersDrawer.cs
  23. 2
      com.unity.ml-agents/Editor/DemonstrationDrawer.cs
  24. 6
      com.unity.ml-agents/Editor/DemonstrationImporter.cs
  25. 3
      com.unity.ml-agents/LICENSE.md
  26. 107
      com.unity.ml-agents/Runtime/Academy.cs
  27. 55
      com.unity.ml-agents/Runtime/ActionMasker.cs
  28. 338
      com.unity.ml-agents/Runtime/Agent.cs
  29. 29
      com.unity.ml-agents/Runtime/DecisionRequester.cs
  30. 1
      com.unity.ml-agents/Runtime/Grpc/GrpcExtensions.cs
  31. 41
      com.unity.ml-agents/Runtime/Grpc/RpcCommunicator.cs
  32. 15
      com.unity.ml-agents/Runtime/ICommunicator.cs
  33. 2
      com.unity.ml-agents/Runtime/InferenceBrain/ApplierImpl.cs
  34. 5
      com.unity.ml-agents/Runtime/InferenceBrain/BarracudaModelParamLoader.cs
  35. 4
      com.unity.ml-agents/Runtime/InferenceBrain/GeneratorImpl.cs
  36. 1
      com.unity.ml-agents/Runtime/InferenceBrain/ModelRunner.cs
  37. 13
      com.unity.ml-agents/Runtime/InferenceBrain/TensorGenerator.cs
  38. 2
      com.unity.ml-agents/Runtime/InferenceBrain/TensorProxy.cs
  39. 2
      com.unity.ml-agents/Runtime/InferenceBrain/Utils/Multinomial.cs
  40. 6
      com.unity.ml-agents/Runtime/InferenceBrain/Utils/RandomNormal.cs
  41. 13
      com.unity.ml-agents/Runtime/Policy/BarracudaPolicy.cs
  42. 21
      com.unity.ml-agents/Runtime/Policy/BehaviorParameters.cs
  43. 34
      com.unity.ml-agents/Runtime/Policy/BrainParameters.cs
  44. 1
      com.unity.ml-agents/Runtime/Policy/HeuristicPolicy.cs
  45. 1
      com.unity.ml-agents/Runtime/Policy/IPolicy.cs
  46. 3
      com.unity.ml-agents/Runtime/Policy/RemotePolicy.cs
  47. 56
      com.unity.ml-agents/Runtime/Sensor/CameraSensor.cs
  48. 39
      com.unity.ml-agents/Runtime/Sensor/CameraSensorComponent.cs
  49. 67
      com.unity.ml-agents/Runtime/Sensor/ISensor.cs
  50. 4
      com.unity.ml-agents/Runtime/Sensor/Observation.cs
  51. 574
      com.unity.ml-agents/Runtime/Sensor/RayPerceptionSensor.cs
  52. 13
      com.unity.ml-agents/Runtime/Sensor/RayPerceptionSensorComponent2D.cs
  53. 43
      com.unity.ml-agents/Runtime/Sensor/RayPerceptionSensorComponent3D.cs
  54. 281
      com.unity.ml-agents/Runtime/Sensor/RayPerceptionSensorComponentBase.cs
  55. 23
      com.unity.ml-agents/Runtime/Sensor/RenderTextureSensor.cs
  56. 24
      com.unity.ml-agents/Runtime/Sensor/RenderTextureSensorComponent.cs
  57. 21
      com.unity.ml-agents/Runtime/Sensor/SensorBase.cs
  58. 19
      com.unity.ml-agents/Runtime/Sensor/SensorComponent.cs
  59. 2
      com.unity.ml-agents/Runtime/Sensor/SensorShapeValidator.cs
  60. 6
      com.unity.ml-agents/Runtime/Sensor/StackingSensor.cs
  61. 30
      com.unity.ml-agents/Runtime/Sensor/VectorSensor.cs
  62. 16
      com.unity.ml-agents/Runtime/Sensor/WriteAdapter.cs
  63. 14
      com.unity.ml-agents/Runtime/SideChannel/EngineConfigurationChannel.cs
  64. 27
      com.unity.ml-agents/Runtime/SideChannel/FloatPropertiesChannel.cs
  65. 17
      com.unity.ml-agents/Runtime/SideChannel/RawBytesChannel.cs
  66. 29
      com.unity.ml-agents/Runtime/SideChannel/SideChannel.cs
  67. 18
      com.unity.ml-agents/Runtime/Timer.cs
  68. 9
      com.unity.ml-agents/Runtime/UnityAgentsException.cs
  69. 84
      com.unity.ml-agents/Runtime/Utilities.cs
  70. 61
      com.unity.ml-agents/Tests/Editor/DemonstrationTests.cs
  71. 1
      com.unity.ml-agents/Tests/Editor/EditModeTestInternalBrainTensorApplier.cs
  72. 114
      com.unity.ml-agents/Tests/Editor/MLAgentsEditModeTest.cs
  73. 1
      com.unity.ml-agents/Tests/Editor/Sensor/FloatVisualSensorTests.cs
  74. 3
      com.unity.ml-agents/Tests/Editor/Sensor/RayPerceptionSensorTests.cs
  75. 1
      com.unity.ml-agents/Tests/Editor/Sensor/StackingSensorTests.cs
  76. 1
      com.unity.ml-agents/Tests/Editor/Sensor/VectorSensorTests.cs
  77. 2
      com.unity.ml-agents/Tests/Editor/Sensor/WriterAdapterTests.cs
  78. 20
      com.unity.ml-agents/Tests/Editor/SideChannelTests.cs
  79. 2
      com.unity.ml-agents/Tests/Runtime/SerializationTest.cs
  80. 2
      com.unity.ml-agents/Tests/Runtime/SerializeAgent.cs
  81. 2
      com.unity.ml-agents/package.json
  82. 9
      config/sac_trainer_config.yaml
  83. 8
      config/trainer_config.yaml
  84. 9
      docs/API-Reference.md
  85. 8
      docs/Getting-Started-with-Balance-Ball.md
  86. 4
      docs/Learning-Environment-Best-Practices.md
  87. 20
      docs/Learning-Environment-Create-New.md
  88. 44
      docs/Learning-Environment-Design-Agents.md
  89. 8
      docs/Learning-Environment-Design.md
  90. 4
      docs/Limitations.md
  91. 21
      docs/Migrating.md
  92. 3
      docs/Profiling-Python.md
  93. 193
      docs/Python-API.md
  94. 3
      docs/Reward-Signals.md
  95. 2
      docs/Training-Generalized-Reinforcement-Learning-Agents.md
  96. 2
      docs/Training-Imitation-Learning.md
  97. 1
      docs/Training-ML-Agents.md
  98. 6
      docs/Training-PPO.md
  99. 6
      docs/Training-SAC.md
  100. 2
      docs/Training-Self-Play.md

11
.yamato/com.unity.ml-agents-test.yml


dependencies:
- .yamato/com.unity.ml-agents-pack.yml#pack
triggers:
pull_requests:
- targets:
only:
- "master"
- "/release-.*/"
- "/hotfix-.*/"
changes:
only:
- "com.unity.ml-agents/**"
- ".yamato/com.unity.ml-agents-test.yml"
{% endfor %}
{% endfor %}

15
.yamato/standalone-build-test.yml


- pip install pyyaml
- python -u -m ml-agents.tests.yamato.standalone_build_tests
triggers:
pull_requests:
- targets:
only:
- "master"
- "/release-.*/"
- "/hotfix-.*/"
changes:
only:
- "com.unity.ml-agents/**"
- "Project/**"
- ".yamato/standalone-build-test.yml"
except:
- "*.md"
- "com.unity.ml-agents/*.md"
- "com.unity.ml-agents/**/*.md"
{% endfor %}

17
.yamato/training-int-tests.yml


- pip install pyyaml
- python -u -m ml-agents.tests.yamato.training_int_tests
triggers:
pull_requests:
- targets:
only:
- "master"
- "/release-.*/"
- "/hotfix-.*/"
changes:
only:
- "com.unity.ml-agents/**"
- "Project/**"
- "ml-agents/**"
- "ml-agents-envs/**"
- ".yamato/training-int-tests.yml"
except:
- "*.md"
- "com.unity.ml-agents/*.md"
- "com.unity.ml-agents/**/*.md"
artifacts:
unit:
paths:

10
Project/Assets/ML-Agents/Examples/3DBall/Scripts/Ball3DAgent.cs


SetResetParameters();
}
public override void CollectObservations()
public override void CollectObservations(VectorSensor sensor)
AddVectorObs(gameObject.transform.rotation.z);
AddVectorObs(gameObject.transform.rotation.x);
AddVectorObs(ball.transform.position - gameObject.transform.position);
AddVectorObs(m_BallRb.velocity);
sensor.AddObservation(gameObject.transform.rotation.z);
sensor.AddObservation(gameObject.transform.rotation.x);
sensor.AddObservation(ball.transform.position - gameObject.transform.position);
sensor.AddObservation(m_BallRb.velocity);
}
public override void AgentAction(float[] vectorAction)

8
Project/Assets/ML-Agents/Examples/3DBall/Scripts/Ball3DHardAgent.cs


SetResetParameters();
}
public override void CollectObservations()
public override void CollectObservations(VectorSensor sensor)
AddVectorObs(gameObject.transform.rotation.z);
AddVectorObs(gameObject.transform.rotation.x);
AddVectorObs((ball.transform.position - gameObject.transform.position));
sensor.AddObservation(gameObject.transform.rotation.z);
sensor.AddObservation(gameObject.transform.rotation.x);
sensor.AddObservation((ball.transform.position - gameObject.transform.position));
}
public override void AgentAction(float[] vectorAction)

4
Project/Assets/ML-Agents/Examples/Basic/Scripts/BasicAgent.cs


{
}
public override void CollectObservations()
public override void CollectObservations(VectorSensor sensor)
AddVectorObs(m_Position, 20);
sensor.AddOneHotObservation(m_Position, 20);
}
public override void AgentAction(float[] vectorAction)

6
Project/Assets/ML-Agents/Examples/Bouncer/Scripts/BouncerAgent.cs


SetResetParameters();
}
public override void CollectObservations()
public override void CollectObservations(VectorSensor sensor)
AddVectorObs(gameObject.transform.localPosition);
AddVectorObs(target.transform.localPosition);
sensor.AddObservation(gameObject.transform.localPosition);
sensor.AddObservation(target.transform.localPosition);
}
public override void AgentAction(float[] vectorAction)

30
Project/Assets/ML-Agents/Examples/Crawler/Scripts/CrawlerAgent.cs


/// <summary>
/// Add relevant information on each body part to observations.
/// </summary>
public void CollectObservationBodyPart(BodyPart bp)
public void CollectObservationBodyPart(BodyPart bp, VectorSensor sensor)
AddVectorObs(bp.groundContact.touchingGround ? 1 : 0); // Whether the bp touching the ground
sensor.AddObservation(bp.groundContact.touchingGround ? 1 : 0); // Whether the bp touching the ground
AddVectorObs(velocityRelativeToLookRotationToTarget);
sensor.AddObservation(velocityRelativeToLookRotationToTarget);
AddVectorObs(angularVelocityRelativeToLookRotationToTarget);
sensor.AddObservation(angularVelocityRelativeToLookRotationToTarget);
AddVectorObs(localPosRelToBody);
AddVectorObs(bp.currentXNormalizedRot); // Current x rot
AddVectorObs(bp.currentYNormalizedRot); // Current y rot
AddVectorObs(bp.currentZNormalizedRot); // Current z rot
AddVectorObs(bp.currentStrength / m_JdController.maxJointForceLimit);
sensor.AddObservation(localPosRelToBody);
sensor.AddObservation(bp.currentXNormalizedRot); // Current x rot
sensor.AddObservation(bp.currentYNormalizedRot); // Current y rot
sensor.AddObservation(bp.currentZNormalizedRot); // Current z rot
sensor.AddObservation(bp.currentStrength / m_JdController.maxJointForceLimit);
public override void CollectObservations()
public override void CollectObservations(VectorSensor sensor)
{
m_JdController.GetCurrentJointForces();

RaycastHit hit;
if (Physics.Raycast(body.position, Vector3.down, out hit, 10.0f))
{
AddVectorObs(hit.distance);
sensor.AddObservation(hit.distance);
AddVectorObs(10.0f);
sensor.AddObservation(10.0f);
AddVectorObs(bodyForwardRelativeToLookRotationToTarget);
sensor.AddObservation(bodyForwardRelativeToLookRotationToTarget);
AddVectorObs(bodyUpRelativeToLookRotationToTarget);
sensor.AddObservation(bodyUpRelativeToLookRotationToTarget);
CollectObservationBodyPart(bodyPart);
CollectObservationBodyPart(bodyPart, sensor);
}
}

11
Project/Assets/ML-Agents/Examples/FoodCollector/Scripts/FoodCollectorAgent.cs


{
base.InitializeAgent();
m_AgentRb = GetComponent<Rigidbody>();
Monitor.verticalOffset = 1f;
m_MyArea = area.GetComponent<FoodCollectorArea>();
m_FoodCollecterSettings = FindObjectOfType<FoodCollectorSettings>();

public override void CollectObservations()
public override void CollectObservations(VectorSensor sensor)
AddVectorObs(localVelocity.x);
AddVectorObs(localVelocity.z);
AddVectorObs(System.Convert.ToInt32(m_Frozen));
AddVectorObs(System.Convert.ToInt32(m_Shoot));
sensor.AddObservation(localVelocity.x);
sensor.AddObservation(localVelocity.z);
sensor.AddObservation(System.Convert.ToInt32(m_Frozen));
sensor.AddObservation(System.Convert.ToInt32(m_Shoot));
}
}

14
Project/Assets/ML-Agents/Examples/GridWorld/Scripts/GridAgent.cs


{
}
public override void CollectObservations()
public override void CollectObservations(VectorSensor sensor, ActionMasker actionMasker)
{
// There are no numeric observations to collect as this environment uses visual
// observations.

{
SetMask();
SetMask(actionMasker);
}
}

void SetMask()
void SetMask(ActionMasker actionMasker)
{
// Prevents the agent from picking an action that would make it collide with a wall
var positionX = (int)transform.position.x;

if (positionX == 0)
{
SetActionMask(k_Left);
actionMasker.SetActionMask(k_Left);
SetActionMask(k_Right);
actionMasker.SetActionMask(k_Right);
SetActionMask(k_Down);
actionMasker.SetActionMask(k_Down);
SetActionMask(k_Up);
actionMasker.SetActionMask(k_Up);
}
}

4
Project/Assets/ML-Agents/Examples/Hallway/Scripts/HallwayAgent.cs


m_GroundMaterial = m_GroundRenderer.material;
}
public override void CollectObservations()
public override void CollectObservations(VectorSensor sensor)
AddVectorObs(GetStepCount() / (float)maxStep);
sensor.AddObservation(StepCount / (float)maxStep);
}
}

6
Project/Assets/ML-Agents/Examples/Pyramids/Scripts/PyramidAgent.cs


m_SwitchLogic = areaSwitch.GetComponent<PyramidSwitch>();
}
public override void CollectObservations()
public override void CollectObservations(VectorSensor sensor)
AddVectorObs(m_SwitchLogic.GetState());
AddVectorObs(transform.InverseTransformDirection(m_AgentRb.velocity));
sensor.AddObservation(m_SwitchLogic.GetState());
sensor.AddObservation(transform.InverseTransformDirection(m_AgentRb.velocity));
}
}

24
Project/Assets/ML-Agents/Examples/Reacher/Scripts/ReacherAgent.cs


/// We collect the normalized rotations, angularal velocities, and velocities of both
/// limbs of the reacher as well as the relative position of the target and hand.
/// </summary>
public override void CollectObservations()
public override void CollectObservations(VectorSensor sensor)
AddVectorObs(pendulumA.transform.localPosition);
AddVectorObs(pendulumA.transform.rotation);
AddVectorObs(m_RbA.angularVelocity);
AddVectorObs(m_RbA.velocity);
sensor.AddObservation(pendulumA.transform.localPosition);
sensor.AddObservation(pendulumA.transform.rotation);
sensor.AddObservation(m_RbA.angularVelocity);
sensor.AddObservation(m_RbA.velocity);
AddVectorObs(pendulumB.transform.localPosition);
AddVectorObs(pendulumB.transform.rotation);
AddVectorObs(m_RbB.angularVelocity);
AddVectorObs(m_RbB.velocity);
sensor.AddObservation(pendulumB.transform.localPosition);
sensor.AddObservation(pendulumB.transform.rotation);
sensor.AddObservation(m_RbB.angularVelocity);
sensor.AddObservation(m_RbB.velocity);
AddVectorObs(goal.transform.localPosition);
AddVectorObs(hand.transform.localPosition);
sensor.AddObservation(goal.transform.localPosition);
sensor.AddObservation(hand.transform.localPosition);
AddVectorObs(m_GoalSpeed);
sensor.AddObservation(m_GoalSpeed);
}
/// <summary>

8
Project/Assets/ML-Agents/Examples/SharedAssets/Scripts/ProjectSettingsOverrides.cs


public class ProjectSettingsOverrides : MonoBehaviour
{
// Original values
float m_OriginalMonitorVerticalOffset;
Vector3 m_OriginalGravity;
float m_OriginalFixedDeltaTime;
float m_OriginalMaximumDeltaTime;

[Tooltip("Increase or decrease the scene gravity. Use ~3x to make things less floaty")]
public float gravityMultiplier = 1.0f;
[Header("Display Settings")]
public float monitorVerticalOffset;
[Header("Advanced physics settings")]
[Tooltip("The interval in seconds at which physics and other fixed frame rate updates (like MonoBehaviour's FixedUpdate) are performed.")]
public float fixedDeltaTime = .02f;

public void Awake()
{
// Save the original values
m_OriginalMonitorVerticalOffset = Monitor.verticalOffset;
m_OriginalGravity = Physics.gravity;
m_OriginalFixedDeltaTime = Time.fixedDeltaTime;
m_OriginalMaximumDeltaTime = Time.maximumDeltaTime;

// Override
Monitor.verticalOffset = monitorVerticalOffset;
Physics.gravity *= gravityMultiplier;
Time.fixedDeltaTime = fixedDeltaTime;
Time.maximumDeltaTime = maximumDeltaTime;

public void OnDestroy()
{
Monitor.verticalOffset = m_OriginalMonitorVerticalOffset;
Physics.gravity = m_OriginalGravity;
Time.fixedDeltaTime = m_OriginalFixedDeltaTime;
Time.maximumDeltaTime = m_OriginalMaximumDeltaTime;

2
Project/Assets/ML-Agents/Examples/Template/Scripts/TemplateAgent.cs


public class TemplateAgent : Agent
{
public override void CollectObservations()
public override void CollectObservations(VectorSensor sensor)
{
}

20
Project/Assets/ML-Agents/Examples/Tennis/Scripts/TennisAgent.cs


SetResetParameters();
}
public override void CollectObservations()
public override void CollectObservations(VectorSensor sensor)
AddVectorObs(m_InvertMult * (transform.position.x - myArea.transform.position.x));
AddVectorObs(transform.position.y - myArea.transform.position.y);
AddVectorObs(m_InvertMult * m_AgentRb.velocity.x);
AddVectorObs(m_AgentRb.velocity.y);
sensor.AddObservation(m_InvertMult * (transform.position.x - myArea.transform.position.x));
sensor.AddObservation(transform.position.y - myArea.transform.position.y);
sensor.AddObservation(m_InvertMult * m_AgentRb.velocity.x);
sensor.AddObservation(m_AgentRb.velocity.y);
AddVectorObs(m_InvertMult * (ball.transform.position.x - myArea.transform.position.x));
AddVectorObs(ball.transform.position.y - myArea.transform.position.y);
AddVectorObs(m_InvertMult * m_BallRb.velocity.x);
AddVectorObs(m_BallRb.velocity.y);
sensor.AddObservation(m_InvertMult * (ball.transform.position.x - myArea.transform.position.x));
sensor.AddObservation(ball.transform.position.y - myArea.transform.position.y);
sensor.AddObservation(m_InvertMult * m_BallRb.velocity.x);
sensor.AddObservation(m_BallRb.velocity.y);
AddVectorObs(m_InvertMult * gameObject.transform.rotation.z);
sensor.AddObservation(m_InvertMult * gameObject.transform.rotation.z);
}
public override void AgentAction(float[] vectorAction)

30
Project/Assets/ML-Agents/Examples/Walker/Scripts/WalkerAgent.cs


/// <summary>
/// Add relevant information on each body part to observations.
/// </summary>
public void CollectObservationBodyPart(BodyPart bp)
public void CollectObservationBodyPart(BodyPart bp, VectorSensor sensor)
AddVectorObs(bp.groundContact.touchingGround ? 1 : 0); // Is this bp touching the ground
AddVectorObs(rb.velocity);
AddVectorObs(rb.angularVelocity);
sensor.AddObservation(bp.groundContact.touchingGround ? 1 : 0); // Is this bp touching the ground
sensor.AddObservation(rb.velocity);
sensor.AddObservation(rb.angularVelocity);
AddVectorObs(localPosRelToHips);
sensor.AddObservation(localPosRelToHips);
AddVectorObs(bp.currentXNormalizedRot);
AddVectorObs(bp.currentYNormalizedRot);
AddVectorObs(bp.currentZNormalizedRot);
AddVectorObs(bp.currentStrength / m_JdController.maxJointForceLimit);
sensor.AddObservation(bp.currentXNormalizedRot);
sensor.AddObservation(bp.currentYNormalizedRot);
sensor.AddObservation(bp.currentZNormalizedRot);
sensor.AddObservation(bp.currentStrength / m_JdController.maxJointForceLimit);
}
}

public override void CollectObservations()
public override void CollectObservations(VectorSensor sensor)
AddVectorObs(m_DirToTarget.normalized);
AddVectorObs(m_JdController.bodyPartsDict[hips].rb.position);
AddVectorObs(hips.forward);
AddVectorObs(hips.up);
sensor.AddObservation(m_DirToTarget.normalized);
sensor.AddObservation(m_JdController.bodyPartsDict[hips].rb.position);
sensor.AddObservation(hips.forward);
sensor.AddObservation(hips.up);
CollectObservationBodyPart(bodyPart);
CollectObservationBodyPart(bodyPart, sensor);
}
}

15
Project/Assets/ML-Agents/Examples/WallJump/Scripts/WallJumpAgent.cs


}
}
public override void CollectObservations()
public override void CollectObservations(VectorSensor sensor)
AddVectorObs(agentPos / 20f);
AddVectorObs(DoGroundCheck(true) ? 1 : 0);
sensor.AddObservation(agentPos / 20f);
sensor.AddObservation(DoGroundCheck(true) ? 1 : 0);
}
/// <summary>

}
/// <summary>
/// Chenges the color of the ground for a moment
/// Changes the color of the ground for a moment.
/// <returns>The Enumerator to be used in a Coroutine</returns>
/// <param name="mat">The material to be swaped.</param>
/// <returns>The Enumerator to be used in a Coroutine.</returns>
/// <param name="mat">The material to be swapped.</param>
/// <param name="time">The time the material will remain.</param>
IEnumerator GoalScoredSwapGroundMaterial(Material mat, float time)
{

/// <param name="config">Config.
/// If 0 : No wall and noWallBrain.
/// If 1: Small wall and smallWallBrain.
/// Other : Tall wall and BigWallBrain. </param>
/// Other : Tall wall and BigWallBrain.
/// </param>
void ConfigureAgent(int config)
{
var localScale = wall.transform.localScale;

36
com.unity.ml-agents/CHANGELOG.md


The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).
## [Unreleased]
### Major Changes
- Agent.CollectObservations now takes a VectorSensor argument. It was also overloaded to optionally take an ActionMasker argument. (#3352, #3389)
- Beta support for ONNX export was added. If the `tf2onnx` python package is installed, models will be saved to `.onnx` as well as `.nn` format.
Note that Barracuda 0.6.0 or later is required to import the `.onnx` files properly
- Multi-GPU training and the `--multi-gpu` option has been removed temporarily. (#3345)
### Minor Changes
- Monitor.cs was moved to Examples. (#3372)
- Automatic stepping for Academy is now controlled from the AutomaticSteppingEnabled property. (#3376)
- The GetEpisodeCount, GetStepCount, GetTotalStepCount and methods of Academy were changed to EpisodeCount, StepCount, TotalStepCount properties respectively. (#3376)
- Several classes were changed from public to internal visibility. (#3390)
- Academy.RegisterSideChannel and UnregisterSideChannel methods were added. (#3391)
- A tutorial on adding custom SideChannels was added (#3391)
- The stepping logic for the Agent and the Academy has been simplified (#3448)
- Update Barracuda to 0.6.0-preview
- The interface for `RayPerceptionSensor.PerceiveStatic()` was changed to take an input class and write to an output class.
- The checkpoint file suffix was changed from `.cptk` to `.ckpt` (#3470)
- The command-line argument used to determine the port that an environment will listen on was changed from `--port` to `--mlagents-port`.
- `DemonstrationRecorder` can now record observations outside of the editor.
- `DemonstrationRecorder` now has an optional path for the demonstrations. This will default to `Application.dataPath` if not set.
- `DemonstrationStore` was changed to accept a `Stream` for its constructor, and was renamed to `DemonstrationWriter`
- The method `GetStepCount()` on the Agent class has been replaced with the property getter `StepCount`
- `RayPerceptionSensorComponent` and related classes now display the debug gizmos whenever the Agent is selected (not just Play mode).
- Most fields on `RayPerceptionSensorComponent` can now be changed while the editor is in Play mode. The exceptions to this are fields that affect the number of observations.
- Unused static methods from the `Utilities` class (ShiftLeft, ReplaceRange, AddRangeNoAlloc, and GetSensorFloatObservationSize) were removed.
### Bugfixes
- Fixed demonstration recording of experiences when the Agent is done. (#3463)
- Fixed a bug with the rewards of multiple Agents in the gym interface (#3471, #3496)
## [0.14.1-preview] - 2020-02-25
### Bug Fixes

- Fixed demonstration recording of experiences when the Agent is done. (#3463)
- Fixed a bug with the rewards of multiple Agents in the gym interface (#3471, #3496)
## [0.14.0-preview] - 2020-02-13

2
com.unity.ml-agents/Editor/AgentEditor.cs


*/
[CustomEditor(typeof(Agent), true)]
[CanEditMultipleObjects]
public class AgentEditor : Editor
internal class AgentEditor : Editor
{
public override void OnInspectorGUI()
{

3
com.unity.ml-agents/Editor/BehaviorParametersEditor.cs


using UnityEngine;
using UnityEditor;
using Barracuda;
using MLAgents.Sensor;
namespace MLAgents
{

[CustomEditor(typeof(BehaviorParameters))]
[CanEditMultipleObjects]
public class BehaviorParametersEditor : Editor
internal class BehaviorParametersEditor : Editor
{
const float k_TimeBetweenModelReloads = 2f;
// Time since the last reload of the model

6
com.unity.ml-agents/Editor/BrainParametersDrawer.cs


/// Inspector.
/// </summary>
[CustomPropertyDrawer(typeof(BrainParameters))]
public class BrainParametersDrawer : PropertyDrawer
internal class BrainParametersDrawer : PropertyDrawer
{
// The height of a line in the Unity Inspectors
const float k_LineHeight = 17f;

}
/// <summary>
/// The Height required to draw the Vector Action parameters
/// The Height required to draw the Vector Action parameters.
/// <returns>The height of the drawer of the Vector Action </returns>
/// <returns>The height of the drawer of the Vector Action.</returns>
static float GetHeightDrawVectorAction(SerializedProperty property)
{
var actionSize = 2 + property.FindPropertyRelative(k_ActionSizePropName).arraySize;

2
com.unity.ml-agents/Editor/DemonstrationDrawer.cs


/// </summary>
[CustomEditor(typeof(Demonstration))]
[CanEditMultipleObjects]
public class DemonstrationEditor : Editor
internal class DemonstrationEditor : Editor
{
SerializedProperty m_BrainParameters;
SerializedProperty m_DemoMetaData;

6
com.unity.ml-agents/Editor/DemonstrationImporter.cs


/// Asset Importer used to parse demonstration files.
/// </summary>
[ScriptedImporter(1, new[] {"demo"})]
public class DemonstrationImporter : ScriptedImporter
internal class DemonstrationImporter : ScriptedImporter
const string k_IconPath = "Assets/ML-Agents/Resources/DemoIcon.png";
const string k_IconPath = "Packages/com.unity.ml-agents/Editor/Icons/DemoIcon.png";
public override void OnImportAsset(AssetImportContext ctx)
{

var metaDataProto = DemonstrationMetaProto.Parser.ParseDelimitedFrom(reader);
var metaData = metaDataProto.ToDemonstrationMetaData();
reader.Seek(DemonstrationStore.MetaDataBytes + 1, 0);
reader.Seek(DemonstrationWriter.MetaDataBytes + 1, 0);
var brainParamsProto = BrainParametersProto.Parser.ParseDelimitedFrom(reader);
var brainParameters = brainParamsProto.ToBrainParameters();

3
com.unity.ml-agents/LICENSE.md


com.unity.ml-agents copyright © 2020 Unity Technologies ApS
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/

107
com.unity.ml-agents/Runtime/Academy.cs


"docs/Learning-Environment-Design.md")]
public class Academy : IDisposable
{
const string k_ApiVersion = "API-14";
const string k_ApiVersion = "API-15-dev0";
internal const string k_portCommandLineFlag = "--mlagents-port";
/// <summary>
/// True if the Academy is initialized, false otherwise.
/// </summary>
/// <summary>
/// The singleton Academy object.
/// </summary>
/// <summary>
/// Collection of float properties (indexed by a string).
/// </summary>
public IFloatProperties FloatProperties;

// Signals to all the listeners that the academy is being destroyed
internal event Action DestroyAction;
// Signals the Agent that a new step is about to start.
// This will mark the Agent as Done if it has reached its maxSteps.
internal event Action AgentIncrementStep;
// Signals to all the agents at each environment step along with the
// Academy's maxStepReached, done and stepCount values. The agents rely
// on this event to update their own values of max step reached and done

// Signals to all the agents each time the Academy force resets.
internal event Action AgentForceReset;
// Signals that the Academy has been reset by the training process
/// <summary>
/// Signals that the Academy has been reset by the training process.
/// </summary>
public event Action OnEnvironmentReset;
AcademyFixedUpdateStepper m_FixedUpdateStepper;

{
Application.quitting += Dispose;
LazyInitialization();
LazyInitialize();
/// This method is always safe to call; it will have no effect if the Academy is already initialized.
/// This method is always safe to call; it will have no effect if the Academy is already
/// initialized.
internal void LazyInitialization()
internal void LazyInitialize()
{
if (!m_Initialized)
{

}
/// <summary>
/// Enable stepping of the Academy during the FixedUpdate phase. This is done by creating a temporary
/// GameObject with a MonoBehavior that calls Academy.EnvironmentStep().
/// Enable stepping of the Academy during the FixedUpdate phase. This is done by creating
/// a temporary GameObject with a MonoBehaviour that calls Academy.EnvironmentStep().
public void EnableAutomaticStepping()
void EnableAutomaticStepping()
{
if (m_FixedUpdateStepper != null)
{

// Don't show this object in the hierarchy
m_StepperObject.hideFlags = HideFlags.HideInHierarchy;
m_FixedUpdateStepper = m_StepperObject.AddComponent<AcademyFixedUpdateStepper>();
}
/// <summary>
/// Registers SideChannel to the Academy to send and receive data with Python.
/// If IsCommunicatorOn is false, the SideChannel will not be registered.
/// </summary>
/// <param name="channel"> The side channel to be registered.</param>
public void RegisterSideChannel(SideChannel channel)
{
LazyInitialize();
Communicator?.RegisterSideChannel(channel);
}
/// <summary>
/// Unregisters SideChannel to the Academy. If the side channel was not registered,
/// nothing will happen.
/// </summary>
/// <param name="channel"> The side channel to be unregistered.</param>
public void UnregisterSideChannel(SideChannel channel)
{
Communicator?.UnregisterSideChannel(channel);
}
/// <summary>

public void DisableAutomaticStepping(bool destroyImmediate = false)
void DisableAutomaticStepping()
{
if (m_FixedUpdateStepper == null)
{

m_FixedUpdateStepper = null;
if (destroyImmediate)
if (Application.isEditor)
{
UnityEngine.Object.DestroyImmediate(m_StepperObject);
}

}
/// <summary>
/// Returns whether or not the Academy is automatically stepped during the FixedUpdate phase.
/// Determines whether or not the Academy is automatically stepped during the FixedUpdate phase.
public bool IsAutomaticSteppingEnabled
public bool AutomaticSteppingEnabled
set
{
if (value)
{
EnableAutomaticStepping();
}
else
{
DisableAutomaticStepping();
}
}
}
// Used to read Python-provided environment parameters

var inputPort = "";
for (var i = 0; i < args.Length; i++)
{
if (args[i] == "--port")
if (args[i] == k_portCommandLineFlag)
{
inputPort = args[i + 1];
}

/// <returns>
/// Current episode number.
/// </returns>
public int GetEpisodeCount()
public int EpisodeCount
return m_EpisodeCount;
get { return m_EpisodeCount; }
}
/// <summary>

/// Current step count.
/// </returns>
public int GetStepCount()
public int StepCount
return m_StepCount;
get { return m_StepCount; }
}
/// <summary>

/// Total step count.
/// </returns>
public int GetTotalStepCount()
public int TotalStepCount
return m_TotalStepCount;
get { return m_TotalStepCount; }
}
/// <summary>

AgentSetStatus?.Invoke(m_StepCount);
m_StepCount += 1;
m_TotalStepCount += 1;
AgentIncrementStep?.Invoke();
using (TimerStack.Instance.Scoped("AgentSendState"))
{

{
AgentAct?.Invoke();
}
m_StepCount += 1;
m_TotalStepCount += 1;
}
/// <summary>

/// Creates or retrieves an existing ModelRunner that uses the same
/// NNModel and the InferenceDevice as provided.
/// </summary>
/// <param name="model"> The NNModel the ModelRunner must use </param>
/// <param name="brainParameters"> The brainParameters used to create
/// the ModelRunner </param>
/// <param name="inferenceDevice"> The inference device (CPU or GPU)
/// the ModelRunner will use </param>
/// <returns> The ModelRunner compatible with the input settings</returns>
/// <param name="model">The NNModel the ModelRunner must use.</param>
/// <param name="brainParameters">The brainParameters used to create the ModelRunner.</param>
/// <param name="inferenceDevice">
/// The inference device (CPU or GPU) the ModelRunner will use.
/// </param>
/// <returns> The ModelRunner compatible with the input settings.</returns>
internal ModelRunner GetOrCreateModelRunner(
NNModel model, BrainParameters brainParameters, InferenceDevice inferenceDevice)
{

/// </summary>
public void Dispose()
{
DisableAutomaticStepping(true);
DisableAutomaticStepping();
// Signal to listeners that the academy is being destroyed now
DestroyAction?.Invoke();

55
com.unity.ml-agents/Runtime/ActionMasker.cs


namespace MLAgents
{
internal class ActionMasker
/// <summary>
/// Agents that take discrete actions can explicitly indicate that specific actions
/// are not allowed at a point in time. This enables the agent to indicate that some actions
/// may be illegal (e.g. the King in Chess taking a move to the left if it is already in the
/// left side of the board). This class represents the set of masked actions and provides
/// the utilities for setting and retrieving them.
/// </summary>
public class ActionMasker
{
/// When using discrete control, is the starting indices of the actions
/// when all the branches are concatenated with each other.

}
/// <summary>
/// Sets an action mask for discrete control agents. When used, the agent will not be
/// able to perform the actions passed as argument at the next decision.
/// The actionIndices correspond to the actions the agent will be unable to perform
/// on the branch 0.
/// </summary>
/// <param name="actionIndices">The indices of the masked actions on branch 0.</param>
public void SetActionMask(IEnumerable<int> actionIndices)
{
SetActionMask(0, actionIndices);
}
/// <summary>
/// Sets an action mask for discrete control agents. When used, the agent will not be
/// able to perform the action passed as argument at the next decision for the specified
/// action branch. The actionIndex correspond to the action the agent will be unable
/// to perform.
/// </summary>
/// <param name="branch">The branch for which the actions will be masked.</param>
/// <param name="actionIndex">The index of the masked action.</param>
public void SetActionMask(int branch, int actionIndex)
{
SetActionMask(branch, new[] { actionIndex });
}
/// <summary>
/// Sets an action mask for discrete control agents. When used, the agent will not be
/// able to perform the action passed as argument at the next decision. The actionIndex
/// correspond to the action the agent will be unable to perform on the branch 0.
/// </summary>
/// <param name="actionIndex">The index of the masked action on branch 0</param>
public void SetActionMask(int actionIndex)
{
SetActionMask(0, new[] { actionIndex });
}
/// <summary>
/// able to perform the action passed as argument at the next decision. If no branch is
/// specified, the default branch will be 0. The actionIndex or actionIndices correspond
/// to the action the agent will be unable to perform.
/// able to perform the actions passed as argument at the next decision for the specified
/// action branch. The actionIndices correspond to the action options the agent will
/// be unable to perform.
/// </summary>
/// <param name="branch">The branch for which the actions will be masked</param>
/// <param name="actionIndices">The indices of the masked actions</param>

/// </summary>
/// <returns>A mask for the agent. A boolean array of length equal to the total number of
/// actions.</returns>
public bool[] GetMask()
internal bool[] GetMask()
{
if (m_CurrentMask != null)
{

/// <summary>
/// Resets the current mask for an agent
/// </summary>
public void ResetMask()
internal void ResetMask()
{
if (m_CurrentMask != null)
{

338
com.unity.ml-agents/Runtime/Agent.cs


using System.Collections.Generic;
using UnityEngine;
using Barracuda;
using MLAgents.Sensor;
using UnityEngine.Serialization;
/// observations, actions and current status, that is sent to the Brain.
/// observations, actions and current status.
public struct AgentInfo
internal struct AgentInfo
{
/// <summary>
/// Keeps track of the last vector action taken by the Brain.

public float[] vectorActions;
}
/// Agent Monobehavior class that is attached to a Unity GameObject, making it
/// Agent MonoBehaviour class that is attached to a Unity GameObject, making it
/// user in <see cref="CollectObservations"/>. On the other hand, actions
/// are determined by decisions produced by a Policy. Currently, this
/// class is expected to be extended to implement the desired agent behavior.
/// user in <see cref="Agent.CollectObservations(VectorSensor)"/> or
/// <see cref="Agent.CollectObservations(VectorSensor, ActionMasker)"/>.
/// On the other hand, actions are determined by decisions produced by a Policy.
/// Currently, this class is expected to be extended to implement the desired agent behavior.
/// </summary>
/// <remarks>
/// Simply speaking, an agent roams through an environment and at each step

/// little may have changed between successive steps.
///
/// At any step, an agent may be considered <see cref="m_Done"/>.
/// This could occur due to a variety of reasons:
/// At any step, an agent may be considered done due to a variety of reasons:
/// - The agent reached an end state within its environment.
/// - The agent reached the maximum # of steps (i.e. timed out).
/// - The academy reached the maximum # of steps (forced agent to be done).

BehaviorParameters m_PolicyFactory;
/// This code is here to make the upgrade path for users using maxStep
/// easier. We will hook into the Serialization code and make sure that
/// easier. We will hook into the Serialization code and make sure that
/// agentParameters.maxStep and this.maxStep are in sync.
[Serializable]
internal struct AgentParameters

[SerializeField] [HideInInspector]
[SerializeField][HideInInspector]
[SerializeField] [HideInInspector]
[SerializeField][HideInInspector]
internal bool hasUpgradedFromAgentParameters;
/// <summary>

/// Whether or not the agent requests a decision.
bool m_RequestDecision;
/// Keeps track of the number of steps taken by the agent in this episode.
/// Note that this value is different for each agent, and may not overlap
/// with the step counter in the Academy, since agents reset based on

ActionMasker m_ActionMasker;
/// <summary>
/// Demonstration recorder.
/// Set of DemonstrationWriters that the Agent will write its step information to.
/// If you use a DemonstrationRecorder component, this will automatically register its DemonstrationWriter.
/// You can also add your own DemonstrationWriter by calling
/// DemonstrationRecorder.AddDemonstrationWriterToAgent()
DemonstrationRecorder m_Recorder;
internal ISet<DemonstrationWriter> DemonstrationWriters = new HashSet<DemonstrationWriter>();
/// <summary>
/// List of sensors used to generate observations.

/// </summary>
internal VectorSensor collectObservationsSensor;
/// MonoBehaviour function that is called when the attached GameObject
/// becomes enabled or active.
/// <summary>
/// <inheritdoc cref="OnBeforeSerialize"/>
/// </summary>
// Manages a serialization upgrade issue from v0.13 to v0.14 where maxStep moved
// from AgentParameters (since removed) to Agent
/// <summary>
/// <inheritdoc cref="OnAfterDeserialize"/>
/// </summary>
// Manages a serialization upgrade issue from v0.13 to v0.14 where maxStep moved
// from AgentParameters (since removed) to Agent
if (maxStep == 0 && maxStep != agentParameters.maxStep && !hasUpgradedFromAgentParameters)
{
maxStep = agentParameters.maxStep;

/// Helper method for the <see cref="OnEnable"/> event, created to
/// facilitate testing.
/// <summary>
/// Initializes the agent. Can be safely called multiple times.
/// </summary>
public void LazyInitialize()
{
if (m_Initialized)

// Grab the "static" properties for the Agent.
m_EpisodeId = EpisodeIdCounter.GetEpisodeId();
m_PolicyFactory = GetComponent<BehaviorParameters>();
m_Recorder = GetComponent<DemonstrationRecorder>();
Academy.Instance.AgentIncrementStep += AgentIncrementStep;
Academy.Instance.AgentSendState += SendInfo;
Academy.Instance.DecideAction += DecideAction;
Academy.Instance.AgentAct += AgentStep;

InitializeSensors();
}
/// Monobehavior function that is called when the attached GameObject
/// becomes disabled or inactive.
DemonstrationWriters.Clear();
Academy.Instance.AgentIncrementStep -= AgentIncrementStep;
Academy.Instance.AgentSendState -= SendInfo;
Academy.Instance.DecideAction -= DecideAction;
Academy.Instance.AgentAct -= AgentStep;

// We request a decision so Python knows the Agent is done immediately
m_Brain?.RequestDecision(m_Info, sensors);
if (m_Recorder != null && m_Recorder.record && Application.isEditor)
// We also have to write any to any DemonstationStores so that they get the "done" flag.
foreach(var demoWriter in DemonstrationWriters)
m_Recorder.WriteExperience(m_Info, sensors);
demoWriter.Record(m_Info, sensors);
}
UpdateRewardStats();

/// Returns the current step counter (within the current episode).
/// </summary>
/// <returns>
/// Current episode number.
/// Current step count.
public int GetStepCount()
public int StepCount
return m_StepCount;
get { return m_StepCount; }
}
/// <summary>

public void SetReward(float reward)
{
#if DEBUG
if (float.IsNaN(reward))
{
throw new ArgumentException("NaN reward passed to SetReward.");
}
Utilities.DebugCheckNanAndInfinity(reward, nameof(reward), nameof(SetReward));
#endif
m_CumulativeReward += (reward - m_Reward);
m_Reward = reward;

public void AddReward(float increment)
{
#if DEBUG
if (float.IsNaN(increment))
{
throw new ArgumentException("NaN reward passed to AddReward.");
}
Utilities.DebugCheckNanAndInfinity(increment, nameof(increment), nameof(AddReward));
#endif
m_Reward += increment;
m_CumulativeReward += increment;

/// </returns>
public virtual float[] Heuristic()
{
throw new UnityAgentsException(string.Format(
throw new UnityAgentsException(
"{0} GameObject.",
gameObject.name));
$"{gameObject.name} GameObject.");
}
/// <summary>

collectObservationsSensor = new VectorSensor(param.vectorObservationSize);
if (param.numStackedVectorObservations > 1)
{
var stackingSensor = new StackingSensor(collectObservationsSensor, param.numStackedVectorObservations);
var stackingSensor = new StackingSensor(
collectObservationsSensor, param.numStackedVectorObservations);
sensors.Add(stackingSensor);
}
else

// Make sure the names are actually unique
for (var i = 0; i < sensors.Count - 1; i++)
{
Debug.Assert(!sensors[i].GetName().Equals(sensors[i + 1].GetName()), "Sensor names must be unique.");
Debug.Assert(
!sensors[i].GetName().Equals(sensors[i + 1].GetName()),
"Sensor names must be unique.");
}
#endif
}

UpdateSensors();
using (TimerStack.Instance.Scoped("CollectObservations"))
{
CollectObservations();
CollectObservations(collectObservationsSensor, m_ActionMasker);
}
m_Info.actionMasks = m_ActionMasker.GetMask();

m_Brain.RequestDecision(m_Info, sensors);
if (m_Recorder != null && m_Recorder.record && Application.isEditor)
// If we have any DemonstrationWriters, write the AgentInfo and sensors to them.
foreach(var demoWriter in DemonstrationWriters)
m_Recorder.WriteExperience(m_Info, sensors);
demoWriter.Record(m_Info, sensors);
for (var i = 0; i < sensors.Count; i++)
foreach (var sensor in sensors)
sensors[i].Update();
sensor.Update();
/// Collects the (vector, visual) observations of the agent.
/// Collects the vector observations of the agent.
/// <param name="sensor">
/// The vector observations for the agent.
/// </param>
/// Simply, an agents observation is any environment information that helps
/// the Agent acheive its goal. For example, for a fighting Agent, its
/// An agents observation is any environment information that helps
/// the Agent achieve its goal. For example, for a fighting Agent, its
/// Vector observations are added by calling the provided helper methods:
/// - <see cref="AddVectorObs(int)"/>
/// - <see cref="AddVectorObs(float)"/>
/// - <see cref="AddVectorObs(Vector3)"/>
/// - <see cref="AddVectorObs(Vector2)"/>
/// - <see>
/// <cref>AddVectorObs(float[])</cref>
/// </see>
/// - <see>
/// <cref>AddVectorObs(List{float})</cref>
/// </see>
/// - <see cref="AddVectorObs(Quaternion)"/>
/// - <see cref="AddVectorObs(bool)"/>
/// - <see cref="AddVectorObs(int, int)"/>
/// Vector observations are added by calling the provided helper methods
/// on the VectorSensor input:
/// - <see cref="VectorSensor.AddObservation(int)"/>
/// - <see cref="VectorSensor.AddObservation(float)"/>
/// - <see cref="VectorSensor.AddObservation(Vector3)"/>
/// - <see cref="VectorSensor.AddObservation(Vector2)"/>
/// - <see cref="VectorSensor.AddObservation(Quaternion)"/>
/// - <see cref="VectorSensor.AddObservation(bool)"/>
/// - <see cref="VectorSensor.AddObservation(IEnumerable{float})"/>
/// - <see cref="VectorSensor.AddOneHotObservation(int, int)"/>
/// Depending on your environment, any combination of these helpers can
/// be used. They just need to be used in the exact same order each time
/// this method is called and the resulting size of the vector observation

/// </remarks>
public virtual void CollectObservations()
{
}
/// <summary>
/// Sets an action mask for discrete control agents. When used, the agent will not be
/// able to perform the action passed as argument at the next decision. If no branch is
/// specified, the default branch will be 0. The actionIndex or actionIndices correspond
/// to the action the agent will be unable to perform.
/// </summary>
/// <param name="actionIndices">The indices of the masked actions on branch 0</param>
protected void SetActionMask(IEnumerable<int> actionIndices)
{
m_ActionMasker.SetActionMask(0, actionIndices);
}
/// <summary>
/// Sets an action mask for discrete control agents. When used, the agent will not be
/// able to perform the action passed as argument at the next decision. If no branch is
/// specified, the default branch will be 0. The actionIndex or actionIndices correspond
/// to the action the agent will be unable to perform.
/// </summary>
/// <param name="actionIndex">The index of the masked action on branch 0</param>
protected void SetActionMask(int actionIndex)
{
m_ActionMasker.SetActionMask(0, new[] { actionIndex });
}
/// <summary>
/// Sets an action mask for discrete control agents. When used, the agent will not be
/// able to perform the action passed as argument at the next decision. If no branch is
/// specified, the default branch will be 0. The actionIndex or actionIndices correspond
/// to the action the agent will be unable to perform.
/// </summary>
/// <param name="branch">The branch for which the actions will be masked</param>
/// <param name="actionIndex">The index of the masked action</param>
protected void SetActionMask(int branch, int actionIndex)
public virtual void CollectObservations(VectorSensor sensor)
m_ActionMasker.SetActionMask(branch, new[] { actionIndex });
/// Modifies an action mask for discrete control agents. When used, the agent will not be
/// able to perform the action passed as argument at the next decision. If no branch is
/// specified, the default branch will be 0. The actionIndex or actionIndices correspond
/// to the action the agent will be unable to perform.
/// Collects the vector observations of the agent alongside the masked actions.
/// The agent observation describes the current environment from the
/// perspective of the agent.
/// <param name="branch">The branch for which the actions will be masked</param>
/// <param name="actionIndices">The indices of the masked actions</param>
protected void SetActionMask(int branch, IEnumerable<int> actionIndices)
{
m_ActionMasker.SetActionMask(branch, actionIndices);
}
/// <summary>
/// Adds a float observation to the vector observations of the agent.
/// Increases the size of the agents vector observation by 1.
/// </summary>
/// <param name="observation">Observation.</param>
protected void AddVectorObs(float observation)
{
collectObservationsSensor.AddObservation(observation);
}
/// <summary>
/// Adds an integer observation to the vector observations of the agent.
/// Increases the size of the agents vector observation by 1.
/// </summary>
/// <param name="observation">Observation.</param>
protected void AddVectorObs(int observation)
/// <param name="sensor">
/// The vector observations for the agent.
/// </param>
/// <param name="actionMasker">
/// The masked actions for the agent.
/// </param>
/// <remarks>
/// An agents observation is any environment information that helps
/// the Agent achieve its goal. For example, for a fighting Agent, its
/// observation could include distances to friends or enemies, or the
/// current level of ammunition at its disposal.
/// Recall that an Agent may attach vector or visual observations.
/// Vector observations are added by calling the provided helper methods
/// on the VectorSensor input:
/// - <see cref="VectorSensor.AddObservation(int)"/>
/// - <see cref="VectorSensor.AddObservation(float)"/>
/// - <see cref="VectorSensor.AddObservation(Vector3)"/>
/// - <see cref="VectorSensor.AddObservation(Vector2)"/>
/// - <see cref="VectorSensor.AddObservation(Quaternion)"/>
/// - <see cref="VectorSensor.AddObservation(bool)"/>
/// - <see cref="VectorSensor.AddObservation(IEnumerable{float})"/>
/// - <see cref="VectorSensor.AddOneHotObservation(int, int)"/>
/// Depending on your environment, any combination of these helpers can
/// be used. They just need to be used in the exact same order each time
/// this method is called and the resulting size of the vector observation
/// needs to match the vectorObservationSize attribute of the linked Brain.
/// Visual observations are implicitly added from the cameras attached to
/// the Agent.
/// When using Discrete Control, you can prevent the Agent from using a certain
/// action by masking it. You can call the following method on the ActionMasker
/// input :
/// - <see cref="ActionMasker.SetActionMask(int)"/>
/// - <see cref="ActionMasker.SetActionMask(int, int)"/>
/// - <see cref="ActionMasker.SetActionMask(int, IEnumerable{int})"/>
/// - <see cref="ActionMasker.SetActionMask(IEnumerable{int})"/>
/// The branch input is the index of the action, actionIndices are the indices of the
/// invalid options for that action.
/// </remarks>
public virtual void CollectObservations(VectorSensor sensor, ActionMasker actionMasker)
collectObservationsSensor.AddObservation(observation);
}
/// <summary>
/// Adds an Vector3 observation to the vector observations of the agent.
/// Increases the size of the agents vector observation by 3.
/// </summary>
/// <param name="observation">Observation.</param>
protected void AddVectorObs(Vector3 observation)
{
collectObservationsSensor.AddObservation(observation);
}
/// <summary>
/// Adds an Vector2 observation to the vector observations of the agent.
/// Increases the size of the agents vector observation by 2.
/// </summary>
/// <param name="observation">Observation.</param>
protected void AddVectorObs(Vector2 observation)
{
collectObservationsSensor.AddObservation(observation);
}
/// <summary>
/// Adds a collection of float observations to the vector observations of the agent.
/// Increases the size of the agents vector observation by size of the collection.
/// </summary>
/// <param name="observation">Observation.</param>
protected void AddVectorObs(IEnumerable<float> observation)
{
collectObservationsSensor.AddObservation(observation);
}
/// <summary>
/// Adds a quaternion observation to the vector observations of the agent.
/// Increases the size of the agents vector observation by 4.
/// </summary>
/// <param name="observation">Observation.</param>
protected void AddVectorObs(Quaternion observation)
{
collectObservationsSensor.AddObservation(observation);
}
/// <summary>
/// Adds a boolean observation to the vector observation of the agent.
/// Increases the size of the agent's vector observation by 1.
/// </summary>
/// <param name="observation"></param>
protected void AddVectorObs(bool observation)
{
collectObservationsSensor.AddObservation(observation);
}
protected void AddVectorObs(int observation, int range)
{
collectObservationsSensor.AddOneHotObservation(observation, range);
CollectObservations(sensor);
}
/// <summary>

}
/// <summary>
/// Returns the last action that was decided on by the Agent (returns null if no decision has been made)
/// Returns the last action that was decided on by the Agent
/// <returns>
/// The last action that was decided by the Agent (or null if no decision has been made)
/// </returns>
return m_Action.vectorActions;
return m_Action.vectorActions;
}
/// <summary>

AgentReset();
}
internal void UpdateAgentAction(AgentAction action)
{
m_Action = action;
}
/// <summary>
/// Scales continuous action from [-1, 1] to arbitrary range.
/// </summary>

/// <returns></returns>
protected float ScaleAction(float rawAction, float min, float max)
protected static float ScaleAction(float rawAction, float min, float max)
{
var middle = (min + max) / 2;
var range = (max - min) / 2;

}
}
void AgentIncrementStep()
{
m_StepCount += 1;
}
if ((m_RequestAction) && (m_Brain != null))
{
m_RequestAction = false;
AgentAction(m_Action.vectorActions);
}
}
else
{
m_StepCount += 1;
}
if ((m_RequestAction) && (m_Brain != null))
{
m_RequestAction = false;
AgentAction(m_Action.vectorActions);
}
}

29
com.unity.ml-agents/Runtime/DecisionRequester.cs


using System;
using System.Runtime.CompilerServices;
using UnityEngine;
namespace MLAgents

[AddComponentMenu("ML Agents/Decision Requester", (int)MenuGroup.Default)]
public class DecisionRequester : MonoBehaviour
{
/// <summary>
/// The frequency with which the agent requests a decision. A DecisionPeriod of 5 means
/// that the Agent will request a decision every 5 Academy steps.
/// </summary>
[Tooltip("The agent will automatically request a decision every X Academy steps.")]
[Tooltip("The frequency with which the agent requests a decision. A DecisionPeriod " +
"of 5 means that the Agent will request a decision every 5 Academy steps.")]
[Tooltip("Whether or not AgentAction will be called on Academy steps that decisions aren't requested. Has no effect if DecisionPeriod is 1.")]
/// <summary>
/// Indicates whether or not the agent will take an action during the Academy steps where
/// it does not request a decision. Has no effect when DecisionPeriod is set to 1.
/// </summary>
[Tooltip("Indicates whether or not the agent will take an action during the Academy " +
"steps where it does not request a decision. Has no effect when DecisionPeriod " +
"is set to 1.")]
[Tooltip("Whether or not Agent decisions should start at a random offset.")]
/// <summary>
/// Whether or not the Agent decisions should start at an offset (different for each agent).
/// This does not affect <see cref="DecisionPeriod"/>. Turning this on will distribute
/// the decision-making computations for all the agents across multiple Academy steps.
/// This can be valuable in scenarios where you have many agents in the scene, particularly
/// during the inference phase.
/// </summary>
[Tooltip("Whether or not Agent decisions should start at an offset.")]
public void Awake()
internal void Awake()
{
m_Offset = offsetStep ? gameObject.GetInstanceID() : 0;
m_Agent = gameObject.GetComponent<Agent>();

1
com.unity.ml-agents/Runtime/Grpc/GrpcExtensions.cs


using System.Linq;
using Google.Protobuf;
using MLAgents.CommunicatorObjects;
using MLAgents.Sensor;
using UnityEngine;
using System.Runtime.CompilerServices;

41
com.unity.ml-agents/Runtime/Grpc/RpcCommunicator.cs


using MLAgents.CommunicatorObjects;
using System.IO;
using Google.Protobuf;
using MLAgents.Sensor;
namespace MLAgents
{

/// The communicator parameters sent at construction
CommunicatorInitParameters m_CommunicatorInitParameters;
Dictionary<int, SideChannel> m_SideChannels = new Dictionary<int, SideChannel>();
Dictionary<Guid, SideChannel> m_SideChannels = new Dictionary<Guid, SideChannel>();
/// <summary>
/// Initializes a new instance of the RPCCommunicator class.

/// <param name="sideChannel"> The side channel to be registered.</param>
public void RegisterSideChannel(SideChannel sideChannel)
{
var channelType = sideChannel.ChannelType();
if (m_SideChannels.ContainsKey(channelType))
var channelId = sideChannel.ChannelId;
if (m_SideChannels.ContainsKey(channelId))
"side channels of the same type.", channelType));
"side channels of the same id.", channelId));
m_SideChannels.Add(channelType, sideChannel);
m_SideChannels.Add(channelId, sideChannel);
}
/// <summary>
/// Unregisters a side channel from the communicator.
/// </summary>
/// <param name="sideChannel"> The side channel to be unregistered.</param>
public void UnregisterSideChannel(SideChannel sideChannel)
{
if (m_SideChannels.ContainsKey(sideChannel.ChannelId))
{
m_SideChannels.Remove(sideChannel.ChannelId);
}
}
/// <summary>

/// <param name="sideChannels"> A dictionary of channel type to channel.</param>
/// <returns></returns>
public static byte[] GetSideChannelMessage(Dictionary<int, SideChannel> sideChannels)
public static byte[] GetSideChannelMessage(Dictionary<Guid, SideChannel> sideChannels)
{
using (var memStream = new MemoryStream())
{

var messageList = sideChannel.MessageQueue;
foreach (var message in messageList)
{
binaryWriter.Write(sideChannel.ChannelType());
binaryWriter.Write(sideChannel.ChannelId.ToByteArray());
binaryWriter.Write(message.Count());
binaryWriter.Write(message);
}

/// </summary>
/// <param name="sideChannels">A dictionary of channel type to channel.</param>
/// <param name="dataReceived">The byte array of data received from Python.</param>
public static void ProcessSideChannelData(Dictionary<int, SideChannel> sideChannels, byte[] dataReceived)
public static void ProcessSideChannelData(Dictionary<Guid, SideChannel> sideChannels, byte[] dataReceived)
{
if (dataReceived.Length == 0)
{

{
while (memStream.Position < memStream.Length)
{
int channelType = 0;
Guid channelId = Guid.Empty;
channelType = binaryReader.ReadInt32();
channelId = new Guid(binaryReader.ReadBytes(16));
var messageLength = binaryReader.ReadInt32();
message = binaryReader.ReadBytes(messageLength);
}

"version of MLAgents in Unity is compatible with the Python version. Original error : "
+ ex.Message);
}
if (sideChannels.ContainsKey(channelType))
if (sideChannels.ContainsKey(channelId))
sideChannels[channelType].OnMessageReceived(message);
sideChannels[channelId].OnMessageReceived(message);
"Unknown side channel data received. Channel type "
+ ": {0}", channelType));
"Unknown side channel data received. Channel Id is "
+ ": {0}", channelId));
}
}
}

15
com.unity.ml-agents/Runtime/ICommunicator.cs


using System;
using System.Collections.Generic;
using UnityEngine;
using MLAgents.Sensor;
namespace MLAgents
{

/// <summary>
/// Registers a new Brain to the Communicator.
/// </summary>
/// <param name="name">The name or key uniquely identifying the Brain</param>
/// <param name="brainParameters">The Parameters for the Brain being registered</param>
/// <param name="name">The name or key uniquely identifying the Brain.</param>
/// <param name="brainParameters">The Parameters for the Brain being registered.</param>
void SubscribeBrain(string name, BrainParameters brainParameters);
/// <summary>

/// <summary>
/// Gets the AgentActions based on the batching key.
/// </summary>
/// <param name="key">A key to identify which behavior actions to get</param>
/// <param name="agentId">A key to identify which Agent actions to get</param>
/// <param name="key">A key to identify which behavior actions to get.</param>
/// <param name="agentId">A key to identify which Agent actions to get.</param>
/// <returns></returns>
float[] GetActions(string key, int agentId);

/// </summary>
/// <param name="sideChannel"> The side channel to be registered.</param>
void RegisterSideChannel(SideChannel sideChannel);
/// <summary>
/// Unregisters a side channel from the communicator.
/// </summary>
/// <param name="sideChannel"> The side channel to be unregistered.</param>
void UnregisterSideChannel(SideChannel sideChannel);
}
}

2
com.unity.ml-agents/Runtime/InferenceBrain/ApplierImpl.cs


{
actionValue[j] = tensorProxy.data[agentIndex, j];
}
}
agentIndex++;
}

{
actionVal[j] = actionValues[agentIndex, j];
}
}
agentIndex++;
}

5
com.unity.ml-agents/Runtime/InferenceBrain/BarracudaModelParamLoader.cs


using System.Collections.Generic;
using System.Linq;
using Barracuda;
using MLAgents.Sensor;
using UnityEngine;
namespace MLAgents.InferenceBrain

/// Generates the Tensor inputs that are expected to be present in the Model.
/// </summary>
/// <param name="model">
/// The Barracuda engine model for loading static parameters
/// The Barracuda engine model for loading static parameters.
/// <returns>TensorProxy IEnumerable with the expected Tensor inputs</returns>
/// <returns>TensorProxy IEnumerable with the expected Tensor inputs.</returns>
public static IReadOnlyList<TensorProxy> GetInputTensors(Model model)
{
var tensors = new List<TensorProxy>();

4
com.unity.ml-agents/Runtime/InferenceBrain/GeneratorImpl.cs


using System;
using Barracuda;
using MLAgents.InferenceBrain.Utils;
using MLAgents.Sensor;
using UnityEngine;
namespace MLAgents.InferenceBrain

{
var info = infoSensorPair.agentInfo;
var pastAction = info.storedVectorActions;
if (pastAction != null){
if (pastAction != null)
{
for (var j = 0; j < actionSize; j++)
{
tensorProxy.data[agentIndex, j] = pastAction[j];

1
com.unity.ml-agents/Runtime/InferenceBrain/ModelRunner.cs


using Barracuda;
using UnityEngine.Profiling;
using System;
using MLAgents.Sensor;
namespace MLAgents.InferenceBrain
{

13
com.unity.ml-agents/Runtime/InferenceBrain/TensorGenerator.cs


using System.Collections.Generic;
using Barracuda;
using MLAgents.Sensor;
namespace MLAgents.InferenceBrain
{

/// Modifies the data inside a Tensor according to the information contained in the
/// AgentInfos contained in the current batch.
/// </summary>
/// <param name="tensorProxy"> The tensor the data and shape will be modified</param>
/// <param name="batchSize"> The number of agents present in the current batch</param>
/// <param name="infos"> List of AgentInfos containing the
/// information that will be used to populate the tensor's data</param>
/// <param name="tensorProxy"> The tensor the data and shape will be modified.</param>
/// <param name="batchSize"> The number of agents present in the current batch.</param>
/// <param name="infos">
/// List of AgentInfos containing the information that will be used to populate
/// the tensor's data.
/// </param>
void Generate(
TensorProxy tensorProxy, int batchSize, IEnumerable<AgentInfoSensorsPair> infos);
}

/// Returns a new TensorGenerators object.
/// </summary>
/// <param name="seed"> The seed the Generators will be initialized with.</param>
/// <param name="allocator"> Tensor allocator</param>
/// <param name="allocator"> Tensor allocator.</param>
/// <param name="memories">Dictionary of AgentInfo.id to memory for use in the inference model.</param>
/// <param name="barracudaModel"></param>
public TensorGenerator(

2
com.unity.ml-agents/Runtime/InferenceBrain/TensorProxy.cs


/// allowing the user to specify everything but the data in a graphical way.
/// </summary>
[Serializable]
public class TensorProxy
internal class TensorProxy
{
public enum TensorType
{

2
com.unity.ml-agents/Runtime/InferenceBrain/Utils/Multinomial.cs


/// entry[i] = P(x \le i), NOT P(i - 1 \le x \lt i).
/// (\le stands for less than or equal to while \lt is strictly less than).
/// </summary>
public class Multinomial
internal class Multinomial
{
readonly System.Random m_Random;

6
com.unity.ml-agents/Runtime/InferenceBrain/Utils/RandomNormal.cs


/// https://en.wikipedia.org/wiki/Marsaglia_polar_method
/// TODO: worth overriding System.Random instead of aggregating?
/// </summary>
public class RandomNormal
internal class RandomNormal
{
readonly double m_Mean;
readonly double m_Stddev;

double m_SpareUnscaled;
/// <summary>
/// Return the next random double number
/// Return the next random double number.
/// <returns>Next random double number</returns>
/// <returns>Next random double number.</returns>
public double NextDouble()
{
if (m_HasSpare)

13
com.unity.ml-agents/Runtime/Policy/BarracudaPolicy.cs


using System.Collections.Generic;
using MLAgents.InferenceBrain;
using System;
using MLAgents.Sensor;
/// <summary>
/// Where to perform inference.
/// </summary>
/// <summary>
/// CPU inference
/// </summary>
/// <summary>
/// GPU inference
/// </summary>
/// every step. It uses a ModelRunner that is shared accross all
/// every step. It uses a ModelRunner that is shared across all
/// Barracuda Policies that use the same model and inference devices.
/// </summary>
internal class BarracudaPolicy : IPolicy

21
com.unity.ml-agents/Runtime/Policy/BehaviorParameters.cs


/// <summary>
/// The Factory to generate policies.
/// </summary>
///
[AddComponentMenu("ML Agents/Behavior Parameters", (int)MenuGroup.Default)]
public class BehaviorParameters : MonoBehaviour
{

[HideInInspector]
[SerializeField]
string m_BehaviorName = "My Behavior";
/// <summary>
/// The team ID for this behavior.
/// </summary>
[HideInInspector]
[SerializeField]
public int m_TeamID;

[Tooltip("Use all Sensor components attached to child GameObjects of this Agent.")]
bool m_UseChildSensors = true;
/// <summary>
/// The associated <see cref="BrainParameters"/> for this behavior.
/// </summary>
/// <summary>
/// Whether or not to use all the sensor components attached to child GameObjects of the agent.
/// </summary>
/// <summary>
/// The name of this behavior, which is used as a base name. See
/// <see cref="fullyQualifiedBehaviorName"/> for the full name.
/// </summary>
public string behaviorName
{
get { return m_BehaviorName; }

}
}
/// <summary>
/// Updates the model and related details for this behavior.
/// </summary>
/// <param name="newBehaviorName">New name for the behavior.</param>
/// <param name="model">New neural network model for this behavior.</param>
/// <param name="inferenceDevice">New inference device for this behavior.</param>
public void GiveModel(
string newBehaviorName,
NNModel model,

34
com.unity.ml-agents/Runtime/Policy/BrainParameters.cs


namespace MLAgents
{
/// <summary>
/// Whether the action space is discrete or continuous.
/// </summary>
/// <summary>
/// Discrete action space: a fixed number of options are available.
/// </summary>
/// <summary>
/// Continuous action space: each action can take on a float value.
/// </summary>
Continuous
}

public class BrainParameters
{
/// <summary>
/// If continuous : The length of the float vector that represents
/// the state
/// If discrete : The number of possible values the state can take
/// If continuous : The length of the float vector that represents the state.
/// If discrete : The number of possible values the state can take.
/// <summary>
/// Stacking refers to concatenating the observations across multiple frames. This field
/// indicates the number of frames to concatenate across.
/// </summary>
/// If continuous : The length of the float vector that represents
/// the action
/// If discrete : The number of possible values the action can take*/
/// If continuous : The length of the float vector that represents the action.
/// If discrete : The number of possible values the action can take.
/// <summary></summary>The list of strings describing what the actions correpond to */
/// <summary>
/// The list of strings describing what the actions correspond to.
/// </summary>
/// <summary>Defines if the action is discrete or continuous</summary>
/// <summary>
/// Defines if the action is discrete or continuous.
/// </summary>
/// Deep clones the BrainParameter object
/// Deep clones the BrainParameter object.
/// </summary>
/// <returns> A new BrainParameter object with the same values as the original.</returns>
public BrainParameters Clone()

1
com.unity.ml-agents/Runtime/Policy/HeuristicPolicy.cs


using MLAgents.Sensor;
using System.Collections.Generic;
using System;

1
com.unity.ml-agents/Runtime/Policy/IPolicy.cs


using System;
using System.Collections.Generic;
using MLAgents.Sensor;
namespace MLAgents
{

3
com.unity.ml-agents/Runtime/Policy/RemotePolicy.cs


using UnityEngine;
using System.Collections.Generic;
using MLAgents.Sensor;
using System;
namespace MLAgents

/// </summary>
internal class RemotePolicy : IPolicy
{
int m_AgentId;
string m_FullyQualifiedBehaviorName;

{
m_Communicator?.DecideBatch();
return m_Communicator?.GetActions(m_FullyQualifiedBehaviorName, m_AgentId);
}
public void Dispose()

56
com.unity.ml-agents/Runtime/Sensor/CameraSensor.cs


using System;
namespace MLAgents.Sensor
namespace MLAgents
/// <summary>
/// A sensor that wraps a Camera object to generate visual observations for an agent.
/// </summary>
public class CameraSensor : ISensor
{
Camera m_Camera;

int[] m_Shape;
SensorCompressionType m_CompressionType;
public CameraSensor(Camera camera, int width, int height, bool grayscale, string name,
SensorCompressionType compression)
/// <summary>
/// Creates and returns the camera sensor.
/// </summary>
/// <param name="camera">Camera object to capture images from.</param>
/// <param name="width">The width of the generated visual observation.</param>
/// <param name="height">The height of the generated visual observation.</param>
/// <param name="grayscale">Whether to convert the generated image to grayscale or keep color.</param>
/// <param name="name">The name of the camera sensor.</param>
/// <param name="compression">The compression to apply to the generated image.</param>
public CameraSensor(
Camera camera, int width, int height, bool grayscale, string name, SensorCompressionType compression)
{
m_Camera = camera;
m_Width = width;

m_Shape = new[] { height, width, grayscale ? 1 : 3 };
m_Shape = GenerateShape(width, height, grayscale);
/// <summary>
/// Accessor for the name of the sensor.
/// </summary>
/// <returns>Sensor name.</returns>
/// <summary>
/// Accessor for the size of the sensor data. Will be h x w x 1 for grayscale and
/// h x w x 3 for color.
/// </summary>
/// <returns>Size of each of the three dimensions.</returns>
/// <summary>
/// Generates a compressed image. This can be valuable in speeding-up training.
/// </summary>
/// <returns>Compressed image.</returns>
public byte[] GetCompressedObservation()
{
using (TimerStack.Instance.Scoped("CameraSensor.GetCompressedObservation"))

}
}
/// <summary>
/// Writes out the generated, uncompressed image to the provided <see cref="WriteAdapter"/>.
/// </summary>
/// <param name="adapter">Where the observation is written to.</param>
/// <returns></returns>
public int Write(WriteAdapter adapter)
{
using (TimerStack.Instance.Scoped("CameraSensor.WriteToTensor"))

}
}
/// <inheritdoc/>
/// <inheritdoc/>
public SensorCompressionType GetCompressionType()
{
return m_CompressionType;

/// Converts a m_Camera and corresponding resolution to a 2D texture.
/// Renders a Camera instance to a 2D texture at the corresponding resolution.
/// </summary>
/// <returns>The 2D texture.</returns>
/// <param name="obsCamera">Camera.</param>

RenderTexture.active = prevActiveRt;
RenderTexture.ReleaseTemporary(tempRt);
return texture2D;
}
/// <summary>
/// Computes the observation shape for a camera sensor based on the height, width
/// and grayscale flag.
/// </summary>
/// <param name="width">Width of the image captures from the camera.</param>
/// <param name="height">Height of the image captures from the camera.</param>
/// <param name="grayscale">Whether or not to convert the image to grayscale.</param>
/// <returns>The observation shape.</returns>
internal static int[] GenerateShape(int width, int height, bool grayscale)
{
return new[] { height, width, grayscale ? 1 : 3 };
}
}
}

39
com.unity.ml-agents/Runtime/Sensor/CameraSensorComponent.cs


using System;
namespace MLAgents.Sensor
namespace MLAgents
/// <summary>
/// A SensorComponent that creates a <see cref="CameraSensor"/>.
/// </summary>
/// <summary>
/// Camera object that provides the data to the sensor.
/// </summary>
/// <summary>
/// Name of the generated <see cref="CameraSensor"/> object.
/// </summary>
/// <summary>
/// Width of the generated image.
/// </summary>
/// <summary>
/// Height of the generated image.
/// </summary>
/// <summary>
/// Whether to generate grayscale images or color.
/// </summary>
/// <summary>
/// The compression type to use for the sensor.
/// </summary>
/// <summary>
/// Creates the <see cref="CameraSensor"/>
/// </summary>
/// <returns>The created <see cref="CameraSensor"/> object for this component.</returns>
/// <summary>
/// Computes the observation shape of the sensor.
/// </summary>
/// <returns>The observation shape of the associated <see cref="CameraSensor"/> object.</returns>
return new[] { height, width, grayscale ? 1 : 3 };
return CameraSensor.GenerateShape(width, height, grayscale);
}
}
}

67
com.unity.ml-agents/Runtime/Sensor/ISensor.cs


namespace MLAgents.Sensor
namespace MLAgents
/// <summary>
/// The compression setting for visual/camera observations.
/// </summary>
/// <summary>
/// No compression. Data is preserved as float arrays.
/// </summary>
/// <summary>
/// PNG format. Data will be stored in binary format.
/// </summary>
/// For custom implementations, it is recommended to SensorBase instead.
/// For custom implementations, it is recommended to <see cref="SensorBase"/> instead.
/// For example, a sensor that observes the velocity of a rigid body (in 3D) would return new {3}.
/// A sensor that returns an RGB image would return new [] {Width, Height, 3}
/// For example, a sensor that observes the velocity of a rigid body (in 3D) would return
/// new {3}. A sensor that returns an RGB image would return new [] {Height, Width, 3}
/// <returns></returns>
/// <returns>Size of the observations that will be generated.</returns>
/// Write the observation data directly to the WriteAdapter.
/// This is considered an advanced interface; for a simpler approach, use SensorBase and override WriteFloats instead.
/// Note that this (and GetCompressedObservation) may be called multiple times per agent step, so should not
/// mutate any internal state.
/// Write the observation data directly to the <see cref="WriteAdapter"/>.
/// This is considered an advanced interface; for a simpler approach, use
/// <see cref="SensorBase"/> and override <see cref="SensorBase.WriteObservation"/> instead.
/// Note that this (and <see cref="GetCompressedObservation"/>) may
/// be called multiple times per agent step, so should not mutate any internal state.
/// <param name="adapater"></param>
/// <returns>The number of elements written</returns>
int Write(WriteAdapter adapater);
/// <param name="adapter">Where the observations will be written to.</param>
/// <returns>The number of elements written.</returns>
int Write(WriteAdapter adapter);
/// Return a compressed representation of the observation. For small observations, this should generally not be
/// implemented. However, compressing large observations (such as visual results) can significantly improve
/// model training time.
/// Return a compressed representation of the observation. For small observations,
/// this should generally not be implemented. However, compressing large observations
/// (such as visual results) can significantly improve model training time.
/// <returns></returns>
/// <returns>Compressed observation.</returns>
byte[] GetCompressedObservation();
/// <summary>

/// <summary>
/// Return the compression type being used. If no compression is used, return SensorCompressionType.None
/// Return the compression type being used. If no compression is used, return
/// <see cref="SensorCompressionType.None"/>.
/// <returns></returns>
/// <returns>Compression type used by the sensor.</returns>
/// Get the name of the sensor. This is used to ensure deterministic sorting of the sensors on an Agent,
/// so the naming must be consistent across all sensors and agents.
/// Get the name of the sensor. This is used to ensure deterministic sorting of the sensors
/// on an Agent, so the naming must be consistent across all sensors and agents.
/// <returns>The name of the sensor</returns>
/// <returns>The name of the sensor.</returns>
/// <summary>
/// Helper methods to be shared by all classes that implement <see cref="ISensor"/>.
/// </summary>
/// Get the total number of elements in the ISensor's observation (i.e. the product of the shape elements).
/// Get the total number of elements in the ISensor's observation (i.e. the product of the
/// shape elements).
/// </summary>
/// <param name="sensor"></param>
/// <returns></returns>

int count = 1;
for (var i = 0; i < shape.Length; i++)
var count = 1;
foreach (var dim in shape)
count *= shape[i];
count *= dim;
}
return count;

4
com.unity.ml-agents/Runtime/Sensor/Observation.cs


using System;
using UnityEngine;
namespace MLAgents.Sensor
namespace MLAgents
{
internal struct Observation
{

/// <summary>
/// The uncompressed dimensions of the data.
/// E.g. for RGB visual observations, this will be {Width, Height, 3}
/// E.g. for RGB visual observations, this will be {Height, Width, 3}
/// </summary>
public int[] Shape;
}

574
com.unity.ml-agents/Runtime/Sensor/RayPerceptionSensor.cs


using System.Collections.Generic;
using UnityEngine;
namespace MLAgents.Sensor
namespace MLAgents
public class RayPerceptionSensor : ISensor
/// <summary>
/// Determines which dimensions the sensor will perform the casts in.
/// </summary>
public enum RayPerceptionCastType
public enum CastType
{
Cast2D,
Cast3D,
}
/// <summary>
/// Cast in 2 dimensions, using Physics2D.CircleCast or Physics2D.RayCast.
/// </summary>
Cast2D,
float[] m_Observations;
int[] m_Shape;
string m_Name;
/// <summary>
/// Cast in 3 dimensions, using Physics.SphereCast or Physics.RayCast.
/// </summary>
Cast3D,
}
float m_RayDistance;
List<string> m_DetectableObjects;
float[] m_Angles;
/// <summary>
/// Contains the elements that define a ray perception sensor.
/// </summary>
public struct RayPerceptionInput
{
/// <summary>
/// Length of the rays to cast. This will be scaled up or down based on the scale of the transform.
/// </summary>
public float rayLength;
float m_StartOffset;
float m_EndOffset;
float m_CastRadius;
CastType m_CastType;
Transform m_Transform;
int m_LayerMask;
/// <summary>
/// List of tags which correspond to object types agent can see.
/// </summary>
public IReadOnlyList<string> detectableTags;
/// Debug information for the raycast hits. This is used by the RayPerceptionSensorComponent.
/// List of angles (in degrees) used to define the rays.
/// 90 degrees is considered "forward" relative to the game object.
public class DebugDisplayInfo
public IReadOnlyList<float> angles;
/// <summary>
/// Starting height offset of ray from center of agent
/// </summary>
public float startOffset;
/// <summary>
/// Ending height offset of ray from center of agent.
/// </summary>
public float endOffset;
/// <summary>
/// Radius of the sphere to use for spherecasting.
/// If 0 or less, rays are used instead - this may be faster, especially for complex environments.
/// </summary>
public float castRadius;
/// <summary>
/// Transform of the GameObject.
/// </summary>
public Transform transform;
/// <summary>
/// Whether to perform the casts in 2D or 3D.
/// </summary>
public RayPerceptionCastType castType;
/// <summary>
/// Filtering options for the casts.
/// </summary>
public int layerMask;
/// <summary>
/// Returns the expected number of floats in the output.
/// </summary>
/// <returns></returns>
public int OutputSize()
public struct RayInfo
return (detectableTags.Count + 2) * angles.Count;
}
/// <summary>
/// Get the cast start and end points for the given ray index/
/// </summary>
/// <param name="rayIndex"></param>
/// <returns>A tuple of the start and end positions in world space.</returns>
public (Vector3 StartPositionWorld, Vector3 EndPositionWorld) RayExtents(int rayIndex)
{
var angle = angles[rayIndex];
Vector3 startPositionLocal, endPositionLocal;
if (castType == RayPerceptionCastType.Cast3D)
public Vector3 localStart;
public Vector3 localEnd;
public Vector3 worldStart;
public Vector3 worldEnd;
public bool castHit;
public float hitFraction;
public float castRadius;
startPositionLocal = new Vector3(0, startOffset, 0);
endPositionLocal = PolarToCartesian3D(rayLength, angle);
endPositionLocal.y += endOffset;
public void Reset()
else
m_Frame = Time.frameCount;
// Vector2s here get converted to Vector3s (and back to Vector2s for casting)
startPositionLocal = new Vector2();
endPositionLocal = PolarToCartesian2D(rayLength, angle);
var startPositionWorld = transform.TransformPoint(startPositionLocal);
var endPositionWorld = transform.TransformPoint(endPositionLocal);
return (StartPositionWorld: startPositionWorld, EndPositionWorld: endPositionWorld);
}
/// <summary>
/// Converts polar coordinate to cartesian coordinate.
/// </summary>
static internal Vector3 PolarToCartesian3D(float radius, float angleDegrees)
{
var x = radius * Mathf.Cos(Mathf.Deg2Rad * angleDegrees);
var z = radius * Mathf.Sin(Mathf.Deg2Rad * angleDegrees);
return new Vector3(x, 0f, z);
}
/// <summary>
/// Converts polar coordinate to cartesian coordinate.
/// </summary>
static internal Vector2 PolarToCartesian2D(float radius, float angleDegrees)
{
var x = radius * Mathf.Cos(Mathf.Deg2Rad * angleDegrees);
var y = radius * Mathf.Sin(Mathf.Deg2Rad * angleDegrees);
return new Vector2(x, y);
}
}
/// <summary>
/// Contains the data generated/produced from a ray perception sensor.
/// </summary>
public class RayPerceptionOutput
{
/// <summary>
/// Contains the data generated from a single ray of a ray perception sensor.
/// </summary>
public struct RayOutput
{
/// "Age" of the results in number of frames. This is used to adjust the alpha when drawing.
/// Whether or not the ray hit anything.
public int age
public bool hasHit;
/// <summary>
/// Whether or not the ray hit an object whose tag is in the input's detectableTags list.
/// </summary>
public bool hitTaggedObject;
/// <summary>
/// The index of the hit object's tag in the detectableTags list, or -1 if there was no hit, or the
/// hit object has a different tag.
/// </summary>
public int hitTagIndex;
/// <summary>
/// Normalized distance to the hit object.
/// </summary>
public float hitFraction;
/// <summary>
/// Writes the ray output information to a subset of the float array. Each element in the rayAngles array
/// determines a sublist of data to the observation. The sublist contains the observation data for a single cast.
/// The list is composed of the following:
/// 1. A one-hot encoding for detectable tags. For example, if detectableTags.Length = n, the
/// first n elements of the sublist will be a one-hot encoding of the detectableTag that was hit, or
/// all zeroes otherwise.
/// 2. The 'numDetectableTags' element of the sublist will be 1 if the ray missed everything, or 0 if it hit
/// something (detectable or not).
/// 3. The 'numDetectableTags+1' element of the sublist will contain the normalized distance to the object
/// hit, or 1.0 if nothing was hit.
/// </summary>
/// <param name="numDetectableTags"></param>
/// <param name="rayIndex"></param>
/// <param name="buffer">Output buffer. The size must be equal to (numDetectableTags+2) * rayOutputs.Length</param>
public void ToFloatArray(int numDetectableTags, int rayIndex, float[] buffer)
get { return Time.frameCount - m_Frame; }
var bufferOffset = (numDetectableTags + 2) * rayIndex;
if (hitTaggedObject)
{
buffer[bufferOffset + hitTagIndex] = 1f;
}
buffer[bufferOffset + numDetectableTags] = hasHit ? 0f : 1f;
buffer[bufferOffset + numDetectableTags + 1] = hitFraction;
}
public RayInfo[] rayInfos;
/// <summary>
/// RayOutput for each ray that was cast.
/// </summary>
public RayOutput[] rayOutputs;
}
/// <summary>
/// Debug information for the raycast hits. This is used by the RayPerceptionSensorComponent.
/// </summary>
internal class DebugDisplayInfo
{
public struct RayInfo
{
public Vector3 worldStart;
public Vector3 worldEnd;
public float castRadius;
public RayPerceptionOutput.RayOutput rayOutput;
}
int m_Frame;
public void Reset()
{
m_Frame = Time.frameCount;
/// <summary>
/// "Age" of the results in number of frames. This is used to adjust the alpha when drawing.
/// </summary>
public int age
{
get { return Time.frameCount - m_Frame; }
}
public RayInfo[] rayInfos;
int m_Frame;
}
/// <summary>
/// A sensor implementation that supports ray cast-based observations.
/// </summary>
public class RayPerceptionSensor : ISensor
{
float[] m_Observations;
int[] m_Shape;
string m_Name;
RayPerceptionInput m_RayPerceptionInput;
public DebugDisplayInfo debugDisplayInfo
internal DebugDisplayInfo debugDisplayInfo
public RayPerceptionSensor(string name, float rayDistance, List<string> detectableObjects, float[] angles,
Transform transform, float startOffset, float endOffset, float castRadius, CastType castType,
int rayLayerMask)
/// <summary>
/// Creates the RayPerceptionSensor.
/// </summary>
/// <param name="name">The name of the sensor.</param>
/// <param name="rayInput">The inputs for the sensor.</param>
public RayPerceptionSensor(string name, RayPerceptionInput rayInput)
var numObservations = (detectableObjects.Count + 2) * angles.Length;
var numObservations = rayInput.OutputSize();
m_RayPerceptionInput = rayInput;
m_RayDistance = rayDistance;
m_DetectableObjects = detectableObjects;
// TODO - preprocess angles, save ray directions instead?
m_Angles = angles;
m_Transform = transform;
m_StartOffset = startOffset;
m_EndOffset = endOffset;
m_CastRadius = castRadius;
m_CastType = castType;
m_LayerMask = rayLayerMask;
if (Application.isEditor)
{
m_DebugDisplayInfo = new DebugDisplayInfo();

internal void SetRayPerceptionInput(RayPerceptionInput input)
{
// TODO make sure that number of rays and tags don't change
m_RayPerceptionInput = input;
}
/// <summary>
/// Computes the ray perception observations and saves them to the provided
/// <see cref="WriteAdapter"/>.
/// </summary>
/// <param name="adapter">Where the ray perception observations are written to.</param>
/// <returns></returns>
PerceiveStatic(
m_RayDistance, m_Angles, m_DetectableObjects, m_StartOffset, m_EndOffset,
m_CastRadius, m_Transform, m_CastType, m_Observations, m_LayerMask,
m_DebugDisplayInfo
);
Array.Clear(m_Observations, 0, m_Observations.Length);
var numRays = m_RayPerceptionInput.angles.Count;
var numDetectableTags = m_RayPerceptionInput.detectableTags.Count;
if (m_DebugDisplayInfo != null)
{
// Reset the age information, and resize the buffer if needed.
m_DebugDisplayInfo.Reset();
if (m_DebugDisplayInfo.rayInfos == null || m_DebugDisplayInfo.rayInfos.Length != numRays)
{
m_DebugDisplayInfo.rayInfos = new DebugDisplayInfo.RayInfo[numRays];
}
}
// For each ray, do the casting, and write the information to the observation buffer
for (var rayIndex = 0; rayIndex < numRays; rayIndex++)
{
DebugDisplayInfo.RayInfo debugRay;
var rayOutput = PerceiveSingleRay(m_RayPerceptionInput, rayIndex, out debugRay);
if (m_DebugDisplayInfo != null)
{
m_DebugDisplayInfo.rayInfos[rayIndex] = debugRay;
}
rayOutput.ToFloatArray(numDetectableTags, rayIndex, m_Observations);
}
// Finally, add the observations to the WriteAdapter
/// <inheritdoc/>
/// <inheritdoc/>
/// <inheritdoc/>
/// <inheritdoc/>
/// <inheritdoc/>
public virtual SensorCompressionType GetCompressionType()
{
return SensorCompressionType.None;

/// Evaluates a perception vector to be used as part of an observation of an agent.
/// Each element in the rayAngles array determines a sublist of data to the observation.
/// The sublist contains the observation data for a single cast. The list is composed of the following:
/// 1. A one-hot encoding for detectable objects. For example, if detectableObjects.Length = n, the
/// first n elements of the sublist will be a one-hot encoding of the detectableObject that was hit, or
/// all zeroes otherwise.
/// 2. The 'length' element of the sublist will be 1 if the ray missed everything, or 0 if it hit
/// something (detectable or not).
/// 3. The 'length+1' element of the sublist will contain the normalised distance to the object hit, or 1 if
/// nothing was hit.
///
/// Evaluates the raycasts to be used as part of an observation of an agent.
/// <param name="unscaledRayLength"></param>
/// <param name="rayAngles">List of angles (in degrees) used to define the rays. 90 degrees is considered
/// "forward" relative to the game object</param>
/// <param name="detectableObjects">List of tags which correspond to object types agent can see</param>
/// <param name="startOffset">Starting height offset of ray from center of agent.</param>
/// <param name="endOffset">Ending height offset of ray from center of agent.</param>
/// <param name="unscaledCastRadius">Radius of the sphere to use for spherecasting. If 0 or less, rays are used
/// instead - this may be faster, especially for complex environments.</param>
/// <param name="transform">Transform of the GameObject</param>
/// <param name="castType">Whether to perform the casts in 2D or 3D.</param>
/// <param name="perceptionBuffer">Output array of floats. Must be (num rays) * (num tags + 2) in size.</param>
/// <param name="layerMask">Filtering options for the casts</param>
/// <param name="debugInfo">Optional debug information output, only used by RayPerceptionSensor.</param>
///
public static void PerceiveStatic(float unscaledRayLength,
IReadOnlyList<float> rayAngles, IReadOnlyList<string> detectableObjects,
float startOffset, float endOffset, float unscaledCastRadius,
Transform transform, CastType castType, float[] perceptionBuffer,
int layerMask = Physics.DefaultRaycastLayers,
DebugDisplayInfo debugInfo = null)
/// <param name="input">Input defining the rays that will be cast.</param>
/// <returns>Output struct containing the raycast results.</returns>
public static RayPerceptionOutput PerceiveStatic(RayPerceptionInput input)
Array.Clear(perceptionBuffer, 0, perceptionBuffer.Length);
if (debugInfo != null)
RayPerceptionOutput output = new RayPerceptionOutput();
output.rayOutputs = new RayPerceptionOutput.RayOutput[input.angles.Count];
for (var rayIndex = 0; rayIndex < input.angles.Count; rayIndex++)
debugInfo.Reset();
if (debugInfo.rayInfos == null || debugInfo.rayInfos.Length != rayAngles.Count)
{
debugInfo.rayInfos = new DebugDisplayInfo.RayInfo[rayAngles.Count];
}
DebugDisplayInfo.RayInfo debugRay;
output.rayOutputs[rayIndex] = PerceiveSingleRay(input, rayIndex, out debugRay);
// For each ray sublist stores categorical information on detected object
// along with object distance.
int bufferOffset = 0;
for (var rayIndex = 0; rayIndex < rayAngles.Count; rayIndex++)
{
var angle = rayAngles[rayIndex];
Vector3 startPositionLocal, endPositionLocal;
if (castType == CastType.Cast3D)
{
startPositionLocal = new Vector3(0, startOffset, 0);
endPositionLocal = PolarToCartesian3D(unscaledRayLength, angle);
endPositionLocal.y += endOffset;
}
else
{
// Vector2s here get converted to Vector3s (and back to Vector2s for casting)
startPositionLocal = new Vector2();
endPositionLocal = PolarToCartesian2D(unscaledRayLength, angle);
}
return output;
}
var startPositionWorld = transform.TransformPoint(startPositionLocal);
var endPositionWorld = transform.TransformPoint(endPositionLocal);
/// <summary>
/// Evaluate the raycast results of a single ray from the RayPerceptionInput.
/// </summary>
/// <param name="input"></param>
/// <param name="rayIndex"></param>
/// <param name="debugRayOut"></param>
/// <returns></returns>
internal static RayPerceptionOutput.RayOutput PerceiveSingleRay(
RayPerceptionInput input,
int rayIndex,
out DebugDisplayInfo.RayInfo debugRayOut
)
{
var unscaledRayLength = input.rayLength;
var unscaledCastRadius = input.castRadius;
var rayDirection = endPositionWorld - startPositionWorld;
// If there is non-unity scale, |rayDirection| will be different from rayLength.
// We want to use this transformed ray length for determining cast length, hit fraction etc.
// We also it to scale up or down the sphere or circle radii
var scaledRayLength = rayDirection.magnitude;
// Avoid 0/0 if unscaledRayLength is 0
var scaledCastRadius = unscaledRayLength > 0 ? unscaledCastRadius * scaledRayLength / unscaledRayLength : unscaledCastRadius;
var extents = input.RayExtents(rayIndex);
var startPositionWorld = extents.StartPositionWorld;
var endPositionWorld = extents.EndPositionWorld;
// Do the cast and assign the hit information for each detectable object.
// sublist[0 ] <- did hit detectableObjects[0]
// ...
// sublist[numObjects-1] <- did hit detectableObjects[numObjects-1]
// sublist[numObjects ] <- 1 if missed else 0
// sublist[numObjects+1] <- hit fraction (or 1 if no hit)
var rayDirection = endPositionWorld - startPositionWorld;
// If there is non-unity scale, |rayDirection| will be different from rayLength.
// We want to use this transformed ray length for determining cast length, hit fraction etc.
// We also it to scale up or down the sphere or circle radii
var scaledRayLength = rayDirection.magnitude;
// Avoid 0/0 if unscaledRayLength is 0
var scaledCastRadius = unscaledRayLength > 0 ?
unscaledCastRadius * scaledRayLength / unscaledRayLength :
unscaledCastRadius;
bool castHit;
float hitFraction;
GameObject hitObject;
// Do the cast and assign the hit information for each detectable tag.
bool castHit;
float hitFraction;
GameObject hitObject;
if (castType == CastType.Cast3D)
if (input.castType == RayPerceptionCastType.Cast3D)
{
RaycastHit rayHit;
if (scaledCastRadius > 0f)
RaycastHit rayHit;
if (scaledCastRadius > 0f)
{
castHit = Physics.SphereCast(startPositionWorld, scaledCastRadius, rayDirection, out rayHit,
scaledRayLength, layerMask);
}
else
{
castHit = Physics.Raycast(startPositionWorld, rayDirection, out rayHit,
scaledRayLength, layerMask);
}
// If scaledRayLength is 0, we still could have a hit with sphere casts (maybe?).
// To avoid 0/0, set the fraction to 0.
hitFraction = castHit ? (scaledRayLength > 0 ? rayHit.distance / scaledRayLength : 0.0f) : 1.0f;
hitObject = castHit ? rayHit.collider.gameObject : null;
castHit = Physics.SphereCast(startPositionWorld, scaledCastRadius, rayDirection, out rayHit,
scaledRayLength, input.layerMask);
RaycastHit2D rayHit;
if (scaledCastRadius > 0f)
{
rayHit = Physics2D.CircleCast(startPositionWorld, scaledCastRadius, rayDirection,
scaledRayLength, layerMask);
}
else
{
rayHit = Physics2D.Raycast(startPositionWorld, rayDirection, scaledRayLength, layerMask);
}
castHit = rayHit;
hitFraction = castHit ? rayHit.fraction : 1.0f;
hitObject = castHit ? rayHit.collider.gameObject : null;
castHit = Physics.Raycast(startPositionWorld, rayDirection, out rayHit,
scaledRayLength, input.layerMask);
if (debugInfo != null)
// If scaledRayLength is 0, we still could have a hit with sphere casts (maybe?).
// To avoid 0/0, set the fraction to 0.
hitFraction = castHit ? (scaledRayLength > 0 ? rayHit.distance / scaledRayLength : 0.0f) : 1.0f;
hitObject = castHit ? rayHit.collider.gameObject : null;
}
else
{
RaycastHit2D rayHit;
if (scaledCastRadius > 0f)
debugInfo.rayInfos[rayIndex].localStart = startPositionLocal;
debugInfo.rayInfos[rayIndex].localEnd = endPositionLocal;
debugInfo.rayInfos[rayIndex].worldStart = startPositionWorld;
debugInfo.rayInfos[rayIndex].worldEnd = endPositionWorld;
debugInfo.rayInfos[rayIndex].castHit = castHit;
debugInfo.rayInfos[rayIndex].hitFraction = hitFraction;
debugInfo.rayInfos[rayIndex].castRadius = scaledCastRadius;
rayHit = Physics2D.CircleCast(startPositionWorld, scaledCastRadius, rayDirection,
scaledRayLength, input.layerMask);
else if (Application.isEditor)
else
// Legacy drawing
Debug.DrawRay(startPositionWorld, rayDirection, Color.black, 0.01f, true);
rayHit = Physics2D.Raycast(startPositionWorld, rayDirection, scaledRayLength, input.layerMask);
if (castHit)
castHit = rayHit;
hitFraction = castHit ? rayHit.fraction : 1.0f;
hitObject = castHit ? rayHit.collider.gameObject : null;
}
var rayOutput = new RayPerceptionOutput.RayOutput
{
hasHit = castHit,
hitFraction = hitFraction,
hitTaggedObject = false,
hitTagIndex = -1
};
if (castHit)
{
// Find the index of the tag of the object that was hit.
for (var i = 0; i < input.detectableTags.Count; i++)
bool hitTaggedObject = false;
for (var i = 0; i < detectableObjects.Count; i++)
if (hitObject.CompareTag(input.detectableTags[i]))
if (hitObject.CompareTag(detectableObjects[i]))
{
perceptionBuffer[bufferOffset + i] = 1;
perceptionBuffer[bufferOffset + detectableObjects.Count + 1] = hitFraction;
hitTaggedObject = true;
break;
}
}
if (!hitTaggedObject)
{
// Something was hit but not on the list. Still set the hit fraction.
perceptionBuffer[bufferOffset + detectableObjects.Count + 1] = hitFraction;
rayOutput.hitTaggedObject = true;
rayOutput.hitTagIndex = i;
break;
else
{
perceptionBuffer[bufferOffset + detectableObjects.Count] = 1f;
// Nothing was hit, so there's full clearance in front of the agent.
perceptionBuffer[bufferOffset + detectableObjects.Count + 1] = 1.0f;
}
bufferOffset += detectableObjects.Count + 2;
}
/// <summary>
/// Converts polar coordinate to cartesian coordinate.
/// </summary>
static Vector3 PolarToCartesian3D(float radius, float angleDegrees)
{
var x = radius * Mathf.Cos(Mathf.Deg2Rad * angleDegrees);
var z = radius * Mathf.Sin(Mathf.Deg2Rad * angleDegrees);
return new Vector3(x, 0f, z);
}
debugRayOut.worldStart = startPositionWorld;
debugRayOut.worldEnd = endPositionWorld;
debugRayOut.rayOutput = rayOutput;
debugRayOut.castRadius = scaledCastRadius;
/// <summary>
/// Converts polar coordinate to cartesian coordinate.
/// </summary>
static Vector2 PolarToCartesian2D(float radius, float angleDegrees)
{
var x = radius * Mathf.Cos(Mathf.Deg2Rad * angleDegrees);
var y = radius * Mathf.Sin(Mathf.Deg2Rad * angleDegrees);
return new Vector2(x, y);
return rayOutput;
}
}
}

13
com.unity.ml-agents/Runtime/Sensor/RayPerceptionSensorComponent2D.cs


using UnityEngine;
namespace MLAgents.Sensor
namespace MLAgents
/// <summary>
/// A component for 2D Ray Perception.
/// </summary>
/// <summary>
/// Initializes the raycast sensor component.
/// </summary>
public RayPerceptionSensorComponent2D()
{
// Set to the 2D defaults (just in case they ever diverge).

public override RayPerceptionSensor.CastType GetCastType()
/// <inheritdoc/>
public override RayPerceptionCastType GetCastType()
return RayPerceptionSensor.CastType.Cast2D;
return RayPerceptionCastType.Cast2D;
}
}
}

43
com.unity.ml-agents/Runtime/Sensor/RayPerceptionSensorComponent3D.cs


using System;
using UnityEngine.Serialization;
namespace MLAgents.Sensor
namespace MLAgents
/// <summary>
/// A component for 3D Ray Perception.
/// </summary>
[Header("3D Properties", order = 100)]
[HideInInspector]
[SerializeField]
[FormerlySerializedAs("startVerticalOffset")]
public float startVerticalOffset;
float m_StartVerticalOffset;
/// <summary>
/// Ray start is offset up or down by this amount.
/// </summary>
public float startVerticalOffset
{
get => m_StartVerticalOffset;
set { m_StartVerticalOffset = value; UpdateSensor(); }
}
[HideInInspector]
[SerializeField]
[FormerlySerializedAs("endVerticalOffset")]
public float endVerticalOffset;
float m_EndVerticalOffset;
/// <summary>
/// Ray end is offset up or down by this amount.
/// </summary>
public float endVerticalOffset
{
get => m_EndVerticalOffset;
set { m_EndVerticalOffset = value; UpdateSensor(); }
}
public override RayPerceptionSensor.CastType GetCastType()
/// <inheritdoc/>
public override RayPerceptionCastType GetCastType()
return RayPerceptionSensor.CastType.Cast3D;
return RayPerceptionCastType.Cast3D;
/// <inheritdoc/>
/// <inheritdoc/>
public override float GetEndVerticalOffset()
{
return endVerticalOffset;

281
com.unity.ml-agents/Runtime/Sensor/RayPerceptionSensorComponentBase.cs


using System;
using System.Collections.Generic;
using UnityEngine;
using UnityEngine.Serialization;
namespace MLAgents.Sensor
namespace MLAgents
/// <summary>
/// A base class to support sensor components for raycast-based sensors.
/// </summary>
public string sensorName = "RayPerceptionSensor";
[HideInInspector]
[SerializeField]
[FormerlySerializedAs("sensorName")]
string m_SensorName = "RayPerceptionSensor";
/// <summary>
/// The name of the Sensor that this component wraps.
/// </summary>
public string sensorName
{
get => m_SensorName;
// Restrict the access on the name, since changing it a runtime doesn't re-sort the Agent sensors.
internal set => m_SensorName = value;
}
[SerializeField]
[FormerlySerializedAs("detectableTags")]
public List<string> detectableTags;
List<string> m_DetectableTags;
/// <summary>
/// List of tags in the scene to compare against.
/// </summary>
public List<string> detectableTags
{
get => m_DetectableTags;
// Note: can't change at runtime
internal set => m_DetectableTags = value;
}
[HideInInspector]
[SerializeField]
[FormerlySerializedAs("raysPerDirection")]
public int raysPerDirection = 3;
int m_RaysPerDirection = 3;
/// <summary>
/// Number of rays to the left and right of center.
/// </summary>
public int raysPerDirection
{
get => m_RaysPerDirection;
// Note: can't change at runtime
internal set => m_RaysPerDirection = value;
}
[HideInInspector]
[SerializeField]
[FormerlySerializedAs("maxRayDegrees")]
[Tooltip("Cone size for rays. Using 90 degrees will cast rays to the left and right. Greater than 90 degrees will go backwards.")]
public float maxRayDegrees = 70;
[Tooltip("Cone size for rays. Using 90 degrees will cast rays to the left and right. " +
"Greater than 90 degrees will go backwards.")]
float m_MaxRayDegrees = 70;
/// <summary>
/// Cone size for rays. Using 90 degrees will cast rays to the left and right.
/// Greater than 90 degrees will go backwards.
/// </summary>
public float maxRayDegrees
{
get => m_MaxRayDegrees;
set { m_MaxRayDegrees = value; UpdateSensor(); }
}
[HideInInspector]
[SerializeField]
[FormerlySerializedAs("sphereCastRadius")]
public float sphereCastRadius = 0.5f;
float m_SphereCastRadius = 0.5f;
/// <summary>
/// Radius of sphere to cast. Set to zero for raycasts.
/// </summary>
public float sphereCastRadius
{
get => m_SphereCastRadius;
set { m_SphereCastRadius = value; UpdateSensor(); }
}
[HideInInspector]
[SerializeField]
[FormerlySerializedAs("rayLength")]
public float rayLength = 20f;
float m_RayLength = 20f;
/// <summary>
/// Length of the rays to cast.
/// </summary>
public float rayLength
{
get => m_RayLength;
set { m_RayLength = value; UpdateSensor(); }
}
[HideInInspector]
[SerializeField]
[FormerlySerializedAs("rayLayerMask")]
public LayerMask rayLayerMask = Physics.DefaultRaycastLayers;
LayerMask m_RayLayerMask = Physics.DefaultRaycastLayers;
/// <summary>
/// Controls which layers the rays can hit.
/// </summary>
public LayerMask rayLayerMask
{
get => m_RayLayerMask;
set { m_RayLayerMask = value; UpdateSensor();}
}
[HideInInspector]
[SerializeField]
[FormerlySerializedAs("observationStacks")]
public int observationStacks = 1;
int m_ObservationStacks = 1;
/// <summary>
/// Whether to stack previous observations. Using 1 means no previous observations.
/// </summary>
internal int observationStacks
{
get => m_ObservationStacks;
set => m_ObservationStacks = value; // Note: can't change at runtime
}
/// <summary>
/// Color to code a ray that hits another object.
/// </summary>
[HideInInspector]
[SerializeField]
public Color rayHitColor = Color.red;
public Color rayMissColor = Color.white;
[Tooltip("Whether to draw the raycasts in the world space of when they happened, or using the Agent's current transform'")]
public bool useWorldPositions = true;
internal Color rayHitColor = Color.red;
/// <summary>
/// Color to code a ray that avoid or misses all other objects.
/// </summary>
[HideInInspector]
[SerializeField]
internal Color rayMissColor = Color.white;
public abstract RayPerceptionSensor.CastType GetCastType();
/// <summary>
/// Returns the <see cref="RayPerceptionCastType"/> for the associated raycast sensor.
/// </summary>
/// <returns></returns>
public abstract RayPerceptionCastType GetCastType();
/// <summary>
/// Returns the amount that the ray start is offset up or down by.
/// </summary>
/// <returns></returns>
/// <summary>
/// Returns the amount that the ray end is offset up or down by.
/// </summary>
/// <returns></returns>
/// <summary>
/// Returns an initialized raycast sensor.
/// </summary>
/// <returns></returns>
var rayAngles = GetRayAngles(raysPerDirection, maxRayDegrees);
m_RaySensor = new RayPerceptionSensor(sensorName, rayLength, detectableTags, rayAngles,
transform, GetStartVerticalOffset(), GetEndVerticalOffset(), sphereCastRadius, GetCastType(),
rayLayerMask
);
var rayPerceptionInput = GetRayPerceptionInput();
m_RaySensor = new RayPerceptionSensor(m_SensorName, rayPerceptionInput);
if (observationStacks != 1)
{

return m_RaySensor;
}
/// <summary>
/// Returns the specific ray angles given the number of rays per direction and the
/// cone size for the rays.
/// </summary>
/// <param name="raysPerDirection">Number of rays to the left and right of center.</param>
/// <param name="maxRayDegrees">
/// Cone size for rays. Using 90 degrees will cast rays to the left and right.
/// Greater than 90 degrees will go backwards.
/// </param>
/// <returns></returns>
public static float[] GetRayAngles(int raysPerDirection, float maxRayDegrees)
{
// Example:

return anglesOut;
}
/// <summary>
/// Returns the observation shape for this raycast sensor which depends on the number
/// of tags for detected objects and the number of rays.
/// </summary>
/// <returns></returns>
var numTags = detectableTags == null ? 0 : detectableTags.Count;
var numTags = m_DetectableTags?.Count ?? 0;
/// <summary>
/// Draw the debug information from the sensor (if available).
/// </summary>
public void OnDrawGizmos()
RayPerceptionInput GetRayPerceptionInput()
{
var rayAngles = GetRayAngles(raysPerDirection, maxRayDegrees);
var rayPerceptionInput = new RayPerceptionInput();
rayPerceptionInput.rayLength = rayLength;
rayPerceptionInput.detectableTags = detectableTags;
rayPerceptionInput.angles = rayAngles;
rayPerceptionInput.startOffset = GetStartVerticalOffset();
rayPerceptionInput.endOffset = GetEndVerticalOffset();
rayPerceptionInput.castRadius = sphereCastRadius;
rayPerceptionInput.transform = transform;
rayPerceptionInput.castType = GetCastType();
rayPerceptionInput.layerMask = rayLayerMask;
return rayPerceptionInput;
}
internal void UpdateSensor()
if (m_RaySensor?.debugDisplayInfo?.rayInfos == null)
if (m_RaySensor != null)
return;
var rayInput = GetRayPerceptionInput();
m_RaySensor.SetRayPerceptionInput(rayInput);
var debugInfo = m_RaySensor.debugDisplayInfo;
}
// Draw "old" observations in a lighter color.
// Since the agent may not step every frame, this helps de-emphasize "stale" hit information.
var alpha = Mathf.Pow(.5f, debugInfo.age);
void OnDrawGizmosSelected()
{
if (m_RaySensor?.debugDisplayInfo?.rayInfos != null)
{
// If we have cached debug info from the sensor, draw that.
// Draw "old" observations in a lighter color.
// Since the agent may not step every frame, this helps de-emphasize "stale" hit information.
var alpha = Mathf.Pow(.5f, m_RaySensor.debugDisplayInfo.age);
foreach (var rayInfo in debugInfo.rayInfos)
foreach (var rayInfo in m_RaySensor.debugDisplayInfo.rayInfos)
{
DrawRaycastGizmos(rayInfo, alpha);
}
}
else
// Either use the original world-space coordinates of the raycast, or transform the agent-local
// coordinates of the rays to the current transform of the agent. If the agent acts every frame,
// these should be the same.
var startPositionWorld = rayInfo.worldStart;
var endPositionWorld = rayInfo.worldEnd;
if (!useWorldPositions)
var rayInput = GetRayPerceptionInput();
for (var rayIndex = 0; rayIndex < rayInput.angles.Count; rayIndex++)
startPositionWorld = transform.TransformPoint(rayInfo.localStart);
endPositionWorld = transform.TransformPoint(rayInfo.localEnd);
DebugDisplayInfo.RayInfo debugRay;
RayPerceptionSensor.PerceiveSingleRay(rayInput, rayIndex, out debugRay);
DrawRaycastGizmos(debugRay);
var rayDirection = endPositionWorld - startPositionWorld;
rayDirection *= rayInfo.hitFraction;
}
}
// hit fraction ^2 will shift "far" hits closer to the hit color
var lerpT = rayInfo.hitFraction * rayInfo.hitFraction;
var color = Color.Lerp(rayHitColor, rayMissColor, lerpT);
color.a *= alpha;
Gizmos.color = color;
Gizmos.DrawRay(startPositionWorld, rayDirection);
/// <summary>
/// Draw the debug information from the sensor (if available).
/// </summary>
void DrawRaycastGizmos(DebugDisplayInfo.RayInfo rayInfo, float alpha=1.0f)
{
var startPositionWorld = rayInfo.worldStart;
var endPositionWorld = rayInfo.worldEnd;
var rayDirection = endPositionWorld - startPositionWorld;
rayDirection *= rayInfo.rayOutput.hitFraction;
// hit fraction ^2 will shift "far" hits closer to the hit color
var lerpT = rayInfo.rayOutput.hitFraction * rayInfo.rayOutput.hitFraction;
var color = Color.Lerp(rayHitColor, rayMissColor, lerpT);
color.a *= alpha;
Gizmos.color = color;
Gizmos.DrawRay(startPositionWorld, rayDirection);
// Draw the hit point as a sphere. If using rays to cast (0 radius), use a small sphere.
if (rayInfo.castHit)
{
var hitRadius = Mathf.Max(rayInfo.castRadius, .05f);
Gizmos.DrawWireSphere(startPositionWorld + rayDirection, hitRadius);
}
// Draw the hit point as a sphere. If using rays to cast (0 radius), use a small sphere.
if (rayInfo.rayOutput.hasHit)
{
var hitRadius = Mathf.Max(rayInfo.castRadius, .05f);
Gizmos.DrawWireSphere(startPositionWorld + rayDirection, hitRadius);
}
}
}

23
com.unity.ml-agents/Runtime/Sensor/RenderTextureSensor.cs


using System;
namespace MLAgents.Sensor
namespace MLAgents
/// <summary>
/// Sensor class that wraps a <see cref="RenderTexture"/> instance.
/// </summary>
public class RenderTextureSensor : ISensor
{
RenderTexture m_RenderTexture;

SensorCompressionType m_CompressionType;
public RenderTextureSensor(RenderTexture renderTexture, bool grayscale, string name,
SensorCompressionType compressionType)
/// <summary>
/// Initializes the sensor.
/// </summary>
/// <param name="renderTexture">The <see cref="RenderTexture"/> instance to wrap.</param>
/// <param name="grayscale">Whether to convert it to grayscale or not.</param>
/// <param name="name">Name of the sensor.</param>
/// <param name="compressionType">Compression method for the render texture.</param>
public RenderTextureSensor(
RenderTexture renderTexture, bool grayscale, string name, SensorCompressionType compressionType)
{
m_RenderTexture = renderTexture;
var width = renderTexture != null ? renderTexture.width : 0;

m_CompressionType = compressionType;
}
/// <inheritdoc/>
/// <inheritdoc/>
/// <inheritdoc/>
public byte[] GetCompressedObservation()
{
using (TimerStack.Instance.Scoped("RenderTexSensor.GetCompressedObservation"))

}
}
/// <inheritdoc/>
public int Write(WriteAdapter adapter)
{
using (TimerStack.Instance.Scoped("RenderTexSensor.GetCompressedObservation"))

}
}
/// <inheritdoc/>
/// <inheritdoc/>
public SensorCompressionType GetCompressionType()
{
return m_CompressionType;

24
com.unity.ml-agents/Runtime/Sensor/RenderTextureSensorComponent.cs


using System;
namespace MLAgents.Sensor
namespace MLAgents
/// <summary>
/// Component that wraps a <see cref="RenderTextureSensor"/>.
/// </summary>
/// <summary>
/// The <see cref="RenderTexture"/> instance that the associated
/// <see cref="RenderTextureSensor"/> wraps.
/// </summary>
/// <summary>
/// Name of the sensor.
/// </summary>
/// <summary>
/// Whether the RenderTexture observation should be converted to grayscale or not.
/// </summary>
/// <summary>
/// Compression type for the render texture observation.
/// </summary>
/// <inheritdoc/>
/// <inheritdoc/>
public override int[] GetObservationShape()
{
var width = renderTexture != null ? renderTexture.width : 0;

21
com.unity.ml-agents/Runtime/Sensor/SensorBase.cs


using UnityEngine;
namespace MLAgents.Sensor
namespace MLAgents
/// <summary>
/// A base sensor that provides a number default implementations.
/// </summary>
/// Write the observations to the output buffer. This size of the buffer will be product of the sizes returned
/// by GetObservationShape().
/// Write the observations to the output buffer. This size of the buffer will be product
/// of the sizes returned by <see cref="GetObservationShape"/>.
/// <inheritdoc/>
/// <inheritdoc/>
/// Default implementation of Write interface. This creates a temporary array, calls WriteObservation,
/// and then writes the results to the WriteAdapter.
/// Default implementation of Write interface. This creates a temporary array,
/// calls WriteObservation, and then writes the results to the WriteAdapter.
/// <returns>The number of elements written.</returns>
public virtual int Write(WriteAdapter adapter)
{
// TODO reuse buffer for similar agents, don't call GetObservationShape()

return numFloats;
}
/// <inheritdoc/>
/// <inheritdoc/>
/// <inheritdoc/>
public virtual SensorCompressionType GetCompressionType()
{
return SensorCompressionType.None;

19
com.unity.ml-agents/Runtime/Sensor/SensorComponent.cs


using System;
namespace MLAgents.Sensor
namespace MLAgents
/// Editor components for creating Sensors. Generally an ISensor implementation should have a corresponding
/// SensorComponent to create it.
/// Editor components for creating Sensors. Generally an ISensor implementation should have a
/// corresponding SensorComponent to create it.
/// </summary>
public abstract class SensorComponent : MonoBehaviour
{

/// <returns></returns>
/// <returns>Created ISensor object.</returns>
/// <returns></returns>
/// <returns>Shape of the sensor observation.</returns>
/// <summary>
/// Whether the observation is visual or not.
/// </summary>
/// <returns>True if the observation is visual, false otherwise.</returns>
public virtual bool IsVisual()
{
var shape = GetObservationShape();

/// <summary>
/// Whether the observation is vector or not.
/// </summary>
/// <returns>True if the observation is vector, false otherwise.</returns>
public virtual bool IsVector()
{
var shape = GetObservationShape();

2
com.unity.ml-agents/Runtime/Sensor/SensorShapeValidator.cs


using System.Collections.Generic;
using UnityEngine;
namespace MLAgents.Sensor
namespace MLAgents
{
internal class SensorShapeValidator
{

6
com.unity.ml-agents/Runtime/Sensor/StackingSensor.cs


namespace MLAgents.Sensor
namespace MLAgents
{
/// <summary>
/// Sensor that wraps around another Sensor to provide temporal stacking.

/// <summary>
///
/// </summary>
/// <param name="wrapped">The wrapped sensor</param>
/// <param name="numStackedObservations">Number of stacked observations to keep</param>
/// <param name="wrapped">The wrapped sensor.</param>
/// <param name="numStackedObservations">Number of stacked observations to keep.</param>
public StackingSensor(ISensor wrapped, int numStackedObservations)
{
// TODO ensure numStackedObservations > 1

30
com.unity.ml-agents/Runtime/Sensor/VectorSensor.cs


using System.Collections.Generic;
using UnityEngine;
namespace MLAgents.Sensor
namespace MLAgents
/// <summary>
/// A sensor implementation for vector observations.
/// </summary>
// TOOD allow setting float[]
// TODO allow setting float[]
/// <summary>
/// Initializes the sensor.
/// </summary>
/// <param name="observationSize">Number of vector observations.</param>
/// <param name="name">Name of the sensor.</param>
public VectorSensor(int observationSize, string name = null)
{
if (name == null)

m_Shape = new[] { observationSize };
}
/// <inheritdoc/>
public int Write(WriteAdapter adapter)
{
var expectedObservations = m_Shape[0];

return expectedObservations;
}
/// <inheritdoc/>
/// <inheritdoc/>
/// <inheritdoc/>
/// <inheritdoc/>
/// <inheritdoc/>
public virtual SensorCompressionType GetCompressionType()
{
return SensorCompressionType.None;

void AddFloatObs(float obs)
{
#if DEBUG
if (float.IsNaN(obs))
{
throw new System.ArgumentException("NaN value passed to observation.");
}
Utilities.DebugCheckNanAndInfinity(obs, nameof(obs), nameof(AddFloatObs));
#endif
m_Observations.Add(obs);
}

/// <summary>
/// Adds a boolean observation to the vector observation of the agent.
/// </summary>
/// <param name="observation"></param>
/// <param name="observation">Observation.</param>
/// <summary>
/// Adds a one-hot encoding observation.
/// </summary>
/// <param name="observation">The index of this observation.</param>
/// <param name="range">The max index for any observation.</param>
public void AddOneHotObservation(int observation, int range)
{
for (var i = 0; i < range; i++)

16
com.unity.ml-agents/Runtime/Sensor/WriteAdapter.cs


using Barracuda;
using MLAgents.InferenceBrain;
namespace MLAgents.Sensor
namespace MLAgents
{
/// <summary>
/// Allows sensors to write to both TensorProxy and float arrays/lists.

int m_Batch;
TensorShape m_TensorShape;
internal WriteAdapter() { }
/// <summary>
/// Set the adapter to write to an IList at the given channelOffset.

/// <param name="offset">Offset from the start of the float data to write to.</param>
public void SetTarget(IList<float> data, int[] shape, int offset)
internal void SetTarget(IList<float> data, int[] shape, int offset)
{
m_Data = data;
m_Offset = offset;

/// <summary>
/// Set the adapter to write to a TensorProxy at the given batch and channel offset.
/// </summary>
/// <param name="tensorProxy">Tensor proxy that will be writtent to.</param>
/// <param name="batchIndex">Batch index in the tensor proxy (i.e. the index of the Agent)</param>
/// <param name="tensorProxy">Tensor proxy that will be written to.</param>
/// <param name="batchIndex">Batch index in the tensor proxy (i.e. the index of the Agent).</param>
public void SetTarget(TensorProxy tensorProxy, int batchIndex, int channelOffset)
internal void SetTarget(TensorProxy tensorProxy, int batchIndex, int channelOffset)
{
m_Proxy = tensorProxy;
m_Batch = batchIndex;

/// <summary>
/// 1D write access at a specified index. Use AddRange if possible instead.
/// </summary>
/// <param name="index">Index to write to</param>
/// <param name="index">Index to write to.</param>
public float this[int index]
{
set

/// Write the range of floats
/// </summary>
/// <param name="data"></param>
/// <param name="writeOffset">Optional write offset</param>
/// <param name="writeOffset">Optional write offset.</param>
public void AddRange(IEnumerable<float> data, int writeOffset = 0)
{
if (m_Data != null)

14
com.unity.ml-agents/Runtime/SideChannel/EngineConfigurationChannel.cs


using System.IO;
using System;
/// <summary>
/// Side channel that supports modifying attributes specific to the Unity Engine.
/// </summary>
public override int ChannelType()
private const string k_EngineConfigId = "e951342c-4f7e-11ea-b238-784f4387d1f7";
/// <summary>
/// Initializes the side channel.
/// </summary>
public EngineConfigurationChannel()
return (int)SideChannelType.EngineSettings;
ChannelId = new Guid(k_EngineConfigId);
/// <inheritdoc/>
public override void OnMessageReceived(byte[] data)
{
using (var memStream = new MemoryStream(data))

27
com.unity.ml-agents/Runtime/SideChannel/FloatPropertiesChannel.cs


namespace MLAgents
{
/// <summary>
/// Interface for managing a collection of float properties keyed by a string variable.
/// </summary>
public interface IFloatProperties
{
/// <summary>

IList<string> ListProperties();
}
/// <summary>
/// Side channel that is comprised of a collection of float variables, represented by
/// <see cref="IFloatProperties"/>
/// </summary>
private const string k_FloatPropertiesDefaultId = "60ccf7d0-4f7e-11ea-b238-784f4387d1f7";
public override int ChannelType()
/// <summary>
/// Initializes the side channel with the provided channel ID.
/// </summary>
/// <param name="channelId">ID for the side channel.</param>
public FloatPropertiesChannel(Guid channelId = default(Guid))
return (int)SideChannelType.FloatProperties;
if (channelId == default(Guid))
{
ChannelId = new Guid(k_FloatPropertiesDefaultId);
}
else{
ChannelId = channelId;
}
/// <inheritdoc/>
public override void OnMessageReceived(byte[] data)
{
var kv = DeserializeMessage(data);

}
}
/// <inheritdoc/>
public void SetProperty(string key, float value)
{
m_FloatProperties[key] = value;

}
}
/// <inheritdoc/>
public float GetPropertyWithDefault(string key, float defaultValue)
{
if (m_FloatProperties.ContainsKey(key))

}
}
/// <inheritdoc/>
/// <inheritdoc/>
public IList<string> ListProperties()
{
return new List<string>(m_FloatProperties.Keys);

17
com.unity.ml-agents/Runtime/SideChannel/RawBytesChannel.cs


using System.Collections.Generic;
using System;
/// <summary>
/// Side channel for managing raw bytes of data. It is up to the clients of this side channel
/// to interpret the messages.
/// </summary>
int m_ChannelId;
/// <summary>
/// RawBytesChannel provides a way to exchange raw byte arrays between Unity and Python.

public RawBytesChannel(int channelId = 0)
public RawBytesChannel(Guid channelId)
m_ChannelId = channelId;
}
public override int ChannelType()
{
return (int)SideChannelType.RawBytesChannelStart + m_ChannelId;
ChannelId = channelId;
/// <inheritdoc/>
public override void OnMessageReceived(byte[] data)
{
m_MessagesReceived.Add(data);

29
com.unity.ml-agents/Runtime/SideChannel/SideChannel.cs


using System.Collections.Generic;
using System;
public enum SideChannelType
{
// Invalid side channel
Invalid = 0,
// Reserved for the FloatPropertiesChannel.
FloatProperties = 1,
//Reserved for the EngineConfigurationChannel.
EngineSettings = 2,
// Raw bytes channels should start here to avoid conflicting with other Unity ones.
RawBytesChannelStart = 1000,
// custom side channels should start here to avoid conflicting with Unity ones.
UserSideChannelStart = 2000,
}
/// <summary>
/// Side channels provide an alternative mechanism of sending/receiving data from Unity
/// to Python that is outside of the traditional machine learning loop. ML-Agents provides
/// some specific implementations of side channels, but users can create their own.
/// </summary>
public List<byte[]> MessageQueue = new List<byte[]>();
internal List<byte[]> MessageQueue = new List<byte[]>();
/// <returns> The integer identifier of the SideChannel</returns>
public abstract int ChannelType();
/// <returns> The integer identifier of the SideChannel.</returns>
public Guid ChannelId{
get;
protected set;
}
/// <summary>
/// Is called by the communicator every time a message is received from Python by the SideChannel.

18
com.unity.ml-agents/Runtime/Timer.cs


static double s_TicksToSeconds = 1e-7; // 100 ns per tick
/// <summary>
/// Full name of the node. This is the node's parents full name concatenated with this node's name
/// Full name of the node. This is the node's parents full name concatenated with this
/// node's name.
/// </summary>
string m_FullName;

Reset();
}
/// <summary>
/// Resets the timer stack and the root node.
/// </summary>
/// <param name="name">Name of the root node.</param>
public void Reset(string name = "root")
{
m_Stack = new Stack<TimerNode>();

/// <summary>
/// The singleton <see cref="TimerStack"/> instance.
/// </summary>
public static TimerStack Instance
{
get { return k_Instance; }

get { return m_RootNode; }
}
/// <summary>
/// Whether or not new timers and gauges can be added.
/// </summary>
public bool Recording
{
get { return m_Recording; }

/// <summary>
/// Updates the referenced gauge in the root node with the provided value.
/// </summary>
/// <param name="name">The name of the Gauge to modify.</param>
/// <param name="value">The value to update the Gauge with.</param>
public void SetGauge(string name, float value)
{
if (!Recording)

9
com.unity.ml-agents/Runtime/UnityAgentsException.cs


namespace MLAgents
{
/// <summary>
/// </summary>
/// <summary>
/// </summary>
/// <param name="message">The exception message</param>
/// <summary>
/// </summary>
/// <param name="info">Data for serializing/de-serializing</param>
/// <param name="context">Describes the source and destination of the serialized stream</param>
protected UnityAgentsException(
System.Runtime.Serialization.SerializationInfo info,
System.Runtime.Serialization.StreamingContext context)

84
com.unity.ml-agents/Runtime/Utilities.cs


using System;
using System.Collections.Generic;
using MLAgents.Sensor;
namespace MLAgents
{

/// being stored in the tensor.
/// </param>
/// <returns>The number of floats written</returns>
public static int TextureToTensorProxy(
internal static int TextureToTensorProxy(
Texture2D texture,
WriteAdapter adapter,
bool grayScale)

/// Input array whose elements will be cumulatively added
/// </param>
/// <returns> The cumulative sum of the input array.</returns>
public static int[] CumSum(int[] input)
internal static int[] CumSum(int[] input)
{
var runningSum = 0;
var result = new int[input.Length + 1];

return result;
}
/// <summary>
/// Shifts list elements to the left by the specified amount (in-place).
/// <param name="list">
/// List whose elements will be shifted
/// </param>
/// <param name="shiftAmount">
/// Amount to shift the elements to the left by
/// </param>
/// </summary>
public static void ShiftLeft<T>(List<T> list, int shiftAmount)
#if DEBUG
internal static void DebugCheckNanAndInfinity(float value, string valueCategory, string caller)
for (var i = shiftAmount; i < list.Count; i++)
{
list[i - shiftAmount] = list[i];
}
}
/// <summary>
/// Replaces target list elements with source list elements starting at specified position
/// in target list.
/// <param name="dst">
/// Target list, where the elements are added to
/// </param>
/// <param name="src">
/// Source array, where the elements are copied from
/// </param>
/// <param name="start">
/// Starting position in target list to copy elements to
/// </param>
/// </summary>
public static void ReplaceRange<T>(List<T> dst, List<T> src, int start)
{
for (var i = 0; i < src.Count; i++)
{
dst[i + start] = src[i];
}
}
/// <summary>
/// Adds elements to list without extra temp allocations (assuming it fits pre-allocated
/// capacity of the list). The built-in List/<T/>.AddRange() unfortunately allocates
/// a temporary list to add items (even if the original array has sufficient capacity):
/// https://stackoverflow.com/questions/2123161/listt-addrange-implementation-suboptimal
/// Note: this implementation might be slow with a large source array.
/// <param name="dst">
/// Target list, where the elements are added to
/// </param>
/// <param name="src">
/// Source array, where the elements are copied from
/// </param>
/// </summary>
// ReSharper disable once ParameterTypeCanBeEnumerable.Global
public static void AddRangeNoAlloc<T>(List<T> dst, T[] src)
{
// ReSharper disable once LoopCanBeConvertedToQuery
foreach (var item in src)
if (float.IsNaN(value))
dst.Add(item);
throw new ArgumentException($"NaN {valueCategory} passed to {caller}.");
}
/// <summary>
/// Calculates the number of uncompressed floats in a list of ISensor
/// </summary>
public static int GetSensorFloatObservationSize(this List<ISensor> sensors)
{
int numFloatObservations = 0;
for (var i = 0; i < sensors.Count; i++)
if (float.IsInfinity(value))
if (sensors[i].GetCompressionType() == SensorCompressionType.None)
{
numFloatObservations += sensors[i].ObservationSize();
}
throw new ArgumentException($"Inifinity {valueCategory} passed to {caller}.");
return numFloatObservations;
#endif
}

61
com.unity.ml-agents/Tests/Editor/DemonstrationTests.cs


using System.IO.Abstractions.TestingHelpers;
using System.Reflection;
using MLAgents.CommunicatorObjects;
using MLAgents.Sensor;
namespace MLAgents.Tests
{

const string k_DemoDirecory = "Assets/Demonstrations/";
const string k_DemoDirectory = "Assets/Demonstrations/";
const string k_ExtensionType = ".demo";
const string k_DemoName = "Test";

}
[Test]
public void TestStoreInitalize()
public void TestStoreInitialize()
var demoStore = new DemonstrationStore(fileSystem);
var gameobj = new GameObject("gameObj");
var bp = gameobj.AddComponent<BehaviorParameters>();
bp.brainParameters.vectorObservationSize = 3;
bp.brainParameters.numStackedVectorObservations = 2;
bp.brainParameters.vectorActionDescriptions = new[] { "TestActionA", "TestActionB" };
bp.brainParameters.vectorActionSize = new[] { 2, 2 };
bp.brainParameters.vectorActionSpaceType = SpaceType.Discrete;
Assert.IsFalse(fileSystem.Directory.Exists(k_DemoDirecory));
var agent = gameobj.AddComponent<TestAgent>();
var brainParameters = new BrainParameters
{
vectorObservationSize = 3,
numStackedVectorObservations = 2,
vectorActionDescriptions = new[] { "TestActionA", "TestActionB" },
vectorActionSize = new[] { 2, 2 },
vectorActionSpaceType = SpaceType.Discrete
};
Assert.IsFalse(fileSystem.Directory.Exists(k_DemoDirectory));
demoStore.Initialize(k_DemoName, brainParameters, "TestBrain");
var demoRec = gameobj.AddComponent<DemonstrationRecorder>();
demoRec.record = true;
demoRec.demonstrationName = k_DemoName;
demoRec.demonstrationDirectory = k_DemoDirectory;
var demoWriter = demoRec.LazyInitialize(fileSystem);
Assert.IsTrue(fileSystem.Directory.Exists(k_DemoDirecory));
Assert.IsTrue(fileSystem.FileExists(k_DemoDirecory + k_DemoName + k_ExtensionType));
Assert.IsTrue(fileSystem.Directory.Exists(k_DemoDirectory));
Assert.IsTrue(fileSystem.FileExists(k_DemoDirectory + k_DemoName + k_ExtensionType));
var agentInfo = new AgentInfo
{

storedVectorActions = new[] { 0f, 1f },
};
demoStore.Record(agentInfo, new System.Collections.Generic.List<ISensor>());
demoStore.Close();
demoWriter.Record(agentInfo, new System.Collections.Generic.List<ISensor>());
demoRec.Close();
// Make sure close can be called multiple times
demoWriter.Close();
demoRec.Close();
// Make sure trying to write after closing doesn't raise an error.
demoWriter.Record(agentInfo, new System.Collections.Generic.List<ISensor>());
public override void CollectObservations()
public override void CollectObservations(VectorSensor sensor)
AddVectorObs(1f);
AddVectorObs(2f);
AddVectorObs(3f);
sensor.AddObservation(1f);
sensor.AddObservation(2f);
sensor.AddObservation(3f);
}
}

agentGo1.AddComponent<DemonstrationRecorder>();
var demoRecorder = agentGo1.GetComponent<DemonstrationRecorder>();
var fileSystem = new MockFileSystem();
demoRecorder.demonstrationDirectory = k_DemoDirectory;
demoRecorder.InitializeDemoStore(fileSystem);
demoRecorder.LazyInitialize(fileSystem);
var agentEnableMethod = typeof(Agent).GetMethod("OnEnable",
BindingFlags.Instance | BindingFlags.NonPublic);

// Read back the demo file and make sure observations were written
var reader = fileSystem.File.OpenRead("Assets/Demonstrations/TestBrain.demo");
reader.Seek(DemonstrationStore.MetaDataBytes + 1, 0);
reader.Seek(DemonstrationWriter.MetaDataBytes + 1, 0);
BrainParametersProto.Parser.ParseDelimitedFrom(reader);
var agentInfoProto = AgentInfoActionPairProto.Parser.ParseDelimitedFrom(reader).AgentInfo;

1
com.unity.ml-agents/Tests/Editor/EditModeTestInternalBrainTensorApplier.cs


{
class TestAgent : Agent
{
}
[Test]

114
com.unity.ml-agents/Tests/Editor/MLAgentsEditModeTest.cs


using UnityEngine;
using NUnit.Framework;
using System.Reflection;
using MLAgents.Sensor;
public void RequestDecision(AgentInfo info, List<ISensor> sensors) { }
public void RequestDecision(AgentInfo info, List<ISensor> sensors) {}
public void Dispose() { }
public void Dispose() {}
}
public class TestAgent : Agent

sensors.Add(sensor1);
}
public override void CollectObservations()
public override void CollectObservations(VectorSensor sensor)
AddVectorObs(0f);
sensor.AddObservation(0f);
}
public override void AgentAction(float[] vectorAction)

public override void AgentReset()
{
agentResetCalls += 1;
collectObservationsCallsSinceLastReset = 0;
agentActionCallsSinceLastReset = 0;

return sensorName;
}
public void Update() { }
public void Update() {}
}
[TestFixture]

{
var aca = Academy.Instance;
Assert.AreNotEqual(null, aca);
Assert.AreEqual(0, aca.GetEpisodeCount());
Assert.AreEqual(0, aca.GetStepCount());
Assert.AreEqual(0, aca.GetTotalStepCount());
Assert.AreEqual(0, aca.EpisodeCount);
Assert.AreEqual(0, aca.StepCount);
Assert.AreEqual(0, aca.TotalStepCount);
}
[Test]

Assert.AreEqual(true, Academy.IsInitialized);
// Check that init is idempotent
aca.LazyInitialization();
aca.LazyInitialization();
aca.LazyInitialize();
aca.LazyInitialize();
Assert.AreEqual(0, aca.GetEpisodeCount());
Assert.AreEqual(0, aca.GetStepCount());
Assert.AreEqual(0, aca.GetTotalStepCount());
Assert.AreEqual(0, aca.EpisodeCount);
Assert.AreEqual(0, aca.StepCount);
Assert.AreEqual(0, aca.TotalStepCount);
Assert.AreNotEqual(null, aca.FloatProperties);
// Check that Dispose is idempotent

var numberReset = 0;
for (var i = 0; i < 10; i++)
{
Assert.AreEqual(numberReset, aca.GetEpisodeCount());
Assert.AreEqual(i, aca.GetStepCount());
Assert.AreEqual(numberReset, aca.EpisodeCount);
Assert.AreEqual(i, aca.StepCount);
// The reset happens at the beginning of the first step
if (i == 0)

public void TestAcademyAutostep()
{
var aca = Academy.Instance;
Assert.IsTrue(aca.IsAutomaticSteppingEnabled);
aca.DisableAutomaticStepping(true);
Assert.IsFalse(aca.IsAutomaticSteppingEnabled);
aca.EnableAutomaticStepping();
Assert.IsTrue(aca.IsAutomaticSteppingEnabled);
Assert.IsTrue(aca.AutomaticSteppingEnabled);
aca.AutomaticSteppingEnabled = false;
Assert.IsFalse(aca.AutomaticSteppingEnabled);
aca.AutomaticSteppingEnabled = true;
Assert.IsTrue(aca.AutomaticSteppingEnabled);
}
[Test]

var stepsSinceReset = 0;
for (var i = 0; i < 50; i++)
{
Assert.AreEqual(stepsSinceReset, aca.GetStepCount());
Assert.AreEqual(numberReset, aca.GetEpisodeCount());
Assert.AreEqual(i, aca.GetTotalStepCount());
Assert.AreEqual(stepsSinceReset, aca.StepCount);
Assert.AreEqual(numberReset, aca.EpisodeCount);
Assert.AreEqual(i, aca.TotalStepCount);
// Academy resets at the first step
if (i == 0)
{

var agent2StepSinceReset = 0;
for (var i = 0; i < 5000; i++)
{
Assert.AreEqual(acaStepsSinceReset, aca.GetStepCount());
Assert.AreEqual(numberAcaReset, aca.GetEpisodeCount());
Assert.AreEqual(acaStepsSinceReset, aca.StepCount);
Assert.AreEqual(numberAcaReset, aca.EpisodeCount);
Assert.AreEqual(i, aca.GetTotalStepCount());
Assert.AreEqual(i, aca.TotalStepCount);
Assert.AreEqual(agent2StepSinceReset, agent2.GetStepCount());
Assert.AreEqual(agent2StepSinceReset, agent2.StepCount);
Assert.AreEqual(numberAgent1Reset, agent1.agentResetCalls);
Assert.AreEqual(numberAgent2Reset, agent2.agentResetCalls);

agent1.LazyInitialize();
agent2.SetPolicy(new TestPolicy());
var j = 0;
for (var i = 0; i < 500; i++)
var expectedAgent1ActionSinceReset = 0;
for (var i = 0; i < 50; i++)
if (i % 21 == 0)
{
j = 0;
}
else
{
j++;
expectedAgent1ActionSinceReset += 1;
if (expectedAgent1ActionSinceReset == agent1.maxStep || i == 0){
expectedAgent1ActionSinceReset = 0;
Assert.LessOrEqual(Mathf.Abs(j * 10.1f - agent1.GetCumulativeReward()), 0.05f);
Assert.LessOrEqual(Mathf.Abs(expectedAgent1ActionSinceReset * 10.1f - agent1.GetCumulativeReward()), 0.05f);
Assert.LessOrEqual(Mathf.Abs(i * 0.1f - agent2.GetCumulativeReward()), 0.05f);
agent1.AddReward(10f);

decisionRequester.DecisionPeriod = 1;
decisionRequester.Awake();
var maxStep = 6;
const int maxStep = 6;
var expectedAgentStepCount = 0;
var expectedResets= 0;
var expectedAgentAction = 0;
var expectedAgentActionSinceReset = 0;
var expectedCollectObsCalls = 0;
var expectedCollectObsCallsSinceReset = 0;
// We expect resets to occur when there are maxSteps actions since the last reset (and on the first step)
var expectReset = agent1.agentActionCallsSinceLastReset == maxStep || (i == 0);
var previousNumResets = agent1.agentResetCalls;
aca.EnvironmentStep();
// Agent should observe and act on each Academy step
expectedAgentAction += 1;
expectedAgentActionSinceReset += 1;
expectedCollectObsCalls += 1;
expectedCollectObsCallsSinceReset += 1;
expectedAgentStepCount += 1;
if (expectReset)
// If the next step will put the agent at maxSteps, we expect it to reset
if (agent1.StepCount == maxStep - 1 || (i == 0))
Assert.AreEqual(previousNumResets + 1, agent1.agentResetCalls);
expectedResets +=1;
else
if (agent1.StepCount == maxStep - 1)
Assert.AreEqual(previousNumResets, agent1.agentResetCalls);
expectedAgentActionSinceReset = 0;
expectedCollectObsCallsSinceReset = 0;
expectedAgentStepCount = 0;
aca.EnvironmentStep();
Assert.AreEqual(expectedAgentStepCount, agent1.StepCount);
Assert.AreEqual(expectedResets, agent1.agentResetCalls);
Assert.AreEqual(expectedAgentAction, agent1.agentActionCalls);
Assert.AreEqual(expectedAgentActionSinceReset, agent1.agentActionCallsSinceLastReset);
Assert.AreEqual(expectedCollectObsCalls, agent1.collectObservationsCalls);
Assert.AreEqual(expectedCollectObsCallsSinceReset, agent1.collectObservationsCallsSinceLastReset);
}
}
}

1
com.unity.ml-agents/Tests/Editor/Sensor/FloatVisualSensorTests.cs


using NUnit.Framework;
using UnityEngine;
using MLAgents.Sensor;
namespace MLAgents.Tests
{

3
com.unity.ml-agents/Tests/Editor/Sensor/RayPerceptionSensorTests.cs


using System.Collections.Generic;
using NUnit.Framework;
using UnityEngine;
using MLAgents.Sensor;
namespace MLAgents.Tests
{

SetupScene();
var obj = new GameObject("agent");
var perception = obj.AddComponent<RayPerceptionSensorComponent3D>();
obj.transform.localScale = new Vector3(2, 2,2 );
obj.transform.localScale = new Vector3(2, 2, 2);
perception.raysPerDirection = 0;
perception.maxRayDegrees = 45;

1
com.unity.ml-agents/Tests/Editor/Sensor/StackingSensorTests.cs


using NUnit.Framework;
using UnityEngine;
using MLAgents.Sensor;
namespace MLAgents.Tests
{

1
com.unity.ml-agents/Tests/Editor/Sensor/VectorSensorTests.cs


using NUnit.Framework;
using UnityEngine;
using MLAgents.Sensor;
namespace MLAgents.Tests
{

2
com.unity.ml-agents/Tests/Editor/Sensor/WriterAdapterTests.cs


using NUnit.Framework;
using UnityEngine;
using MLAgents.Sensor;
using Barracuda;
using MLAgents.InferenceBrain;

20
com.unity.ml-agents/Tests/Editor/SideChannelTests.cs


{
public List<int> messagesReceived = new List<int>();
public override int ChannelType() { return -1; }
public TestSideChannel() {
ChannelId = new Guid("6afa2c06-4f82-11ea-b238-784f4387d1f7");
}
public override void OnMessageReceived(byte[] data)
{

{
var intSender = new TestSideChannel();
var intReceiver = new TestSideChannel();
var dictSender = new Dictionary<int, SideChannel> { { intSender.ChannelType(), intSender } };
var dictReceiver = new Dictionary<int, SideChannel> { { intReceiver.ChannelType(), intReceiver } };
var dictSender = new Dictionary<Guid, SideChannel> { { intSender.ChannelId, intSender } };
var dictReceiver = new Dictionary<Guid, SideChannel> { { intReceiver.ChannelId, intReceiver } };
intSender.SendInt(4);
intSender.SendInt(5);

var str1 = "Test string";
var str2 = "Test string, second";
var strSender = new RawBytesChannel();
var strReceiver = new RawBytesChannel();
var dictSender = new Dictionary<int, SideChannel> { { strSender.ChannelType(), strSender } };
var dictReceiver = new Dictionary<int, SideChannel> { { strReceiver.ChannelType(), strReceiver } };
var strSender = new RawBytesChannel(new Guid("9a5b8954-4f82-11ea-b238-784f4387d1f7"));
var strReceiver = new RawBytesChannel(new Guid("9a5b8954-4f82-11ea-b238-784f4387d1f7"));
var dictSender = new Dictionary<Guid, SideChannel> { { strSender.ChannelId, strSender } };
var dictReceiver = new Dictionary<Guid, SideChannel> { { strReceiver.ChannelId, strReceiver } };
strSender.SendRawBytes(Encoding.ASCII.GetBytes(str1));
strSender.SendRawBytes(Encoding.ASCII.GetBytes(str2));

var propA = new FloatPropertiesChannel();
var propB = new FloatPropertiesChannel();
var dictReceiver = new Dictionary<int, SideChannel> { { propA.ChannelType(), propA } };
var dictSender = new Dictionary<int, SideChannel> { { propB.ChannelType(), propB } };
var dictReceiver = new Dictionary<Guid, SideChannel> { { propA.ChannelId, propA } };
var dictSender = new Dictionary<Guid, SideChannel> { { propB.ChannelId, propB } };
propA.RegisterCallback(k1, f => { wasCalled++; });
var tmp = propB.GetPropertyWithDefault(k2, 3.0f);

2
com.unity.ml-agents/Tests/Runtime/SerializationTest.cs


// using System.Collections;
// using System.Collections;
// using NUnit.Framework;
// #if UNITY_EDITOR
// using UnityEditor.SceneManagement;

2
com.unity.ml-agents/Tests/Runtime/SerializeAgent.cs


using System.Collections;
using System.Collections;
using System.Collections.Generic;
using UnityEngine;
using MLAgents;

2
com.unity.ml-agents/package.json


"unity": "2018.4",
"description": "Add interactivity to your game with Machine Learning Agents trained using Deep Reinforcement Learning.",
"dependencies": {
"com.unity.barracuda": "0.5.0-preview"
"com.unity.barracuda": "0.6.0-preview"
}
}

9
config/sac_trainer_config.yaml


learning_rate: 3.0e-4
learning_rate_schedule: constant
max_steps: 5.0e5
memory_size: 256
memory_size: 128
normalize: false
num_update: 1
train_interval: 1

sequence_length: 32
num_layers: 2
hidden_units: 128
memory_size: 256
memory_size: 128
init_entcoef: 0.1
max_steps: 1.0e7
summary_freq: 10000

sequence_length: 32
num_layers: 1
hidden_units: 128
memory_size: 256
memory_size: 128
summary_freq: 10000
time_horizon: 64
use_recurrent: true

num_layers: 1
hidden_units: 128
memory_size: 256
memory_size: 128
gamma: 0.99
buffer_size: 1024
batch_size: 64

8
config/trainer_config.yaml


learning_rate: 3.0e-4
learning_rate_schedule: linear
max_steps: 5.0e5
memory_size: 256
memory_size: 128
normalize: false
num_epoch: 3
num_layers: 2

sequence_length: 64
num_layers: 2
hidden_units: 128
memory_size: 256
memory_size: 128
beta: 1.0e-2
num_epoch: 3
buffer_size: 1024

sequence_length: 64
num_layers: 1
hidden_units: 128
memory_size: 256
memory_size: 128
beta: 1.0e-2
num_epoch: 3
buffer_size: 1024

sequence_length: 32
num_layers: 1
hidden_units: 128
memory_size: 256
memory_size: 128
beta: 1.0e-2
num_epoch: 3
buffer_size: 1024

9
docs/API-Reference.md


# API Reference
Our developer-facing C# classes (Academy, Agent, Decision and Monitor) have been
documented to be compatible with Doxygen for auto-generating HTML
documentation.
Our developer-facing C# classes have been documented to be compatible with
Doxygen for auto-generating HTML documentation.
To generate the API reference, download Doxygen
and run the following command within the `docs/` directory:

subdirectory to navigate to the API reference home. Note that `html/` is already
included in the repository's `.gitignore` file.
In the near future, we aim to expand our documentation to include all the Unity
C# classes and Python API.
In the near future, we aim to expand our documentation to include the Python
classes.

8
docs/Getting-Started-with-Balance-Ball.md


agent cube and ball. The function randomizes the reset values so that the
training generalizes to more than a specific starting position and agent cube
attitude.
* agent.CollectObservations() — Called every simulation step. Responsible for
* agent.CollectObservations(VectorSensor sensor) — Called every simulation step. Responsible for
space with a state size of 8, the `CollectObservations()` must call
`AddVectorObs` such that vector size adds up to 8.
space with a state size of 8, the `CollectObservations(VectorSensor sensor)` must call
`VectorSensor.AddObservation()` such that vector size adds up to 8.
* agent.AgentAction() — Called every simulation step. Receives the action chosen
by the Agent. The vector action spaces result in a
small change in the agent cube's rotation at each step. The `AgentAction()` function

vector containing the Agent's observations contains eight elements: the `x` and
`z` components of the agent cube's rotation and the `x`, `y`, and `z` components
of the ball's relative position and velocity. (The observation values are
defined in the Agent's `CollectObservations()` function.)
defined in the Agent's `CollectObservations(VectorSensor sensor)` method.)
#### Behavior Parameters : Vector Action Space

4
docs/Learning-Environment-Best-Practices.md


* Besides encoding non-numeric values, all inputs should be normalized to be in
the range 0 to +1 (or -1 to 1). For example, the `x` position information of
an agent where the maximum possible value is `maxValue` should be recorded as
`AddVectorObs(transform.position.x / maxValue);` rather than
`AddVectorObs(transform.position.x);`. See the equation below for one approach
`VectorSensor.AddObservation(transform.position.x / maxValue);` rather than
`VectorSensor.AddObservation(transform.position.x);`. See the equation below for one approach
of normalization.
* Positional information of relevant GameObjects should be encoded in relative
coordinates wherever possible. This is often relative to the agent position.

20
docs/Learning-Environment-Create-New.md


}
```
Next, let's implement the `Agent.CollectObservations()` method.
Next, let's implement the `Agent.CollectObservations(VectorSensor sensor)` method.
### Observing the Environment

* Position of the target.
```csharp
AddVectorObs(Target.position);
sensor.AddObservation(Target.position);
AddVectorObs(this.transform.position);
sensor.AddObservation(this.transform.position);
```
* The velocity of the Agent. This helps the Agent learn to control its speed so

// Agent velocity
AddVectorObs(rBody.velocity.x);
AddVectorObs(rBody.velocity.z);
sensor.AddObservation(rBody.velocity.x);
sensor.AddObservation(rBody.velocity.z);
```
In total, the state observation contains 8 values and we need to use the

public override void CollectObservations()
public override void CollectObservations(VectorSensor sensor)
AddVectorObs(Target.position);
AddVectorObs(this.transform.position);
sensor.AddObservation(Target.position);
sensor.AddObservation(this.transform.position);
AddVectorObs(rBody.velocity.x);
AddVectorObs(rBody.velocity.z);
sensor.AddObservation(rBody.velocity.x);
sensor.AddObservation(rBody.velocity.z);
}
```

44
docs/Learning-Environment-Design-Agents.md


* **Visual Observations** — one or more camera images and/or render textures.
When you use vector observations for an Agent, implement the
`Agent.CollectObservations()` method to create the feature vector. When you use
`Agent.CollectObservations(VectorSensor sensor)` method to create the feature vector. When you use
You do not need to implement the `CollectObservations()` method when your Agent
You do not need to implement the `CollectObservations(VectorSensor sensor)` method when your Agent
uses visual observations (unless it also uses vector observations).
### Vector Observation Space: Feature Vectors

class calls the `CollectObservations()` method of each Agent. Your
implementation of this function must call `AddVectorObs` to add vector
class calls the `CollectObservations(VectorSensor sensor)` method of each Agent. Your
implementation of this function must call `VectorSensor.AddObservation` to add vector
observations.
The observation must include all the information an agents needs to accomplish

public GameObject ball;
private List<float> state = new List<float>();
public override void CollectObservations()
public override void CollectObservations(VectorSensor sensor)
AddVectorObs(gameObject.transform.rotation.z);
AddVectorObs(gameObject.transform.rotation.x);
AddVectorObs((ball.transform.position.x - gameObject.transform.position.x));
AddVectorObs((ball.transform.position.y - gameObject.transform.position.y));
AddVectorObs((ball.transform.position.z - gameObject.transform.position.z));
AddVectorObs(ball.transform.GetComponent<Rigidbody>().velocity.x);
AddVectorObs(ball.transform.GetComponent<Rigidbody>().velocity.y);
AddVectorObs(ball.transform.GetComponent<Rigidbody>().velocity.z);
sensor.AddObservation(gameObject.transform.rotation.z);
sensor.AddObservation(gameObject.transform.rotation.x);
sensor.AddObservation((ball.transform.position.x - gameObject.transform.position.x));
sensor.AddObservation((ball.transform.position.y - gameObject.transform.position.y));
sensor.AddObservation((ball.transform.position.z - gameObject.transform.position.z));
sensor.AddObservation(ball.transform.GetComponent<Rigidbody>().velocity.x);
sensor.AddObservation(ball.transform.GetComponent<Rigidbody>().velocity.y);
sensor.AddObservation(ball.transform.GetComponent<Rigidbody>().velocity.z);
}
```

The observation feature vector is a list of floating point numbers, which means
you must convert any other data types to a float or a list of floats.
The `AddVectorObs` method provides a number of overloads for adding common types
The `VectorSensor.AddObservation` method provides a number of overloads for adding common types
of data to your observation vector. You can add Integers and booleans directly to
the observation vector, as well as some common Unity data types such as `Vector2`,
`Vector3`, and `Quaternion`.

```csharp
enum CarriedItems { Sword, Shield, Bow, LastItem }
private List<float> state = new List<float>();
public override void CollectObservations()
public override void CollectObservations(VectorSensor sensor)
AddVectorObs((int)currentItem == ci ? 1.0f : 0.0f);
sensor.AddObservation((int)currentItem == ci ? 1.0f : 0.0f);
`AddVectorObs` also provides a two-argument version as a shortcut for _one-hot_
`VectorSensor.AddObservation` also provides a two-argument version as a shortcut for _one-hot_
style observations. The following example is identical to the previous one.
```csharp

public override void CollectObservations()
public override void CollectObservations(VectorSensor sensor)
AddVectorObs((int)currentItem, NUM_ITEM_TYPES);
sensor.AddOneHotObservation((int)currentItem, NUM_ITEM_TYPES);
}
```

neural network, the Agent will be unable to perform the specified action. Note
that when the Agent is controlled by its Heuristic, the Agent will
still be able to decide to perform the masked action. In order to mask an
action, call the method `SetActionMask` within the `CollectObservation` method :
action, call the method `SetActionMask` on the optional `ActionMasker` argument of the `CollectObservation` method :
SetActionMask(branch, actionIndices)
public override void CollectObservations(VectorSensor sensor, ActionMasker actionMasker){
actionMasker.SetActionMask(branch, actionIndices)
}
```
Where:

8
docs/Learning-Environment-Design.md


1. Calls your Academy's `OnEnvironmentReset` delegate.
2. Calls the `AgentReset()` function for each Agent in the scene.
3. Calls the `CollectObservations()` function for each Agent in the scene.
3. Calls the `CollectObservations(VectorSensor sensor)` function for each Agent in the scene.
4. Uses each Agent's Policy to decide on the Agent's next action.
5. Calls the `AgentAction()` function for each Agent in the scene, passing in
the action chosen by the Agent's Policy. (This function is not called if the

To create a training environment, extend the Agent class to
implement the above methods. The `Agent.CollectObservations()` and
implement the above methods. The `Agent.CollectObservations(VectorSensor sensor)` and
`Agent.AgentAction()` functions are required; the other methods are optional —
whether you need to implement them or not depends on your specific scenario.

have appropriate `Behavior Parameters`.
To create an Agent, extend the Agent class and implement the essential
`CollectObservations()` and `AgentAction()` methods:
`CollectObservations(VectorSensor sensor)` and `AgentAction()` methods:
* `CollectObservations()` — Collects the Agent's observation of its environment.
* `CollectObservations(VectorSensor sensor)` — Collects the Agent's observation of its environment.
* `AgentAction()` — Carries out the action chosen by the Agent's Policy and
assigns a reward to the current state.

4
docs/Limitations.md


`Academy.Instance.DisableAutomaticStepping()`, and then calling
`Academy.Instance.EnvironmentStep()`
### Unity Inference Engine Models
Currently, only models created with our trainers are supported for running
ML-Agents with a neural network behavior.
## Python API
### Python version

21
docs/Migrating.md


# Migrating
## Migrating from 0.14 to latest
### Important changes
* The `Agent.CollectObservations()` virtual method now takes as input a `VectorSensor` sensor as argument. The `Agent.AddVectorObs()` methods were removed.
* The `SetActionMask` method must now be called on the optional `ActionMasker` argument of the `CollectObservations` method. (We now consider an action mask as a type of observation)
* The `Monitor` class has been moved to the Examples Project. (It was prone to errors during testing)
* The `MLAgents.Sensor` namespace has been removed. All sensors now belong to the `MLAgents` namespace.
* The interface for `RayPerceptionSensor.PerceiveStatic()` was changed to take an input class and write to an output class.
* The `SetActionMask` method must now be called on the optional `ActionMasker` argument of the `CollectObservations` method. (We now consider an action mask as a type of observation)
* The method `GetStepCount()` on the Agent class has been replaced with the property getter `StepCount`
* The `--multi-gpu` option has been removed temporarily.
### Steps to Migrate
* Replace your Agent's implementation of `CollectObservations()` with `CollectObservations(VectorSensor sensor)`. In addition, replace all calls to `AddVectorObs()` with `sensor.AddObservation()` or `sensor.AddOneHotObservation()` on the `VectorSensor` passed as argument.
* Replace your calls to `SetActionMask` on your Agent to `ActionMasker.SetActionMask` in `CollectObservations`
* If you call `RayPerceptionSensor.PerceiveStatic()` manually, add your inputs to a `RayPerceptionInput`. To get the previous float array output, use `RayPerceptionOutput.ToFloatArray()`
* Re-import all of your `*.NN` files to work with the updated Barracuda package.
* Replace all calls to `Agent.GetStepCount()` with `Agent.StepCount`
## Migrating from 0.13 to 0.14
### Important changes

* Move the AcademyStep code to MonoBehaviour.FixedUpdate
* Move the OnDestroy code to MonoBehaviour.OnDestroy.
* Move the AcademyReset code to a new method and add it to the Academy.OnEnvironmentReset action.
* Multiply `max_steps` and `summary_steps` in your `trainer_config.yaml` by the number of Agents in the scene.
* Multiply `max_steps` and `summary_freq` in your `trainer_config.yaml` by the number of Agents in the scene.
* Combine curriculum configs into a single file. See [the WallJump curricula](../config/curricula/wall_jump.yaml) for an example of the new curriculum config format.
A tool like https://www.json2yaml.com may be useful to help with the conversion.
* If you have a model trained which uses RayPerceptionSensor and has non-1.0 scale in the Agent's transform, it must be retrained.

3
docs/Profiling-Python.md


## Output
By default, at the end of training, timers are collected and written in json format to
`{summaries_dir}/{run_id}_timers.json`. The output consists of node objects with the following keys:
* name (string): The name of the block of code.
* children (list): A list of child nodes.
* children (dictionary): A dictionary of child nodes, keyed by the node name.
* is_parallel (bool): Indicates that the block of code was executed in multiple threads or processes (see below). This
is optional and defaults to false.

193
docs/Python-API.md


channel = EngineConfigurationChannel()
env = UnityEnvironment(base_port = 5004, side_channels = [channel])
env = UnityEnvironment(base_port = UnityEnvironment.DEFAULT_EDITOR_PORT, side_channels = [channel])
channel.set_configuration_parameters(time_scale = 2.0)

`FloatPropertiesChannel` has three methods:
* `set_property` Sets a property in the Unity Environment.
* key: The string identifier of the property.
* value: The float value of the property.
* key: The string identifier of the property.
* value: The float value of the property.
* key: The string identifier of the property.
* key: The string identifier of the property.
* `list_properties` Returns a list of all the string identifiers of the properties
```python

channel = FloatPropertiesChannel()
env = UnityEnvironment(base_port = 5004, side_channels = [channel])
env = UnityEnvironment(base_port = UnityEnvironment.DEFAULT_EDITOR_PORT, side_channels = [channel])
channel.set_property("parameter_1", 2.0)

var sharedProperties = Academy.Instance.FloatProperties;
float property1 = sharedProperties.GetPropertyWithDefault("parameter_1", 0.0f);
```
#### [Advanced] Create your own SideChannel
You can create your own `SideChannel` in C# and Python and use it to communicate data between the two.
##### Unity side
The side channel will have to implement the `SideChannel` abstract class and the following method.
* `OnMessageReceived(byte[] data)` : You must implement this method to specify what the side channel will be doing
with the data received from Python. The data is a `byte[]` argument.
The side channel must also assign a `ChannelId` property in the constructor. The `ChannelId` is a Guid
(or UUID in Python) used to uniquely identify a side channel. This Guid must be the same on C# and Python.
There can only be one side channel of a certain id during communication.
To send a byte array from C# to Python, call the `base.QueueMessageToSend(data)` method inside the side channel.
The `data` argument must be a `byte[]`.
To register a side channel on the Unity side, call `Academy.Instance.RegisterSideChannel` with the side channel
as only argument.
##### Python side
The side channel will have to implement the `SideChannel` abstract class. You must implement :
* `on_message_received(self, data: bytes) -> None` : You must implement this method to specify what the
side channel will be doing with the data received from Unity. The data is a `byte[]` argument.
The side channel must also assign a `channel_id` property in the constructor. The `channel_id` is a UUID
(referred in C# as Guid) used to uniquely identify a side channel. This number must be the same on C# and
Python. There can only be one side channel of a certain id during communication.
To assign the `channel_id` call the abstract class constructor with the appropriate `channel_id` as follows:
```python
super().__init__(my_channel_id)
```
To send a byte array from Python to C#, call the `super().queue_message_to_send(bytes_data)` method inside the
side channel. The `bytes_data` argument must be a `bytes` object.
To register a side channel on the Python side, pass the side channel as argument when creating the
`UnityEnvironment` object. One of the arguments of the constructor (`side_channels`) is a list of side channels.
##### Example implementation
Here is a simple implementation of a Side Channel that will exchange strings between C# and Python
(encoded as ascii).
One the C# side :
Here is an implementation of a `StringLogSideChannel` that will listed to the `UnityEngine.Debug.LogError` calls in
the game :
```csharp
using UnityEngine;
using MLAgents;
using System.Text;
using System;
public class StringLogSideChannel : SideChannel
{
public StringLogSideChannel()
{
ChannelId = new Guid("621f0a70-4f87-11ea-a6bf-784f4387d1f7");
}
public override void OnMessageReceived(byte[] data)
{
var receivedString = Encoding.ASCII.GetString(data);
Debug.Log("From Python : " + receivedString);
}
public void SendDebugStatementToPython(string logString, string stackTrace, LogType type)
{
if (type == LogType.Error)
{
var stringToSend = type.ToString() + ": " + logString + "\n" + stackTrace;
var encodedString = Encoding.ASCII.GetBytes(stringToSend);
base.QueueMessageToSend(encodedString);
}
}
}
```
We also need to register this side channel to the Academy and to the `Application.logMessageReceived` events,
so we write a simple MonoBehavior for this. (Do not forget to attach it to a GameObject in the scene).
```csharp
using UnityEngine;
using MLAgents;
public class RegisterStringLogSideChannel : MonoBehaviour
{
StringLogSideChannel stringChannel;
public void Awake()
{
// We create the Side Channel
stringChannel = new StringLogSideChannel();
// When a Debug.Log message is created, we send it to the stringChannel
Application.logMessageReceived += stringChannel.SendDebugStatementToPython;
// Just in case the Academy has not yet initialized
Academy.Instance.RegisterSideChannel(stringChannel);
}
public void OnDestroy()
{
// De-register the Debug.Log callback
Application.logMessageReceived -= stringChannel.SendDebugStatementToPython;
if (Academy.IsInitialized){
Academy.Instance.UnregisterSideChannel(stringChannel);
}
}
public void Update()
{
// Optional : If the space bar is pressed, raise an error !
if (Input.GetKeyDown(KeyCode.Space))
{
Debug.LogError("This is a fake error. Space bar was pressed in Unity.");
}
}
}
```
And here is the script on the Python side. This script creates a new Side channel type (`StringLogChannel`) and
launches a `UnityEnvironment` with that side channel.
```python
from mlagents_envs.environment import UnityEnvironment
from mlagents_envs.side_channel.side_channel import SideChannel
import numpy as np
# Create the StringLogChannel class
class StringLogChannel(SideChannel):
def __init__(self) -> None:
super().__init__(uuid.UUID("621f0a70-4f87-11ea-a6bf-784f4387d1f7"))
def on_message_received(self, data: bytes) -> None:
"""
Note :We must implement this method of the SideChannel interface to
receive messages from Unity
"""
# We simply print the data received interpreted as ascii
print(data.decode("ascii"))
def send_string(self, data: str) -> None:
# Convert the string to ascii
bytes_data = data.encode("ascii")
# We call this method to queue the data we want to send
super().queue_message_to_send(bytes_data)
# Create the channel
string_log = StringLogChannel()
# We start the communication with the Unity Editor and pass the string_log side channel as input
env = UnityEnvironment(base_port=UnityEnvironment.DEFAULT_EDITOR_PORT, side_channels=[string_log])
env.reset()
string_log.send_string("The environment was reset")
group_name = env.get_agent_groups()[0] # Get the first group_name
for i in range(1000):
step_data = env.get_step_result(group_name)
n_agents = step_data.n_agents() # Get the number of agents
# We send data to Unity : A string with the number of Agent at each
string_log.send_string(
"Step " + str(i) + " occurred with " + str(n_agents) + " agents."
)
env.step() # Move the simulation forward
env.close()
```
Now, if you run this script and press `Play` the Unity Editor when prompted, The console in the Unity Editor will
display a message at every Python step. Additionally, if you press the Space Bar in the Unity Engine, a message will
appear in the terminal.

3
docs/Reward-Signals.md


#### Demo Path
`demo_path` is the path to your `.demo` file or directory of `.demo` files. See the [imitation learning guide]
(Training-Imitation-Learning.md).
`demo_path` is the path to your `.demo` file or directory of `.demo` files. See the [imitation learning guide](Training-Imitation-Learning.md).
#### (Optional) Encoding Size

2
docs/Training-Generalized-Reinforcement-Learning-Agents.md


* `sampler-type-sub-arguments` - Specify the sub-arguments depending on the `sampler-type`.
In the example above, this would correspond to the `intervals`
under the `sampler-type` `"multirange_uniform"` for the `Reset Parameter` called gravity`.
under the `sampler-type` `"multirange_uniform"` for the `Reset Parameter` called `gravity`.
The key name should match the name of the corresponding argument in the sampler definition.
(See below)

2
docs/Training-Imitation-Learning.md


from a few minutes or a few hours of demonstration data may be necessary to
be useful for imitation learning. When you have recorded enough data, end
the Editor play session, and a `.demo` file will be created in the
`Assets/Demonstrations` folder. This file contains the demonstrations.
`Assets/Demonstrations` folder (by default). This file contains the demonstrations.
Clicking on the file will provide metadata about the demonstration in the
inspector.

1
docs/Training-ML-Agents.md


[here](https://docs.unity3d.com/Manual/CommandLineArguments.html) for more
details.
* `--debug`: Specify this option to enable debug-level logging for some parts of the code.
* `--multi-gpu`: Setting this flag enables the use of multiple GPU's (if available) during training.
* `--cpu`: Forces training using CPU only.
* Engine Configuration :
* `--width' : The width of the executable window of the environment(s) in pixels

6
docs/Training-PPO.md


### Memory Size
`memory_size` corresponds to the size of the array of floating point numbers
used to store the hidden state of the recurrent neural network. This value must
be a multiple of 4, and should scale with the amount of information you expect
used to store the hidden state of the recurrent neural network of the policy. This value must
be a multiple of 2, and should scale with the amount of information you expect
Typical Range: `64` - `512`
Typical Range: `32` - `256`
## (Optional) Behavioral Cloning Using Demonstrations

6
docs/Training-SAC.md


### Memory Size
`memory_size` corresponds to the size of the array of floating point numbers
used to store the hidden state of the recurrent neural network. This value must
be a multiple of 4, and should scale with the amount of information you expect
used to store the hidden state of the recurrent neural network in the policy.
This value must be a multiple of 2, and should scale with the amount of information you expect
Typical Range: `64` - `512`
Typical Range: `32` - `256`
### (Optional) Save Replay Buffer

2
docs/Training-Self-Play.md


### ELO
In adversarial games, the cumulative environment reward may not be a meaningful metric by which to track learning progress. This is because cumulative reward is entirely dependent on the skill of the opponent. An agent at a particular skill level will get more or less reward against a worse or better agent, respectively.
We provide an implementation of the ELO rating system, a method for calculating the relative skill level between two players from a given population in a zero-sum game. For more informtion on ELO, please see [the ELO wiki](https://en.wikipedia.org/wiki/Elo_rating_system).
We provide an implementation of the ELO rating system, a method for calculating the relative skill level between two players from a given population in a zero-sum game. For more information on ELO, please see [the ELO wiki](https://en.wikipedia.org/wiki/Elo_rating_system).
In a proper training run, the ELO of the agent should steadily increase. The absolute value of the ELO is less important than the change in ELO over training iterations.

部分文件因为文件数量过多而无法显示

正在加载...
取消
保存