浏览代码

Develop side channels: migrate reset parameters (#2990)

* [WIP] Side Channel initial layout

* Working prototype for raw bytes

* fixing format mistake

* Added some errors and some unit tests in C#

* Added the side channel for the Engine Configuration. (#2958)

* Added the side channel for the Engine Configuration.

Note that this change does not require modifying a lot of files :
 - Adding a sender in Python
 - Adding a receiver in C#
 - subscribe the receiver to the communicator (here is a one liner in the Academy)
 - Add the side channel to the Python UnityEnvironment (not represented here)

Adding the side channel to the environment would look like such :

```python
from mlagents.envs.environment import UnityEnvironment
from mlagents.envs.side_channel.raw_bytes_channel import RawBytesChannel
from mlagents.envs.side_channel.engine_configuration_channel import EngineConfigurationChannel

channel0 = RawBytesChannel()
channel1 = EngineConfigurationChanne...
/develop/tanhsquash
GitHub 5 年前
当前提交
8ec5ab62
共有 70 个文件被更改,包括 429 次插入1630 次删除
  1. 1
      UnitySDK/Assets/ML-Agents/Editor/Tests/DemonstrationTests.cs
  2. 1
      UnitySDK/Assets/ML-Agents/Editor/Tests/EditModeTestInternalBrainTensorGenerator.cs
  3. 8
      UnitySDK/Assets/ML-Agents/Editor/Tests/MLAgentsEditModeTest.cs
  4. 7
      UnitySDK/Assets/ML-Agents/Examples/3DBall/Scripts/Ball3DAcademy.cs
  5. 8
      UnitySDK/Assets/ML-Agents/Examples/3DBall/Scripts/Ball3DAgent.cs
  6. 8
      UnitySDK/Assets/ML-Agents/Examples/3DBall/Scripts/Ball3DHardAgent.cs
  7. 2
      UnitySDK/Assets/ML-Agents/Examples/Basic/Scripts/BasicAgent.cs
  8. 6
      UnitySDK/Assets/ML-Agents/Examples/Bouncer/Scripts/BouncerAgent.cs
  9. 5
      UnitySDK/Assets/ML-Agents/Examples/FoodCollector/Scripts/FoodCollectorAgent.cs
  10. 12
      UnitySDK/Assets/ML-Agents/Examples/GridWorld/Scripts/GridAcademy.cs
  11. 4
      UnitySDK/Assets/ML-Agents/Examples/GridWorld/Scripts/GridAgent.cs
  12. 14
      UnitySDK/Assets/ML-Agents/Examples/GridWorld/Scripts/GridArea.cs
  13. 13
      UnitySDK/Assets/ML-Agents/Examples/PushBlock/Scripts/PushAgentBasic.cs
  14. 3
      UnitySDK/Assets/ML-Agents/Examples/Reacher/Scripts/ReacherAcademy.cs
  15. 9
      UnitySDK/Assets/ML-Agents/Examples/Reacher/Scripts/ReacherAgent.cs
  16. 4
      UnitySDK/Assets/ML-Agents/Examples/Soccer/Scripts/SoccerAcademy.cs
  17. 2
      UnitySDK/Assets/ML-Agents/Examples/Soccer/Scripts/SoccerFieldArea.cs
  18. 5
      UnitySDK/Assets/ML-Agents/Examples/Tennis/Scripts/TennisAcademy.cs
  19. 8
      UnitySDK/Assets/ML-Agents/Examples/Tennis/Scripts/TennisAgent.cs
  20. 6
      UnitySDK/Assets/ML-Agents/Examples/Walker/Scripts/WalkerAcademy.cs
  21. 14
      UnitySDK/Assets/ML-Agents/Examples/Walker/Scripts/WalkerAgent.cs
  22. 11
      UnitySDK/Assets/ML-Agents/Examples/WallJump/Scripts/WallJumpAgent.cs
  23. 200
      UnitySDK/Assets/ML-Agents/Scripts/Academy.cs
  24. 52
      UnitySDK/Assets/ML-Agents/Scripts/Grpc/CommunicatorObjects/UnityRlInitializationOutput.cs
  25. 92
      UnitySDK/Assets/ML-Agents/Scripts/Grpc/CommunicatorObjects/UnityRlInput.cs
  26. 23
      UnitySDK/Assets/ML-Agents/Scripts/Grpc/GrpcExtensions.cs
  27. 22
      UnitySDK/Assets/ML-Agents/Scripts/Grpc/RpcCommunicator.cs
  28. 27
      UnitySDK/Assets/ML-Agents/Scripts/ICommunicator.cs
  29. 2
      UnitySDK/Assets/ML-Agents/Scripts/SideChannel/EngineConfigurationChannel.cs
  30. 38
      docs/Basic-Guide.md
  31. 12
      docs/Learning-Environment-Design-Academy.md
  32. 28
      docs/Learning-Environment-Examples.md
  33. 18
      docs/Learning-Environment-Executable.md
  34. 13
      docs/Migrating.md
  35. 81
      docs/Python-API.md
  36. 5
      docs/Training-Curriculum-Learning.md
  37. 4
      docs/Training-Generalized-Reinforcement-Learning-Agents.md
  38. 17
      docs/Training-ML-Agents.md
  39. 14
      ml-agents-envs/mlagents/envs/base_unity_environment.py
  40. 17
      ml-agents-envs/mlagents/envs/communicator_objects/unity_rl_initialization_output_pb2.py
  41. 14
      ml-agents-envs/mlagents/envs/communicator_objects/unity_rl_initialization_output_pb2.pyi
  42. 36
      ml-agents-envs/mlagents/envs/communicator_objects/unity_rl_input_pb2.py
  43. 16
      ml-agents-envs/mlagents/envs/communicator_objects/unity_rl_input_pb2.pyi
  44. 11
      ml-agents-envs/mlagents/envs/env_manager.py
  45. 66
      ml-agents-envs/mlagents/envs/environment.py
  46. 25
      ml-agents-envs/mlagents/envs/side_channel/engine_configuration_channel.py
  47. 29
      ml-agents-envs/mlagents/envs/simple_env_manager.py
  48. 71
      ml-agents-envs/mlagents/envs/subprocess_env_manager.py
  49. 35
      ml-agents-envs/mlagents/envs/tests/test_subprocess_env_manager.py
  50. 9
      ml-agents/mlagents/trainers/curriculum.py
  51. 70
      ml-agents/mlagents/trainers/learn.py
  52. 8
      ml-agents/mlagents/trainers/meta_curriculum.py
  53. 8
      ml-agents/mlagents/trainers/tests/test_curriculum.py
  54. 4
      ml-agents/mlagents/trainers/tests/test_learn.py
  55. 9
      ml-agents/mlagents/trainers/tests/test_meta_curriculum.py
  56. 4
      ml-agents/mlagents/trainers/tests/test_simple_rl.py
  57. 2
      ml-agents/mlagents/trainers/tests/test_trainer_controller.py
  58. 4
      ml-agents/mlagents/trainers/trainer_controller.py
  59. 3
      protobuf-definitions/proto/mlagents/envs/communicator_objects/unity_rl_initialization_output.proto
  60. 5
      protobuf-definitions/proto/mlagents/envs/communicator_objects/unity_rl_input.proto
  61. 179
      UnitySDK/Assets/ML-Agents/Editor/ResetParameterDrawer.cs
  62. 12
      UnitySDK/Assets/ML-Agents/Editor/ResetParameterDrawer.cs.meta
  63. 207
      UnitySDK/Assets/ML-Agents/Scripts/Grpc/CommunicatorObjects/EnvironmentParameters.cs
  64. 11
      UnitySDK/Assets/ML-Agents/Scripts/Grpc/CommunicatorObjects/EnvironmentParameters.cs.meta
  65. 61
      UnitySDK/Assets/ML-Agents/Scripts/ResetParameters.cs
  66. 12
      UnitySDK/Assets/ML-Agents/Scripts/ResetParameters.cs.meta
  67. 116
      docs/images/academy.png
  68. 130
      ml-agents-envs/mlagents/envs/communicator_objects/environment_parameters_pb2.py
  69. 75
      ml-agents-envs/mlagents/envs/communicator_objects/environment_parameters_pb2.pyi
  70. 11
      protobuf-definitions/proto/mlagents/envs/communicator_objects/environment_parameters.proto

1
UnitySDK/Assets/ML-Agents/Editor/Tests/DemonstrationTests.cs


var acaGo = new GameObject("TestAcademy");
acaGo.AddComponent<TestAcademy>();
var aca = acaGo.GetComponent<TestAcademy>();
aca.resetParameters = new ResetParameters();
var academyInitializeMethod = typeof(Academy).GetMethod("InitializeEnvironment",
BindingFlags.Instance | BindingFlags.NonPublic);

1
UnitySDK/Assets/ML-Agents/Editor/Tests/EditModeTestInternalBrainTensorGenerator.cs


var acaGo = new GameObject("TestAcademy");
acaGo.AddComponent<TestAcademy>();
var aca = acaGo.GetComponent<TestAcademy>();
aca.resetParameters = new ResetParameters();
var goA = new GameObject("goA");
var bpA = goA.AddComponent<BehaviorParameters>();

8
UnitySDK/Assets/ML-Agents/Editor/Tests/MLAgentsEditModeTest.cs


var acaGo = new GameObject("TestAcademy");
acaGo.AddComponent<TestAcademy>();
var aca = acaGo.GetComponent<TestAcademy>();
aca.resetParameters = new ResetParameters();
Assert.AreEqual(0, aca.initializeAcademyCalls);
Assert.AreEqual(0, aca.GetStepCount());
Assert.AreEqual(0, aca.GetEpisodeCount());

var acaGo = new GameObject("TestAcademy");
acaGo.AddComponent<TestAcademy>();
var aca = acaGo.GetComponent<TestAcademy>();
aca.resetParameters = new ResetParameters();
Assert.AreEqual(false, agent1.IsDone());
Assert.AreEqual(false, agent2.IsDone());

var acaGo = new GameObject("TestAcademy");
acaGo.AddComponent<TestAcademy>();
var aca = acaGo.GetComponent<TestAcademy>();
aca.resetParameters = new ResetParameters();
var academyInitializeMethod = typeof(Academy).GetMethod("InitializeEnvironment",
BindingFlags.Instance | BindingFlags.NonPublic);
academyInitializeMethod?.Invoke(aca, new object[] { });

var acaGo = new GameObject("TestAcademy");
acaGo.AddComponent<TestAcademy>();
var aca = acaGo.GetComponent<TestAcademy>();
aca.resetParameters = new ResetParameters();
var agentEnableMethod = typeof(Agent).GetMethod(

var acaGo = new GameObject("TestAcademy");
acaGo.AddComponent<TestAcademy>();
var aca = acaGo.GetComponent<TestAcademy>();
aca.resetParameters = new ResetParameters();
var academyInitializeMethod = typeof(Academy).GetMethod(
"InitializeEnvironment", BindingFlags.Instance | BindingFlags.NonPublic);
academyInitializeMethod?.Invoke(aca, new object[] { });

var acaGo = new GameObject("TestAcademy");
acaGo.AddComponent<TestAcademy>();
var aca = acaGo.GetComponent<TestAcademy>();
aca.resetParameters = new ResetParameters();
var agentEnableMethod = typeof(Agent).GetMethod(

var acaGo = new GameObject("TestAcademy");
acaGo.AddComponent<TestAcademy>();
var aca = acaGo.GetComponent<TestAcademy>();
aca.resetParameters = new ResetParameters();
var agentEnableMethod = typeof(Agent).GetMethod(

var acaGo = new GameObject("TestAcademy");
acaGo.AddComponent<TestAcademy>();
var aca = acaGo.GetComponent<TestAcademy>();
aca.resetParameters = new ResetParameters();
var agentEnableMethod = typeof(Agent).GetMethod(

7
UnitySDK/Assets/ML-Agents/Examples/3DBall/Scripts/Ball3DAcademy.cs


public class Ball3DAcademy : Academy
{
public override void AcademyReset()
public override void InitializeAcademy()
Physics.gravity = new Vector3(0, -resetParameters["gravity"], 0);
FloatProperties.RegisterCallback("gravity", f => { Physics.gravity = new Vector3(0, -f, 0); });
public override void AcademyStep()
{
}
}

8
UnitySDK/Assets/ML-Agents/Examples/3DBall/Scripts/Ball3DAgent.cs


[Header("Specific to Ball3D")]
public GameObject ball;
Rigidbody m_BallRb;
ResetParameters m_ResetParams;
IFloatProperties m_ResetParams;
m_ResetParams = academy.resetParameters;
m_ResetParams = academy.FloatProperties;
SetResetParameters();
}

public void SetBall()
{
//Set the attributes of the ball by fetching the information from the academy
m_BallRb.mass = m_ResetParams["mass"];
var scale = m_ResetParams["scale"];
m_BallRb.mass = m_ResetParams.GetPropertyWithDefault("mass", 1.0f);
var scale = m_ResetParams.GetPropertyWithDefault("scale", 1.0f);
ball.transform.localScale = new Vector3(scale, scale, scale);
}

8
UnitySDK/Assets/ML-Agents/Examples/3DBall/Scripts/Ball3DHardAgent.cs


[Header("Specific to Ball3DHard")]
public GameObject ball;
Rigidbody m_BallRb;
ResetParameters m_ResetParams;
IFloatProperties m_ResetParams;
m_ResetParams = academy.resetParameters;
m_ResetParams = academy.FloatProperties;
SetResetParameters();
}

public void SetBall()
{
//Set the attributes of the ball by fetching the information from the academy
m_BallRb.mass = m_ResetParams["mass"];
var scale = m_ResetParams["scale"];
m_BallRb.mass = m_ResetParams.GetPropertyWithDefault("mass", 1.0f);
var scale = m_ResetParams.GetPropertyWithDefault("scale", 1.0f);
ball.transform.localScale = new Vector3(scale, scale, scale);
}

2
UnitySDK/Assets/ML-Agents/Examples/Basic/Scripts/BasicAgent.cs


void WaitTimeInference()
{
if (!m_Academy.GetIsInference())
if (!m_Academy.IsCommunicatorOn)
{
RequestDecision();
}

6
UnitySDK/Assets/ML-Agents/Examples/Bouncer/Scripts/BouncerAgent.cs


int m_NumberJumps = 20;
int m_JumpLeft = 20;
ResetParameters m_ResetParams;
IFloatProperties m_ResetParams;
public override void InitializeAgent()
{

var academy = FindObjectOfType<Academy>();
m_ResetParams = academy.resetParameters;
m_ResetParams = academy.FloatProperties;
SetResetParameters();
}

public void SetTargetScale()
{
var targetScale = m_ResetParams["target_scale"];
var targetScale = m_ResetParams.GetPropertyWithDefault("target_scale", 1.0f);
target.transform.localScale = new Vector3(targetScale, targetScale, targetScale);
}

5
UnitySDK/Assets/ML-Agents/Examples/FoodCollector/Scripts/FoodCollectorAgent.cs


public void SetLaserLengths()
{
m_LaserLength = m_MyAcademy.resetParameters.TryGetValue("laser_length", out m_LaserLength) ? m_LaserLength : 1.0f;
m_LaserLength = m_MyAcademy.FloatProperties.GetPropertyWithDefault("laser_length", 1.0f);
float agentScale;
agentScale = m_MyAcademy.resetParameters.TryGetValue("agent_scale", out agentScale) ? agentScale : 1.0f;
float agentScale = m_MyAcademy.FloatProperties.GetPropertyWithDefault("agent_scale", 1.0f);
gameObject.transform.localScale = new Vector3(agentScale, agentScale, agentScale);
}

12
UnitySDK/Assets/ML-Agents/Examples/GridWorld/Scripts/GridAcademy.cs


{
public Camera MainCamera;
public override void AcademyReset()
public override void InitializeAcademy()
MainCamera.transform.position = new Vector3(-((int)resetParameters["gridSize"] - 1) / 2f,
(int)resetParameters["gridSize"] * 1.25f,
-((int)resetParameters["gridSize"] - 1) / 2f);
MainCamera.orthographicSize = ((int)resetParameters["gridSize"] + 5f) / 2f;
FloatProperties.RegisterCallback("gridSize", f =>
{
MainCamera.transform.position = new Vector3(-(f - 1) / 2f, f * 1.25f, -(f - 1) / 2f);
MainCamera.orthographicSize = (f + 5f) / 2f;
});
}
}

4
UnitySDK/Assets/ML-Agents/Examples/GridWorld/Scripts/GridAgent.cs


// Prevents the agent from picking an action that would make it collide with a wall
var positionX = (int)transform.position.x;
var positionZ = (int)transform.position.z;
var maxPosition = (int)m_Academy.resetParameters["gridSize"] - 1;
var maxPosition = (int)m_Academy.FloatProperties.GetPropertyWithDefault("gridSize", 5f) - 1;
if (positionX == 0)
{

renderCamera.Render();
}
if (!m_Academy.GetIsInference())
if (!m_Academy.IsCommunicatorOn)
{
RequestDecision();
}

14
UnitySDK/Assets/ML-Agents/Examples/GridWorld/Scripts/GridArea.cs


public GameObject trueAgent;
ResetParameters m_ResetParameters;
IFloatProperties m_ResetParameters;
Camera m_AgentCam;

public void Awake()
{
m_ResetParameters = FindObjectOfType<Academy>().resetParameters;
m_ResetParameters = FindObjectOfType<Academy>().FloatProperties;
m_Objects = new[] { goalPref, pitPref };

public void SetEnvironment()
{
transform.position = m_InitialPosition * (m_ResetParameters["gridSize"] + 1);
transform.position = m_InitialPosition * (m_ResetParameters.GetPropertyWithDefault("gridSize", 5f) + 1);
for (var i = 0; i < (int)m_ResetParameters["numObstacles"]; i++)
for (var i = 0; i < (int)m_ResetParameters.GetPropertyWithDefault("numObstacles", 1); i++)
for (var i = 0; i < (int)m_ResetParameters["numGoals"]; i++)
for (var i = 0; i < (int)m_ResetParameters.GetPropertyWithDefault("numGoals", 1f); i++)
var gridSize = (int)m_ResetParameters["gridSize"];
var gridSize = (int)m_ResetParameters.GetPropertyWithDefault("gridSize", 5f);
m_Plane.transform.localScale = new Vector3(gridSize / 10.0f, 1f, gridSize / 10.0f);
m_Plane.transform.localPosition = new Vector3((gridSize - 1) / 2f, -0.5f, (gridSize - 1) / 2f);
m_Sn.transform.localScale = new Vector3(1, 1, gridSize + 2);

public void AreaReset()
{
var gridSize = (int)m_ResetParameters["gridSize"];
var gridSize = (int)m_ResetParameters.GetPropertyWithDefault("gridSize", 5f); ;
foreach (var actor in actorObjs)
{
DestroyImmediate(actor);

13
UnitySDK/Assets/ML-Agents/Examples/PushBlock/Scripts/PushAgentBasic.cs


public void SetGroundMaterialFriction()
{
var resetParams = m_Academy.resetParameters;
var resetParams = m_Academy.FloatProperties;
groundCollider.material.dynamicFriction = resetParams["dynamic_friction"];
groundCollider.material.staticFriction = resetParams["static_friction"];
groundCollider.material.dynamicFriction = resetParams.GetPropertyWithDefault("dynamic_friction", 0);
groundCollider.material.staticFriction = resetParams.GetPropertyWithDefault("static_friction", 0);
var resetParams = m_Academy.resetParameters;
var resetParams = m_Academy.FloatProperties;
var scale = resetParams.GetPropertyWithDefault("block_scale", 2);
m_BlockRb.transform.localScale = new Vector3(resetParams["block_scale"], 0.75f, resetParams["block_scale"]);
m_BlockRb.transform.localScale = new Vector3(scale, 0.75f, scale);
m_BlockRb.drag = resetParams["block_drag"];
m_BlockRb.drag = resetParams.GetPropertyWithDefault("block_drag", 0.5f);
}
public void SetResetParameters()

3
UnitySDK/Assets/ML-Agents/Examples/Reacher/Scripts/ReacherAcademy.cs


{
public override void AcademyReset()
{
Physics.gravity = new Vector3(0, -resetParameters["gravity"], 0);
FloatProperties.RegisterCallback("gravity", f => { Physics.gravity = new Vector3(0, -f, 0); });
}
public override void AcademyStep()

9
UnitySDK/Assets/ML-Agents/Examples/Reacher/Scripts/ReacherAgent.cs


public void SetResetParameters()
{
m_GoalSize = m_MyAcademy.resetParameters["goal_size"];
m_GoalSpeed = Random.Range(-1f, 1f) * m_MyAcademy.resetParameters["goal_speed"];
m_Deviation = m_MyAcademy.resetParameters["deviation"];
m_DeviationFreq = m_MyAcademy.resetParameters["deviation_freq"];
var fp = m_MyAcademy.FloatProperties;
m_GoalSize = fp.GetPropertyWithDefault("goal_size", 5);
m_GoalSpeed = Random.Range(-1f, 1f) * fp.GetPropertyWithDefault("goal_speed", 1);
m_Deviation = fp.GetPropertyWithDefault("deviation", 0);
m_DeviationFreq = fp.GetPropertyWithDefault("deviation_freq", 0);
}
}

4
UnitySDK/Assets/ML-Agents/Examples/Soccer/Scripts/SoccerAcademy.cs


Physics.gravity *= gravityMultiplier; //for soccer a multiplier of 3 looks good
}
public override void AcademyReset()
public override void InitializeAcademy()
Physics.gravity = new Vector3(0, -resetParameters["gravity"], 0);
FloatProperties.RegisterCallback("gravity", f => { Physics.gravity = new Vector3(0, -f, 0); });
}
public override void AcademyStep()

2
UnitySDK/Assets/ML-Agents/Examples/Soccer/Scripts/SoccerFieldArea.cs


ballRb.velocity = Vector3.zero;
ballRb.angularVelocity = Vector3.zero;
var ballScale = m_Academy.resetParameters["ball_scale"];
var ballScale = m_Academy.FloatProperties.GetPropertyWithDefault("ball_scale", 0.015f);
ballRb.transform.localScale = new Vector3(ballScale, ballScale, ballScale);
}
}

5
UnitySDK/Assets/ML-Agents/Examples/Tennis/Scripts/TennisAcademy.cs


public class TennisAcademy : Academy
{
public override void AcademyReset()
public override void InitializeAcademy()
Physics.gravity = new Vector3(0, -resetParameters["gravity"], 0);
FloatProperties.RegisterCallback("gravity", f => { Physics.gravity = new Vector3(0, -f, 0); });
}
public override void AcademyStep()

8
UnitySDK/Assets/ML-Agents/Examples/Tennis/Scripts/TennisAgent.cs


Rigidbody m_AgentRb;
Rigidbody m_BallRb;
float m_InvertMult;
ResetParameters m_ResetParams;
IFloatProperties m_ResetParams;
// Looks for the scoreboard based on the name of the gameObjects.
// Do not modify the names of the Score GameObjects

var canvas = GameObject.Find(k_CanvasName);
GameObject scoreBoard;
var academy = FindObjectOfType<Academy>();
m_ResetParams = academy.resetParameters;
m_ResetParams = academy.FloatProperties;
if (invertX)
{
scoreBoard = canvas.transform.Find(k_ScoreBoardBName).gameObject;

public void SetRacket()
{
angle = m_ResetParams["angle"];
angle = m_ResetParams.GetPropertyWithDefault("angle", 55);
gameObject.transform.eulerAngles = new Vector3(
gameObject.transform.eulerAngles.x,
gameObject.transform.eulerAngles.y,

public void SetBall()
{
scale = m_ResetParams["scale"];
scale = m_ResetParams.GetPropertyWithDefault("scale", 1);
ball.transform.localScale = new Vector3(scale, scale, scale);
}

6
UnitySDK/Assets/ML-Agents/Examples/Walker/Scripts/WalkerAcademy.cs


Physics.defaultSolverVelocityIterations = 12;
Time.fixedDeltaTime = 0.01333f; //(75fps). default is .2 (60fps)
Time.maximumDeltaTime = .15f; // Default is .33
}
public override void AcademyReset()
{
Physics.gravity = new Vector3(0, -resetParameters["gravity"], 0);
FloatProperties.RegisterCallback("gravity", f => { Physics.gravity = new Vector3(0, -f, 0); });
}
public override void AcademyStep()

14
UnitySDK/Assets/ML-Agents/Examples/Walker/Scripts/WalkerAgent.cs


public class WalkerAgent : Agent
{
[Header("Specific to Walker")][Header("Target To Walk Towards")][Space(10)]
[Header("Specific to Walker")]
[Header("Target To Walk Towards")]
[Space(10)]
public Transform target;
Vector3 m_DirToTarget;

Rigidbody m_ChestRb;
Rigidbody m_SpineRb;
ResetParameters m_ResetParams;
IFloatProperties m_ResetParams;
public override void InitializeAgent()
{

m_SpineRb = spine.GetComponent<Rigidbody>();
var academy = FindObjectOfType<WalkerAcademy>();
m_ResetParams = academy.resetParameters;
m_ResetParams = academy.FloatProperties;
SetResetParameters();
}

public void SetTorsoMass()
{
m_ChestRb.mass = m_ResetParams["chest_mass"];
m_SpineRb.mass = m_ResetParams["spine_mass"];
m_HipsRb.mass = m_ResetParams["hip_mass"];
m_ChestRb.mass = m_ResetParams.GetPropertyWithDefault("chest_mass", 8);
m_SpineRb.mass = m_ResetParams.GetPropertyWithDefault("spine_mass", 10);
m_HipsRb.mass = m_ResetParams.GetPropertyWithDefault("hip_mass", 15);
}
public void SetResetParameters()

11
UnitySDK/Assets/ML-Agents/Examples/WallJump/Scripts/WallJumpAgent.cs


{
localScale = new Vector3(
localScale.x,
m_Academy.resetParameters["no_wall_height"],
m_Academy.FloatProperties.GetPropertyWithDefault("no_wall_height", 0),
localScale.z);
wall.transform.localScale = localScale;
GiveModel("SmallWallJump", noWallBrain);

localScale = new Vector3(
localScale.x,
m_Academy.resetParameters["small_wall_height"],
m_Academy.FloatProperties.GetPropertyWithDefault("small_wall_height", 4),
localScale.z);
wall.transform.localScale = localScale;
GiveModel("SmallWallJump", smallWallBrain);

var height =
m_Academy.resetParameters["big_wall_min_height"] +
Random.value * (m_Academy.resetParameters["big_wall_max_height"] -
m_Academy.resetParameters["big_wall_min_height"]);
var min = m_Academy.FloatProperties.GetPropertyWithDefault("big_wall_min_height", 8);
var max = m_Academy.FloatProperties.GetPropertyWithDefault("big_wall_max_height", 8);
var height = min + Random.value * (max - min);
localScale = new Vector3(
localScale.x,
height,

200
UnitySDK/Assets/ML-Agents/Scripts/Academy.cs


namespace MLAgents
{
/// <summary>
/// Wraps the environment-level parameters that are provided within the
/// Editor. These parameters can be provided for training and inference
/// modes separately and represent screen resolution, rendering quality and
/// frame rate.
/// </summary>
[System.Serializable]
public class EnvironmentConfiguration
{
[Tooltip("Width of the environment window in pixels.")]
public int width;
[Tooltip("Height of the environment window in pixels.")]
public int height;
[Tooltip("Rendering quality of environment. (Higher is better quality.)")]
[Range(0, 5)]
public int qualityLevel;
[Tooltip("Speed at which environment is run. (Higher is faster.)")]
[Range(1f, 100f)]
public float timeScale;
[Tooltip("Frames per second (FPS) engine attempts to maintain.")]
public int targetFrameRate;
/// Initializes a new instance of the
/// <see cref="EnvironmentConfiguration"/> class.
/// <param name="width">Width of environment window (pixels).</param>
/// <param name="height">Height of environment window (pixels).</param>
/// <param name="qualityLevel">
/// Rendering quality of environment. Ranges from 0 to 5, with higher.
/// </param>
/// <param name="timeScale">
/// Speed at which environment is run. Ranges from 1 to 100, with higher
/// values representing faster speed.
/// </param>
/// <param name="targetFrameRate">
/// Target frame rate (per second) that the engine tries to maintain.
/// </param>
public EnvironmentConfiguration(
int width, int height, int qualityLevel,
float timeScale, int targetFrameRate)
{
this.width = width;
this.height = height;
this.qualityLevel = qualityLevel;
this.timeScale = timeScale;
this.targetFrameRate = targetFrameRate;
}
}
/// <summary>
/// An Academy is where Agent objects go to train their behaviors.

/// Used to restore original value when deriving Academy modifies it
float m_OriginalMaximumDeltaTime;
// Fields provided in the Inspector
[FormerlySerializedAs("trainingConfiguration")]
[SerializeField]
[Tooltip("The engine-level settings which correspond to rendering " +
"quality and engine speed during Training.")]
EnvironmentConfiguration m_TrainingConfiguration =
new EnvironmentConfiguration(80, 80, 1, 100.0f, -1);
[FormerlySerializedAs("inferenceConfiguration")]
[SerializeField]
[Tooltip("The engine-level settings which correspond to rendering " +
"quality and engine speed during Inference.")]
EnvironmentConfiguration m_InferenceConfiguration =
new EnvironmentConfiguration(1280, 720, 5, 1.0f, 60);
/// <summary>
/// Contains a mapping from parameter names to float values. They are
/// used in <see cref="AcademyReset"/> and <see cref="AcademyStep"/>
/// to modify elements in the environment at reset time.
/// </summary>
/// <remarks>
/// Default reset parameters are specified in the academy Editor, and can
/// be modified when training by passing a config
/// dictionary at reset.
/// </remarks>
[SerializeField]
[Tooltip("List of custom parameters that can be changed in the " +
"environment when it resets.")]
public ResetParameters resetParameters;
public CommunicatorObjects.CustomResetParametersProto customResetParameters;
// Fields not provided in the Inspector.

get { return Communicator != null; }
}
/// If true, the Academy will use inference settings. This field is
/// initialized in <see cref="Awake"/> depending on the presence
/// or absence of a communicator. Furthermore, it can be modified during
/// training via <see cref="SetIsInference"/>.
bool m_IsInference = true;
/// The number of episodes completed by the environment. Incremented
/// each time the environment is reset.
int m_EpisodeCount;

/// The number of total number of steps completed during the whole simulation. Incremented
/// each time a step is taken in the environment.
int m_TotalStepCount;
/// Flag that indicates whether the inference/training mode of the
/// environment was switched by the training process. This impacts the
/// engine settings at the next environment step.
bool m_ModeSwitched;
/// Pointer to the communicator currently in use by the Academy.
public ICommunicator Communicator;

m_OriginalFixedDeltaTime = Time.fixedDeltaTime;
m_OriginalMaximumDeltaTime = Time.maximumDeltaTime;
InitializeAcademy();
InitializeAcademy();
// Try to launch the communicator by using the arguments passed at launch
try

if (Communicator != null)
{
Communicator.RegisterSideChannel(new EngineConfigurationChannel());
Communicator.RegisterSideChannel(floatProperties);
// We try to exchange the first message with Python. If this fails, it means
// no Python Process is ready to train the environment. In this case, the
//environment must use Inference.

{
version = k_ApiVersion,
name = gameObject.name,
environmentResetParameters = new EnvironmentResetParameters
{
resetParameters = resetParameters,
customResetParameters = customResetParameters
}
});
Random.InitState(unityRLInitParameters.seed);
}

{
Communicator.QuitCommandReceived += OnQuitCommandReceived;
Communicator.ResetCommandReceived += OnResetCommand;
Communicator.RLInputReceived += OnRLInputReceived;
Communicator.RegisterSideChannel(new EngineConfigurationChannel());
Communicator.RegisterSideChannel(floatProperties);
}
}

SetIsInference(!IsCommunicatorOn);
DecideAction += () => { };
DestroyAction += () => { };
AgentSetStatus += i => { };

AgentForceReset += () => { };
ConfigureEnvironment();
}
static void OnQuitCommandReceived()

Application.Quit();
}
void OnResetCommand(EnvironmentResetParameters newResetParameters)
void OnResetCommand()
UpdateResetParameters(newResetParameters);
void OnRLInputReceived(UnityRLInputParameters inputParams)
{
m_IsInference = !inputParams.isTraining;
}
void UpdateResetParameters(EnvironmentResetParameters newResetParameters)
{
if (newResetParameters.resetParameters != null)
{
foreach (var kv in newResetParameters.resetParameters)
{
resetParameters[kv.Key] = kv.Value;
}
}
customResetParameters = newResetParameters.customResetParameters;
}
/// <summary>
/// Configures the environment settings depending on the training/inference
/// mode and the corresponding parameters passed in the Editor.
/// </summary>
void ConfigureEnvironment()
{
if (m_IsInference)
{
ConfigureEnvironmentHelper(m_InferenceConfiguration);
Monitor.SetActive(true);
}
else
{
ConfigureEnvironmentHelper(m_TrainingConfiguration);
Monitor.SetActive(false);
}
}
/// <summary>
/// Helper method for initializing the environment based on the provided
/// configuration.
/// </summary>
/// <param name="config">
/// Environment configuration (specified in the Editor).
/// </param>
static void ConfigureEnvironmentHelper(EnvironmentConfiguration config)
{
Screen.SetResolution(config.width, config.height, false);
QualitySettings.SetQualityLevel(config.qualityLevel, true);
Time.timeScale = config.timeScale;
Time.captureFramerate = 60;
Application.targetFrameRate = config.targetFrameRate;
}
/// <summary>
/// Initializes the academy and environment. Called during the waking-up
/// phase of the environment before any of the scene objects/agents have

{
}
/// <summary>
/// Returns the <see cref="m_IsInference"/> flag.
/// </summary>
/// <returns>
/// <c>true</c>, if current mode is inference, <c>false</c> if training.
/// </returns>
public bool GetIsInference()
{
return m_IsInference;
}
/// <summary>
/// Sets the <see cref="m_IsInference"/> flag to the provided value. If
/// the new flag differs from the current flag value, this signals that
/// the environment configuration needs to be updated.
/// </summary>
/// <param name="isInference">
/// Environment mode, if true then inference, otherwise training.
/// </param>
public void SetIsInference(bool isInference)
{
if (m_IsInference != isInference)
{
m_IsInference = isInference;
// This signals to the academy that at the next environment step
// the engine configurations need updating to the respective mode
// (i.e. training vs inference) configuration.
m_ModeSwitched = true;
}
}
/// <summary>
/// Returns the current episode counter.

/// </summary>
void EnvironmentStep()
{
if (m_ModeSwitched)
{
ConfigureEnvironment();
m_ModeSwitched = false;
}
if (!m_FirstAcademyReset)
{
ForcedFullReset();

52
UnitySDK/Assets/ML-Agents/Scripts/Grpc/CommunicatorObjects/UnityRlInitializationOutput.cs


"CkdtbGFnZW50cy9lbnZzL2NvbW11bmljYXRvcl9vYmplY3RzL3VuaXR5X3Js",
"X2luaXRpYWxpemF0aW9uX291dHB1dC5wcm90bxIUY29tbXVuaWNhdG9yX29i",
"amVjdHMaOW1sYWdlbnRzL2VudnMvY29tbXVuaWNhdG9yX29iamVjdHMvYnJh",
"aW5fcGFyYW1ldGVycy5wcm90bxo/bWxhZ2VudHMvZW52cy9jb21tdW5pY2F0",
"b3Jfb2JqZWN0cy9lbnZpcm9ubWVudF9wYXJhbWV0ZXJzLnByb3RvIusBCiBV",
"bml0eVJMSW5pdGlhbGl6YXRpb25PdXRwdXRQcm90bxIMCgRuYW1lGAEgASgJ",
"Eg8KB3ZlcnNpb24YAiABKAkSEAoIbG9nX3BhdGgYAyABKAkSRAoQYnJhaW5f",
"cGFyYW1ldGVycxgFIAMoCzIqLmNvbW11bmljYXRvcl9vYmplY3RzLkJyYWlu",
"UGFyYW1ldGVyc1Byb3RvElAKFmVudmlyb25tZW50X3BhcmFtZXRlcnMYBiAB",
"KAsyMC5jb21tdW5pY2F0b3Jfb2JqZWN0cy5FbnZpcm9ubWVudFBhcmFtZXRl",
"cnNQcm90b0IfqgIcTUxBZ2VudHMuQ29tbXVuaWNhdG9yT2JqZWN0c2IGcHJv",
"dG8z"));
"aW5fcGFyYW1ldGVycy5wcm90byKfAQogVW5pdHlSTEluaXRpYWxpemF0aW9u",
"T3V0cHV0UHJvdG8SDAoEbmFtZRgBIAEoCRIPCgd2ZXJzaW9uGAIgASgJEhAK",
"CGxvZ19wYXRoGAMgASgJEkQKEGJyYWluX3BhcmFtZXRlcnMYBSADKAsyKi5j",
"b21tdW5pY2F0b3Jfb2JqZWN0cy5CcmFpblBhcmFtZXRlcnNQcm90b0oECAYQ",
"B0IfqgIcTUxBZ2VudHMuQ29tbXVuaWNhdG9yT2JqZWN0c2IGcHJvdG8z"));
new pbr::FileDescriptor[] { global::MLAgents.CommunicatorObjects.BrainParametersReflection.Descriptor, global::MLAgents.CommunicatorObjects.EnvironmentParametersReflection.Descriptor, },
new pbr::FileDescriptor[] { global::MLAgents.CommunicatorObjects.BrainParametersReflection.Descriptor, },
new pbr::GeneratedClrTypeInfo(typeof(global::MLAgents.CommunicatorObjects.UnityRLInitializationOutputProto), global::MLAgents.CommunicatorObjects.UnityRLInitializationOutputProto.Parser, new[]{ "Name", "Version", "LogPath", "BrainParameters", "EnvironmentParameters" }, null, null, null)
new pbr::GeneratedClrTypeInfo(typeof(global::MLAgents.CommunicatorObjects.UnityRLInitializationOutputProto), global::MLAgents.CommunicatorObjects.UnityRLInitializationOutputProto.Parser, new[]{ "Name", "Version", "LogPath", "BrainParameters" }, null, null, null)
}));
}
#endregion

version_ = other.version_;
logPath_ = other.logPath_;
brainParameters_ = other.brainParameters_.Clone();
EnvironmentParameters = other.environmentParameters_ != null ? other.EnvironmentParameters.Clone() : null;
_unknownFields = pb::UnknownFieldSet.Clone(other._unknownFields);
}

get { return brainParameters_; }
}
/// <summary>Field number for the "environment_parameters" field.</summary>
public const int EnvironmentParametersFieldNumber = 6;
private global::MLAgents.CommunicatorObjects.EnvironmentParametersProto environmentParameters_;
[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public global::MLAgents.CommunicatorObjects.EnvironmentParametersProto EnvironmentParameters {
get { return environmentParameters_; }
set {
environmentParameters_ = value;
}
}
[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public override bool Equals(object other) {
return Equals(other as UnityRLInitializationOutputProto);

if (Version != other.Version) return false;
if (LogPath != other.LogPath) return false;
if(!brainParameters_.Equals(other.brainParameters_)) return false;
if (!object.Equals(EnvironmentParameters, other.EnvironmentParameters)) return false;
return Equals(_unknownFields, other._unknownFields);
}

if (Version.Length != 0) hash ^= Version.GetHashCode();
if (LogPath.Length != 0) hash ^= LogPath.GetHashCode();
hash ^= brainParameters_.GetHashCode();
if (environmentParameters_ != null) hash ^= EnvironmentParameters.GetHashCode();
if (_unknownFields != null) {
hash ^= _unknownFields.GetHashCode();
}

output.WriteString(LogPath);
}
brainParameters_.WriteTo(output, _repeated_brainParameters_codec);
if (environmentParameters_ != null) {
output.WriteRawTag(50);
output.WriteMessage(EnvironmentParameters);
}
if (_unknownFields != null) {
_unknownFields.WriteTo(output);
}

size += 1 + pb::CodedOutputStream.ComputeStringSize(LogPath);
}
size += brainParameters_.CalculateSize(_repeated_brainParameters_codec);
if (environmentParameters_ != null) {
size += 1 + pb::CodedOutputStream.ComputeMessageSize(EnvironmentParameters);
}
if (_unknownFields != null) {
size += _unknownFields.CalculateSize();
}

LogPath = other.LogPath;
}
brainParameters_.Add(other.brainParameters_);
if (other.environmentParameters_ != null) {
if (environmentParameters_ == null) {
environmentParameters_ = new global::MLAgents.CommunicatorObjects.EnvironmentParametersProto();
}
EnvironmentParameters.MergeFrom(other.EnvironmentParameters);
}
_unknownFields = pb::UnknownFieldSet.MergeFrom(_unknownFields, other._unknownFields);
}

}
case 42: {
brainParameters_.AddEntriesFrom(input, _repeated_brainParameters_codec);
break;
}
case 50: {
if (environmentParameters_ == null) {
environmentParameters_ = new global::MLAgents.CommunicatorObjects.EnvironmentParametersProto();
}
input.ReadMessage(environmentParameters_);
break;
}
}

92
UnitySDK/Assets/ML-Agents/Scripts/Grpc/CommunicatorObjects/UnityRlInput.cs


"CjdtbGFnZW50cy9lbnZzL2NvbW11bmljYXRvcl9vYmplY3RzL3VuaXR5X3Js",
"X2lucHV0LnByb3RvEhRjb21tdW5pY2F0b3Jfb2JqZWN0cxo1bWxhZ2VudHMv",
"ZW52cy9jb21tdW5pY2F0b3Jfb2JqZWN0cy9hZ2VudF9hY3Rpb24ucHJvdG8a",
"P21sYWdlbnRzL2VudnMvY29tbXVuaWNhdG9yX29iamVjdHMvZW52aXJvbm1l",
"bnRfcGFyYW1ldGVycy5wcm90bxowbWxhZ2VudHMvZW52cy9jb21tdW5pY2F0",
"b3Jfb2JqZWN0cy9jb21tYW5kLnByb3RvItkDChFVbml0eVJMSW5wdXRQcm90",
"bxJQCg1hZ2VudF9hY3Rpb25zGAEgAygLMjkuY29tbXVuaWNhdG9yX29iamVj",
"dHMuVW5pdHlSTElucHV0UHJvdG8uQWdlbnRBY3Rpb25zRW50cnkSUAoWZW52",
"aXJvbm1lbnRfcGFyYW1ldGVycxgCIAEoCzIwLmNvbW11bmljYXRvcl9vYmpl",
"Y3RzLkVudmlyb25tZW50UGFyYW1ldGVyc1Byb3RvEhMKC2lzX3RyYWluaW5n",
"GAMgASgIEjMKB2NvbW1hbmQYBCABKA4yIi5jb21tdW5pY2F0b3Jfb2JqZWN0",
"cy5Db21tYW5kUHJvdG8SFAoMc2lkZV9jaGFubmVsGAUgASgMGk0KFExpc3RB",
"Z2VudEFjdGlvblByb3RvEjUKBXZhbHVlGAEgAygLMiYuY29tbXVuaWNhdG9y",
"X29iamVjdHMuQWdlbnRBY3Rpb25Qcm90bxpxChFBZ2VudEFjdGlvbnNFbnRy",
"eRILCgNrZXkYASABKAkSSwoFdmFsdWUYAiABKAsyPC5jb21tdW5pY2F0b3Jf",
"b2JqZWN0cy5Vbml0eVJMSW5wdXRQcm90by5MaXN0QWdlbnRBY3Rpb25Qcm90",
"bzoCOAFCH6oCHE1MQWdlbnRzLkNvbW11bmljYXRvck9iamVjdHNiBnByb3Rv",
"Mw=="));
"MG1sYWdlbnRzL2VudnMvY29tbXVuaWNhdG9yX29iamVjdHMvY29tbWFuZC5w",
"cm90byL+AgoRVW5pdHlSTElucHV0UHJvdG8SUAoNYWdlbnRfYWN0aW9ucxgB",
"IAMoCzI5LmNvbW11bmljYXRvcl9vYmplY3RzLlVuaXR5UkxJbnB1dFByb3Rv",
"LkFnZW50QWN0aW9uc0VudHJ5EjMKB2NvbW1hbmQYBCABKA4yIi5jb21tdW5p",
"Y2F0b3Jfb2JqZWN0cy5Db21tYW5kUHJvdG8SFAoMc2lkZV9jaGFubmVsGAUg",
"ASgMGk0KFExpc3RBZ2VudEFjdGlvblByb3RvEjUKBXZhbHVlGAEgAygLMiYu",
"Y29tbXVuaWNhdG9yX29iamVjdHMuQWdlbnRBY3Rpb25Qcm90bxpxChFBZ2Vu",
"dEFjdGlvbnNFbnRyeRILCgNrZXkYASABKAkSSwoFdmFsdWUYAiABKAsyPC5j",
"b21tdW5pY2F0b3Jfb2JqZWN0cy5Vbml0eVJMSW5wdXRQcm90by5MaXN0QWdl",
"bnRBY3Rpb25Qcm90bzoCOAFKBAgCEANKBAgDEARCH6oCHE1MQWdlbnRzLkNv",
"bW11bmljYXRvck9iamVjdHNiBnByb3RvMw=="));
new pbr::FileDescriptor[] { global::MLAgents.CommunicatorObjects.AgentActionReflection.Descriptor, global::MLAgents.CommunicatorObjects.EnvironmentParametersReflection.Descriptor, global::MLAgents.CommunicatorObjects.CommandReflection.Descriptor, },
new pbr::FileDescriptor[] { global::MLAgents.CommunicatorObjects.AgentActionReflection.Descriptor, global::MLAgents.CommunicatorObjects.CommandReflection.Descriptor, },
new pbr::GeneratedClrTypeInfo(typeof(global::MLAgents.CommunicatorObjects.UnityRLInputProto), global::MLAgents.CommunicatorObjects.UnityRLInputProto.Parser, new[]{ "AgentActions", "EnvironmentParameters", "IsTraining", "Command", "SideChannel" }, null, null, new pbr::GeneratedClrTypeInfo[] { new pbr::GeneratedClrTypeInfo(typeof(global::MLAgents.CommunicatorObjects.UnityRLInputProto.Types.ListAgentActionProto), global::MLAgents.CommunicatorObjects.UnityRLInputProto.Types.ListAgentActionProto.Parser, new[]{ "Value" }, null, null, null),
new pbr::GeneratedClrTypeInfo(typeof(global::MLAgents.CommunicatorObjects.UnityRLInputProto), global::MLAgents.CommunicatorObjects.UnityRLInputProto.Parser, new[]{ "AgentActions", "Command", "SideChannel" }, null, null, new pbr::GeneratedClrTypeInfo[] { new pbr::GeneratedClrTypeInfo(typeof(global::MLAgents.CommunicatorObjects.UnityRLInputProto.Types.ListAgentActionProto), global::MLAgents.CommunicatorObjects.UnityRLInputProto.Types.ListAgentActionProto.Parser, new[]{ "Value" }, null, null, null),
null, })
}));
}

[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public UnityRLInputProto(UnityRLInputProto other) : this() {
agentActions_ = other.agentActions_.Clone();
EnvironmentParameters = other.environmentParameters_ != null ? other.EnvironmentParameters.Clone() : null;
isTraining_ = other.isTraining_;
command_ = other.command_;
sideChannel_ = other.sideChannel_;
_unknownFields = pb::UnknownFieldSet.Clone(other._unknownFields);

get { return agentActions_; }
}
/// <summary>Field number for the "environment_parameters" field.</summary>
public const int EnvironmentParametersFieldNumber = 2;
private global::MLAgents.CommunicatorObjects.EnvironmentParametersProto environmentParameters_;
[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public global::MLAgents.CommunicatorObjects.EnvironmentParametersProto EnvironmentParameters {
get { return environmentParameters_; }
set {
environmentParameters_ = value;
}
}
/// <summary>Field number for the "is_training" field.</summary>
public const int IsTrainingFieldNumber = 3;
private bool isTraining_;
[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public bool IsTraining {
get { return isTraining_; }
set {
isTraining_ = value;
}
}
/// <summary>Field number for the "command" field.</summary>
public const int CommandFieldNumber = 4;
private global::MLAgents.CommunicatorObjects.CommandProto command_ = 0;

return true;
}
if (!AgentActions.Equals(other.AgentActions)) return false;
if (!object.Equals(EnvironmentParameters, other.EnvironmentParameters)) return false;
if (IsTraining != other.IsTraining) return false;
if (Command != other.Command) return false;
if (SideChannel != other.SideChannel) return false;
return Equals(_unknownFields, other._unknownFields);

public override int GetHashCode() {
int hash = 1;
hash ^= AgentActions.GetHashCode();
if (environmentParameters_ != null) hash ^= EnvironmentParameters.GetHashCode();
if (IsTraining != false) hash ^= IsTraining.GetHashCode();
if (Command != 0) hash ^= Command.GetHashCode();
if (SideChannel.Length != 0) hash ^= SideChannel.GetHashCode();
if (_unknownFields != null) {

[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public void WriteTo(pb::CodedOutputStream output) {
agentActions_.WriteTo(output, _map_agentActions_codec);
if (environmentParameters_ != null) {
output.WriteRawTag(18);
output.WriteMessage(EnvironmentParameters);
}
if (IsTraining != false) {
output.WriteRawTag(24);
output.WriteBool(IsTraining);
}
if (Command != 0) {
output.WriteRawTag(32);
output.WriteEnum((int) Command);

public int CalculateSize() {
int size = 0;
size += agentActions_.CalculateSize(_map_agentActions_codec);
if (environmentParameters_ != null) {
size += 1 + pb::CodedOutputStream.ComputeMessageSize(EnvironmentParameters);
}
if (IsTraining != false) {
size += 1 + 1;
}
if (Command != 0) {
size += 1 + pb::CodedOutputStream.ComputeEnumSize((int) Command);
}

return;
}
agentActions_.Add(other.agentActions_);
if (other.environmentParameters_ != null) {
if (environmentParameters_ == null) {
environmentParameters_ = new global::MLAgents.CommunicatorObjects.EnvironmentParametersProto();
}
EnvironmentParameters.MergeFrom(other.EnvironmentParameters);
}
if (other.IsTraining != false) {
IsTraining = other.IsTraining;
}
if (other.Command != 0) {
Command = other.Command;
}

break;
case 10: {
agentActions_.AddEntriesFrom(input, _map_agentActions_codec);
break;
}
case 18: {
if (environmentParameters_ == null) {
environmentParameters_ = new global::MLAgents.CommunicatorObjects.EnvironmentParametersProto();
}
input.ReadMessage(environmentParameters_);
break;
}
case 24: {
IsTraining = input.ReadBool();
break;
}
case 32: {

23
UnitySDK/Assets/ML-Agents/Scripts/Grpc/GrpcExtensions.cs


return bp;
}
/// <summary>
/// Convert a MapField to ResetParameters.
/// </summary>
/// <param name="floatParams">The mapping of strings to floats from a protobuf MapField.</param>
/// <returns></returns>
public static ResetParameters ToResetParameters(this MapField<string, float> floatParams)
{
return new ResetParameters(floatParams);
}
/// <summary>
/// Convert an EnvironmnetParametersProto protobuf object to an EnvironmentResetParameters struct.
/// </summary>
/// <param name="epp">The instance of the EnvironmentParametersProto object.</param>
/// <returns>A new EnvironmentResetParameters struct.</returns>
public static EnvironmentResetParameters ToEnvironmentResetParameters(this EnvironmentParametersProto epp)
{
return new EnvironmentResetParameters
{
resetParameters = epp.FloatParameters?.ToResetParameters(),
customResetParameters = epp.CustomResetParameters
};
}
public static UnityRLInitParameters ToUnityRLInitParameters(this UnityRLInitializationInputProto inputProto)
{

22
UnitySDK/Assets/ML-Agents/Scripts/Grpc/RpcCommunicator.cs


{
public event QuitCommandHandler QuitCommandReceived;
public event ResetCommandHandler ResetCommandReceived;
public event RLInputReceivedHandler RLInputReceived;
/// If true, the communication is active.
bool m_IsOpen;

Version = initParameters.version
};
academyParameters.EnvironmentParameters = new EnvironmentParametersProto();
var resetParameters = initParameters.environmentResetParameters.resetParameters;
foreach (var key in resetParameters.Keys)
{
academyParameters.EnvironmentParameters.FloatParameters.Add(key, resetParameters[key]);
}
UnityInputProto input;
UnityInputProto initializationInput;
try

void UpdateEnvironmentWithInput(UnityRLInputProto rlInput)
{
SendRLInputReceivedEvent(rlInput.IsTraining);
SendCommandEvent(rlInput.Command, rlInput.EnvironmentParameters);
SendCommandEvent(rlInput.Command);
}
UnityInputProto Initialize(UnityOutputProto unityOutput,

#region Sending Events
void SendCommandEvent(CommandProto command, EnvironmentParametersProto environmentParametersProto)
void SendCommandEvent(CommandProto command)
{
switch (command)
{

}
case CommandProto.Reset:
{
ResetCommandReceived?.Invoke(environmentParametersProto.ToEnvironmentResetParameters());
ResetCommandReceived?.Invoke();
return;
}
default:

}
}
void SendRLInputReceivedEvent(bool isTraining)
{
RLInputReceived?.Invoke(new UnityRLInputParameters { isTraining = isTraining });
}
#endregion

27
UnitySDK/Assets/ML-Agents/Scripts/ICommunicator.cs


namespace MLAgents
{
public struct EnvironmentResetParameters
{
/// <summary>
/// Mapping of string : float which defines which parameters can be
/// reset from python.
/// </summary>
public ResetParameters resetParameters;
/// <summary>
/// The protobuf for custom reset parameters.
/// NOTE: This is the last remaining relic of gRPC protocol
/// that is left in our code. We need to decide how to handle this
/// moving forward.
/// </summary>
public CustomResetParametersProto customResetParameters;
}
public struct CommunicatorInitParameters
{
/// <summary>

/// The version of the Unity SDK.
/// </summary>
public string version;
/// <summary>
/// The set of environment parameters defined by the user that will be sent to the communicator.
/// </summary>
public EnvironmentResetParameters environmentResetParameters;
}
public struct UnityRLInitParameters
{

/// Delegate for handling reset parameter updates sent from the communicator.
/// </summary>
/// <param name="resetParams"></param>
public delegate void ResetCommandHandler(EnvironmentResetParameters resetParams);
public delegate void ResetCommandHandler();
/// <summary>
/// Delegate to handle UnityRLInputParameters updates from the communicator.

/// Reset command sent back from the communicator.
/// </summary>
event ResetCommandHandler ResetCommandReceived;
/// <summary>
/// Unity RL Input was received by the communicator.
/// </summary>
event RLInputReceivedHandler RLInputReceived;
/// <summary>
/// Sends the academy parameters through the Communicator.

2
UnitySDK/Assets/ML-Agents/Scripts/SideChannel/EngineConfigurationChannel.cs


var timeScale = binaryReader.ReadSingle();
var targetFrameRate = binaryReader.ReadInt32();
timeScale = Mathf.Clamp(timeScale, 1, 100);
Screen.SetResolution(width, height, false);
QualitySettings.SetQualityLevel(qualityLevel, true);
Time.timeScale = timeScale;

38
docs/Basic-Guide.md


page](Learning-Environment-Executable.md) for instructions on how to build and
use an executable.
```console
ml-agents$ mlagents-learn config/trainer_config.yaml --run-id=first-run --train
▄▄▄▓▓▓▓
╓▓▓▓▓▓▓█▓▓▓▓▓
,▄▄▄m▀▀▀' ,▓▓▓▀▓▓▄ ▓▓▓ ▓▓▌
▄▓▓▓▀' ▄▓▓▀ ▓▓▓ ▄▄ ▄▄ ,▄▄ ▄▄▄▄ ,▄▄ ▄▓▓▌▄ ▄▄▄ ,▄▄
▄▓▓▓▀ ▄▓▓▀ ▐▓▓▌ ▓▓▌ ▐▓▓ ▐▓▓▓▀▀▀▓▓▌ ▓▓▓ ▀▓▓▌▀ ^▓▓▌ ╒▓▓▌
▄▓▓▓▓▓▄▄▄▄▄▄▄▄▓▓▓ ▓▀ ▓▓▌ ▐▓▓ ▐▓▓ ▓▓▓ ▓▓▓ ▓▓▌ ▐▓▓▄ ▓▓▌
▀▓▓▓▓▀▀▀▀▀▀▀▀▀▀▓▓▄ ▓▓ ▓▓▌ ▐▓▓ ▐▓▓ ▓▓▓ ▓▓▓ ▓▓▌ ▐▓▓▐▓▓
^█▓▓▓ ▀▓▓▄ ▐▓▓▌ ▓▓▓▓▄▓▓▓▓ ▐▓▓ ▓▓▓ ▓▓▓ ▓▓▓▄ ▓▓▓▓`
'▀▓▓▓▄ ^▓▓▓ ▓▓▓ └▀▀▀▀ ▀▀ ^▀▀ `▀▀ `▀▀ '▀▀ ▐▓▓▌
▀▀▀▀▓▄▄▄ ▓▓▓▓▓▓, ▓▓▓▓▀
`▀█▓▓▓▓▓▓▓▓▓▌
¬`▀▀▀█▓
INFO:mlagents.learn:{'--curriculum': 'None',
'--docker-target-name': 'Empty',
'--env': 'None',
'--help': False,
'--keep-checkpoints': '5',
'--lesson': '0',
'--load': False,
'--no-graphics': False,
'--num-runs': '1',
'--run-id': 'first-run',
'--save-freq': '50000',
'--seed': '-1',
'--slow': False,
'--train': True,
'--worker-id': '0',
'<trainer-config-path>': 'config/trainer_config.yaml'}
INFO:mlagents.envs:Start training by pressing the Play button in the Unity Editor.
```
**Note**: If you're using Anaconda, don't forget to activate the ml-agents
environment first.

INFO:mlagents.envs:
'Ball3DAcademy' started successfully!
Unity Academy name: Ball3DAcademy
Reset Parameters : {}
INFO:mlagents.envs:Connected new brain:
Unity brain name: 3DBallLearning

12
docs/Learning-Environment-Design-Academy.md


you want to add elements to the environment at random intervals, you can put the
logic for creating them in the `AcademyStep()` function.
## Academy Properties
![Academy Inspector](images/academy.png)
* `Configuration` - The engine-level settings which correspond to rendering
quality and engine speed.
* `Width` - Width of the environment window in pixels.
* `Height` - Height of the environment window in pixels.
* `Quality Level` - Rendering quality of environment. (Higher is better)
* `Time Scale` - Speed at which environment is run. (Higher is faster)
* `Target Frame Rate` - FPS engine attempts to maintain.
* `Reset Parameters` - List of custom parameters that can be changed in the
environment on reset.

28
docs/Learning-Environment-Examples.md


* Vector Action space: (Discrete) Two possible actions (Move left, move
right).
* Visual Observations: None
* Reset Parameters: None
* Float Properties: None
* Benchmark Mean Reward: 0.94
## [3DBall: 3D Balance Ball](https://youtu.be/dheeCO29-EI)

* Vector Action space: (Continuous) Size of 2, with one value corresponding to
X-rotation, and the other to Z-rotation.
* Visual Observations: None.
* Reset Parameters: Three
* Float Properties: Three
* scale: Specifies the scale of the ball in the 3 dimensions (equal across the three dimensions)
* Default: 1
* Recommended Minimum: 0.2

using the `Mask Actions` checkbox within the `trueAgent` GameObject).
The trained model file provided was generated with action masking turned on.
* Visual Observations: One corresponding to top-down view of GridWorld.
* Reset Parameters: Three, corresponding to grid size, number of obstacles, and
* Float Properties: Three, corresponding to grid size, number of obstacles, and
number of goals.
* Benchmark Mean Reward: 0.8

* Vector Action space: (Continuous) Size of 2, corresponding to movement
toward net or away from net, and jumping.
* Visual Observations: None
* Reset Parameters: Three
* Float Properties: Three
* angle: Angle of the racket from the vertical (Y) axis.
* Default: 55
* Recommended Minimum: 35

`VisualPushBlock` scene. __The visual observation version of
this environment does not train with the provided default
training parameters.__
* Reset Parameters: Four
* Float Properties: Four
* block_scale: Scale of the block along the x and z dimensions
* Default: 2
* Recommended Minimum: 0.5

* Side Motion (3 possible actions: Left, Right, No Action)
* Jump (2 possible actions: Jump, No Action)
* Visual Observations: None
* Reset Parameters: Four
* Float Properties: Four
* Benchmark Mean Reward (Big & Small Wall): 0.8
## [Reacher](https://youtu.be/2N9EoF6pQyE)

* Vector Action space: (Continuous) Size of 4, corresponding to torque
applicable to two joints.
* Visual Observations: None.
* Reset Parameters: Five
* Float Properties: Five
* goal_size: radius of the goal zone
* Default: 5
* Recommended Minimum: 1

* Vector Action space: (Continuous) Size of 20, corresponding to target
rotations for joints.
* Visual Observations: None
* Reset Parameters: None
* Float Properties: None
* Benchmark Mean Reward for `CrawlerStaticTarget`: 2000
* Benchmark Mean Reward for `CrawlerDynamicTarget`: 400

`VisualFoodCollector` scene. __The visual observation version of
this environment does not train with the provided default
training parameters.__
* Reset Parameters: Two
* Float Properties: Two
* laser_length: Length of the laser used by the agent
* Default: 1
* Recommended Minimum: 0.2

`VisualHallway` scene. __The visual observation version of
this environment does not train with the provided default
training parameters.__
* Reset Parameters: None
* Float Properties: None
* Benchmark Mean Reward: 0.7
* To speed up training, you can enable curiosity by adding `use_curiosity: true` in `config/trainer_config.yaml`

* Vector Action space: (Continuous) 3 corresponding to agent force applied for
the jump.
* Visual Observations: None
* Reset Parameters: Two
* Float Properties: Two
* target_scale: The scale of the green cube in the 3 dimensions
* Default: 150
* Recommended Minimum: 50

as well as rotation.
* Goalie: 4 actions corresponding to forward, backward, sideways movement.
* Visual Observations: None
* Reset Parameters: Two
* Float Properties: Two
* ball_scale: Specifies the scale of the ball in the 3 dimensions (equal across the three dimensions)
* Default: 7.5
* Recommended minimum: 4

* Vector Action space: (Continuous) Size of 39, corresponding to target
rotations applicable to the joints.
* Visual Observations: None
* Reset Parameters: Four
* Float Properties: Four
* gravity: Magnitude of gravity
* Default: 9.81
* Recommended Minimum:

`VisualPyramids` scene. __The visual observation version of
this environment does not train with the provided default
training parameters.__
* Reset Parameters: None
* Float Properties: None
* Benchmark Mean Reward: 1.75

18
docs/Learning-Environment-Executable.md


`▀█▓▓▓▓▓▓▓▓▓▌
¬`▀▀▀█▓
INFO:mlagents.learn:{'--curriculum': 'None',
'--docker-target-name': 'Empty',
'--env': '3DBall',
'--help': False,
'--keep-checkpoints': '5',
'--lesson': '0',
'--load': False,
'--no-graphics': False,
'--num-runs': '1',
'--run-id': 'firstRun',
'--save-freq': '50000',
'--seed': '-1',
'--slow': False,
'--train': True,
'--worker-id': '0',
'<trainer-config-path>': 'config/trainer_config.yaml'}
```
**Note**: If you're using Anaconda, don't forget to activate the ml-agents

INFO:mlagents.envs:
'Ball3DAcademy' started successfully!
Unity Academy name: Ball3DAcademy
Reset Parameters : {}
INFO:mlagents.envs:Connected new brain:
Unity brain name: Ball3DLearning

13
docs/Migrating.md


# Migrating
## Migrating from master to develop
### Important changes
* `CustomResetParameters` are now removed.
* `reset()` on the Low-Level Python API no longer takes a `train_mode` argument. To modify the performance/speed of the engine, you must use an `EngineConfigurationChannel`
* `reset()` on the Low-Level Python API no longer takes a `config` argument. `UnityEnvironment` no longer has a `reset_parameters` field. To modify float properties in the environment, you must use a `FloatPropertiesChannel`. For more information, refer to the [Low Level Python API documentation](Python-API.md)
* The Academy no longer has a `Training Configuration` nor `Inference Configuration` field in the inspector. To modify the configuration from the Low-Level Python API, use an `EngineConfigurationChannel`. To modify it during training, use the new command line arguments `--width`, `--height`, `--quality-level`, `--time-scale` and `--target-frame-rate` in `mlagents-learn`.
* The Academy no longer has a `Default Reset Parameters` field in the inspector. The Academy class no longer has a `ResetParameters`. To access shared float properties with Python, use the new `FloatProperties` field on the Academy.
### Steps to Migrate
* If you had a custom `Training Configuration` in the Academy inspector, you will need to pass your custom configuration at every training run using the new command line arguments `--width`, `--height`, `--quality-level`, `--time-scale` and `--target-frame-rate`.
* If you were using `--slow` in `mlagents-learn`, you will need to pass your old `Inference Configuration` of the Academy inspector with the new command line arguments `--width`, `--height`, `--quality-level`, `--time-scale` and `--target-frame-rate` instead.
## Migrating from ML-Agents toolkit v0.11.0 to v0.12.0
### Important Changes

81
docs/Python-API.md


- **Print : `print(str(env))`**
Prints all parameters relevant to the loaded environment and the
Brains.
- **Reset : `env.reset(train_mode=True, config=None)`**
- **Reset : `env.reset()`**
- `train_mode` indicates whether to run the environment in train (`True`) or
test (`False`) mode.
- `config` is an optional dictionary of configuration flags specific to the
environment. For generic environments, `config` can be ignored. `config` is
a dictionary of strings to floats where the keys are the names of the
`resetParameters` and the values are their corresponding float values.
Define the reset parameters on the Academy Inspector window in the Unity
Editor.
- **Step : `env.step(action)`**
Sends a step signal to the environment using the actions. For each Brain :
- `action` can be one dimensional arrays or two dimensional arrays if you have

- **Close : `env.close()`**
Sends a shutdown signal to the environment and closes the communication
socket.
### Modifying the environment from Python
The Environment can be modified by using side channels to send data to the
environment. When creating the environment, pass a list of side channels as
`side_channels` argument to the constructor.
__Note__ : A side channel will only send/receive messages when `env.step` is
called.
#### EngineConfigurationChannel
An `EngineConfiguration` will allow you to modify the time scale and graphics quality of the Unity engine.
`EngineConfigurationChannel` has two methods :
* `set_configuration_parameters` with arguments
* width: Defines the width of the display. Default 80.
* height: Defines the height of the display. Default 80.
* quality_level: Defines the quality level of the simulation. Default 1.
* time_scale: Defines the multiplier for the deltatime in the simulation. If set to a higher value, time will pass faster in the simulation but the physics might break. Default 20.
* target_frame_rate: Instructs simulation to try to render at a specified frame rate. Default -1.
* `set_configuration` with argument config which is an `EngineConfig`
NamedTuple object.
For example :
```python
from mlagents.envs.environment import UnityEnvironment
from mlagents.envs.side_channel.engine_configuration_channel import EngineConfigurationChannel
channel = EngineConfigurationChannel()
env = UnityEnvironment(base_port = 5004, side_channels = [channel])
channel.set_configuration_parameters(time_scale = 2.0)
i = env.reset()
...
```
#### FloatPropertiesChannel
A `FloatPropertiesChannel` will allow you to get and set float properties
in the environment. You can call get_property and set_property on the
side channel to read and write properties.
`FloatPropertiesChannel` has three methods:
* `set_property` Sets a property in the Unity Environment.
* key: The string identifier of the property.
* value: The float value of the property.
* `get_property` Gets a property in the Unity Environment. If the property was not found, will return None.
* key: The string identifier of the property.
* `list_properties` Returns a list of all the string identifiers of the properties
```python
from mlagents.envs.environment import UnityEnvironment
from mlagents.envs.side_channel.float_properties_channel import FloatPropertiesChannel
channel = FloatPropertiesChannel()
env = UnityEnvironment(base_port = 5004, side_channels = [channel])
channel.set_property("parameter_1", 2.0)
i = env.reset()
...
```
Once a property has been modified in Python, you can access it in C# after the next call to `step` as follows:
```csharp
var academy = FindObjectOfType<Academy>();
var sharedProperties = academy.FloatProperties;
float property1 = sharedProperties.GetPropertyWithDefault("parameter_1", 0.0f);
```
## mlagents-learn

5
docs/Training-Curriculum-Learning.md


In order to define a curriculum, the first step is to decide which parameters of
the environment will vary. In the case of the Wall Jump environment, what varies
is the height of the wall. We define this as a `Reset Parameter` in the Academy
object of our scene, and by doing so it becomes adjustable via the Python API.
is the height of the wall. We define this as a `Shared Float Property` that
can be accessed in `Academy.FloatProperties`, and by doing so it becomes
adjustable via the Python API.
Rather than adjusting it by hand, we will create a JSON file which
describes the structure of the curriculum. Within it, we can specify which
points in the training process our wall height will change, either based on the

4
docs/Training-Generalized-Reinforcement-Learning-Agents.md


## Introducing Generalization Using Reset Parameters
To enable variations in the environments, we implemented `Reset Parameters`. We
To enable variations in the environments, we implemented `Reset Parameters`.
`Reset Parameters` are `Academy.FloatProperties` that are used only when
resetting the environment. We
also included different sampling methods and the ability to create new kinds of
sampling methods for each `Reset Parameter`. In the 3D ball environment example displayed
in the figure above, the reset parameters are `gravity`, `ball_mass` and `ball_scale`.

17
docs/Training-ML-Agents.md


will use the port `(base_port + worker_id)`, where the `worker_id` is sequential IDs
given to each instance from 0 to `num_envs - 1`. Default is 5005. __Note:__ When
training using the Editor rather than an executable, the base port will be ignored.
* `--slow`: Specify this option to run the Unity environment at normal, game
speed. The `--slow` mode uses the **Time Scale** and **Target Frame Rate**
specified in the Academy's **Inference Configuration**. By default, training
runs using the speeds specified in your Academy's **Training Configuration**.
See
[Academy Properties](Learning-Environment-Design-Academy.md#academy-properties).
* `--train`: Specifies whether to train model or only run in inference mode.
When training, **always** use the `--train` option.
* `--load`: If set, the training code loads an already trained model to

* `--debug`: Specify this option to enable debug-level logging for some parts of the code.
* `--multi-gpu`: Setting this flag enables the use of multiple GPU's (if available) during training.
* `--cpu`: Forces training using CPU only.
* Engine Configuration :
* `--width' : The width of the executable window of the environment(s) in pixels
(ignored for editor training) (Default 84)
* `--height` : The height of the executable window of the environment(s) in pixels
(ignored for editor training). (Default 84)
* `--quality-level` : The quality level of the environment(s). Equivalent to
calling `QualitySettings.SetQualityLevel` in Unity. (Default 5)
* `--time-scale` : The time scale of the Unity environment(s). Equivalent to setting
`Time.timeScale` in Unity. (Default 20.0, maximum 100.0)
* `--target-frame-rate` : The target frame rate of the Unity environment(s).
Equivalent to setting `Application.targetFrameRate` in Unity. (Default: -1)
### Training Config File

14
ml-agents-envs/mlagents/envs/base_unity_environment.py


from abc import ABC, abstractmethod
from typing import Dict, Optional, Any
from typing import Dict, Optional
from mlagents.envs.brain import AllBrainInfo, BrainParameters

pass
@abstractmethod
def reset(
self,
config: Optional[Dict] = None,
train_mode: bool = True,
custom_reset_parameters: Any = None,
) -> AllBrainInfo:
def reset(self) -> AllBrainInfo:
pass
@property
@abstractmethod
def reset_parameters(self) -> Dict[str, float]:
pass
@abstractmethod

17
ml-agents-envs/mlagents/envs/communicator_objects/unity_rl_initialization_output_pb2.py


from mlagents.envs.communicator_objects import brain_parameters_pb2 as mlagents_dot_envs_dot_communicator__objects_dot_brain__parameters__pb2
from mlagents.envs.communicator_objects import environment_parameters_pb2 as mlagents_dot_envs_dot_communicator__objects_dot_environment__parameters__pb2
DESCRIPTOR = _descriptor.FileDescriptor(

serialized_pb=_b('\nGmlagents/envs/communicator_objects/unity_rl_initialization_output.proto\x12\x14\x63ommunicator_objects\x1a\x39mlagents/envs/communicator_objects/brain_parameters.proto\x1a?mlagents/envs/communicator_objects/environment_parameters.proto\"\xeb\x01\n UnityRLInitializationOutputProto\x12\x0c\n\x04name\x18\x01 \x01(\t\x12\x0f\n\x07version\x18\x02 \x01(\t\x12\x10\n\x08log_path\x18\x03 \x01(\t\x12\x44\n\x10\x62rain_parameters\x18\x05 \x03(\x0b\x32*.communicator_objects.BrainParametersProto\x12P\n\x16\x65nvironment_parameters\x18\x06 \x01(\x0b\x32\x30.communicator_objects.EnvironmentParametersProtoB\x1f\xaa\x02\x1cMLAgents.CommunicatorObjectsb\x06proto3')
serialized_pb=_b('\nGmlagents/envs/communicator_objects/unity_rl_initialization_output.proto\x12\x14\x63ommunicator_objects\x1a\x39mlagents/envs/communicator_objects/brain_parameters.proto\"\x9f\x01\n UnityRLInitializationOutputProto\x12\x0c\n\x04name\x18\x01 \x01(\t\x12\x0f\n\x07version\x18\x02 \x01(\t\x12\x10\n\x08log_path\x18\x03 \x01(\t\x12\x44\n\x10\x62rain_parameters\x18\x05 \x03(\x0b\x32*.communicator_objects.BrainParametersProtoJ\x04\x08\x06\x10\x07\x42\x1f\xaa\x02\x1cMLAgents.CommunicatorObjectsb\x06proto3')
dependencies=[mlagents_dot_envs_dot_communicator__objects_dot_brain__parameters__pb2.DESCRIPTOR,mlagents_dot_envs_dot_communicator__objects_dot_environment__parameters__pb2.DESCRIPTOR,])
dependencies=[mlagents_dot_envs_dot_communicator__objects_dot_brain__parameters__pb2.DESCRIPTOR,])

message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='environment_parameters', full_name='communicator_objects.UnityRLInitializationOutputProto.environment_parameters', index=4,
number=6, type=11, cpp_type=10, label=1,
has_default_value=False, default_value=None,
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
],
extensions=[
],

extension_ranges=[],
oneofs=[
],
serialized_start=222,
serialized_end=457,
serialized_start=157,
serialized_end=316,
_UNITYRLINITIALIZATIONOUTPUTPROTO.fields_by_name['environment_parameters'].message_type = mlagents_dot_envs_dot_communicator__objects_dot_environment__parameters__pb2._ENVIRONMENTPARAMETERSPROTO
DESCRIPTOR.message_types_by_name['UnityRLInitializationOutputProto'] = _UNITYRLINITIALIZATIONOUTPUTPROTO
_sym_db.RegisterFileDescriptor(DESCRIPTOR)

14
ml-agents-envs/mlagents/envs/communicator_objects/unity_rl_initialization_output_pb2.pyi


BrainParametersProto as mlagents___envs___communicator_objects___brain_parameters_pb2___BrainParametersProto,
)
from mlagents.envs.communicator_objects.environment_parameters_pb2 import (
EnvironmentParametersProto as mlagents___envs___communicator_objects___environment_parameters_pb2___EnvironmentParametersProto,
)
from typing import (
Iterable as typing___Iterable,
Optional as typing___Optional,

@property
def brain_parameters(self) -> google___protobuf___internal___containers___RepeatedCompositeFieldContainer[mlagents___envs___communicator_objects___brain_parameters_pb2___BrainParametersProto]: ...
@property
def environment_parameters(self) -> mlagents___envs___communicator_objects___environment_parameters_pb2___EnvironmentParametersProto: ...
def __init__(self,
*,
name : typing___Optional[typing___Text] = None,

environment_parameters : typing___Optional[mlagents___envs___communicator_objects___environment_parameters_pb2___EnvironmentParametersProto] = None,
) -> None: ...
@classmethod
def FromString(cls, s: builtin___bytes) -> UnityRLInitializationOutputProto: ...

def HasField(self, field_name: typing_extensions___Literal[u"environment_parameters"]) -> builtin___bool: ...
def ClearField(self, field_name: typing_extensions___Literal[u"brain_parameters",u"environment_parameters",u"log_path",u"name",u"version"]) -> None: ...
def ClearField(self, field_name: typing_extensions___Literal[u"brain_parameters",u"log_path",u"name",u"version"]) -> None: ...
def HasField(self, field_name: typing_extensions___Literal[u"environment_parameters",b"environment_parameters"]) -> builtin___bool: ...
def ClearField(self, field_name: typing_extensions___Literal[u"brain_parameters",b"brain_parameters",u"environment_parameters",b"environment_parameters",u"log_path",b"log_path",u"name",b"name",u"version",b"version"]) -> None: ...
def ClearField(self, field_name: typing_extensions___Literal[u"brain_parameters",b"brain_parameters",u"log_path",b"log_path",u"name",b"name",u"version",b"version"]) -> None: ...

36
ml-agents-envs/mlagents/envs/communicator_objects/unity_rl_input_pb2.py


from mlagents.envs.communicator_objects import agent_action_pb2 as mlagents_dot_envs_dot_communicator__objects_dot_agent__action__pb2
from mlagents.envs.communicator_objects import environment_parameters_pb2 as mlagents_dot_envs_dot_communicator__objects_dot_environment__parameters__pb2
from mlagents.envs.communicator_objects import command_pb2 as mlagents_dot_envs_dot_communicator__objects_dot_command__pb2

syntax='proto3',
serialized_pb=_b('\n7mlagents/envs/communicator_objects/unity_rl_input.proto\x12\x14\x63ommunicator_objects\x1a\x35mlagents/envs/communicator_objects/agent_action.proto\x1a?mlagents/envs/communicator_objects/environment_parameters.proto\x1a\x30mlagents/envs/communicator_objects/command.proto\"\xd9\x03\n\x11UnityRLInputProto\x12P\n\ragent_actions\x18\x01 \x03(\x0b\x32\x39.communicator_objects.UnityRLInputProto.AgentActionsEntry\x12P\n\x16\x65nvironment_parameters\x18\x02 \x01(\x0b\x32\x30.communicator_objects.EnvironmentParametersProto\x12\x13\n\x0bis_training\x18\x03 \x01(\x08\x12\x33\n\x07\x63ommand\x18\x04 \x01(\x0e\x32\".communicator_objects.CommandProto\x12\x14\n\x0cside_channel\x18\x05 \x01(\x0c\x1aM\n\x14ListAgentActionProto\x12\x35\n\x05value\x18\x01 \x03(\x0b\x32&.communicator_objects.AgentActionProto\x1aq\n\x11\x41gentActionsEntry\x12\x0b\n\x03key\x18\x01 \x01(\t\x12K\n\x05value\x18\x02 \x01(\x0b\x32<.communicator_objects.UnityRLInputProto.ListAgentActionProto:\x02\x38\x01\x42\x1f\xaa\x02\x1cMLAgents.CommunicatorObjectsb\x06proto3')
serialized_pb=_b('\n7mlagents/envs/communicator_objects/unity_rl_input.proto\x12\x14\x63ommunicator_objects\x1a\x35mlagents/envs/communicator_objects/agent_action.proto\x1a\x30mlagents/envs/communicator_objects/command.proto\"\xfe\x02\n\x11UnityRLInputProto\x12P\n\ragent_actions\x18\x01 \x03(\x0b\x32\x39.communicator_objects.UnityRLInputProto.AgentActionsEntry\x12\x33\n\x07\x63ommand\x18\x04 \x01(\x0e\x32\".communicator_objects.CommandProto\x12\x14\n\x0cside_channel\x18\x05 \x01(\x0c\x1aM\n\x14ListAgentActionProto\x12\x35\n\x05value\x18\x01 \x03(\x0b\x32&.communicator_objects.AgentActionProto\x1aq\n\x11\x41gentActionsEntry\x12\x0b\n\x03key\x18\x01 \x01(\t\x12K\n\x05value\x18\x02 \x01(\x0b\x32<.communicator_objects.UnityRLInputProto.ListAgentActionProto:\x02\x38\x01J\x04\x08\x02\x10\x03J\x04\x08\x03\x10\x04\x42\x1f\xaa\x02\x1cMLAgents.CommunicatorObjectsb\x06proto3')
dependencies=[mlagents_dot_envs_dot_communicator__objects_dot_agent__action__pb2.DESCRIPTOR,mlagents_dot_envs_dot_communicator__objects_dot_environment__parameters__pb2.DESCRIPTOR,mlagents_dot_envs_dot_communicator__objects_dot_command__pb2.DESCRIPTOR,])
dependencies=[mlagents_dot_envs_dot_communicator__objects_dot_agent__action__pb2.DESCRIPTOR,mlagents_dot_envs_dot_communicator__objects_dot_command__pb2.DESCRIPTOR,])

extension_ranges=[],
oneofs=[
],
serialized_start=533,
serialized_end=610,
serialized_start=365,
serialized_end=442,
)
_UNITYRLINPUTPROTO_AGENTACTIONSENTRY = _descriptor.Descriptor(

extension_ranges=[],
oneofs=[
],
serialized_start=612,
serialized_end=725,
serialized_start=444,
serialized_end=557,
)
_UNITYRLINPUTPROTO = _descriptor.Descriptor(

is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='environment_parameters', full_name='communicator_objects.UnityRLInputProto.environment_parameters', index=1,
number=2, type=11, cpp_type=10, label=1,
has_default_value=False, default_value=None,
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='is_training', full_name='communicator_objects.UnityRLInputProto.is_training', index=2,
number=3, type=8, cpp_type=7, label=1,
has_default_value=False, default_value=False,
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='command', full_name='communicator_objects.UnityRLInputProto.command', index=3,
name='command', full_name='communicator_objects.UnityRLInputProto.command', index=1,
number=4, type=14, cpp_type=8, label=1,
has_default_value=False, default_value=0,
message_type=None, enum_type=None, containing_type=None,

name='side_channel', full_name='communicator_objects.UnityRLInputProto.side_channel', index=4,
name='side_channel', full_name='communicator_objects.UnityRLInputProto.side_channel', index=2,
number=5, type=12, cpp_type=9, label=1,
has_default_value=False, default_value=_b(""),
message_type=None, enum_type=None, containing_type=None,

extension_ranges=[],
oneofs=[
],
serialized_start=252,
serialized_end=725,
serialized_start=187,
serialized_end=569,
)
_UNITYRLINPUTPROTO_LISTAGENTACTIONPROTO.fields_by_name['value'].message_type = mlagents_dot_envs_dot_communicator__objects_dot_agent__action__pb2._AGENTACTIONPROTO

_UNITYRLINPUTPROTO.fields_by_name['agent_actions'].message_type = _UNITYRLINPUTPROTO_AGENTACTIONSENTRY
_UNITYRLINPUTPROTO.fields_by_name['environment_parameters'].message_type = mlagents_dot_envs_dot_communicator__objects_dot_environment__parameters__pb2._ENVIRONMENTPARAMETERSPROTO
_UNITYRLINPUTPROTO.fields_by_name['command'].enum_type = mlagents_dot_envs_dot_communicator__objects_dot_command__pb2._COMMANDPROTO
DESCRIPTOR.message_types_by_name['UnityRLInputProto'] = _UNITYRLINPUTPROTO
_sym_db.RegisterFileDescriptor(DESCRIPTOR)

16
ml-agents-envs/mlagents/envs/communicator_objects/unity_rl_input_pb2.pyi


CommandProto as mlagents___envs___communicator_objects___command_pb2___CommandProto,
)
from mlagents.envs.communicator_objects.environment_parameters_pb2 import (
EnvironmentParametersProto as mlagents___envs___communicator_objects___environment_parameters_pb2___EnvironmentParametersProto,
)
from typing import (
Iterable as typing___Iterable,
Mapping as typing___Mapping,

def HasField(self, field_name: typing_extensions___Literal[u"value",b"value"]) -> builtin___bool: ...
def ClearField(self, field_name: typing_extensions___Literal[u"key",b"key",u"value",b"value"]) -> None: ...
is_training = ... # type: builtin___bool
command = ... # type: mlagents___envs___communicator_objects___command_pb2___CommandProto
side_channel = ... # type: builtin___bytes

@property
def environment_parameters(self) -> mlagents___envs___communicator_objects___environment_parameters_pb2___EnvironmentParametersProto: ...
environment_parameters : typing___Optional[mlagents___envs___communicator_objects___environment_parameters_pb2___EnvironmentParametersProto] = None,
is_training : typing___Optional[builtin___bool] = None,
command : typing___Optional[mlagents___envs___communicator_objects___command_pb2___CommandProto] = None,
side_channel : typing___Optional[builtin___bytes] = None,
) -> None: ...

def CopyFrom(self, other_msg: google___protobuf___message___Message) -> None: ...
if sys.version_info >= (3,):
def HasField(self, field_name: typing_extensions___Literal[u"environment_parameters"]) -> builtin___bool: ...
def ClearField(self, field_name: typing_extensions___Literal[u"agent_actions",u"command",u"environment_parameters",u"is_training",u"side_channel"]) -> None: ...
def ClearField(self, field_name: typing_extensions___Literal[u"agent_actions",u"command",u"side_channel"]) -> None: ...
def HasField(self, field_name: typing_extensions___Literal[u"environment_parameters",b"environment_parameters"]) -> builtin___bool: ...
def ClearField(self, field_name: typing_extensions___Literal[u"agent_actions",b"agent_actions",u"command",b"command",u"environment_parameters",b"environment_parameters",u"is_training",b"is_training",u"side_channel",b"side_channel"]) -> None: ...
def ClearField(self, field_name: typing_extensions___Literal[u"agent_actions",b"agent_actions",u"command",b"command",u"side_channel",b"side_channel"]) -> None: ...

11
ml-agents-envs/mlagents/envs/env_manager.py


from abc import ABC, abstractmethod
from typing import Any, List, Dict, NamedTuple, Optional
from typing import List, Dict, NamedTuple, Optional
from mlagents.envs.brain import AllBrainInfo, BrainParameters
from mlagents.envs.policy import Policy
from mlagents.envs.action_info import ActionInfo

pass
@abstractmethod
def reset(
self,
config: Dict = None,
train_mode: bool = True,
custom_reset_parameters: Any = None,
) -> List[EnvironmentStep]:
def reset(self, config: Dict = None) -> List[EnvironmentStep]:
pass
@property

@property
@abstractmethod
def reset_parameters(self) -> Dict[str, float]:
def get_properties(self) -> Dict[str, float]:
pass
@abstractmethod

66
ml-agents-envs/mlagents/envs/environment.py


from mlagents.envs.communicator_objects.unity_rl_input_pb2 import UnityRLInputProto
from mlagents.envs.communicator_objects.unity_rl_output_pb2 import UnityRLOutputProto
from mlagents.envs.communicator_objects.agent_action_pb2 import AgentActionProto
from mlagents.envs.communicator_objects.environment_parameters_pb2 import (
EnvironmentParametersProto,
)
from mlagents.envs.communicator_objects.unity_output_pb2 import UnityOutputProto
from mlagents.envs.communicator_objects.unity_rl_initialization_input_pb2 import (
UnityRLInitializationInputProto,

self._external_brain_names: List[str] = []
self._num_external_brains = 0
self._update_brain_parameters(aca_output)
self._resetParameters = dict(aca_params.environment_parameters.float_parameters)
logger.info(
"\n'{0}' started successfully!\n{1}".format(self._academy_name, str(self))
)

external_brains[brain_name] = self.brains[brain_name]
return external_brains
@property
def reset_parameters(self):
return self._resetParameters
def executable_launcher(self, file_name, docker_training, no_graphics, args):
cwd = os.getcwd()
file_name = (

)
def __str__(self):
reset_params_str = (
"\n\t\t".join(
[
str(k) + " -> " + str(self._resetParameters[k])
for k in self._resetParameters
]
)
if self._resetParameters
else "{}"
)
return f"""Unity Academy name: {self._academy_name}
Reset Parameters : {reset_params_str}"""
return """Unity Academy name: {0}""".format(self._academy_name)
def reset(
self,
config: Dict = None,
train_mode: bool = True,
custom_reset_parameters: Any = None,
) -> AllBrainInfo:
def reset(self) -> AllBrainInfo:
if config is None:
config = self._resetParameters
elif config:
logger.info(
"Academy reset with parameters: {0}".format(
", ".join([str(x) + " -> " + str(config[x]) for x in config])
)
)
for k in config:
if (k in self._resetParameters) and (isinstance(config[k], (int, float))):
self._resetParameters[k] = config[k]
elif not isinstance(config[k], (int, float)):
raise UnityEnvironmentException(
"The value for parameter '{0}'' must be an Integer or a Float.".format(
k
)
)
else:
raise UnityEnvironmentException(
"The parameter '{0}' is not a valid parameter.".format(k)
)
outputs = self.communicator.exchange(
self._generate_reset_input(train_mode, config, custom_reset_parameters)
)
outputs = self.communicator.exchange(self._generate_reset_input())
if outputs is None:
raise UnityCommunicationException("Communicator has stopped.")
self._update_brain_parameters(outputs)

rl_in.side_channel = bytes(self._generate_side_channel_data(self.side_channels))
return self.wrap_unity_input(rl_in)
def _generate_reset_input(
self, training: bool, config: Dict, custom_reset_parameters: Any
) -> UnityInputProto:
def _generate_reset_input(self) -> UnityInputProto:
rl_in.is_training = training
rl_in.environment_parameters.CopyFrom(EnvironmentParametersProto())
for key in config:
rl_in.environment_parameters.float_parameters[key] = config[key]
if custom_reset_parameters is not None:
rl_in.environment_parameters.custom_reset_parameters.CopyFrom(
custom_reset_parameters
)
rl_in.command = 1
rl_in.side_channel = bytes(self._generate_side_channel_data(self.side_channels))
return self.wrap_unity_input(rl_in)

25
ml-agents-envs/mlagents/envs/side_channel/engine_configuration_channel.py


from mlagents.envs.side_channel.side_channel import SideChannel, SideChannelType
from mlagents.envs.exception import UnityCommunicationException
import struct
from typing import NamedTuple
class EngineConfig(NamedTuple):
width: int
height: int
quality_level: int
time_scale: float
target_frame_rate: int
@staticmethod
def default_config():
return EngineConfig(80, 80, 1, 20.0, -1)
class EngineConfigurationChannel(SideChannel):

+ "this should not have happend."
)
def set_configuration(
def set_configuration_parameters(
self,
width: int = 80,
height: int = 80,

:param quality_level: Defines the quality level of the simulation.
Default 1.
:param time_scale: Defines the multiplier for the deltatime in the
simulation. If set to a higher value, time will pass faaster in the
simulation. If set to a higher value, time will pass faster in the
simulation but the physics might break. Default 20.
:param target_frame_rate: Instructs simulation to try to render at a
specified frame rate. Default -1.

data += struct.pack("<f", time_scale)
data += struct.pack("<i", target_frame_rate)
super().queue_message_to_send(data)
def set_configuration(self, config: EngineConfig) -> None:
"""
Sets the engine configuration. Takes as input an EngineConfig.
"""
data = bytearray()
data += struct.pack("<iiifi", *config)
super().queue_message_to_send(data)

29
ml-agents-envs/mlagents/envs/simple_env_manager.py


from typing import Any, Dict, List
from typing import Dict, List
from mlagents.envs.base_unity_environment import BaseUnityEnvironment
from mlagents.envs.env_manager import EnvManager, EnvironmentStep

from mlagents.envs.side_channel.float_properties_channel import FloatPropertiesChannel
class SimpleEnvManager(EnvManager):

"""
def __init__(self, env: BaseUnityEnvironment):
def __init__(
self, env: BaseUnityEnvironment, float_prop_channel: FloatPropertiesChannel
):
self.shared_float_properties = float_prop_channel
self.env = env
self.previous_step: EnvironmentStep = EnvironmentStep(None, {}, None)
self.previous_all_action_info: Dict[str, ActionInfo] = {}

return [step_info]
def reset(
self,
config: Dict[str, float] = None,
train_mode: bool = True,
custom_reset_parameters: Any = None,
self, config: Dict[str, float] = None
all_brain_info = self.env.reset(
config=config,
train_mode=train_mode,
custom_reset_parameters=custom_reset_parameters,
)
if config is not None:
for k, v in config.items():
self.shared_float_properties.set_property(k, v)
all_brain_info = self.env.reset()
self.previous_step = EnvironmentStep(None, all_brain_info, None)
return [self.previous_step]

@property
def reset_parameters(self) -> Dict[str, float]:
return self.env.reset_parameters
def get_properties(self) -> Dict[str, float]:
reset_params = {}
for k in self.shared_float_properties.list_properties():
reset_params[k] = self.shared_float_properties.get_property(k)
return reset_params
def close(self):
self.env.close()

71
ml-agents-envs/mlagents/envs/subprocess_env_manager.py


)
from mlagents.envs.brain import AllBrainInfo, BrainParameters
from mlagents.envs.action_info import ActionInfo
from mlagents.envs.side_channel.float_properties_channel import FloatPropertiesChannel
from mlagents.envs.side_channel.engine_configuration_channel import (
EngineConfigurationChannel,
EngineConfig,
)
from mlagents.envs.side_channel.side_channel import SideChannel
logger = logging.getLogger("mlagents.envs")

def worker(
parent_conn: Connection, step_queue: Queue, pickled_env_factory: str, worker_id: int
parent_conn: Connection,
step_queue: Queue,
pickled_env_factory: str,
worker_id: int,
engine_configuration: EngineConfig,
env_factory: Callable[[int], UnityEnvironment] = cloudpickle.loads(
pickled_env_factory
env_factory: Callable[
[int, List[SideChannel]], UnityEnvironment
] = cloudpickle.loads(pickled_env_factory)
shared_float_properties = FloatPropertiesChannel()
engine_configuration_channel = EngineConfigurationChannel()
engine_configuration_channel.set_configuration(engine_configuration)
env: BaseUnityEnvironment = env_factory(
worker_id, [shared_float_properties, engine_configuration_channel]
env: BaseUnityEnvironment = env_factory(worker_id)
def _send_response(cmd_name, payload):
parent_conn.send(EnvironmentResponse(cmd_name, worker_id, payload))

reset_timers()
elif cmd.name == "external_brains":
_send_response("external_brains", env.external_brains)
elif cmd.name == "reset_parameters":
_send_response("reset_parameters", env.reset_parameters)
elif cmd.name == "get_properties":
reset_params = {}
for k in shared_float_properties.list_properties():
reset_params[k] = shared_float_properties.get_property(k)
_send_response("get_properties", reset_params)
all_brain_info = env.reset(
cmd.payload[0], cmd.payload[1], cmd.payload[2]
)
for k, v in cmd.payload.items():
shared_float_properties.set_property(k, v)
all_brain_info = env.reset()
_send_response("reset", all_brain_info)
elif cmd.name == "close":
break

class SubprocessEnvManager(EnvManager):
def __init__(
self, env_factory: Callable[[int], BaseUnityEnvironment], n_env: int = 1
self,
env_factory: Callable[[int, List[SideChannel]], BaseUnityEnvironment],
engine_configuration: EngineConfig,
n_env: int = 1,
):
super().__init__()
self.env_workers: List[UnityEnvWorker] = []

self.create_worker(worker_idx, self.step_queue, env_factory)
self.create_worker(
worker_idx, self.step_queue, env_factory, engine_configuration
)
)
@staticmethod

env_factory: Callable[[int], BaseUnityEnvironment],
env_factory: Callable[[int, List[SideChannel]], BaseUnityEnvironment],
engine_configuration: EngineConfig,
) -> UnityEnvWorker:
parent_conn, child_conn = Pipe()

child_process = Process(
target=worker, args=(child_conn, step_queue, pickled_env_factory, worker_id)
target=worker,
args=(
child_conn,
step_queue,
pickled_env_factory,
worker_id,
engine_configuration,
),
)
child_process.start()
return UnityEnvWorker(child_process, worker_id, parent_conn)

step_infos = self._postprocess_steps(worker_steps)
return step_infos
def reset(
self,
config: Optional[Dict] = None,
train_mode: bool = True,
custom_reset_parameters: Any = None,
) -> List[EnvironmentStep]:
def reset(self, config: Optional[Dict] = None) -> List[EnvironmentStep]:
while any(ew.waiting for ew in self.env_workers):
if not self.step_queue.empty():
step = self.step_queue.get_nowait()

ew.send("reset", (config, train_mode, custom_reset_parameters))
ew.send("reset", config)
# Next (synchronously) collect the reset observations from each worker in sequence
for ew in self.env_workers:
ew.previous_step = EnvironmentStep(None, ew.recv().payload, None)

return self.env_workers[0].recv().payload
@property
def reset_parameters(self) -> Dict[str, float]:
self.env_workers[0].send("reset_parameters")
def get_properties(self) -> Dict[str, float]:
self.env_workers[0].send("get_properties")
return self.env_workers[0].recv().payload
def close(self) -> None:

35
ml-agents-envs/mlagents/envs/tests/test_subprocess_env_manager.py


StepResponse,
)
from mlagents.envs.base_unity_environment import BaseUnityEnvironment
from mlagents.envs.side_channel.engine_configuration_channel import EngineConfig
def mock_env_factory(worker_id):

class SubprocessEnvManagerTest(unittest.TestCase):
def test_environments_are_created(self):
SubprocessEnvManager.create_worker = MagicMock()
env = SubprocessEnvManager(mock_env_factory, 2)
env = SubprocessEnvManager(mock_env_factory, EngineConfig.default_config(), 2)
mock.call(0, env.step_queue, mock_env_factory),
mock.call(1, env.step_queue, mock_env_factory),
mock.call(
0, env.step_queue, mock_env_factory, EngineConfig.default_config()
),
mock.call(
1, env.step_queue, mock_env_factory, EngineConfig.default_config()
),
SubprocessEnvManager.create_worker = lambda em, worker_id, step_queue, env_factory: MockEnvWorker(
SubprocessEnvManager.create_worker = lambda em, worker_id, step_queue, env_factory, engine_c: MockEnvWorker(
manager = SubprocessEnvManager(mock_env_factory, 1)
manager = SubprocessEnvManager(
mock_env_factory, EngineConfig.default_config(), 1
)
manager.reset(params, False)
manager.env_workers[0].send.assert_called_with("reset", (params, False, None))
manager.reset(params)
manager.env_workers[0].send.assert_called_with("reset", (params))
SubprocessEnvManager.create_worker = lambda em, worker_id, step_queue, env_factory: MockEnvWorker(
SubprocessEnvManager.create_worker = lambda em, worker_id, step_queue, env_factory, engine_c: MockEnvWorker(
manager = SubprocessEnvManager(mock_env_factory, 4)
manager = SubprocessEnvManager(
mock_env_factory, EngineConfig.default_config(), 4
)
env.send.assert_called_with("reset", (params, True, None))
env.send.assert_called_with("reset", (params))
env.recv.assert_called()
# Check that the "last steps" are set to the value returned for each step
self.assertEqual(

def test_step_takes_steps_for_all_non_waiting_envs(self):
SubprocessEnvManager.create_worker = lambda em, worker_id, step_queue, env_factory: MockEnvWorker(
SubprocessEnvManager.create_worker = lambda em, worker_id, step_queue, env_factory, engine_c: MockEnvWorker(
manager = SubprocessEnvManager(mock_env_factory, 3)
manager = SubprocessEnvManager(
mock_env_factory, EngineConfig.default_config(), 3
)
manager.step_queue = Mock()
manager.step_queue.get_nowait.side_effect = [
EnvironmentResponse("step", 0, StepResponse(0, None)),

9
ml-agents/mlagents/trainers/curriculum.py


class Curriculum(object):
def __init__(self, location, default_reset_parameters):
def __init__(self, location):
:param default_reset_parameters: Set of reset parameters for
environment.
"""
self.max_lesson_num = 0
self.measure = None

parameters = self.data["parameters"]
for key in parameters:
if key not in default_reset_parameters:
raise CurriculumConfigError(
"The parameter {0} in Curriculum {1} is not present in "
"the Environment".format(key, location)
)
if len(parameters[key]) != self.max_lesson_num + 1:
raise CurriculumConfigError(
"The parameter {0} in Curriculum {1} must have {2} values "

70
ml-agents/mlagents/trainers/learn.py


from mlagents.envs.exception import SamplerException
from mlagents.envs.base_unity_environment import BaseUnityEnvironment
from mlagents.envs.subprocess_env_manager import SubprocessEnvManager
from mlagents.envs.side_channel.side_channel import SideChannel
from mlagents.envs.side_channel.engine_configuration_channel import EngineConfig
class CommandLineOptions(NamedTuple):

num_envs: int
curriculum_folder: Optional[str]
lesson: int
slow: bool
no_graphics: bool
multi_gpu: bool # ?
trainer_config_path: str

cpu: bool
@property
def fast_simulation(self) -> bool:
return not self.slow
width: int
height: int
quality_level: int
time_scale: float
target_frame_rate: int
@staticmethod
def from_argparse(args: Any) -> "CommandLineOptions":

"--seed", default=-1, type=int, help="Random seed used for training"
)
parser.add_argument(
"--slow", action="store_true", help="Whether to run the game at training speed"
)
parser.add_argument(
"--train",
default=False,
dest="train_model",

parser.add_argument("--version", action="version", version=get_version_string())
eng_conf = parser.add_argument_group(title="Engine Configuration")
eng_conf.add_argument(
"--width",
default=84,
type=int,
help="The width of the executable window of the environment(s)",
)
eng_conf.add_argument(
"--height",
default=84,
type=int,
help="The height of the executable window of the environment(s)",
)
eng_conf.add_argument(
"--quality-level",
default=5,
type=int,
help="The quality level of the environment(s)",
)
eng_conf.add_argument(
"--time-scale",
default=20,
type=float,
help="The time scale of the Unity environment(s)",
)
eng_conf.add_argument(
"--target-frame-rate",
default=-1,
type=int,
help="The target frame rate of the Unity environment(s)",
)
args = parser.parse_args(argv)
return CommandLineOptions.from_argparse(args)

port,
options.env_args,
)
env = SubprocessEnvManager(env_factory, options.num_envs)
engine_config = EngineConfig(
options.width,
options.height,
options.quality_level,
options.time_scale,
options.target_frame_rate,
)
env = SubprocessEnvManager(env_factory, engine_config, options.num_envs)
options.sampler_file_path, env.reset_parameters, run_seed
options.sampler_file_path, run_seed
)
trainer_factory = TrainerFactory(
trainer_config,

maybe_meta_curriculum,
options.train_model,
run_seed,
options.fast_simulation,
sampler_manager,
resampling_interval,
)

tc.start_learning(env)
def create_sampler_manager(sampler_file_path, env_reset_params, run_seed=None):
def create_sampler_manager(sampler_file_path, run_seed=None):
sampler_config = None
resample_interval = None
if sampler_file_path is not None:

return None
else:
meta_curriculum = MetaCurriculum(curriculum_folder, env.reset_parameters)
meta_curriculum = MetaCurriculum(curriculum_folder)
# TODO: Should be able to start learning at different lesson numbers
# for each curriculum.
meta_curriculum.set_all_curriculums_to_lesson_num(lesson)

seed: Optional[int],
start_port: int,
env_args: Optional[List[str]],
) -> Callable[[int], BaseUnityEnvironment]:
) -> Callable[[int, List[SideChannel]], BaseUnityEnvironment]:
if env_path is not None:
# Strip out executable extensions if passed
env_path = (

seed_count = 10000
seed_pool = [np.random.randint(0, seed_count) for _ in range(seed_count)]
def create_unity_environment(worker_id: int) -> UnityEnvironment:
def create_unity_environment(
worker_id: int, side_channels: List[SideChannel]
) -> UnityEnvironment:
env_seed = seed
if not env_seed:
env_seed = seed_pool[worker_id % len(seed_pool)]

no_graphics=no_graphics,
base_port=start_port,
args=env_args,
side_channels=side_channels,
)
return create_unity_environment

8
ml-agents/mlagents/trainers/meta_curriculum.py


"""Contains the MetaCurriculum class."""
import os
from typing import Any, Dict, Set
from typing import Dict, Set
from mlagents.trainers.curriculum import Curriculum
from mlagents.trainers.exception import MetaCurriculumError

particular brain in the environment.
"""
def __init__(
self, curriculum_folder: str, default_reset_parameters: Dict[str, Any]
):
def __init__(self, curriculum_folder: str):
"""Initializes a MetaCurriculum object.
Args:

curriculum_filepath = os.path.join(
curriculum_folder, curriculum_filename
)
curriculum = Curriculum(curriculum_filepath, default_reset_parameters)
curriculum = Curriculum(curriculum_filepath)
config_keys: Set[str] = set(curriculum.get_config().keys())
# Check if any two curriculums use the same reset params.

8
ml-agents/mlagents/trainers/tests/test_curriculum.py


@patch("builtins.open", new_callable=mock_open, read_data=dummy_curriculum_json_str)
def test_init_curriculum_happy_path(mock_file, location, default_reset_parameters):
curriculum = Curriculum(location, default_reset_parameters)
curriculum = Curriculum(location)
assert curriculum._brain_name == "TestBrain"
assert curriculum.lesson_num == 0

mock_file, location, default_reset_parameters
):
with pytest.raises(CurriculumConfigError):
Curriculum(location, default_reset_parameters)
Curriculum(location)
curriculum = Curriculum(location, default_reset_parameters)
curriculum = Curriculum(location)
assert curriculum.lesson_num == 0
curriculum.lesson_num = 1

@patch("builtins.open", new_callable=mock_open, read_data=dummy_curriculum_json_str)
def test_get_config(mock_file):
curriculum = Curriculum("TestBrain.json", {"param1": 1, "param2": 1, "param3": 1})
curriculum = Curriculum("TestBrain.json")
assert curriculum.get_config() == {"param1": 0.7, "param2": 100, "param3": 0.2}
curriculum.lesson_num = 2

4
ml-agents/mlagents/trainers/tests/test_learn.py


None,
False,
0,
True,
sampler_manager_mock.return_value,
None,
)

assert opt.run_id == "ppo"
assert opt.save_freq == 50000
assert opt.seed == -1
assert opt.fast_simulation is True
assert opt.train_model is False
assert opt.base_port == 5005
assert opt.num_envs == 1

"--num-runs=3",
"--save-freq=123456",
"--seed=7890",
"--slow",
"--train",
"--base-port=4004",
"--num-envs=2",

assert opt.run_id == "myawesomerun"
assert opt.save_freq == 123456
assert opt.seed == 7890
assert opt.fast_simulation is False
assert opt.train_model is True
assert opt.base_port == 4004
assert opt.num_envs == 2

9
ml-agents/mlagents/trainers/tests/test_meta_curriculum.py


def test_init_meta_curriculum_happy_path(
listdir, mock_curriculum_init, mock_curriculum_get_config, default_reset_parameters
):
meta_curriculum = MetaCurriculum("test/", default_reset_parameters)
meta_curriculum = MetaCurriculum("test/")
assert len(meta_curriculum.brains_to_curriculums) == 2

calls = [
call("test/Brain1.json", default_reset_parameters),
call("test/Brain2.json", default_reset_parameters),
]
calls = [call("test/Brain1.json"), call("test/Brain2.json")]
mock_curriculum_init.assert_has_calls(calls)

with pytest.raises(MetaCurriculumError):
MetaCurriculum("test/", default_reset_parameters)
MetaCurriculum("test/")
@patch("mlagents.trainers.curriculum.Curriculum")

4
ml-agents/mlagents/trainers/tests/test_simple_rl.py


)
from mlagents.envs.simple_env_manager import SimpleEnvManager
from mlagents.envs.sampler_class import SamplerManager
from mlagents.envs.side_channel.float_properties_channel import FloatPropertiesChannel
BRAIN_NAME = __name__

seed = 1337
trainer_config = yaml.safe_load(config)
env_manager = SimpleEnvManager(env)
env_manager = SimpleEnvManager(env, FloatPropertiesChannel())
trainer_factory = TrainerFactory(
trainer_config=trainer_config,
summaries_dir=dir,

meta_curriculum=None,
train=True,
training_seed=seed,
fast_simulation=True,
sampler_manager=SamplerManager(None),
resampling_interval=None,
save_freq=save_freq,

2
ml-agents/mlagents/trainers/tests/test_trainer_controller.py


meta_curriculum=None,
train=True,
training_seed=99,
fast_simulation=True,
sampler_manager=SamplerManager({}),
resampling_interval=None,
)

meta_curriculum=None,
train=True,
training_seed=seed,
fast_simulation=True,
sampler_manager=SamplerManager({}),
resampling_interval=None,
)

4
ml-agents/mlagents/trainers/trainer_controller.py


meta_curriculum: Optional[MetaCurriculum],
train: bool,
training_seed: int,
fast_simulation: bool,
sampler_manager: SamplerManager,
resampling_interval: Optional[int],
):

self.trainer_metrics: Dict[str, TrainerMetrics] = {}
self.meta_curriculum = meta_curriculum
self.training_start_time = time()
self.fast_simulation = fast_simulation
self.sampler_manager = sampler_manager
self.resampling_interval = resampling_interval
np.random.seed(training_seed)

self.meta_curriculum.get_config() if self.meta_curriculum else {}
)
sampled_reset_param.update(new_meta_curriculum_config)
return env.reset(train_mode=self.fast_simulation, config=sampled_reset_param)
return env.reset(config=sampled_reset_param)
def _should_save_model(self, global_step: int) -> bool:
return (

3
protobuf-definitions/proto/mlagents/envs/communicator_objects/unity_rl_initialization_output.proto


syntax = "proto3";
import "mlagents/envs/communicator_objects/brain_parameters.proto";
import "mlagents/envs/communicator_objects/environment_parameters.proto";
option csharp_namespace = "MLAgents.CommunicatorObjects";
package communicator_objects;

string version = 2;
string log_path = 3;
repeated BrainParametersProto brain_parameters = 5;
EnvironmentParametersProto environment_parameters = 6;
reserved 6; //environment parameters
}

5
protobuf-definitions/proto/mlagents/envs/communicator_objects/unity_rl_input.proto


syntax = "proto3";
import "mlagents/envs/communicator_objects/agent_action.proto";
import "mlagents/envs/communicator_objects/environment_parameters.proto";
import "mlagents/envs/communicator_objects/command.proto";
option csharp_namespace = "MLAgents.CommunicatorObjects";

repeated AgentActionProto value = 1;
}
map<string, ListAgentActionProto> agent_actions = 1;
EnvironmentParametersProto environment_parameters = 2;
bool is_training = 3;
reserved 2; //deprecated environment proto
reserved 3; //deprecated is_trainig
CommandProto command = 4;
bytes side_channel = 5;
}

179
UnitySDK/Assets/ML-Agents/Editor/ResetParameterDrawer.cs


using UnityEngine;
using UnityEditor;
using System;
using System.Linq;
using UnityEditor.SceneManagement;
using UnityEngine.SceneManagement;
namespace MLAgents
{
/// <summary>
/// PropertyDrawer for ResetParameters. Defines how ResetParameters are displayed in the
/// Inspector.
/// </summary>
[CustomPropertyDrawer(typeof(ResetParameters))]
public class ResetParameterDrawer : PropertyDrawer
{
ResetParameters m_Parameters;
// The height of a line in the Unity Inspectors
const float k_LineHeight = 17f;
// This is the prefix for the key when you add a reset parameter
const string k_NewKeyPrefix = "Param-";
/// <summary>
/// Computes the height of the Drawer depending on the property it is showing
/// </summary>
/// <param name="property">The property that is being drawn.</param>
/// <param name="label">The label of the property being drawn.</param>
/// <returns>The vertical space needed to draw the property.</returns>
public override float GetPropertyHeight(SerializedProperty property, GUIContent label)
{
LazyInitializeParameters(property);
return (m_Parameters.Count + 2) * k_LineHeight;
}
/// <inheritdoc />
public override void OnGUI(Rect position, SerializedProperty property, GUIContent label)
{
LazyInitializeParameters(property);
position.height = k_LineHeight;
EditorGUI.LabelField(position, label);
position.y += k_LineHeight;
var width = position.width / 2 - 24;
var keyRect = new Rect(position.x + 20, position.y, width, position.height);
var valueRect = new Rect(position.x + width + 30, position.y, width, position.height);
DrawAddRemoveButtons(keyRect, valueRect);
EditorGUI.BeginProperty(position, label, property);
foreach (var parameter in m_Parameters)
{
var key = parameter.Key;
var value = parameter.Value;
keyRect.y += k_LineHeight;
valueRect.y += k_LineHeight;
EditorGUI.BeginChangeCheck();
var newKey = EditorGUI.TextField(keyRect, key);
if (EditorGUI.EndChangeCheck())
{
MarkSceneAsDirty();
try
{
m_Parameters.Remove(key);
m_Parameters.Add(newKey, value);
}
catch (Exception e)
{
Debug.Log(e.Message);
}
break;
}
EditorGUI.BeginChangeCheck();
value = EditorGUI.FloatField(valueRect, value);
if (EditorGUI.EndChangeCheck())
{
MarkSceneAsDirty();
m_Parameters[key] = value;
break;
}
}
EditorGUI.EndProperty();
}
/// <summary>
/// Draws the Add and Remove buttons.
/// </summary>
/// <param name="addRect">The rectangle for the Add New button.</param>
/// <param name="removeRect">The rectangle for the Remove Last button.</param>
void DrawAddRemoveButtons(Rect addRect, Rect removeRect)
{
// This is the Add button
if (m_Parameters.Count == 0)
{
addRect.width *= 2;
}
if (GUI.Button(addRect,
new GUIContent("Add New", "Add a new item to the default reset parameters"),
EditorStyles.miniButton))
{
MarkSceneAsDirty();
AddParameter();
}
// If there are no items in the ResetParameters, Hide the Remove button
if (m_Parameters.Count == 0)
{
return;
}
// This is the Remove button
if (GUI.Button(removeRect,
new GUIContent(
"Remove Last", "Remove the last item from the default reset parameters"),
EditorStyles.miniButton))
{
MarkSceneAsDirty();
RemoveLastParameter();
}
}
/// <summary>
/// Signals that the property has been modified and requires the scene to be saved for
/// the changes to persist. Only works when the Editor is not playing.
/// </summary>
static void MarkSceneAsDirty()
{
if (!EditorApplication.isPlaying)
{
EditorSceneManager.MarkSceneDirty(SceneManager.GetActiveScene());
}
}
/// <summary>
/// Ensures that the state of the Drawer is synchronized with the property.
/// </summary>
/// <param name="property">The SerializedProperty of the ResetParameters
/// to make the custom GUI for.</param>
void LazyInitializeParameters(SerializedProperty property)
{
if (m_Parameters != null)
{
return;
}
var target = property.serializedObject.targetObject;
m_Parameters = fieldInfo.GetValue(target) as ResetParameters;
if (m_Parameters == null)
{
m_Parameters = new ResetParameters();
fieldInfo.SetValue(target, m_Parameters);
}
}
/// <summary>
/// Removes the last ResetParameter from the ResetParameters
/// </summary>
void RemoveLastParameter()
{
if (m_Parameters.Count > 0)
{
var key = m_Parameters.Keys.ToList()[m_Parameters.Count - 1];
m_Parameters.Remove(key);
}
}
/// <summary>
/// Adds a new ResetParameter to the ResetParameters with a default name.
/// </summary>
void AddParameter()
{
var key = k_NewKeyPrefix + m_Parameters.Count;
var value = default(float);
try
{
m_Parameters.Add(key, value);
}
catch (Exception e)
{
Debug.Log(e.Message);
}
}
}
}

12
UnitySDK/Assets/ML-Agents/Editor/ResetParameterDrawer.cs.meta


fileFormatVersion: 2
guid: 740b9a60fe38f476ab020dcf91f3f94a
timeCreated: 1517291065
licenseType: Free
MonoImporter:
serializedVersion: 2
defaultReferences: []
executionOrder: 0
icon: {instanceID: 0}
userData:
assetBundleName:
assetBundleVariant:

207
UnitySDK/Assets/ML-Agents/Scripts/Grpc/CommunicatorObjects/EnvironmentParameters.cs


// <auto-generated>
// Generated by the protocol buffer compiler. DO NOT EDIT!
// source: mlagents/envs/communicator_objects/environment_parameters.proto
// </auto-generated>
#pragma warning disable 1591, 0612, 3021
#region Designer generated code
using pb = global::Google.Protobuf;
using pbc = global::Google.Protobuf.Collections;
using pbr = global::Google.Protobuf.Reflection;
using scg = global::System.Collections.Generic;
namespace MLAgents.CommunicatorObjects {
/// <summary>Holder for reflection information generated from mlagents/envs/communicator_objects/environment_parameters.proto</summary>
public static partial class EnvironmentParametersReflection {
#region Descriptor
/// <summary>File descriptor for mlagents/envs/communicator_objects/environment_parameters.proto</summary>
public static pbr::FileDescriptor Descriptor {
get { return descriptor; }
}
private static pbr::FileDescriptor descriptor;
static EnvironmentParametersReflection() {
byte[] descriptorData = global::System.Convert.FromBase64String(
string.Concat(
"Cj9tbGFnZW50cy9lbnZzL2NvbW11bmljYXRvcl9vYmplY3RzL2Vudmlyb25t",
"ZW50X3BhcmFtZXRlcnMucHJvdG8SFGNvbW11bmljYXRvcl9vYmplY3RzGkBt",
"bGFnZW50cy9lbnZzL2NvbW11bmljYXRvcl9vYmplY3RzL2N1c3RvbV9yZXNl",
"dF9wYXJhbWV0ZXJzLnByb3RvIogCChpFbnZpcm9ubWVudFBhcmFtZXRlcnNQ",
"cm90bxJfChBmbG9hdF9wYXJhbWV0ZXJzGAEgAygLMkUuY29tbXVuaWNhdG9y",
"X29iamVjdHMuRW52aXJvbm1lbnRQYXJhbWV0ZXJzUHJvdG8uRmxvYXRQYXJh",
"bWV0ZXJzRW50cnkSUQoXY3VzdG9tX3Jlc2V0X3BhcmFtZXRlcnMYAiABKAsy",
"MC5jb21tdW5pY2F0b3Jfb2JqZWN0cy5DdXN0b21SZXNldFBhcmFtZXRlcnNQ",
"cm90bxo2ChRGbG9hdFBhcmFtZXRlcnNFbnRyeRILCgNrZXkYASABKAkSDQoF",
"dmFsdWUYAiABKAI6AjgBQh+qAhxNTEFnZW50cy5Db21tdW5pY2F0b3JPYmpl",
"Y3RzYgZwcm90bzM="));
descriptor = pbr::FileDescriptor.FromGeneratedCode(descriptorData,
new pbr::FileDescriptor[] { global::MLAgents.CommunicatorObjects.CustomResetParametersReflection.Descriptor, },
new pbr::GeneratedClrTypeInfo(null, new pbr::GeneratedClrTypeInfo[] {
new pbr::GeneratedClrTypeInfo(typeof(global::MLAgents.CommunicatorObjects.EnvironmentParametersProto), global::MLAgents.CommunicatorObjects.EnvironmentParametersProto.Parser, new[]{ "FloatParameters", "CustomResetParameters" }, null, null, new pbr::GeneratedClrTypeInfo[] { null, })
}));
}
#endregion
}
#region Messages
public sealed partial class EnvironmentParametersProto : pb::IMessage<EnvironmentParametersProto> {
private static readonly pb::MessageParser<EnvironmentParametersProto> _parser = new pb::MessageParser<EnvironmentParametersProto>(() => new EnvironmentParametersProto());
private pb::UnknownFieldSet _unknownFields;
[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public static pb::MessageParser<EnvironmentParametersProto> Parser { get { return _parser; } }
[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public static pbr::MessageDescriptor Descriptor {
get { return global::MLAgents.CommunicatorObjects.EnvironmentParametersReflection.Descriptor.MessageTypes[0]; }
}
[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
pbr::MessageDescriptor pb::IMessage.Descriptor {
get { return Descriptor; }
}
[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public EnvironmentParametersProto() {
OnConstruction();
}
partial void OnConstruction();
[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public EnvironmentParametersProto(EnvironmentParametersProto other) : this() {
floatParameters_ = other.floatParameters_.Clone();
CustomResetParameters = other.customResetParameters_ != null ? other.CustomResetParameters.Clone() : null;
_unknownFields = pb::UnknownFieldSet.Clone(other._unknownFields);
}
[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public EnvironmentParametersProto Clone() {
return new EnvironmentParametersProto(this);
}
/// <summary>Field number for the "float_parameters" field.</summary>
public const int FloatParametersFieldNumber = 1;
private static readonly pbc::MapField<string, float>.Codec _map_floatParameters_codec
= new pbc::MapField<string, float>.Codec(pb::FieldCodec.ForString(10), pb::FieldCodec.ForFloat(21), 10);
private readonly pbc::MapField<string, float> floatParameters_ = new pbc::MapField<string, float>();
[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public pbc::MapField<string, float> FloatParameters {
get { return floatParameters_; }
}
/// <summary>Field number for the "custom_reset_parameters" field.</summary>
public const int CustomResetParametersFieldNumber = 2;
private global::MLAgents.CommunicatorObjects.CustomResetParametersProto customResetParameters_;
[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public global::MLAgents.CommunicatorObjects.CustomResetParametersProto CustomResetParameters {
get { return customResetParameters_; }
set {
customResetParameters_ = value;
}
}
[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public override bool Equals(object other) {
return Equals(other as EnvironmentParametersProto);
}
[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public bool Equals(EnvironmentParametersProto other) {
if (ReferenceEquals(other, null)) {
return false;
}
if (ReferenceEquals(other, this)) {
return true;
}
if (!FloatParameters.Equals(other.FloatParameters)) return false;
if (!object.Equals(CustomResetParameters, other.CustomResetParameters)) return false;
return Equals(_unknownFields, other._unknownFields);
}
[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public override int GetHashCode() {
int hash = 1;
hash ^= FloatParameters.GetHashCode();
if (customResetParameters_ != null) hash ^= CustomResetParameters.GetHashCode();
if (_unknownFields != null) {
hash ^= _unknownFields.GetHashCode();
}
return hash;
}
[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public override string ToString() {
return pb::JsonFormatter.ToDiagnosticString(this);
}
[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public void WriteTo(pb::CodedOutputStream output) {
floatParameters_.WriteTo(output, _map_floatParameters_codec);
if (customResetParameters_ != null) {
output.WriteRawTag(18);
output.WriteMessage(CustomResetParameters);
}
if (_unknownFields != null) {
_unknownFields.WriteTo(output);
}
}
[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public int CalculateSize() {
int size = 0;
size += floatParameters_.CalculateSize(_map_floatParameters_codec);
if (customResetParameters_ != null) {
size += 1 + pb::CodedOutputStream.ComputeMessageSize(CustomResetParameters);
}
if (_unknownFields != null) {
size += _unknownFields.CalculateSize();
}
return size;
}
[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public void MergeFrom(EnvironmentParametersProto other) {
if (other == null) {
return;
}
floatParameters_.Add(other.floatParameters_);
if (other.customResetParameters_ != null) {
if (customResetParameters_ == null) {
customResetParameters_ = new global::MLAgents.CommunicatorObjects.CustomResetParametersProto();
}
CustomResetParameters.MergeFrom(other.CustomResetParameters);
}
_unknownFields = pb::UnknownFieldSet.MergeFrom(_unknownFields, other._unknownFields);
}
[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public void MergeFrom(pb::CodedInputStream input) {
uint tag;
while ((tag = input.ReadTag()) != 0) {
switch(tag) {
default:
_unknownFields = pb::UnknownFieldSet.MergeFieldFrom(_unknownFields, input);
break;
case 10: {
floatParameters_.AddEntriesFrom(input, _map_floatParameters_codec);
break;
}
case 18: {
if (customResetParameters_ == null) {
customResetParameters_ = new global::MLAgents.CommunicatorObjects.CustomResetParametersProto();
}
input.ReadMessage(customResetParameters_);
break;
}
}
}
}
}
#endregion
}
#endregion Designer generated code

11
UnitySDK/Assets/ML-Agents/Scripts/Grpc/CommunicatorObjects/EnvironmentParameters.cs.meta


fileFormatVersion: 2
guid: 8b4c58a64d6a94f579774322ef683b17
MonoImporter:
externalObjects: {}
serializedVersion: 2
defaultReferences: []
executionOrder: 0
icon: {instanceID: 0}
userData:
assetBundleName:
assetBundleVariant:

61
UnitySDK/Assets/ML-Agents/Scripts/ResetParameters.cs


using System;
using System.Collections.Generic;
using UnityEngine;
using UnityEngine.Serialization;
namespace MLAgents
{
[Serializable]
public class ResetParameters : Dictionary<string, float>, ISerializationCallbackReceiver
{
[Serializable]
public struct ResetParameter
{
public string key;
public float value;
}
public ResetParameters() {}
public ResetParameters(IDictionary<string, float> dict) : base(dict)
{
UpdateResetParameters();
}
void UpdateResetParameters()
{
m_ResetParameters.Clear();
foreach (var pair in this)
{
m_ResetParameters.Add(new ResetParameter { key = pair.Key, value = pair.Value });
}
}
[FormerlySerializedAs("resetParameters")]
[SerializeField]
List<ResetParameter> m_ResetParameters = new List<ResetParameter>();
public void OnBeforeSerialize()
{
UpdateResetParameters();
}
public void OnAfterDeserialize()
{
Clear();
for (var i = 0; i < m_ResetParameters.Count; i++)
{
if (ContainsKey(m_ResetParameters[i].key))
{
Debug.LogError("The ResetParameters contains the same key twice");
}
else
{
Add(m_ResetParameters[i].key, m_ResetParameters[i].value);
}
}
}
}
}

12
UnitySDK/Assets/ML-Agents/Scripts/ResetParameters.cs.meta


fileFormatVersion: 2
guid: af19281a4c1dd47518ac7501c45eca9f
timeCreated: 1517261137
licenseType: Free
MonoImporter:
serializedVersion: 2
defaultReferences: []
executionOrder: 0
icon: {instanceID: 0}
userData:
assetBundleName:
assetBundleVariant:

116
docs/images/academy.png

之前 之后

130
ml-agents-envs/mlagents/envs/communicator_objects/environment_parameters_pb2.py


# Generated by the protocol buffer compiler. DO NOT EDIT!
# source: mlagents/envs/communicator_objects/environment_parameters.proto
import sys
_b=sys.version_info[0]<3 and (lambda x:x) or (lambda x:x.encode('latin1'))
from google.protobuf import descriptor as _descriptor
from google.protobuf import message as _message
from google.protobuf import reflection as _reflection
from google.protobuf import symbol_database as _symbol_database
from google.protobuf import descriptor_pb2
# @@protoc_insertion_point(imports)
_sym_db = _symbol_database.Default()
from mlagents.envs.communicator_objects import custom_reset_parameters_pb2 as mlagents_dot_envs_dot_communicator__objects_dot_custom__reset__parameters__pb2
DESCRIPTOR = _descriptor.FileDescriptor(
name='mlagents/envs/communicator_objects/environment_parameters.proto',
package='communicator_objects',
syntax='proto3',
serialized_pb=_b('\n?mlagents/envs/communicator_objects/environment_parameters.proto\x12\x14\x63ommunicator_objects\x1a@mlagents/envs/communicator_objects/custom_reset_parameters.proto\"\x88\x02\n\x1a\x45nvironmentParametersProto\x12_\n\x10\x66loat_parameters\x18\x01 \x03(\x0b\x32\x45.communicator_objects.EnvironmentParametersProto.FloatParametersEntry\x12Q\n\x17\x63ustom_reset_parameters\x18\x02 \x01(\x0b\x32\x30.communicator_objects.CustomResetParametersProto\x1a\x36\n\x14\x46loatParametersEntry\x12\x0b\n\x03key\x18\x01 \x01(\t\x12\r\n\x05value\x18\x02 \x01(\x02:\x02\x38\x01\x42\x1f\xaa\x02\x1cMLAgents.CommunicatorObjectsb\x06proto3')
,
dependencies=[mlagents_dot_envs_dot_communicator__objects_dot_custom__reset__parameters__pb2.DESCRIPTOR,])
_ENVIRONMENTPARAMETERSPROTO_FLOATPARAMETERSENTRY = _descriptor.Descriptor(
name='FloatParametersEntry',
full_name='communicator_objects.EnvironmentParametersProto.FloatParametersEntry',
filename=None,
file=DESCRIPTOR,
containing_type=None,
fields=[
_descriptor.FieldDescriptor(
name='key', full_name='communicator_objects.EnvironmentParametersProto.FloatParametersEntry.key', index=0,
number=1, type=9, cpp_type=9, label=1,
has_default_value=False, default_value=_b("").decode('utf-8'),
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='value', full_name='communicator_objects.EnvironmentParametersProto.FloatParametersEntry.value', index=1,
number=2, type=2, cpp_type=6, label=1,
has_default_value=False, default_value=float(0),
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
],
extensions=[
],
nested_types=[],
enum_types=[
],
options=_descriptor._ParseOptions(descriptor_pb2.MessageOptions(), _b('8\001')),
is_extendable=False,
syntax='proto3',
extension_ranges=[],
oneofs=[
],
serialized_start=366,
serialized_end=420,
)
_ENVIRONMENTPARAMETERSPROTO = _descriptor.Descriptor(
name='EnvironmentParametersProto',
full_name='communicator_objects.EnvironmentParametersProto',
filename=None,
file=DESCRIPTOR,
containing_type=None,
fields=[
_descriptor.FieldDescriptor(
name='float_parameters', full_name='communicator_objects.EnvironmentParametersProto.float_parameters', index=0,
number=1, type=11, cpp_type=10, label=3,
has_default_value=False, default_value=[],
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='custom_reset_parameters', full_name='communicator_objects.EnvironmentParametersProto.custom_reset_parameters', index=1,
number=2, type=11, cpp_type=10, label=1,
has_default_value=False, default_value=None,
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
],
extensions=[
],
nested_types=[_ENVIRONMENTPARAMETERSPROTO_FLOATPARAMETERSENTRY, ],
enum_types=[
],
options=None,
is_extendable=False,
syntax='proto3',
extension_ranges=[],
oneofs=[
],
serialized_start=156,
serialized_end=420,
)
_ENVIRONMENTPARAMETERSPROTO_FLOATPARAMETERSENTRY.containing_type = _ENVIRONMENTPARAMETERSPROTO
_ENVIRONMENTPARAMETERSPROTO.fields_by_name['float_parameters'].message_type = _ENVIRONMENTPARAMETERSPROTO_FLOATPARAMETERSENTRY
_ENVIRONMENTPARAMETERSPROTO.fields_by_name['custom_reset_parameters'].message_type = mlagents_dot_envs_dot_communicator__objects_dot_custom__reset__parameters__pb2._CUSTOMRESETPARAMETERSPROTO
DESCRIPTOR.message_types_by_name['EnvironmentParametersProto'] = _ENVIRONMENTPARAMETERSPROTO
_sym_db.RegisterFileDescriptor(DESCRIPTOR)
EnvironmentParametersProto = _reflection.GeneratedProtocolMessageType('EnvironmentParametersProto', (_message.Message,), dict(
FloatParametersEntry = _reflection.GeneratedProtocolMessageType('FloatParametersEntry', (_message.Message,), dict(
DESCRIPTOR = _ENVIRONMENTPARAMETERSPROTO_FLOATPARAMETERSENTRY,
__module__ = 'mlagents.envs.communicator_objects.environment_parameters_pb2'
# @@protoc_insertion_point(class_scope:communicator_objects.EnvironmentParametersProto.FloatParametersEntry)
))
,
DESCRIPTOR = _ENVIRONMENTPARAMETERSPROTO,
__module__ = 'mlagents.envs.communicator_objects.environment_parameters_pb2'
# @@protoc_insertion_point(class_scope:communicator_objects.EnvironmentParametersProto)
))
_sym_db.RegisterMessage(EnvironmentParametersProto)
_sym_db.RegisterMessage(EnvironmentParametersProto.FloatParametersEntry)
DESCRIPTOR.has_options = True
DESCRIPTOR._options = _descriptor._ParseOptions(descriptor_pb2.FileOptions(), _b('\252\002\034MLAgents.CommunicatorObjects'))
_ENVIRONMENTPARAMETERSPROTO_FLOATPARAMETERSENTRY.has_options = True
_ENVIRONMENTPARAMETERSPROTO_FLOATPARAMETERSENTRY._options = _descriptor._ParseOptions(descriptor_pb2.MessageOptions(), _b('8\001'))
# @@protoc_insertion_point(module_scope)

75
ml-agents-envs/mlagents/envs/communicator_objects/environment_parameters_pb2.pyi


# @generated by generate_proto_mypy_stubs.py. Do not edit!
import sys
from google.protobuf.descriptor import (
Descriptor as google___protobuf___descriptor___Descriptor,
)
from google.protobuf.message import (
Message as google___protobuf___message___Message,
)
from mlagents.envs.communicator_objects.custom_reset_parameters_pb2 import (
CustomResetParametersProto as mlagents___envs___communicator_objects___custom_reset_parameters_pb2___CustomResetParametersProto,
)
from typing import (
Mapping as typing___Mapping,
MutableMapping as typing___MutableMapping,
Optional as typing___Optional,
Text as typing___Text,
)
from typing_extensions import (
Literal as typing_extensions___Literal,
)
builtin___bool = bool
builtin___bytes = bytes
builtin___float = float
builtin___int = int
class EnvironmentParametersProto(google___protobuf___message___Message):
DESCRIPTOR: google___protobuf___descriptor___Descriptor = ...
class FloatParametersEntry(google___protobuf___message___Message):
DESCRIPTOR: google___protobuf___descriptor___Descriptor = ...
key = ... # type: typing___Text
value = ... # type: builtin___float
def __init__(self,
*,
key : typing___Optional[typing___Text] = None,
value : typing___Optional[builtin___float] = None,
) -> None: ...
@classmethod
def FromString(cls, s: builtin___bytes) -> EnvironmentParametersProto.FloatParametersEntry: ...
def MergeFrom(self, other_msg: google___protobuf___message___Message) -> None: ...
def CopyFrom(self, other_msg: google___protobuf___message___Message) -> None: ...
if sys.version_info >= (3,):
def ClearField(self, field_name: typing_extensions___Literal[u"key",u"value"]) -> None: ...
else:
def ClearField(self, field_name: typing_extensions___Literal[u"key",b"key",u"value",b"value"]) -> None: ...
@property
def float_parameters(self) -> typing___MutableMapping[typing___Text, builtin___float]: ...
@property
def custom_reset_parameters(self) -> mlagents___envs___communicator_objects___custom_reset_parameters_pb2___CustomResetParametersProto: ...
def __init__(self,
*,
float_parameters : typing___Optional[typing___Mapping[typing___Text, builtin___float]] = None,
custom_reset_parameters : typing___Optional[mlagents___envs___communicator_objects___custom_reset_parameters_pb2___CustomResetParametersProto] = None,
) -> None: ...
@classmethod
def FromString(cls, s: builtin___bytes) -> EnvironmentParametersProto: ...
def MergeFrom(self, other_msg: google___protobuf___message___Message) -> None: ...
def CopyFrom(self, other_msg: google___protobuf___message___Message) -> None: ...
if sys.version_info >= (3,):
def HasField(self, field_name: typing_extensions___Literal[u"custom_reset_parameters"]) -> builtin___bool: ...
def ClearField(self, field_name: typing_extensions___Literal[u"custom_reset_parameters",u"float_parameters"]) -> None: ...
else:
def HasField(self, field_name: typing_extensions___Literal[u"custom_reset_parameters",b"custom_reset_parameters"]) -> builtin___bool: ...
def ClearField(self, field_name: typing_extensions___Literal[u"custom_reset_parameters",b"custom_reset_parameters",u"float_parameters",b"float_parameters"]) -> None: ...

11
protobuf-definitions/proto/mlagents/envs/communicator_objects/environment_parameters.proto


syntax = "proto3";
import "mlagents/envs/communicator_objects/custom_reset_parameters.proto";
option csharp_namespace = "MLAgents.CommunicatorObjects";
package communicator_objects;
message EnvironmentParametersProto {
map<string, float> float_parameters = 1;
CustomResetParametersProto custom_reset_parameters = 2;
}
正在加载...
取消
保存