浏览代码

Merge branch 'master' into develop-sac-apex

/develop/sac-apex
Ervin Teng 5 年前
当前提交
06fa3d39
共有 113 个文件被更改,包括 1991 次插入966 次删除
  1. 2
      .pylintrc
  2. 4
      .yamato/standalone-build-test.yml
  3. 7
      .yamato/training-int-tests.yml
  4. 8
      Dockerfile
  5. 30
      Project/Assets/ML-Agents/Editor/Tests/StandaloneBuildTest.cs
  6. 2
      Project/Assets/ML-Agents/Examples/3DBall/Scripts/Ball3DAgent.cs
  7. 2
      Project/Assets/ML-Agents/Examples/3DBall/Scripts/Ball3DHardAgent.cs
  8. 2
      Project/Assets/ML-Agents/Examples/Bouncer/Scripts/BouncerAgent.cs
  9. 5
      Project/Assets/ML-Agents/Examples/FoodCollector/Scripts/FoodCollectorAgent.cs
  10. 12
      Project/Assets/ML-Agents/Examples/FoodCollector/Scripts/FoodCollectorSettings.cs
  11. 18
      Project/Assets/ML-Agents/Examples/GridWorld/Scenes/GridWorld.unity
  12. 3
      Project/Assets/ML-Agents/Examples/GridWorld/Scripts/GridAgent.cs
  13. 2
      Project/Assets/ML-Agents/Examples/GridWorld/Scripts/GridArea.cs
  14. 3
      Project/Assets/ML-Agents/Examples/GridWorld/Scripts/GridSettings.cs
  15. 5
      Project/Assets/ML-Agents/Examples/PushBlock/Scripts/PushAgentBasic.cs
  16. 3
      Project/Assets/ML-Agents/Examples/Reacher/Scripts/ReacherAgent.cs
  17. 3
      Project/Assets/ML-Agents/Examples/SharedAssets/Scripts/ProjectSettingsOverrides.cs
  18. 3
      Project/Assets/ML-Agents/Examples/Soccer/Scripts/SoccerFieldArea.cs
  19. 2
      Project/Assets/ML-Agents/Examples/Tennis/Scripts/TennisAgent.cs
  20. 2
      Project/Assets/ML-Agents/Examples/Walker/Scripts/WalkerAgent.cs
  21. 13
      Project/Assets/ML-Agents/Examples/WallJump/Scripts/WallJumpAgent.cs
  22. 4
      README.md
  23. 22
      com.unity.ml-agents/CHANGELOG.md
  24. 43
      com.unity.ml-agents/Runtime/Academy.cs
  25. 3
      com.unity.ml-agents/Runtime/Agent.cs
  26. 13
      com.unity.ml-agents/Runtime/Communicator/ICommunicator.cs
  27. 184
      com.unity.ml-agents/Runtime/Communicator/RpcCommunicator.cs
  28. 5
      com.unity.ml-agents/Runtime/Policies/HeuristicPolicy.cs
  29. 20
      com.unity.ml-agents/Runtime/Sensors/StackingSensor.cs
  30. 7
      com.unity.ml-agents/Runtime/SideChannels/EngineConfigurationChannel.cs
  31. 2
      com.unity.ml-agents/Runtime/SideChannels/SideChannel.cs
  32. 8
      com.unity.ml-agents/Tests/Editor/MLAgentsEditModeTest.cs
  33. 30
      com.unity.ml-agents/Tests/Editor/PublicAPI/PublicApiValidation.cs
  34. 16
      com.unity.ml-agents/Tests/Editor/SideChannelTests.cs
  35. 6
      config/trainer_config.yaml
  36. 10
      docs/Custom-SideChannels.md
  37. 23
      docs/Getting-Started.md
  38. 3
      docs/Installation.md
  39. 7
      docs/Learning-Environment-Create-New.md
  40. 1
      docs/Learning-Environment-Design-Agents.md
  41. 1
      docs/Learning-Environment-Examples.md
  42. 8
      docs/Learning-Environment-Executable.md
  43. 17
      docs/Migrating.md
  44. 11
      docs/Python-API.md
  45. 1
      docs/Readme.md
  46. 4
      docs/Training-Curriculum-Learning.md
  47. 6
      docs/Training-Environment-Parameter-Randomization.md
  48. 38
      docs/Training-ML-Agents.md
  49. 98
      docs/Training-Self-Play.md
  50. 9
      docs/Using-Docker.md
  51. 7
      docs/Using-Tensorboard.md
  52. 2
      gym-unity/README.md
  53. 34
      gym-unity/gym_unity/envs/__init__.py
  54. 48
      gym-unity/gym_unity/tests/test_gym.py
  55. 2
      ml-agents-envs/mlagents_envs/communicator.py
  56. 117
      ml-agents-envs/mlagents_envs/environment.py
  57. 3
      ml-agents-envs/mlagents_envs/mock_communicator.py
  58. 3
      ml-agents-envs/mlagents_envs/rpc_communicator.py
  59. 4
      ml-agents-envs/mlagents_envs/side_channel/outgoing_message.py
  60. 4
      ml-agents-envs/mlagents_envs/side_channel/side_channel.py
  61. 23
      ml-agents-envs/mlagents_envs/tests/test_envs.py
  62. 5
      ml-agents/mlagents/model_serialization.py
  63. 52
      ml-agents/mlagents/trainers/agent_processor.py
  64. 52
      ml-agents/mlagents/trainers/behavior_id_utils.py
  65. 5
      ml-agents/mlagents/trainers/components/reward_signals/__init__.py
  66. 4
      ml-agents/mlagents/trainers/curriculum.py
  67. 38
      ml-agents/mlagents/trainers/distributions.py
  68. 15
      ml-agents/mlagents/trainers/env_manager.py
  69. 439
      ml-agents/mlagents/trainers/ghost/trainer.py
  70. 145
      ml-agents/mlagents/trainers/learn.py
  71. 4
      ml-agents/mlagents/trainers/meta_curriculum.py
  72. 1
      ml-agents/mlagents/trainers/policy/nn_policy.py
  73. 34
      ml-agents/mlagents/trainers/policy/tf_policy.py
  74. 11
      ml-agents/mlagents/trainers/ppo/trainer.py
  75. 5
      ml-agents/mlagents/trainers/sac/optimizer.py
  76. 11
      ml-agents/mlagents/trainers/sac/trainer.py
  77. 6
      ml-agents/mlagents/trainers/simple_env_manager.py
  78. 47
      ml-agents/mlagents/trainers/stats.py
  79. 124
      ml-agents/mlagents/trainers/subprocess_env_manager.py
  80. 80
      ml-agents/mlagents/trainers/tests/simple_test_envs.py
  81. 43
      ml-agents/mlagents/trainers/tests/test_agent_processor.py
  82. 10
      ml-agents/mlagents/trainers/tests/test_distributions.py
  83. 36
      ml-agents/mlagents/trainers/tests/test_ghost.py
  84. 48
      ml-agents/mlagents/trainers/tests/test_learn.py
  85. 4
      ml-agents/mlagents/trainers/tests/test_meta_curriculum.py
  86. 129
      ml-agents/mlagents/trainers/tests/test_simple_rl.py
  87. 30
      ml-agents/mlagents/trainers/tests/test_stats.py
  88. 55
      ml-agents/mlagents/trainers/tests/test_subprocess_env_manager.py
  89. 22
      ml-agents/mlagents/trainers/tests/test_trainer_util.py
  90. 10
      ml-agents/mlagents/trainers/trainer/trainer.py
  91. 25
      ml-agents/mlagents/trainers/trainer_controller.py
  92. 41
      ml-agents/mlagents/trainers/trainer_util.py
  93. 2
      ml-agents/setup.py
  94. 67
      ml-agents/tests/yamato/training_int_tests.py
  95. 69
      ml-agents/tests/yamato/yamato_utils.py
  96. 1
      setup.cfg
  97. 32
      .yamato/gym-interface-test.yml
  98. 32
      .yamato/python-ll-api-test.yml
  99. 234
      com.unity.ml-agents/Runtime/SideChannels/SideChannelUtils.cs
  100. 11
      com.unity.ml-agents/Runtime/SideChannels/SideChannelUtils.cs.meta

2
.pylintrc


# Appears to be https://github.com/PyCQA/pylint/issues/2981
W0201,
# Using the global statement
W0603,

4
.yamato/standalone-build-test.yml


- "*.md"
- "com.unity.ml-agents/*.md"
- "com.unity.ml-agents/**/*.md"
artifacts:
standalonebuild:
paths:
- "Project/testPlayer*/**"
{% endfor %}

7
.yamato/training-int-tests.yml


commands:
- pip install pyyaml
- python -u -m ml-agents.tests.yamato.training_int_tests
# Backwards-compatibility tests.
# If we make a breaking change to the communication protocol, these will need
# to be disabled until the next release.
- python -u -m ml-agents.tests.yamato.training_int_tests --python=0.15.0
- python -u -m ml-agents.tests.yamato.training_int_tests --csharp=0.15.0
dependencies:
- .yamato/standalone-build-test.yml#test_mac_standalone_{{ editor.version }}
triggers:
cancel_old_ci: true
changes:

8
Dockerfile


WORKDIR /ml-agents
RUN pip install -e .
# port 5005 is the port used in in Editor training.
EXPOSE 5005
# Port 5004 is the port used in in Editor training.
# Environments will start from port 5005,
# so allow enough ports for several environments.
EXPOSE 5004-5050
ENTRYPOINT ["mlagents-learn"]
ENTRYPOINT ["xvfb-run", "--auto-servernum", "--server-args='-screen 0 640x480x24'", "mlagents-learn"]

30
Project/Assets/ML-Agents/Editor/Tests/StandaloneBuildTest.cs


{
public class StandaloneBuildTest
{
const string k_outputCommandLineFlag = "--mlagents-build-output-path";
const string k_sceneCommandLineFlag = "--mlagents-build-scene-path";
string[] scenes = { "Assets/ML-Agents/Examples/3DBall/Scenes/3DBall.unity" };
var buildResult = BuildPipeline.BuildPlayer(scenes, "testPlayer", BuildTarget.StandaloneOSX, BuildOptions.None);
// Read commandline arguments for options
var outputPath = "testPlayer";
var scenePath = "Assets/ML-Agents/Examples/3DBall/Scenes/3DBall.unity";
var args = Environment.GetCommandLineArgs();
for (var i = 0; i < args.Length - 1; i++)
{
if (args[i] == k_outputCommandLineFlag)
{
outputPath = args[i + 1];
Debug.Log($"Overriding output path to {outputPath}");
}
else if (args[i] == k_sceneCommandLineFlag)
{
scenePath = args[i + 1];
}
}
string[] scenes = { scenePath };
var buildResult = BuildPipeline.BuildPlayer(
scenes,
outputPath,
BuildTarget.StandaloneOSX,
BuildOptions.None
);
var isOk = buildResult.summary.result == BuildResult.Succeeded;
var error = "";
foreach (var stepInfo in buildResult.steps)

2
Project/Assets/ML-Agents/Examples/3DBall/Scripts/Ball3DAgent.cs


public override void Initialize()
{
m_BallRb = ball.GetComponent<Rigidbody>();
m_ResetParams = Academy.Instance.FloatProperties;
m_ResetParams = SideChannelUtils.GetSideChannel<FloatPropertiesChannel>();
SetResetParameters();
}

2
Project/Assets/ML-Agents/Examples/3DBall/Scripts/Ball3DHardAgent.cs


public override void Initialize()
{
m_BallRb = ball.GetComponent<Rigidbody>();
m_ResetParams = Academy.Instance.FloatProperties;
m_ResetParams = SideChannelUtils.GetSideChannel<FloatPropertiesChannel>();
SetResetParameters();
}

2
Project/Assets/ML-Agents/Examples/Bouncer/Scripts/BouncerAgent.cs


m_Rb = gameObject.GetComponent<Rigidbody>();
m_LookDir = Vector3.zero;
m_ResetParams = Academy.Instance.FloatProperties;
m_ResetParams = SideChannelUtils.GetSideChannel<FloatPropertiesChannel>();
SetResetParameters();
}

5
Project/Assets/ML-Agents/Examples/FoodCollector/Scripts/FoodCollectorAgent.cs


using UnityEngine;
using MLAgents;
using MLAgents.Sensors;
using MLAgents.SideChannels;
public class FoodCollectorAgent : Agent
{

public void SetLaserLengths()
{
m_LaserLength = Academy.Instance.FloatProperties.GetPropertyWithDefault("laser_length", 1.0f);
m_LaserLength = SideChannelUtils.GetSideChannel<FloatPropertiesChannel>().GetPropertyWithDefault("laser_length", 1.0f);
float agentScale = Academy.Instance.FloatProperties.GetPropertyWithDefault("agent_scale", 1.0f);
float agentScale = SideChannelUtils.GetSideChannel<FloatPropertiesChannel>().GetPropertyWithDefault("agent_scale", 1.0f);
gameObject.transform.localScale = new Vector3(agentScale, agentScale, agentScale);
}

12
Project/Assets/ML-Agents/Examples/FoodCollector/Scripts/FoodCollectorSettings.cs


using UnityEngine;
using UnityEngine.UI;
using MLAgents;
using MLAgents.SideChannels;
public class FoodCollectorSettings : MonoBehaviour
{

public int totalScore;
public Text scoreText;
StatsSideChannel m_statsSideChannel;
m_statsSideChannel = SideChannelUtils.GetSideChannel<StatsSideChannel>();
}
public void EnvironmentReset()

public void Update()
{
scoreText.text = $"Score: {totalScore}";
// Send stats via SideChannel so that they'll appear in TensorBoard.
// These values get averaged every summary_frequency steps, so we don't
// need to send every Update() call.
if ((Time.frameCount % 100)== 0)
{
m_statsSideChannel?.AddStat("TotalScore", totalScore);
}
}
}

18
Project/Assets/ML-Agents/Examples/GridWorld/Scenes/GridWorld.unity


m_ReflectionIntensity: 1
m_CustomReflection: {fileID: 0}
m_Sun: {fileID: 0}
m_IndirectSpecularColor: {r: 0.44971162, g: 0.49977726, b: 0.5756362, a: 1}
m_IndirectSpecularColor: {r: 0.44971168, g: 0.4997775, b: 0.57563686, a: 1}
m_UseRadianceAmbientProbe: 0
--- !u!157 &3
LightmapSettings:

m_Father: {fileID: 363761400}
m_RootOrder: 1
m_LocalEulerAnglesHint: {x: 0, y: 0, z: 0}
m_AnchorMin: {x: 0.5, y: 0.5}
m_AnchorMax: {x: 0.5, y: 0.5}
m_AnchoredPosition: {x: -369.5, y: -62.2}
m_SizeDelta: {x: 160, y: 55.6}
m_AnchorMin: {x: 0, y: 1}
m_AnchorMax: {x: 0, y: 1}
m_AnchoredPosition: {x: 150, y: -230}
m_SizeDelta: {x: 160, y: 55.599976}
m_Pivot: {x: 0.5, y: 0.5}
--- !u!114 &918893360
MonoBehaviour:

m_Calls: []
m_FontData:
m_Font: {fileID: 10102, guid: 0000000000000000e000000000000000, type: 0}
m_FontSize: 20
m_FontSize: 22
m_FontStyle: 0
m_BestFit: 0
m_MinSize: 2

m_Father: {fileID: 363761400}
m_RootOrder: 2
m_LocalEulerAnglesHint: {x: 0, y: 0, z: 0}
m_AnchorMin: {x: 0.5, y: 0.5}
m_AnchorMax: {x: 0.5, y: 0.5}
m_AnchoredPosition: {x: -369.5, y: -197}
m_AnchorMin: {x: 0, y: 1}
m_AnchorMax: {x: 0, y: 1}
m_AnchoredPosition: {x: 150, y: -128}
m_SizeDelta: {x: 200, y: 152}
m_Pivot: {x: 0.5, y: 0.5}
--- !u!114 &1305247361

3
Project/Assets/ML-Agents/Examples/GridWorld/Scripts/GridAgent.cs


using MLAgents;
using MLAgents.Sensors;
using UnityEngine.Serialization;
using MLAgents.SideChannels;
public class GridAgent : Agent
{

// Prevents the agent from picking an action that would make it collide with a wall
var positionX = (int)transform.position.x;
var positionZ = (int)transform.position.z;
var maxPosition = (int)Academy.Instance.FloatProperties.GetPropertyWithDefault("gridSize", 5f) - 1;
var maxPosition = (int)SideChannelUtils.GetSideChannel<FloatPropertiesChannel>().GetPropertyWithDefault("gridSize", 5f) - 1;
if (positionX == 0)
{

2
Project/Assets/ML-Agents/Examples/GridWorld/Scripts/GridArea.cs


public void Start()
{
m_ResetParameters = Academy.Instance.FloatProperties;
m_ResetParameters = SideChannelUtils.GetSideChannel<FloatPropertiesChannel>();
m_Objects = new[] { goalPref, pitPref };

3
Project/Assets/ML-Agents/Examples/GridWorld/Scripts/GridSettings.cs


using UnityEngine;
using MLAgents;
using MLAgents.SideChannels;
public class GridSettings : MonoBehaviour
{

{
Academy.Instance.FloatProperties.RegisterCallback("gridSize", f =>
SideChannelUtils.GetSideChannel<FloatPropertiesChannel>().RegisterCallback("gridSize", f =>
{
MainCamera.transform.position = new Vector3(-(f - 1) / 2f, f * 1.25f, -(f - 1) / 2f);
MainCamera.orthographicSize = (f + 5f) / 2f;

5
Project/Assets/ML-Agents/Examples/PushBlock/Scripts/PushAgentBasic.cs


using System.Collections;
using UnityEngine;
using MLAgents;
using MLAgents.SideChannels;
public class PushAgentBasic : Agent
{

public void SetGroundMaterialFriction()
{
var resetParams = Academy.Instance.FloatProperties;
var resetParams = SideChannelUtils.GetSideChannel<FloatPropertiesChannel>();
var groundCollider = ground.GetComponent<Collider>();

public void SetBlockProperties()
{
var resetParams = Academy.Instance.FloatProperties;
var resetParams = SideChannelUtils.GetSideChannel<FloatPropertiesChannel>();
var scale = resetParams.GetPropertyWithDefault("block_scale", 2);
//Set the scale of the block

3
Project/Assets/ML-Agents/Examples/Reacher/Scripts/ReacherAgent.cs


using UnityEngine;
using MLAgents;
using MLAgents.Sensors;
using MLAgents.SideChannels;
public class ReacherAgent : Agent
{

public void SetResetParameters()
{
var fp = Academy.Instance.FloatProperties;
var fp = SideChannelUtils.GetSideChannel<FloatPropertiesChannel>();
m_GoalSize = fp.GetPropertyWithDefault("goal_size", 5);
m_GoalSpeed = Random.Range(-1f, 1f) * fp.GetPropertyWithDefault("goal_speed", 1);
m_Deviation = fp.GetPropertyWithDefault("deviation", 0);

3
Project/Assets/ML-Agents/Examples/SharedAssets/Scripts/ProjectSettingsOverrides.cs


using UnityEngine;
using MLAgents;
using MLAgents.SideChannels;
namespace MLAgentsExamples
{

Physics.defaultSolverIterations = solverIterations;
Physics.defaultSolverVelocityIterations = solverVelocityIterations;
Academy.Instance.FloatProperties.RegisterCallback("gravity", f => { Physics.gravity = new Vector3(0, -f, 0); });
SideChannelUtils.GetSideChannel<FloatPropertiesChannel>().RegisterCallback("gravity", f => { Physics.gravity = new Vector3(0, -f, 0); });
}
public void OnDestroy()

3
Project/Assets/ML-Agents/Examples/Soccer/Scripts/SoccerFieldArea.cs


using System.Collections;
using System.Collections.Generic;
using MLAgents;
using MLAgents.SideChannels;
using UnityEngine;
using UnityEngine.Serialization;

ballRb.velocity = Vector3.zero;
ballRb.angularVelocity = Vector3.zero;
var ballScale = Academy.Instance.FloatProperties.GetPropertyWithDefault("ball_scale", 0.015f);
var ballScale = SideChannelUtils.GetSideChannel<FloatPropertiesChannel>().GetPropertyWithDefault("ball_scale", 0.015f);
ballRb.transform.localScale = new Vector3(ballScale, ballScale, ballScale);
}
}

2
Project/Assets/ML-Agents/Examples/Tennis/Scripts/TennisAgent.cs


m_BallRb = ball.GetComponent<Rigidbody>();
var canvas = GameObject.Find(k_CanvasName);
GameObject scoreBoard;
m_ResetParams = Academy.Instance.FloatProperties;
m_ResetParams = SideChannelUtils.GetSideChannel<FloatPropertiesChannel>();
if (invertX)
{
scoreBoard = canvas.transform.Find(k_ScoreBoardBName).gameObject;

2
Project/Assets/ML-Agents/Examples/Walker/Scripts/WalkerAgent.cs


m_ChestRb = chest.GetComponent<Rigidbody>();
m_SpineRb = spine.GetComponent<Rigidbody>();
m_ResetParams = Academy.Instance.FloatProperties;
m_ResetParams = SideChannelUtils.GetSideChannel<FloatPropertiesChannel>();
SetResetParameters();
}

13
Project/Assets/ML-Agents/Examples/WallJump/Scripts/WallJumpAgent.cs


using MLAgents;
using Barracuda;
using MLAgents.Sensors;
using MLAgents.SideChannels;
public class WallJumpAgent : Agent
{

Vector3 m_JumpTargetPos;
Vector3 m_JumpStartingPos;
FloatPropertiesChannel m_FloatProperties;
public override void Initialize()
{
m_WallJumpSettings = FindObjectOfType<WallJumpSettings>();

m_GroundMaterial = m_GroundRenderer.material;
spawnArea.SetActive(false);
m_FloatProperties = SideChannelUtils.GetSideChannel<FloatPropertiesChannel>();
}
// Begin the jump sequence

{
localScale = new Vector3(
localScale.x,
Academy.Instance.FloatProperties.GetPropertyWithDefault("no_wall_height", 0),
m_FloatProperties.GetPropertyWithDefault("no_wall_height", 0),
localScale.z);
wall.transform.localScale = localScale;
SetModel("SmallWallJump", noWallBrain);

localScale = new Vector3(
localScale.x,
Academy.Instance.FloatProperties.GetPropertyWithDefault("small_wall_height", 4),
m_FloatProperties.GetPropertyWithDefault("small_wall_height", 4),
localScale.z);
wall.transform.localScale = localScale;
SetModel("SmallWallJump", smallWallBrain);

var min = Academy.Instance.FloatProperties.GetPropertyWithDefault("big_wall_min_height", 8);
var max = Academy.Instance.FloatProperties.GetPropertyWithDefault("big_wall_max_height", 8);
var min = m_FloatProperties.GetPropertyWithDefault("big_wall_min_height", 8);
var max = m_FloatProperties.GetPropertyWithDefault("big_wall_max_height", 8);
var height = min + Random.value * (max - min);
localScale = new Vector3(
localScale.x,

4
README.md


* Train using concurrent Unity environment instances
## Releases & Documentation
**Our latest, stable release is 0.15.0. Click
**Our latest, stable release is 0.15.1. Click
get started with the latest release of ML-Agents.**
The table below lists all our releases, including our `master` branch which is under active

| **Version** | **Release Date** | **Source** | **Documentation** | **Download** |
|:-------:|:------:|:-------------:|:-------:|:------------:|
| **master (unstable)** | -- | [source](https://github.com/Unity-Technologies/ml-agents/tree/master) | [docs](https://github.com/Unity-Technologies/ml-agents/tree/master/docs/Readme.md) | [download](https://github.com/Unity-Technologies/ml-agents/archive/master.zip) |
| **0.15.1** | **March 30, 2020** | **[source](https://github.com/Unity-Technologies/ml-agents/tree/0.15.1)** | **[docs](https://github.com/Unity-Technologies/ml-agents/tree/0.15.1/docs/Readme.md)** | **[download](https://github.com/Unity-Technologies/ml-agents/archive/0.15.1.zip)** |
| **0.15.0** | **March 18, 2020** | **[source](https://github.com/Unity-Technologies/ml-agents/tree/0.15.0)** | **[docs](https://github.com/Unity-Technologies/ml-agents/tree/0.15.0/docs/Readme.md)** | **[download](https://github.com/Unity-Technologies/ml-agents/archive/0.15.0.zip)** |
| **0.14.1** | February 26, 2020 | [source](https://github.com/Unity-Technologies/ml-agents/tree/0.14.1) | [docs](https://github.com/Unity-Technologies/ml-agents/tree/0.14.1/docs/Readme.md) | [download](https://github.com/Unity-Technologies/ml-agents/archive/0.14.1.zip) |
| **0.14.0** | February 13, 2020 | [source](https://github.com/Unity-Technologies/ml-agents/tree/0.14.0) | [docs](https://github.com/Unity-Technologies/ml-agents/tree/0.14.0/docs/Readme.md) | [download](https://github.com/Unity-Technologies/ml-agents/archive/0.14.0.zip) |

22
com.unity.ml-agents/CHANGELOG.md


## [Unreleased]
### Major Changes
- The `--load` and `--train` command-line flags have been deprecated. Training now happens by default, and
use `--resume` to resume training instead. (#3705)
- The Jupyter notebooks have been removed from the repository.
- Introduced the `SideChannelUtils` to register, unregister and access side channels.
- `Academy.FloatProperties` was removed, please use `SideChannelUtils.GetSideChannel<FloatPropertiesChannel>()` instead.
- Raise the wall in CrawlerStatic scene to prevent Agent from falling off. (#3650)
- Added a feature to allow sending stats from C# environments to TensorBoard (and other python StatsWriters). To do this from your code, use `SideChannelUtils.GetSideChannel<StatsSideChannel>().AddStat(key, value)` (#3660)
- The way that UnityEnvironment decides the port was changed. If no port is specified, the behavior will depend on the `file_name` parameter. If it is `None`, 5004 (the editor port) will be used; otherwise 5005 (the base environment port) will be used.
- Fixed an issue where exceptions from environments provided a returncode of 0. (#3680)
- Running `mlagents-learn` with the same `--run-id` twice will no longer overwrite the existing files. (#3705)
- `StackingSensor` was changed from `internal` visibility to `public`
## [0.15.1-preview] - 2020-03-30
### Bug Fixes
- Raise the wall in CrawlerStatic scene to prevent Agent from falling off. (#3650)
- Fixed an issue where specifying `vis_encode_type` was required only for SAC. (#3677)
- Fixed the reported entropy values for continuous actions (#3684)
- Fixed an issue where switching models using `SetModel()` during training would use an excessive amount of memory. (#3664)
- Environment subprocesses now close immediately on timeout or wrong API version. (#3679)
- Fixed an issue in the gym wrapper that would raise an exception if an Agent called EndEpisode multiple times in the same step. (#3700)
- Fixed an issue where logging output was not visible; logging levels are now set consistently. (#3703)
## [0.15.0-preview] - 2020-03-18
### Major Changes

43
com.unity.ml-agents/Runtime/Academy.cs


/// </summary>
public static Academy Instance { get { return s_Lazy.Value; } }
/// <summary>
/// Collection of float properties (indexed by a string).
/// </summary>
public FloatPropertiesChannel FloatProperties;
// Fields not provided in the Inspector.
/// <summary>

}
/// <summary>
/// Registers SideChannel to the Academy to send and receive data with Python.
/// If IsCommunicatorOn is false, the SideChannel will not be registered.
/// </summary>
/// <param name="channel"> The side channel to be registered.</param>
public void RegisterSideChannel(SideChannel channel)
{
LazyInitialize();
Communicator?.RegisterSideChannel(channel);
}
/// <summary>
/// Unregisters SideChannel to the Academy. If the side channel was not registered,
/// nothing will happen.
/// </summary>
/// <param name="channel"> The side channel to be unregistered.</param>
public void UnregisterSideChannel(SideChannel channel)
{
Communicator?.UnregisterSideChannel(channel);
}
/// <summary>
/// Disable stepping of the Academy during the FixedUpdate phase. If this is called, the Academy must be
/// stepped manually by the user by calling Academy.EnvironmentStep().
/// </summary>

{
EnableAutomaticStepping();
var floatProperties = new FloatPropertiesChannel();
FloatProperties = floatProperties;
SideChannelUtils.RegisterSideChannel(new EngineConfigurationChannel());
SideChannelUtils.RegisterSideChannel(new FloatPropertiesChannel());
SideChannelUtils.RegisterSideChannel(new StatsSideChannel());
// Try to launch the communicator by using the arguments passed at launch
var port = ReadPortFromArgs();

if (Communicator != null)
{
Communicator.RegisterSideChannel(new EngineConfigurationChannel());
Communicator.RegisterSideChannel(floatProperties);
// We try to exchange the first message with Python. If this fails, it means
// no Python Process is ready to train the environment. In this case, the
//environment must use Inference.

DecideAction?.Invoke();
}
// If the communicator is not on, we need to clear the SideChannel sending queue
if (!IsCommunicatorOn)
{
SideChannelUtils.GetSideChannelMessage();
}
using (TimerStack.Instance.Scoped("AgentAct"))
{
AgentAct?.Invoke();

Communicator?.Dispose();
Communicator = null;
SideChannelUtils.UnregisterAllSideChannels();
if (m_ModelRunners != null)
{

// TODO - Pass worker ID or some other identifier,
// so that multiple envs won't overwrite each others stats.
TimerStack.Instance.SaveJsonTimers();
FloatProperties = null;
m_Initialized = false;
// Reset the Lazy instance

3
com.unity.ml-agents/Runtime/Agent.cs


void NotifyAgentDone(DoneReason doneReason)
{
m_Info.episodeId = m_EpisodeId;
m_Info.reward = m_Reward;
m_Info.done = true;
m_Info.maxStepReached = doneReason == DoneReason.MaxStepReached;

// If everything is the same, don't make any changes.
return;
}
NotifyAgentDone(DoneReason.Disabled);
m_PolicyFactory.model = model;
m_PolicyFactory.inferenceDevice = inferenceDevice;
m_PolicyFactory.behaviorName = behaviorName;

13
com.unity.ml-agents/Runtime/Communicator/ICommunicator.cs


/// <param name="agentId">A key to identify which Agent actions to get.</param>
/// <returns></returns>
float[] GetActions(string key, int agentId);
/// <summary>
/// Registers a side channel to the communicator. The side channel will exchange
/// messages with its Python equivalent.
/// </summary>
/// <param name="sideChannel"> The side channel to be registered.</param>
void RegisterSideChannel(SideChannel sideChannel);
/// <summary>
/// Unregisters a side channel from the communicator.
/// </summary>
/// <param name="sideChannel"> The side channel to be unregistered.</param>
void UnregisterSideChannel(SideChannel sideChannel);
}
}

184
com.unity.ml-agents/Runtime/Communicator/RpcCommunicator.cs


using MLAgents.Sensors;
using MLAgents.Policies;
using MLAgents.SideChannels;
using System.IO;
using Google.Protobuf;
namespace MLAgents

#endif
/// The communicator parameters sent at construction
CommunicatorInitParameters m_CommunicatorInitParameters;
Dictionary<Guid, SideChannel> m_SideChannels = new Dictionary<Guid, SideChannel>();
/// <summary>
/// Initializes a new instance of the RPCCommunicator class.

void UpdateEnvironmentWithInput(UnityRLInputProto rlInput)
{
ProcessSideChannelData(m_SideChannels, rlInput.SideChannel.ToArray());
SideChannelUtils.ProcessSideChannelData(rlInput.SideChannel.ToArray());
SendCommandEvent(rlInput.Command);
}

message.RlInitializationOutput = tempUnityRlInitializationOutput;
}
byte[] messageAggregated = GetSideChannelMessage(m_SideChannels);
byte[] messageAggregated = SideChannelUtils.GetSideChannelMessage();
message.RlOutput.SideChannel = ByteString.CopyFrom(messageAggregated);
var input = Exchange(message);

{
if (m_CurrentUnityRlOutput.AgentInfos.ContainsKey(behaviorName))
{
if (output == null)
if (m_CurrentUnityRlOutput.AgentInfos[behaviorName].CalculateSize() > 0)
output = new UnityRLInitializationOutputProto();
}
// Only send the BrainParameters if there is a non empty list of
// AgentInfos ready to be sent.
// This is to ensure that The Python side will always have a first
// observation when receiving the BrainParameters
if (output == null)
{
output = new UnityRLInitializationOutputProto();
}
var brainParameters = m_UnsentBrainKeys[behaviorName];
output.BrainParameters.Add(brainParameters.ToProto(behaviorName, true));
var brainParameters = m_UnsentBrainKeys[behaviorName];
output.BrainParameters.Add(brainParameters.ToProto(behaviorName, true));
}
}
}

{
m_SentBrainKeys.Add(brainProto.BrainName);
m_UnsentBrainKeys.Remove(brainProto.BrainName);
}
}
#endregion
#region Handling side channels
/// <summary>
/// Registers a side channel to the communicator. The side channel will exchange
/// messages with its Python equivalent.
/// </summary>
/// <param name="sideChannel"> The side channel to be registered.</param>
public void RegisterSideChannel(SideChannel sideChannel)
{
var channelId = sideChannel.ChannelId;
if (m_SideChannels.ContainsKey(channelId))
{
throw new UnityAgentsException(string.Format(
"A side channel with type index {0} is already registered. You cannot register multiple " +
"side channels of the same id.", channelId));
}
// Process any messages that we've already received for this channel ID.
var numMessages = m_CachedMessages.Count;
for (int i = 0; i < numMessages; i++)
{
var cachedMessage = m_CachedMessages.Dequeue();
if (channelId == cachedMessage.ChannelId)
{
using (var incomingMsg = new IncomingMessage(cachedMessage.Message))
{
sideChannel.OnMessageReceived(incomingMsg);
}
}
else
{
m_CachedMessages.Enqueue(cachedMessage);
}
}
m_SideChannels.Add(channelId, sideChannel);
}
/// <summary>
/// Unregisters a side channel from the communicator.
/// </summary>
/// <param name="sideChannel"> The side channel to be unregistered.</param>
public void UnregisterSideChannel(SideChannel sideChannel)
{
if (m_SideChannels.ContainsKey(sideChannel.ChannelId))
{
m_SideChannels.Remove(sideChannel.ChannelId);
}
}
/// <summary>
/// Grabs the messages that the registered side channels will send to Python at the current step
/// into a singe byte array.
/// </summary>
/// <param name="sideChannels"> A dictionary of channel type to channel.</param>
/// <returns></returns>
public static byte[] GetSideChannelMessage(Dictionary<Guid, SideChannel> sideChannels)
{
using (var memStream = new MemoryStream())
{
using (var binaryWriter = new BinaryWriter(memStream))
{
foreach (var sideChannel in sideChannels.Values)
{
var messageList = sideChannel.MessageQueue;
foreach (var message in messageList)
{
binaryWriter.Write(sideChannel.ChannelId.ToByteArray());
binaryWriter.Write(message.Count());
binaryWriter.Write(message);
}
sideChannel.MessageQueue.Clear();
}
return memStream.ToArray();
}
}
}
private struct CachedSideChannelMessage
{
public Guid ChannelId;
public byte[] Message;
}
private static Queue<CachedSideChannelMessage> m_CachedMessages = new Queue<CachedSideChannelMessage>();
/// <summary>
/// Separates the data received from Python into individual messages for each registered side channel.
/// </summary>
/// <param name="sideChannels">A dictionary of channel type to channel.</param>
/// <param name="dataReceived">The byte array of data received from Python.</param>
public static void ProcessSideChannelData(Dictionary<Guid, SideChannel> sideChannels, byte[] dataReceived)
{
while (m_CachedMessages.Count != 0)
{
var cachedMessage = m_CachedMessages.Dequeue();
if (sideChannels.ContainsKey(cachedMessage.ChannelId))
{
using (var incomingMsg = new IncomingMessage(cachedMessage.Message))
{
sideChannels[cachedMessage.ChannelId].OnMessageReceived(incomingMsg);
}
}
else
{
Debug.Log(string.Format(
"Unknown side channel data received. Channel Id is "
+ ": {0}", cachedMessage.ChannelId));
}
}
if (dataReceived.Length == 0)
{
return;
}
using (var memStream = new MemoryStream(dataReceived))
{
using (var binaryReader = new BinaryReader(memStream))
{
while (memStream.Position < memStream.Length)
{
Guid channelId = Guid.Empty;
byte[] message = null;
try
{
channelId = new Guid(binaryReader.ReadBytes(16));
var messageLength = binaryReader.ReadInt32();
message = binaryReader.ReadBytes(messageLength);
}
catch (Exception ex)
{
throw new UnityAgentsException(
"There was a problem reading a message in a SideChannel. Please make sure the " +
"version of MLAgents in Unity is compatible with the Python version. Original error : "
+ ex.Message);
}
if (sideChannels.ContainsKey(channelId))
{
using (var incomingMsg = new IncomingMessage(message))
{
sideChannels[channelId].OnMessageReceived(incomingMsg);
}
}
else
{
// Don't recognize this ID, but cache it in case the SideChannel that can handle
// it is registered before the next call to ProcessSideChannelData.
m_CachedMessages.Enqueue(new CachedSideChannelMessage
{
ChannelId = channelId,
Message = message
});
}
}
}
}
}

5
com.unity.ml-agents/Runtime/Policies/HeuristicPolicy.cs


public void RequestDecision(AgentInfo info, List<ISensor> sensors)
{
StepSensors(sensors);
m_LastDecision = m_Heuristic.Invoke();
if (!info.done)
{
m_LastDecision = m_Heuristic.Invoke();
}
}
/// <inheritdoc />

20
com.unity.ml-agents/Runtime/Sensors/StackingSensor.cs


/// For example, 4 stacked sets of observations would be output like
/// | t = now - 3 | t = now -3 | t = now - 2 | t = now |
/// Internally, a circular buffer of arrays is used. The m_CurrentIndex represents the most recent observation.
///
/// Currently, compressed and multidimensional observations are not supported.
internal class StackingSensor : ISensor
public class StackingSensor : ISensor
{
/// <summary>
/// The wrapped sensor.

WriteAdapter m_LocalAdapter = new WriteAdapter();
/// <summary>
///
/// Initializes the sensor.
/// </summary>
/// <param name="wrapped">The wrapped sensor.</param>
/// <param name="numStackedObservations">Number of stacked observations to keep.</param>

m_Name = $"StackingSensor_size{numStackedObservations}_{wrapped.GetName()}";
if (wrapped.GetCompressionType() != SensorCompressionType.None)
{
throw new UnityAgentsException("StackingSensor doesn't support compressed observations.'");
}
if (shape.Length != 1)
{
throw new UnityAgentsException("Only 1-D observations are supported by StackingSensor");
}
m_Shape = new int[shape.Length];
m_UnstackedObservationSize = wrapped.ObservationSize();

}
}
/// <inheritdoc/>
public int Write(WriteAdapter adapter)
{
// First, call the wrapped sensor's write method. Make sure to use our own adapter, not the passed one.

m_CurrentIndex = (m_CurrentIndex + 1) % m_NumStackedObservations;
}
/// <inheritdoc/>
/// <inheritdoc/>
/// <inheritdoc/>
/// <inheritdoc/>
public virtual SensorCompressionType GetCompressionType()
{
return SensorCompressionType.None;

7
com.unity.ml-agents/Runtime/SideChannels/EngineConfigurationChannel.cs


/// </summary>
public class EngineConfigurationChannel : SideChannel
{
private const string k_EngineConfigId = "e951342c-4f7e-11ea-b238-784f4387d1f7";
const string k_EngineConfigId = "e951342c-4f7e-11ea-b238-784f4387d1f7";
/// Initializes the side channel.
/// Initializes the side channel. The constructor is internal because only one instance is
/// supported at a time, and is created by the Academy.
public EngineConfigurationChannel()
internal EngineConfigurationChannel()
{
ChannelId = new Guid(k_EngineConfigId);
}

2
com.unity.ml-agents/Runtime/SideChannels/SideChannel.cs


using System.Collections.Generic;
using System;
using System.IO;
using System.Text;
namespace MLAgents.SideChannels
{

8
com.unity.ml-agents/Tests/Editor/MLAgentsEditModeTest.cs


using System.Collections.Generic;
using MLAgents.Sensors;
using MLAgents.Policies;
using MLAgents.SideChannels;
namespace MLAgents.Tests
{

Assert.AreEqual(0, aca.EpisodeCount);
Assert.AreEqual(0, aca.StepCount);
Assert.AreEqual(0, aca.TotalStepCount);
Assert.AreNotEqual(null, aca.FloatProperties);
Assert.AreNotEqual(null, SideChannelUtils.GetSideChannel<FloatPropertiesChannel>());
// Check that Dispose is idempotent
aca.Dispose();

[Test]
public void TestAcademyDispose()
{
var floatProperties1 = Academy.Instance.FloatProperties;
var floatProperties1 = SideChannelUtils.GetSideChannel<FloatPropertiesChannel>();
var floatProperties2 = Academy.Instance.FloatProperties;
Academy.Instance.LazyInitialize();
var floatProperties2 = SideChannelUtils.GetSideChannel<FloatPropertiesChannel>();
Academy.Instance.Dispose();
Assert.AreNotEqual(floatProperties1, floatProperties2);

30
com.unity.ml-agents/Tests/Editor/PublicAPI/PublicApiValidation.cs


}
}
// Simple SensorComponent that sets up a StackingSensor
class StackingComponent : SensorComponent
{
public SensorComponent wrappedComponent;
public int numStacks;
public override ISensor CreateSensor()
{
var wrappedSensor = wrappedComponent.CreateSensor();
return new StackingSensor(wrappedSensor, numStacks);
}
public override int[] GetObservationShape()
{
int[] shape = (int[]) wrappedComponent.GetObservationShape().Clone();
for (var i = 0; i < shape.Length; i++)
{
shape[i] *= numStacks;
}
return shape;
}
}
[Test]
public void CheckSetupAgent()

sensorComponent.sensorName = "ray3d";
sensorComponent.detectableTags = new List<string> { "Player", "Respawn" };
sensorComponent.raysPerDirection = 3;
// Make a StackingSensor that wraps the RayPerceptionSensorComponent3D
// This isn't necessarily practical, just to ensure that it can be done
var wrappingSensorComponent = gameObject.AddComponent<StackingComponent>();
wrappingSensorComponent.wrappedComponent = sensorComponent;
wrappingSensorComponent.numStacks = 3;
// ISensor isn't set up yet.
Assert.IsNull(sensorComponent.raySensor);

16
com.unity.ml-agents/Tests/Editor/SideChannelTests.cs


intSender.SendInt(5);
intSender.SendInt(6);
byte[] fakeData = RpcCommunicator.GetSideChannelMessage(dictSender);
RpcCommunicator.ProcessSideChannelData(dictReceiver, fakeData);
byte[] fakeData = SideChannelUtils.GetSideChannelMessage(dictSender);
SideChannelUtils.ProcessSideChannelData(dictReceiver, fakeData);
Assert.AreEqual(intReceiver.messagesReceived[0], 4);
Assert.AreEqual(intReceiver.messagesReceived[1], 5);

strSender.SendRawBytes(Encoding.ASCII.GetBytes(str1));
strSender.SendRawBytes(Encoding.ASCII.GetBytes(str2));
byte[] fakeData = RpcCommunicator.GetSideChannelMessage(dictSender);
RpcCommunicator.ProcessSideChannelData(dictReceiver, fakeData);
byte[] fakeData = SideChannelUtils.GetSideChannelMessage(dictSender);
SideChannelUtils.ProcessSideChannelData(dictReceiver, fakeData);
var messages = strReceiver.GetAndClearReceivedMessages();

tmp = propB.GetPropertyWithDefault(k2, 3.0f);
Assert.AreEqual(tmp, 1.0f);
byte[] fakeData = RpcCommunicator.GetSideChannelMessage(dictSender);
RpcCommunicator.ProcessSideChannelData(dictReceiver, fakeData);
byte[] fakeData = SideChannelUtils.GetSideChannelMessage(dictSender);
SideChannelUtils.ProcessSideChannelData(dictReceiver, fakeData);
tmp = propA.GetPropertyWithDefault(k2, 3.0f);
Assert.AreEqual(tmp, 1.0f);

Assert.AreEqual(wasCalled, 0);
fakeData = RpcCommunicator.GetSideChannelMessage(dictSender);
RpcCommunicator.ProcessSideChannelData(dictReceiver, fakeData);
fakeData = SideChannelUtils.GetSideChannelMessage(dictSender);
SideChannelUtils.ProcessSideChannelData(dictReceiver, fakeData);
Assert.AreEqual(wasCalled, 1);
var keysA = propA.ListProperties();

6
config/trainer_config.yaml


time_horizon: 1000
self_play:
window: 10
play_against_current_self_ratio: 0.5
play_against_latest_model_ratio: 0.5
team_change: 100000
Soccer:
normalize: false

num_layers: 2
self_play:
window: 10
play_against_current_self_ratio: 0.5
play_against_latest_model_ratio: 0.5
team_change: 100000
CrawlerStatic:
normalize: true

10
docs/Custom-SideChannels.md


`base.QueueMessageToSend(msg)` method inside the side channel, and call the
`OutgoingMessage.Dispose()` method.
To register a side channel on the Unity side, call `Academy.Instance.RegisterSideChannel` with the side channel
To register a side channel on the Unity side, call `SideChannelUtils.RegisterSideChannel` with the side channel
as only argument.
### Python side

// When a Debug.Log message is created, we send it to the stringChannel
Application.logMessageReceived += stringChannel.SendDebugStatementToPython;
// The channel must be registered with the Academy
Academy.Instance.RegisterSideChannel(stringChannel);
// The channel must be registered with the SideChannelUtils class
SideChannelUtils.RegisterSideChannel(stringChannel);
}
public void OnDestroy()

if (Academy.IsInitialized){
Academy.Instance.UnregisterSideChannel(stringChannel);
SideChannelUtils.UnregisterSideChannel(stringChannel);
}
}

string_log = StringLogChannel()
# We start the communication with the Unity Editor and pass the string_log side channel as input
env = UnityEnvironment(base_port=UnityEnvironment.DEFAULT_EDITOR_PORT, side_channels=[string_log])
env = UnityEnvironment(side_channels=[string_log])
env.reset()
string_log.send_string("The environment was reset")

23
docs/Getting-Started.md


Depending on your version of Unity, it may be necessary to change the **Scripting Runtime Version** of your project. This can be done as follows:
1. Launch Unity
2. On the Projects dialog, choose the **Open** option at the top of the window.
1. Launch Unity Hub
2. On the Projects dialog, choose the **Add** option at the top of the window.
3. Using the file dialog that opens, locate the `Project` folder
within the ML-Agents toolkit project and click **Open**.
4. Go to **Edit** > **Project Settings** > **Player**

2. Navigate to the folder where you cloned the ML-Agents toolkit repository.
**Note**: If you followed the default [installation](Installation.md), then
you should be able to run `mlagents-learn` from any directory.
3. Run `mlagents-learn <trainer-config-path> --run-id=<run-identifier> --train`
3. Run `mlagents-learn <trainer-config-path> --run-id=<run-identifier>`
training runs
- `--train` tells `mlagents-learn` to run a training session (rather
than inference)
training runs. Make sure to use one that hasn't been used already!
mlagents-learn config/trainer_config.yaml --run-id=firstRun --train
mlagents-learn config/trainer_config.yaml --run-id=firstRun
```
5. When the message _"Start training by pressing the Play button in the Unity

**Note**: If you're using Anaconda, don't forget to activate the ml-agents
environment first.
The `--train` flag tells the ML-Agents toolkit to run in training mode.
The `--time-scale=100` sets the `Time.TimeScale` value in Unity.
**Note**: You can train using an executable rather than the Editor. To do so,

command-line prompt. If you close the window manually, the `.nn` file
containing the trained model is not exported into the ml-agents folder.
You can press Ctrl+C to stop the training, and your trained model will be at
`models/<run-identifier>/<behavior_name>.nn` where
If you've quit the training early using Ctrl+C and want to resume training, run the
same command again, appending the `--resume` flag:
```sh
mlagents-learn config/trainer_config.yaml --run-id=firstRun --resume
```
Your trained model will be at `models/<run-identifier>/<behavior_name>.nn` where
`<behavior_name>` is the name of the `Behavior Name` of the agents corresponding to the model.
(**Note:** There is a known bug on Windows that causes the saving of the model to
fail when you early terminate the training, it's recommended to wait until Step

3
docs/Installation.md


By installing the `mlagents` package, the dependencies listed in the
[setup.py file](../ml-agents/setup.py) are also installed. These include
[TensorFlow](Background-TensorFlow.md) (Requires a CPU w/ AVX support) and
[Jupyter](Background-Jupyter.md).
[TensorFlow](Background-TensorFlow.md) (Requires a CPU w/ AVX support).
#### Advanced: Installing for Development

7
docs/Learning-Environment-Create-New.md


includes a convenient Monitor class that you can use to easily display Agent
status information in the Game window.
One additional test you can perform is to first ensure that your environment and
the Python API work as expected using the `notebooks/getting-started.ipynb`
[Jupyter notebook](Background-Jupyter.md). Within the notebook, be sure to set
`env_name` to the name of the environment file you specify when building this
environment.
## Training the Environment

To train in the editor, run the following Python command from a Terminal or Console
window before pressing play:
mlagents-learn config/config.yaml --run-id=RollerBall-1 --train
mlagents-learn config/config.yaml --run-id=RollerBall-1
(where `config.yaml` is a copy of `trainer_config.yaml` that you have edited
to change the `batch_size` and `buffer_size` hyperparameters for your trainer.)

1
docs/Learning-Environment-Design-Agents.md


```csharp
normalizedValue = (currentValue - minValue)/(maxValue - minValue)
```
:warning: For vectors, you should apply the above formula to each component (x, y, and z). Note that this is *not* the same as using the `Vector3.normalized` property or `Vector3.Normalize()` method in Unity (and similar for `Vector2`).
Rotations and angles should also be normalized. For angles between 0 and 360
degrees, you can use the following formulas:

1
docs/Learning-Environment-Examples.md


* Goal:
* Get the ball into the opponent's goal while preventing
the ball from entering own goal.
* Goalie:
* Agents: The environment contains four agents, with the same
Behavior Parameters : Soccer.
* Agent Reward Function (dependent):

8
docs/Learning-Environment-Executable.md


followed the default [installation](Installation.md), then navigate to the
`ml-agents/` folder.
3. Run
`mlagents-learn <trainer-config-file> --env=<env_name> --run-id=<run-identifier> --train`
`mlagents-learn <trainer-config-file> --env=<env_name> --run-id=<run-identifier>`
Where:
* `<trainer-config-file>` is the file path of the trainer configuration yaml
* `<env_name>` is the name and path to the executable you exported from Unity

* And the `--train` tells `mlagents-learn` to run a training session (rather
than inference)
mlagents-learn ../config/trainer_config.yaml --env=3DBall --run-id=firstRun --train
mlagents-learn ../config/trainer_config.yaml --env=3DBall --run-id=firstRun
ml-agents$ mlagents-learn config/trainer_config.yaml --env=3DBall --run-id=first-run --train
ml-agents$ mlagents-learn config/trainer_config.yaml --env=3DBall --run-id=first-run
▄▄▄▓▓▓▓

17
docs/Migrating.md


## Migrating from 0.15 to latest
### Important changes
* The `--load` and `--train` command-line flags have been deprecated and replaced with `--resume` and `--inference`.
* Running with the same `--run-id` twice will now throw an error.
* The `play_against_current_self_ratio` self-play trainer hyperparameter has been renamed to `play_against_latest_model_ratio`
* Replace the `--load` flag with `--resume` when calling `mlagents-learn`, and don't use the `--train` flag as training
will happen by default. To run without training, use `--inference`.
* To force-overwrite files from a pre-existing run, add the `--force` command-line flag.
* The Jupyter notebooks have been removed from the repository.
* `Academy.FloatProperties` was removed.
* `Academy.RegisterSideChannel` and `Academy.UnregisterSideChannel` were removed.
### Steps to Migrate
* Replace `Academy.FloatProperties` with `SideChannelUtils.GetSideChannel<FloatPropertiesChannel>()`.
* Replace `Academy.RegisterSideChannel` with `SideChannelUtils.RegisterSideChannel()`.
* Replace `Academy.UnregisterSideChannel` with `SideChannelUtils.UnregisterSideChannel`.
## Migrating from 0.14 to 0.15

* The interface for SideChannels was changed:
* In C#, `OnMessageReceived` now takes a `IncomingMessage` argument, and `QueueMessageToSend` takes an `OutgoingMessage` argument.
* In python, `on_message_received` now takes a `IncomingMessage` argument, and `queue_message_to_send` takes an `OutgoingMessage` argument.
* Automatic stepping for Academy is now controlled from the AutomaticSteppingEnabled property.
### Steps to Migrate
* Add the `using MLAgents.Sensors;` in addition to `using MLAgents;` on top of your Agent's script.

* We strongly recommend replacing the following methods with their new equivalent as they will be removed in a later release:
* `InitializeAgent()` to `Initialize()`
* `AgentAction()` to `OnActionReceived()`
* `AgentReset()` to `OnEpsiodeBegin()`
* `AgentReset()` to `OnEpisodeBegin()`
* Replace calls to Academy.EnableAutomaticStepping()/DisableAutomaticStepping() with Academy.AutomaticSteppingEnabled = true/false.
## Migrating from 0.13 to 0.14

11
docs/Python-API.md


The ML-Agents Toolkit Low Level API is a Python API for controlling the simulation
loop of an environment or game built with Unity. This API is used by the
training algorithms inside the ML-Agent Toolkit, but you can also write your own
Python programs using this API. Go [here](../notebooks/getting-started.ipynb)
for a Jupyter Notebook walking through the functionality of the API.
Python programs using this API.
The key objects in the Python API include: