比较提交

...
此合并请求有变更与目标分支冲突。
/test_requirements.txt
/.pre-commit-config.yaml
/DevProject/Packages/manifest.json
/utils/validate_versions.py
/utils/make_readme_table.py
/.yamato/gym-interface-test.yml
/.yamato/protobuf-generation-test.yml
/.yamato/training-int-tests.yml
/.yamato/python-ll-api-test.yml
/.yamato/standalone-build-test.yml
/.yamato/com.unity.ml-agents-test.yml
/gym-unity/setup.py
/gym-unity/gym_unity/envs/__init__.py
/gym-unity/gym_unity/__init__.py
/Project/Assets/ML-Agents/Examples/WallJump/Scripts/WallJumpAgent.cs
/Project/Assets/ML-Agents/Examples/FoodCollector/Scripts/FoodCollectorAgent.cs
/Project/Assets/ML-Agents/Examples/Soccer/Scripts/AgentSoccer.cs
/com.unity.ml-agents/package.json
/com.unity.ml-agents/Documentation~/com.unity.ml-agents.md
/com.unity.ml-agents/Editor/DemonstrationImporter.cs
/com.unity.ml-agents/Editor/BrainParametersDrawer.cs
/com.unity.ml-agents/Tests/Editor/MLAgentsEditModeTest.cs
/com.unity.ml-agents/Tests/Editor/Communicator/RpcCommunicatorTests.cs
/com.unity.ml-agents/Runtime/Inference/TensorProxy.cs
/com.unity.ml-agents/Runtime/Inference/BarracudaModelParamLoader.cs
/com.unity.ml-agents/Runtime/Communicator/RpcCommunicator.cs
/com.unity.ml-agents/Runtime/Communicator/GrpcExtensions.cs
/com.unity.ml-agents/Runtime/Academy.cs
/com.unity.ml-agents/Runtime/Agent.cs
/com.unity.ml-agents/Runtime/Demonstrations/DemonstrationRecorder.cs
/com.unity.ml-agents/Runtime/Sensors/RayPerceptionSensorComponentBase.cs
/com.unity.ml-agents/Runtime/Sensors/SensorShapeValidator.cs
/com.unity.ml-agents/Runtime/Sensors/RayPerceptionSensor.cs
/com.unity.ml-agents/CHANGELOG.md
/ml-agents-envs/setup.py
/ml-agents-envs/mlagents_envs/communicator.py
/ml-agents-envs/mlagents_envs/rpc_communicator.py
/ml-agents-envs/mlagents_envs/exception.py
/ml-agents-envs/mlagents_envs/side_channel/environment_parameters_channel.py
/ml-agents-envs/mlagents_envs/base_env.py
/ml-agents-envs/mlagents_envs/environment.py
/ml-agents-envs/mlagents_envs/tests/test_side_channel.py
/ml-agents-envs/mlagents_envs/__init__.py
/docs/Using-Tensorboard.md
/docs/Learning-Environment-Create-New.md
/docs/Training-ML-Agents.md
/docs/Installation-Anaconda-Windows.md
/docs/Installation.md
/ml-agents/tests/yamato/check_coverage_percent.py
/ml-agents/tests/yamato/scripts/run_gym.py
/ml-agents/tests/yamato/scripts/run_llapi.py
/ml-agents/tests/yamato/yamato_utils.py
/ml-agents/setup.py
/ml-agents/mlagents/trainers/trainer_controller.py
/ml-agents/mlagents/trainers/stats.py
/ml-agents/mlagents/trainers/subprocess_env_manager.py
/ml-agents/mlagents/trainers/ghost/trainer.py
/ml-agents/mlagents/trainers/ppo/trainer.py
/ml-agents/mlagents/trainers/sac/trainer.py
/ml-agents/mlagents/trainers/trainer/trainer.py
/ml-agents/mlagents/trainers/trainer/rl_trainer.py
/ml-agents/mlagents/trainers/buffer.py
/ml-agents/mlagents/trainers/__init__.py
/README.md
/com.unity.ml-agents/Tests/Editor/Sensor/RayPerceptionSensorTests.cs
/com.unity.ml-agents/Tests/Editor/Sensor/SensorShapeValidatorTests.cs
/com.unity.ml-agents/Tests/Editor/TensorUtilsTest.cs
/com.unity.ml-agents/Tests/Editor/Communicator/GrpcExtensionsTests.cs
/.github/workflows
/docs/Versioning.md
/com.unity.ml-agents/Tests/Editor/Communicator/GrpcExtensionsTests.cs.meta
/.circleci/config.yml
/com.unity.ml-agents/Runtime/DiscreteActionMasker.cs
/ml-agents/mlagents/trainers/components/bc/model.py
/ml-agents/mlagents/trainers/components/bc/module.py
/ml-agents/mlagents/trainers/components/reward_signals/__init__.py
/ml-agents/mlagents/trainers/components/reward_signals/curiosity/model.py
/ml-agents/mlagents/trainers/components/reward_signals/gail/model.py
/ml-agents/mlagents/trainers/components/reward_signals/reward_signal_factory.py
/ml-agents/mlagents/trainers/curriculum.py
/ml-agents/mlagents/trainers/models.py
/ml-agents/mlagents/trainers/policy/tf_policy.py
/ml-agents/mlagents/trainers/ppo/optimizer.py
/ml-agents/mlagents/trainers/sac/network.py
/ml-agents/mlagents/trainers/sac/optimizer.py
/ml-agents/mlagents/trainers/sampler_class.py
/ml-agents/mlagents/trainers/trainer_util.py
/ml-agents/mlagents/trainers/tests/test_nn_policy.py
/ml-agents/mlagents/trainers/tests/test_simple_rl.py
/ml-agents/mlagents/trainers/tests/test_ghost.py

1 次代码提交

作者 SHA1 备注 提交日期
Ruo-Ping Dong 2ca79207 [bug-fix] Don't load non-wrapped policy (#4593) 4 年前
共有 96 个文件被更改,包括 1000 次插入479 次删除
  1. 201
      .circleci/config.yml
  2. 35
      .pre-commit-config.yaml
  3. 2
      test_requirements.txt
  4. 4
      gym-unity/gym_unity/__init__.py
  5. 13
      gym-unity/gym_unity/envs/__init__.py
  6. 4
      gym-unity/setup.py
  7. 3
      Project/Assets/ML-Agents/Examples/FoodCollector/Scripts/FoodCollectorAgent.cs
  8. 1
      Project/Assets/ML-Agents/Examples/Soccer/Scripts/AgentSoccer.cs
  9. 1
      Project/Assets/ML-Agents/Examples/WallJump/Scripts/WallJumpAgent.cs
  10. 4
      ml-agents-envs/mlagents_envs/__init__.py
  11. 8
      ml-agents-envs/mlagents_envs/base_env.py
  12. 2
      ml-agents-envs/mlagents_envs/communicator.py
  13. 24
      ml-agents-envs/mlagents_envs/environment.py
  14. 2
      ml-agents-envs/mlagents_envs/exception.py
  15. 2
      ml-agents-envs/mlagents_envs/rpc_communicator.py
  16. 2
      ml-agents-envs/mlagents_envs/side_channel/environment_parameters_channel.py
  17. 2
      ml-agents-envs/mlagents_envs/side_channel/float_properties_channel.py
  18. 4
      ml-agents-envs/mlagents_envs/tests/test_side_channel.py
  19. 2
      ml-agents-envs/setup.py
  20. 24
      .yamato/gym-interface-test.yml
  21. 5
      .yamato/protobuf-generation-test.yml
  22. 2
      .yamato/training-int-tests.yml
  23. 25
      .yamato/python-ll-api-test.yml
  24. 2
      .yamato/standalone-build-test.yml
  25. 12
      .yamato/com.unity.ml-agents-test.yml
  26. 43
      README.md
  27. 8
      docs/Using-Tensorboard.md
  28. 1
      docs/Learning-Environment-Create-New.md
  29. 11
      docs/Training-ML-Agents.md
  30. 4
      docs/Installation-Anaconda-Windows.md
  31. 2
      docs/Installation.md
  32. 1
      utils/make_readme_table.py
  33. 36
      utils/validate_versions.py
  34. 2
      DevProject/Packages/manifest.json
  35. 14
      com.unity.ml-agents/Documentation~/com.unity.ml-agents.md
  36. 2
      com.unity.ml-agents/Tests/Editor/PublicAPI/Unity.ML-Agents.Editor.Tests.PublicAPI.asmdef
  37. 74
      com.unity.ml-agents/Tests/Editor/TensorUtilsTest.cs
  38. 13
      com.unity.ml-agents/Tests/Editor/MLAgentsEditModeTest.cs
  39. 8
      com.unity.ml-agents/Tests/Editor/Sensor/SensorShapeValidatorTests.cs
  40. 89
      com.unity.ml-agents/Tests/Editor/Sensor/RayPerceptionSensorTests.cs
  41. 16
      com.unity.ml-agents/Tests/Editor/Communicator/RpcCommunicatorTests.cs
  42. 20
      com.unity.ml-agents/Editor/BrainParametersDrawer.cs
  43. 4
      com.unity.ml-agents/Editor/DemonstrationImporter.cs
  44. 6
      com.unity.ml-agents/Runtime/Inference/BarracudaModelParamLoader.cs
  45. 23
      com.unity.ml-agents/Runtime/Inference/TensorProxy.cs
  46. 2
      com.unity.ml-agents/Runtime/Sensors/SensorShapeValidator.cs
  47. 5
      com.unity.ml-agents/Runtime/Sensors/RayPerceptionSensorComponentBase.cs
  48. 19
      com.unity.ml-agents/Runtime/Sensors/RayPerceptionSensor.cs
  49. 6
      com.unity.ml-agents/Runtime/Academy.cs
  50. 28
      com.unity.ml-agents/Runtime/Agent.cs
  51. 18
      com.unity.ml-agents/Runtime/Communicator/GrpcExtensions.cs
  52. 39
      com.unity.ml-agents/Runtime/Communicator/RpcCommunicator.cs
  53. 2
      com.unity.ml-agents/Runtime/Demonstrations/DemonstrationRecorder.cs
  54. 2
      com.unity.ml-agents/Runtime/DiscreteActionMasker.cs
  55. 8
      com.unity.ml-agents/package.json
  56. 58
      com.unity.ml-agents/CHANGELOG.md
  57. 5
      ml-agents/setup.py
  58. 9
      ml-agents/tests/yamato/check_coverage_percent.py
  59. 4
      ml-agents/tests/yamato/scripts/run_gym.py
  60. 2
      ml-agents/tests/yamato/scripts/run_llapi.py
  61. 5
      ml-agents/tests/yamato/yamato_utils.py
  62. 4
      ml-agents/mlagents/trainers/__init__.py
  63. 2
      ml-agents/mlagents/trainers/subprocess_env_manager.py
  64. 4
      ml-agents/mlagents/trainers/buffer.py
  65. 2
      ml-agents/mlagents/trainers/components/bc/model.py
  66. 2
      ml-agents/mlagents/trainers/components/bc/module.py
  67. 2
      ml-agents/mlagents/trainers/components/reward_signals/__init__.py
  68. 6
      ml-agents/mlagents/trainers/components/reward_signals/curiosity/model.py
  69. 6
      ml-agents/mlagents/trainers/components/reward_signals/gail/model.py
  70. 4
      ml-agents/mlagents/trainers/components/reward_signals/reward_signal_factory.py
  71. 12
      ml-agents/mlagents/trainers/curriculum.py
  72. 6
      ml-agents/mlagents/trainers/models.py
  73. 13
      ml-agents/mlagents/trainers/policy/tf_policy.py
  74. 12
      ml-agents/mlagents/trainers/ppo/optimizer.py
  75. 20
      ml-agents/mlagents/trainers/ppo/trainer.py
  76. 6
      ml-agents/mlagents/trainers/sac/network.py
  77. 4
      ml-agents/mlagents/trainers/sac/optimizer.py
  78. 14
      ml-agents/mlagents/trainers/sac/trainer.py
  79. 2
      ml-agents/mlagents/trainers/sampler_class.py
  80. 12
      ml-agents/mlagents/trainers/stats.py
  81. 2
      ml-agents/mlagents/trainers/trainer/rl_trainer.py
  82. 4
      ml-agents/mlagents/trainers/trainer/trainer.py
  83. 2
      ml-agents/mlagents/trainers/trainer_controller.py
  84. 2
      ml-agents/mlagents/trainers/trainer_util.py
  85. 4
      ml-agents/mlagents/trainers/ghost/controller.py
  86. 18
      ml-agents/mlagents/trainers/ghost/trainer.py
  87. 4
      ml-agents/mlagents/trainers/tests/test_nn_policy.py
  88. 2
      ml-agents/mlagents/trainers/tests/test_simple_rl.py
  89. 64
      ml-agents/mlagents/trainers/tests/test_ghost.py
  90. 95
      docs/Versioning.md
  91. 37
      com.unity.ml-agents/Tests/Editor/Communicator/GrpcExtensionsTests.cs
  92. 11
      com.unity.ml-agents/Tests/Editor/Communicator/GrpcExtensionsTests.cs.meta
  93. 19
      .github/workflows/nightly.yml
  94. 41
      .github/workflows/pre-commit.yml
  95. 66
      .github/workflows/pytest.yml

201
.circleci/config.yml


- image: circleci/python:3.8.2
jobs:
build_python:
parameters:
executor:
type: executor
pyversion:
type: string
description: python version to being used (currently only affects caching).
pip_constraints:
type: string
description: Constraints file that is passed to "pip install". We constraint older versions of libraries for older python runtime, in order to help ensure compatibility.
enforce_onnx_conversion:
type: integer
default: 0
description: Whether to raise an exception if ONNX models couldn't be saved.
executor: << parameters.executor >>
working_directory: ~/repo
# Run additional numpy checks on unit tests
environment:
TEST_ENFORCE_NUMPY_FLOAT32: 1
TEST_ENFORCE_ONNX_CONVERSION: << parameters.enforce_onnx_conversion >>
steps:
- checkout
- run:
# Combine all the python dependencies into one file so that we can use that for the cache checksum
name: Combine pip dependencies for caching
command: cat ml-agents/setup.py ml-agents-envs/setup.py gym-unity/setup.py test_requirements.txt << parameters.pip_constraints >> > python_deps.txt
- restore_cache:
keys:
# Parameterize the cache so that different python versions can get different versions of the packages
- v1-dependencies-py<< parameters.pyversion >>-{{ checksum "python_deps.txt" }}
- run:
name: Install Dependencies
command: |
python3 -m venv venv
. venv/bin/activate
pip install --upgrade pip
pip install --upgrade setuptools
pip install --progress-bar=off -e ./ml-agents-envs -c << parameters.pip_constraints >>
pip install --progress-bar=off -e ./ml-agents -c << parameters.pip_constraints >>
pip install --progress-bar=off -r test_requirements.txt -c << parameters.pip_constraints >>
pip install --progress-bar=off -e ./gym-unity -c << parameters.pip_constraints >>
- save_cache:
paths:
- ./venv
key: v1-dependencies-py<< parameters.pyversion >>-{{ checksum "python_deps.txt" }}
- run:
name: Run Tests for ml-agents and gym_unity
# This also dumps the installed pip packages to a file, so we can see what versions are actually being used.
command: |
. venv/bin/activate
mkdir test-reports
pip freeze > test-reports/pip_versions.txt
pytest -n 2 --cov=ml-agents --cov=ml-agents-envs --cov=gym-unity --cov-report html --junitxml=test-reports/junit.xml -p no:warnings
- run:
name: Verify there are no hidden/missing metafiles.
# Renaming files or deleting files can leave metafiles behind that makes Unity very unhappy.
command: |
. venv/bin/activate
python utils/validate_meta_files.py
- store_test_results:
path: test-reports
- store_artifacts:
path: test-reports
destination: test-reports
- store_artifacts:
path: htmlcov
destination: htmlcov
pre-commit:
docker:
- image: circleci/python:3.7.3
working_directory: ~/repo/
steps:
- checkout
- run:
name: Combine precommit config and python versions for caching
command: |
cat .pre-commit-config.yaml > pre-commit-deps.txt
python -VV >> pre-commit-deps.txt
- restore_cache:
keys:
- v1-precommit-deps-{{ checksum "pre-commit-deps.txt" }}
- run:
name: Install Dependencies
command: |
# Need ruby for search-and-replace
sudo apt-get update
sudo apt-get install ruby-full
python3 -m venv venv
. venv/bin/activate
pip install --upgrade pip
pip install --upgrade setuptools
pip install pre-commit
# Install the hooks now so that they'll be cached
pre-commit install-hooks
- save_cache:
paths:
- ~/.cache/pre-commit
- ./venv
key: v1-precommit-deps-{{ checksum "pre-commit-deps.txt" }}
- run:
name: Check Code Style using pre-commit
command: |
. venv/bin/activate
pre-commit run --show-diff-on-failure --all-files
markdown_link_check:
parameters:
precommit_command:
type: string
description: precommit hook to run
default: markdown-link-check
docker:
- image: circleci/node:12.6.0
working_directory: ~/repo
steps:
- checkout
- restore_cache:
keys:
- v1-node-dependencies-{{ checksum ".pre-commit-config.yaml" }}
# fallback to using the latest cache if no exact match is found
- v1-node-dependencies-
- run:
name: Install Dependencies
command: |
sudo apt-get install python3-venv
python3 -m venv venv
. venv/bin/activate
pip install pre-commit
- run: sudo npm install -g markdown-link-check
- save_cache:
paths:
- ./venv
key: v1-node-dependencies-{{ checksum ".pre-commit-config.yaml" }}
- run:
name: Run markdown-link-check via precommit
command: |
. venv/bin/activate
pre-commit run --hook-stage manual << parameters.precommit_command >> --all-files
deploy:
parameters:
directory:

version: 2
workflow:
jobs:
- build_python:
name: python_3.6.1
executor: python361
pyversion: 3.6.1
# Test python 3.6 with the oldest supported versions
pip_constraints: test_constraints_min_version.txt
- build_python:
name: python_3.7.3
executor: python373
pyversion: 3.7.3
# Test python 3.7 with the newest supported versions
pip_constraints: test_constraints_max_tf1_version.txt
# Make sure ONNX conversion passes here (recent version of tensorflow 1.x)
enforce_onnx_conversion: 1
- build_python:
name: python_3.7.3+tf2
executor: python373
pyversion: 3.7.3
# Test python 3.7 with the newest supported versions
pip_constraints: test_constraints_max_tf2_version.txt
- build_python:
name: python_3.8.2+tf2.2
executor: python382
pyversion: 3.8.2
# Test python 3.8 with the newest edge versions
pip_constraints: test_constraints_max_tf2_version.txt
- markdown_link_check
- pre-commit
# The first deploy jobs are the "real" ones that upload to pypi
- deploy:
name: deploy ml-agents-envs

only: /^release_[0-9]+_test[0-9]+$/
branches:
ignore: /.*/
nightly:
triggers:
- schedule:
cron: "0 0 * * *"
filters:
branches:
only:
- develop
jobs:
- markdown_link_check:
name: markdown-link-check full
precommit_command: markdown-link-check-full

35
.pre-commit-config.yaml


files: "gym-unity/.*"
args: [--ignore-missing-imports, --disallow-incomplete-defs]
- repo: https://gitlab.com/pycqa/flake8
rev: 3.8.1
hooks:
- id: flake8
exclude: >
(?x)^(
.*_pb2.py|
.*_pb2_grpc.py
)$
# flake8-tidy-imports is used for banned-modules, not actually tidying
additional_dependencies: [flake8-comprehensions==3.2.2, flake8-tidy-imports==4.1.0, flake8-bugbear==20.1.4]
- repo: https://github.com/asottile/pyupgrade
rev: v2.7.0
hooks:
- id: pyupgrade
args: [--py3-plus, --py36-plus]
exclude: >
(?x)^(
.*barracuda.py|
.*_pb2.py|
.*_pb2_grpc.py
)$
rev: v2.4.0
rev: v2.5.0
hooks:
- id: mixed-line-ending
exclude: >

.*.meta
)$
args: [--fix=lf]
- id: flake8
exclude: >
(?x)^(
.*_pb2.py|
.*_pb2_grpc.py
)$
# flake8-tidy-imports is used for banned-modules, not actually tidying
additional_dependencies: [flake8-comprehensions==3.1.4, flake8-tidy-imports==4.0.0, flake8-bugbear==20.1.2]
- id: trailing-whitespace
name: trailing-whitespace-markdown
types: [markdown]

2
test_requirements.txt


# Test-only dependencies should go here, not in setup.py
pytest>4.0.0,<6.0.0
pytest-cov==2.6.1
pytest-xdist
pytest-xdist==1.34.0
# onnx doesn't currently have a wheel for 3.8
tf2onnx>=1.5.5;python_version<'3.8'

4
gym-unity/gym_unity/__init__.py


# Version of the library that will be used to upload to pypi
__version__ = "0.16.0"
__version__ = "0.16.1"
__release_tag__ = "release_1"
__release_tag__ = "release_2"

13
gym-unity/gym_unity/envs/__init__.py


self._env.step()
self.visual_obs = None
self._n_agents = -1
# Save the step result from the last time all Agents requested decisions.
self._previous_decision_step: DecisionSteps = None

self._env.step()
decision_step, terminal_step = self._env.get_steps(self.name)
self._check_agents(max(len(decision_step), len(terminal_step)))
if len(terminal_step) != 0:
# The agent is done
self.game_over = True

logger.warning("Could not seed environment %s", self.name)
return
def _check_agents(self, n_agents: int) -> None:
if self._n_agents > 1:
@staticmethod
def _check_agents(n_agents: int) -> None:
if n_agents > 1:
"There can only be one Agent in the environment but {n_agents} were detected."
f"There can only be one Agent in the environment but {n_agents} were detected."
)
@property

@property
def observation_space(self):
return self._observation_space
@property
def number_agents(self):
return self._n_agents
class ActionFlattener:

4
gym-unity/setup.py


tag = os.getenv("CIRCLE_TAG")
if tag != EXPECTED_TAG:
info = "Git tag: {0} does not match the expected tag of this app: {1}".format(
info = "Git tag: {} does not match the expected tag of this app: {}".format(
tag, EXPECTED_TAG
)
sys.exit(info)

author_email="ML-Agents@unity3d.com",
url="https://github.com/Unity-Technologies/ml-agents",
packages=find_packages(),
install_requires=["gym", "mlagents_envs=={}".format(VERSION)],
install_requires=["gym", f"mlagents_envs=={VERSION}"],
cmdclass={"verify": VerifyVersionCommand},
)

3
Project/Assets/ML-Agents/Examples/FoodCollector/Scripts/FoodCollectorAgent.cs


public override void Heuristic(float[] actionsOut)
{
actionsOut[0] = 0f;
actionsOut[1] = 0f;
actionsOut[2] = 0f;
if (Input.GetKey(KeyCode.D))
{
actionsOut[2] = 2f;

1
Project/Assets/ML-Agents/Examples/Soccer/Scripts/AgentSoccer.cs


public override void Heuristic(float[] actionsOut)
{
Array.Clear(actionsOut, 0, actionsOut.Length);
//forward
if (Input.GetKey(KeyCode.W))
{

1
Project/Assets/ML-Agents/Examples/WallJump/Scripts/WallJumpAgent.cs


public override void Heuristic(float[] actionsOut)
{
System.Array.Clear(actionsOut, 0, actionsOut.Length);
if (Input.GetKey(KeyCode.D))
{
actionsOut[1] = 2f;

4
ml-agents-envs/mlagents_envs/__init__.py


# Version of the library that will be used to upload to pypi
__version__ = "0.16.0"
__version__ = "0.16.1"
__release_tag__ = "release_1"
__release_tag__ = "release_2"

8
ml-agents-envs/mlagents_envs/base_env.py


:returns: The DecisionStep
"""
if agent_id not in self.agent_id_to_index:
raise KeyError(
"agent_id {} is not present in the DecisionSteps".format(agent_id)
)
raise KeyError(f"agent_id {agent_id} is not present in the DecisionSteps")
agent_index = self._agent_id_to_index[agent_id] # type: ignore
agent_obs = []
for batched_obs in self.obs:

specific agent
"""
if agent_id not in self.agent_id_to_index:
raise KeyError(
"agent_id {} is not present in the TerminalSteps".format(agent_id)
)
raise KeyError(f"agent_id {agent_id} is not present in the TerminalSteps")
agent_index = self._agent_id_to_index[agent_id] # type: ignore
agent_obs = []
for batched_obs in self.obs:

2
ml-agents-envs/mlagents_envs/communicator.py


from mlagents_envs.communicator_objects.unity_input_pb2 import UnityInputProto
class Communicator(object):
class Communicator:
def __init__(self, worker_id=0, base_port=5005):
"""
Python side of the communication. Must be used in pair with the right Unity Communicator equivalent.

24
ml-agents-envs/mlagents_envs/environment.py


for _sc in side_channels:
if _sc.channel_id in self.side_channels:
raise UnityEnvironmentException(
"There cannot be two side channels with the same channel id {0}.".format(
"There cannot be two side channels with the same channel id {}.".format(
_sc.channel_id
)
)

.replace(".x86", "")
)
true_filename = os.path.basename(os.path.normpath(env_path))
logger.debug("The true file name is {}".format(true_filename))
logger.debug(f"The true file name is {true_filename}")
if not (glob.glob(env_path) or glob.glob(env_path + ".*")):
return None

f"Couldn't launch the {file_name} environment. Provided filename does not match any environments."
)
else:
logger.debug("This is the launch string {}".format(launch_string))
logger.debug(f"This is the launch string {launch_string}")
# Launch Unity environment
subprocess_args = [launch_string]
if no_graphics:

def _assert_behavior_exists(self, behavior_name: str) -> None:
if behavior_name not in self._env_specs:
raise UnityActionException(
"The group {0} does not correspond to an existing agent group "
"The group {} does not correspond to an existing agent group "
"in the environment".format(behavior_name)
)

expected_shape = (len(self._env_state[behavior_name][0]), spec.action_size)
if action.shape != expected_shape:
raise UnityActionException(
"The behavior {0} needs an input of dimension {1} but received input of dimension {2}".format(
behavior_name, expected_shape, action.shape
)
"The behavior {} needs an input of dimension {} for "
"(<number of agents>, <action size>) but received input of "
"dimension {}".format(behavior_name, expected_shape, action.shape)
)
if action.dtype != expected_type:
action = action.astype(expected_type)

expected_shape = (spec.action_size,)
if action.shape != expected_shape:
raise UnityActionException(
f"The Agent {0} with BehaviorName {1} needs an input of dimension "
f"{2} but received input of dimension {3}".format(
agent_id, behavior_name, expected_shape, action.shape
)
f"The Agent {agent_id} with BehaviorName {behavior_name} needs an input of dimension "
f"{expected_shape} but received input of dimension {action.shape}"
)
expected_type = np.float32 if spec.is_action_continuous() else np.int32
if action.dtype != expected_type:

)
if len(message_data) != message_len:
raise UnityEnvironmentException(
"The message received by the side channel {0} was "
"The message received by the side channel {} was "
"unexpectedly short. Make sure your Unity Environment "
"sending side channel data properly.".format(channel_id)
)

else:
logger.warning(
"Unknown side channel data received. Channel type "
": {0}.".format(channel_id)
": {}.".format(channel_id)
)
@staticmethod

2
ml-agents-envs/mlagents_envs/exception.py


def __init__(self, worker_id):
message = self.MESSAGE_TEMPLATE.format(str(worker_id))
super(UnityWorkerInUseException, self).__init__(message)
super().__init__(message)

2
ml-agents-envs/mlagents_envs/rpc_communicator.py


s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
try:
s.bind(("localhost", port))
except socket.error:
except OSError:
raise UnityWorkerInUseException(self.worker_id)
finally:
s.close()

2
ml-agents-envs/mlagents_envs/side_channel/environment_parameters_channel.py


FLOAT = 0
def __init__(self) -> None:
channel_id = uuid.UUID(("534c891e-810f-11ea-a9d0-822485860400"))
channel_id = uuid.UUID("534c891e-810f-11ea-a9d0-822485860400")
super().__init__(channel_id)
def on_message_received(self, msg: IncomingMessage) -> None:

2
ml-agents-envs/mlagents_envs/side_channel/float_properties_channel.py


def __init__(self, channel_id: uuid.UUID = None) -> None:
self._float_properties: Dict[str, float] = {}
if channel_id is None:
channel_id = uuid.UUID(("60ccf7d0-4f7e-11ea-b238-784f4387d1f7"))
channel_id = uuid.UUID("60ccf7d0-4f7e-11ea-b238-784f4387d1f7")
super().__init__(channel_id)
def on_message_received(self, msg: IncomingMessage) -> None:

4
ml-agents-envs/mlagents_envs/tests/test_side_channel.py


sender = RawBytesChannel(guid)
receiver = RawBytesChannel(guid)
sender.send_raw_data("foo".encode("ascii"))
sender.send_raw_data("bar".encode("ascii"))
sender.send_raw_data(b"foo")
sender.send_raw_data(b"bar")
data = UnityEnvironment._generate_side_channel_data({sender.channel_id: sender})
UnityEnvironment._parse_side_channel_message({receiver.channel_id: receiver}, data)

2
ml-agents-envs/setup.py


tag = os.getenv("CIRCLE_TAG")
if tag != EXPECTED_TAG:
info = "Git tag: {0} does not match the expected tag of this app: {1}".format(
info = "Git tag: {} does not match the expected tag of this app: {}".format(
tag, EXPECTED_TAG
)
sys.exit(info)

24
.yamato/gym-interface-test.yml


variables:
UNITY_VERSION: {{ editor.version }}
commands:
- pip install pyyaml
- pip install pyyaml --index-url https://artifactory.prd.it.unity3d.com/artifactory/api/pypi/pypi/simple
- python -u -m ml-agents.tests.yamato.setup_venv
- ./venv/bin/python ml-agents/tests/yamato/scripts/run_gym.py --env=artifacts/testPlayer-Basic
dependencies:

changes:
only:
- "com.unity.ml-agents/**"
- "Project/**"
- "ml-agents/**"
- "ml-agents-envs/**"
- ".yamato/gym-interface-test.yml"
except:
- "*.md"
- "com.unity.ml-agents/*.md"
- "com.unity.ml-agents/**/*.md"
expression: |
(pull_request.target eq "master" OR
pull_request.target match "release.+") AND
NOT pull_request.draft AND
(pull_request.changes.any match "com.unity.ml-agents/**" OR
pull_request.changes.any match "Project/**" OR
pull_request.changes.any match "ml-agents/**" OR
pull_request.changes.any match "ml-agents-envs/**" OR
pull_request.changes.any match "gym-unity/**" OR
pull_request.changes.any match ".yamato/gym-interface-test.yml") AND
NOT pull_request.changes.all match "**/*.md"
{% endfor %}

5
.yamato/protobuf-generation-test.yml


nuget install Grpc.Tools -Version $GRPC_VERSION -OutputDirectory protobuf-definitions/
python3 -m venv venv
. venv/bin/activate
pip install --upgrade pip
pip install grpcio-tools==1.13.0 --progress-bar=off
pip install mypy-protobuf==1.16.0 --progress-bar=off
pip install --upgrade pip --index-url https://artifactory.prd.it.unity3d.com/artifactory/api/pypi/pypi/simple
pip install grpcio==1.28.1 grpcio-tools==1.13.0 protobuf==3.11.3 six==1.14.0 mypy-protobuf==1.16.0 --progress-bar=off --index-url https://artifactory.prd.it.unity3d.com/artifactory/api/pypi/pypi/simple
cd protobuf-definitions
chmod +x Grpc.Tools.$GRPC_VERSION/tools/macosx_x64/protoc
chmod +x Grpc.Tools.$GRPC_VERSION/tools/macosx_x64/grpc_csharp_plugin

2
.yamato/training-int-tests.yml


variables:
UNITY_VERSION: {{ editor.version }}
commands:
- pip install pyyaml
- pip install pyyaml --index-url https://artifactory.prd.it.unity3d.com/artifactory/api/pypi/pypi/simple
- python -u -m ml-agents.tests.yamato.training_int_tests
# Backwards-compatibility tests.
# If we make a breaking change to the communication protocol, these will need

25
.yamato/python-ll-api-test.yml


variables:
UNITY_VERSION: {{ editor.version }}
commands:
- pip install pyyaml
- pip install pyyaml --index-url https://artifactory.prd.it.unity3d.com/artifactory/api/pypi/pypi/simple
- ./venv/bin/python ml-agents/tests/yamato/scripts/run_llapi.py
- ./venv/bin/python ml-agents/tests/yamato/scripts/run_llapi.py
- ./venv/bin/python ml-agents/tests/yamato/scripts/run_llapi.py --env=artifacts/testPlayer-Basic
- ./venv/bin/python ml-agents/tests/yamato/scripts/run_llapi.py --env=artifacts/testPlayer-WallJump
- ./venv/bin/python ml-agents/tests/yamato/scripts/run_llapi.py --env=artifacts/testPlayer-Bouncer

cancel_old_ci: true
changes:
only:
- "com.unity.ml-agents/**"
- "Project/**"
- "ml-agents/**"
- "ml-agents-envs/**"
- ".yamato/python-ll-api-test.yml"
except:
- "*.md"
- "com.unity.ml-agents/*.md"
- "com.unity.ml-agents/**/*.md"
expression: |
(pull_request.target eq "master" OR
pull_request.target match "release.+") AND
NOT pull_request.draft AND
(pull_request.changes.any match "com.unity.ml-agents/**" OR
pull_request.changes.any match "Project/**" OR
pull_request.changes.any match "ml-agents/**" OR
pull_request.changes.any match "ml-agents-envs/**" OR
pull_request.changes.any match ".yamato/python-ll-api-test.yml") AND
NOT pull_request.changes.all match "**/*.md"
{% endfor %}

2
.yamato/standalone-build-test.yml


variables:
UNITY_VERSION: {{ editor.version }}
commands:
- pip install pyyaml
- pip install pyyaml --index-url https://artifactory.prd.it.unity3d.com/artifactory/api/pypi/pypi/simple
- python -u -m ml-agents.tests.yamato.standalone_build_tests
- python -u -m ml-agents.tests.yamato.standalone_build_tests --scene=Assets/ML-Agents/Examples/Basic/Scenes/Basic.unity
- python -u -m ml-agents.tests.yamato.standalone_build_tests --scene=Assets/ML-Agents/Examples/Bouncer/Scenes/Bouncer.unity

12
.yamato/com.unity.ml-agents-test.yml


commands:
- npm install upm-ci-utils@stable -g --registry https://artifactory.prd.cds.internal.unity3d.com/artifactory/api/npm/upm-npm
- upm-ci package test -u {{ editor.version }} --package-path com.unity.ml-agents {{ editor.coverageOptions }}
- python ml-agents/tests/yamato/check_coverage_percent.py upm-ci~/test-results/ {{ editor.minCoveragePct }}
- python3 ml-agents/tests/yamato/check_coverage_percent.py upm-ci~/test-results/ {{ editor.minCoveragePct }}
artifacts:
logs:
paths:

image: {{ platform.image }}
flavor: {{ platform.flavor}}
commands:
- python -m pip install unity-downloader-cli --extra-index-url https://artifactory.eu-cph-1.unityops.net/api/pypi/common-python/simple
- unity-downloader-cli -u trunk -c editor --wait --fast
- python3 -m pip install unity-downloader-cli --index-url https://artifactory.prd.it.unity3d.com/artifactory/api/pypi/pypi/simple --upgrade
- unity-downloader-cli -u {{ editor.version }} -c editor --wait --fast
{% if platform.name == "win" %}
- upm-ci package test -u "C:\build\output\Unity-Technologies\ml-agents\.Editor" --package-path com.unity.ml-agents {{ editor.coverageOptions }}
{% else %}
- python ml-agents/tests/yamato/check_coverage_percent.py upm-ci~/test-results/ {{ editor.minCoveragePct }}
{% endif %}
- python3 ml-agents/tests/yamato/check_coverage_percent.py upm-ci~/test-results/ {{ editor.minCoveragePct }}
artifacts:
logs:
paths:

43
README.md


# Unity ML-Agents Toolkit
[![docs badge](https://img.shields.io/badge/docs-reference-blue.svg)](https://github.com/Unity-Technologies/ml-agents/tree/release_1_docs/docs/)
[![docs badge](https://img.shields.io/badge/docs-reference-blue.svg)](https://github.com/Unity-Technologies/ml-agents/tree/release_2_verified_docs/docs/)
[![license badge](https://img.shields.io/badge/license-Apache--2.0-green.svg)](LICENSE)

## Releases & Documentation
**Our latest, stable release is `Release 1`. Click [here](docs/Readme.md) to
get started with the latest release of ML-Agents.**
**Our latest, stable release is `Release 2`. Click
[here](https://github.com/Unity-Technologies/ml-agents/tree/release_2_verified_docs/docs/Readme.md)
to get started with the latest release of ML-Agents.**
The table below lists all our releases, including our `master` branch which is under active
development and may be unstable. A few helpful guidelines:
* The docs links in the table below include installation and usage instructions specific to each
release. Remember to always use the documentation that corresponds to the release version you're
using.
* See the [GitHub releases](https://github.com/Unity-Technologies/ml-agents/releases) for more
details of the changes between versions.
* If you have used an earlier version of the ML-Agents Toolkit, we strongly recommend our
[guide on migrating from earlier versions](docs/Migrating.md).
The table below lists all our releases, including our `master` branch which is
under active development and may be unstable. A few helpful guidelines:
- The [Versioning page](docs/Versioning.md) overviews how we manage our GitHub
releases and the versioning process for each of the ML-Agents components.
- The [Releases page](https://github.com/Unity-Technologies/ml-agents/releases)
contains details of the changes between releases.
- The [Migration page](docs/Migrating.md) contains details on how to upgrade
from earlier releases of the ML-Agents Toolkit.
- The **Documentation** links in the table below include installation and usage
instructions specific to each release. Remember to always use the
documentation that corresponds to the release version you're using.
| **Release 1** | **April 30, 2020** | **[source](https://github.com/Unity-Technologies/ml-agents/tree/release_1)** | **[docs](https://github.com/Unity-Technologies/ml-agents/tree/release_1/docs/Readme.md)** | **[download](https://github.com/Unity-Technologies/ml-agents/archive/release_1.zip)** |
| **Release 2** | **May 19, 2020** | **[source](https://github.com/Unity-Technologies/ml-agents/tree/release_2)** | **[docs](https://github.com/Unity-Technologies/ml-agents/tree/release_2/docs/Readme.md)** | **[download](https://github.com/Unity-Technologies/ml-agents/archive/release_2.zip)** |
| **Release 1** | April 30, 2020 | [source](https://github.com/Unity-Technologies/ml-agents/tree/release_1) | [docs](https://github.com/Unity-Technologies/ml-agents/tree/release_1/docs/Readme.md) | [download](https://github.com/Unity-Technologies/ml-agents/archive/release_1.zip) |
| **0.15.1** | March 30, 2020 | [source](https://github.com/Unity-Technologies/ml-agents/tree/0.15.1) | [docs](https://github.com/Unity-Technologies/ml-agents/tree/0.15.1/docs/Readme.md) | [download](https://github.com/Unity-Technologies/ml-agents/archive/0.15.1.zip) |
| **0.15.0** | March 18, 2020 | [source](https://github.com/Unity-Technologies/ml-agents/tree/0.15.0) | [docs](https://github.com/Unity-Technologies/ml-agents/tree/0.15.0/docs/Readme.md) | [download](https://github.com/Unity-Technologies/ml-agents/archive/0.15.0.zip) |
| **0.14.1** | February 26, 2020 | [source](https://github.com/Unity-Technologies/ml-agents/tree/0.14.1) | [docs](https://github.com/Unity-Technologies/ml-agents/tree/0.14.1/docs/Readme.md) | [download](https://github.com/Unity-Technologies/ml-agents/archive/0.14.1.zip) |

| **0.12.1** | December 11, 2019 | [source](https://github.com/Unity-Technologies/ml-agents/tree/0.12.1) | [docs](https://github.com/Unity-Technologies/ml-agents/tree/0.12.1/docs/Readme.md) | [download](https://github.com/Unity-Technologies/ml-agents/archive/0.12.1.zip) |
| **0.12.0** | December 2, 2019 | [source](https://github.com/Unity-Technologies/ml-agents/tree/0.12.0) | [docs](https://github.com/Unity-Technologies/ml-agents/tree/0.12.0/docs/Readme.md) | [download](https://github.com/Unity-Technologies/ml-agents/archive/0.12.0.zip) |
## Citation
If you are a researcher interested in a discussion of Unity as an AI platform,

If you use Unity or the ML-Agents Toolkit to conduct research, we ask that you
cite the following paper as a reference:
Juliani, A., Berges, V., Vckay, E., Gao, Y., Henry, H., Mattar, M., Lange, D.
(2018). Unity: A General Platform for Intelligent Agents. _arXiv preprint
arXiv:1809.02627._ https://github.com/Unity-Technologies/ml-agents.
Juliani, A., Berges, V., Teng, E., Cohen, A., Harper, J., Elion, C., Goy, C.,
Gao, Y., Henry, H., Mattar, M., Lange, D. (2020). Unity: A General Platform for
Intelligent Agents. _arXiv preprint
[arXiv:1809.02627](https://arxiv.org/abs/1809.02627)._
https://github.com/Unity-Technologies/ml-agents.
- (May 12, 2020)
[Announcing ML-Agents Unity Package v1.0!](https://blogs.unity3d.com/2020/05/12/announcing-ml-agents-unity-package-v1-0/)
- (February 28, 2020)
[Training intelligent adversaries using self-play with ML-Agents](https://blogs.unity3d.com/2020/02/28/training-intelligent-adversaries-using-self-play-with-ml-agents/)
- (November 11, 2019)

8
docs/Using-Tensorboard.md


the --port option.
**Note:** If you don't assign a `run-id` identifier, `mlagents-learn` uses the
default string, "ppo". All the statistics will be saved to the same sub-folder
and displayed as one session in TensorBoard. After a few runs, the displays can
become difficult to interpret in this situation. You can delete the folders
under the `summaries` directory to clear out old statistics.
default string, "ppo". You can delete the folders under the `results` directory
to clear out old statistics.
On the left side of the TensorBoard window, you can select which of the training
runs you want to display. You can select multiple run-ids to compare statistics.

```csharp
var statsRecorder = Academy.Instance.StatsRecorder;
statsSideChannel.Add("MyMetric", 1.0);
statsRecorder.Add("MyMetric", 1.0);
```

1
docs/Learning-Environment-Create-New.md


learning_rate: 3.0e-4
learning_rate_schedule: linear
max_steps: 5.0e4
memory_size: 128
normalize: false
num_epoch: 3
num_layers: 2

11
docs/Training-ML-Agents.md


normalize: false
num_layers: 2
time_horizon: 64
summary_freq: 10000
init_path: null
# PPO-specific configs
beta: 5.0e-3

batch_size: 512
num_epoch: 3
samples_per_update: 0
init_path:
reward_signals:
# environment reward

strength: 0.02
gamma: 0.99
encoding_size: 256
learning_rate: 3e-4
learning_rate: 3.0e-4
# GAIL
gail:

demo_path: Project/Assets/ML-Agents/Examples/Pyramids/Demos/ExpertPyramid.demo
learning_rate: 3e-4
learning_rate: 3.0e-4
use_actions: false
use_vail: false

`interval_2_max`], ...]
- **sub-arguments** - `intervals`
The implementation of the samplers can be found at
`ml-agents-envs/mlagents_envs/sampler_class.py`.
The implementation of the samplers can be found in the
[sampler_class.py file](../ml-agents/mlagents/trainers/sampler_class.py).
#### Defining a New Sampler Type

4
docs/Installation-Anaconda-Windows.md


connected to the Internet and then type in the Anaconda Prompt:
```console
pip install mlagents
pip install mlagents==0.16.1
```
This will complete the installation of all the required Python packages to run

this, you can try:
```console
pip install mlagents --no-cache-dir
pip install mlagents==0.16.1 --no-cache-dir
```
This `--no-cache-dir` tells the pip to disable the cache.

2
docs/Installation.md


run from the command line:
```sh
pip3 install mlagents
pip3 install mlagents==0.16.1
```
Note that this will install `mlagents` from PyPi, _not_ from the cloned

1
utils/make_readme_table.py


ReleaseInfo.from_simple_tag("0.15.0", "March 18, 2020"),
ReleaseInfo.from_simple_tag("0.15.1", "March 30, 2020"),
ReleaseInfo("release_1", "1.0.0", "0.16.0", "April 30, 2020"),
ReleaseInfo("release_2", "1.0.1", "0.16.1", "May 19, 2020"),
]
MAX_DAYS = 150 # do not print releases older than this many days

36
utils/validate_versions.py


def extract_version_string(filename):
with open(filename) as f:
for l in f.readlines():
if l.startswith(VERSION_LINE_START):
return l.replace(VERSION_LINE_START, "").strip()
for line in f.readlines():
if line.startswith(VERSION_LINE_START):
return line.replace(VERSION_LINE_START, "").strip()
return None

def set_package_version(new_version: str) -> None:
with open(UNITY_PACKAGE_JSON_PATH, "r") as f:
with open(UNITY_PACKAGE_JSON_PATH) as f:
package_json = json.load(f)
if "version" in package_json:
package_json["version"] = new_version

f.writelines(lines)
def print_release_tag_commands(
python_version: str, csharp_version: str, release_tag: str
):
python_tag = f"python-packages_{python_version}"
csharp_tag = f"com.unity.ml-agents_{csharp_version}"
docs_tag = f"{release_tag}_docs"
print(
f"""
###
Use these commands to create the tags after the release:
###
git checkout {release_tag}
git tag -f latest_release
git push -f origin latest_release
git tag -f {docs_tag}
git push -f origin {docs_tag}
git tag {python_tag}
git push -f origin {python_tag}
git tag {csharp_tag}
git push -f origin {csharp_tag}
"""
)
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--python-version", default=None)

if args.csharp_version:
print(f"Updating C# package to version {args.csharp_version}")
set_version(args.python_version, args.csharp_version, args.release_tag)
if args.release_tag is not None:
print_release_tag_commands(
args.python_version, args.csharp_version, args.release_tag
)
else:
ok = check_versions()
return_code = 0 if ok else 1

2
DevProject/Packages/manifest.json


"com.unity.ide.rider": "1.1.4",
"com.unity.ide.vscode": "1.1.4",
"com.unity.ml-agents": "file:../../com.unity.ml-agents",
"com.unity.multiplayer-hlapi": "1.0.4",
"com.unity.multiplayer-hlapi": "1.0.6",
"com.unity.package-manager-doctools": "1.1.1-preview.3",
"com.unity.package-validation-suite": "0.7.15-preview",
"com.unity.purchasing": "2.0.6",

14
com.unity.ml-agents/Documentation~/com.unity.ml-agents.md


Manager documentation].
To install the companion Python package to enable training behaviors, follow the
[installation instructions] on our [GitHub repository].
[installation instructions] on our [GitHub repository]. It is strongly recommended that you
use the Python package that corresponds to this release (version 0.16.1) for the best experience;
versions between 0.16.1 and 0.20.0 are supported.
## Requirements

the documentation, you can checkout our [GitHUb Repository], which also includes
a number of ways to [connect with us] including our [ML-Agents Forum].
[unity ML-Agents Toolkit]: https://github.com/Unity-Technologies/ml-agents
[unity ML-Agents Toolkit]: https://github.com/Unity-Technologies/ml-agents/tree/release_2_verified_docs
[installation instructions]: https://github.com/Unity-Technologies/ml-agents/blob/release_1_docs/docs/Installation.md
[github repository]: https://github.com/Unity-Technologies/ml-agents
[python package]: https://github.com/Unity-Technologies/ml-agents
[installation instructions]: https://github.com/Unity-Technologies/ml-agents/blob/release_2_verified_docs/docs/Installation.md
[github repository]: https://github.com/Unity-Technologies/ml-agents/tree/release_2_verified_docs
[python package]: https://github.com/Unity-Technologies/ml-agents/tree/release_2_verified_docs
[connect with us]: https://github.com/Unity-Technologies/ml-agents#community-and-feedback
[connect with us]: https://github.com/Unity-Technologies/ml-agents/tree/release_2_verified_docs#community-and-feedback
[ml-agents forum]: https://forum.unity.com/forums/ml-agents.453/

2
com.unity.ml-agents/Tests/Editor/PublicAPI/Unity.ML-Agents.Editor.Tests.PublicAPI.asmdef


"references": [
"Unity.ML-Agents.Editor",
"Unity.ML-Agents",
"Barracuda",
"Unity.Barracuda",
"Unity.ML-Agents.CommunicatorObjects"
],
"optionalUnityReferences": [

74
com.unity.ml-agents/Tests/Editor/TensorUtilsTest.cs


{
public class TensorUtilsTest
{
[TestCase(4, TestName = "TestResizeTensor_4D")]
[TestCase(8, TestName = "TestResizeTensor_8D")]
public void TestResizeTensor(int dimension)
{
if (dimension == 8)
{
// Barracuda 1.0.x doesn't support 8D tensors
// Barracuda 1.1.x does but it initially broke ML-Agents support
// Unfortunately, the PackageInfo methods don't exist in earlier versions of the editor,
// so just skip that variant of the test then.
// It's unlikely, but possible that we'll upgrade to a newer dependency of Barracuda,
// in which case we should make sure this test is run then.
#if UNITY_2019_3_OR_NEWER
var packageInfo = UnityEditor.PackageManager.PackageInfo.FindForAssembly(typeof(Tensor).Assembly);
Assert.AreEqual("com.unity.barracuda", packageInfo.name);
var barracuda8DSupport = new Version(1, 1, 0);
var strippedBarracudaVersion = packageInfo.version.Replace("-preview", "");
var version = new Version(strippedBarracudaVersion);
if (version <= barracuda8DSupport)
{
return;
}
#else
return;
#endif
}
var alloc = new TensorCachingAllocator();
var height = 64;
var width = 84;
var channels = 3;
// Set shape to {1, ..., height, width, channels}
// For 8D, the ... are all 1's
var shape = new long[dimension];
for (var i = 0; i < dimension; i++)
{
shape[i] = 1;
}
shape[dimension - 3] = height;
shape[dimension - 2] = width;
shape[dimension - 1] = channels;
var intShape = new int[dimension];
for (var i = 0; i < dimension; i++)
{
intShape[i] = (int)shape[i];
}
var tensorProxy = new TensorProxy
{
valueType = TensorProxy.TensorType.Integer,
data = new Tensor(intShape),
shape = shape,
};
// These should be invariant after the resize.
Assert.AreEqual(height, tensorProxy.data.shape.height);
Assert.AreEqual(width, tensorProxy.data.shape.width);
Assert.AreEqual(channels, tensorProxy.data.shape.channels);
TensorUtils.ResizeTensor(tensorProxy, 42, alloc);
Assert.AreEqual(height, tensorProxy.shape[dimension - 3]);
Assert.AreEqual(width, tensorProxy.shape[dimension - 2]);
Assert.AreEqual(channels, tensorProxy.shape[dimension - 1]);
Assert.AreEqual(height, tensorProxy.data.shape.height);
Assert.AreEqual(width, tensorProxy.data.shape.width);
Assert.AreEqual(channels, tensorProxy.data.shape.channels);
alloc.Dispose();
}
[Test]
public void RandomNormalTestTensorInt()
{

13
com.unity.ml-agents/Tests/Editor/MLAgentsEditModeTest.cs


{
public Action OnRequestDecision;
ObservationWriter m_ObsWriter = new ObservationWriter();
public void RequestDecision(AgentInfo info, List<ISensor> sensors) {
foreach(var sensor in sensors){
public void RequestDecision(AgentInfo info, List<ISensor> sensors)
{
foreach (var sensor in sensors)
{
sensor.GetObservationProto(m_ObsWriter);
}
OnRequestDecision?.Invoke();

agent1.SetPolicy(policy);
StackingSensor sensor = null;
foreach(ISensor s in agent1.sensors){
if (s is StackingSensor){
foreach (ISensor s in agent1.sensors)
{
if (s is StackingSensor)
{
sensor = s as StackingSensor;
}
}

{
agent1.RequestDecision();
aca.EnvironmentStep();
}
policy.OnRequestDecision = () => SensorTestHelper.CompareObservation(sensor, new[] {18f, 19f, 21f});

8
com.unity.ml-agents/Tests/Editor/Sensor/SensorShapeValidatorTests.cs


validator.ValidateSensors(sensorList1);
var sensorList2 = new List<ISensor>() { new DummySensor(1), new DummySensor(2, 3), new DummySensor(4, 5, 7) };
LogAssert.Expect(LogType.Assert, "Sensor sizes much match.");
LogAssert.Expect(LogType.Assert, "Sensor sizes must match.");
LogAssert.Expect(LogType.Assert, "Sensor sizes much match.");
LogAssert.Expect(LogType.Assert, "Sensor sizes must match.");
validator.ValidateSensors(sensorList1);
}

var sensorList2 = new List<ISensor>() { new DummySensor(1), new DummySensor(9) };
LogAssert.Expect(LogType.Assert, "Number of Sensors must match. 3 != 2");
LogAssert.Expect(LogType.Assert, "Sensor dimensions must match.");
LogAssert.Expect(LogType.Assert, "Sensor sizes much match.");
LogAssert.Expect(LogType.Assert, "Sensor sizes must match.");
validator.ValidateSensors(sensorList2);
// Add the sensors in the other order

LogAssert.Expect(LogType.Assert, "Sensor dimensions must match.");
LogAssert.Expect(LogType.Assert, "Sensor sizes much match.");
LogAssert.Expect(LogType.Assert, "Sensor sizes must match.");
validator.ValidateSensors(sensorList1);
}
}

89
com.unity.ml-agents/Tests/Editor/Sensor/RayPerceptionSensorTests.cs


using System.Collections.Generic;
using NUnit.Framework;
using UnityEngine;
using UnityEngine.TestTools;
using Unity.MLAgents.Sensors;
namespace Unity.MLAgents.Tests

// hit fraction is arbitrary but should be finite in [0,1]
Assert.GreaterOrEqual(outputBuffer[2], 0.0f);
Assert.LessOrEqual(outputBuffer[2], 1.0f);
}
}
[Test]
public void TestStaticPerceive()
{
SetupScene();
var obj = new GameObject("agent");
var perception = obj.AddComponent<RayPerceptionSensorComponent3D>();
perception.RaysPerDirection = 0; // single ray
perception.MaxRayDegrees = 45;
perception.RayLength = 20;
perception.DetectableTags = new List<string>();
perception.DetectableTags.Add(k_CubeTag);
perception.DetectableTags.Add(k_SphereTag);
var radii = new[] { 0f, .5f };
foreach (var castRadius in radii)
{
perception.SphereCastRadius = castRadius;
var castInput = perception.GetRayPerceptionInput();
var castOutput = RayPerceptionSensor.Perceive(castInput);
Assert.AreEqual(1, castOutput.RayOutputs.Length);
// Expected to hit the cube
Assert.AreEqual(0, castOutput.RayOutputs[0].HitTagIndex);
}
}
[Test]
public void TestStaticPerceiveInvalidTags()
{
SetupScene();
var obj = new GameObject("agent");
var perception = obj.AddComponent<RayPerceptionSensorComponent3D>();
perception.RaysPerDirection = 0; // single ray
perception.MaxRayDegrees = 45;
perception.RayLength = 20;
perception.DetectableTags = new List<string>();
perception.DetectableTags.Add("Bad tag");
perception.DetectableTags.Add(null);
perception.DetectableTags.Add("");
perception.DetectableTags.Add(k_CubeTag);
var radii = new[] { 0f, .5f };
foreach (var castRadius in radii)
{
perception.SphereCastRadius = castRadius;
var castInput = perception.GetRayPerceptionInput();
// There's no clean way that I can find to check for a defined tag without
// logging an error.
LogAssert.Expect(LogType.Error, "Tag: Bad tag is not defined.");
var castOutput = RayPerceptionSensor.Perceive(castInput);
Assert.AreEqual(1, castOutput.RayOutputs.Length);
// Expected to hit the cube
Assert.AreEqual(3, castOutput.RayOutputs[0].HitTagIndex);
}
}
[Test]
public void TestStaticPerceiveNoTags()
{
SetupScene();
var obj = new GameObject("agent");
var perception = obj.AddComponent<RayPerceptionSensorComponent3D>();
perception.RaysPerDirection = 0; // single ray
perception.MaxRayDegrees = 45;
perception.RayLength = 20;
perception.DetectableTags = null;
var radii = new[] { 0f, .5f };
foreach (var castRadius in radii)
{
perception.SphereCastRadius = castRadius;
var castInput = perception.GetRayPerceptionInput();
var castOutput = RayPerceptionSensor.Perceive(castInput);
Assert.AreEqual(1, castOutput.RayOutputs.Length);
// Expected to hit the cube
Assert.AreEqual(-1, castOutput.RayOutputs[0].HitTagIndex);
}
}
}

16
com.unity.ml-agents/Tests/Editor/Communicator/RpcCommunicatorTests.cs


pythonPackageVerStr));
}
[Test]
public void TestCheckPythonPackageVersionIsCompatible()
{
Assert.IsFalse(RpcCommunicator.CheckPythonPackageVersionIsCompatible("0.13.37")); // too low
Assert.IsFalse(RpcCommunicator.CheckPythonPackageVersionIsCompatible("0.42.0")); // too high
// These are fine
Assert.IsTrue(RpcCommunicator.CheckPythonPackageVersionIsCompatible("0.16.1"));
Assert.IsTrue(RpcCommunicator.CheckPythonPackageVersionIsCompatible("0.17.17"));
Assert.IsTrue(RpcCommunicator.CheckPythonPackageVersionIsCompatible("0.20.0"));
// "dev" string or otherwise unparseable
Assert.IsFalse(RpcCommunicator.CheckPythonPackageVersionIsCompatible("0.17.0-dev0"));
Assert.IsFalse(RpcCommunicator.CheckPythonPackageVersionIsCompatible("oh point seventeen point oh"));
}
}
}

20
com.unity.ml-agents/Editor/BrainParametersDrawer.cs


static void DrawContinuousVectorAction(Rect position, SerializedProperty property)
{
var vecActionSize = property.FindPropertyRelative(k_ActionSizePropName);
vecActionSize.arraySize = 1;
// This check is here due to:
// https://fogbugz.unity3d.com/f/cases/1246524/
// If this case has been resolved, please remove this if condition.
if (vecActionSize.arraySize != 1)
{
vecActionSize.arraySize = 1;
}
var continuousActionSize =
vecActionSize.GetArrayElementAtIndex(0);
EditorGUI.PropertyField(

static void DrawDiscreteVectorAction(Rect position, SerializedProperty property)
{
var vecActionSize = property.FindPropertyRelative(k_ActionSizePropName);
vecActionSize.arraySize = EditorGUI.IntField(
var newSize = EditorGUI.IntField(
// This check is here due to:
// https://fogbugz.unity3d.com/f/cases/1246524/
// If this case has been resolved, please remove this if condition.
if (newSize != vecActionSize.arraySize)
{
vecActionSize.arraySize = newSize;
}
position.y += k_LineHeight;
position.x += 20;
position.width -= 20;

4
com.unity.ml-agents/Editor/DemonstrationImporter.cs


using Unity.MLAgents.CommunicatorObjects;
using UnityEditor;
using UnityEngine;
#if UNITY_2020_2_OR_NEWER
using UnityEditor.AssetImporters;
#else
#endif
using Unity.MLAgents.Demonstrations;
namespace Unity.MLAgents.Editor

6
com.unity.ml-agents/Runtime/Inference/BarracudaModelParamLoader.cs


var heightBp = shape[0];
var widthBp = shape[1];
var pixelBp = shape[2];
var heightT = tensorProxy.shape[1];
var widthT = tensorProxy.shape[2];
var pixelT = tensorProxy.shape[3];
var heightT = tensorProxy.Height;
var widthT = tensorProxy.Width;
var pixelT = tensorProxy.Channels;
if ((widthBp != widthT) || (heightBp != heightT) || (pixelBp != pixelT))
{
return $"The visual Observation of the model does not match. " +

23
com.unity.ml-agents/Runtime/Inference/TensorProxy.cs


public Type DataType => k_TypeMap[valueType];
public long[] shape;
public Tensor data;
public long Height
{
get { return shape.Length == 4 ? shape[1] : shape[5]; }
}
public long Width
{
get { return shape.Length == 4 ? shape[2] : shape[6]; }
}
public long Channels
{
get { return shape.Length == 4 ? shape[3] : shape[7]; }
}
}
internal static class TensorUtils

tensor.data?.Dispose();
tensor.shape[0] = batch;
if (tensor.shape.Length == 4)
if (tensor.shape.Length == 4 || tensor.shape.Length == 8)
(int)tensor.shape[1],
(int)tensor.shape[2],
(int)tensor.shape[3]));
(int)tensor.Height,
(int)tensor.Width,
(int)tensor.Channels));
}
else
{

2
com.unity.ml-agents/Runtime/Sensors/SensorShapeValidator.cs


Debug.Assert(cachedShape.Length == sensorShape.Length, "Sensor dimensions must match.");
for (var j = 0; j < Mathf.Min(cachedShape.Length, sensorShape.Length); j++)
{
Debug.Assert(cachedShape[j] == sensorShape[j], "Sensor sizes much match.");
Debug.Assert(cachedShape[j] == sensorShape[j], "Sensor sizes must match.");
}
}
}

5
com.unity.ml-agents/Runtime/Sensors/RayPerceptionSensorComponentBase.cs


else
{
var rayInput = GetRayPerceptionInput();
// We don't actually need the tags here, since they don't affect the display of the rays.
// Additionally, the user might be in the middle of typing the tag name when this is called,
// and there's no way to turn off the "Tag ... is not defined" error logs.
// So just don't use any tags here.
rayInput.DetectableTags = null;
for (var rayIndex = 0; rayIndex < rayInput.Angles.Count; rayIndex++)
{
DebugDisplayInfo.RayInfo debugRay;

19
com.unity.ml-agents/Runtime/Sensors/RayPerceptionSensor.cs


if (castHit)
{
// Find the index of the tag of the object that was hit.
for (var i = 0; i < input.DetectableTags.Count; i++)
var numTags = input.DetectableTags?.Count ?? 0;
for (var i = 0; i < numTags; i++)
if (hitObject.CompareTag(input.DetectableTags[i]))
var tagsEqual = false;
try
{
var tag = input.DetectableTags[i];
if (!string.IsNullOrEmpty(tag))
{
tagsEqual = hitObject.CompareTag(tag);
}
}
catch (UnityException)
{
// If the tag is null, empty, or not a valid tag, just ignore it.
}
if (tagsEqual)
{
rayOutput.HitTaggedObject = true;
rayOutput.HitTagIndex = i;

6
com.unity.ml-agents/Runtime/Academy.cs


* API. For more information on each of these entities, in addition to how to
* set-up a learning environment and train the behavior of characters in a
* Unity scene, please browse our documentation pages on GitHub:
* https://github.com/Unity-Technologies/ml-agents/tree/release_1_docs/docs/
* https://github.com/Unity-Technologies/ml-agents/tree/release_2_verified_docs/docs/
*/
namespace Unity.MLAgents

/// fall back to inference or heuristic decisions. (You can also set agents to always use
/// inference or heuristics.)
/// </remarks>
[HelpURL("https://github.com/Unity-Technologies/ml-agents/tree/release_1_docs/" +
[HelpURL("https://github.com/Unity-Technologies/ml-agents/tree/release_2_verified_docs/" +
"docs/Learning-Environment-Design.md")]
public class Academy : IDisposable
{

/// Unity package version of com.unity.ml-agents.
/// This must match the version string in package.json and is checked in a unit test.
/// </summary>
internal const string k_PackageVersion = "1.0.0-preview";
internal const string k_PackageVersion = "1.0.5";
const int k_EditorTrainingPort = 5004;

28
com.unity.ml-agents/Runtime/Agent.cs


/// [OnDisable()]: https://docs.unity3d.com/ScriptReference/MonoBehaviour.OnDisable.html]
/// [OnBeforeSerialize()]: https://docs.unity3d.com/ScriptReference/MonoBehaviour.OnBeforeSerialize.html
/// [OnAfterSerialize()]: https://docs.unity3d.com/ScriptReference/MonoBehaviour.OnAfterSerialize.html
/// [Agents]: https://github.com/Unity-Technologies/ml-agents/blob/release_1_docs/docs/Learning-Environment-Design-Agents.md
/// [Reinforcement Learning in Unity]: https://github.com/Unity-Technologies/ml-agents/blob/release_1_docs/docs/Learning-Environment-Design.md
/// [Agents]: https://github.com/Unity-Technologies/ml-agents/blob/release_2_verified_docs/docs/Learning-Environment-Design-Agents.md
/// [Reinforcement Learning in Unity]: https://github.com/Unity-Technologies/ml-agents/blob/release_2_verified_docs/docs/Learning-Environment-Design.md
/// [Unity ML-Agents Toolkit manual]: https://github.com/Unity-Technologies/ml-agents/blob/release_1_docs/docs/Readme.md
/// [Unity ML-Agents Toolkit manual]: https://github.com/Unity-Technologies/ml-agents/blob/release_2_verified_docs/docs/Readme.md
[HelpURL("https://github.com/Unity-Technologies/ml-agents/blob/release_1_docs/" +
[HelpURL("https://github.com/Unity-Technologies/ml-agents/blob/release_2_verified_docs/" +
"docs/Learning-Environment-Design-Agents.md")]
[Serializable]
[RequireComponent(typeof(BehaviorParameters))]

/// for information about mixing reward signals from curiosity and Generative Adversarial
/// Imitation Learning (GAIL) with rewards supplied through this method.
///
/// [Agents - Rewards]: https://github.com/Unity-Technologies/ml-agents/blob/release_1_docs/docs/Learning-Environment-Design-Agents.md#rewards
/// [Reward Signals]: https://github.com/Unity-Technologies/ml-agents/blob/release_1_docs/docs/ML-Agents-Overview.md#a-quick-note-on-reward-signals
/// [Agents - Rewards]: https://github.com/Unity-Technologies/ml-agents/blob/release_2_verified_docs/docs/Learning-Environment-Design-Agents.md#rewards
/// [Reward Signals]: https://github.com/Unity-Technologies/ml-agents/blob/release_2_verified_docs/docs/ML-Agents-Overview.md#a-quick-note-on-reward-signals
/// </remarks>
/// <param name="reward">The new value of the reward.</param>
public void SetReward(float reward)

/// for information about mixing reward signals from curiosity and Generative Adversarial
/// Imitation Learning (GAIL) with rewards supplied through this method.
///
/// [Agents - Rewards]: https://github.com/Unity-Technologies/ml-agents/blob/release_1_docs/docs/Learning-Environment-Design-Agents.md#rewards
/// [Reward Signals]: https://github.com/Unity-Technologies/ml-agents/blob/release_1_docs/docs/ML-Agents-Overview.md#a-quick-note-on-reward-signals
/// [Agents - Rewards]: https://github.com/Unity-Technologies/ml-agents/blob/release_2_verified_docs/docs/Learning-Environment-Design-Agents.md#rewards
/// [Reward Signals]: https://github.com/Unity-Technologies/ml-agents/blob/release_2_verified_docs/docs/ML-Agents-Overview.md#a-quick-note-on-reward-signals
///</remarks>
/// <param name="increment">Incremental reward value.</param>
public void AddReward(float increment)

///
/// Your heuristic implementation can use any decision making logic you specify. Assign decision
/// values to the float[] array, <paramref name="actionsOut"/>, passed to your function as a parameter.
/// The same array will be reused between steps. It is up to the user to initialize
/// the values on each call, for example by calling `Array.Clear(actionsOut, 0, actionsOut.Length);`.
/// Add values to the array at the same indexes as they are used in your
/// <seealso cref="OnActionReceived(float[])"/> function, which receives this array and
/// implements the corresponding agent behavior. See [Actions] for more information

/// implementing a simple heuristic function can aid in debugging agent actions and interactions
/// with its environment.
///
/// [Demonstration Recorder]: https://github.com/Unity-Technologies/ml-agents/blob/release_1_docs/docs/Learning-Environment-Design-Agents.md#recording-demonstrations
/// [Actions]: https://github.com/Unity-Technologies/ml-agents/blob/release_1_docs/docs/Learning-Environment-Design-Agents.md#actions
/// [Demonstration Recorder]: https://github.com/Unity-Technologies/ml-agents/blob/release_2_verified_docs/docs/Learning-Environment-Design-Agents.md#recording-demonstrations
/// [Actions]: https://github.com/Unity-Technologies/ml-agents/blob/release_2_verified_docs/docs/Learning-Environment-Design-Agents.md#actions
/// [GameObject]: https://docs.unity3d.com/Manual/GameObjects.html
/// </remarks>
/// <example>

/// For more information about observations, see [Observations and Sensors].
///
/// [GameObject]: https://docs.unity3d.com/Manual/GameObjects.html
/// [Observations and Sensors]: https://github.com/Unity-Technologies/ml-agents/blob/release_1_docs/docs/Learning-Environment-Design-Agents.md#observations-and-sensors
/// [Observations and Sensors]: https://github.com/Unity-Technologies/ml-agents/blob/release_2_verified_docs/docs/Learning-Environment-Design-Agents.md#observations-and-sensors
/// </remarks>
public virtual void CollectObservations(VectorSensor sensor)
{

///
/// See [Agents - Actions] for more information on masking actions.
///
/// [Agents - Actions]: https://github.com/Unity-Technologies/ml-agents/blob/release_1_docs/docs/Learning-Environment-Design-Agents.md#actions
/// [Agents - Actions]: https://github.com/Unity-Technologies/ml-agents/blob/release_2_verified_docs/docs/Learning-Environment-Design-Agents.md#actions
/// </remarks>
/// <seealso cref="OnActionReceived(float[])"/>
public virtual void CollectDiscreteActionMasks(DiscreteActionMasker actionMasker)

///
/// For more information about implementing agent actions see [Agents - Actions].
///
/// [Agents - Actions]: https://github.com/Unity-Technologies/ml-agents/blob/release_1_docs/docs/Learning-Environment-Design-Agents.md#actions
/// [Agents - Actions]: https://github.com/Unity-Technologies/ml-agents/blob/release_2_verified_docs/docs/Learning-Environment-Design-Agents.md#actions
/// </remarks>
/// <param name="vectorAction">
/// An array containing the action vector. The length of the array is specified

18
com.unity.ml-agents/Runtime/Communicator/GrpcExtensions.cs


{
var agentInfoProto = ai.ToAgentInfoProto();
var agentActionProto = new AgentActionProto
var agentActionProto = new AgentActionProto();
if(ai.storedVectorActions != null)
VectorActions = { ai.storedVectorActions }
};
agentActionProto.VectorActions.AddRange(ai.storedVectorActions);
}
return new AgentInfoActionPairProto
{

var brainParametersProto = new BrainParametersProto
{
VectorActionSize = { bp.VectorActionSize },
VectorActionSpaceType =
(SpaceTypeProto)bp.VectorActionSpaceType,
VectorActionSpaceType = (SpaceTypeProto) bp.VectorActionSpaceType,
brainParametersProto.VectorActionDescriptions.AddRange(bp.VectorActionDescriptions);
if(bp.VectorActionDescriptions != null)
{
brainParametersProto.VectorActionDescriptions.AddRange(bp.VectorActionDescriptions);
}
return brainParametersProto;
}

/// </summary>
public static DemonstrationMetaProto ToProto(this DemonstrationMetaData dm)
{
var demonstrationName = dm.demonstrationName ?? "";
var demoProto = new DemonstrationMetaProto
{
ApiVersion = DemonstrationMetaData.ApiVersion,

DemonstrationName = dm.demonstrationName
DemonstrationName = demonstrationName
};
return demoProto;
}

39
com.unity.ml-agents/Runtime/Communicator/RpcCommunicator.cs


/// Responsible for communication with External using gRPC.
internal class RpcCommunicator : ICommunicator
{
// The python package version must be >= s_MinSupportedPythonPackageVersion
// and <= s_MaxSupportedPythonPackageVersion.
static Version s_MinSupportedPythonPackageVersion = new Version("0.16.1");
static Version s_MaxSupportedPythonPackageVersion = new Version("0.20.0");
public event QuitCommandHandler QuitCommandReceived;
public event ResetCommandHandler ResetCommandReceived;

return true;
}
internal static bool CheckPythonPackageVersionIsCompatible(string pythonLibraryVersion)
{
Version pythonVersion;
try
{
pythonVersion = new Version(pythonLibraryVersion);
}
catch
{
// Unparseable - this also catches things like "0.20.0-dev0" which we don't want to support
return false;
}
if (pythonVersion < s_MinSupportedPythonPackageVersion ||
pythonVersion > s_MaxSupportedPythonPackageVersion)
{
return false;
}
return true;
}
/// <summary>
/// Sends the initialization parameters through the Communicator.
/// Is used by the academy to send initialization parameters to the communicator.

}
throw new UnityAgentsException("ICommunicator.Initialize() failed.");
}
var packageVersionSupported = CheckPythonPackageVersionIsCompatible(pythonPackageVersion);
if (!packageVersionSupported)
{
Debug.LogWarningFormat(
"Python package version ({0}) is out of the supported range or not from an official release. " +
"It is strongly recommended that you use a Python package between {1} and {2}. " +
"Training will proceed, but the output format may be different.",
pythonPackageVersion,
s_MinSupportedPythonPackageVersion,
s_MaxSupportedPythonPackageVersion
);
}
}
catch

2
com.unity.ml-agents/Runtime/Demonstrations/DemonstrationRecorder.cs


/// See [Imitation Learning - Recording Demonstrations] for more information.
///
/// [GameObject]: https://docs.unity3d.com/Manual/GameObjects.html
/// [Imitation Learning - Recording Demonstrations]: https://github.com/Unity-Technologies/ml-agents/blob/release_1_docs/docs//Learning-Environment-Design-Agents.md#recording-demonstrations
/// [Imitation Learning - Recording Demonstrations]: https://github.com/Unity-Technologies/ml-agents/blob/release_2_verified_docs/docs//Learning-Environment-Design-Agents.md#recording-demonstrations
/// </remarks>
[RequireComponent(typeof(Agent))]
[AddComponentMenu("ML Agents/Demonstration Recorder", (int)MenuGroup.Default)]

2
com.unity.ml-agents/Runtime/DiscreteActionMasker.cs


///
/// See [Agents - Actions] for more information on masking actions.
///
/// [Agents - Actions]: https://github.com/Unity-Technologies/ml-agents/blob/release_1_docs/docs/Learning-Environment-Design-Agents.md#actions
/// [Agents - Actions]: https://github.com/Unity-Technologies/ml-agents/blob/release_2_verified_docs/docs/Learning-Environment-Design-Agents.md#actions
/// </remarks>
/// <param name="branch">The branch for which the actions will be masked.</param>
/// <param name="actionIndices">The indices of the masked actions.</param>

8
com.unity.ml-agents/package.json


{
"name": "com.unity.ml-agents",
"displayName": "ML Agents",
"version": "1.0.0-preview",
"version": "1.0.5",
"com.unity.barracuda": "0.7.0-preview"
"com.unity.barracuda": "1.0.3",
"com.unity.modules.imageconversion": "1.0.0",
"com.unity.modules.jsonserialize": "1.0.0",
"com.unity.modules.physics": "1.0.0",
"com.unity.modules.physics2d": "1.0.0"
}
}

58
com.unity.ml-agents/CHANGELOG.md


and this project adheres to
[Semantic Versioning](http://semver.org/spec/v2.0.0.html).
## [Unreleased]
### Minor Changes
#### com.unity.ml-agents (C#)
- Update documentation with recommended version of Python trainer. (#4535)
- Log a warning if a version of the Python trainer is used that is newer than expected. (#4535)
### Bug Fixes
#### com.unity.ml-agents (C#)
- Fixed a bug with visual observations using .onnx model files and newer versions of Barracuda (1.1.0 or later). (#4533)
#### ml-agents / ml-agents-envs / gym-unity (Python)
- Fixed an issue where runs could not be resumed when using TensorFlow and Ghost Training. (#4593)
## [1.0.5] - 2020-09-23
### Minor Changes
#### com.unity.ml-agents (C#)
- Update Barracuda to 1.0.3. (#4506)
## [1.0.4] - 2020-08-19
### Minor Changes
#### com.unity.ml-agents (C#)
- Update Barracuda to 1.0.2. (#4385)
- Explicitly call out dependencies in package.json.
## [1.0.3] - 2020-07-07
### Minor Changes
#### com.unity.ml-agents (C#)
- Update Barracuda to 1.0.1. (#4187)
### Bug Fixes
#### com.unity.ml-agents (C#)
- Fixed an issue where RayPerceptionSensor would raise an exception when the
list of tags was empty, or a tag in the list was invalid (unknown, null, or
empty string). (#4155)
#### ml-agents / ml-agents-envs / gym-unity (Python)
- Fixed issue with FoodCollector, Soccer, and WallJump when playing with keyboard. (#4147, #4174)
## [1.0.2] - 2020-06-04
### Minor Changes
#### com.unity.ml-agents (C#)
- Remove 'preview' tag.
## [1.0.2-preview] - 2020-05-19
### Bug Fixes
#### com.unity.ml-agents (C#)
- Fix missing .meta file
## [1.0.1-preview] - 2020-05-19
### Bug Fixes
#### com.unity.ml-agents (C#)
- A bug that would cause the editor to go into a loop when a prefab was selected was fixed. (#3949)
- BrainParameters.ToProto() no longer throws an exception if none of the fields have been set. (#3930)
- The Barracuda dependency was upgraded to 0.7.1-preview. (#3977)
#### ml-agents / ml-agents-envs / gym-unity (Python)
- An issue was fixed where using `--initialize-from` would resume from the past step count. (#3962)
- The gym wrapper error for the wrong number of agents now fires more consistently, and more details
were added to the error message when the input dimension is wrong. (#3963)
## [1.0.0-preview] - 2020-05-06
### Major Changes

5
ml-agents/setup.py


from io import open
import os
import sys

tag = os.getenv("CIRCLE_TAG")
if tag != EXPECTED_TAG:
info = "Git tag: {0} does not match the expected tag of this app: {1}".format(
info = "Git tag: {} does not match the expected tag of this app: {}".format(
tag, EXPECTED_TAG
)
sys.exit(info)

# Test-only dependencies should go in test_requirements.txt, not here.
"grpcio>=1.11.0",
"h5py>=2.9.0",
"mlagents_envs=={}".format(VERSION),
f"mlagents_envs=={VERSION}",
"numpy>=1.13.3,<2.0",
"Pillow>=4.2.1",
"protobuf>=3.6",

9
ml-agents/tests/yamato/check_coverage_percent.py


from __future__ import print_function
import sys
import os

summary_xml = os.path.join(dirpath, SUMMARY_XML_FILENAME)
break
if not summary_xml:
print("Couldn't find {} in root directory".format(SUMMARY_XML_FILENAME))
print(f"Couldn't find {SUMMARY_XML_FILENAME} in root directory")
sys.exit(1)
with open(summary_xml) as f:

for l in lines:
if "Linecoverage" in l:
pct = l.replace("<Linecoverage>", "").replace("</Linecoverage>", "")
for line in lines:
if >"Linecoverage" in line:
pct = line.replace("<Linecoverage>", "").replace("</Linecoverage>", "")
pct = float(pct)
if pct < min_percentage:
print(

4
ml-agents/tests/yamato/scripts/run_gym.py


if len(env.observation_space.shape) == 1:
# Examine the initial vector observation
print("Agent observations look like: \n{}".format(initial_observations))
print(f"Agent observations look like: \n{initial_observations}")
for _episode in range(10):
env.reset()

actions = env.action_space.sample()
obs, reward, done, _ = env.step(actions)
episode_rewards += reward
print("Total reward this episode: {}".format(episode_rewards))
print(f"Total reward this episode: {episode_rewards}")
finally:
env.close()

2
ml-agents/tests/yamato/scripts/run_llapi.py


if tracked_agent in terminal_steps:
episode_rewards += terminal_steps[tracked_agent].reward
done = True
print("Total reward this episode: {}".format(episode_rewards))
print(f"Total reward this episode: {episode_rewards}")
finally:
env.close()

5
ml-agents/tests/yamato/yamato_utils.py


if extra_packages:
pip_commands += extra_packages
for cmd in pip_commands:
pip_index_url = "--index-url https://artifactory.prd.it.unity3d.com/artifactory/api/pypi/pypi/simple"
f"source {venv_path}/bin/activate; python -m pip install -q {cmd}",
f"source {venv_path}/bin/activate; python -m pip install -q {cmd} {pip_index_url}",
shell=True,
)
return venv_path

subprocess.check_call("git reset HEAD .", shell=True)
subprocess.check_call("git checkout -- .", shell=True)
# Ensure the cache isn't polluted with old compiled assemblies.
subprocess.check_call(f"rm -rf Project/Library", shell=True)
subprocess.check_call("rm -rf Project/Library", shell=True)
def override_config_file(src_path, dest_path, **kwargs):

4
ml-agents/mlagents/trainers/__init__.py


# Version of the library that will be used to upload to pypi
__version__ = "0.16.0"
__version__ = "0.16.1"
__release_tag__ = "release_1"
__release_tag__ = "release_2"

2
ml-agents/mlagents/trainers/subprocess_env_manager.py


return self.env_workers[0].recv().payload
def close(self) -> None:
logger.debug(f"SubprocessEnvManager closing.")
logger.debug("SubprocessEnvManager closing.")
self.step_queue.close()
self.step_queue.join_thread()
for env_worker in self.env_workers:

4
ml-agents/mlagents/trainers/buffer.py


super().__init__()
def __str__(self):
return ", ".join(["'{0}' : {1}".format(k, str(self[k])) for k in self.keys()])
return ", ".join(["'{}' : {}".format(k, str(self[k])) for k in self.keys()])
def reset_agent(self) -> None:
"""

key_list = list(self.keys())
if not self.check_length(key_list):
raise BufferException(
"The length of the fields {0} were not of same length".format(key_list)
f"The length of the fields {key_list} were not of same length"
)
for field_key in key_list:
target_buffer[field_key].extend(

2
ml-agents/mlagents/trainers/components/bc/model.py


from mlagents.trainers.policy.tf_policy import TFPolicy
class BCModel(object):
class BCModel:
def __init__(
self, policy: TFPolicy, learning_rate: float = 3e-4, anneal_steps: int = 0
):

2
ml-agents/mlagents/trainers/components/bc/module.py


for k in param_keys:
if k not in config_dict:
raise UnityTrainerException(
"The required pre-training hyper-parameter {0} was not defined. Please check your \
"The required pre-training hyper-parameter {} was not defined. Please check your \
trainer YAML file.".format(
k
)

2
ml-agents/mlagents/trainers/components/reward_signals/__init__.py


for k in param_keys:
if k not in config_dict:
raise UnityTrainerException(
"The hyper-parameter {0} could not be found for {1}.".format(
"The hyper-parameter {} could not be found for {}.".format(
k, cls.__name__
)
)

6
ml-agents/mlagents/trainers/components/reward_signals/curiosity/model.py


from mlagents.trainers.policy.tf_policy import TFPolicy
class CuriosityModel(object):
class CuriosityModel:
def __init__(
self, policy: TFPolicy, encoding_size: int = 128, learning_rate: float = 3e-4
):

self.encoding_size,
ModelUtils.swish,
1,
"curiosity_stream_{}_visual_obs_encoder".format(i),
f"curiosity_stream_{i}_visual_obs_encoder",
False,
)

ModelUtils.swish,
1,
"curiosity_stream_{}_visual_obs_encoder".format(i),
f"curiosity_stream_{i}_visual_obs_encoder",
True,
)
visual_encoders.append(encoded_visual)

6
ml-agents/mlagents/trainers/components/reward_signals/gail/model.py


EPSILON = 1e-7
class GAILModel(object):
class GAILModel:
def __init__(
self,
policy: TFPolicy,

self.encoding_size,
ModelUtils.swish,
1,
"gail_stream_{}_visual_obs_encoder".format(i),
f"gail_stream_{i}_visual_obs_encoder",
False,
)

ModelUtils.swish,
1,
"gail_stream_{}_visual_obs_encoder".format(i),
f"gail_stream_{i}_visual_obs_encoder",
True,
)
visual_policy_encoders.append(encoded_policy_visual)

4
ml-agents/mlagents/trainers/components/reward_signals/reward_signal_factory.py


"""
rcls = NAME_TO_CLASS.get(name)
if not rcls:
raise UnityTrainerException("Unknown reward signal type {0}".format(name))
raise UnityTrainerException(f"Unknown reward signal type {name}")
"Unknown parameters given for reward signal {0}".format(name)
f"Unknown parameters given for reward signal {name}"
)
return class_inst

12
ml-agents/mlagents/trainers/curriculum.py


for key in parameters:
config[key] = parameters[key][self.lesson_num]
logger.info(
"{0} lesson changed. Now in lesson {1}: {2}".format(
"{} lesson changed. Now in lesson {}: {}".format(
self.brain_name,
self.lesson_num,
", ".join([str(x) + " -> " + str(config[x]) for x in config]),

try:
with open(config_path) as data_file:
return Curriculum._load_curriculum(data_file)
except IOError:
raise CurriculumLoadingError(
"The file {0} could not be found.".format(config_path)
)
except OSError:
raise CurriculumLoadingError(f"The file {config_path} could not be found.")
raise CurriculumLoadingError(
"There was an error decoding {}".format(config_path)
)
raise CurriculumLoadingError(f"There was an error decoding {config_path}")
@staticmethod
def _load_curriculum(fp: TextIO) -> Dict:

6
ml-agents/mlagents/trainers/models.py


)
else:
raise UnityTrainerException(
"The learning rate schedule {} is invalid.".format(lr_schedule)
f"The learning rate schedule {lr_schedule} is invalid."
)
return learning_rate

h_size,
activation=activation,
reuse=reuse,
name="hidden_{}".format(i),
name=f"hidden_{i}",
kernel_initializer=tf.initializers.variance_scaling(1.0),
)
return hidden

"""
value_heads = {}
for name in stream_names:
value = tf.layers.dense(hidden_input, 1, name="{}_value".format(name))
value = tf.layers.dense(hidden_input, 1, name=f"{name}_value")
value_heads[name] = value
value = tf.reduce_mean(list(value_heads.values()), 0)
return value_heads, value

13
ml-agents/mlagents/trainers/policy/tf_policy.py


self.sequence_length = trainer_parameters["sequence_length"]
if self.m_size == 0:
raise UnityPolicyException(
"The memory size for brain {0} is 0 even "
"The memory size for brain {} is 0 even "
"The memory size for brain {0} is {1} "
"The memory size for brain {} is {} "
"but it must be divisible by 2.".format(
brain.brain_name, self.m_size
)

ckpt = tf.train.get_checkpoint_state(model_path)
if ckpt is None:
raise UnityPolicyException(
"The model {0} could not be loaded. Make "
"The model {} could not be loaded. Make "
"sure you specified the right "
"--run-id and that the previous run you are loading from had the same "
"behavior names.".format(model_path)

except tf.errors.NotFoundError:
raise UnityPolicyException(
"The model {0} was found but could not be loaded. Make "
"The model {} was found but could not be loaded. Make "
"sure the model is from the same version of ML-Agents, has the same behavior parameters, "
"and is using the same trainer configuration as the current run.".format(
model_path

self._set_step(0)
logger.info(
"Starting training from step 0 and saving to {}.".format(
self.model_path

logger.info(
"Resuming training from step {}.".format(self.get_current_step())
)
logger.info(f"Resuming training from step {self.get_current_step()}.")
def initialize_or_load(self):
# If there is an initialize path, load from that. Else, load from the set model path.

12
ml-agents/mlagents/trainers/ppo/optimizer.py


self.old_values = {}
for name in value_heads.keys():
returns_holder = tf.placeholder(
shape=[None], dtype=tf.float32, name="{}_returns".format(name)
shape=[None], dtype=tf.float32, name=f"{name}_returns"
shape=[None], dtype=tf.float32, name="{}_value_estimate".format(name)
shape=[None], dtype=tf.float32, name=f"{name}_value_estimate"
)
self.returns_holders[name] = returns_holder
self.old_values[name] = old_value

self.all_old_log_probs: mini_batch["action_probs"],
}
for name in self.reward_signals:
feed_dict[self.returns_holders[name]] = mini_batch[
"{}_returns".format(name)
]
feed_dict[self.old_values[name]] = mini_batch[
"{}_value_estimates".format(name)
]
feed_dict[self.returns_holders[name]] = mini_batch[f"{name}_returns"]
feed_dict[self.old_values[name]] = mini_batch[f"{name}_value_estimates"]
if self.policy.output_pre is not None and "actions_pre" in mini_batch:
feed_dict[self.policy.output_pre] = mini_batch["actions_pre"]

20
ml-agents/mlagents/trainers/ppo/trainer.py


:param seed: The seed the model will be initialized with
:param run_id: The identifier of the current run
"""
super(PPOTrainer, self).__init__(
super().__init__(
brain_name, trainer_parameters, training, run_id, reward_buff_cap
)
self.param_keys = [

trajectory.done_reached and not trajectory.max_step_reached,
)
for name, v in value_estimates.items():
agent_buffer_trajectory["{}_value_estimates".format(name)].extend(v)
agent_buffer_trajectory[f"{name}_value_estimates"].extend(v)
self._stats_reporter.add_stat(
self.optimizer.reward_signals[name].value_name, np.mean(v)
)

evaluate_result = reward_signal.evaluate_batch(
agent_buffer_trajectory
).scaled_reward
agent_buffer_trajectory["{}_rewards".format(name)].extend(evaluate_result)
agent_buffer_trajectory[f"{name}_rewards"].extend(evaluate_result)
# Report the reward signals
self.collected_rewards[name][agent_id] += np.sum(evaluate_result)

for name in self.optimizer.reward_signals:
bootstrap_value = value_next[name]
local_rewards = agent_buffer_trajectory[
"{}_rewards".format(name)
].get_batch()
local_rewards = agent_buffer_trajectory[f"{name}_rewards"].get_batch()
"{}_value_estimates".format(name)
f"{name}_value_estimates"
].get_batch()
local_advantage = get_gae(
rewards=local_rewards,

)
local_return = local_advantage + local_value_estimates
# This is later use as target for the different value estimates
agent_buffer_trajectory["{}_returns".format(name)].set(local_return)
agent_buffer_trajectory["{}_advantage".format(name)].set(local_advantage)
agent_buffer_trajectory[f"{name}_returns"].set(local_return)
agent_buffer_trajectory[f"{name}_advantage"].set(local_advantage)
tmp_advantages.append(local_advantage)
tmp_returns.append(local_return)

self.update_buffer.shuffle(sequence_length=self.policy.sequence_length)
buffer = self.update_buffer
max_num_batch = buffer_length // batch_size
for l in range(0, max_num_batch * batch_size, batch_size):
for i in range(0, max_num_batch * batch_size, batch_size):
buffer.make_mini_batch(l, l + batch_size), n_sequences
buffer.make_mini_batch(i, i + batch_size), n_sequences
)
for stat_name, value in update_stats.items():
batch_update_stats[stat_name].append(value)

6
ml-agents/mlagents/trainers/sac/network.py


"""
self.value_heads = {}
for name in stream_names:
value = tf.layers.dense(hidden_input, 1, name="{}_value".format(name))
value = tf.layers.dense(hidden_input, 1, name=f"{name}_value")
self.value_heads[name] = value
self.value = tf.reduce_mean(list(self.value_heads.values()), 0)

q1_heads = {}
for name in stream_names:
_q1 = tf.layers.dense(q1_hidden, num_outputs, name="{}_q1".format(name))
_q1 = tf.layers.dense(q1_hidden, num_outputs, name=f"{name}_q1")
q1_heads[name] = _q1
q1 = tf.reduce_mean(list(q1_heads.values()), axis=0)

q2_heads = {}
for name in stream_names:
_q2 = tf.layers.dense(q2_hidden, num_outputs, name="{}_q2".format(name))
_q2 = tf.layers.dense(q2_hidden, num_outputs, name=f"{name}_q2")
q2_heads[name] = _q2
q2 = tf.reduce_mean(list(q2_heads.values()), axis=0)

4
ml-agents/mlagents/trainers/sac/optimizer.py


)
rewards_holder = tf.placeholder(
shape=[None], dtype=tf.float32, name="{}_rewards".format(name)
shape=[None], dtype=tf.float32, name=f"{name}_rewards"
)
self.rewards_holders[name] = rewards_holder

self.policy.mask_input: batch["masks"] * burn_in_mask,
}
for name in self.reward_signals:
feed_dict[self.rewards_holders[name]] = batch["{}_rewards".format(name)]
feed_dict[self.rewards_holders[name]] = batch[f"{name}_rewards"]
if self.policy.use_continuous_act:
feed_dict[self.policy_network.external_action_in] = batch["actions"]

14
ml-agents/mlagents/trainers/sac/trainer.py


filename = os.path.join(
self.trainer_parameters["model_path"], "last_replay_buffer.hdf5"
)
logger.info("Saving Experience Replay Buffer to {}".format(filename))
logger.info(f"Saving Experience Replay Buffer to {filename}")
with open(filename, "wb") as file_object:
self.update_buffer.save_to_file(file_object)

filename = os.path.join(
self.trainer_parameters["model_path"], "last_replay_buffer.hdf5"
)
logger.info("Loading Experience Replay Buffer from {}".format(filename))
logger.info(f"Loading Experience Replay Buffer from {filename}")
with open(filename, "rb+") as file_object:
self.update_buffer.load_from_file(file_object)
logger.info(

batch_update_stats: Dict[str, list] = defaultdict(list)
while self.step / self.update_steps > self.steps_per_update:
logger.debug("Updating SAC policy at step {}".format(self.step))
logger.debug(f"Updating SAC policy at step {self.step}")
buffer = self.update_buffer
if (
self.update_buffer.num_experiences

)
# Get rewards for each reward
for name, signal in self.optimizer.reward_signals.items():
sampled_minibatch[
"{}_rewards".format(name)
] = signal.evaluate_batch(sampled_minibatch).scaled_reward
sampled_minibatch[f"{name}_rewards"] = signal.evaluate_batch(
sampled_minibatch
).scaled_reward
update_stats = self.optimizer.update(sampled_minibatch, n_sequences)
for stat_name, value in update_stats.items():

# Get minibatches for reward signal update if needed
reward_signal_minibatches = {}
for name, signal in self.optimizer.reward_signals.items():
logger.debug("Updating {} at step {}".format(name, self.step))
logger.debug(f"Updating {name} at step {self.step}")
# Some signals don't need a minibatch to be sampled - so we don't!
if signal.update_dict:
reward_signal_minibatches[name] = buffer.sample_mini_batch(

2
ml-agents/mlagents/trainers/sampler_class.py


for param_name, cur_param_dict in self.reset_param_dict.items():
if "sampler-type" not in cur_param_dict:
raise SamplerException(
"'sampler_type' argument hasn't been supplied for the {0} parameter".format(
"'sampler_type' argument hasn't been supplied for the {} parameter".format(
param_name
)
)

12
ml-agents/mlagents/trainers/stats.py


)
if self.self_play and "Self-play/ELO" in values:
elo_stats = values["Self-play/ELO"]
logger.info("{} ELO: {:0.3f}. ".format(category, elo_stats.mean))
logger.info(f"{category} ELO: {elo_stats.mean:0.3f}. ")
else:
logger.info(
"{}: Step: {}. No episode was completed since last summary. {}".format(

) -> None:
if property_type == StatsPropertyType.HYPERPARAMETERS:
logger.info(
"""Hyperparameters for behavior name {0}: \n{1}""".format(
"""Hyperparameters for behavior name {}: \n{}""".format(
category, self._dict_to_str(value, 0)
)
)

[
"\t"
+ " " * num_tabs
+ "{0}:\t{1}".format(
+ "{}:\t{}".format(
x, self._dict_to_str(param_dict[x], num_tabs + 1)
)
for x in param_dict

self._maybe_create_summary_writer(category)
for key, value in values.items():
summary = tf.Summary()
summary.value.add(tag="{}".format(key), simple_value=value.mean)
summary.value.add(tag=f"{key}", simple_value=value.mean)
self.summary_writers[category].add_summary(summary, step)
self.summary_writers[category].flush()

for file_name in os.listdir(directory_name):
if file_name.startswith("events.out"):
logger.warning(
"{} was left over from a previous run. Deleting.".format(file_name)
f"{file_name} was left over from a previous run. Deleting."
)
full_fname = os.path.join(directory_name, file_name)
try:

s_op = tf.summary.text(
name,
tf.convert_to_tensor(
([[str(x), str(input_dict[x])] for x in input_dict])
[[str(x), str(input_dict[x])] for x in input_dict]
),
)
s = sess.run(s_op)

2
ml-agents/mlagents/trainers/trainer/rl_trainer.py


"""
def __init__(self, *args, **kwargs):
super(RLTrainer, self).__init__(*args, **kwargs)
super().__init__(*args, **kwargs)
# Make sure we have at least one reward_signal
if not self.trainer_parameters["reward_signals"]:
raise UnityTrainerException(

4
ml-agents/mlagents/trainers/trainer/trainer.py


for k in self.param_keys:
if k not in self.trainer_parameters:
raise UnityTrainerException(
"The hyper-parameter {0} could not be found for the {1} trainer of "
"brain {2}.".format(k, self.__class__, self.brain_name)
"The hyper-parameter {} could not be found for the {} trainer of "
"brain {}.".format(k, self.__class__, self.brain_name)
)
@property

2
ml-agents/mlagents/trainers/trainer_controller.py


from mlagents.trainers.agent_processor import AgentManager
class TrainerController(object):
class TrainerController:
def __init__(
self,
trainer_factory: TrainerFactory,

2
ml-agents/mlagents/trainers/trainer_util.py


try:
with open(config_path) as data_file:
return _load_config(data_file)
except IOError:
except OSError:
abs_path = os.path.abspath(config_path)
raise TrainerConfigError(f"Config file could not be found at {abs_path}.")
except UnicodeDecodeError:

4
ml-agents/mlagents/trainers/ghost/controller.py


"""
self._queue.append(self._learning_team)
self._learning_team = self._queue.popleft()
logger.debug(
"Learning team {} swapped on step {}".format(self._learning_team, step)
)
logger.debug(f"Learning team {self._learning_team} swapped on step {step}")
self._changed_training_team = True
# Adapted from https://github.com/Unity-Technologies/ml-agents/pull/1975 and

18
ml-agents/mlagents/trainers/ghost/trainer.py


:param run_id: The identifier of the current run
"""
super(GhostTrainer, self).__init__(
super().__init__(
brain_name, trainer_parameters, training, run_id, reward_buff_cap
)

@property
def reward_buffer(self) -> Deque[float]:
"""
Returns the reward buffer. The reward buffer contains the cumulative
rewards of the most recent episodes completed by agents using this
trainer.
:return: the reward buffer.
"""
Returns the reward buffer. The reward buffer contains the cumulative
rewards of the most recent episodes completed by agents using this
trainer.
:return: the reward buffer.
"""
return self.trainer.reward_buffer
@property

"""
policy = self.trainer.create_policy(parsed_behavior_id, brain_parameters)
policy.create_tf_graph()
policy.initialize_or_load()
policy.init_load_weights()
team_id = parsed_behavior_id.team_id
self.controller.subscribe_team_id(team_id, self)

self._save_snapshot() # Need to save after trainer initializes policy
self._learning_team = self.controller.get_learning_team
self.wrapped_trainer_team = team_id
else:
# Load the weights of the ghost policy from the wrapped one
policy.load_weights(
self.trainer.get_policy(parsed_behavior_id).get_weights()
)
return policy
def add_policy(

4
ml-agents/mlagents/trainers/tests/test_nn_policy.py


trainer_params["model_path"] = path1
policy = create_policy_mock(trainer_params)
policy.initialize_or_load()
policy._set_step(2000)
policy.save_model(2000)
assert len(os.listdir(tmp_path)) > 0

policy2.initialize_or_load()
_compare_two_policies(policy, policy2)
assert policy2.get_current_step() == 2000
# Try initialize from path 1
trainer_params["model_path"] = path2

_compare_two_policies(policy2, policy3)
# Assert that the steps are 0.
assert policy3.get_current_step() == 0
def _compare_two_policies(policy1: NNPolicy, policy2: NNPolicy) -> None:

2
ml-agents/mlagents/trainers/tests/test_simple_rl.py


def default_reward_processor(rewards, last_n_rewards=5):
rewards_to_use = rewards[-last_n_rewards:]
# For debugging tests
print("Last {} rewards:".format(last_n_rewards), rewards_to_use)
print(f"Last {last_n_rewards} rewards:", rewards_to_use)
return np.array(rewards[-last_n_rewards:], dtype=np.float32).mean()

64
ml-agents/mlagents/trainers/tests/test_ghost.py


np.testing.assert_array_equal(w, lw)
def test_resume(dummy_config, tmp_path):
brain_params_team0 = BrainParameters(
brain_name="test_brain?team=0",
vector_observation_space_size=1,
camera_resolutions=[],
vector_action_space_size=[2],
vector_action_descriptions=[],
vector_action_space_type=0,
)
brain_name = BehaviorIdentifiers.from_name_behavior_id(
brain_params_team0.brain_name
).brain_name
brain_params_team1 = BrainParameters(
brain_name="test_brain?team=1",
vector_observation_space_size=1,
camera_resolutions=[],
vector_action_space_size=[2],
vector_action_descriptions=[],
vector_action_space_type=0,
)
tmp_path = tmp_path.as_posix()
ppo_trainer = PPOTrainer(brain_name, 0, dummy_config, True, False, 0, tmp_path)
controller = GhostController(100)
trainer = GhostTrainer(
ppo_trainer, brain_name, controller, 0, dummy_config, True, tmp_path
)
parsed_behavior_id0 = BehaviorIdentifiers.from_name_behavior_id(
brain_params_team0.brain_name
)
policy = trainer.create_policy(parsed_behavior_id0, brain_params_team0)
trainer.add_policy(parsed_behavior_id0, policy)
parsed_behavior_id1 = BehaviorIdentifiers.from_name_behavior_id(
brain_params_team1.brain_name
)
policy = trainer.create_policy(parsed_behavior_id1, brain_params_team1)
trainer.add_policy(parsed_behavior_id1, policy)
trainer.save_model(parsed_behavior_id0.behavior_id)
# Make a new trainer, check that the policies are the same
ppo_trainer2 = PPOTrainer(brain_name, 0, dummy_config, True, True, 0, tmp_path)
trainer2 = GhostTrainer(
ppo_trainer2, brain_name, controller, 0, dummy_config, True, tmp_path
)
policy = trainer2.create_policy(parsed_behavior_id0, brain_params_team0)
trainer2.add_policy(parsed_behavior_id0, policy)
policy = trainer2.create_policy(parsed_behavior_id1, brain_params_team1)
trainer2.add_policy(parsed_behavior_id1, policy)
trainer1_policy = trainer.get_policy(parsed_behavior_id1.behavior_id)
trainer2_policy = trainer2.get_policy(parsed_behavior_id1.behavior_id)
weights = trainer1_policy.get_weights()
weights2 = trainer2_policy.get_weights()
for w, lw in zip(weights, weights2):
np.testing.assert_array_equal(w, lw)
def test_process_trajectory(dummy_config):
brain_params_team0 = BrainParameters(
brain_name="test_brain?team=0",

95
docs/Versioning.md


# ML-Agents Versioning
## Context
As the ML-Agents project evolves into a more mature product, we want to communicate the process
we use to version our packages and the data that flows into, through, and out of them clearly.
Our project now has four packages (1 Unity, 3 Python) along with artifacts that are produced as
well as consumed. This document covers the versioning for these packages and artifacts.
## GitHub Releases
Up until now, all packages were in lockstep in-terms of versioning. As a result, the GitHub releases
were tagged with the version of all those packages (e.g. v0.15.0, v0.15.1) and labeled accordingly.
With the decoupling of package versions, we now need to revisit our GitHub release tagging.
The proposal is that we move towards an integer release numbering for our repo and each such
release will call out specific version upgrades of each package. For instance, with
[the April 30th release](https://github.com/Unity-Technologies/ml-agents/releases/tag/release_1),
we will have:
- GitHub Release 1 (branch name: *release_1_branch*)
- com.unity.ml-agents release 1.0.0
- ml-agents release 0.16.0
- ml-agents-envs release 0.16.0
- gym-unity release 0.16.0
Our release cadence will not be affected by these versioning changes. We will keep having
monthly releases to fix bugs and release new features.
## Packages
All of the software packages, and their generated artifacts will be versioned. Any automation
tools will not be versioned.
### Unity package
Package name: com.unity.ml-agents
- Versioned following [Semantic Versioning Guidelines](https://www.semver.org)
- This package consumes an artifact of the training process: the `.nn` file. These files
are integer versioned and currently at version 2. The com.unity.ml-agents package
will need to support the version of `.nn` files which existed at its 1.0.0 release.
For example, consider that com.unity.ml-agents is at version 1.0.0 and the NN files
are at version 2. If the NN files change to version 3, the next release of
com.unity.ml-agents at version 1.1.0 guarantees it will be able to read both of these
formats. If the NN files were to change to version 4 and com.unity.ml-agents to
version 2.0.0, support for NN versions 2 and 3 could be dropped for com.unity.ml-agents
version 2.0.0.
- This package produces one artifact, the `.demo` files. These files will have integer
versioning. This means their version will increment by 1 at each change. The
com.unity.ml-agents package must be backward compatible with version changes
that occur between minor versions.
- To summarize, the artifacts produced and consumed by com.unity.ml-agents are guaranteed
to be supported for 1.x.x versions of com.unity.ml-agents. We intend to provide stability
for our users by moving to a 1.0.0 release of com.unity.ml-agents.
### Python Packages
Package names: ml-agents / ml-agents-envs / gym-unity
- The python packages remain in "Beta." This means that breaking changes to the public
API of the python packages can change without having to have a major version bump.
Historically, the python and C# packages were in version lockstep. This is no longer
the case. The python packages will remain in lockstep with each other for now, while the
C# package will follow its own versioning as is appropriate. However, the python package
versions may diverge in the future.
- While the python packages will remain in Beta for now, we acknowledge that the most
heavily used portion of our python interface is the `mlagents-learn` CLI and strive
to make this part of our API backward compatible. We are actively working on this and
expect to have a stable CLI in the next few weeks.
## Communicator
Packages which communicate: com.unity.ml-agents / ml-agents-envs
Another entity of the ML-Agents Toolkit that requires versioning is the communication layer
between C# and Python, which will follow also semantic versioning. This guarantees a level of
backward compatibility between different versions of C# and Python packages which communicate.
Any Communicator version 1.x.x of the Unity package should be compatible with any 1.x.x
Communicator Version in Python.
An RLCapabilities struct keeps track of which features exist. This struct is passed from C# to
Python, and another from Python to C#. With this feature level granularity, we can notify users
more specifically about feature limitations based on what's available in both C# and Python.
These notifications will be logged to the python terminal, or to the Unity Editor Console.
## Side Channels
The communicator is what manages data transfer between Unity and Python for the core
training loop. Side Channels are another means of data transfer between Unity and Python.
Side Channels are not versioned, but have been designed to support backward compatibility
for what they are. As of today, we provide 4 side channels:
- FloatProperties: shared float data between Unity - Python (bidirectional)
- RawBytes: raw data that can be sent Unity - Python (bidirectional)
- EngineConfig: a set of numeric fields in a pre-defined order sent from Python to Unity
- Stats: (name, value, agg) messages sent from Unity to Python
Aside from the specific implementations of side channels we provide (and use ourselves),
the Side Channel interface is made available for users to create their own custom side
channels. As such, we guarantee that the built in SideChannel interface between Unity and
Python is backward compatible in packages that share the same major version.

37
com.unity.ml-agents/Tests/Editor/Communicator/GrpcExtensionsTests.cs


using NUnit.Framework;
using UnityEngine;
using Unity.MLAgents.Policies;
using Unity.MLAgents.Demonstrations;
using Unity.MLAgents.Sensors;
namespace Unity.MLAgents.Tests
{
[TestFixture]
public class GrpcExtensionsTests
{
[Test]
public void TestDefaultBrainParametersToProto()
{
// Should be able to convert a default instance to proto.
var brain = new BrainParameters();
brain.ToProto("foo", false);
}
[Test]
public void TestDefaultAgentInfoToProto()
{
// Should be able to convert a default instance to proto.
var agentInfo = new AgentInfo();
agentInfo.ToInfoActionPairProto();
agentInfo.ToAgentInfoProto();
}
[Test]
public void TestDefaultDemonstrationMetaDataToProto()
{
// Should be able to convert a default instance to proto.
var demoMetaData = new DemonstrationMetaData();
demoMetaData.ToProto();
}
}
}

11
com.unity.ml-agents/Tests/Editor/Communicator/GrpcExtensionsTests.cs.meta


fileFormatVersion: 2
guid: 7aa28d0e370064c18bb8a913417ad21d
MonoImporter:
externalObjects: {}
serializedVersion: 2
defaultReferences: []
executionOrder: 0
icon: {instanceID: 0}
userData:
assetBundleName:
assetBundleVariant:

19
.github/workflows/nightly.yml


name: nightly
on:
schedule:
- cron: '0 7 * * *' # run at 7 AM UTC == midnight PST
jobs:
markdown-link-check-full:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v1
- uses: actions/setup-node@v2-beta
with:
node-version: '12'
- run: sudo npm install -g markdown-link-check
- uses: pre-commit/action@v2.0.0
with:
extra_args: --hook-stage manual markdown-link-check-full --all-files

41
.github/workflows/pre-commit.yml


name: pre-commit
on:
pull_request:
push:
branches: [master]
jobs:
pre-commit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v1
- uses: actions/setup-ruby@v1
with:
ruby-version: '2.6'
- uses: actions/setup-dotnet@v1
with:
dotnet-version: '3.1.x'
- run: dotnet tool install -g dotnet-format --version 4.1.131201
- uses: pre-commit/action@v2.0.0
markdown-link-check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v1
- uses: actions/setup-node@v2-beta
with:
node-version: '12'
- run: sudo npm install -g markdown-link-check
- uses: pre-commit/action@v2.0.0
with:
extra_args: --hook-stage manual markdown-link-check --all-files
validate-meta-files:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v1
- run: python utils/validate_meta_files.py

66
.github/workflows/pytest.yml


name: pytest
on:
pull_request:
paths: # This action will only run if the PR modifies a file in one of these directories
- 'ml-agents/**'
- 'ml-agents-envs/**'
- 'gym-unity/**'
- 'test_constraints*.txt'
- 'test_requirements.txt'
push:
branches: [master]
jobs:
pytest:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.6.x, 3.7.x, 3.8.x]
include:
- python-version: 3.6.x
pip_constraints: test_constraints_min_version.txt
- python-version: 3.7.x
pip_constraints: test_constraints_max_tf1_version.txt
- python-version: 3.8.x
pip_constraints: test_constraints_max_tf2_version.txt
steps:
- uses: actions/checkout@v2
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Cache pip
uses: actions/cache@v2
with:
# This path is specific to Ubuntu
path: ~/.cache/pip
# Look to see if there is a cache hit for the corresponding requirements file
key: ${{ runner.os }}-pip-${{ hashFiles('ml-agents/setup.py', 'ml-agents-envs/setup.py', 'gym-unity/setup.py', 'test_requirements.txt', matrix.pip_constraints) }}
restore-keys: |
${{ runner.os }}-pip-
${{ runner.os }}-
- name: Display Python version
run: python -c "import sys; print(sys.version)"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
python -m pip install --upgrade setuptools
python -m pip install --progress-bar=off -e ./ml-agents-envs -c ${{ matrix.pip_constraints }}
python -m pip install --progress-bar=off -e ./ml-agents -c ${{ matrix.pip_constraints }}
python -m pip install --progress-bar=off -r test_requirements.txt -c ${{ matrix.pip_constraints }}
python -m pip install --progress-bar=off -e ./gym-unity -c ${{ matrix.pip_constraints }}
- name: Save python dependencies
run: pip freeze > pip_versions-${{ matrix.python-version }}.txt
- name: Run pytest
run: pytest --cov=ml-agents --cov=ml-agents-envs --cov=gym-unity --cov-report html --junitxml=junit/test-results-${{ matrix.python-version }}.xml -p no:warnings
- name: Upload pytest test results
uses: actions/upload-artifact@v2
with:
name: artifacts-${{ matrix.python-version }}
path: |
htmlcov
pip_versions-${{ matrix.python-version }}.txt
junit/test-results-${{ matrix.python-version }}.xml
# Use always() to always run this step to publish test results when there are test failures
if: ${{ always() }}
正在加载...
取消
保存