浏览代码

Merge branch 'master' into develop-critic-optimizer

/develop/critic-op-lstm-currentmem
Andrew Cohen 4 年前
当前提交
dc8e8494
共有 69 个文件被更改,包括 5409 次插入248 次删除
  1. 3
      .github/workflows/pytest.yml
  2. 1
      .yamato/com.unity.ml-agents-performance.yml
  3. 4
      .yamato/com.unity.ml-agents-test.yml
  4. 4
      .yamato/compressed-sensor-test.yml
  5. 4
      .yamato/gym-interface-test.yml
  6. 4
      .yamato/python-ll-api-test.yml
  7. 15
      .yamato/test_versions.metafile
  8. 2
      Project/Assets/ML-Agents/Editor/Tests/StandaloneBuildTest.cs
  9. 29
      Project/Assets/ML-Agents/Examples/SharedAssets/Scripts/ModelOverrider.cs
  10. 1
      Project/ProjectSettings/TagManager.asset
  11. 18
      com.unity.ml-agents/CHANGELOG.md
  12. 16
      com.unity.ml-agents/Editor/BehaviorParametersEditor.cs
  13. 55
      com.unity.ml-agents/Runtime/Academy.cs
  14. 4
      com.unity.ml-agents/Runtime/Agent.cs
  15. 15
      com.unity.ml-agents/Runtime/Communicator/GrpcExtensions.cs
  16. 5
      com.unity.ml-agents/Runtime/Communicator/ICommunicator.cs
  17. 147
      com.unity.ml-agents/Runtime/Communicator/RpcCommunicator.cs
  18. 5
      com.unity.ml-agents/Runtime/Communicator/UnityRLCapabilities.cs
  19. 40
      com.unity.ml-agents/Runtime/Grpc/CommunicatorObjects/Capabilities.cs
  20. 70
      com.unity.ml-agents/Runtime/Inference/BarracudaModelParamLoader.cs
  21. 6
      com.unity.ml-agents/Runtime/Inference/ModelRunner.cs
  22. 21
      com.unity.ml-agents/Runtime/Sensors/BufferSensor.cs
  23. 14
      com.unity.ml-agents/Runtime/Sensors/BufferSensorComponent.cs
  24. 17
      com.unity.ml-agents/Runtime/Sensors/CameraSensor.cs
  25. 2
      com.unity.ml-agents/Runtime/Sensors/IDimensionPropertiesSensor.cs
  26. 14
      com.unity.ml-agents/Runtime/SideChannels/SideChannel.cs
  27. 19
      com.unity.ml-agents/Tests/Editor/Communicator/RpcCommunicatorTests.cs
  28. 30
      com.unity.ml-agents/Tests/Editor/ParameterLoaderTest.cs
  29. 2
      docs/Installation.md
  30. 51
      docs/Learning-Environment-Design-Agents.md
  31. 27
      docs/Learning-Environment-Examples.md
  32. 2
      docs/Training-Configuration-File.md
  33. 2
      gym-unity/README.md
  34. 11
      ml-agents-envs/mlagents_envs/communicator_objects/capabilities_pb2.py
  35. 6
      ml-agents-envs/mlagents_envs/communicator_objects/capabilities_pb2.pyi
  36. 4
      ml-agents-envs/mlagents_envs/environment.py
  37. 16
      ml-agents-envs/mlagents_envs/rpc_utils.py
  38. 14
      ml-agents-envs/mlagents_envs/tests/test_rpc_utils.py
  39. 40
      ml-agents/mlagents/trainers/ghost/trainer.py
  40. 9
      ml-agents/mlagents/trainers/tests/torch/test_ghost.py
  41. 49
      ml-agents/mlagents/trainers/torch/attention.py
  42. 29
      ml-agents/tests/yamato/training_int_tests.py
  43. 3
      protobuf-definitions/proto/mlagents_envs/communicator_objects/capabilities.proto
  44. 8
      Project/Assets/ML-Agents/Examples/Sorter.meta
  45. 105
      config/ppo/Sorter_curriculum.yaml
  46. 1001
      docs/images/sorter.png
  47. 8
      Project/Assets/ML-Agents/Examples/Sorter/Meshes.meta
  48. 63
      Project/Assets/ML-Agents/Examples/Sorter/Meshes/ArenaWalls.fbx
  49. 247
      Project/Assets/ML-Agents/Examples/Sorter/Meshes/ArenaWalls.fbx.meta
  50. 8
      Project/Assets/ML-Agents/Examples/Sorter/Prefabs.meta
  51. 8
      Project/Assets/ML-Agents/Examples/Sorter/Scenes.meta
  52. 8
      Project/Assets/ML-Agents/Examples/Sorter/Scripts.meta
  53. 8
      Project/Assets/ML-Agents/Examples/Sorter/TFModels.meta
  54. 15
      Project/Assets/ML-Agents/Examples/Sorter/TFModels/Sorter.onnx.meta
  55. 1001
      Project/Assets/ML-Agents/Examples/Sorter/TFModels/Sorter.onnx
  56. 7
      Project/Assets/ML-Agents/Examples/Sorter/Prefabs/Area.prefab.meta
  57. 1001
      Project/Assets/ML-Agents/Examples/Sorter/Prefabs/Area.prefab
  58. 11
      Project/Assets/ML-Agents/Examples/Sorter/Scripts/SorterAgent.cs.meta
  59. 11
      Project/Assets/ML-Agents/Examples/Sorter/Scripts/NumberTile.cs.meta
  60. 34
      Project/Assets/ML-Agents/Examples/Sorter/Scripts/NumberTile.cs
  61. 273
      Project/Assets/ML-Agents/Examples/Sorter/Scripts/SorterAgent.cs
  62. 9
      Project/Assets/ML-Agents/Examples/Sorter/Scenes/Sorter.unity.meta
  63. 1001
      Project/Assets/ML-Agents/Examples/Sorter/Scenes/Sorter.unity

3
.github/workflows/pytest.yml


run: python -c "import sys; print(sys.version)"
- name: Install dependencies
run: |
# pin pip to workaround https://github.com/pypa/pip/issues/9180
python -m pip install pip==20.2
python -m pip install --upgrade pip
python -m pip install --upgrade setuptools
python -m pip install --progress-bar=off -e ./ml-agents-envs
python -m pip install --progress-bar=off -e ./ml-agents

1
.yamato/com.unity.ml-agents-performance.yml


test_editors:
- version: 2019.4
- version: 2020.1
- version: 2020.2
---
{% for editor in test_editors %}

4
.yamato/com.unity.ml-agents-test.yml


enableCodeCoverage: !!bool true
testProject: DevProject
enableNoDefaultPackages: !!bool true
- version: 2020.1
enableCodeCoverage: !!bool true
testProject: DevProject
enableNoDefaultPackages: !!bool true
- version: 2020.2
enableCodeCoverage: !!bool true
testProject: DevProject

4
.yamato/compressed-sensor-test.yml


- .yamato/standalone-build-test.yml#test_linux_standalone_{{ editor.version }}
triggers:
cancel_old_ci: true
{% if editor.extra_test == "sensor" %}
expression: |
(pull_request.target eq "master" OR
pull_request.target match "release.+") AND

pull_request.changes.any match "Project/**" OR
pull_request.changes.any match "ml-agents/**" OR
pull_request.changes.any match "ml-agents/tests/yamato/**" OR
{% endif %}
{% endfor %}

4
.yamato/gym-interface-test.yml


- .yamato/standalone-build-test.yml#test_linux_standalone_{{ editor.version }}
triggers:
cancel_old_ci: true
{% if editor.extra_test == "gym" %}
expression: |
(pull_request.target eq "master" OR
pull_request.target match "release.+") AND

pull_request.changes.any match "ml-agents/**" OR
pull_request.changes.any match "ml-agents/tests/yamato/**" OR
{% endif %}
{% endfor %}

4
.yamato/python-ll-api-test.yml


- .yamato/standalone-build-test.yml#test_linux_standalone_{{ editor.version }}
triggers:
cancel_old_ci: true
{% if editor.extra_test == "llapi" %}
expression: |
(pull_request.target eq "master" OR
pull_request.target match "release.+") AND

pull_request.changes.any match "ml-agents/**" OR
pull_request.changes.any match "ml-agents/tests/yamato/**" OR
{% endif %}
{% endfor %}

15
.yamato/test_versions.metafile


# List of editor versions for standalone-build-test and its dependencies.
# csharp_backcompat_version is used in training-int-tests to determine the
# older package version to run the backwards compat tests against.
# We always run training-int-tests for all versions of the editor
# For each "other" test, we only run it against a single version of the
# editor to reduce the number of yamato jobs
csharp_backcompat_version: 1.0.0
extra_test: llapi
csharp_backcompat_version: 1.0.0
- version: 2020.1
csharp_backcompat_version: 1.0.0
extra_test: gym
# 2020.2 moved the AssetImporters namespace
# but we didn't handle this until 1.2.0
csharp_backcompat_version: 1.2.0
extra_test: sensor

2
Project/Assets/ML-Agents/Editor/Tests/StandaloneBuildTest.cs


scenes,
outputPath,
buildTarget,
BuildOptions.None
BuildOptions.Development
);
var isOk = buildResult.summary.result == BuildResult.Succeeded;
var error = "";

29
Project/Assets/ML-Agents/Examples/SharedAssets/Scripts/ModelOverrider.cs


const string k_CommandLineModelOverrideDirectoryFlag = "--mlagents-override-model-directory";
const string k_CommandLineModelOverrideExtensionFlag = "--mlagents-override-model-extension";
const string k_CommandLineQuitAfterEpisodesFlag = "--mlagents-quit-after-episodes";
const string k_CommandLineQuitAfterSeconds = "--mlagents-quit-after-seconds";
const string k_CommandLineQuitOnLoadFailure = "--mlagents-quit-on-load-failure";
// The attached Agent

// Max episodes to run. Only used if > 0
// Will default to 1 if override models are specified, otherwise 0.
int m_MaxEpisodes;
// Deadline - exit if the time exceeds this
DateTime m_Deadline = DateTime.MaxValue;
int m_NumSteps;
int m_PreviousNumSteps;

void GetAssetPathFromCommandLine()
{
var maxEpisodes = 0;
var timeoutSeconds = 0;
string[] commandLineArgsOverride = null;
if (!string.IsNullOrEmpty(debugCommandLineOverride) && Application.isEditor)
{

{
Int32.TryParse(args[i + 1], out maxEpisodes);
}
else if (args[i] == k_CommandLineQuitAfterSeconds && i < args.Length - 1)
{
Int32.TryParse(args[i + 1], out timeoutSeconds);
}
else if (args[i] == k_CommandLineQuitOnLoadFailure)
{
m_QuitOnLoadFailure = true;

m_MaxEpisodes = maxEpisodes > 0 ? maxEpisodes : 1;
Debug.Log($"setting m_MaxEpisodes to {maxEpisodes}");
}
if (timeoutSeconds > 0)
{
m_Deadline = DateTime.Now + TimeSpan.FromSeconds(timeoutSeconds);
Debug.Log($"setting deadline to {timeoutSeconds} from now.");
}
}
void OnEnable()

EditorApplication.isPlaying = false;
#endif
}
else if (DateTime.Now >= m_Deadline)
{
Debug.Log(
$"Deadline exceeded. " +
$"{TotalCompletedEpisodes}/{m_MaxEpisodes} episodes and " +
$"{TotalNumSteps}/{m_MaxEpisodes * m_Agent.MaxStep} steps completed. Exiting.");
Application.Quit(0);
#if UNITY_EDITOR
EditorApplication.isPlaying = false;
#endif
}
m_NumSteps++;
}

1
Project/ProjectSettings/TagManager.asset


- symbol_O_Goal
- purpleAgent
- purpleGoal
- tile
layers:
- Default
- TransparentFX

18
com.unity.ml-agents/CHANGELOG.md


and this project adheres to
[Semantic Versioning](http://semver.org/spec/v2.0.0.html).
## [Unreleased]
### Major Changes
#### com.unity.ml-agents (C#)
#### ml-agents / ml-agents-envs / gym-unity (Python)
### Minor Changes
#### com.unity.ml-agents / com.unity.ml-agents.extensions (C#)
#### ml-agents / ml-agents-envs / gym-unity (Python)
## [Unreleased]
### Bug Fixes
#### com.unity.ml-agents (C#)
#### ml-agents / ml-agents-envs / gym-unity (Python)
## [1.8.0-preview] - 2021-02-17
### Major Changes
#### com.unity.ml-agents (C#)
#### ml-agents / ml-agents-envs / gym-unity (Python)

reduced the amount of memory allocated by approximately 25%. (#4887)
- Removed several memory allocations that happened during inference with discrete actions. (#4922)
- Properly catch permission errors when writing timer files. (#4921)
- Unexpected exceptions during training initialization and shutdown are now logged. If you see
"noisy" logs, please let us know! (#4930, #4935)
#### ml-agents / ml-agents-envs / gym-unity (Python)
- Fixed a bug that would cause an exception when `RunOptions` was deserialized via `pickle`. (#4842)

while waiting for a connection, and raises a better error message if it crashes. (#4880)
- Passing a `-logfile` option in the `--env-args` option to `mlagents-learn` is
no longer overwritten. (#4880)
- The `load_weights` function was being called unnecessarily often in the Ghost Trainer leading to training slowdowns. (#4934)
## [1.7.2-preview] - 2020-12-22

16
com.unity.ml-agents/Editor/BehaviorParametersEditor.cs


// Grab the sensor components, since we need them to determine the observation sizes.
// TODO make these methods of BehaviorParameters
SensorComponent[] sensorComponents;
if (behaviorParameters.UseChildSensors)
{
sensorComponents = behaviorParameters.GetComponentsInChildren<SensorComponent>();
}
else
{
sensorComponents = behaviorParameters.GetComponents<SensorComponent>();
}
var agent = behaviorParameters.gameObject.GetComponent<Agent>();
agent.sensors = new List<ISensor>();
agent.InitializeSensors();
var sensors = agent.sensors.ToArray();
ActuatorComponent[] actuatorComponents;
if (behaviorParameters.UseChildActuators)

// Get the total size of the sensors generated by ObservableAttributes.
// If there are any errors (e.g. unsupported type, write-only properties), display them too.
int observableAttributeSensorTotalSize = 0;
var agent = behaviorParameters.GetComponent<Agent>();
if (agent != null && behaviorParameters.ObservableAttributeHandling != ObservableAttributeOptions.Ignore)
{
List<string> observableErrors = new List<string>();

if (brainParameters != null)
{
var failedChecks = Inference.BarracudaModelParamLoader.CheckModel(
barracudaModel, brainParameters, sensorComponents, actuatorComponents,
barracudaModel, brainParameters, sensors, actuatorComponents,
observableAttributeSensorTotalSize, behaviorParameters.BehaviorType
);
foreach (var check in failedChecks)

55
com.unity.ml-agents/Runtime/Academy.cs


/// <term>1.4.0</term>
/// <description>Support training analytics sent from python trainer to the editor.</description>
/// </item>
/// <item>
/// <term>1.5.0</term>
/// <description>Support variable length observation training.</description>
/// </item>
const string k_ApiVersion = "1.4.0";
const string k_ApiVersion = "1.5.0";
/// <summary>
/// Unity package version of com.unity.ml-agents.

{
// We try to exchange the first message with Python. If this fails, it means
// no Python Process is ready to train the environment. In this case, the
//environment must use Inference.
// environment must use Inference.
bool initSuccessful = false;
var communicatorInitParams = new CommunicatorInitParameters
{
unityCommunicationVersion = k_ApiVersion,
unityPackageVersion = k_PackageVersion,
name = "AcademySingleton",
CSharpCapabilities = new UnityRLCapabilities()
};
var unityRlInitParameters = Communicator.Initialize(
new CommunicatorInitParameters
{
unityCommunicationVersion = k_ApiVersion,
unityPackageVersion = k_PackageVersion,
name = "AcademySingleton",
CSharpCapabilities = new UnityRLCapabilities()
});
UnityEngine.Random.InitState(unityRlInitParameters.seed);
// We might have inference-only Agents, so set the seed for them too.
m_InferenceSeed = unityRlInitParameters.seed;
TrainerCapabilities = unityRlInitParameters.TrainerCapabilities;
TrainerCapabilities.WarnOnPythonMissingBaseRLCapabilities();
initSuccessful = Communicator.Initialize(
communicatorInitParams,
out var unityRlInitParameters
);
if (initSuccessful)
{
UnityEngine.Random.InitState(unityRlInitParameters.seed);
// We might have inference-only Agents, so set the seed for them too.
m_InferenceSeed = unityRlInitParameters.seed;
TrainerCapabilities = unityRlInitParameters.TrainerCapabilities;
TrainerCapabilities.WarnOnPythonMissingBaseRLCapabilities();
}
else
{
Debug.Log($"Couldn't connect to trainer on port {port} using API version {k_ApiVersion}. Will perform inference instead.");
Communicator = null;
}
catch
catch (Exception ex)
Debug.Log($"" +
$"Couldn't connect to trainer on port {port} using API version {k_ApiVersion}. " +
"Will perform inference instead."
);
Debug.Log($"Unexpected exception when trying to initialize communication: {ex}\nWill perform inference instead.");
if (Communicator != null)
{
Communicator.QuitCommandReceived += OnQuitCommandReceived;

4
com.unity.ml-agents/Runtime/Agent.cs


/// </summary>
internal void InitializeSensors()
{
if (m_PolicyFactory == null)
{
m_PolicyFactory = GetComponent<BehaviorParameters>();
}
if (m_PolicyFactory.ObservableAttributeHandling != ObservableAttributeOptions.Ignore)
{
var excludeInherited =

15
com.unity.ml-agents/Runtime/Communicator/GrpcExtensions.cs


{
observationProto.DimensionProperties.Add((int)dimensionProperties[i]);
}
// Checking trainer compatibility with variable length observations
if (dimensionProperties.Length == 2)
{
if (dimensionProperties[0] == DimensionProperty.VariableSize &&
dimensionProperties[1] == DimensionProperty.None)
{
var trainerCanHandleVarLenObs = Academy.Instance.TrainerCapabilities == null || Academy.Instance.TrainerCapabilities.VariableLengthObservation;
if (!trainerCanHandleVarLenObs)
{
throw new UnityAgentsException("Variable Length Observations are not supported by the trainer");
}
}
}
}
observationProto.Shape.AddRange(shape);

CompressedChannelMapping = proto.CompressedChannelMapping,
HybridActions = proto.HybridActions,
TrainingAnalytics = proto.TrainingAnalytics,
VariableLengthObservation = proto.VariableLengthObservation,
};
}

CompressedChannelMapping = rlCaps.CompressedChannelMapping,
HybridActions = rlCaps.HybridActions,
TrainingAnalytics = rlCaps.TrainingAnalytics,
VariableLengthObservation = rlCaps.VariableLengthObservation,
};
}

5
com.unity.ml-agents/Runtime/Communicator/ICommunicator.cs


/// Sends the academy parameters through the Communicator.
/// Is used by the academy to send the AcademyParameters to the communicator.
/// </summary>
/// <returns>The External Initialization Parameters received.</returns>
/// <returns>Whether the connection was successful.</returns>
UnityRLInitParameters Initialize(CommunicatorInitParameters initParameters);
/// <param name="initParametersOut">The External Initialization Parameters received</param>
bool Initialize(CommunicatorInitParameters initParameters, out UnityRLInitParameters initParametersOut);
/// <summary>
/// Registers a new Brain to the Communicator.

147
com.unity.ml-agents/Runtime/Communicator/RpcCommunicator.cs


internal static bool CheckCommunicationVersionsAreCompatible(
string unityCommunicationVersion,
string pythonApiVersion,
string pythonLibraryVersion)
string pythonApiVersion
)
{
var unityVersion = new Version(unityCommunicationVersion);
var pythonVersion = new Version(pythonApiVersion);

/// Sends the initialization parameters through the Communicator.
/// Is used by the academy to send initialization parameters to the communicator.
/// </summary>
/// <returns>The External Initialization Parameters received.</returns>
/// <returns>Whether the connection was successful.</returns>
public UnityRLInitParameters Initialize(CommunicatorInitParameters initParameters)
/// <param name="initParametersOut">The External Initialization Parameters received.</param>
public bool Initialize(CommunicatorInitParameters initParameters, out UnityRLInitParameters initParametersOut)
{
var academyParameters = new UnityRLInitializationOutputProto
{

{
RlInitializationOutput = academyParameters
},
out input);
var pythonPackageVersion = initializationInput.RlInitializationInput.PackageVersion;
var pythonCommunicationVersion = initializationInput.RlInitializationInput.CommunicationVersion;
var unityCommunicationVersion = initParameters.unityCommunicationVersion;
TrainingAnalytics.SetTrainerInformation(pythonPackageVersion, pythonCommunicationVersion);
out input
);
}
catch (Exception ex)
{
if (ex is RpcException rpcException)
{
var communicationIsCompatible = CheckCommunicationVersionsAreCompatible(unityCommunicationVersion,
pythonCommunicationVersion,
pythonPackageVersion);
// Initialization succeeded part-way. The most likely cause is a mismatch between the communicator
// API strings, so log an explicit warning if that's the case.
if (initializationInput != null && input == null)
{
if (!communicationIsCompatible)
switch (rpcException.Status.StatusCode)
Debug.LogWarningFormat(
"Communication protocol between python ({0}) and Unity ({1}) have different " +
"versions which make them incompatible. Python library version: {2}.",
pythonCommunicationVersion, initParameters.unityCommunicationVersion,
pythonPackageVersion
);
case StatusCode.Unavailable:
// This is the common case where there's no trainer to connect to.
break;
case StatusCode.DeadlineExceeded:
// We don't currently set a deadline for connection, but likely will in the future.
break;
default:
Debug.Log($"Unexpected gRPC exception when trying to initialize communication: {rpcException}");
break;
else
{
Debug.LogWarningFormat(
"Unknown communication error between Python. Python communication protocol: {0}, " +
"Python library version: {1}.",
pythonCommunicationVersion,
pythonPackageVersion
);
}
throw new UnityAgentsException("ICommunicator.Initialize() failed.");
else
{
Debug.Log($"Unexpected exception when trying to initialize communication: {ex}");
}
initParametersOut = new UnityRLInitParameters();
return false;
catch
var pythonPackageVersion = initializationInput.RlInitializationInput.PackageVersion;
var pythonCommunicationVersion = initializationInput.RlInitializationInput.CommunicationVersion;
TrainingAnalytics.SetTrainerInformation(pythonPackageVersion, pythonCommunicationVersion);
var communicationIsCompatible = CheckCommunicationVersionsAreCompatible(
initParameters.unityCommunicationVersion,
pythonCommunicationVersion
);
// Initialization succeeded part-way. The most likely cause is a mismatch between the communicator
// API strings, so log an explicit warning if that's the case.
if (initializationInput != null && input == null)
var exceptionMessage = "The Communicator was unable to connect. Please make sure the External " +
"process is ready to accept communication with Unity.";
// Check for common error condition and add details to the exception message.
var httpProxy = Environment.GetEnvironmentVariable("HTTP_PROXY");
var httpsProxy = Environment.GetEnvironmentVariable("HTTPS_PROXY");
if (httpProxy != null || httpsProxy != null)
if (!communicationIsCompatible)
{
Debug.LogWarningFormat(
"Communication protocol between python ({0}) and Unity ({1}) have different " +
"versions which make them incompatible. Python library version: {2}.",
pythonCommunicationVersion, initParameters.unityCommunicationVersion,
pythonPackageVersion
);
}
else
exceptionMessage += " Try removing HTTP_PROXY and HTTPS_PROXY from the" +
"environment variables and try again.";
Debug.LogWarningFormat(
"Unknown communication error between Python. Python communication protocol: {0}, " +
"Python library version: {1}.",
pythonCommunicationVersion,
pythonPackageVersion
);
throw new UnityAgentsException(exceptionMessage);
initParametersOut = new UnityRLInitParameters();
return false;
return initializationInput.RlInitializationInput.ToUnityRLInitParameters();
initParametersOut = initializationInput.RlInitializationInput.ToUnityRLInitParameters();
return true;
}
/// <summary>

SendCommandEvent(rlInput.Command);
}
UnityInputProto Initialize(UnityOutputProto unityOutput,
out UnityInputProto unityInput)
UnityInputProto Initialize(UnityOutputProto unityOutput, out UnityInputProto unityInput)
{
#if UNITY_EDITOR || UNITY_STANDALONE_WIN || UNITY_STANDALONE_OSX || UNITY_STANDALONE_LINUX
m_IsOpen = true;

}
return result.UnityInput;
#else
throw new UnityAgentsException(
"You cannot perform training on this platform.");
throw new UnityAgentsException("You cannot perform training on this platform.");
#endif
}

{
return null;
}
try
{
var message = m_Client.Exchange(WrapMessage(unityOutput, 200));

QuitCommandReceived?.Invoke();
return message.UnityInput;
}
catch
catch (Exception ex)
if (ex is RpcException rpcException)
{
// Log more verbose errors if they're something the user can possibly do something about.
switch (rpcException.Status.StatusCode)
{
case StatusCode.Unavailable:
// This can happen when python disconnects. Ignore it to avoid noisy logs.
break;
case StatusCode.ResourceExhausted:
// This happens is the message body is too large. There's no way to
// gracefully handle this, but at least we can show the message and the
// user can try to reduce the number of agents or observation sizes.
Debug.LogError($"GRPC Exception: {rpcException.Message}. Disconnecting from trainer.");
break;
default:
// Other unknown errors. Log at INFO level.
Debug.Log($"GRPC Exception: {rpcException.Message}. Disconnecting from trainer.");
break;
}
}
else
{
// Fall-through for other error types
Debug.LogError($"Communication Exception: {ex.Message}. Disconnecting from trainer.");
}
m_IsOpen = false;
QuitCommandReceived?.Invoke();
return null;

5
com.unity.ml-agents/Runtime/Communicator/UnityRLCapabilities.cs


public bool CompressedChannelMapping;
public bool HybridActions;
public bool TrainingAnalytics;
public bool VariableLengthObservation;
/// <summary>
/// A class holding the capabilities flags for Reinforcement Learning across C# and the Trainer codebase. This

bool concatenatedPngObservations = true,
bool compressedChannelMapping = true,
bool hybridActions = true,
bool trainingAnalytics = true)
bool trainingAnalytics = true,
bool variableLengthObservation = true)
{
BaseRLCapabilities = baseRlCapabilities;
ConcatenatedPngObservations = concatenatedPngObservations;

VariableLengthObservation = variableLengthObservation;
}
/// <summary>

40
com.unity.ml-agents/Runtime/Grpc/CommunicatorObjects/Capabilities.cs


byte[] descriptorData = global::System.Convert.FromBase64String(
string.Concat(
"CjVtbGFnZW50c19lbnZzL2NvbW11bmljYXRvcl9vYmplY3RzL2NhcGFiaWxp",
"dGllcy5wcm90bxIUY29tbXVuaWNhdG9yX29iamVjdHMirwEKGFVuaXR5UkxD",
"dGllcy5wcm90bxIUY29tbXVuaWNhdG9yX29iamVjdHMi0gEKGFVuaXR5UkxD",
"ASgIEhkKEXRyYWluaW5nQW5hbHl0aWNzGAUgASgIQiWqAiJVbml0eS5NTEFn",
"ZW50cy5Db21tdW5pY2F0b3JPYmplY3RzYgZwcm90bzM="));
"ASgIEhkKEXRyYWluaW5nQW5hbHl0aWNzGAUgASgIEiEKGXZhcmlhYmxlTGVu",
"Z3RoT2JzZXJ2YXRpb24YBiABKAhCJaoCIlVuaXR5Lk1MQWdlbnRzLkNvbW11",
"bmljYXRvck9iamVjdHNiBnByb3RvMw=="));
new pbr::GeneratedClrTypeInfo(typeof(global::Unity.MLAgents.CommunicatorObjects.UnityRLCapabilitiesProto), global::Unity.MLAgents.CommunicatorObjects.UnityRLCapabilitiesProto.Parser, new[]{ "BaseRLCapabilities", "ConcatenatedPngObservations", "CompressedChannelMapping", "HybridActions", "TrainingAnalytics" }, null, null, null)
new pbr::GeneratedClrTypeInfo(typeof(global::Unity.MLAgents.CommunicatorObjects.UnityRLCapabilitiesProto), global::Unity.MLAgents.CommunicatorObjects.UnityRLCapabilitiesProto.Parser, new[]{ "BaseRLCapabilities", "ConcatenatedPngObservations", "CompressedChannelMapping", "HybridActions", "TrainingAnalytics", "VariableLengthObservation" }, null, null, null)
}));
}
#endregion

compressedChannelMapping_ = other.compressedChannelMapping_;
hybridActions_ = other.hybridActions_;
trainingAnalytics_ = other.trainingAnalytics_;
variableLengthObservation_ = other.variableLengthObservation_;
_unknownFields = pb::UnknownFieldSet.Clone(other._unknownFields);
}

}
}
/// <summary>Field number for the "variableLengthObservation" field.</summary>
public const int VariableLengthObservationFieldNumber = 6;
private bool variableLengthObservation_;
/// <summary>
/// Support for variable length observations of rank 2
/// </summary>
[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public bool VariableLengthObservation {
get { return variableLengthObservation_; }
set {
variableLengthObservation_ = value;
}
}
[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public override bool Equals(object other) {
return Equals(other as UnityRLCapabilitiesProto);

if (CompressedChannelMapping != other.CompressedChannelMapping) return false;
if (HybridActions != other.HybridActions) return false;
if (TrainingAnalytics != other.TrainingAnalytics) return false;
if (VariableLengthObservation != other.VariableLengthObservation) return false;
return Equals(_unknownFields, other._unknownFields);
}

if (CompressedChannelMapping != false) hash ^= CompressedChannelMapping.GetHashCode();
if (HybridActions != false) hash ^= HybridActions.GetHashCode();
if (TrainingAnalytics != false) hash ^= TrainingAnalytics.GetHashCode();
if (VariableLengthObservation != false) hash ^= VariableLengthObservation.GetHashCode();
if (_unknownFields != null) {
hash ^= _unknownFields.GetHashCode();
}

if (TrainingAnalytics != false) {
output.WriteRawTag(40);
output.WriteBool(TrainingAnalytics);
}
if (VariableLengthObservation != false) {
output.WriteRawTag(48);
output.WriteBool(VariableLengthObservation);
}
if (_unknownFields != null) {
_unknownFields.WriteTo(output);

if (TrainingAnalytics != false) {
size += 1 + 1;
}
if (VariableLengthObservation != false) {
size += 1 + 1;
}
if (_unknownFields != null) {
size += _unknownFields.CalculateSize();
}

if (other.TrainingAnalytics != false) {
TrainingAnalytics = other.TrainingAnalytics;
}
if (other.VariableLengthObservation != false) {
VariableLengthObservation = other.VariableLengthObservation;
}
_unknownFields = pb::UnknownFieldSet.MergeFrom(_unknownFields, other._unknownFields);
}

}
case 40: {
TrainingAnalytics = input.ReadBool();
break;
}
case 48: {
VariableLengthObservation = input.ReadBool();
break;
}
}

70
com.unity.ml-agents/Runtime/Inference/BarracudaModelParamLoader.cs


/// <param name="brainParameters">
/// The BrainParameters that are used verify the compatibility with the InferenceEngine
/// </param>
/// <param name="sensorComponents">Attached sensor components</param>
/// <param name="sensors">Attached sensor components</param>
SensorComponent[] sensorComponents, ActuatorComponent[] actuatorComponents,
ISensor[] sensors, ActuatorComponent[] actuatorComponents,
int observableAttributeTotalSize = 0,
BehaviorType behaviorType = BehaviorType.Default)
{

}
failedModelChecks.AddRange(
CheckInputTensorPresence(model, brainParameters, memorySize, sensorComponents)
CheckInputTensorPresence(model, brainParameters, memorySize, sensors)
CheckInputTensorShape(model, brainParameters, sensorComponents, observableAttributeTotalSize)
CheckInputTensorShape(model, brainParameters, sensors, observableAttributeTotalSize)
);
failedModelChecks.AddRange(
CheckOutputTensorShape(model, brainParameters, actuatorComponents)

/// <param name="memory">
/// The memory size that the model is expecting.
/// </param>
/// <param name="sensorComponents">Array of attached sensor components</param>
/// <param name="sensors">Array of attached sensor components</param>
/// <returns>
/// A IEnumerable of string corresponding to the failed input presence checks.
/// </returns>

int memory,
SensorComponent[] sensorComponents
ISensor[] sensors
)
{
var failedModelChecks = new List<string>();

// If there are not enough Visual Observation Input compared to what the
// sensors expect.
var visObsIndex = 0;
for (var sensorIndex = 0; sensorIndex < sensorComponents.Length; sensorIndex++)
for (var sensorIndex = 0; sensorIndex < sensors.Length; sensorIndex++)
var sensor = sensorComponents[sensorIndex];
var sensor = sensors[sensorIndex];
if (sensor.GetObservationShape().Length == 3)
{
if (!tensorsNames.Contains(

/// Checks that the shape of the visual observation input placeholder is the same as the corresponding sensor.
/// </summary>
/// <param name="tensorProxy">The tensor that is expected by the model</param>
/// <param name="sensorComponent">The sensor that produces the visual observation.</param>
/// <param name="sensor">The sensor that produces the visual observation.</param>
TensorProxy tensorProxy, SensorComponent sensorComponent)
TensorProxy tensorProxy, ISensor sensor)
var shape = sensorComponent.GetObservationShape();
var shape = sensor.GetObservationShape();
var heightBp = shape[0];
var widthBp = shape[1];
var pixelBp = shape[2];

/// Checks that the shape of the rank 2 observation input placeholder is the same as the corresponding sensor.
/// </summary>
/// <param name="tensorProxy">The tensor that is expected by the model</param>
/// <param name="sensorComponent">The sensor that produces the visual observation.</param>
/// <param name="sensor">The sensor that produces the visual observation.</param>
TensorProxy tensorProxy, SensorComponent sensorComponent)
TensorProxy tensorProxy, ISensor sensor)
var shape = sensorComponent.GetObservationShape();
var shape = sensor.GetObservationShape();
var dim1Bp = shape[0];
var dim2Bp = shape[1];
var dim1T = tensorProxy.Channels;

/// <param name="brainParameters">
/// The BrainParameters that are used verify the compatibility with the InferenceEngine
/// </param>
/// <param name="sensorComponents">Attached sensors</param>
/// <param name="sensors">Attached sensors</param>
Model model, BrainParameters brainParameters, SensorComponent[] sensorComponents,
Model model, BrainParameters brainParameters, ISensor[] sensors,
new Dictionary<string, Func<BrainParameters, TensorProxy, SensorComponent[], int, string>>()
new Dictionary<string, Func<BrainParameters, TensorProxy, ISensor[], int, string>>()
{
{TensorNames.VectorObservationPlaceholder, CheckVectorObsShape},
{TensorNames.PreviousActionPlaceholder, CheckPreviousActionShape},

}
var visObsIndex = 0;
for (var sensorIndex = 0; sensorIndex < sensorComponents.Length; sensorIndex++)
for (var sensorIndex = 0; sensorIndex < sensors.Length; sensorIndex++)
var sensorComponent = sensorComponents[sensorIndex];
if (sensorComponent.GetObservationShape().Length == 3)
var sens = sensors[sensorIndex];
if (sens.GetObservationShape().Length == 3)
(bp, tensor, scs, i) => CheckVisualObsShape(tensor, sensorComponent);
(bp, tensor, scs, i) => CheckVisualObsShape(tensor, sens);
if (sensorComponent.GetObservationShape().Length == 2)
if (sens.GetObservationShape().Length == 2)
(bp, tensor, scs, i) => CheckRankTwoObsShape(tensor, sensorComponent);
(bp, tensor, scs, i) => CheckRankTwoObsShape(tensor, sens);
}
}

else
{
var tester = tensorTester[tensor.name];
var error = tester.Invoke(brainParameters, tensor, sensorComponents, observableAttributeTotalSize);
var error = tester.Invoke(brainParameters, tensor, sensors, observableAttributeTotalSize);
if (error != null)
{
failedModelChecks.Add(error);

/// The BrainParameters that are used verify the compatibility with the InferenceEngine
/// </param>
/// <param name="tensorProxy">The tensor that is expected by the model</param>
/// <param name="sensorComponents">Array of attached sensor components</param>
/// <param name="sensors">Array of attached sensor components</param>
/// <param name="observableAttributeTotalSize">Sum of the sizes of all ObservableAttributes.</param>
/// <returns>
/// If the Check failed, returns a string containing information about why the

BrainParameters brainParameters, TensorProxy tensorProxy, SensorComponent[] sensorComponents,
BrainParameters brainParameters, TensorProxy tensorProxy, ISensor[] sensors,
int observableAttributeTotalSize)
{
var vecObsSizeBp = brainParameters.VectorObservationSize;

var totalVectorSensorSize = 0;
foreach (var sensorComp in sensorComponents)
foreach (var sens in sensors)
if (sensorComp.GetObservationShape().Length == 1)
if ((sens.GetObservationShape().Length == 1))
totalVectorSensorSize += sensorComp.GetObservationShape()[0];
totalVectorSensorSize += sens.GetObservationShape()[0];
totalVectorSensorSize += observableAttributeTotalSize;
if (vecObsSizeBp * numStackedVector + totalVectorSensorSize != totalVecObsSizeT)
if (totalVectorSensorSize != totalVecObsSizeT)
foreach (var sensorComp in sensorComponents)
foreach (var sensorComp in sensors)
{
if (sensorComp.GetObservationShape().Length == 1)
{

$"but received: \n" +
$"Vector observations: {vecObsSizeBp} x {numStackedVector}\n" +
$"Total [Observable] attributes: {observableAttributeTotalSize}\n" +
$"SensorComponent sizes: {sensorSizes}.";
$"Sensor sizes: {sensorSizes}.";
}
return null;
}

/// The BrainParameters that are used verify the compatibility with the InferenceEngine
/// </param>
/// <param name="tensorProxy"> The tensor that is expected by the model</param>
/// <param name="sensorComponents">Array of attached sensor components (unused).</param>
/// <param name="sensors">Array of attached sensor components (unused).</param>
SensorComponent[] sensorComponents, int observableAttributeTotalSize)
ISensor[] sensors, int observableAttributeTotalSize)
{
var numberActionsBp = brainParameters.ActionSpec.NumDiscreteActions;
var numberActionsT = tensorProxy.shape[tensorProxy.shape.Length - 1];

6
com.unity.ml-agents/Runtime/Inference/ModelRunner.cs


SensorShapeValidator m_SensorShapeValidator = new SensorShapeValidator();
bool m_VisualObservationsInitialized;
bool m_ObservationsInitialized;
/// <summary>
/// Initializes the Brain with the Model that it will use when selecting actions for

{
return;
}
if (!m_VisualObservationsInitialized)
if (!m_ObservationsInitialized)
m_VisualObservationsInitialized = true;
m_ObservationsInitialized = true;
}
Profiler.BeginSample("ModelRunner.DecideAction");

21
com.unity.ml-agents/Runtime/Sensors/BufferSensor.cs


namespace Unity.MLAgents.Sensors
{
internal class BufferSensor : ISensor, IDimensionPropertiesSensor, IBuiltInSensor
/// <summary>
/// A Sensor that allows to observe a variable number of entities.
/// </summary>
public class BufferSensor : ISensor, IDimensionPropertiesSensor, IBuiltInSensor
static DimensionProperty[] s_DimensionProperties = new DimensionProperty[]{
DimensionProperty.VariableSize,
DimensionProperty.None
};
public BufferSensor(int maxNumberObs, int obsSize)
{
m_MaxNumObs = maxNumberObs;

/// <inheritdoc/>
public DimensionProperty[] GetDimensionProperties()
{
return new DimensionProperty[]{
DimensionProperty.VariableSize,
DimensionProperty.None
};
return s_DimensionProperties;
}
/// <summary>

/// <param name="obs"> The float array observation</param>
public void AppendObservation(float[] obs)
{
if (obs.Length != m_ObsSize)
{
throw new UnityAgentsException(
"The BufferSensor was expecting an observation of size " +
$"{m_ObsSize} but received {obs.Length} observations instead."
);
}
if (m_CurrentNumObservables >= m_MaxNumObs)
{
return;

14
com.unity.ml-agents/Runtime/Sensors/BufferSensorComponent.cs


{
/// <summary>
/// A component for BufferSensor.
/// A SensorComponent that creates a <see cref="BufferSensor"/>.
internal class BufferSensorComponent : SensorComponent
public class BufferSensorComponent : SensorComponent
/// <summary>
/// This is how many floats each entities will be represented with. This number
/// is fixed and all entities must have the same representation.
/// </summary>
/// <summary>
/// This is the maximum number of entities the `BufferSensor` will be able to
/// collect.
/// </summary>
private BufferSensor m_Sensor;
/// <inheritdoc/>

17
com.unity.ml-agents/Runtime/Sensors/CameraSensor.cs


/// <summary>
/// A sensor that wraps a Camera object to generate visual observations for an agent.
/// </summary>
public class CameraSensor : ISensor, IBuiltInSensor
public class CameraSensor : ISensor, IBuiltInSensor, IDimensionPropertiesSensor
{
Camera m_Camera;
int m_Width;

int[] m_Shape;
SensorCompressionType m_CompressionType;
static DimensionProperty[] s_DimensionProperties = new DimensionProperty[] {
DimensionProperty.TranslationalEquivariance,
DimensionProperty.TranslationalEquivariance,
DimensionProperty.None };
/// <summary>
/// The Camera used for rendering the sensor observations.

public int[] GetObservationShape()
{
return m_Shape;
}
/// <summary>
/// Accessor for the dimension properties of a camera sensor. A camera sensor
/// Has translational equivariance along width and hight and no property along
/// the channels dimension.
/// </summary>
/// <returns></returns>
public DimensionProperty[] GetDimensionProperties()
{
return s_DimensionProperties;
}
/// <summary>

2
com.unity.ml-agents/Runtime/Sensors/IDimensionPropertiesSensor.cs


/// The Dimension property flags of the observations
/// </summary>
[System.Flags]
internal enum DimensionProperty
public enum DimensionProperty
{
/// <summary>
/// No properties specified.

14
com.unity.ml-agents/Runtime/SideChannels/SideChannel.cs


using System.Collections.Generic;
using System;
using UnityEngine;
namespace Unity.MLAgents.SideChannels
{

internal void ProcessMessage(byte[] msg)
{
using (var incomingMsg = new IncomingMessage(msg))
try
{
using (var incomingMsg = new IncomingMessage(msg))
{
OnMessageReceived(incomingMsg);
}
}
catch (Exception ex)
OnMessageReceived(incomingMsg);
// Catch all errors in the sidechannel processing, so that a single
// bad SideChannel implementation doesn't take everything down with it.
Debug.LogError($"Error processing SideChannel message: {ex}.\nThe message will be skipped.");
}
}

19
com.unity.ml-agents/Tests/Editor/Communicator/RpcCommunicatorTests.cs


{
var unityVerStr = "1.0.0";
var pythonVerStr = "1.0.0";
var pythonPackageVerStr = "0.16.0";
pythonVerStr,
pythonPackageVerStr));
pythonVerStr));
pythonVerStr,
pythonPackageVerStr));
pythonVerStr));
pythonVerStr,
pythonPackageVerStr));
pythonVerStr));
pythonVerStr,
pythonPackageVerStr));
pythonVerStr));
pythonVerStr,
pythonPackageVerStr));
pythonVerStr));
pythonVerStr,
pythonPackageVerStr));
pythonVerStr));
}
}

30
com.unity.ml-agents/Tests/Editor/ParameterLoaderTest.cs


var errors = BarracudaModelParamLoader.CheckModel(
model, validBrainParameters,
new SensorComponent[] { sensor_21_20_3, sensor_20_22_3 }, new ActuatorComponent[0]
new ISensor[] { new VectorSensor(8), sensor_21_20_3.CreateSensor(), sensor_20_22_3.CreateSensor() }, new ActuatorComponent[0]
);
Assert.AreEqual(0, errors.Count()); // There should not be any errors
}

var errors = BarracudaModelParamLoader.CheckModel(
model, validBrainParameters,
new SensorComponent[] { sensor_21_20_3 }, new ActuatorComponent[0]
new ISensor[] { sensor_21_20_3.CreateSensor() }, new ActuatorComponent[0]
);
Assert.AreEqual(0, errors.Count()); // There should not be any errors
}

var errors = BarracudaModelParamLoader.CheckModel(
model, validBrainParameters,
new SensorComponent[] { }, new ActuatorComponent[0]
new ISensor[] { new VectorSensor(validBrainParameters.VectorObservationSize) }, new ActuatorComponent[0]
);
Assert.AreEqual(0, errors.Count()); // There should not be any errors
}

brainParameters.VectorObservationSize = 9; // Invalid observation
var errors = BarracudaModelParamLoader.CheckModel(
model, brainParameters,
new SensorComponent[] { sensor_21_20_3, sensor_20_22_3 }, new ActuatorComponent[0]
new ISensor[] { sensor_21_20_3.CreateSensor(), sensor_20_22_3.CreateSensor() }, new ActuatorComponent[0]
);
Assert.Greater(errors.Count(), 0);

model, brainParameters,
new SensorComponent[] { sensor_21_20_3, sensor_20_22_3 }, new ActuatorComponent[0]
new ISensor[] { sensor_21_20_3.CreateSensor(), sensor_20_22_3.CreateSensor() }, new ActuatorComponent[0]
);
Assert.Greater(errors.Count(), 0);
}

var brainParameters = GetDiscrete1vis0vec_2_3action_recurrModelBrainParameters();
brainParameters.VectorObservationSize = 1; // Invalid observation
var errors = BarracudaModelParamLoader.CheckModel(model, brainParameters, new SensorComponent[] { sensor_21_20_3 }, new ActuatorComponent[0]);
var errors = BarracudaModelParamLoader.CheckModel(model, brainParameters, new ISensor[] { sensor_21_20_3.CreateSensor() }, new ActuatorComponent[0]);
Assert.Greater(errors.Count(), 0);
}

brainParameters.VectorObservationSize = 9; // Invalid observation
var errors = BarracudaModelParamLoader.CheckModel(
model, brainParameters,
new SensorComponent[] { }, new ActuatorComponent[0]
new ISensor[] { }, new ActuatorComponent[0]
);
Assert.Greater(errors.Count(), 0);

model, brainParameters,
new SensorComponent[] { }, new ActuatorComponent[0]
new ISensor[] { }, new ActuatorComponent[0]
);
Assert.Greater(errors.Count(), 0);
}

var brainParameters = GetContinuous2vis8vec2actionBrainParameters();
brainParameters.ActionSpec = ActionSpec.MakeContinuous(3); // Invalid action
var errors = BarracudaModelParamLoader.CheckModel(model, brainParameters, new SensorComponent[] { sensor_21_20_3, sensor_20_22_3 }, new ActuatorComponent[0]);
var errors = BarracudaModelParamLoader.CheckModel(model, brainParameters, new ISensor[] { sensor_21_20_3.CreateSensor(), sensor_20_22_3.CreateSensor() }, new ActuatorComponent[0]);
errors = BarracudaModelParamLoader.CheckModel(model, brainParameters, new SensorComponent[] { sensor_21_20_3, sensor_20_22_3 }, new ActuatorComponent[0]);
errors = BarracudaModelParamLoader.CheckModel(model, brainParameters, new ISensor[] { sensor_21_20_3.CreateSensor(), sensor_20_22_3.CreateSensor() }, new ActuatorComponent[0]);
Assert.Greater(errors.Count(), 0);
}

var brainParameters = GetDiscrete1vis0vec_2_3action_recurrModelBrainParameters();
brainParameters.ActionSpec = ActionSpec.MakeDiscrete(3, 3); // Invalid action
var errors = BarracudaModelParamLoader.CheckModel(model, brainParameters, new SensorComponent[] { sensor_21_20_3 }, new ActuatorComponent[0]);
var errors = BarracudaModelParamLoader.CheckModel(model, brainParameters, new ISensor[] { sensor_21_20_3.CreateSensor() }, new ActuatorComponent[0]);
errors = BarracudaModelParamLoader.CheckModel(model, brainParameters, new SensorComponent[] { sensor_21_20_3 }, new ActuatorComponent[0]);
errors = BarracudaModelParamLoader.CheckModel(model, brainParameters, new ISensor[] { sensor_21_20_3.CreateSensor() }, new ActuatorComponent[0]);
Assert.Greater(errors.Count(), 0);
}

var brainParameters = GetHybridBrainParameters();
brainParameters.ActionSpec = new ActionSpec(3, new[] { 3 }); // Invalid discrete action size
var errors = BarracudaModelParamLoader.CheckModel(model, brainParameters, new SensorComponent[] { sensor_21_20_3, sensor_20_22_3 }, new ActuatorComponent[0]);
var errors = BarracudaModelParamLoader.CheckModel(model, brainParameters, new ISensor[] { sensor_21_20_3.CreateSensor(), sensor_20_22_3.CreateSensor() }, new ActuatorComponent[0]);
errors = BarracudaModelParamLoader.CheckModel(model, brainParameters, new SensorComponent[] { sensor_21_20_3, sensor_20_22_3 }, new ActuatorComponent[0]);
errors = BarracudaModelParamLoader.CheckModel(model, brainParameters, new ISensor[] { sensor_21_20_3.CreateSensor(), sensor_20_22_3.CreateSensor() }, new ActuatorComponent[0]);
Assert.Greater(errors.Count(), 0);
}

var brainParameters = GetContinuous2vis8vec2actionBrainParameters();
var errors = BarracudaModelParamLoader.CheckModel(null, brainParameters, new SensorComponent[] { sensor_21_20_3, sensor_20_22_3 }, new ActuatorComponent[0]);
var errors = BarracudaModelParamLoader.CheckModel(null, brainParameters, new ISensor[] { sensor_21_20_3.CreateSensor(), sensor_20_22_3.CreateSensor() }, new ActuatorComponent[0]);
Assert.Greater(errors.Count(), 0);
}
}

2
docs/Installation.md


installing ML-Agents. Activate your virtual environment and run from the command line:
```sh
pip3 install torch==1.7.0 -f https://download.pytorch.org/whl/torch_stable.html
pip3 install torch~=1.7.1 -f https://download.pytorch.org/whl/torch_stable.html
```
Note that on Windows, you may also need Microsoft's

51
docs/Learning-Environment-Design-Agents.md


- [Visual Observation Summary & Best Practices](#visual-observation-summary--best-practices)
- [Raycast Observations](#raycast-observations)
- [RayCast Observation Summary & Best Practices](#raycast-observation-summary--best-practices)
- [Variable Length Observations](#variable-length-observations)
- [Variable Length Observation Summary & Best Practices](#variable-length-observation-summary--best-practices)
- [Actions and Actuators](#actions-and-actuators)
- [Continuous Actions](#continuous-actions)
- [Discrete Actions](#discrete-actions)

for the agent that doesn't require a fully rendered image to convey.
- Use as few rays and tags as necessary to solve the problem in order to improve
learning stability and agent performance.
### Variable Length Observations
It is possible for agents to collect observations from a varying number of
GameObjects by using a `BufferSensor`.
You can add a `BufferSensor` to your Agent by adding a `BufferSensorComponent` to
its GameObject.
The `BufferSensor` can be useful in situations in which the Agent must pay
attention to a varying number of entities (for example, a varying number of
enemies or projectiles).
On the trainer side, the `BufferSensor`
is processed using an attention module. More information about attention
mechanisms can be found [here](https://arxiv.org/abs/1706.03762). Training or
doing inference with variable length observations can be slower than using
a flat vector observation. However, attention mechanisms enable solving
problems that require comparative reasoning between entities in a scene
such as our [Sorter environment](Learning-Environment-Examples.md#sorter).
Note that even though the `BufferSensor` can process a variable number of
entities, you still need to define a maximum number of entities. This is
because our network architecture requires to know what the shape of the
observations will be. If fewer entities are observed than the maximum, the
observation will be padded with zeros and the trainer will ignore
the padded observations. Note that attention layers are invariant to
the order of the entities, so there is no need to properly "order" the
entities before feeding them into the `BufferSensor`.
The the `BufferSensorComponent` Editor inspector have two arguments:
- `Observation Size` : This is how many floats each entities will be
represented with. This number is fixed and all entities must
have the same representation. For example, if the entities you want to
put into the `BufferSensor` have for relevant information position and
speed, then the `Observation Size` should be 6 floats.
- `Maximum Number of Entities` : This is the maximum number of entities
the `BufferSensor` will be able to collect.
To add an entity's observations to a `BufferSensorComponent`, you need
to call `BufferSensorComponent.AppendObservation()`
with a float array of size `Observation Size` as argument.
__Note__: Currently, the observations put into the `BufferSensor` are
not normalized, you will need to normalize your observations manually
between -1 and 1.
#### Variable Length Observation Summary & Best Practices
- Attach `BufferSensorComponent` to use.
- Call `BufferSensorComponent.AppendObservation()` to add the observations
of an entity to the `BufferSensor`.
- Normalize the entities observations before feeding them into the `BufferSensor`.
## Actions and Actuators

27
docs/Learning-Environment-Examples.md


- 37.6 for vector observations
- 34.2 for simple heuristic (pick a random valid move)
- 37.0 for greedy heuristic (pick the highest-scoring valid move)
## Sorter
![Sorter](images/sorter.png)
- Set-up: The Agent is in a circular room with numbered tiles. The values of the
tiles are random between 1 and 20. The tiles present in the room are randomized
at each episode. When the Agent visits a tile, it turns green.
- Goal: Visit all the tiles in ascending order.
- Agents: The environment contains a single Agent
- Agent Reward Function:
- -.0002 Existential penalty.
- +1 For visiting the right tile
- -1 For visiting the wrong tile
- BehaviorParameters:
- Vector Observations : 4 : 2 floats for Position and 2 floats for orientation
- Variable Length Observations : Between 1 and 20 entities (one for each tile)
each with 22 observations, the first 20 are one hot encoding of the value of the tile,
the 21st and 22nd represent the position of the tile relative to the Agent and the 23rd
is `1` if the tile was visited and `0` otherwise.
- Actions: 3 discrete branched actions corresponding to forward, backward,
sideways movement, as well as rotation.
- Float Properties: One
- num_tiles: The maximum number of tiles to sample.
- Default: 2
- Recommended Minimum: 1
- Recommended Maximum: 20
- Benchmark Mean Reward: Depends on the number of tiles.

2
docs/Training-Configuration-File.md


- LSTM does not work well with continuous actions. Please use
discrete actions for better results.
- Since the memories must be sent back and forth between Python and Unity, using
too large `memory_size` will slow down training.
- Adding a recurrent layer increases the complexity of the neural network, it is
recommended to decrease `num_layers` when using recurrent.
- It is required that `memory_size` be divisible by 2.

2
gym-unity/README.md


def main():
unity_env = UnityEnvironment("./envs/GridWorld")
env = UnityToGymWrapper(unity_env, 0, uint8_visual=True)
logger.configure('./logs') # Çhange to log in a different directory
logger.configure('./logs') # Change to log in a different directory
act = deepq.learn(