浏览代码

merging master and addressing comments

/develop/rm-rf-new-models
vincentpierre 4 年前
当前提交
c3699de8
共有 68 个文件被更改,包括 4175 次插入4349 次删除
  1. 1001
      Project/Assets/ML-Agents/Examples/FoodCollector/Demos/ExpertFood.demo
  2. 62
      Project/Assets/ML-Agents/Examples/FoodCollector/Prefabs/FoodCollectorArea.prefab
  3. 52
      Project/Assets/ML-Agents/Examples/FoodCollector/Prefabs/GridFoodCollectorArea.prefab
  4. 32
      Project/Assets/ML-Agents/Examples/FoodCollector/Prefabs/VisualFoodCollectorArea.prefab
  5. 72
      Project/Assets/ML-Agents/Examples/FoodCollector/Scripts/FoodCollectorAgent.cs
  6. 18
      Project/Assets/ML-Agents/Examples/GridWorld/Scenes/GridWorld.unity
  7. 3
      Project/Assets/ML-Agents/Examples/GridWorld/Scripts/GridArea.cs
  8. 6
      README.md
  9. 2
      com.unity.ml-agents.extensions/package.json
  10. 23
      com.unity.ml-agents/CHANGELOG.md
  11. 4
      com.unity.ml-agents/Documentation~/com.unity.ml-agents.md
  12. 6
      com.unity.ml-agents/Editor/BrainParametersDrawer.cs
  13. 14
      com.unity.ml-agents/Editor/DemonstrationDrawer.cs
  14. 2
      com.unity.ml-agents/Runtime/Academy.cs
  15. 2
      com.unity.ml-agents/Runtime/Actuators/IActionReceiver.cs
  16. 14
      com.unity.ml-agents/Runtime/Agent.cs
  17. 5
      com.unity.ml-agents/Runtime/Agent.deprecated.cs
  18. 36
      com.unity.ml-agents/Runtime/Communicator/GrpcExtensions.cs
  19. 12
      com.unity.ml-agents/Runtime/Communicator/RpcCommunicator.cs
  20. 10
      com.unity.ml-agents/Runtime/Inference/ModelRunner.cs
  21. 24
      com.unity.ml-agents/Runtime/Policies/BarracudaPolicy.cs
  22. 5
      com.unity.ml-agents/Runtime/Policies/BehaviorParameters.cs
  23. 23
      com.unity.ml-agents/Runtime/Policies/BrainParameters.cs
  24. 4
      com.unity.ml-agents/Tests/Editor/Communicator/RpcCommunicatorTests.cs
  25. 8
      com.unity.ml-agents/Tests/Editor/MLAgentsEditModeTest.cs
  26. 2
      com.unity.ml-agents/package.json
  27. 117
      docs/Learning-Environment-Create-New.md
  28. 9
      docs/Learning-Environment-Examples.md
  29. 255
      docs/images/3dball_learning_brain.png
  30. 999
      docs/images/roller-ball-agent.png
  31. 268
      docs/images/team_id.png
  32. 2
      gym-unity/gym_unity/__init__.py
  33. 2
      ml-agents-envs/mlagents_envs/__init__.py
  34. 15
      ml-agents-envs/mlagents_envs/environment.py
  35. 2
      ml-agents/mlagents/trainers/__init__.py
  36. 30
      ml-agents/mlagents/trainers/demo_loader.py
  37. 4
      ml-agents/mlagents/trainers/policy/policy.py
  38. 8
      ml-agents/mlagents/trainers/tests/torch/saver/test_saver.py
  39. 12
      ml-agents/mlagents/trainers/tests/torch/test_policy.py
  40. 2
      ml-agents/mlagents/trainers/tests/torch/test_reward_providers/test_gail.py
  41. 9
      ml-agents/mlagents/trainers/torch/components/bc/module.py
  42. 10
      ml-agents/mlagents/trainers/torch/components/reward_providers/curiosity_reward_provider.py
  43. 10
      ml-agents/mlagents/trainers/torch/components/reward_providers/gail_reward_provider.py
  44. 8
      ml-agents/mlagents/trainers/torch/components/reward_providers/rnd_reward_provider.py
  45. 8
      ml-agents/mlagents/trainers/torch/distributions.py
  46. 26
      ml-agents/mlagents/trainers/torch/networks.py
  47. 467
      Project/Assets/ML-Agents/Examples/FoodCollector/TFModels/FoodCollector.onnx
  48. 14
      Project/Assets/ML-Agents/Examples/FoodCollector/TFModels/FoodCollector.onnx.meta
  49. 1001
      Project/Assets/ML-Agents/Examples/FoodCollector/TFModels/GridFoodCollector.onnx
  50. 14
      Project/Assets/ML-Agents/Examples/FoodCollector/TFModels/GridFoodCollector.onnx.meta
  51. 1001
      Project/Assets/ML-Agents/Examples/FoodCollector/TFModels/VisualFoodCollector.onnx
  52. 14
      Project/Assets/ML-Agents/Examples/FoodCollector/TFModels/VisualFoodCollector.onnx.meta
  53. 3
      com.unity.ml-agents/Runtime/Analytics.meta
  54. 3
      com.unity.ml-agents/Tests/Editor/Analytics.meta
  55. 89
      com.unity.ml-agents/Runtime/Analytics/Events.cs
  56. 3
      com.unity.ml-agents/Runtime/Analytics/Events.cs.meta
  57. 263
      com.unity.ml-agents/Runtime/Analytics/InferenceAnalytics.cs
  58. 3
      com.unity.ml-agents/Runtime/Analytics/InferenceAnalytics.cs.meta
  59. 68
      com.unity.ml-agents/Tests/Editor/Analytics/InferenceAnalyticsTests.cs
  60. 3
      com.unity.ml-agents/Tests/Editor/Analytics/InferenceAnalyticsTests.cs.meta
  61. 11
      Project/Assets/ML-Agents/Examples/FoodCollector/TFModels/FoodCollector.nn.meta
  62. 305
      Project/Assets/ML-Agents/Examples/FoodCollector/TFModels/FoodCollector.nn
  63. 1001
      Project/Assets/ML-Agents/Examples/FoodCollector/TFModels/GridFoodCollector.nn
  64. 11
      Project/Assets/ML-Agents/Examples/FoodCollector/TFModels/GridFoodCollector.nn.meta
  65. 1001
      Project/Assets/ML-Agents/Examples/FoodCollector/TFModels/VisualFoodCollector.nn
  66. 11
      Project/Assets/ML-Agents/Examples/FoodCollector/TFModels/VisualFoodCollector.nn.meta

1001
Project/Assets/ML-Agents/Examples/FoodCollector/Demos/ExpertFood.demo
文件差异内容过多而无法显示
查看文件

62
Project/Assets/ML-Agents/Examples/FoodCollector/Prefabs/FoodCollectorArea.prefab


m_BrainParameters:
VectorObservationSize: 4
NumStackedVectorObservations: 1
VectorActionSize: 03000000030000000300000002000000
m_ActionSpec:
m_NumContinuousActions: 3
BranchSizes: 02000000
VectorActionSize:
VectorActionSpaceType: 0
m_Model: {fileID: 11400000, guid: 36ab3e93020504f48858d0856f939685, type: 3}
VectorActionSpaceType: 1
hasUpgradedBrainParametersWithActionSpec: 1
m_Model: {fileID: 11400000, guid: 3210b528a2bc44a86bd6bd1d571070f8, type: 3}
m_UseChildActuators: 1
m_ObservableAttributeHandling: 0
--- !u!114 &114176228333253036
MonoBehaviour:

myLaser: {fileID: 1081721624670010}
contribute: 1
useVectorObs: 1
useVectorFrozenFlag: 0
--- !u!114 &114725457980523372
MonoBehaviour:
m_ObjectHideFlags: 0

m_BrainParameters:
VectorObservationSize: 4
NumStackedVectorObservations: 1
VectorActionSize: 03000000030000000300000002000000
m_ActionSpec:
m_NumContinuousActions: 3
BranchSizes: 02000000
VectorActionSize:
VectorActionSpaceType: 0
m_Model: {fileID: 11400000, guid: 36ab3e93020504f48858d0856f939685, type: 3}
VectorActionSpaceType: 1
hasUpgradedBrainParametersWithActionSpec: 1
m_Model: {fileID: 11400000, guid: 3210b528a2bc44a86bd6bd1d571070f8, type: 3}
m_UseChildActuators: 1
m_ObservableAttributeHandling: 0
--- !u!114 &114711827726849508
MonoBehaviour:

myLaser: {fileID: 1941433838307300}
contribute: 0
useVectorObs: 1
useVectorFrozenFlag: 0
--- !u!114 &114443152683847924
MonoBehaviour:
m_ObjectHideFlags: 0

m_BrainParameters:
VectorObservationSize: 4
NumStackedVectorObservations: 1
VectorActionSize: 03000000030000000300000002000000
m_ActionSpec:
m_NumContinuousActions: 3
BranchSizes: 02000000
VectorActionSize:
VectorActionSpaceType: 0
m_Model: {fileID: 11400000, guid: 36ab3e93020504f48858d0856f939685, type: 3}
VectorActionSpaceType: 1
hasUpgradedBrainParametersWithActionSpec: 1
m_Model: {fileID: 11400000, guid: 3210b528a2bc44a86bd6bd1d571070f8, type: 3}
m_UseChildActuators: 1
m_ObservableAttributeHandling: 0
--- !u!114 &114542632553128056
MonoBehaviour:

myLaser: {fileID: 1421240237750412}
contribute: 0
useVectorObs: 1
useVectorFrozenFlag: 0
--- !u!114 &114986980423924774
MonoBehaviour:
m_ObjectHideFlags: 0

m_BrainParameters:
VectorObservationSize: 4
NumStackedVectorObservations: 1
VectorActionSize: 03000000030000000300000002000000
m_ActionSpec:
m_NumContinuousActions: 3
BranchSizes: 02000000
VectorActionSize:
VectorActionSpaceType: 0
m_Model: {fileID: 11400000, guid: 36ab3e93020504f48858d0856f939685, type: 3}
VectorActionSpaceType: 1
hasUpgradedBrainParametersWithActionSpec: 1
m_Model: {fileID: 11400000, guid: 3210b528a2bc44a86bd6bd1d571070f8, type: 3}
m_UseChildActuators: 1
m_ObservableAttributeHandling: 0
--- !u!114 &114189751434580810
MonoBehaviour:

myLaser: {fileID: 1617924810425504}
contribute: 0
useVectorObs: 1
useVectorFrozenFlag: 0
--- !u!114 &114644889237473510
MonoBehaviour:
m_ObjectHideFlags: 0

m_BrainParameters:
VectorObservationSize: 4
NumStackedVectorObservations: 1
VectorActionSize: 03000000030000000300000002000000
m_ActionSpec:
m_NumContinuousActions: 3
BranchSizes: 02000000
VectorActionSize:
VectorActionSpaceType: 0
m_Model: {fileID: 11400000, guid: 36ab3e93020504f48858d0856f939685, type: 3}
VectorActionSpaceType: 1
hasUpgradedBrainParametersWithActionSpec: 1
m_Model: {fileID: 11400000, guid: 3210b528a2bc44a86bd6bd1d571070f8, type: 3}
m_UseChildActuators: 1
m_ObservableAttributeHandling: 0
--- !u!114 &114235147148547996
MonoBehaviour:

myLaser: {fileID: 1045923826166930}
contribute: 0
useVectorObs: 1
useVectorFrozenFlag: 0
--- !u!114 &114276061479012222
MonoBehaviour:
m_ObjectHideFlags: 0

m_PrefabInstance: {fileID: 0}
m_PrefabAsset: {fileID: 0}
m_GameObject: {fileID: 1819751139121548}
m_LocalRotation: {x: -0, y: -0, z: -0, w: 1}
m_LocalRotation: {x: 0, y: 0, z: 0, w: 1}
m_LocalPosition: {x: 0, y: 12.3, z: 0}
m_LocalScale: {x: 1, y: 1, z: 1}
m_Children:

52
Project/Assets/ML-Agents/Examples/FoodCollector/Prefabs/GridFoodCollectorArea.prefab


m_BrainParameters:
VectorObservationSize: 0
NumStackedVectorObservations: 1
VectorActionSize: 03000000030000000300000002000000
m_ActionSpec:
m_NumContinuousActions: 3
BranchSizes: 02000000
VectorActionSize:
m_Model: {fileID: 11400000, guid: 699f852e79b5ba642871514fb1fb9843, type: 3}
hasUpgradedBrainParametersWithActionSpec: 1
m_Model: {fileID: 11400000, guid: 75910f45f20be49b18e2b95879a217b2, type: 3}
m_InferenceDevice: 0
m_BehaviorType: 0
m_BehaviorName: GridFoodCollector

myLaser: {fileID: 1081721624670010}
contribute: 0
useVectorObs: 0
useVectorFrozenFlag: 0
--- !u!114 &8297075921230369060
MonoBehaviour:
m_ObjectHideFlags: 0

- {r: 0, g: 0, b: 0, a: 0}
GizmoYOffset: 0
ShowGizmos: 0
CompressionType: 1
--- !u!1 &1482701732800114
GameObject:
m_ObjectHideFlags: 0

m_BrainParameters:
VectorObservationSize: 0
NumStackedVectorObservations: 1
VectorActionSize: 03000000030000000300000002000000
m_ActionSpec:
m_NumContinuousActions: 3
BranchSizes: 02000000
VectorActionSize:
m_Model: {fileID: 11400000, guid: 699f852e79b5ba642871514fb1fb9843, type: 3}
hasUpgradedBrainParametersWithActionSpec: 1
m_Model: {fileID: 11400000, guid: 75910f45f20be49b18e2b95879a217b2, type: 3}
m_InferenceDevice: 0
m_BehaviorType: 0
m_BehaviorName: GridFoodCollector

myLaser: {fileID: 1941433838307300}
contribute: 0
useVectorObs: 0
useVectorFrozenFlag: 0
--- !u!114 &259154752087955944
MonoBehaviour:
m_ObjectHideFlags: 0

- {r: 0, g: 0, b: 0, a: 0}
GizmoYOffset: 0
ShowGizmos: 0
CompressionType: 1
--- !u!1 &1528397385587768
GameObject:
m_ObjectHideFlags: 0

m_BrainParameters:
VectorObservationSize: 0
NumStackedVectorObservations: 1
VectorActionSize: 03000000030000000300000002000000
m_ActionSpec:
m_NumContinuousActions: 3
BranchSizes: 02000000
VectorActionSize:
m_Model: {fileID: 11400000, guid: 699f852e79b5ba642871514fb1fb9843, type: 3}
hasUpgradedBrainParametersWithActionSpec: 1
m_Model: {fileID: 11400000, guid: 75910f45f20be49b18e2b95879a217b2, type: 3}
m_InferenceDevice: 0
m_BehaviorType: 0
m_BehaviorName: GridFoodCollector

myLaser: {fileID: 1421240237750412}
contribute: 0
useVectorObs: 0
useVectorFrozenFlag: 0
--- !u!114 &5519119940433428255
MonoBehaviour:
m_ObjectHideFlags: 0

- {r: 0, g: 0, b: 0, a: 0}
GizmoYOffset: 0
ShowGizmos: 0
CompressionType: 1
--- !u!1 &1617924810425504
GameObject:
m_ObjectHideFlags: 0

m_BrainParameters:
VectorObservationSize: 0
NumStackedVectorObservations: 1
VectorActionSize: 03000000030000000300000002000000
m_ActionSpec:
m_NumContinuousActions: 3
BranchSizes: 02000000
VectorActionSize:
m_Model: {fileID: 11400000, guid: 699f852e79b5ba642871514fb1fb9843, type: 3}
hasUpgradedBrainParametersWithActionSpec: 1
m_Model: {fileID: 11400000, guid: 75910f45f20be49b18e2b95879a217b2, type: 3}
m_InferenceDevice: 0
m_BehaviorType: 0
m_BehaviorName: GridFoodCollector

myLaser: {fileID: 1617924810425504}
contribute: 0
useVectorObs: 0
useVectorFrozenFlag: 0
--- !u!114 &5884750436653390196
MonoBehaviour:
m_ObjectHideFlags: 0

- {r: 0, g: 0, b: 0, a: 0}
GizmoYOffset: 0
ShowGizmos: 0
CompressionType: 1
--- !u!1 &1688105343773098
GameObject:
m_ObjectHideFlags: 0

m_BrainParameters:
VectorObservationSize: 0
NumStackedVectorObservations: 1
VectorActionSize: 03000000030000000300000002000000
m_ActionSpec:
m_NumContinuousActions: 3
BranchSizes: 02000000
VectorActionSize:
m_Model: {fileID: 11400000, guid: 699f852e79b5ba642871514fb1fb9843, type: 3}
hasUpgradedBrainParametersWithActionSpec: 1
m_Model: {fileID: 11400000, guid: 75910f45f20be49b18e2b95879a217b2, type: 3}
m_InferenceDevice: 0
m_BehaviorType: 0
m_BehaviorName: GridFoodCollector

myLaser: {fileID: 1045923826166930}
contribute: 0
useVectorObs: 0
useVectorFrozenFlag: 0
--- !u!114 &4768752321433982785
MonoBehaviour:
m_ObjectHideFlags: 0

- {r: 0, g: 0, b: 0, a: 0}
GizmoYOffset: 0
ShowGizmos: 0
CompressionType: 1
--- !u!1 &1729825611722018
GameObject:
m_ObjectHideFlags: 0

m_PrefabInstance: {fileID: 0}
m_PrefabAsset: {fileID: 0}
m_GameObject: {fileID: 1819751139121548}
m_LocalRotation: {x: -0, y: -0, z: -0, w: 1}
m_LocalRotation: {x: 0, y: 0, z: 0, w: 1}
m_LocalPosition: {x: 0, y: 12.3, z: 0}
m_LocalScale: {x: 1, y: 1, z: 1}
m_Children:

32
Project/Assets/ML-Agents/Examples/FoodCollector/Prefabs/VisualFoodCollectorArea.prefab


m_BrainParameters:
VectorObservationSize: 1
NumStackedVectorObservations: 1
VectorActionSize: 03000000030000000300000002000000
m_ActionSpec:
m_NumContinuousActions: 3
BranchSizes: 02000000
VectorActionSize:
m_Model: {fileID: 11400000, guid: c3b1eb0bcf06b4c0488599c7ab806de7, type: 3}
hasUpgradedBrainParametersWithActionSpec: 1
m_Model: {fileID: 11400000, guid: ec4b31b5d66ca4e51ae3ac41945facb2, type: 3}
m_InferenceDevice: 0
m_BehaviorType: 0
m_BehaviorName: VisualFoodCollector

m_BrainParameters:
VectorObservationSize: 1
NumStackedVectorObservations: 1
VectorActionSize: 03000000030000000300000002000000
m_ActionSpec:
m_NumContinuousActions: 3
BranchSizes: 02000000
VectorActionSize:
m_Model: {fileID: 11400000, guid: c3b1eb0bcf06b4c0488599c7ab806de7, type: 3}
hasUpgradedBrainParametersWithActionSpec: 1
m_Model: {fileID: 11400000, guid: ec4b31b5d66ca4e51ae3ac41945facb2, type: 3}
m_InferenceDevice: 0
m_BehaviorType: 0
m_BehaviorName: VisualFoodCollector

m_BrainParameters:
VectorObservationSize: 1
NumStackedVectorObservations: 1
VectorActionSize: 03000000030000000300000002000000
m_ActionSpec:
m_NumContinuousActions: 3
BranchSizes: 02000000
VectorActionSize:
m_Model: {fileID: 11400000, guid: c3b1eb0bcf06b4c0488599c7ab806de7, type: 3}
hasUpgradedBrainParametersWithActionSpec: 1
m_Model: {fileID: 11400000, guid: ec4b31b5d66ca4e51ae3ac41945facb2, type: 3}
m_InferenceDevice: 0
m_BehaviorType: 0
m_BehaviorName: VisualFoodCollector

m_BrainParameters:
VectorObservationSize: 1
NumStackedVectorObservations: 1
VectorActionSize: 03000000030000000300000002000000
m_ActionSpec:
m_NumContinuousActions: 3
BranchSizes: 02000000
VectorActionSize:
m_Model: {fileID: 11400000, guid: c3b1eb0bcf06b4c0488599c7ab806de7, type: 3}
hasUpgradedBrainParametersWithActionSpec: 1
m_Model: {fileID: 11400000, guid: ec4b31b5d66ca4e51ae3ac41945facb2, type: 3}
m_InferenceDevice: 0
m_BehaviorType: 0
m_BehaviorName: VisualFoodCollector

72
Project/Assets/ML-Agents/Examples/FoodCollector/Scripts/FoodCollectorAgent.cs


return new Color32(r, g, b, 255);
}
public void MoveAgent(ActionSegment<int> act)
public void MoveAgent(ActionBuffers actionBuffers)
{
m_Shoot = false;

var dirToGo = Vector3.zero;
var rotateDir = Vector3.zero;
var continuousActions = actionBuffers.ContinuousActions;
var discreteActions = actionBuffers.DiscreteActions;
var shootCommand = false;
var forwardAxis = (int)act[0];
var rightAxis = (int)act[1];
var rotateAxis = (int)act[2];
var shootAxis = (int)act[3];
switch (forwardAxis)
{
case 1:
dirToGo = transform.forward;
break;
case 2:
dirToGo = -transform.forward;
break;
}
var forward = Mathf.Clamp(continuousActions[0], -1f, 1f);
var right = Mathf.Clamp(continuousActions[1], -1f, 1f);
var rotate = Mathf.Clamp(continuousActions[2], -1f, 1f);
switch (rightAxis)
{
case 1:
dirToGo = transform.right;
break;
case 2:
dirToGo = -transform.right;
break;
}
dirToGo = transform.forward * forward;
dirToGo += transform.right * right;
rotateDir = -transform.up * rotate;
switch (rotateAxis)
{
case 1:
rotateDir = -transform.up;
break;
case 2:
rotateDir = transform.up;
break;
}
switch (shootAxis)
{
case 1:
shootCommand = true;
break;
}
var shootCommand = (int)discreteActions[0] > 0;
if (shootCommand)
{
m_Shoot = true;

public override void OnActionReceived(ActionBuffers actionBuffers)
{
MoveAgent(actionBuffers.DiscreteActions);
MoveAgent(actionBuffers);
var discreteActionsOut = actionsOut.DiscreteActions;
discreteActionsOut[0] = 0;
discreteActionsOut[1] = 0;
discreteActionsOut[2] = 0;
var continuousActionsOut = actionsOut.ContinuousActions;
continuousActionsOut[0] = 0;
continuousActionsOut[1] = 0;
continuousActionsOut[2] = 0;
discreteActionsOut[2] = 2;
continuousActionsOut[2] = 1;
discreteActionsOut[0] = 1;
continuousActionsOut[0] = 1;
discreteActionsOut[2] = 1;
continuousActionsOut[2] = -1;
discreteActionsOut[0] = 2;
continuousActionsOut[0] = -1;
discreteActionsOut[3] = Input.GetKey(KeyCode.Space) ? 1 : 0;
var discreteActionsOut = actionsOut.DiscreteActions;
discreteActionsOut[0] = Input.GetKey(KeyCode.Space) ? 1 : 0;
}
public override void OnEpisodeBegin()

18
Project/Assets/ML-Agents/Examples/GridWorld/Scenes/GridWorld.unity


m_ReflectionIntensity: 1
m_CustomReflection: {fileID: 0}
m_Sun: {fileID: 0}
m_IndirectSpecularColor: {r: 0.44971168, g: 0.4997775, b: 0.57563686, a: 1}
m_IndirectSpecularColor: {r: 0.44971228, g: 0.49977815, b: 0.57563734, a: 1}
m_UseRadianceAmbientProbe: 0
--- !u!157 &3
LightmapSettings:

agentParameters:
maxStep: 100
hasUpgradedFromAgentParameters: 1
maxStep: 100
MaxStep: 100
area: {fileID: 1795599557}
timeBetweenDecisionsAtInference: 0.15
renderCamera: {fileID: 797520692}

m_Name:
m_EditorClassIdentifier:
m_BrainParameters:
vectorObservationSize: 0
numStackedVectorObservations: 1
vectorActionSize: 05000000
vectorActionDescriptions: []
vectorActionSpaceType: 0
VectorObservationSize: 0
NumStackedVectorObservations: 1
VectorActionSize: 05000000
VectorActionDescriptions: []
VectorActionSpaceType: 0
m_Model: {fileID: 11400000, guid: a812f1ce7763a4a0c912717f3594fe20, type: 3}
m_InferenceDevice: 0
m_BehaviorType: 0

m_UseChildActuators: 1
m_ObservableAttributeHandling: 0
--- !u!114 &125487791
MonoBehaviour:
m_ObjectHideFlags: 0

m_RenderTexture: {fileID: 8400000, guid: 114608d5384404f89bff4b6f88432958, type: 2}
m_SensorName: RenderTextureSensor
m_Grayscale: 0
m_ObservationStacks: 1
m_Compression: 1
--- !u!1 &260425459
GameObject:

trueAgent: {fileID: 125487785}
goalPref: {fileID: 1508142483324970, guid: 1ec4e4e96e7514d45b7ebc3ba5a9a481, type: 3}
pitPref: {fileID: 1811317785436014, guid: d13ee2db77b3a4dcc8664d2fe2a0f219, type: 3}
numberOfObstacles: 1
--- !u!4 &1795599558
Transform:
m_ObjectHideFlags: 0

3
Project/Assets/ML-Agents/Examples/GridWorld/Scripts/GridArea.cs


public GameObject goalPref;
public GameObject pitPref;
GameObject[] m_Objects;
public int numberOfObstacles = 1;
GameObject m_Plane;
GameObject m_Sn;

transform.position = m_InitialPosition * (m_ResetParams.GetWithDefault("gridSize", 5f) + 1);
var playersList = new List<int>();
for (var i = 0; i < (int)m_ResetParams.GetWithDefault("numObstacles", 1); i++)
for (var i = 0; i < (int)m_ResetParams.GetWithDefault("numObstacles", numberOfObstacles); i++)
{
playersList.Add(1);
}

6
README.md


For any other questions or feedback, connect directly with the ML-Agents team at
ml-agents@unity3d.com.
## Privacy
In order to improve the developer experience for Unity ML-Agents Toolkit, we have added in-editor analytics.
Please refer to "Information that is passively collected by Unity" in the
[Unity Privacy Policy](https://unity3d.com/legal/privacy-policy).
## License
[Apache License 2.0](LICENSE)

2
com.unity.ml-agents.extensions/package.json


"unity": "2018.4",
"description": "A source-only package for new features based on ML-Agents",
"dependencies": {
"com.unity.ml-agents": "1.6.0-preview"
"com.unity.ml-agents": "1.7.0-preview"
}
}

23
com.unity.ml-agents/CHANGELOG.md


#### com.unity.ml-agents (C#)
#### ml-agents / ml-agents-envs / gym-unity (Python)
- TensorFlow trainers have been removed, please use the Torch trainers instead. (#4707)
### Minor Changes
#### com.unity.ml-agents / com.unity.ml-agents.extensions (C#)
#### ml-agents / ml-agents-envs / gym-unity (Python)
### Bug Fixes
#### com.unity.ml-agents (C#)
#### ml-agents / ml-agents-envs / gym-unity (Python)
## [1.7.0-preview] - 2020-12-21
### Major Changes
#### com.unity.ml-agents (C#)
#### ml-agents / ml-agents-envs / gym-unity (Python)
- PyTorch trainers now support training agents with both continuous and discrete action spaces. (#4702)
The `.onnx` models generated by the trainers of this release are incompatible with versions of Barracuda before `1.2.1-preview`. If you upgrade the trainers, you must upgrade the version of the Barracuda package as well (which can be done by upgrading the `com.unity.ml-agents` package).
### Minor Changes

- In order to improve the developer experience for Unity ML-Agents Toolkit, we have added in-editor analytics.
Please refer to "Information that is passively collected by Unity" in the
[Unity Privacy Policy](https://unity3d.com/legal/privacy-policy). (#4677)
- The FoodCollector example environment now uses continuous actions for moving and
discrete actions for shooting. (#4746)
- `ActionSpec.validate_action()` now enforces that `UnityEnvironment.set_action_for_agent()` receives a 1D `np.array`.
- `ActionSpec.validate_action()` now enforces that `UnityEnvironment.set_action_for_agent()` receives a 1D `np.array`. (#4691)
- Removed noisy warnings about API minor version mismatches in both the C# and python code. (#4688)
#### ml-agents / ml-agents-envs / gym-unity (Python)

4
com.unity.ml-agents/Documentation~/com.unity.ml-agents.md


the documentation, you can checkout our [GitHub Repository], which also includes
a number of ways to [connect with us] including our [ML-Agents Forum].
In order to improve the developer experience for Unity ML-Agents Toolkit, we have added in-editor analytics.
Please refer to "Information that is passively collected by Unity" in the
[Unity Privacy Policy](https://unity3d.com/legal/privacy-policy).
[unity ML-Agents Toolkit]: https://github.com/Unity-Technologies/ml-agents
[unity inference engine]: https://docs.unity3d.com/Packages/com.unity.barracuda@latest/index.html
[package manager documentation]: https://docs.unity3d.com/Manual/upm-ui-install.html

6
com.unity.ml-agents/Editor/BrainParametersDrawer.cs


/// to make the custom GUI for.</param>
static void DrawVectorAction(Rect position, SerializedProperty property)
{
EditorGUI.LabelField(position, "Vector Action");
EditorGUI.LabelField(position, "Actions");
position.y += k_LineHeight;
EditorGUI.indentLevel++;
var actionSpecProperty = property.FindPropertyRelative(k_ActionSpecName);

EditorGUI.PropertyField(
position,
continuousActionSize,
new GUIContent("Continuous Action Size", "Length of continuous action vector."));
new GUIContent("Continuous Actions", "Number of continuous actions."));
}
/// <summary>

{
var branchSizes = property.FindPropertyRelative(k_DiscreteBranchSizeName);
var newSize = EditorGUI.IntField(
position, "Discrete Branch Size", branchSizes.arraySize);
position, "Discrete Branches", branchSizes.arraySize);
// This check is here due to:
// https://fogbugz.unity3d.com/f/cases/1246524/

14
com.unity.ml-agents/Editor/DemonstrationDrawer.cs


using System.Text;
using UnityEditor;
using Unity.MLAgents.Demonstrations;
using Unity.MLAgents.Policies;
namespace Unity.MLAgents.Editor

const string k_NumberStepsName = "numberSteps";
const string k_NumberEpisodesName = "numberEpisodes";
const string k_MeanRewardName = "meanReward";
const string k_ActionSpecName = "ActionSpec";
const string k_ActionSpecName = "m_ActionSpec";
const string k_NumDiscreteActionsName = "m_NumDiscreteActions";
const string k_NumDiscreteActionsName = "BranchSizes";
const string k_ShapeName = "shape";

var actSpecProperty = property.FindPropertyRelative(k_ActionSpecName);
var continuousSizeProperty = actSpecProperty.FindPropertyRelative(k_NumContinuousActionsName);
var discreteSizeProperty = actSpecProperty.FindPropertyRelative(k_NumDiscreteActionsName);
var continuousSizeLabel =
continuousSizeProperty.displayName + ": " + continuousSizeProperty.intValue;
var discreteSizeLabel = discreteSizeProperty.displayName + ": " +
discreteSizeProperty.intValue;
var continuousSizeLabel = "Continuous Actions: " + continuousSizeProperty.intValue;
var discreteSizeLabel = "Discrete Action Branches: ";
discreteSizeLabel += discreteSizeProperty == null ? "[]" : BuildIntArrayLabel(discreteSizeProperty);
EditorGUILayout.LabelField(continuousSizeLabel);
EditorGUILayout.LabelField(discreteSizeLabel);
}

2
com.unity.ml-agents/Runtime/Academy.cs


/// Unity package version of com.unity.ml-agents.
/// This must match the version string in package.json and is checked in a unit test.
/// </summary>
internal const string k_PackageVersion = "1.6.0-preview";
internal const string k_PackageVersion = "1.7.0-preview";
const int k_EditorTrainingPort = 5004;

2
com.unity.ml-agents/Runtime/Actuators/IActionReceiver.cs


/// <param name="destination">A float array to pack actions into whose length is greater than or
/// equal to the addition of the Lengths of this objects <see cref="ContinuousActions"/> and
/// <see cref="DiscreteActions"/> segments.</param>
/// [Obsolete("PackActions has been deprecated.")]
[Obsolete("PackActions has been deprecated.")]
public void PackActions(in float[] destination)
{
Debug.Assert(destination.Length >= ContinuousActions.Length + DiscreteActions.Length,

14
com.unity.ml-agents/Runtime/Agent.cs


/// <seealso cref="IActionReceiver.OnActionReceived"/>
public virtual void Heuristic(in ActionBuffers actionsOut)
{
// For backward compatibility
// Disable deprecation warnings so we can call the legacy overload.
#pragma warning disable CS0618
// The default implementation of Heuristic calls the
// obsolete version for backward compatibility
switch (m_PolicyFactory.BrainParameters.VectorActionSpaceType)
{
case SpaceType.Continuous:

actionsOut.ContinuousActions.Clear();
break;
}
#pragma warning restore CS0618
}
/// <summary>

{
m_ActionMasker = new DiscreteActionMasker(actionMask);
}
// Disable deprecation warnings so we can call the legacy overload.
#pragma warning disable CS0618
#pragma warning restore CS0618
}
/// <summary>

{
m_LegacyActionCache = Array.ConvertAll(actions.DiscreteActions.Array, x => (float)x);
}
// Disable deprecation warnings so we can call the legacy overload.
#pragma warning disable CS0618
#pragma warning restore CS0618
}
/// <summary>

5
com.unity.ml-agents/Runtime/Agent.deprecated.cs


/// Deprecated, use <see cref="WriteDiscreteActionMask"/> instead.
/// </summary>
/// <param name="actionMasker"></param>
[Obsolete("CollectDiscreteActionMasks has been deprecated, please use WriteDiscreteActionMask.")]
public virtual void CollectDiscreteActionMasks(DiscreteActionMasker actionMasker)
{
}

/// </summary>
/// <param name="actionsOut"></param>
[Obsolete("The float[] version of Heuristic has been deprecated, please use the ActionBuffers version instead.")]
public virtual void Heuristic(float[] actionsOut)
{
Debug.LogWarning("Heuristic method called but not implemented. Returning placeholder actions.");

/// Deprecated, use <see cref="OnActionReceived(ActionBuffers)"/> instead.
/// </summary>
/// <param name="vectorAction"></param>
[Obsolete("The float[] version of OnActionReceived has been deprecated, please use the ActionBuffers version instead.")]
public virtual void OnActionReceived(float[] vectorAction) { }
/// <summary>

/// The last action that was decided by the Agent (or null if no decision has been made).
/// </returns>
/// <seealso cref="OnActionReceived(ActionBuffers)"/>
// [Obsolete("GetAction has been deprecated, please use GetStoredActionBuffers, Or GetStoredDiscreteActions.")]
[Obsolete("GetAction has been deprecated, please use GetStoredActionBuffers instead.")]
public float[] GetAction()
{
var storedAction = m_Info.storedVectorActions;

36
com.unity.ml-agents/Runtime/Communicator/GrpcExtensions.cs


/// <param name="isTraining">Whether or not the Brain is training.</param>
public static BrainParametersProto ToProto(this BrainParameters bp, string name, bool isTraining)
{
// Disable deprecation warnings so we can set legacy fields
#pragma warning disable CS0618
var brainParametersProto = new BrainParametersProto
{
VectorActionSpaceTypeDeprecated = (SpaceTypeProto)bp.VectorActionSpaceType,

{
brainParametersProto.VectorActionDescriptionsDeprecated.AddRange(bp.VectorActionDescriptions);
}
#pragma warning restore CS0618
return brainParametersProto;
}

/// <returns>A BrainParameters struct.</returns>
public static BrainParameters ToBrainParameters(this BrainParametersProto bpp)
{
ActionSpec actionSpec;
if (bpp.ActionSpec == null)
{
var spaceType = (SpaceType)bpp.VectorActionSpaceTypeDeprecated;
if (spaceType == SpaceType.Continuous)
{
actionSpec = ActionSpec.MakeContinuous(bpp.VectorActionSizeDeprecated.ToArray()[0]);
}
else
{
actionSpec = ActionSpec.MakeDiscrete(bpp.VectorActionSizeDeprecated.ToArray());
}
}
else
{
actionSpec = ToActionSpec(bpp.ActionSpec);
}
ActionSpec = ToActionSpec(bpp.ActionSpec),
ActionSpec = actionSpec,
};
return bp;
}

{
if (!s_HaveWarnedTrainerCapabilitiesMultiPng)
{
Debug.LogWarning($"Attached trainer doesn't support multiple PNGs. Switching to uncompressed observations for sensor {sensor.GetName()}.");
Debug.LogWarning(
$"Attached trainer doesn't support multiple PNGs. Switching to uncompressed observations for sensor {sensor.GetName()}. " +
"Please find the versions that work best together from our release page: " +
"https://github.com/Unity-Technologies/ml-agents/releases"
);
s_HaveWarnedTrainerCapabilitiesMultiPng = true;
}
compressionType = SensorCompressionType.None;

{
if (!s_HaveWarnedTrainerCapabilitiesMapping)
{
Debug.LogWarning($"The sensor {sensor.GetName()} is using non-trivial mapping and " +
Debug.LogWarning(
$"The sensor {sensor.GetName()} is using non-trivial mapping and " +
"Switching to uncompressed observations.");
"Switching to uncompressed observations. " +
"Please find the versions that work best together from our release page: " +
"https://github.com/Unity-Technologies/ml-agents/releases"
);
s_HaveWarnedTrainerCapabilitiesMapping = true;
}
compressionType = SensorCompressionType.None;

12
com.unity.ml-agents/Runtime/Communicator/RpcCommunicator.cs


}
else if (unityVersion.Minor != pythonVersion.Minor)
{
// Even if we initialize, we still want to check to make sure that we inform users of minor version
// changes. This will surface any features that may not work due to minor version incompatibilities.
Debug.LogWarningFormat(
"WARNING: The communication API versions between Unity and python differ at the minor version level. " +
"Python API: {0}, Unity API: {1} Python Library Version: {2} .\n" +
"This means that some features may not work unless you upgrade the package with the lower version." +
"Please find the versions that work best together from our release page.\n" +
"https://github.com/Unity-Technologies/ml-agents/releases",
pythonApiVersion, unityCommunicationVersion, pythonLibraryVersion
);
// If a feature is used in Unity but not supported in the trainer,
// we will warn at the point it's used. Don't warn here to avoid noise.
}
return true;
}

10
com.unity.ml-agents/Runtime/Inference/ModelRunner.cs


actionSpec, seed, m_TensorAllocator, m_Memories, barracudaModel);
}
public InferenceDevice InferenceDevice
{
get { return m_InferenceDevice; }
}
public NNModel Model
{
get { return m_Model; }
}
static Dictionary<string, Tensor> PrepareBarracudaInputs(IEnumerable<TensorProxy> infInputs)
{
var inputs = new Dictionary<string, Tensor>();

24
com.unity.ml-agents/Runtime/Policies/BarracudaPolicy.cs


List<int[]> m_SensorShapes;
ActionSpec m_ActionSpec;
private string m_BehaviorName;
/// <summary>
/// Whether or not we've tried to send analytics for this model. We only ever try to send once per policy,
/// and do additional deduplication in the analytics code.
/// </summary>
private bool m_AnalyticsSent;
InferenceDevice inferenceDevice)
InferenceDevice inferenceDevice,
string behaviorName
)
m_BehaviorName = behaviorName;
m_ActionSpec = actionSpec;
}

if (!m_AnalyticsSent)
{
m_AnalyticsSent = true;
Analytics.InferenceAnalytics.InferenceModelSet(
m_ModelRunner.Model,
m_BehaviorName,
m_ModelRunner.InferenceDevice,
sensors,
m_ActionSpec
);
}
m_AgentId = info.episodeId;
m_ModelRunner?.PutObservations(info, sensors);
}

5
com.unity.ml-agents/Runtime/Policies/BehaviorParameters.cs


"Either assign a model, or change to a different Behavior Type."
);
}
return new BarracudaPolicy(actionSpec, m_Model, m_InferenceDevice);
return new BarracudaPolicy(actionSpec, m_Model, m_InferenceDevice, m_BehaviorName);
}
case BehaviorType.Default:
if (Academy.Instance.IsCommunicatorOn)

if (m_Model != null)
{
return new BarracudaPolicy(actionSpec, m_Model, m_InferenceDevice);
return new BarracudaPolicy(actionSpec, m_Model, m_InferenceDevice, m_BehaviorName);
}
else
{

}
agent.ReloadPolicy();
}
}
}

23
com.unity.ml-agents/Runtime/Policies/BrainParameters.cs


/// the action.
/// For the discrete action space: the number of branches in the action space.
/// </value>
/// [Obsolete("VectorActionSize has been deprecated, please use ActionSpec instead.")]
[Obsolete("VectorActionSize has been deprecated, please use ActionSpec instead.")]
[FormerlySerializedAs("vectorActionSize")]
public int[] VectorActionSize = new[] { 1 };

/// <summary>
/// (Deprecated) Defines if the action is discrete or continuous.
/// </summary>
/// [Obsolete("VectorActionSpaceType has been deprecated, please use ActionSpec instead.")]
[Obsolete("VectorActionSpaceType has been deprecated, please use ActionSpec instead.")]
[FormerlySerializedAs("vectorActionSpaceType")]
public SpaceType VectorActionSpaceType = SpaceType.Discrete;

/// <summary>
/// (Deprecated) The number of actions specified by this Brain.
/// </summary>
/// [Obsolete("NumActions has been deprecated, please use ActionSpec instead.")]
[Obsolete("NumActions has been deprecated, please use ActionSpec instead.")]
public int NumActions
{
get

/// <returns> A new BrainParameter object with the same values as the original.</returns>
public BrainParameters Clone()
{
// Disable deprecation warnings so we can read/write the old fields.
#pragma warning disable CS0618
return new BrainParameters
{
VectorObservationSize = VectorObservationSize,

VectorActionSize = (int[])VectorActionSize.Clone(),
VectorActionSpaceType = VectorActionSpaceType,
};
#pragma warning restore CS0618
}
/// <summary>

{
if (!hasUpgradedBrainParametersWithActionSpec)
// Disable deprecation warnings so we can read the old fields.
#pragma warning disable CS0618
if (!hasUpgradedBrainParametersWithActionSpec
&& m_ActionSpec.NumContinuousActions == 0
&& m_ActionSpec.BranchSizes == null)
{
if (VectorActionSpaceType == SpaceType.Continuous)
{

m_ActionSpec.NumContinuousActions = 0;
m_ActionSpec.BranchSizes = (int[])VectorActionSize.Clone();
}
hasUpgradedBrainParametersWithActionSpec = true;
hasUpgradedBrainParametersWithActionSpec = true;
#pragma warning restore CS0618
}
/// <summary>

{
// Disable deprecation warnings so we can read the old fields.
#pragma warning disable CS0618
if (m_ActionSpec.NumContinuousActions == 0)
{
VectorActionSize = (int[])ActionSpec.BranchSizes.Clone();

{
VectorActionSize = null;
}
#pragma warning restore CS0618
}
/// <summary>

4
com.unity.ml-agents/Tests/Editor/Communicator/RpcCommunicatorTests.cs


Assert.IsTrue(RpcCommunicator.CheckCommunicationVersionsAreCompatible(unityVerStr,
pythonVerStr,
pythonPackageVerStr));
// Ensure that a warning was printed.
LogAssert.Expect(LogType.Warning, new Regex("(.\\s)+"));
LogAssert.NoUnexpectedReceived();
unityVerStr = "2.0.0";
Assert.IsFalse(RpcCommunicator.CheckCommunicationVersionsAreCompatible(unityVerStr,

8
com.unity.ml-agents/Tests/Editor/MLAgentsEditModeTest.cs


Assert.AreEqual(numSteps, agent1.sensor1.numWriteCalls);
Assert.AreEqual(numSteps, agent1.sensor2.numCompressedCalls);
// Disable deprecation warnings so we can read/write the old fields.
#pragma warning disable CS0618
Assert.AreEqual(
agent1.collectObservationsCallsForEpisode,
agent1.GetStoredActionBuffers().ContinuousActions[0]
);
#pragma warning restore CS0618
}
}

2
com.unity.ml-agents/package.json


{
"name": "com.unity.ml-agents",
"displayName": "ML Agents",
"version": "1.6.0-preview",
"version": "1.7.0-preview",
"unity": "2018.4",
"description": "Use state-of-the-art machine learning to create intelligent character behaviors in any Unity environment (games, robotics, film, etc.).",
"dependencies": {

117
docs/Learning-Environment-Create-New.md


### Create the Floor Plane
1. Right click in Hierarchy window, select 3D Object > Plane.
1. Name the GameObject "Floor."
1. Name the GameObject "Floor".
1. Select the Floor Plane to view its properties in the Inspector window.
1. Set Transform to Position = `(0, 0, 0)`, Rotation = `(0, 0, 0)`, Scale =
`(1, 1, 1)`.

### Add the Target Cube
1. Right click in Hierarchy window, select 3D Object > Cube.
1. Name the GameObject "Target"
1. Name the GameObject "Target".
1. Select the Target Cube to view its properties in the Inspector window.
1. Set Transform to Position = `(3, 0.5, 3)`, Rotation = `(0, 0, 0)`, Scale =
`(1, 1, 1)`.

### Add the Agent Sphere
1. Right click in Hierarchy window, select 3D Object > Sphere.
1. Name the GameObject "RollerAgent"
1. Name the GameObject "RollerAgent".
1. Select the RollerAgent Sphere to view its properties in the Inspector window.
1. Set Transform to Position = `(0, 0.5, 0)`, Rotation = `(0, 0, 0)`, Scale =
`(1, 1, 1)`.

<p align="left">
<img src="images/roller-ball-agent.png"
alt="The Agent GameObject in the Inspector window"
width="400" border="10" />
</p>
### Group into Training Area
Note that the screenshot above includes the `Roller Agent` script, which we will
create in the next section. However, before we do that, we'll first group the
floor, target and agent under a single, empty, GameObject. This will simplify
Group the floor, target and agent under a single, empty, GameObject. This will simplify
<p align="left">
<img src="images/roller-ball-hierarchy.png"
alt="The Hierarchy window"
width="250" border="10" />
</p>
To do so:

1. Drag the Floor, Target, and RollerAgent GameObjects in the Hierarchy into the
TrainingArea GameObject.
<p align="left">
<img src="images/roller-ball-hierarchy.png"
alt="The Hierarchy window"
width="250" border="10" />
</p>
To create the Agent:
To create the Agent Script:
1. Select the RollerAgent GameObject to view it in the Inspector window.
1. Click **Add Component**.

1. In the Unity Project window, double-click the `RollerAgent` script to open it
in your code editor.
1. In the editor, add the `using Unity.MLAgents;` and
`using Unity.MLAgents.Sensors;` statements and then change the base class from
`MonoBehaviour` to `Agent`.
1. Delete the `Update()` method, but we will use the `Start()` function, so
leave it alone for now.
1. Import ML-Agent package by adding
```csharp
using Unity.MLAgents;
using Unity.MLAgents.Sensors;
using Unity.MLAgents.Actuators;
```
then change the base class from `MonoBehaviour` to `Agent`.
1. Delete `Update()` since we are not using it, but keep `Start()`.
So far, these are the basic steps that you would use to add ML-Agents to any
Unity project. Next, we will add the logic that will let our Agent learn to roll

the Agent (Sphere) attempts to solve the task. Each episode lasts until the
Agents solves the task (i.e. reaches the cube), fails (rolls off the platform)
or times out (takes too long to solve or fail at the task). At the start of each
episode, the `OnEpisodeBegin()` method is called to set-up the environment for a
episode, `OnEpisodeBegin()` is called to set-up the environment for a
In this example, each time the Agent (Sphere) reaches its target (Cube), its
episode ends and the method moves the target (Cube) to a new random location. In
addition, if the Agent rolls off the platform, the `OnEpisodeBegin()` method
puts it back onto the floor.
In this example, each time the Agent (Sphere) reaches its target (Cube), the
episode ends and the target (Cube) is moved to a new random location; and if
the Agent rolls off the platform, it will be put back onto the floor.
These are all handled in `OnEpisodeBegin()`.
To move the target (Cube), we need a reference to its Transform (which stores a
GameObject's position, orientation and scale in the 3D world). To get this

public Transform Target;
public override void OnEpisodeBegin()
{
// If the Agent fell, zero its momentum
// If the Agent fell, zero its momentum
this.rBody.angularVelocity = Vector3.zero;
this.rBody.velocity = Vector3.zero;
this.transform.localPosition = new Vector3( 0, 0.5f, 0);

#### Actions
To solve the task of moving towards the target, the Agent (Sphere) needs to be
able to move in the `x` and `z` directions. As such, we will provide 2 actions
to the agent. The first determines the force applied along the x-axis; the
able to move in the `x` and `z` directions. As such, the agent needs 2 actions:
the first determines the force applied along the x-axis; and the
to move in three dimensions, then we would need a third action.
to move in three dimensions, then we would need a third action.)
component, `rBody`, using the `Rigidbody.AddForce` function:
component `rBody`, using `Rigidbody.AddForce()`:
```csharp
Vector3 controlSignal = Vector3.zero;

#### Rewards
Reinforcement learning requires rewards. Assign rewards in the
`OnActionReceived()` function. The learning algorithm uses the rewards assigned
to the Agent during the simulation and learning process to determine whether it
Reinforcement learning requires rewards to signal which decisions are good and
which are bad. The learning algorithm uses the rewards to determine whether it
The RollerAgent calculates the distance to detect when it reaches the target.
When it does, the code calls the `Agent.SetReward()` method to assign a reward
of 1.0 and marks the agent as finished by calling the `EndEpisode()` method on
Rewards are assigned in `OnActionReceived()`. The RollerAgent
calculates the distance to detect when it reaches the target.
When it does, the code calls `Agent.SetReward()` to assign a reward
of 1.0 and marks the agent as finished by calling `EndEpisode()` on
the Agent.
```csharp

#### OnActionReceived()
With the action and reward logic outlined above, the final version of the
`OnActionReceived()` function looks like:
With the action and reward logic outlined above, the final version of
`OnActionReceived()` looks like:
```csharp
public float forceMultiplier = 10;

}
```
Note the `forceMultiplier` class variable is defined before the function. Since `forceMultiplier` is
public, you can set the value from the Inspector window.
Note the `forceMultiplier` class variable is defined before the method definition.
Since `forceMultiplier` is public, you can set the value from the Inspector window.
## Final Editor Setup
## Final Agent Setup in Editor
Now, that all the GameObjects and ML-Agent components are in place, it is time
to connect everything together in the Unity Editor. This involves changing some
of the Agent Component's properties so that they are compatible with our Agent
code.
Now that all the GameObjects and ML-Agent components are in place, it is time
to connect everything together in the Unity Editor. This involves adding and
setting some of the Agent Component's properties so that they are compatible
with our Agent script.
1. Add the `Decision Requester` script with the Add Component button from the
RollerAgent Inspector.
1. Change **Decision Period** to `10`. For more information on decisions, see [the Agent documentation](Learning-Environment-Design-Agents.md#decisions)
1. Add the `Behavior Parameters` script with the Add Component button from the
RollerAgent Inspector.
1. Modify the Behavior Parameters of the Agent :
- `Behavior Name` to _RollerBall_
1. Add a `Decision Requester` script with the **Add Component** button.
Set the **Decision Period** to `10`. For more information on decisions,
see [the Agent documentation](Learning-Environment-Design-Agents.md#decisions)
1. Add a `Behavior Parameters` script with the **Add Component** button.
Set the Behavior Parameters of the Agent to the following:
- `Behavior Name`: _RollerBall_
- `Vector Action` > `Space Type` = **Continuous**
- `Vector Action` > `Space Size` = 2
- `Actions` > `Continuous Actions` = 2
In the inspector, the `RollerAgent` should look like this now:
<p align="left">
<img src="images/roller-ball-agent.png"
alt="The Agent GameObject in the Inspector window"
width="400" border="5" />
</p>
Now you are ready to test the environment before training.

9
docs/Learning-Environment-Examples.md


agent is frozen and/or shot its laser (2), plus ray-based perception of
objects around agent's forward direction (49; 7 raycast angles with 7
measurements for each).
- Actions: 4 discrete action ranches:
- Forward Motion (3 possible actions: Forward, Backwards, No Action)
- Side Motion (3 possible actions: Left, Right, No Action)
- Rotation (3 possible actions: Rotate Left, Rotate Right, No Action)
- Laser (2 possible actions: Laser, No Action)
- Actions:
- 3 continuous actions correspond to Forward Motion, Side Motion and Rotation
- 1 discrete acion branch for Laser with 2 possible actions corresponding to
Shoot Laser or No Action
- Visual Observations (Optional): First-person camera per-agent, plus one vector
flag representing the frozen state of the agent. This scene uses a combination
of vector and visual observations and the training will not succeed without

255
docs/images/3dball_learning_brain.png

之前 之后
宽度: 499  |  高度: 384  |  大小: 43 KiB

999
docs/images/roller-ball-agent.png
文件差异内容过多而无法显示
查看文件

268
docs/images/team_id.png

之前 之后
宽度: 439  |  高度: 249  |  大小: 27 KiB

2
gym-unity/gym_unity/__init__.py


# Version of the library that will be used to upload to pypi
__version__ = "0.23.0.dev0"
__version__ = "0.24.0.dev0"
# Git tag that will be checked to determine whether to trigger upload to pypi
__release_tag__ = None

2
ml-agents-envs/mlagents_envs/__init__.py


# Version of the library that will be used to upload to pypi
__version__ = "0.23.0.dev0"
__version__ = "0.24.0.dev0"
# Git tag that will be checked to determine whether to trigger upload to pypi
__release_tag__ = None

15
ml-agents-envs/mlagents_envs/environment.py


elif unity_communicator_version.version[0] != api_version.version[0]:
# Major versions mismatch.
return False
elif unity_communicator_version.version[1] != api_version.version[1]:
# Non-beta minor versions mismatch. Log a warning but allow execution to continue.
logger.warning(
f"WARNING: The communication API versions between Unity and python differ at the minor version level. "
f"Python API: {python_api_version}, Unity API: {unity_communicator_version}.\n"
f"This means that some features may not work unless you upgrade the package with the lower version."
f"Please find the versions that work best together from our release page.\n"
"https://github.com/Unity-Technologies/ml-agents/releases"
)
# Major versions match, so either:
# 1) The versions are identical, in which case there's no compatibility issues
# 2) The Unity version is newer, in which case we'll warn or fail on the Unity side if trying to use
# unsupported features
# 3) The trainer version is newer, in which case new trainer features might be available but unused by C#
# In any of the cases, there's no reason to warn about mismatch here.
logger.info(
f"Connected to Unity environment with package version {unity_package_version} "
f"and communication version {unity_com_ver}"

2
ml-agents/mlagents/trainers/__init__.py


# Version of the library that will be used to upload to pypi
__version__ = "0.23.0.dev0"
__version__ = "0.24.0.dev0"
# Git tag that will be checked to determine whether to trigger upload to pypi
__release_tag__ = None

30
ml-agents/mlagents/trainers/demo_loader.py


demo_raw_buffer["rewards"].append(next_reward)
for i, obs in enumerate(current_obs):
demo_raw_buffer[ObsUtil.get_name_at(i)].append(obs)
# TODO: update the demonstraction files and read from the new proto format
if behavior_spec.action_spec.continuous_size > 0:
demo_raw_buffer["continuous_action"].append(
current_pair_info.action_info.vector_actions_deprecated
)
if behavior_spec.action_spec.discrete_size > 0:
demo_raw_buffer["discrete_action"].append(
current_pair_info.action_info.vector_actions_deprecated
)
if (
len(current_pair_info.action_info.continuous_actions) == 0
and len(current_pair_info.action_info.discrete_actions) == 0
):
if behavior_spec.action_spec.continuous_size > 0:
demo_raw_buffer["continuous_action"].append(
current_pair_info.action_info.vector_actions_deprecated
)
</