比较提交

...
此合并请求有变更与目标分支冲突。
/com.unity.ml-agents/package.json
/com.unity.ml-agents/CHANGELOG.md
/ml-agents/mlagents/trainers/settings.py
/ml-agents/mlagents/trainers/sac/optimizer_torch.py

9 次代码提交

作者 SHA1 备注 提交日期
vincentpierre 4bde393e Got the walker to walk different based on diversity setting 3 年前
vincentpierre 5985959d Got 2 modes on Wlker I think 3 年前
vincentpierre 4289a6dd _ 3 年前
vincentpierre 8450b154 - 3 年前
vincentpierre b4f30613 Adding a variational version 3 年前
vincentpierre 7c74c967 _ 4 年前
vincentpierre 47fa1682 - 4 年前
vincentpierre bf8acbb0 - 4 年前
vincentpierre 8da21669 Adding some changes 4 年前
共有 36 个文件被更改,包括 9090 次插入89 次删除
  1. 13
      config/ppo/Pyramids.yaml
  2. 10
      config/ppo/Walker.yaml
  3. 2
      config/ppo/GridWorld.yaml
  4. 17
      config/sac/Pyramids.yaml
  5. 5
      config/sac/Walker.yaml
  6. 2
      com.unity.ml-agents/CHANGELOG.md
  7. 2
      com.unity.ml-agents/package.json
  8. 37
      Project/Assets/ML-Agents/Examples/PushBlock/Prefabs/PushBlockArea.prefab
  9. 17
      Project/Assets/ML-Agents/Examples/PushBlock/Scripts/PushAgentBasic.cs
  10. 79
      Project/Assets/ML-Agents/Examples/Pyramids/Prefabs/AreaPB.prefab
  11. 11
      Project/Assets/ML-Agents/Examples/Pyramids/Scripts/PyramidAgent.cs
  12. 17
      Project/Assets/ML-Agents/Examples/Walker/Scripts/WalkerAgent.cs
  13. 70
      Project/Assets/ML-Agents/Examples/Walker/Prefabs/Platforms/Platform.prefab
  14. 111
      Project/Assets/ML-Agents/Examples/Walker/Prefabs/Ragdoll/WalkerRagdoll.prefab
  15. 207
      Project/Assets/ML-Agents/Examples/Walker/Scenes/Walker.unity
  16. 8
      ml-agents/mlagents/trainers/settings.py
  17. 3
      ml-agents/mlagents/trainers/torch/components/reward_providers/__init__.py
  18. 4
      ml-agents/mlagents/trainers/torch/components/reward_providers/reward_provider_factory.py
  19. 311
      ml-agents/mlagents/trainers/sac/optimizer_torch.py
  20. 1001
      Project/Assets/ML-Agents/Examples/Walker/TFModels/Walker-10M.onnx
  21. 14
      Project/Assets/ML-Agents/Examples/Walker/TFModels/Walker-10M.onnx.meta
  22. 1001
      Project/Assets/ML-Agents/Examples/Walker/TFModels/Walker-diverse-no-extrinsic.onnx
  23. 14
      Project/Assets/ML-Agents/Examples/Walker/TFModels/Walker-diverse-no-extrinsic.onnx.meta
  24. 1001
      Project/Assets/ML-Agents/Examples/Walker/TFModels/Walker-diverse-r02-bigger.onnx
  25. 14
      Project/Assets/ML-Agents/Examples/Walker/TFModels/Walker-diverse-r02-bigger.onnx.meta
  26. 1001
      Project/Assets/ML-Agents/Examples/Walker/TFModels/Walker-diverse-r05-bigger.onnx
  27. 14
      Project/Assets/ML-Agents/Examples/Walker/TFModels/Walker-diverse-r05-bigger.onnx.meta
  28. 1001
      Project/Assets/ML-Agents/Examples/Walker/TFModels/Walker-extrinsic-log-diverse.onnx
  29. 14
      Project/Assets/ML-Agents/Examples/Walker/TFModels/Walker-extrinsic-log-diverse.onnx.meta
  30. 1001
      Project/Assets/ML-Agents/Examples/Walker/TFModels/Walker-new-reward-1.onnx
  31. 14
      Project/Assets/ML-Agents/Examples/Walker/TFModels/Walker-new-reward-1.onnx.meta
  32. 1001
      Project/Assets/ML-Agents/Examples/Walker/TFModels/mede-walker-crazy-mutual-10000-nogound-penalty.onnx
  33. 14
      Project/Assets/ML-Agents/Examples/Walker/TFModels/mede-walker-crazy-mutual-10000-nogound-penalty.onnx.meta
  34. 1001
      Project/Assets/ML-Agents/Examples/Walker/TFModels/mede-walker-crazy-mutual-10000-nogound-penalty-01strength.onnx
  35. 14
      Project/Assets/ML-Agents/Examples/Walker/TFModels/mede-walker-crazy-mutual-10000-nogound-penalty-01strength.onnx.meta
  36. 133
      ml-agents/mlagents/trainers/torch/components/reward_providers/diverse_reward_provider.py

13
config/ppo/Pyramids.yaml


hidden_units: 512
num_layers: 2
vis_encode_type: simple
goal_conditioning_type: none
curiosity:
# curiosity:
# gamma: 0.99
# strength: 0.02
# network_settings:
# hidden_units: 256
# learning_rate: 0.0003
diverse:
strength: 0.02
network_settings:
hidden_units: 256
strength: 0.1
learning_rate: 0.0003
keep_checkpoints: 5
max_steps: 10000000

10
config/ppo/Walker.yaml


hidden_units: 512
num_layers: 3
vis_encode_type: simple
goal_conditioning_type: none
diverse:
gamma: 0.99
strength: 0.1
network_settings:
normalize: false
hidden_units: 512
num_layers: 3
vis_encode_type: simple
goal_conditioning_type: none
keep_checkpoints: 5
max_steps: 30000000
time_horizon: 1000

2
config/ppo/GridWorld.yaml


normalize: false
hidden_units: 128
num_layers: 1
vis_encode_type: simple
vis_encode_type: fully_connected
reward_signals:
extrinsic:
gamma: 0.9

17
config/sac/Pyramids.yaml


hidden_units: 512
num_layers: 3
vis_encode_type: simple
goal_conditioning_type: none
gail:
gamma: 0.99
strength: 0.01
learning_rate: 0.0003
use_actions: true
use_vail: false
demo_path: Project/Assets/ML-Agents/Examples/Pyramids/Demos/ExpertPyramid.demo
# gail:
# gamma: 0.99
# strength: 0.01
# learning_rate: 0.0003
# use_actions: true
# use_vail: false
# demo_path: Project/Assets/ML-Agents/Examples/Pyramids/Demos/ExpertPyramid.demo
max_steps: 3000000
max_steps: 30000000
time_horizon: 128
summary_freq: 30000

5
config/sac/Walker.yaml


hidden_units: 256
num_layers: 3
vis_encode_type: simple
goal_conditioning_type: none
strength: 1.0
strength: 0.1
max_steps: 15000000
max_steps: 150000000
time_horizon: 1000
summary_freq: 30000

2
com.unity.ml-agents/CHANGELOG.md


sizes and will need to be retrained. (#5181)
- The `AbstractBoard` class for integration with Match-3 games was changed to make it easier to support boards with
different sizes using the same model. For a summary of the interface changes, please see the Migration Guide. (##5189)
- Updated the Barracuda package to version `1.3.3-preview`(#5236)
- Updated the Barracuda package to version `1.4.0-preview`(#5236)
- `GridSensor` has been refactored and moved to main package, with changes to both sensor interfaces and behaviors.
Exsisting GridSensor created by extension package will not work in newer version. Previously trained models will
need to be retrained. Please see the Migration Guide for more details. (#5256)

2
com.unity.ml-agents/package.json


"unity": "2019.4",
"description": "Use state-of-the-art machine learning to create intelligent character behaviors in any Unity environment (games, robotics, film, etc.).",
"dependencies": {
"com.unity.barracuda": "1.3.3-preview",
"com.unity.barracuda": "1.4.0-preview",
"com.unity.modules.imageconversion": "1.0.0",
"com.unity.modules.jsonserialize": "1.0.0"
}

37
Project/Assets/ML-Agents/Examples/PushBlock/Prefabs/PushBlockArea.prefab


m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

- component: {fileID: 114451319691753174}
- component: {fileID: 8964598783836598940}
- component: {fileID: 4081319787948195948}
- component: {fileID: 3604832710400578108}
m_Layer: 0
m_Name: Agent
m_TagString: agent

m_Name:
m_EditorClassIdentifier:
debugCommandLineOverride:
--- !u!114 &3604832710400578108
MonoBehaviour:
m_ObjectHideFlags: 0
m_CorrespondingSourceObject: {fileID: 0}
m_PrefabInstance: {fileID: 0}
m_PrefabAsset: {fileID: 0}
m_GameObject: {fileID: 1489716781518988}
m_Enabled: 1
m_EditorHideFlags: 0
m_Script: {fileID: 11500000, guid: 38b7cc1f5819445aa85e9a9b054552dc, type: 3}
m_Name:
m_EditorClassIdentifier:
m_SensorName: VectorSensor
m_ObservationSize: 8
m_ObservationType: 1
--- !u!1 &1500989011945850
GameObject:
m_ObjectHideFlags: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_ClearFlags: 2
m_BackGroundColor: {r: 0.46666667, g: 0.5647059, b: 0.60784316, a: 1}
m_projectionMatrixMode: 1
m_GateFitMode: 2
m_FOVAxisMode: 0
m_GateFitMode: 2
m_FocalLength: 50
m_NormalizedViewPortRect:
serializedVersion: 2

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

17
Project/Assets/ML-Agents/Examples/PushBlock/Scripts/PushAgentBasic.cs


using UnityEngine;
using Unity.MLAgents;
using Unity.MLAgents.Actuators;
using Unity.MLAgents.Sensors;
public class PushAgentBasic : Agent
{

m_ResetParams = Academy.Instance.EnvironmentParameters;
SetResetParameters();
GetComponent<VectorSensorComponent>().CreateSensors();
m_DiversitySettingSensor = GetComponent<VectorSensorComponent>();
}
VectorSensorComponent m_DiversitySettingSensor;
public int m_DiversitySetting = 0;
/// <summary>
/// Loop over body parts to add them to observation.
/// </summary>
public override void CollectObservations(VectorSensor sensor)
{
m_DiversitySettingSensor.GetSensor().Reset();
m_DiversitySettingSensor.GetSensor().AddOneHotObservation(m_DiversitySetting, 8);
}
/// <summary>

m_AgentRb.angularVelocity = Vector3.zero;
SetResetParameters();
m_DiversitySetting = Random.Range(0, 8);
}
public void SetGroundMaterialFriction()

79
Project/Assets/ML-Agents/Examples/Pyramids/Prefabs/AreaPB.prefab


m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

- component: {fileID: 5767481171805996936}
- component: {fileID: 4725417187860315718}
- component: {fileID: 6474351450651730614}
- component: {fileID: 5328107234309071792}
m_Layer: 0
m_Name: Agent
m_TagString: agent

m_Name:
m_EditorClassIdentifier:
debugCommandLineOverride:
--- !u!114 &5328107234309071792
MonoBehaviour:
m_ObjectHideFlags: 0
m_CorrespondingSourceObject: {fileID: 0}
m_PrefabInstance: {fileID: 0}
m_PrefabAsset: {fileID: 0}
m_GameObject: {fileID: 1131043459059966}
m_Enabled: 1
m_EditorHideFlags: 0
m_Script: {fileID: 11500000, guid: 38b7cc1f5819445aa85e9a9b054552dc, type: 3}
m_Name:
m_EditorClassIdentifier:
m_SensorName: VectorSensor
m_ObservationSize: 8
m_ObservationType: 1
--- !u!1 &1148882946833254
GameObject:
m_ObjectHideFlags: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_ClearFlags: 2
m_BackGroundColor: {r: 0.46666667, g: 0.5647059, b: 0.60784316, a: 1}
m_projectionMatrixMode: 1
m_GateFitMode: 2
m_FOVAxisMode: 0
m_GateFitMode: 2
m_FocalLength: 50
m_NormalizedViewPortRect:
serializedVersion: 2

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

11
Project/Assets/ML-Agents/Examples/Pyramids/Scripts/PyramidAgent.cs


public GameObject areaSwitch;
public bool useVectorObs;
VectorSensorComponent m_DiversitySettingSensor;
public int m_DiversitySetting = 0;
GetComponent<VectorSensorComponent>().CreateSensors();
m_DiversitySettingSensor = GetComponent<VectorSensorComponent>();
}
public override void CollectObservations(VectorSensor sensor)

sensor.AddObservation(m_SwitchLogic.GetState());
sensor.AddObservation(transform.InverseTransformDirection(m_AgentRb.velocity));
}
m_DiversitySettingSensor.GetSensor().Reset();
m_DiversitySettingSensor.GetSensor().AddOneHotObservation(m_DiversitySetting, 8);
}
public void MoveAgent(ActionSegment<int> act)

m_MyArea.CreateStonePyramid(1, items[6]);
m_MyArea.CreateStonePyramid(1, items[7]);
m_MyArea.CreateStonePyramid(1, items[8]);
m_DiversitySetting = Random.Range(0, 8);
}
void OnCollisionEnter(Collision collision)

17
Project/Assets/ML-Agents/Examples/Walker/Scripts/WalkerAgent.cs


m_ResetParams = Academy.Instance.EnvironmentParameters;
SetResetParameters();
GetComponent<VectorSensorComponent>().CreateSensors();
m_DiversitySettingSensor = GetComponent<VectorSensorComponent>();
VectorSensorComponent m_DiversitySettingSensor;
public int m_DiversitySetting = 0;
/// <summary>
/// Loop over body parts and reset them to initial conditions.

randomizeWalkSpeedEachEpisode ? Random.Range(0.1f, m_maxWalkingSpeed) : MTargetWalkingSpeed;
SetResetParameters();
m_DiversitySetting = Random.Range(0, 8);
}
/// <summary>

/// </summary>
public override void CollectObservations(VectorSensor sensor)
{
m_DiversitySettingSensor.GetSensor().Reset();
m_DiversitySettingSensor.GetSensor().AddOneHotObservation(m_DiversitySetting, 8);
var cubeForward = m_OrientationCube.transform.forward;
//velocity we want to match

public void TouchedTarget()
{
AddReward(1f);
//Set our goal walking speed
MTargetWalkingSpeed =
randomizeWalkSpeedEachEpisode ? Random.Range(0.1f, m_maxWalkingSpeed) : MTargetWalkingSpeed;
SetResetParameters();
m_DiversitySetting = Random.Range(0, 8);
}
public void SetTorsoMass()

70
Project/Assets/ML-Agents/Examples/Walker/Prefabs/Platforms/Platform.prefab


m_Modifications:
- target: {fileID: 6902107422946006027, guid: f0d7741d9e06247f6843b921a206b978,
type: 3}
propertyPath: m_RootOrder
value: 0
objectReference: {fileID: 0}
- target: {fileID: 6902107422946006027, guid: f0d7741d9e06247f6843b921a206b978,
type: 3}
propertyPath: m_LocalPosition.x
value: 0
objectReference: {fileID: 0}

objectReference: {fileID: 0}
- target: {fileID: 6902107422946006027, guid: f0d7741d9e06247f6843b921a206b978,
type: 3}
propertyPath: m_LocalRotation.w
value: 1
objectReference: {fileID: 0}
- target: {fileID: 6902107422946006027, guid: f0d7741d9e06247f6843b921a206b978,
type: 3}
propertyPath: m_LocalRotation.x
value: -0
objectReference: {fileID: 0}

objectReference: {fileID: 0}
- target: {fileID: 6902107422946006027, guid: f0d7741d9e06247f6843b921a206b978,
type: 3}
propertyPath: m_LocalRotation.w
value: 1
objectReference: {fileID: 0}
- target: {fileID: 6902107422946006027, guid: f0d7741d9e06247f6843b921a206b978,
type: 3}
propertyPath: m_RootOrder
value: 0
objectReference: {fileID: 0}
- target: {fileID: 6902107422946006027, guid: f0d7741d9e06247f6843b921a206b978,
type: 3}
propertyPath: m_LocalEulerAnglesHint.x
value: 0
objectReference: {fileID: 0}

m_Modifications:
- target: {fileID: 3839136118347789758, guid: 46734abd0de454192b407379c6a4ab8d,
type: 3}
propertyPath: m_RootOrder
value: 2
objectReference: {fileID: 0}
- target: {fileID: 3839136118347789758, guid: 46734abd0de454192b407379c6a4ab8d,
type: 3}
propertyPath: m_LocalPosition.x
value: 0
objectReference: {fileID: 0}

objectReference: {fileID: 0}
- target: {fileID: 3839136118347789758, guid: 46734abd0de454192b407379c6a4ab8d,
type: 3}
propertyPath: m_LocalRotation.w
value: 1
objectReference: {fileID: 0}
- target: {fileID: 3839136118347789758, guid: 46734abd0de454192b407379c6a4ab8d,
type: 3}
propertyPath: m_LocalRotation.x
value: 0
objectReference: {fileID: 0}

objectReference: {fileID: 0}
- target: {fileID: 3839136118347789758, guid: 46734abd0de454192b407379c6a4ab8d,
type: 3}
propertyPath: m_LocalRotation.w
value: 1
objectReference: {fileID: 0}
- target: {fileID: 3839136118347789758, guid: 46734abd0de454192b407379c6a4ab8d,
type: 3}
propertyPath: m_RootOrder
value: 2
objectReference: {fileID: 0}
- target: {fileID: 3839136118347789758, guid: 46734abd0de454192b407379c6a4ab8d,
type: 3}
propertyPath: m_LocalEulerAnglesHint.x
value: 0
objectReference: {fileID: 0}

objectReference: {fileID: 0}
- target: {fileID: 895268871377934297, guid: 765582efd9dda46ed98564603316353f,
type: 3}
propertyPath: m_Model
value:
objectReference: {fileID: 5022602860645237092, guid: a043eafd232b84def8999a7288cc791b,
type: 3}
- target: {fileID: 895268871377934297, guid: 765582efd9dda46ed98564603316353f,
type: 3}
- target: {fileID: 895268871377934297, guid: 765582efd9dda46ed98564603316353f,
- target: {fileID: 895268871377934298, guid: 765582efd9dda46ed98564603316353f,
propertyPath: m_Model
value:
objectReference: {fileID: 11400000, guid: 205590a7f0a844b24b82b7f8355a1529,
type: 3}
propertyPath: m_RootOrder
value: 3
objectReference: {fileID: 0}
- target: {fileID: 895268871377934298, guid: 765582efd9dda46ed98564603316353f,
type: 3}
propertyPath: m_LocalPosition.x

objectReference: {fileID: 0}
- target: {fileID: 895268871377934298, guid: 765582efd9dda46ed98564603316353f,
type: 3}
propertyPath: m_LocalRotation.w
value: 1
objectReference: {fileID: 0}
- target: {fileID: 895268871377934298, guid: 765582efd9dda46ed98564603316353f,
type: 3}
propertyPath: m_LocalRotation.x
value: 0
objectReference: {fileID: 0}

type: 3}
propertyPath: m_LocalRotation.z
value: 0
objectReference: {fileID: 0}
- target: {fileID: 895268871377934298, guid: 765582efd9dda46ed98564603316353f,
type: 3}
propertyPath: m_LocalRotation.w
value: 1
objectReference: {fileID: 0}
- target: {fileID: 895268871377934298, guid: 765582efd9dda46ed98564603316353f,
type: 3}
propertyPath: m_RootOrder
value: 3
objectReference: {fileID: 0}
- target: {fileID: 895268871377934298, guid: 765582efd9dda46ed98564603316353f,
type: 3}

111
Project/Assets/ML-Agents/Examples/Walker/Prefabs/Ragdoll/WalkerRagdoll.prefab


m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_EditorClassIdentifier:
agent: {fileID: 0}
agentDoneOnGroundContact: 1
penalizeGroundContact: 1
penalizeGroundContact: 0
groundContactPenalty: -1
touchingGround: 0
--- !u!1 &895268871289741235

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

- component: {fileID: 895268871377934303}
- component: {fileID: 895268871377934302}
- component: {fileID: 895268871377934301}
- component: {fileID: 5678764055635236588}
m_Layer: 0
m_Name: WalkerRagdoll
m_TagString: Untagged

VectorActionDescriptions: []
VectorActionSpaceType: 1
hasUpgradedBrainParametersWithActionSpec: 1
m_Model: {fileID: 11400000, guid: f598eaeeef9f94691989a2cfaaafb565, type: 3}
m_Model: {fileID: 5022602860645237092, guid: a043eafd232b84def8999a7288cc791b, type: 3}
m_InferenceDevice: 2
m_BehaviorType: 0
m_BehaviorName: WalkerDynamic

armR: {fileID: 7933235355057813930}
forearmR: {fileID: 7933235353195701980}
handR: {fileID: 7933235354616748502}
m_DiversitySetting: 0
--- !u!114 &895268871377934303
MonoBehaviour:
m_ObjectHideFlags: 0

m_Name:
m_EditorClassIdentifier:
debugCommandLineOverride:
--- !u!114 &5678764055635236588
MonoBehaviour:
m_ObjectHideFlags: 0
m_CorrespondingSourceObject: {fileID: 0}
m_PrefabInstance: {fileID: 0}
m_PrefabAsset: {fileID: 0}
m_GameObject: {fileID: 895268871377934275}
m_Enabled: 1
m_EditorHideFlags: 0
m_Script: {fileID: 11500000, guid: 38b7cc1f5819445aa85e9a9b054552dc, type: 3}
m_Name:
m_EditorClassIdentifier:
m_SensorName: DiversitySetting
m_ObservationSize: 8
m_ObservationType: 1
--- !u!1 &895268871382313704
GameObject:
m_ObjectHideFlags: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RayTracingMode: 2
m_RenderingLayerMask: 1
m_RendererPriority: 0
m_Materials:

m_ProbeAnchor: {fileID: 0}
m_LightProbeVolumeOverride: {fileID: 0}
m_ScaleInLightmap: 1
m_ReceiveGI: 1
m_PreserveUVs: 0
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0

m_EditorClassIdentifier:
agent: {fileID: 0}
agentDoneOnGroundContact: 1
penalizeGroundContact: 1
penalizeGroundContact: 0
groundContactPenalty: -1
touchingGround: 0
--- !u!136 &7933235353030744117

m_EditorClassIdentifier:
agent: {fileID: 0}
agentDoneOnGroundContact: 1
penalizeGroundContact: 1
penalizeGroundContact: 0
groundContactPenalty: -1
touchingGround: 0
--- !u!135 &7933235353041637845

m_EditorClassIdentifier:
agent: {fileID: 0}
agentDoneOnGroundContact: 1
penalizeGroundContact: 1
penalizeGroundContact: 0
groundContactPenalty: -1
touchingGround: 0
--- !u!136 &7933235353195701957

m_EditorClassIdentifier:
agent: {fileID: 0}
agentDoneOnGroundContact: 1
penalizeGroundContact: 1
penalizeGroundContact: 0
groundContactPenalty: -1
touchingGround: 0
--- !u!136 &7933235353228551178

m_EditorClassIdentifier:
agent: {fileID: 0}
agentDoneOnGroundContact: 1
penalizeGroundContact: 1
penalizeGroundContact: 0
groundContactPenalty: -1
touchingGround: 0
--- !u!136 &7933235353240438144

m_EditorClassIdentifier:
agent: {fileID: 0}
agentDoneOnGroundContact: 1
penalizeGroundContact: 1
penalizeGroundContact: 0
groundContactPenalty: -1
touchingGround: 0
--- !u!136 &7933235353713167634

m_EditorClassIdentifier:
agent: {fileID: 0}
agentDoneOnGroundContact: 1
penalizeGroundContact: 1
penalizeGroundContact: 0
groundContactPenalty: -1
touchingGround: 0
--- !u!135 &7933235354074184676

m_EditorClassIdentifier:
agent: {fileID: 0}
agentDoneOnGroundContact: 1
penalizeGroundContact: 1
penalizeGroundContact: 0
groundContactPenalty: -1
touchingGround: 0
--- !u!135 &7933235354616748520

m_EditorClassIdentifier:
agent: {fileID: 0}
agentDoneOnGroundContact: 1
penalizeGroundContact: 1
penalizeGroundContact: 0
groundContactPenalty: -1
touchingGround: 0
--- !u!136 &7933235354652902042

m_EditorClassIdentifier:
agent: {fileID: 0}
agentDoneOnGroundContact: 1
penalizeGroundContact: 1
penalizeGroundContact: 0
groundContactPenalty: -1
touchingGround: 0
--- !u!136 &7933235354845945040

m_EditorClassIdentifier:
agent: {fileID: 0}
agentDoneOnGroundContact: 1
penalizeGroundContact: 1
penalizeGroundContact: 0
groundContactPenalty: -1
touchingGround: 0
--- !u!136 &7933235355057813907

objectReference: {fileID: 0}
- target: {fileID: 2591864627249999504, guid: 72f745913c5a34df5aaadd5c1f0024cb,
type: 3}
propertyPath: m_RootOrder
value: 2
objectReference: {fileID: 0}
- target: {fileID: 2591864627249999504, guid: 72f745913c5a34df5aaadd5c1f0024cb,
type: 3}
propertyPath: m_LocalPosition.x
value: 0
objectReference: {fileID: 0}

objectReference: {fileID: 0}
- target: {fileID: 2591864627249999504, guid: 72f745913c5a34df5aaadd5c1f0024cb,
type: 3}
propertyPath: m_LocalRotation.w
value: 1
objectReference: {fileID: 0}
- target: {fileID: 2591864627249999504, guid: 72f745913c5a34df5aaadd5c1f0024cb,
type: 3}
propertyPath: m_LocalRotation.x
value: 0
objectReference: {fileID: 0}

type: 3}
propertyPath: m_LocalRotation.z
value: 0
objectReference: {fileID: 0}
- target: {fileID: 2591864627249999504, guid: 72f745913c5a34df5aaadd5c1f0024cb,
type: 3}
propertyPath: m_LocalRotation.w
value: 1
objectReference: {fileID: 0}
- target: {fileID: 2591864627249999504, guid: 72f745913c5a34df5aaadd5c1f0024cb,
type: 3}
propertyPath: m_RootOrder
value: 2
objectReference: {fileID: 0}
- target: {fileID: 2591864627249999504, guid: 72f745913c5a34df5aaadd5c1f0024cb,
type: 3}

207
Project/Assets/ML-Agents/Examples/Walker/Scenes/Walker.unity


m_ReflectionIntensity: 1
m_CustomReflection: {fileID: 0}
m_Sun: {fileID: 0}
m_IndirectSpecularColor: {r: 0.4497121, g: 0.49977785, b: 0.57563704, a: 1}
m_IndirectSpecularColor: {r: 0.44971168, g: 0.4997775, b: 0.57563686, a: 1}
m_UseRadianceAmbientProbe: 0
--- !u!157 &3
LightmapSettings:

m_Modification:
m_TransformParent: {fileID: 0}
m_Modifications:
- target: {fileID: 443215535079112744, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_TargetWalkingSpeed
value: 5
objectReference: {fileID: 0}
- target: {fileID: 443215535079112744, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: randomizeWalkSpeedEachEpisode
value: 0
objectReference: {fileID: 0}
- target: {fileID: 1076680649171575083, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_ConnectedAnchor.x

propertyPath: m_Name
value: PlatformDynamicTarget (4)
objectReference: {fileID: 0}
- target: {fileID: 6713178126238440196, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_IsActive
value: 1
objectReference: {fileID: 0}
- target: {fileID: 6718791046026642300, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_RootOrder

propertyPath: m_LocalEulerAnglesHint.z
value: 0
objectReference: {fileID: 0}
- target: {fileID: 7819659174506835736, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_Model
value:
objectReference: {fileID: 5022602860645237092, guid: 371ea8d3fe4bf4cde9da72b71d914c52,
type: 3}
m_RemovedComponents: []
m_SourcePrefab: {fileID: 100100000, guid: 84359146bf7af47e58c229d877e801d7, type: 3}
--- !u!1001 &193531851

m_Modification:
m_TransformParent: {fileID: 0}
m_Modifications:
- target: {fileID: 443215535079112744, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_TargetWalkingSpeed
value: 5
objectReference: {fileID: 0}
- target: {fileID: 443215535079112744, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: randomizeWalkSpeedEachEpisode
value: 0
objectReference: {fileID: 0}
- target: {fileID: 1076680649171575083, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_ConnectedAnchor.x

propertyPath: m_Name
value: PlatformDynamicTarget (2)
objectReference: {fileID: 0}
- target: {fileID: 6713178126238440196, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_IsActive
value: 1
objectReference: {fileID: 0}
- target: {fileID: 6718791046026642300, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_RootOrder

propertyPath: m_LocalEulerAnglesHint.z
value: 0
objectReference: {fileID: 0}
- target: {fileID: 7819659174506835736, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_Model
value:
objectReference: {fileID: 5022602860645237092, guid: 371ea8d3fe4bf4cde9da72b71d914c52,
type: 3}
m_RemovedComponents: []
m_SourcePrefab: {fileID: 100100000, guid: 84359146bf7af47e58c229d877e801d7, type: 3}
--- !u!1001 &476292838

m_Modification:
m_TransformParent: {fileID: 0}
m_Modifications:
- target: {fileID: 443215535079112744, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_TargetWalkingSpeed
value: 5
objectReference: {fileID: 0}
- target: {fileID: 443215535079112744, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: randomizeWalkSpeedEachEpisode
value: 0
objectReference: {fileID: 0}
- target: {fileID: 1076680649171575083, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_ConnectedAnchor.x

propertyPath: m_Name
value: PlatformDynamicTarget (6)
objectReference: {fileID: 0}
- target: {fileID: 6713178126238440196, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_IsActive
value: 1
objectReference: {fileID: 0}
- target: {fileID: 6718791046026642300, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_RootOrder

propertyPath: m_LocalEulerAnglesHint.z
value: 0
objectReference: {fileID: 0}
- target: {fileID: 7819659174506835736, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_Model
value:
objectReference: {fileID: 5022602860645237092, guid: 371ea8d3fe4bf4cde9da72b71d914c52,
type: 3}
m_RemovedComponents: []
m_SourcePrefab: {fileID: 100100000, guid: 84359146bf7af47e58c229d877e801d7, type: 3}
--- !u!1 &781961355

m_Modification:
m_TransformParent: {fileID: 0}
m_Modifications:
- target: {fileID: 443215535079112744, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_TargetWalkingSpeed
value: 5
objectReference: {fileID: 0}
- target: {fileID: 443215535079112744, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: randomizeWalkSpeedEachEpisode
value: 0
objectReference: {fileID: 0}
- target: {fileID: 1076680649171575083, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_ConnectedAnchor.x

propertyPath: m_Name
value: PlatformDynamicTarget (9)
objectReference: {fileID: 0}
- target: {fileID: 6713178126238440196, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_IsActive
value: 1
objectReference: {fileID: 0}
- target: {fileID: 6718791046026642300, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_RootOrder

propertyPath: m_LocalEulerAnglesHint.z
value: 0
objectReference: {fileID: 0}
- target: {fileID: 7819659174506835736, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_Model
value:
objectReference: {fileID: 5022602860645237092, guid: 371ea8d3fe4bf4cde9da72b71d914c52,
type: 3}
m_RemovedComponents: []
m_SourcePrefab: {fileID: 100100000, guid: 84359146bf7af47e58c229d877e801d7, type: 3}
--- !u!1001 &1062792380

m_Modification:
m_TransformParent: {fileID: 0}
m_Modifications:
- target: {fileID: 443215535079112744, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_TargetWalkingSpeed
value: 5
objectReference: {fileID: 0}
- target: {fileID: 443215535079112744, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: randomizeWalkSpeedEachEpisode
value: 0
objectReference: {fileID: 0}
- target: {fileID: 1076680649171575083, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_ConnectedAnchor.x

propertyPath: m_Name
value: PlatformDynamicTarget (8)
objectReference: {fileID: 0}
- target: {fileID: 6713178126238440196, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_IsActive
value: 1
objectReference: {fileID: 0}
- target: {fileID: 6718791046026642300, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_RootOrder

propertyPath: m_LocalEulerAnglesHint.z
value: 0
objectReference: {fileID: 0}
- target: {fileID: 7819659174506835736, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_Model
value:
objectReference: {fileID: 5022602860645237092, guid: 371ea8d3fe4bf4cde9da72b71d914c52,
type: 3}
m_RemovedComponents: []
m_SourcePrefab: {fileID: 100100000, guid: 84359146bf7af47e58c229d877e801d7, type: 3}
--- !u!1001 &1071024415

m_Modification:
m_TransformParent: {fileID: 0}
m_Modifications:
- target: {fileID: 443215535079112744, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_TargetWalkingSpeed
value: 5
objectReference: {fileID: 0}
- target: {fileID: 443215535079112744, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: randomizeWalkSpeedEachEpisode
value: 0
objectReference: {fileID: 0}
- target: {fileID: 1076680649171575083, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_ConnectedAnchor.x

propertyPath: m_Name
value: PlatformDynamicTarget (3)
objectReference: {fileID: 0}
- target: {fileID: 6713178126238440196, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_IsActive
value: 1
objectReference: {fileID: 0}
- target: {fileID: 6718791046026642300, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_RootOrder

propertyPath: m_LocalEulerAnglesHint.z
value: 0
objectReference: {fileID: 0}
- target: {fileID: 7819659174506835736, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_Model
value:
objectReference: {fileID: 5022602860645237092, guid: 371ea8d3fe4bf4cde9da72b71d914c52,
type: 3}
m_RemovedComponents: []
m_SourcePrefab: {fileID: 100100000, guid: 84359146bf7af47e58c229d877e801d7, type: 3}
--- !u!1001 &1237883783

m_Modification:
m_TransformParent: {fileID: 0}
m_Modifications:
- target: {fileID: 443215535079112744, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_TargetWalkingSpeed
value: 5
objectReference: {fileID: 0}
- target: {fileID: 443215535079112744, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: randomizeWalkSpeedEachEpisode
value: 0
objectReference: {fileID: 0}
objectReference: {fileID: 0}
- target: {fileID: 6713178126238440196, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_IsActive
value: 1
objectReference: {fileID: 0}
- target: {fileID: 6718791046026642300, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}

propertyPath: m_LocalEulerAnglesHint.z
value: 0
objectReference: {fileID: 0}
- target: {fileID: 7819659174506835736, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_Model
value:
objectReference: {fileID: 5022602860645237092, guid: 371ea8d3fe4bf4cde9da72b71d914c52,
type: 3}
m_RemovedComponents: []
m_SourcePrefab: {fileID: 100100000, guid: 84359146bf7af47e58c229d877e801d7, type: 3}
--- !u!1 &1392866527

m_Modification:
m_TransformParent: {fileID: 0}
m_Modifications:
- target: {fileID: 443215535079112744, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_TargetWalkingSpeed
value: 5
objectReference: {fileID: 0}
- target: {fileID: 443215535079112744, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: randomizeWalkSpeedEachEpisode
value: 0
objectReference: {fileID: 0}
- target: {fileID: 1076680649171575083, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_ConnectedAnchor.x

propertyPath: m_Name
value: PlatformDynamicTarget (5)
objectReference: {fileID: 0}
- target: {fileID: 6713178126238440196, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_IsActive
value: 1
objectReference: {fileID: 0}
- target: {fileID: 6718791046026642300, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_RootOrder

propertyPath: m_LocalEulerAnglesHint.z
value: 0
objectReference: {fileID: 0}
- target: {fileID: 7819659174506835736, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_Model
value:
objectReference: {fileID: 5022602860645237092, guid: 371ea8d3fe4bf4cde9da72b71d914c52,
type: 3}
m_RemovedComponents: []
m_SourcePrefab: {fileID: 100100000, guid: 84359146bf7af47e58c229d877e801d7, type: 3}
--- !u!1001 &1481808307

m_Modification:
m_TransformParent: {fileID: 0}
m_Modifications:
- target: {fileID: 443215535079112744, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_TargetWalkingSpeed
value: 5
objectReference: {fileID: 0}
- target: {fileID: 443215535079112744, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: randomizeWalkSpeedEachEpisode
value: 0
objectReference: {fileID: 0}
- target: {fileID: 1076680649171575083, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_ConnectedAnchor.x

propertyPath: m_Name
value: PlatformDynamicTarget (7)
objectReference: {fileID: 0}
- target: {fileID: 6713178126238440196, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_IsActive
value: 1
objectReference: {fileID: 0}
- target: {fileID: 6718791046026642300, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_RootOrder

propertyPath: m_LocalEulerAnglesHint.z
value: 0
objectReference: {fileID: 0}
- target: {fileID: 7819659174506835736, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_Model
value:
objectReference: {fileID: 5022602860645237092, guid: 371ea8d3fe4bf4cde9da72b71d914c52,
type: 3}
m_RemovedComponents: []
m_SourcePrefab: {fileID: 100100000, guid: 84359146bf7af47e58c229d877e801d7, type: 3}
--- !u!1001 &2709362470382547191

m_Modification:
m_TransformParent: {fileID: 0}
m_Modifications:
- target: {fileID: 443215535079112744, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_TargetWalkingSpeed
value: 5
objectReference: {fileID: 0}
- target: {fileID: 443215535079112744, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: randomizeWalkSpeedEachEpisode
value: 0
objectReference: {fileID: 0}
- target: {fileID: 6713178126238440196, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_Name

propertyPath: m_LocalEulerAnglesHint.z
value: 0
objectReference: {fileID: 0}
- target: {fileID: 7819659174506835736, guid: 84359146bf7af47e58c229d877e801d7,
type: 3}
propertyPath: m_Model
value:
objectReference: {fileID: 5022602860645237092, guid: 371ea8d3fe4bf4cde9da72b71d914c52,
type: 3}
m_RemovedComponents: []
m_SourcePrefab: {fileID: 100100000, guid: 84359146bf7af47e58c229d877e801d7, type: 3}

8
ml-agents/mlagents/trainers/settings.py


GAIL: str = "gail"
CURIOSITY: str = "curiosity"
RND: str = "rnd"
DIVERSE: str = "diverse"
def to_settings(self) -> type:
_mapping = {

RewardSignalType.RND: RNDSettings,
RewardSignalType.DIVERSE: DiverseSettings,
}
return _mapping[self]

@attr.s(auto_attribs=True)
class RNDSettings(RewardSignalSettings):
learning_rate: float = 1e-4
encoding_size: Optional[int] = None
@attr.s(auto_attribs=True)
class DiverseSettings(RewardSignalSettings):
learning_rate: float = 1e-4
encoding_size: Optional[int] = None

3
ml-agents/mlagents/trainers/torch/components/reward_providers/__init__.py


from mlagents.trainers.torch.components.reward_providers.rnd_reward_provider import ( # noqa F401
RNDRewardProvider,
)
from mlagents.trainers.torch.components.reward_providers.diverse_reward_provider import ( # noqa F401
DiverseRewardProvider,
)
from mlagents.trainers.torch.components.reward_providers.reward_provider_factory import ( # noqa F401
create_reward_provider,
)

4
ml-agents/mlagents/trainers/torch/components/reward_providers/reward_provider_factory.py


from mlagents.trainers.torch.components.reward_providers.rnd_reward_provider import (
RNDRewardProvider,
)
from mlagents.trainers.torch.components.reward_providers.diverse_reward_provider import (
DiverseRewardProvider,
)
from mlagents_envs.base_env import BehaviorSpec

RewardSignalType.GAIL: GAILRewardProvider,
RewardSignalType.RND: RNDRewardProvider,
RewardSignalType.DIVERSE: DiverseRewardProvider,
}

311
ml-agents/mlagents/trainers/sac/optimizer_torch.py


logger = get_logger(__name__)
from mlagents.trainers.torch.action_flattener import ActionFlattener
from mlagents_envs.base_env import ObservationType
from mlagents.trainers.torch.networks import NetworkBody
from mlagents_envs.base_env import BehaviorSpec
from mlagents.trainers.torch.layers import linear_layer, Initialization
class DiverseNetworkVariational(torch.nn.Module):
EPSILON = 1e-10
STRENGTH = 0.1#1.0
# gradient_penalty_weight = 10.0
z_size = 128
alpha = 0.0005
mutual_information = 10000 # 0.5
EPSILON = 1e-7
initial_beta = 0.0
def __init__(self, specs: BehaviorSpec, settings) -> None:
super().__init__()
self._use_actions = True
sigma_start = 0.5
print(
"VARIATIONAL : Settings : strength:",
self.STRENGTH,
" use_actions:",
self._use_actions,
" mutual_information : ",
self.mutual_information,
"Sigma_Start : ",
sigma_start,
)
# state_encoder_settings = settings
state_encoder_settings = NetworkSettings(normalize=True, num_layers=1)
if state_encoder_settings.memory is not None:
state_encoder_settings.memory = None
logger.warning(
"memory was specified in network_settings but is not supported. It is being ignored."
)
self._action_flattener = ActionFlattener(specs.action_spec)
new_spec = [
spec
for spec in specs.observation_specs
if spec.observation_type != ObservationType.GOAL_SIGNAL
]
diverse_spec = [
spec
for spec in specs.observation_specs
if spec.observation_type == ObservationType.GOAL_SIGNAL
][0]
print(" > ", new_spec, "\n\n\n", " >> ", diverse_spec)
self._all_obs_specs = specs.observation_specs
self.diverse_size = diverse_spec.shape[0]
if self._use_actions:
self._encoder = NetworkBody(
new_spec, state_encoder_settings, self._action_flattener.flattened_size
)
else:
self._encoder = NetworkBody(new_spec, state_encoder_settings)
self._z_sigma = torch.nn.Parameter(
sigma_start * torch.ones((self.z_size), dtype=torch.float),
requires_grad=True,
)
# self._z_mu_layer = linear_layer(
# state_encoder_settings.hidden_units,
# self.z_size,
# kernel_init=Initialization.KaimingHeNormal,
# kernel_gain=0.1,
# )
self._beta = torch.nn.Parameter(
torch.tensor(self.initial_beta, dtype=torch.float), requires_grad=False
)
self._last_layer = torch.nn.Linear(self.z_size, self.diverse_size)
self._diverse_index = -1
self._max_index = len(specs.observation_specs)
for i, spec in enumerate(specs.observation_specs):
if spec.observation_type == ObservationType.GOAL_SIGNAL:
self._diverse_index = i
def predict(
self, obs_input, action_input, detach_action=False, var_noise=True
) -> torch.Tensor:
# Convert to tensors
tensor_obs = [
obs
for obs, spec in zip(obs_input, self._all_obs_specs)
if spec.observation_type != ObservationType.GOAL_SIGNAL
]
if self._use_actions:
action = self._action_flattener.forward(action_input).reshape(
-1, self._action_flattener.flattened_size
)
if detach_action:
action = action.detach()
hidden, _ = self._encoder.forward(tensor_obs, action)
else:
hidden, _ = self._encoder.forward(tensor_obs)
# add a VAE (like in VAIL ?)
# z_mu = self._z_mu_layer(hidden)
z_mu = hidden # self._z_mu_layer(hidden)
hidden = torch.normal(z_mu, self._z_sigma * var_noise)
prediction = torch.softmax(self._last_layer(hidden), dim=1)
return prediction, z_mu
def copy_normalization(self, thing):
self._encoder.processors[0].copy_normalization(thing.processors[1])
def rewards(
self, obs_input, action_input, detach_action=False, var_noise=True
) -> torch.Tensor:
truth = obs_input[self._diverse_index]
prediction, _ = self.predict(obs_input, action_input, detach_action, var_noise)
rewards = torch.log(
torch.sum((prediction * truth), dim=1) + self.EPSILON
)# - np.log(1 / self.diverse_size) # Center around 0
return rewards
def loss(
self, obs_input, action_input, masks, detach_action=True, var_noise=True
) -> torch.Tensor:
# print( ">>> ",obs_input[self._diverse_index][0],self.predict(obs_input, action_input, detach_action)[0], self.predict([x*0 for x in obs_input], action_input, detach_action * 0)[0] )
base_loss = -ModelUtils.masked_mean(
self.rewards(obs_input, action_input, detach_action, var_noise), masks
)
_, mu = self.predict(obs_input, action_input, detach_action, var_noise)
kl_loss = ModelUtils.masked_mean(
-torch.sum(
1 + (self._z_sigma ** 2).log() - 0.5 * mu ** 2
# - 0.5 * mu_expert ** 2
- (self._z_sigma ** 2),
dim=1,
),
masks,
)
vail_loss = self._beta * (kl_loss - self.mutual_information)
with torch.no_grad():
self._beta.data = torch.max(
self._beta + self.alpha * (kl_loss - self.mutual_information),
torch.tensor(0.0),
)
total_loss = base_loss + vail_loss
return total_loss, base_loss, kl_loss, vail_loss, self._beta
class DiverseNetwork(torch.nn.Module):
EPSILON = 1e-10
STRENGTH = 1.0
def __init__(self, specs: BehaviorSpec, settings) -> None:
super().__init__()
self._use_actions = True
print("Settings : strength:", self.STRENGTH, " use_actions:", self._use_actions)
# state_encoder_settings = settings
state_encoder_settings = NetworkSettings(True)
if state_encoder_settings.memory is not None:
state_encoder_settings.memory = None
logger.warning(
"memory was specified in network_settings but is not supported. It is being ignored."
)
self._action_flattener = ActionFlattener(specs.action_spec)
new_spec = [
spec
for spec in specs.observation_specs
if spec.observation_type != ObservationType.GOAL_SIGNAL
]
diverse_spec = [
spec
for spec in specs.observation_specs
if spec.observation_type == ObservationType.GOAL_SIGNAL
][0]
print(" > ", new_spec, "\n\n\n", " >> ", diverse_spec)
self._all_obs_specs = specs.observation_specs
self.diverse_size = diverse_spec.shape[0]
if self._use_actions:
self._encoder = NetworkBody(
new_spec, state_encoder_settings, self._action_flattener.flattened_size
)
else:
self._encoder = NetworkBody(new_spec, state_encoder_settings)
self._last_layer = torch.nn.Linear(
state_encoder_settings.hidden_units, self.diverse_size
)
self._diverse_index = -1
self._max_index = len(specs.observation_specs)
for i, spec in enumerate(specs.observation_specs):
if spec.observation_type == ObservationType.GOAL_SIGNAL:
self._diverse_index = i
def predict(self, obs_input, action_input, detach_action=False) -> torch.Tensor:
# Convert to tensors
tensor_obs = [
obs
for obs, spec in zip(obs_input, self._all_obs_specs)
if spec.observation_type != ObservationType.GOAL_SIGNAL
]
if self._use_actions:
action = self._action_flattener.forward(action_input).reshape(
-1, self._action_flattener.flattened_size
)
if detach_action:
action = action.detach()
hidden, _ = self._encoder.forward(tensor_obs, action)
else:
hidden, _ = self._encoder.forward(tensor_obs)
# add a VAE (like in VAIL ?)
prediction = torch.softmax(self._last_layer(hidden), dim=1)
return prediction
def copy_normalization(self, thing):
self._encoder.processors[0].copy_normalization(thing.processors[1])
def rewards(
self, obs_input, action_input, detach_action=False, var_noise=False
) -> torch.Tensor:
truth = obs_input[self._diverse_index]
prediction = self.predict(obs_input, action_input, detach_action)
rewards = torch.log(torch.sum((prediction * truth), dim=1) + self.EPSILON)
return rewards
def loss(self, obs_input, action_input, masks, detach_action=True) -> torch.Tensor:
# print( ">>> ",obs_input[self._diverse_index][0],self.predict(obs_input, action_input, detach_action)[0], self.predict([x*0 for x in obs_input], action_input, detach_action * 0)[0] )
return -ModelUtils.masked_mean(
self.rewards(obs_input, action_input, detach_action), masks
)
class TorchSACOptimizer(TorchOptimizer):
class PolicyValueNetwork(nn.Module):

self._critic.parameters()
)
# self._mede_network = DiverseNetwork(
self._mede_network = DiverseNetworkVariational(
self.policy.behavior_spec, self.policy.network_settings
)
self._mede_optimizer = torch.optim.Adam(
list(self._mede_network.parameters()), lr=hyperparameters.learning_rate
)
logger.debug("value_vars")
for param in value_params:
logger.debug(param.shape)

q1p_out: Dict[str, torch.Tensor],
q2p_out: Dict[str, torch.Tensor],
loss_masks: torch.Tensor,
obs,
act,
) -> torch.Tensor:
min_policy_qs = {}
with torch.no_grad():

if self._action_spec.discrete_size <= 0:
for name in values.keys():
with torch.no_grad():
v_backup = min_policy_qs[name] - torch.sum(
_cont_ent_coef * log_probs.continuous_tensor, dim=1
v_backup = (
min_policy_qs[name]
- _cont_ent_coef * torch.sum( log_probs.continuous_tensor, dim=1)
+ self._mede_network.STRENGTH
* self._mede_network.rewards(obs, act, var_noise=False)
)
value_loss = 0.5 * ModelUtils.masked_mean(
torch.nn.functional.mse_loss(values[name], v_backup), loss_masks

)
for name in values.keys():
with torch.no_grad():
v_backup = min_policy_qs[name] - torch.mean(
branched_ent_bonus, axis=0
v_backup = (
min_policy_qs[name]
- torch.mean(branched_ent_bonus, axis=0)
+ self._mede_network.STRENGTH
* self._mede_network.rewards(obs, act)
print("The discrete case is much more complicated than that")
# Add continuous entropy bonus to minimum Q
if self._action_spec.continuous_size > 0:
v_backup += torch.sum(

log_probs: ActionLogProbs,
q1p_outs: Dict[str, torch.Tensor],
loss_masks: torch.Tensor,
obs,
act,
) -> torch.Tensor:
_cont_ent_coef, _disc_ent_coef = (
self._log_ent_coef.continuous,

all_mean_q1 = mean_q1
if self._action_spec.continuous_size > 0:
cont_log_probs = log_probs.continuous_tensor
batch_policy_loss += torch.mean(
_cont_ent_coef * cont_log_probs - all_mean_q1.unsqueeze(1), dim=1
)
batch_policy_loss += _cont_ent_coef * torch.sum(cont_log_probs, dim=1) - all_mean_q1.unsqueeze(1)
batch_policy_loss += -self._mede_network.STRENGTH * self._mede_network.rewards(
obs, act, var_noise=False
)
policy_loss = ModelUtils.masked_mean(batch_policy_loss, loss_masks)
return policy_loss

if self._action_spec.continuous_size > 0:
with torch.no_grad():
cont_log_probs = log_probs.continuous_tensor
target_current_diff = torch.sum(
cont_log_probs + self.target_entropy.continuous, dim=1
target_current_diff = (
torch.sum(cont_log_probs, dim=1)
+ self.target_entropy.continuous
# print(self.target_entropy.continuous, cont_log_probs, torch.sum(
# cont_log_probs, dim=1) + self.target_entropy.continuous)
# We update all the _cont_ent_coef as one block
entropy_loss += -1 * ModelUtils.masked_mean(
_cont_ent_coef * target_current_diff, loss_masks

self.target_network.network_body.copy_normalization(
self.policy.actor.network_body
)
self._mede_network.copy_normalization(self.policy.actor.network_body)
self._critic.network_body.copy_normalization(self.policy.actor.network_body)
sampled_actions, log_probs, _, _, = self.policy.actor.get_action_and_stats(
current_obs,

q1_stream, q2_stream, target_values, dones, rewards, masks
)
value_loss = self.sac_value_loss(
log_probs, value_estimates, q1p_out, q2p_out, masks
log_probs,
value_estimates,
q1p_out,
q2p_out,
masks,
current_obs,
sampled_actions,
policy_loss = self.sac_policy_loss(log_probs, q1p_out, masks)
policy_loss = self.sac_policy_loss(
log_probs, q1p_out, masks, current_obs, sampled_actions
)
entropy_loss = self.sac_entropy_loss(log_probs, masks)
total_value_loss = q1_loss + q2_loss

entropy_loss.backward()
self.entropy_optimizer.step()
mede_loss, base_loss, kl_loss, vail_loss, beta = self._mede_network.loss(
current_obs, sampled_actions, masks
)
# mede_loss = self._mede_network.loss(current_obs, sampled_actions, masks)
ModelUtils.update_learning_rate(self._mede_optimizer, decay_lr)
self._mede_optimizer.zero_grad()
mede_loss.backward()
self._mede_optimizer.step()
# Update target network
ModelUtils.soft_update(self._critic, self.target_network, self.tau)
update_stats = {

torch.exp(self._log_ent_coef.continuous)
).item(),
"Policy/Learning Rate": decay_lr,
"Policy/Entropy Loss": entropy_loss.item(),
"Policy/MEDE Loss": mede_loss.item(),
"Policy/MEDE Base": base_loss.item(),
"Policy/MEDE Variational": vail_loss.item(),
"Policy/MEDE KL": kl_loss.item(),
"Policy/MEDE beta": beta.item(),
}
return update_stats

"Optimizer:policy_optimizer": self.policy_optimizer,
"Optimizer:value_optimizer": self.value_optimizer,
"Optimizer:entropy_optimizer": self.entropy_optimizer,
"Optimizer:mede_optimizer": self._mede_optimizer,
"Optimizer:mede_network": self._mede_network,
}
for reward_provider in self.reward_signals.values():
modules.update(reward_provider.get_modules())

1001
Project/Assets/ML-Agents/Examples/Walker/TFModels/Walker-10M.onnx
文件差异内容过多而无法显示
查看文件

14
Project/Assets/ML-Agents/Examples/Walker/TFModels/Walker-10M.onnx.meta


fileFormatVersion: 2
guid: c4c1c9de2772f48e8b0cc2cdd62ce8c5
ScriptedImporter:
internalIDToNameTable: []
externalObjects: {}
serializedVersion: 2
userData:
assetBundleName:
assetBundleVariant:
script: {fileID: 11500000, guid: 683b6cb6d0a474744822c888b46772c9, type: 3}
optimizeModel: 1
forceArbitraryBatchSize: 1
treatErrorsAsWarnings: 0
importMode: 1

1001
Project/Assets/ML-Agents/Examples/Walker/TFModels/Walker-diverse-no-extrinsic.onnx
文件差异内容过多而无法显示
查看文件

14
Project/Assets/ML-Agents/Examples/Walker/TFModels/Walker-diverse-no-extrinsic.onnx.meta


fileFormatVersion: 2
guid: e97d22662d00b43ed999531846572c36
ScriptedImporter:
internalIDToNameTable: []
externalObjects: {}
serializedVersion: 2
userData:
assetBundleName:
assetBundleVariant:
script: {fileID: 11500000, guid: 683b6cb6d0a474744822c888b46772c9, type: 3}
optimizeModel: 1
forceArbitraryBatchSize: 1
treatErrorsAsWarnings: 0
importMode: 1

1001
Project/Assets/ML-Agents/Examples/Walker/TFModels/Walker-diverse-r02-bigger.onnx
文件差异内容过多而无法显示
查看文件

14
Project/Assets/ML-Agents/Examples/Walker/TFModels/Walker-diverse-r02-bigger.onnx.meta


fileFormatVersion: 2
guid: 7c800225c636c4e299f1decdfa9b9029
ScriptedImporter:
internalIDToNameTable: []
externalObjects: {}
serializedVersion: 2
userData:
assetBundleName:
assetBundleVariant:
script: {fileID: 11500000, guid: 683b6cb6d0a474744822c888b46772c9, type: 3}
optimizeModel: 1
forceArbitraryBatchSize: 1
treatErrorsAsWarnings: 0
importMode: 1

1001
Project/Assets/ML-Agents/Examples/Walker/TFModels/Walker-diverse-r05-bigger.onnx
文件差异内容过多而无法显示
查看文件

14
Project/Assets/ML-Agents/Examples/Walker/TFModels/Walker-diverse-r05-bigger.onnx.meta


fileFormatVersion: 2
guid: 55f976c29f6ff4d2bbe6dff72645410a
ScriptedImporter:
internalIDToNameTable: []
externalObjects: {}
serializedVersion: 2
userData:
assetBundleName:
assetBundleVariant:
script: {fileID: 11500000, guid: 683b6cb6d0a474744822c888b46772c9, type: 3}
optimizeModel: 1
forceArbitraryBatchSize: 1
treatErrorsAsWarnings: 0
importMode: 1

1001
Project/Assets/ML-Agents/Examples/Walker/TFModels/Walker-extrinsic-log-diverse.onnx
文件差异内容过多而无法显示
查看文件

14
Project/Assets/ML-Agents/Examples/Walker/TFModels/Walker-extrinsic-log-diverse.onnx.meta


fileFormatVersion: 2
guid: a9ceaa0e80ae34d229e1a2277d3388e5
ScriptedImporter:
internalIDToNameTable: []
externalObjects: {}
serializedVersion: 2
userData:
assetBundleName:
assetBundleVariant:
script: {fileID: 11500000, guid: 683b6cb6d0a474744822c888b46772c9, type: 3}
optimizeModel: 1
forceArbitraryBatchSize: 1
treatErrorsAsWarnings: 0
importMode: 1

1001
Project/Assets/ML-Agents/Examples/Walker/TFModels/Walker-new-reward-1.onnx
文件差异内容过多而无法显示
查看文件

14
Project/Assets/ML-Agents/Examples/Walker/TFModels/Walker-new-reward-1.onnx.meta


fileFormatVersion: 2
guid: 29312cd291f9d4f2d91c7272a2a14324
ScriptedImporter:
internalIDToNameTable: []
externalObjects: {}
serializedVersion: 2
userData:
assetBundleName:
assetBundleVariant:
script: {fileID: 11500000, guid: 683b6cb6d0a474744822c888b46772c9, type: 3}
optimizeModel: 1
forceArbitraryBatchSize: 1
treatErrorsAsWarnings: 0
importMode: 1

1001
Project/Assets/ML-Agents/Examples/Walker/TFModels/mede-walker-crazy-mutual-10000-nogound-penalty.onnx
文件差异内容过多而无法显示
查看文件

14
Project/Assets/ML-Agents/Examples/Walker/TFModels/mede-walker-crazy-mutual-10000-nogound-penalty.onnx.meta


fileFormatVersion: 2
guid: 7ce5ceea4d8ff4d4a837a510deed0b0e
ScriptedImporter:
internalIDToNameTable: []
externalObjects: {}
serializedVersion: 2
userData:
assetBundleName:
assetBundleVariant:
script: {fileID: 11500000, guid: 683b6cb6d0a474744822c888b46772c9, type: 3}
optimizeModel: 1
forceArbitraryBatchSize: 1
treatErrorsAsWarnings: 0
importMode: 1

1001
Project/Assets/ML-Agents/Examples/Walker/TFModels/mede-walker-crazy-mutual-10000-nogound-penalty-01strength.onnx
文件差异内容过多而无法显示
查看文件

14
Project/Assets/ML-Agents/Examples/Walker/TFModels/mede-walker-crazy-mutual-10000-nogound-penalty-01strength.onnx.meta


fileFormatVersion: 2
guid: 371ea8d3fe4bf4cde9da72b71d914c52
ScriptedImporter:
internalIDToNameTable: []
externalObjects: {}
serializedVersion: 2
userData:
assetBundleName:
assetBundleVariant:
script: {fileID: 11500000, guid: 683b6cb6d0a474744822c888b46772c9, type: 3}
optimizeModel: 1
forceArbitraryBatchSize: 1
treatErrorsAsWarnings: 0
importMode: 1

133
ml-agents/mlagents/trainers/torch/components/reward_providers/diverse_reward_provider.py


import numpy as np
from typing import Dict
from mlagents.torch_utils import torch
from mlagents_envs.base_env import ObservationType
from mlagents.trainers.buffer import AgentBuffer
from mlagents.trainers.torch.components.reward_providers.base_reward_provider import (
BaseRewardProvider,
)
from mlagents.trainers.settings import DiverseSettings
from mlagents.trainers.torch.action_flattener import ActionFlattener
from mlagents.trainers.torch.agent_action import AgentAction
from mlagents_envs.base_env import BehaviorSpec
from mlagents_envs import logging_util
from mlagents.trainers.torch.utils import ModelUtils
from mlagents.trainers.torch.networks import NetworkBody
from mlagents.trainers.trajectory import ObsUtil
logger = logging_util.get_logger(__name__)
class DiverseRewardProvider(BaseRewardProvider):
# From https://arxiv.org/pdf/1802.06070.pdf
def __init__(self, specs: BehaviorSpec, settings: DiverseSettings) -> None:
super().__init__(specs, settings)
self._ignore_done = False # Tried with false. Bias for staying alive.
self._use_actions = False
self._network = DiverseNetwork(specs, settings, self._use_actions)
self.optimizer = torch.optim.SGD(
self._network.parameters(), lr=settings.learning_rate
)
self._diverse_index = -1
self._max_index = len(specs.observation_specs)
for i, spec in enumerate(specs.observation_specs):
if spec.observation_type == ObservationType.GOAL_SIGNAL:
self._diverse_index = i
def evaluate(self, mini_batch: AgentBuffer) -> np.ndarray:
with torch.no_grad():
prediction = self._network(mini_batch)
truth = ModelUtils.list_to_tensor(
ObsUtil.from_buffer(mini_batch, self._max_index)[self._diverse_index]
)
# print(prediction[0,:], truth[0,:], torch.log(torch.sum((prediction * truth), dim=1) + 1e-10)[0], (torch.log(torch.sum((prediction * truth), dim=1))- np.log(1 / self._network.diverse_size))[0])
rewards = torch.log(
torch.sum((prediction * truth), dim=1) + 1e-10
) - np.log(1 / self._network.diverse_size)
return rewards.detach().cpu().numpy()
def update(self, mini_batch: AgentBuffer) -> Dict[str, np.ndarray]:
all_loss = 0
for _ in range(1):
prediction = self._network(mini_batch)
truth = ModelUtils.list_to_tensor(
ObsUtil.from_buffer(mini_batch, self._max_index)[self._diverse_index]
)
# loss = torch.mean(
# torch.sum(-torch.log(prediction + 1e-10) * truth, dim=1), dim=0
# )
loss = -torch.mean(torch.log(torch.sum((prediction * truth), dim=1)))
self.optimizer.zero_grad()
loss.backward()
self.optimizer.step()
all_loss += loss.item()
return {"Losses/DIVERSE Loss": all_loss}
def get_modules(self):
return {f"Module:{self.name}": self._network}
class DiverseNetwork(torch.nn.Module):
EPSILON = 1e-10
def __init__(
self, specs: BehaviorSpec, settings: DiverseSettings, use_actions: bool
) -> None:
super().__init__()
self._use_actions = use_actions
state_encoder_settings = settings.network_settings
if state_encoder_settings.memory is not None:
state_encoder_settings.memory = None
logger.warning(
"memory was specified in network_settings but is not supported. It is being ignored."
)
self._action_flattener = ActionFlattener(specs.action_spec)
new_spec = [
spec
for spec in specs.observation_specs
if spec.observation_type != ObservationType.GOAL_SIGNAL
]
diverse_spec = [
spec
for spec in specs.observation_specs
if spec.observation_type == ObservationType.GOAL_SIGNAL
][0]
print(" > ", new_spec, "\n\n\n", " >> ", diverse_spec)
self._all_obs_specs = specs.observation_specs
self.diverse_size = diverse_spec.shape[0]
if self._use_actions:
self._encoder = NetworkBody(
new_spec, state_encoder_settings, self._action_flattener.flattened_size
)
else:
self._encoder = NetworkBody(new_spec, state_encoder_settings)
self._last_layer = torch.nn.Linear(
state_encoder_settings.hidden_units, self.diverse_size
)
def forward(self, mini_batch: AgentBuffer) -> torch.Tensor:
n_obs = len(self._encoder.processors) + 1
np_obs = ObsUtil.from_buffer_next(mini_batch, n_obs)
# Convert to tensors
tensor_obs = [
ModelUtils.list_to_tensor(obs)
for obs, spec in zip(np_obs, self._all_obs_specs)
if spec.observation_type != ObservationType.GOAL_SIGNAL
]
if self._use_actions:
action = self._action_flattener.forward(AgentAction.from_buffer(mini_batch))
hidden, _ = self._encoder.forward(tensor_obs, action)
else:
hidden, _ = self._encoder.forward(tensor_obs)
self._encoder.update_normalization(mini_batch)
prediction = torch.softmax(self._last_layer(hidden), dim=1)
return prediction
正在加载...
取消
保存