浏览代码

Reduce negative reward

/develop/zombieteammanager/killfirst
Ervin Teng 3 年前
当前提交
45804de9
共有 2 个文件被更改,包括 14 次插入2 次删除
  1. 14
      Project/Assets/ML-Agents/Examples/PushBlock/Scenes/2ZombieVs3AgentsPushBlock.unity
  2. 2
      Project/Assets/ML-Agents/Examples/PushBlock/Scripts/PushAgentCollab.cs

14
Project/Assets/ML-Agents/Examples/PushBlock/Scenes/2ZombieVs3AgentsPushBlock.unity


maxStep: 0
hasUpgradedFromAgentParameters: 1
MaxStep: 5000
frozen: 0
useVectorObs: 1
--- !u!1 &263418408
GameObject:

maxStep: 0
hasUpgradedFromAgentParameters: 1
MaxStep: 5000
frozen: 0
useVectorObs: 1
--- !u!114 &265967282
MonoBehaviour:

maxStep: 0
hasUpgradedFromAgentParameters: 1
MaxStep: 5000
frozen: 0
useVectorObs: 1
--- !u!114 &344861064
MonoBehaviour:

maxStep: 0
hasUpgradedFromAgentParameters: 1
MaxStep: 5000
frozen: 0
useVectorObs: 1
--- !u!114 &557898664
MonoBehaviour:

maxStep: 0
hasUpgradedFromAgentParameters: 1
MaxStep: 5000
frozen: 0
useVectorObs: 1
--- !u!114 &607937517
MonoBehaviour:

maxStep: 0
hasUpgradedFromAgentParameters: 1
MaxStep: 5000
frozen: 0
useVectorObs: 1
--- !u!114 &742698648
MonoBehaviour:

maxStep: 0
hasUpgradedFromAgentParameters: 1
MaxStep: 5000
frozen: 0
useVectorObs: 1
--- !u!114 &816508913
MonoBehaviour:

maxStep: 0
hasUpgradedFromAgentParameters: 1
MaxStep: 5000
frozen: 0
useVectorObs: 1
--- !u!114 &1504325949
MonoBehaviour:

maxStep: 0
hasUpgradedFromAgentParameters: 1
MaxStep: 5000
frozen: 0
useVectorObs: 1
--- !u!114 &1614947134
MonoBehaviour:

maxStep: 0
hasUpgradedFromAgentParameters: 1
MaxStep: 5000
frozen: 0
useVectorObs: 1
--- !u!114 &1914183271
MonoBehaviour:

maxStep: 0
hasUpgradedFromAgentParameters: 1
MaxStep: 5000
frozen: 0
useVectorObs: 1
--- !u!114 &2029153700
MonoBehaviour:

maxStep: 0
hasUpgradedFromAgentParameters: 1
MaxStep: 5000
useVectorObs: 1
frozen: 0
useVectorObs: 0
--- !u!1 &1500989827241484
GameObject:
m_ObjectHideFlags: 0

2
Project/Assets/ML-Agents/Examples/PushBlock/Scripts/PushAgentCollab.cs


MoveAgent(actionBuffers.DiscreteActions);
// Penalty given each step to encourage agent to finish task quickly.
AddReward(-1f / MaxStep);
AddReward(-0.1f / MaxStep);
}
public override void Heuristic(in ActionBuffers actionsOut)

正在加载...
取消
保存