浏览代码

timestep penalty on loss only

/asymm-envs
Andrew Cohen 4 年前
当前提交
a6e6e63e
共有 1 个文件被更改,包括 4 次插入4 次删除
  1. 8
      Project/Assets/ML-Agents/Examples/Tennis/Scripts/HitWall.cs

8
Project/Assets/ML-Agents/Examples/Tennis/Scripts/HitWall.cs


void AgentAWins()
{
m_AgentA.SetReward(1 + m_AgentA.timePenalty);
m_AgentB.SetReward(-1);
m_AgentA.SetReward(1);
m_AgentB.SetReward(-1 - m_AgentB.timePenalty);
m_AgentA.score += 1;
Reset();

{
m_AgentA.SetReward(-1);
m_AgentB.SetReward(1 + m_AgentB.timePenalty);
m_AgentA.SetReward(-1 - m_AgentA.timePenalty);
m_AgentB.SetReward(1);
m_AgentB.score += 1;
Reset();

正在加载...
取消
保存