浏览代码

[Addresses #842] (#849)

In the case the agent is done imediately after spawning, its stats are empty because the stats need at least 2 successive experieces to create the stats.
By specifying the default value of 0, the error does no longer appear
/develop-generalizationTraining-TrainerController
GitHub 6 年前
当前提交
0f65e272
共有 2 个文件被更改,包括 10 次插入5 次删除
  1. 6
      python/unitytrainers/bc/trainer.py
  2. 9
      python/unitytrainers/ppo/trainer.py

6
python/unitytrainers/bc/trainer.py


for l in range(len(info_student.agents)):
if info_student.local_done[l]:
agent_id = info_student.agents[l]
self.stats['cumulative_reward'].append(self.cumulative_rewards[agent_id])
self.stats['episode_length'].append(self.episode_steps[agent_id])
self.stats['cumulative_reward'].append(
self.cumulative_rewards.get(agent_id, 0))
self.stats['episode_length'].append(
self.episode_steps.get(agent_id, 0))
self.cumulative_rewards[agent_id] = 0
self.episode_steps[agent_id] = 0

9
python/unitytrainers/ppo/trainer.py


self.training_buffer[agent_id].reset_agent()
if info.local_done[l]:
self.stats['cumulative_reward'].append(self.cumulative_rewards[agent_id])
self.stats['episode_length'].append(self.episode_steps[agent_id])
self.stats['cumulative_reward'].append(
self.cumulative_rewards.get(agent_id, 0))
self.stats['episode_length'].append(
self.episode_steps.get(agent_id, 0))
self.stats['intrinsic_reward'].append(self.intrinsic_rewards[agent_id])
self.stats['intrinsic_reward'].append(
self.intrinsic_rewards.get(agent_id, 0))
self.intrinsic_rewards[agent_id] = 0
def end_episode(self):

正在加载...
取消
保存