浏览代码

[Fix] Must take mean of entropy to avoid errors what number of agents change during training (#407)

/develop-generalizationTraining-TrainerController
GitHub 7 年前
当前提交
5bdef358
共有 1 个文件被更改,包括 1 次插入1 次删除
  1. 2
      python/unitytrainers/ppo/trainer.py

2
python/unitytrainers/ppo/trainer.py


values = self.sess.run(run_list, feed_dict=feed_dict)
run_out = dict(zip(run_list, values))
self.stats['value_estimate'].append(run_out[self.model.value].mean())
self.stats['entropy'].append(run_out[self.model.entropy])
self.stats['entropy'].append(run_out[self.model.entropy].mean())
self.stats['learning_rate'].append(run_out[self.model.learning_rate])
if self.use_recurrent:
return (run_out[self.model.output],

正在加载...
取消
保存