浏览代码

Fixed value estimate bug

/develop-newnormalization
Ervin Teng 5 年前
当前提交
9e0ef912
共有 1 个文件被更改,包括 1 次插入1 次删除
  1. 2
      ml-agents/mlagents/trainers/ppo/trainer.py

2
ml-agents/mlagents/trainers/ppo/trainer.py


value_next = self.policy.get_value_estimates(
trajectory.next_obs,
trajectory.done_reached and not trajectory.done_reached,
trajectory.done_reached and not trajectory.max_step_reached,
)
# Evaluate all reward functions

正在加载...
取消
保存