浏览代码

Fix non-lstm PPO

/develop/critic-op-lstm-currentmem
Ervin Teng 3 年前
当前提交
97842f81
共有 1 个文件被更改,包括 2 次插入1 次删除
  1. 3
      ml-agents/mlagents/trainers/ppo/trainer.py

3
ml-agents/mlagents/trainers/ppo/trainer.py


trajectory.next_obs,
trajectory.done_reached and not trajectory.interrupted,
)
agent_buffer_trajectory[BufferKey.CRITIC_MEMORY].set(value_memories)
if value_memories is not None:
agent_buffer_trajectory[BufferKey.CRITIC_MEMORY].set(value_memories)
for name, v in value_estimates.items():
agent_buffer_trajectory[RewardSignalUtil.value_estimates_key(name)].extend(

正在加载...
取消
保存