浏览代码

use target net

/develop/coma-noact
Andrew Cohen 4 年前
当前提交
6b2a6c5f
共有 1 个文件被更改,包括 1 次插入1 次删除
  1. 2
      ml-agents/mlagents/trainers/ppo/optimizer_torch.py

2
ml-agents/mlagents/trainers/ppo/optimizer_torch.py


self.optimizer.step()
ModelUtils.soft_update(
self.policy.actor_critic.critic, self.policy.actor_critic.target, 1.0
self.policy.actor_critic.critic, self.policy.actor_critic.target, 0.005
)
update_stats = {
# NOTE: abs() is not technically correct, but matches the behavior in TensorFlow.

正在加载...
取消
保存