浏览代码

Don't flatten when there are multiple continuous actions

/develop/add-fire/continuous
Ervin Teng 4 年前
当前提交
f8b40b9b
共有 1 个文件被更改,包括 1 次插入1 次删除
  1. 2
      ml-agents/mlagents/trainers/ppo/optimizer_torch.py

2
ml-agents/mlagents/trainers/ppo/optimizer_torch.py


torch.clamp(r_theta, 1.0 - decay_epsilon, 1.0 + decay_epsilon) * advantage
)
policy_loss = -1 * ModelUtils.masked_mean(
torch.min(p_opt_a, p_opt_b).flatten(), loss_masks
torch.min(p_opt_a, p_opt_b), loss_masks
)
return policy_loss

正在加载...
取消
保存