浏览代码

Use correct dimensions of gradient

/develop/gail-norm
Ervin Teng 4 年前
当前提交
362f2ec0
共有 1 个文件被更改,包括 8 次插入4 次删除
  1. 12
      ml-agents/mlagents/trainers/torch/components/reward_providers/gail_reward_provider.py

12
ml-agents/mlagents/trainers/torch/components/reward_providers/gail_reward_provider.py


use_vail_noise = True
z_mu = self._z_mu_layer(hidden)
hidden = torch.normal(z_mu, self._z_sigma * use_vail_noise)
estimate = self._estimator(hidden).squeeze(1).sum()
gradient = torch.autograd.grad(estimate, encoder_input, create_graph=True)[0]
# print(torch.sum(gradient ** 2, dim=1))
estimate = self._estimator(hidden).squeeze(1)
gradient = torch.autograd.grad(
estimate,
encoder_input,
grad_outputs=torch.ones(estimate.shape),
create_graph=True,
)[0]
safe_norm = (torch.sum(gradient ** 2, dim=1) + self.EPSILON).sqrt()
safe_norm = (torch.sum(torch.pow(gradient, 2), dim=1) + self.EPSILON).sqrt()
gradient_mag = torch.mean((safe_norm - 1) ** 2)
return gradient_mag
正在加载...
取消
保存