浏览代码

Use resamp policy for SAC

/develop/nopreviousactions
Ervin Teng 4 年前
当前提交
b21b3d5c
共有 1 个文件被更改,包括 1 次插入0 次删除
  1. 1
      ml-agents/mlagents/trainers/sac/trainer.py

1
ml-agents/mlagents/trainers/sac/trainer.py


self.is_training,
self.load,
tanh_squash=True,
resample=True,
)
for _reward_signal in policy.reward_signals.keys():
self.collected_rewards[_reward_signal] = defaultdict(lambda: 0)

正在加载...
取消
保存