浏览代码

Remove outdated comment

/develop/removeactionholder-onehot
Ervin Teng 5 年前
当前提交
23088088
共有 1 个文件被更改,包括 1 次插入1 次删除
  1. 2
      ml-agents/mlagents/trainers/sac/optimizer.py

2
ml-agents/mlagents/trainers/sac/optimizer.py


else:
self.output_pre = self.policy_network.output_pre
# Don't use value estimate during inference. TODO: Check why PPO uses value_estimate in inference.
# Don't use value estimate during inference.
self.value = tf.identity(
self.policy_network.value, name="value_estimate_unused"
)

正在加载...
取消
保存