浏览代码

Remove epsilon from everywhere

/develop/nopreviousactions
Ervin Teng 5 年前
当前提交
a5caf4d6
共有 2 个文件被更改,包括 1 次插入10 次删除
  1. 6
      ml-agents/mlagents/trainers/components/bc/module.py
  2. 5
      ml-agents/mlagents/trainers/sac/optimizer.py

6
ml-agents/mlagents/trainers/components/bc/module.py


self.policy.sequence_length: self.policy.sequence_length,
}
feed_dict[self.model.action_in_expert] = mini_batch_demo["actions"]
if self.policy.brain.vector_action_space_type == "continuous":
feed_dict[self.policy.epsilon] = np.random.normal(
size=(1, self.policy.act_size[0])
)
else:
if not self.policy.use_continuous_act:
feed_dict[self.policy.action_masks] = np.ones(
(
self.n_sequences * self.policy.sequence_length,

5
ml-agents/mlagents/trainers/sac/optimizer.py


self.dones_holder = tf.placeholder(
shape=[None], dtype=tf.float32, name="dones_holder"
)
# This is just a dummy to get BC to work. PPO has this but SAC doesn't.
# TODO: Proper input and output specs for models
self.epsilon = tf.placeholder(
shape=[None, self.policy.act_size[0]], dtype=tf.float32, name="epsilon"
)
if self.policy.use_recurrent:
self.memory_in = self.policy_network.memory_in

正在加载...
取消
保存