浏览代码

-

/exp-diverse-behavior
vincentpierre 4 年前
当前提交
8450b154
共有 3 个文件被更改,包括 8 次插入7 次删除
  1. 2
      config/ppo/GridWorld.yaml
  2. 6
      config/sac/Walker.yaml
  3. 7
      ml-agents/mlagents/trainers/sac/optimizer_torch.py

2
config/ppo/GridWorld.yaml


normalize: false
hidden_units: 128
num_layers: 1
vis_encode_type: simple
vis_encode_type: fully_connected
reward_signals:
extrinsic:
gamma: 0.9

6
config/sac/Walker.yaml


reward_signal_steps_per_update: 30.0
network_settings:
normalize: true
hidden_units: 512
num_layers: 4
hidden_units: 256
num_layers: 3
vis_encode_type: simple
goal_conditioning_type: none
reward_signals:

keep_checkpoints: 5
max_steps: 150000000
max_steps: 15000000
time_horizon: 1000
summary_freq: 30000

7
ml-agents/mlagents/trainers/sac/optimizer_torch.py


# gradient_penalty_weight = 10.0
z_size = 128
alpha = 0.0005
mutual_information = 10#0.5
mutual_information = 100#0.5
EPSILON = 1e-7
initial_beta = 0.0

print("VARIATIONAL : Settings : strength:", self.STRENGTH, " use_actions:", self._use_actions, " mutual_information : ", self.mutual_information)
sigma_start = 0.5
print("VARIATIONAL : Settings : strength:", self.STRENGTH, " use_actions:", self._use_actions, " mutual_information : ", self.mutual_information, "Sigma_Start : ", sigma_start)
# state_encoder_settings = settings
state_encoder_settings = NetworkSettings(normalize=True, num_layers=1)
if state_encoder_settings.memory is not None:

self._encoder = NetworkBody(new_spec, state_encoder_settings)
self._z_sigma = torch.nn.Parameter(
torch.ones((self.z_size), dtype=torch.float), requires_grad=True
sigma_start * torch.ones((self.z_size), dtype=torch.float), requires_grad=True
)
# self._z_mu_layer = linear_layer(
# state_encoder_settings.hidden_units,

正在加载...
取消
保存