浏览代码

Improved SAC hyperparameters for Crawler, Walker (#2635)

* Tweak SAC hyperparams

* Make network bigger

* Properly report entropy

* Revert "Properly report entropy"

This reverts commit 383a8d8f1d60ee0ef4ffd87a59aa08d974616d4e.
/develop-gpu-test
GitHub 5 年前
当前提交
aa861bef
共有 1 个文件被更改,包括 12 次插入4 次删除
  1. 16
      config/sac_trainer_config.yaml

16
config/sac_trainer_config.yaml


normalize: true
time_horizon: 1000
batch_size: 256
train_interval: 3
train_interval: 2
buffer_size: 500000
buffer_init_steps: 2000
max_steps: 5e5

hidden_units: 512
reward_signals:
extrinsic:
strength: 1.0
gamma: 0.995
CrawlerDynamicLearning:
normalize: true

summary_freq: 3000
train_interval: 3
train_interval: 2
reward_signals:
extrinsic:
strength: 1.0
gamma: 0.995
WalkerLearning:
normalize: true

max_steps: 2e6
summary_freq: 3000
num_layers: 3
train_interval: 3
num_layers: 4
train_interval: 2
hidden_units: 512
reward_signals:
extrinsic:

正在加载...
取消
保存