yanchaosun
4 年前
当前提交
d7402406
共有 8 个文件被更改,包括 191 次插入 和 0 次删除
-
4ml-agents/mlagents/trainers/ppo_transfer/optimizer.py
-
1ml-agents/mlagents/trainers/settings.py
-
32config/ppo_transfer/CrawlerStatic128.yaml
-
32config/ppo_transfer/CrawlerStatic128_256.yaml
-
32config/ppo_transfer/CrawlerStatic256.yaml
-
26config/ppo_transfer/CrawlerStatic256ppo.yaml
-
32config/ppo_transfer/CrawlerStatic32_256.yaml
-
32config/ppo_transfer/CrawlerStatic64_256.yaml
|
|||
behaviors: |
|||
CrawlerStatic: |
|||
trainer_type: ppo_transfer |
|||
hyperparameters: |
|||
batch_size: 2024 |
|||
buffer_size: 20240 |
|||
learning_rate: 0.0003 |
|||
beta: 0.005 |
|||
epsilon: 0.2 |
|||
lambd: 0.95 |
|||
num_epoch: 3 |
|||
learning_rate_schedule: linear |
|||
encoder_layers: 2 |
|||
policy_layers: 2 |
|||
value_layers: 2 |
|||
feature_size: 128 |
|||
reuse_encoder: true |
|||
in_epoch_alter: true |
|||
network_settings: |
|||
normalize: true |
|||
hidden_units: 512 |
|||
num_layers: 3 |
|||
vis_encode_type: simple |
|||
reward_signals: |
|||
extrinsic: |
|||
gamma: 0.995 |
|||
strength: 1.0 |
|||
keep_checkpoints: 5 |
|||
max_steps: 10000000 |
|||
time_horizon: 1000 |
|||
summary_freq: 30000 |
|||
threaded: true |
|
|||
behaviors: |
|||
CrawlerStatic: |
|||
trainer_type: ppo_transfer |
|||
hyperparameters: |
|||
batch_size: 2024 |
|||
buffer_size: 20240 |
|||
learning_rate: 0.0003 |
|||
beta: 0.005 |
|||
epsilon: 0.2 |
|||
lambd: 0.95 |
|||
num_epoch: 3 |
|||
learning_rate_schedule: linear |
|||
encoder_layers: 2 |
|||
policy_layers: 2 |
|||
value_layers: 2 |
|||
feature_size: 128 |
|||
reuse_encoder: true |
|||
in_epoch_alter: true |
|||
network_settings: |
|||
normalize: true |
|||
hidden_units: 256 |
|||
num_layers: 3 |
|||
vis_encode_type: simple |
|||
reward_signals: |
|||
extrinsic: |
|||
gamma: 0.995 |
|||
strength: 1.0 |
|||
keep_checkpoints: 5 |
|||
max_steps: 10000000 |
|||
time_horizon: 1000 |
|||
summary_freq: 30000 |
|||
threaded: true |
|
|||
behaviors: |
|||
CrawlerStatic: |
|||
trainer_type: ppo_transfer |
|||
hyperparameters: |
|||
batch_size: 2024 |
|||
buffer_size: 20240 |
|||
learning_rate: 0.0003 |
|||
beta: 0.005 |
|||
epsilon: 0.2 |
|||
lambd: 0.95 |
|||
num_epoch: 3 |
|||
learning_rate_schedule: linear |
|||
encoder_layers: 2 |
|||
policy_layers: 2 |
|||
value_layers: 2 |
|||
feature_size: 256 |
|||
reuse_encoder: true |
|||
in_epoch_alter: true |
|||
network_settings: |
|||
normalize: true |
|||
hidden_units: 512 |
|||
num_layers: 3 |
|||
vis_encode_type: simple |
|||
reward_signals: |
|||
extrinsic: |
|||
gamma: 0.995 |
|||
strength: 1.0 |
|||
keep_checkpoints: 5 |
|||
max_steps: 10000000 |
|||
time_horizon: 1000 |
|||
summary_freq: 30000 |
|||
threaded: true |
|
|||
behaviors: |
|||
CrawlerStatic: |
|||
trainer_type: ppo |
|||
hyperparameters: |
|||
batch_size: 2024 |
|||
buffer_size: 20240 |
|||
learning_rate: 0.0003 |
|||
beta: 0.005 |
|||
epsilon: 0.2 |
|||
lambd: 0.95 |
|||
num_epoch: 3 |
|||
learning_rate_schedule: linear |
|||
network_settings: |
|||
normalize: true |
|||
hidden_units: 256 |
|||
num_layers: 3 |
|||
vis_encode_type: simple |
|||
reward_signals: |
|||
extrinsic: |
|||
gamma: 0.995 |
|||
strength: 1.0 |
|||
keep_checkpoints: 5 |
|||
max_steps: 10000000 |
|||
time_horizon: 1000 |
|||
summary_freq: 30000 |
|||
threaded: true |
|
|||
behaviors: |
|||
CrawlerStatic: |
|||
trainer_type: ppo_transfer |
|||
hyperparameters: |
|||
batch_size: 2024 |
|||
buffer_size: 20240 |
|||
learning_rate: 0.0003 |
|||
beta: 0.005 |
|||
epsilon: 0.2 |
|||
lambd: 0.95 |
|||
num_epoch: 3 |
|||
learning_rate_schedule: linear |
|||
encoder_layers: 2 |
|||
policy_layers: 2 |
|||
value_layers: 2 |
|||
feature_size: 32 |
|||
reuse_encoder: true |
|||
in_epoch_alter: true |
|||
network_settings: |
|||
normalize: true |
|||
hidden_units: 256 |
|||
num_layers: 3 |
|||
vis_encode_type: simple |
|||
reward_signals: |
|||
extrinsic: |
|||
gamma: 0.995 |
|||
strength: 1.0 |
|||
keep_checkpoints: 5 |
|||
max_steps: 10000000 |
|||
time_horizon: 1000 |
|||
summary_freq: 30000 |
|||
threaded: true |
|
|||
behaviors: |
|||
CrawlerStatic: |
|||
trainer_type: ppo_transfer |
|||
hyperparameters: |
|||
batch_size: 2024 |
|||
buffer_size: 20240 |
|||
learning_rate: 0.0003 |
|||
beta: 0.005 |
|||
epsilon: 0.2 |
|||
lambd: 0.95 |
|||
num_epoch: 3 |
|||
learning_rate_schedule: linear |
|||
encoder_layers: 2 |
|||
policy_layers: 2 |
|||
value_layers: 2 |
|||
feature_size: 64 |
|||
reuse_encoder: true |
|||
in_epoch_alter: true |
|||
network_settings: |
|||
normalize: true |
|||
hidden_units: 256 |
|||
num_layers: 3 |
|||
vis_encode_type: simple |
|||
reward_signals: |
|||
extrinsic: |
|||
gamma: 0.995 |
|||
strength: 1.0 |
|||
keep_checkpoints: 5 |
|||
max_steps: 10000000 |
|||
time_horizon: 1000 |
|||
summary_freq: 30000 |
|||
threaded: true |
撰写
预览
正在加载...
取消
保存
Reference in new issue