浏览代码

push block

/develop/bisim-sac-transfer
yanchaosun 4 年前
当前提交
db30f918
共有 5 个文件被更改,包括 2067 次插入0 次删除
  1. 1001
      Project/Assets/ML-Agents/Examples/PushBlock/Prefabs/PushBlockAreaMore.prefab
  2. 10
      Project/Assets/ML-Agents/Examples/PushBlock/Prefabs/PushBlockAreaMore.prefab.meta
  3. 1001
      Project/Assets/ML-Agents/Examples/PushBlock/Scenes/PushBlockMore.unity
  4. 9
      Project/Assets/ML-Agents/Examples/PushBlock/Scenes/PushBlockMore.unity.meta
  5. 46
      config/sac_transfer/PushBlock.yaml

1001
Project/Assets/ML-Agents/Examples/PushBlock/Prefabs/PushBlockAreaMore.prefab
文件差异内容过多而无法显示
查看文件

10
Project/Assets/ML-Agents/Examples/PushBlock/Prefabs/PushBlockAreaMore.prefab.meta


fileFormatVersion: 2
guid: da337f7fc698849e6aeafa8eb5b12a68
timeCreated: 1515023875
licenseType: Free
NativeFormatImporter:
externalObjects: {}
mainObjectFileID: 100100000
userData:
assetBundleName:
assetBundleVariant:

1001
Project/Assets/ML-Agents/Examples/PushBlock/Scenes/PushBlockMore.unity
文件差异内容过多而无法显示
查看文件

9
Project/Assets/ML-Agents/Examples/PushBlock/Scenes/PushBlockMore.unity.meta


fileFormatVersion: 2
guid: 209085e3ed5324cea9b8624358607d13
timeCreated: 1506808980
licenseType: Pro
DefaultImporter:
externalObjects: {}
userData:
assetBundleName:
assetBundleVariant:

46
config/sac_transfer/PushBlock.yaml


behaviors:
PushBlock:
trainer_type: sac_transfer
hyperparameters:
learning_rate: 0.0003
learning_rate_schedule: constant
batch_size: 128
buffer_size: 50000
buffer_init_steps: 0
tau: 0.005
steps_per_update: 10.0
save_replay_buffer: false
init_entcoef: 0.05
reward_signal_steps_per_update: 10.0
encoder_layers: 2
policy_layers: 2
forward_layers: 0
value_layers: 2
action_layers: 2
feature_size: 128
action_feature_size: 64
separate_policy_train: true
separate_policy_net: true
separate_model_train: true
reuse_encoder: true
in_epoch_alter: false
in_batch_alter: true
use_op_buffer: false
use_var_predict: true
with_prior: false
predict_return: true
use_bisim: false
network_settings:
normalize: false
hidden_units: 256
num_layers: 2
vis_encode_type: simple
reward_signals:
extrinsic:
gamma: 0.99
strength: 1.0
keep_checkpoints: 5
max_steps: 2000000
time_horizon: 64
summary_freq: 100000
threaded: true
正在加载...
取消
保存