浏览代码

Self play hyperparameter improvements (#4063)

/MLA-1734-demo-provider
GitHub 4 年前
当前提交
91f199cd
共有 5 个文件被更改,包括 8 次插入8 次删除
  1. 2
      config/ppo/SoccerTwos.yaml
  2. 4
      config/ppo/StrikersVsGoalie.yaml
  3. 6
      config/ppo/Tennis.yaml
  4. 2
      docs/Training-ML-Agents.md
  5. 2
      ml-agents/mlagents/trainers/settings.py

2
config/ppo/SoccerTwos.yaml


self_play:
save_steps: 50000
team_change: 200000
swap_steps: 50000
swap_steps: 2000
window: 10
play_against_latest_model_ratio: 0.5
initial_elo: 1200.0

4
config/ppo/StrikersVsGoalie.yaml


self_play:
save_steps: 50000
team_change: 200000
swap_steps: 25000
swap_steps: 1000
window: 10
play_against_latest_model_ratio: 0.5
initial_elo: 1200.0

self_play:
save_steps: 50000
team_change: 200000
swap_steps: 100000
swap_steps: 4000
window: 10
play_against_latest_model_ratio: 0.5
initial_elo: 1200.0

6
config/ppo/Tennis.yaml


Tennis:
trainer_type: ppo
hyperparameters:
batch_size: 1024
buffer_size: 10240
batch_size: 2048
buffer_size: 20480
learning_rate: 0.0003
beta: 0.005
epsilon: 0.2

self_play:
save_steps: 50000
team_change: 100000
swap_steps: 50000
swap_steps: 2000
window: 10
play_against_latest_model_ratio: 0.5
initial_elo: 1200.0

2
docs/Training-ML-Agents.md


window: 10
play_against_latest_model_ratio: 0.5
save_steps: 50000
swap_steps: 50000
swap_steps: 2000
team_change: 100000
```

2
ml-agents/mlagents/trainers/settings.py


# Assign team_change to about 4x save_steps
return self.save_steps * 5
swap_steps: int = 10000
swap_steps: int = 2000
window: int = 10
play_against_latest_model_ratio: float = 0.5
initial_elo: float = 1200.0

正在加载...
取消
保存