浏览代码

Merge branch 'self-play-mutex' into soccer-2v1

/asymm-envs
Andrew Cohen 5 年前
当前提交
ddba3aa7
共有 1 个文件被更改,包括 1 次插入1 次删除
  1. 2
      docs/Training-Self-Play.md

2
docs/Training-Self-Play.md


In a proper training run, the ELO of the agent should steadily increase. The absolute value of the ELO is less important than the change in ELO over training iterations.
Note, this implementation will support any number of teams but ELO is only applicable to games with two teams. It is ongoing work to implement
a reliable metric for measuring progress in these scenarios. These scenarios can still train, though as of now, reward and qualitative observations
a reliable metric for measuring progress in scenarios with three or more teams. These scenarios can still train, though as of now, reward and qualitative observations
are the only metric by which we can judge performance.
正在加载...
取消
保存