
Updating curriculum learning docs.

- Docs updated for multi-brain curriculums.
Deric Pang 6 年前
共有 1 个文件被更改,包括 39 次插入12 次删除
  1. 51


accomplish the task. From there, we can slowly add to the difficulty of the task by
increasing the size of the wall, until the agent can complete the initially
near-impossible task of scaling the wall. We are including just such an environment with
the ML-Agents toolkit 0.2, called Wall Jump.
the ML-Agents toolkit 0.2, called __Wall Jump__.

## How-To
Each Brain in an environment can have a corresponding curriculum. These
curriculums are held in what we call a metacurriculum. A metacurriculum allows
different brains to follow different curriculums within the same environment.
### Specifying a Metacurriculum
We first create a folder inside `python/curricula/` for the environment we want
to use curriculum learning with. For example, if we were creating a metacurriculum
for Wall Jump, we would create the folder `python/curricula/wall-jump/`. We will place
our curriculums inside this folder.
### Specifying a Curriculum
parameters of the environment will vary. In the case of the Wall Area environment, what
parameters of the environment will vary. In the case of the Wall Jump environment, what
than adjusting it by hand, we then create a simple JSON file which describes the
structure of the curriculum. Within it we can set at what points in the training process
than adjusting it by hand, we will create a simple JSON file which describes the
structure of the curriculum. Within it, we can specify which points in the training process
Finally, we have to use the reset parameter we defined and modify the environment from
the agent's `AgentReset()` function.
Once these are in place, we simply launch learn.py using the `–curriculum` flag to
point to the JSON file, and PPO we will train using Curriculum Learning. Of course we can
then keep track of the current lesson and progress via TensorBoard.
Below is an example curriculum for the BigWallBrain in the Wall Jump environment.

"parameters" :
"big_wall_min_height" : [0.0, 4.0, 6.0, 8.0],
"big_wall_max_height" : [4.0, 7.0, 8.0, 8.0],
"small_wall_height" : [1.5, 2.0, 2.5, 4.0]
"big_wall_max_height" : [4.0, 7.0, 8.0, 8.0]

* If `true`, weighting will be 0.75 (new) 0.25 (old).
* `parameters` (dictionary of key:string, value:float array) - Corresponds to academy reset parameters to control. Length of each array
should be one greater than number of thresholds.
Once our curriculum is defined, we have to use the reset parameters we defined and modify
the environment from the agent's `AgentReset()` function. See
for an example.
We will save this file into our metacurriculum folder with the name of its
corresponding Brain. For example, in the Wall Jump environment, there are
two brains---BigWallBrain and SmallWallBrain. If we want to define a
curriculum for the BigWallBrain, we will save `BigWallBrain.json` into
### Training with a Curriculum
Once we have specified our metacurriculum and curriculums, we can launch `learn.py` using the `–curriculum`
flag to point to the metacurriculum folder and PPO will train using Curriculum Learning. For example,
to train agents in the Wall Jump environment with curriculum learning, we can run
`python learn.py --curriculum=curricula/wall-jump/ --run-id=wall-jump-curriculum --train`.
We can then keep track of the current lessons and progresses via TensorBoard.