浏览代码

Documentation tweaks and updates (#1479)

* Add blurb about using the --load flag in the intro guide, and typo fix.

* Add section in tutorial to create multiple area learning environment.

* Add mention of Done() method in agent design
/develop-generalizationTraining-TrainerController
GitHub 6 年前
当前提交
5a29fd25
共有 4 个文件被更改,包括 56 次插入8 次删除
  1. 4
      docs/Getting-Started-with-Balance-Ball.md
  2. 41
      docs/Learning-Environment-Create-New.md
  3. 4
      docs/Learning-Environment-Design-Agents.md
  4. 15
      docs/Learning-Environment-Design.md

4
docs/Getting-Started-with-Balance-Ball.md


follow the instructions in
[Using an Executable](Learning-Environment-Executable.md).
**Note**: Re-running this command will start training from scratch again. To resume
a previous training run, append the `--load` flag and give the same `--run-id` as the
run you want to resume.
### Observing Training Progress
Once you start training using `mlagents-learn` in the way described in the

41
docs/Learning-Environment-Create-New.md


The default settings for the Academy properties are also fine for this
environment, so we don't need to change anything for the RollerAcademy component
in the Inspector window.
in the Inspector window. You may not have the RollerBrain in the Broadcast Hub yet,
more on that later.
![The Academy properties](images/mlagents-NewTutAcademy.png)

you pass to the `mlagents-learn` command for each training run. If you use
the same id value, the statistics for multiple runs are combined and become
difficult to interpret.
## Optional: Multiple Training Areas within the Same Scene
In many of the [example environments](Learning-Environment-Examples.md), many copies of
the training area are instantiated in the scene. This generally speeds up training,
allowing the environment to gather many experiences in parallel. This can be achieved
simply by instantiating many Agents which share the same Brain. Use the following steps to
parallelize your RollerBall environment.
### Instantiating Multiple Training Areas
1. Right-click on your Project Hierarchy and create a new empty GameObject.
Name it TrainingArea.
2. Reset the TrainingArea’s Transform so that it is at (0,0,0) with Rotation (0,0,0)
and Scale (1,1,1).
3. Drag the Floor, Target, and RollerAgent GameObjects in the Hierarchy into the
TrainingArea GameObject.
4. Drag the TrainingArea GameObject, along with its attached GameObjects, into your
Assets browser, turning it into a prefab.
5. You can now instantiate copies of the TrainingArea prefab. Drag them into your scene,
positioning them so that they do not overlap.
### Editing the Scripts
You will notice that in the previous section, we wrote our scripts assuming that our
TrainingArea was at (0,0,0), performing checks such as `this.transform.position.y < 0`
to determine whether our agent has fallen off the platform. We will need to change
this if we are to use multiple TrainingAreas throughout the scene.
A quick way to adapt our current code is to use
localPosition rather than position, so that our position reference is in reference
to the prefab TrainingArea's location, and not global coordinates.
1. Replace all references of `this.transform.position` in RollerAgent.cs with `this.transform.localPosition`.
2. Replace all references of `Target.position` in RollerAgent.cs with `Target.localPosition`.
This is only one way to achieve this objective. Refer to the
[example environments](Learning-Environment-Examples.md) for other ways we can achieve relative positioning.
## Review: Scene Layout

4
docs/Learning-Environment-Design-Agents.md


The `Ball3DAgent` also assigns a negative penalty when the ball falls off the
platform.
Note that all of these environments make use of the `Done()` method, which manually
terminates an episode when a termination condition is reached. This can be
called independently of the `Max Step` property.
## Agent Properties
![Agent Inspector](images/agent.png)

15
docs/Learning-Environment-Design.md


You must also determine how an Agent finishes its task or times out. You can
manually set an Agent to done in your `AgentAction()` function when the Agent
has finished (or irrevocably failed) its task. You can also set the Agent's `Max
Steps` property to a positive value and the Agent will consider itself done
after it has taken that many steps. When the Academy reaches its own `Max Steps`
count, it starts the next episode. If you set an Agent's `ResetOnDone` property
to true, then the Agent can attempt its task several times in one episode. (Use
the `Agent.AgentReset()` function to prepare the Agent to start again.)
has finished (or irrevocably failed) its task by calling the `Done()` function.
You can also set the Agent's `Max Steps` property to a positive value and the
Agent will consider itself done after it has taken that many steps. When the
Academy reaches its own `Max Steps` count, it starts the next episode. If you
set an Agent's `ResetOnDone` property to true, then the Agent can attempt its
task several times in one episode. (Use the `Agent.AgentReset()` function to
prepare the Agent to start again.)
about programing your own Agents.
about programming your own Agents.
## Environments

正在加载...
取消
保存