浏览代码

documentation touchups (#4099)

* doc updates

getting started page now uses consistent run-id

re-order create-new docs to have less back/forth between unity and text editor

* add link explaining decisions where we tell the reader to modify its parameter
/MLA-1734-demo-provider
GitHub 5 年前
当前提交
626581a0
共有 2 个文件被更改,包括 31 次插入29 次删除
  1. 2
      docs/Getting-Started.md
  2. 58
      docs/Learning-Environment-Create-New.md

2
docs/Getting-Started.md


run the same command again, appending the `--resume` flag:
```sh
mlagents-learn config/ppo/3DBall.yaml --run-id=firstRun --resume
mlagents-learn config/ppo/3DBall.yaml --run-id=first3DBallRun --resume
```
Your trained model will be at `results/<run-identifier>/<behavior_name>.nn` where

58
docs/Learning-Environment-Create-New.md


Vector3 controlSignal = Vector3.zero;
controlSignal.x = action[0];
controlSignal.z = action[1];
rBody.AddForce(controlSignal * speed);
rBody.AddForce(controlSignal * forceMultiplier);
```
#### Rewards

`OnActionReceived()` function looks like:
```csharp
public float speed = 10;
public float forceMultiplier = 10;
public override void OnActionReceived(float[] vectorAction)
{
// Actions, size = 2

rBody.AddForce(controlSignal * speed);
rBody.AddForce(controlSignal * forceMultiplier);
// Rewards
float distanceToTarget = Vector3.Distance(this.transform.localPosition, Target.localPosition);

}
```
Note the `speed` class variable is defined before the function. Since `speed` is
Note the `forceMultiplier` class variable is defined before the function. Since `forceMultiplier` is
## Final Editor Setup
Now, that all the GameObjects and ML-Agent components are in place, it is time
to connect everything together in the Unity Editor. This involves changing some
of the Agent Component's properties so that they are compatible with our Agent
code.
1. Select the **RollerAgent** GameObject to show its properties in the Inspector
window.
1. Add the `Decision Requester` script with the Add Component button from the
RollerAgent Inspector.
1. Change **Decision Period** to `10`.
1. Drag the Target GameObject from the Hierarchy window to the RollerAgent
Target field.
1. Add the `Behavior Parameters` script with the Add Component button from the
RollerAgent Inspector.
1. Modify the Behavior Parameters of the Agent :
- `Behavior Name` to _RollerBall_
- `Vector Observation` > `Space Size` = 8
- `Vector Action` > `Space Type` = **Continuous**
- `Vector Action` > `Space Size` = 2
Now you are ready to test the environment before training.
## Testing the Environment
It is always a good idea to first test your environment by controlling the Agent

Console window and that the Agent resets when it reaches its target or falls
from the platform.
## Final Editor Setup
Now, that all the GameObjects and ML-Agent components are in place, it is time
to connect everything together in the Unity Editor. This involves changing some
of the Agent Component's properties so that they are compatible with our Agent
code.
1. Select the **RollerAgent** GameObject to show its properties in the Inspector
window.
1. Add the `Decision Requester` script with the Add Component button from the
RollerAgent Inspector.
1. Change **Decision Period** to `10`. For more information on decisions, see [the Agent documentation](Learning-Environment-Design-Agents.md#decisions)
1. Drag the Target GameObject from the Hierarchy window to the RollerAgent
Target field.
1. Add the `Behavior Parameters` script with the Add Component button from the
RollerAgent Inspector.
1. Modify the Behavior Parameters of the Agent :
- `Behavior Name` to _RollerBall_
- `Vector Observation` > `Space Size` = 8
- `Vector Action` > `Space Type` = **Continuous**
- `Vector Action` > `Space Size` = 2
Now you are ready to test the environment before training.
## Training the Environment
The process is the same as described in the

time_horizon: 64
summary_freq: 10000
```
Hyperparameters are explained in [the training configuration file documentation](Training-Configuration-File.md)
Since this example creates a very simple training environment with only a few
inputs and outputs, using small batch and buffer sizes speeds up the training

正在加载...
取消
保存