
[Documentation for in Editor Training] (#773)

* [Documentation for in Editor Training]

* [Addressed the comments]

* [Addressing unofficial comments]

* [Addressed comments]

* [Addressed more comments]
GitHub 6 年前
共有 8 个文件被更改,包括 156 次插入92 次删除
  1. 84
  2. 24
  3. 2
  4. 1
  5. 13
  6. 6
  7. 14
  8. 104


![Running a pretrained model](images/running-a-pretrained-model.gif)
## Building an Example Environment
## Using the Basics Jupyter Notebook
The first step is to open the Unity scene containing the 3D Balance Ball
The `python/Basics` [Jupyter notebook](Background-Jupyter.md) contains a
simple walkthrough of the functionality of the Python
API. It can also serve as a simple test that your environment is configured
correctly. Within `Basics`, be sure to set `env_name` to the name of the
Unity executable if you want to [use an executable](Learning-Environment-Executable.md) or to `None` if you want to interact with the current scene in the Unity Editor.
1. Launch Unity.
2. On the Projects dialog, choose the **Open** option at the top of the window.
3. Using the file dialog that opens, locate the `unity-environment` folder
within the ML-Agents project and click **Open**.
4. In the **Project** window, navigate to the folder
5. Double-click the `3DBall` file to load the scene containing the Balance
Ball environment.
![3DBall Scene](images/mlagents-Open3DBall.png)
More information and documentation is provided in the
[Python API](Python-API.md) page.
## Training the Brain with Reinforcement Learning
### Setting the Brain to External
Since we are going to build this environment to conduct training, we need to
set the brain used by the agents to **External**. This allows the agents to
communicate with the external training process when making their decisions.

![Set Brain to External](images/mlagents-SetExternalBrain.png)
Next, we want the set up scene to play correctly when the training process
launches our environment executable. This means:
* The environment application runs in the background
* No dialogs require interaction
* The correct scene loads automatically
1. Open Player Settings (menu: **Edit** > **Project Settings** > **Player**).
2. Under **Resolution and Presentation**:
- Ensure that **Run in Background** is Checked.
- Ensure that **Display Resolution Dialog** is set to Disabled.
3. Open the Build Settings window (menu:**File** > **Build Settings**).
4. Choose your target platform.
- (optional) Select “Development Build” to
[log debug messages](https://docs.unity3d.com/Manual/LogFiles.html).
5. If any scenes are shown in the **Scenes in Build** list, make sure that
the 3DBall Scene is the only one checked. (If the list is empty, than only the
current scene is included in the build).
6. Click **Build**:
- In the File dialog, navigate to the `python` folder in your ML-Agents
- Assign a file name and click **Save**.
- (For Windows)With Unity 2018.1, it will ask you to select a folder instead of a file name. Create a subfolder within `python` folder and select that folder to build. In the following steps you will refer to this subfolder's name as `env_name`.
![Build Window](images/mlagents-BuildWindow.png)
Now that we have a Unity executable containing the simulation environment, we
can perform the training. You can ensure that your environment and the Python
API work as expected, by using the `python/Basics`
[Jupyter notebook](Background-Jupyter.md) introduced in the next section.
## Using the Basics Jupyter Notebook
The `python/Basics` [Jupyter notebook](Background-Jupyter.md) contains a
simple walkthrough of the functionality of the Python
API. It can also serve as a simple test that your environment is configured
correctly. Within `Basics`, be sure to set `env_name` to the name of the
Unity executable you built earlier.
More information and documentation is provided in the
[Python API](Python-API.md) page.
## Training the Brain with Reinforcement Learning
### Training the environment
3. Change to the python directory.
4. Run `python3 learn.py <env_name> --run-id=<run-identifier> --train`
3. Change to the `python` directory.
4. Run `python3 learn.py --run-id=<run-identifier> --train`
- `<env_name>` is the name and path to the executable you exported from Unity (without extension)
5. When the message _"Ready to connect with the Editor"_ is displayed on the screen, you can press the :arrow_forward: button in Unity to start training in the Editor.
For example, if you are training with a 3DBall executable you exported to the ml-agents/python directory, run:
python3 learn.py 3DBall --run-id=firstRun --train
**Note**: Alternatively, you can use an executable rather than the Editor to perform training. Please refer to [this page](Learning-Environment-Executable.md) for instructions on how to build and use an executable.
![Training command example](images/training-command-example.png)

![Training running](images/training-running.png)
You can press Ctrl+C to stop the training, and your trained model will be at `ml-agents/python/models/<run-identifier>/<env_name>_<run-identifier>.bytes`, which corresponds to your model's latest checkpoint. You can now embed this trained model into your internal brain by following the steps below, which is similar to the steps described [above](#play-an-example-environment-using-pretrained-model).
### After training
You can press Ctrl+C to stop the training, and your trained model will be at `ml-agents/python/models/<run-identifier>/editor_<academy_name>_<run-identifier>.bytes` where `<academy_name>` is the name of the Academy GameObject in the current scene. This file corresponds to your model's latest checkpoint. You can now embed this trained model into your internal brain by following the steps below, which is similar to the steps described [above](#play-an-example-environment-using-pretrained-model).
1. Move your model file into

5. Drag the `<env_name>_<run-identifier>.bytes` file from the Project window of the Editor
to the **Graph Model** placeholder in the **Ball3DBrain** inspector window.
6. Press the Play button at the top of the editor.
6. Press the :arrow_forward: button at the top of the Editor.
## Next Steps


drops the ball so that it will reset with a new ball for the next simulation
## Building the Environment
To build the 3D Balance Ball environment, follow the steps in the
[Building an Environment](Basic-Guide.md#building-an-example-environment) section
of the Basic Guide page.
Now that we have a Unity executable containing the simulation environment, we
can perform the training.
Now that we have an environment, we can perform the training.
### Training with PPO

To train the agents within the Ball Balance environment, we will be using the python
package. We have provided a convenient Python wrapper script called `learn.py` which accepts arguments used to configure both training and inference phases.
We will pass to this script the path of the environment executable that we just built. (Optionally) We can
use `run_id` to identify the experiment and create a folder where the model and summary statistics are stored. When
using TensorBoard to observe the training statistics, it helps to set this to a sequential value
We can use `run_id` to identify the experiment and create a folder where the model and summary statistics are stored. When using TensorBoard to observe the training statistics, it helps to set this to a sequential value
for each training run. In other words, "BalanceBall1" for the first run,
"BalanceBall2" or the second, and so on. If you don't, the summaries for
every training run are saved to the same directory and will all be included

python3 learn.py <env_name> --run-id=<run-identifier> --train
python3 learn.py --run-id=<run-identifier> --train
When the message _"Ready to connect with the Editor"_ is displayed on the screen, you can press the :arrow_forward: button in Unity to start training in the Editor.
The `--train` flag tells ML-Agents to run in training mode. `env_name` should be the name of the Unity executable that was just created.
The `--train` flag tells ML-Agents to run in training mode.
**Note**: You can train using an executable rather than the Editor. To do so, follow the intructions in
[Using an Execuatble](Learning-Environment-Executable.md).
### Observing Training Progress


* `worker_id` indicates which port to use for communication with the environment. For use in parallel training regimes such as A3C.
* `seed` indicates the seed to use when generating random numbers during the training process. In environments which do not involve physics calculations, setting the seed enables reproducible experimentation by ensuring that the environment and trainers utilize the same random seed.
If you want to directly interact with the Editor, you need to use `file_name=None`, then press the :arrow_forward: button in the Editor when the message _"Ready to connect with the Editor"_ is displayed on the screen
## Interacting with a Unity Environment
A BrainInfo object contains the following fields:


* [Brains](Learning-Environment-Design-Brains.md): [Player](Learning-Environment-Design-Player-Brains.md), [Heuristic](Learning-Environment-Design-Heuristic-Brains.md), [Internal & External](Learning-Environment-Design-External-Internal-Brains.md)
* [Learning Environment Best Practices](Learning-Environment-Best-Practices.md)
* [Using the Monitor](Feature-Monitor.md)
* [Using an Executable Environment](Learning-Environment-Executable.md)
* [TensorFlowSharp in Unity (Experimental)](Using-TensorFlow-Sharp-in-Unity.md)
## Training


2. Set the "Teacher" brain to Player mode, and properly configure the inputs to map to the corresponding actions. **Ensure that "Broadcast" is checked within the Brain inspector window.**
3. Set the "Student" brain to External mode.
4. Link the brains to the desired agents (one agent as the teacher and at least one agent as a student).
5. Build the Unity executable for your desired platform.
6. In `trainer_config.yaml`, add an entry for the "Student" brain. Set the `trainer` parameter of this entry to `imitation`, and the `brain_to_imitate` parameter to the name of the teacher brain: "Teacher". Additionally, set `batches_per_epoch`, which controls how much training to do each moment. Increase the `max_steps` option if you'd like to keep training the agents for a longer period of time.
7. Launch the training process with `python3 python/learn.py <env_name> --train --slow`, where `<env_name>` is the path to your built Unity executable.
8. From the Unity window, control the agent with the Teacher brain by providing "teacher demonstrations" of the behavior you would like to see.
9. Watch as the agent(s) with the student brain attached begin to behave similarly to the demonstrations.
10. Once the Student agents are exhibiting the desired behavior, end the training process with `CTL+C` from the command line.
11. Move the resulting `*.bytes` file into the `TFModels` subdirectory of the Assets folder (or a subdirectory within Assets of your choosing) , and use with `Internal` brain.
5. In `trainer_config.yaml`, add an entry for the "Student" brain. Set the `trainer` parameter of this entry to `imitation`, and the `brain_to_imitate` parameter to the name of the teacher brain: "Teacher". Additionally, set `batches_per_epoch`, which controls how much training to do each moment. Increase the `max_steps` option if you'd like to keep training the agents for a longer period of time.
6. Launch the training process with `python3 python/learn.py --train --slow`, and press the :arrow_forward: button in Unity when the message _"Ready to connect with the Editor"_ is displayed on the screen
7. From the Unity window, control the agent with the Teacher brain by providing "teacher demonstrations" of the behavior you would like to see.
8. Watch as the agent(s) with the student brain attached begin to behave similarly to the demonstrations.
9. Once the Student agents are exhibiting the desired behavior, end the training process with `CTL+C` from the command line.
10. Move the resulting `*.bytes` file into the `TFModels` subdirectory of the Assets folder (or a subdirectory within Assets of your choosing) , and use with `Internal` brain.
### BC Teacher Helper


python3 learn.py <env_name> --run-id=<run-identifier> --train
where `<env_name>` is the name(including path) of your Unity executable containing the agents to be trained and `<run-identifier>` is an optional identifier you can use to identify the results of individual training runs.
* `<env_name>`__(Optional)__ is the name (including path) of your Unity executable containing the agents to be trained. If `<env_name>` is not passed, the training will happen in the Editor. Press the :arrow_forward: button in Unity when the message _"Ready to connect with the Editor"_ is displayed on the screen.
* `<run-identifier>` is an optional identifier you can use to identify the results of individual training runs.
1. Build the project, making sure that you only include the training scene.
1. [Build the project](Learning-Environment-Executable.md), making sure that you only include the training scene.
2. Open a terminal or console window.
3. Navigate to the ml-agents `python` folder.
4. Run the following to launch the training process using the path to the Unity environment you built in step 1:


Using Docker for ML-Agents involves three steps: building the Unity environment with specific flags, building a Docker container and, finally, running the container. If you are not familiar with building a Unity environment for ML-Agents, please read through our [Getting Started with the 3D Balance Ball Example](Getting-Started-with-Balance-Ball.md) guide first.
### Build the Environment
### Build the Environment (Optional)
_If you want to used the Editor to perform training, you can skip this step._
Since Docker typically runs a container sharing a (linux) kernel with the host machine, the
Unity environment **has** to be built for the **linux platform**. When building a Unity environment, please select the following options from the the Build Settings window:

### Build the Docker Container
First, make sure the Docker engine is running on your machine. Then build the Docker container by calling the following command at the top-level of the repository:
**Note** if you modify hyperparameters in `trainer_config.yaml` you will have to build a new Docker Container before running.
-p 5005:5005 \
<image-name>:latest <environment-name> \
--docker-target-name=unity-volume \
--train \

Notes on argument values:
- `<container-name>` is used to identify the container (in case you want to interrupt and terminate it). This is optional and Docker will generate a random name if this is not set. _Note that this must be unique for every run of a Docker image._
- `<image-name>` and `<environment-name>`: References the image and environment names, respectively.
- `<image-name>` references the image name used when building the container.
- `<environemnt-name>` __(Optional)__: If you are training with a linux executable, this is the name of the executable. If you are training in the Editor, do not pass a `<environemnt-name>` argument and press the :arrow_forward: button in Unity when the message _"Ready to connect with the Editor"_ is displayed on the screen.
For the `3DBall` environment, for example this would be:
To train with a `3DBall` environment executable, the command would be:
-p 5005:5005 \
balance.ball.v0.1:latest 3DBall \
--docker-target-name=unity-volume \
--train \


# Using an Environment Executable
This section will help you create and use built environments rather than the Editor to interact with an environment. Using an executable has some advantages over using the Editor :
* You can exchange executable with other people without having to share your entire repository.
* You can put your executable on a remote machine for faster training.
* You can use `Headless` mode for faster training.
* You can keep using the Unity Editor for other tasks while the agents are training.
## Building the 3DBall environment
The first step is to open the Unity scene containing the 3D Balance Ball
1. Launch Unity.
2. On the Projects dialog, choose the **Open** option at the top of the window.
3. Using the file dialog that opens, locate the `unity-environment` folder
within the ML-Agents project and click **Open**.
4. In the **Project** window, navigate to the folder
5. Double-click the `3DBall` file to load the scene containing the Balance
Ball environment.
![3DBall Scene](images/mlagents-Open3DBall.png)
Make sure the Brains in the scene have the right type. For example, if you want to be able to control your agents from Python, you will need to set the corresponding brain to **External**.
1. In the **Scene** window, click the triangle icon next to the Ball3DAcademy
2. Select its child object **Ball3DBrain**.
3. In the Inspector window, set **Brain Type** to **External**.
![Set Brain to External](images/mlagents-SetExternalBrain.png)
Next, we want the set up scene to play correctly when the training process
launches our environment executable. This means:
* The environment application runs in the background
* No dialogs require interaction
* The correct scene loads automatically
1. Open Player Settings (menu: **Edit** > **Project Settings** > **Player**).
2. Under **Resolution and Presentation**:
- Ensure that **Run in Background** is Checked.
- Ensure that **Display Resolution Dialog** is set to Disabled.
3. Open the Build Settings window (menu:**File** > **Build Settings**).
4. Choose your target platform.
- (optional) Select “Development Build” to
[log debug messages](https://docs.unity3d.com/Manual/LogFiles.html).
5. If any scenes are shown in the **Scenes in Build** list, make sure that
the 3DBall Scene is the only one checked. (If the list is empty, than only the
current scene is included in the build).
6. Click **Build**:
- In the File dialog, navigate to the `python` folder in your ML-Agents
- Assign a file name and click **Save**.
- (For Windows)With Unity 2018.1, it will ask you to select a folder instead of a file name. Create a subfolder within `python` folder and select that folder to build. In the following steps you will refer to this subfolder's name as `env_name`.
![Build Window](images/mlagents-BuildWindow.png)
Now that we have a Unity executable containing the simulation environment, we
can interact with it.
## Interacting with the Environment
If you want to use the [Python API](Python-API.md) to interact with your executable, you can pass the name of the executable with the argument 'file_name' of the `UnityEnvironment`. For instance :
from unityagents import UnityEnvironment
env = UnityEnvironment(file_name=<env_name>)
## Training the Environment
1. Open a command or terminal window.
2. Nagivate to the folder where you installed ML-Agents.
3. Change to the python directory.
4. Run `python3 learn.py <env_name> --run-id=<run-identifier> --train`
- `<env_name>` is the name and path to the executable you exported from Unity (without extension)
- `<run-identifier>` is a string used to separate the results of different training runs
- And the `--train` tells learn.py to run a training session (rather than inference)
For example, if you are training with a 3DBall executable you exported to the ml-agents/python directory, run:
python3 learn.py 3DBall --run-id=firstRun --train
![Training command example](images/training-command-example.png)
**Note**: If you're using Anaconda, don't forget to activate the ml-agents environment first.
If the learn.py runs correctly and starts training, you should see something like this:
![Training running](images/training-running.png)
You can press Ctrl+C to stop the training, and your trained model will be at `ml-agents/python/models/<run-identifier>/<env_name>_<run-identifier>.bytes`, which corresponds to your model's latest checkpoint. You can now embed this trained model into your internal brain by following the steps below:
1. Move your model file into
2. Open the Unity Editor, and select the **3DBall** scene as described above.
3. Select the **Ball3DBrain** object from the Scene hierarchy.
4. Change the **Type of Brain** to **Internal**.
5. Drag the `<env_name>_<run-identifier>.bytes` file from the Project window of the Editor
to the **Graph Model** placeholder in the **Ball3DBrain** inspector window.
6. Press the Play button at the top of the editor.