浏览代码

Merge pull request #2515 from Unity-Technologies/ab/docs-updates

Changes to documentation
/develop-gpu-test
GitHub 5 年前
当前提交
b444c1a5
共有 10 个文件被更改,包括 158 次插入597 次删除
  1. 17
      docs/Installation.md
  2. 16
      docs/ML-Agents-Overview.md
  3. 25
      docs/Readme.md
  4. 107
      docs/Training-ML-Agents.md
  5. 3
      docs/Training-on-Amazon-Web-Service.md
  6. 11
      docs/Training-on-Microsoft-Azure.md
  7. 6
      docs/Using-Tensorboard.md
  8. 53
      docs/Using-Virtual-Environment.md
  9. 166
      docs/Using-Docker.md
  10. 351
      docs/Installation-Windows.md

17
docs/Installation.md


width="500" border="10" />
</p>
## Windows Users
For setting up your environment on Windows, we have created a [detailed
guide](Installation-Windows.md) to setting up your env. For Mac and Linux,
continue with this guide.
## Mac and Unix Users
## Environment Setup
We now support a single mechanism for installing ML-Agents on Mac/Windows/Linux using Virtual
Environments. For more information on Virtual Environments and installation instructions,
follow this [guide](Using-Virtual-Environment.md).
### Clone the ML-Agents Toolkit Repository

Running pip with the `-e` flag will let you make changes to the Python files directly and have those
reflected when you run `mlagents-learn`. It is important to install these packages in this order as the
`mlagents` package depends on `mlagents_envs`, and installing it in the other
order will download `mlagents_envs` from PyPi.
## Docker-based Installation
If you'd like to use Docker for ML-Agents, please follow
[this guide](Using-Docker.md).
order will download `mlagents_envs` from PyPi.
## Next Steps

16
docs/ML-Agents-Overview.md


[Training Generalized Reinforcement Learning Agents](Training-Generalized-Reinforcement-Learning-Agents.md)
to learn more about this feature.
- **Docker Set-up (Experimental)** - To facilitate setting up ML-Agents without
installing Python or TensorFlow directly, we provide a
[guide](Using-Docker.md) on how to create and run a Docker container.
- **Broadcasting** - As discussed earlier, a Learning Brain sends the
observations for all its Agents to the Python API when dragged into the
Academy's `Broadcast Hub` with the `Control` checkbox checked. This is helpful
for training and later inference. Broadcasting is a feature which can be
enabled all types of Brains (Player, Learning, Heuristic) where the Agent
observations and actions are also sent to the Python API (despite the fact
that the Agent is **not** controlled by the Python API). This feature is
leveraged by Imitation Learning, where the observations and actions for a
Player Brain are used to learn the policies of an agent through demonstration.
However, this could also be helpful for the Heuristic and Learning Brains,
particularly when debugging agent behaviors. You can learn more about using
the broadcasting feature
[here](Learning-Environment-Design-Brains.md#using-the-broadcast-feature).
- **Cloud Training on AWS** - To facilitate using the ML-Agents toolkit on
Amazon Web Services (AWS) machines, we provide a

25
docs/Readme.md


* [Installation](Installation.md)
* [Background: Jupyter Notebooks](Background-Jupyter.md)
* [Docker Set-up](Using-Docker.md)
* [Using Virtual Environment](Using-Virtual-Environment.md)
* [Basic Guide](Basic-Guide.md)
## Getting Started

[Heuristic](Learning-Environment-Design-Heuristic-Brains.md),
[Learning](Learning-Environment-Design-Learning-Brains.md)
* [Learning Environment Best Practices](Learning-Environment-Best-Practices.md)
* [Using the Monitor](Feature-Monitor.md)
* [Using the Video Recorder](https://github.com/Unity-Technologies/video-recorder)
* [Using an Executable Environment](Learning-Environment-Executable.md)
* [Creating Custom Protobuf Messages](Creating-Custom-Protobuf-Messages.md)
### Advanced Usage
* [Using the Monitor](Feature-Monitor.md)
* [Using the Video Recorder](https://github.com/Unity-Technologies/video-recorder)
* [Using an Executable Environment](Learning-Environment-Executable.md)
* [Creating Custom Protobuf Messages](Creating-Custom-Protobuf-Messages.md)
* [Using TensorBoard to Observe Training](Using-Tensorboard.md)
* [Training Using Concurrent Unity Instances](Training-Using-Concurrent-Unity-Instances.md)
### Advanced Training Methods
### Cloud Training (Deprecated)
Here are the cloud training set-up guides for Azure and AWS. We no longer use them ourselves and
so they may not be work correctly. We've decided to keep them up just in case they are helpful to
you.
* [Training Using Concurrent Unity Instances](Training-Using-Concurrent-Unity-Instances.md)
* [Using TensorBoard to Observe Training](Using-Tensorboard.md)
## Inference

107
docs/Training-ML-Agents.md


using TensorBoard during or after training by running the following command:
```sh
tensorboard --logdir=summaries
tensorboard --logdir=summaries --port 6006
**Note:** The default port TensorBoard uses is 6006. If there is an existing session
running on port 6006 a new session can be launched on an open port using the --port
option.
When training is finished, you can find the saved model in the `models` folder
under the assigned run-id — in the cats example, the path to the model would be

the oldest checkpoint is deleted when saving a new checkpoint. Defaults to 5.
* `--lesson=<n>`: Specify which lesson to start with when performing curriculum
training. Defaults to 0.
* `--load`: If set, the training code loads an already trained model to
initialize the neural network before training. The learning code looks for the
model in `models/<run-id>/` (which is also where it saves models at the end of
training). When not set (the default), the neural network weights are randomly
initialized and an existing model is not loaded.
* `--num-envs=<n>`: Specifies the number of concurrent Unity environment instances to
collect experiences from when training. Defaults to 1.
* `--run-id=<path>`: Specifies an identifier for each training run. This
identifier is used to name the subdirectories in which the trained model and
summary statistics are saved as well as the saved model itself. The default id

training. Defaults to 50000.
* `--seed=<n>`: Specifies a number to use as a seed for the random number
generator used by the training code.
* `--env-args=<string>`: Specify arguments for the executable environment. Be aware that
the standalone build will also process these as
[Unity Command Line Arguments](https://docs.unity3d.com/Manual/CommandLineArguments.html).
You should choose different argument names if you want to create environment-specific arguments.
All arguments after this flag will be passed to the executable. For example, setting
`mlagents-learn config/trainer_config.yaml --env-args --num-orcs 42` would result in
` --num-orcs 42` passed to the executable.
* `--base-port`: Specifies the starting port. Each concurrent Unity environment instance
will get assigned a port sequentially, starting from the `base-port`. Each instance
will use the port `(base_port + worker_id)`, where the `worker_id` is sequential IDs
given to each instance from 0 to `num_envs - 1`. Default is 5005.
* `--slow`: Specify this option to run the Unity environment at normal, game
speed. The `--slow` mode uses the **Time Scale** and **Target Frame Rate**
specified in the Academy's **Inference Configuration**. By default, training

* `--train`: Specifies whether to train model or only run in inference mode.
When training, **always** use the `--train` option.
* `--num-envs=<n>`: Specifies the number of concurrent Unity environment instances to collect
experiences from when training. Defaults to 1.
* `--base-port`: Specifies the starting port. Each concurrent Unity environment instance will
get assigned a port sequentially, starting from the `base-port`. Each instance will use the
port `(base_port + worker_id)`, where the `worker_id` is sequential IDs given to each instance
from 0 to `num_envs - 1`. Default is 5005.
* `--docker-target-name=<dt>`: The Docker Volume on which to store curriculum,
executable and model files. See [Using Docker](Using-Docker.md).
* `--load`: If set, the training code loads an already trained model to
initialize the neural network before training. The learning code looks for the
model in `models/<run-id>/` (which is also where it saves models at the end of
training). When not set (the default), the neural network weights are randomly
initialized and an existing model is not loaded.
* `--no-graphics`: Specify this option to run the Unity executable in
`-batchmode` and doesn't initialize the graphics driver. Use this only if your
training doesn't involve visual observations (reading from Pixels). See

* `--multi-gpu`: Setting this flag enables the use of multiple GPU's (if available) during training.
* `--env-args=<string>`: Specify arguments for the executable environment. Be aware that
the standalone build will also process these as
[Unity Command Line Arguments](https://docs.unity3d.com/Manual/CommandLineArguments.html).
You should choose different argument names if you want to create environment-specific arguments.
All arguments after this flag will be passed to the executable. For example, setting
`mlagents-learn config/trainer_config.yaml --env-args --num-orcs 42` would result in
` --num-orcs 42` passed to the executable.
`config/gail_config.yaml` and `config/offline_bc_config.yaml`
specifies the training method, the hyperparameters, and a few additional values to use when
`config/gail_config.yaml` and `config/offline_bc_config.yaml` specifies the training method,
the hyperparameters, and a few additional values to use when training with Proximal Policy
Optimization(PPO), Soft Actor-Critic(SAC), GAIL (Generative Adversarial Imitation Learning)
with PPO, and online and offline Behavioral Cloning(BC)/Imitation. These files are divided
into sections. The **default** section defines the default values for all the available
The **default** section defines the default values for all the available
settings. You can also add new sections to override these defaults to train
specific Brains. Name each of these override sections after the GameObject
containing the Brain component that should use these settings. (This GameObject
will be a child of the Academy in your scene.) Sections for the example
environments are included in the provided config file.
The **default** section defines the default values for all the available settings. You can
also add new sections to override these defaults to train specific Brains. Name each of these
override sections after the GameObject containing the Brain component that should use these
settings. (This GameObject will be a child of the Academy in your scene.) Sections for the
example environments are included in the provided config file.
| batch_size | The number of experiences in each iteration of gradient descent. | PPO, SAC, BC |
| batch_size | The number of experiences in each iteration of gradient descent. | PPO, SAC, BC |
| buffer_size | The number of experiences to collect before updating the policy model. In SAC, the max size of the experience buffer. | PPO, SAC |
| buffer_init_steps | The number of experiences to collect into the buffer before updating the policy model. | SAC |
| buffer_size | The number of experiences to collect before updating the policy model. In SAC, the max size of the experience buffer. | PPO, SAC |
| buffer_init_steps | The number of experiences to collect into the buffer before updating the policy model. | SAC |
| hidden_units | The number of units in the hidden layers of the neural network. | PPO, SAC, BC |
| init_entcoef | How much the agent should explore in the beginning of training. | SAC |
| hidden_units | The number of units in the hidden layers of the neural network. | PPO, SAC, BC |
| init_entcoef | How much the agent should explore in the beginning of training. | SAC |
| learning_rate | The initial learning rate for gradient descent. | PPO, SAC, BC |
| max_steps | The maximum number of simulation steps to run during a training session. | PPO, SAC, BC |
| memory_size | The size of the memory an agent must keep. Used for training with a recurrent neural network. See [Using Recurrent Neural Networks](Feature-Memory.md). | PPO, SAC, BC |
| normalize | Whether to automatically normalize observations. | PPO, SAC |
| learning_rate | The initial learning rate for gradient descent. | PPO, SAC, BC |
| max_steps | The maximum number of simulation steps to run during a training session. | PPO, SAC, BC |
| memory_size | The size of the memory an agent must keep. Used for training with a recurrent neural network. See [Using Recurrent Neural Networks](Feature-Memory.md). | PPO, SAC, BC |
| normalize | Whether to automatically normalize observations. | PPO, SAC |
| num_layers | The number of hidden layers in the neural network. | PPO, SAC, BC |
| pretraining | Use demonstrations to bootstrap the policy neural network. See [Pretraining Using Demonstrations](Training-PPO.md#optional-pretraining-using-demonstrations). | PPO, SAC |
| reward_signals | The reward signals used to train the policy. Enable Curiosity and GAIL here. See [Reward Signals](Reward-Signals.md) for configuration options. | PPO, SAC, BC |
| save_replay_buffer | Saves the replay buffer when exiting training, and loads it on resume. | SAC |
| sequence_length | Defines how long the sequences of experiences must be while training. Only used for training with a recurrent neural network. See [Using Recurrent Neural Networks](Feature-Memory.md). | PPO, SAC, BC |
| summary_freq | How often, in steps, to save training statistics. This determines the number of data points shown by TensorBoard. | PPO, SAC, BC |
| tau | How aggressively to update the target network used for bootstrapping value estimation in SAC. | SAC |
| time_horizon | How many steps of experience to collect per-agent before adding it to the experience buffer. | PPO, SAC |
| trainer | The type of training to perform: "ppo", "sac" or "offline_bc". | PPO, SAC, BC |
| train_interval | How often to update the agent. | SAC |
| num_update | Number of mini-batches to update the agent with during each update. | SAC |
| use_recurrent | Train using a recurrent neural network. See [Using Recurrent Neural Networks](Feature-Memory.md). | PPO, SAC, BC |
<<<<<<< HEAD
| num_layers | The number of hidden layers in the neural network. | PPO, SAC, BC |
| pretraining | Use demonstrations to bootstrap the policy neural network. See [Pretraining Using Demonstrations](Training-PPO.md#optional-pretraining-using-demonstrations). | PPO, SAC |
| reward_signals | The reward signals used to train the policy. Enable Curiosity and GAIL here. See [Reward Signals](Reward-Signals.md) for configuration options. | PPO, SAC, BC |
| save_replay_buffer | Saves the replay buffer when exiting training, and loads it on resume. | SAC |
| sequence_length | Defines how long the sequences of experiences must be while training. Only used for training with a recurrent neural network. See [Using Recurrent Neural Networks](Feature-Memory.md). | PPO, SAC, BC |
| summary_freq | How often, in steps, to save training statistics. This determines the number of data points shown by TensorBoard. | PPO, SAC, BC |
| tau | How aggressively to update the target network used for bootstrapping value estimation in SAC. | SAC |
| time_horizon | How many steps of experience to collect per-agent before adding it to the experience buffer. | PPO, SAC, (online)BC |
| trainer | The type of training to perform: "ppo", "sac", "offline_bc" or "online_bc". | PPO, SAC, BC |
| train_interval | How often to update the agent. | SAC |
| num_update | Number of mini-batches to update the agent with during each update. | SAC |
| use_recurrent | Train using a recurrent neural network. See [Using Recurrent Neural Networks](Feature-Memory.md). | PPO, SAC, BC |
\*PPO = Proximal Policy Optimization, SAC = Soft Actor-Critic, BC = Behavioral Cloning (Imitation)

3
docs/Training-on-Amazon-Web-Service.md


# Training on Amazon Web Service
Note: We no longer use this guide ourselves and so it may not work correctly. We've
decided to keep it up just in case it is helpful to you.
This page contains instructions for setting up an EC2 instance on Amazon Web
Service for training ML-Agents environments.

11
docs/Training-on-Microsoft-Azure.md


# Training on Microsoft Azure (works with ML-Agents toolkit v0.3)
Note: We no longer use this guide ourselves and so it may not work correctly. We've
decided to keep it up just in case it is helpful to you.
This page contains instructions for setting up training on Microsoft Azure
through either
[Azure Container Instances](https://azure.microsoft.com/services/container-instances/)

[Azure Container Instances](https://azure.microsoft.com/services/container-instances/)
allow you to spin up a container, on demand, that will run your training and
then be shut down. This ensures you aren't leaving a billable VM running when
it isn't needed. You can read more about
[The ML-Agents toolkit support for Docker containers here](Using-Docker.md).
Using ACI enables you to offload training of your models without needing to
install Python and TensorFlow on your own computer. You can find instructions,
including a pre-deployed image in DockerHub for you to use, available
[here](https://github.com/druttka/unity-ml-on-azure).
it isn't needed. Using ACI enables you to offload training of your models without needing to
install Python and TensorFlow on your own computer.

6
docs/Using-Tensorboard.md


3. From the command line run :
```sh
tensorboard --logdir=summaries
tensorboard --logdir=summaries --port=6006
**Note:** The default port TensorBoard uses is 6006. If there is an existing session
running on port 6006 a new session can be launched on an open port using the --port
option.
**Note:** If you don't assign a `run-id` identifier, `mlagents-learn` uses the
default string, "ppo". All the statistics will be saved to the same sub-folder

53
docs/Using-Virtual-Environment.md


# Using Virtual Environment
## What is a Virtual Environment?
A Virtual Environment is a self contained directory tree that contains a Python installation
for a particular version of Python, plus a number of additional packages. To learn more about
Virtual Environments see [here](https://docs.python.org/3/library/venv.html)
## Why should I use a Virtual Environment?
A Virtual Environment keeps all dependencies for the Python project separate from dependencies
of other projects. This has a few advantages:
1. It makes dependency management for the project easy.
1. It enables using and testing of different library versions by quickly
spinning up a new environment and verifying the compatibility of the code with the
different version.
Requirement - Python 3.6 must be installed on the machine you would like
to run ML-Agents on (either local laptop/desktop or remote server). Python 3.6 can be
installed from [here](https://www.python.org/downloads/).
## Installing Pip (Required)
1. Download the `get-pip.py` file using the command `curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py`
1. Run the following `python3 get-pip.py`
1. Check pip version using `pip3 -V`
Note (for Ubuntu users): If the `ModuleNotFoundError: No module named 'distutils.util'` error is encountered, then
python3-distutils needs to be installed. Install python3-distutils using `sudo apt-get install python3-distutils`
## Mac OS X Setup
1. Create a folder where the virtual environments will reside `$ mkdir ~/python-envs`
1. To create a new environment named `sample-env` execute `$ python3 -m venv ~/python-envs/sample-env`
1. To activate the environment execute `$ source ~/python-envs/sample-env/bin/activate`
1. Verify pip version is the same as in the __Installing Pip__ section. In case it is not the latest, upgrade to
the latest pip version using `pip3 install --upgrade pip`
1. Install ML-Agents package using `$ pip3 install mlagents`
1. To deactivate the environment execute `$ deactivate`
## Ubuntu Setup
1. Install the python3-venv package using `$ sudo apt-get install python3-venv`
1. Follow the steps in the Mac OS X installation.
## Windows Setup
1. Create a folder where the virtual environments will reside `$ md python-envs`
1. To create a new environment named `sample-env` execute `$ python3 -m venv python-envs\sample-env`
1. To activate the environment execute `$ python-envs\sample-env\Scripts\activate`
1. Verify pip version is the same as in the __Installing Pip__ section. In case it is not the latest, upgrade to
the latest pip version using `pip3 install --upgrade pip`
1. Install ML-Agents package using `$ pip3 install mlagents`
1. To deactivate the environment execute `$ deactivate`

166
docs/Using-Docker.md


# Using Docker For ML-Agents
We currently offer a solution for Windows and Mac users who would like to do
training or inference using Docker. This option may be appealing to those who
would like to avoid installing Python and TensorFlow themselves. The current
setup forces both TensorFlow and Unity to _only_ rely on the CPU for
computations. Consequently, our Docker simulation does not use a GPU and uses
[`Xvfb`](https://en.wikipedia.org/wiki/Xvfb) to do visual rendering. `Xvfb` is a
utility that enables `ML-Agents` (or any other application) to do rendering
virtually i.e. it does not assume that the machine running `ML-Agents` has a GPU
or a display attached to it. This means that rich environments which involve
agents using camera-based visual observations might be slower.
## Requirements
- Unity _Linux Build Support_ Component
- [Docker](https://www.docker.com)
## Setup
- [Download](https://unity3d.com/get-unity/download) the Unity Installer and add
the _Linux Build Support_ Component
- [Download](https://www.docker.com/community-edition#/download) and install
Docker if you don't have it setup on your machine.
- Since Docker runs a container in an environment that is isolated from the host
machine, a mounted directory in your host machine is used to share data, e.g.
the trainer configuration file, Unity executable, curriculum files and
TensorFlow graph. For convenience, we created an empty `unity-volume`
directory at the root of the repository for this purpose, but feel free to use
any other directory. The remainder of this guide assumes that the
`unity-volume` directory is the one used.
## Usage
Using Docker for ML-Agents involves three steps: building the Unity environment
with specific flags, building a Docker container and, finally, running the
container. If you are not familiar with building a Unity environment for
ML-Agents, please read through our [Getting Started with the 3D Balance Ball
Example](Getting-Started-with-Balance-Ball.md) guide first.
### Build the Environment (Optional)
_If you want to used the Editor to perform training, you can skip this step._
Since Docker typically runs a container sharing a (linux) kernel with the host
machine, the Unity environment **has** to be built for the **linux platform**.
When building a Unity environment, please select the following options from the
the Build Settings window:
- Set the _Target Platform_ to `Linux`
- Set the _Architecture_ to `x86_64`
- If the environment does not contain visual observations, you can select the
`headless` option here.
Then click `Build`, pick an environment name (e.g. `3DBall`) and set the output
directory to `unity-volume`. After building, ensure that the file
`<environment-name>.x86_64` and subdirectory `<environment-name>_Data/` are
created under `unity-volume`.
![Build Settings For Docker](images/docker_build_settings.png)
### Build the Docker Container
First, make sure the Docker engine is running on your machine. Then build the
Docker container by calling the following command at the top-level of the
repository:
```sh
docker build -t <image-name> .
```
Replace `<image-name>` with a name for the Docker image, e.g.
`balance.ball.v0.1`.
### Run the Docker Container
Run the Docker container by calling the following command at the top-level of
the repository:
```sh
docker run -it --name <container-name> \
--mount type=bind,source="$(pwd)"/unity-volume,target=/unity-volume \
-p 5005:5005 \
-p 6006:6006 \
<image-name>:latest \
--docker-target-name=unity-volume \
<trainer-config-file> \
--env=<environment-name> \
--train \
--run-id=<run-id>
```
Notes on argument values:
- `<container-name>` is used to identify the container (in case you want to
interrupt and terminate it). This is optional and Docker will generate a
random name if this is not set. _Note that this must be unique for every run
of a Docker image._
- `<image-name>` references the image name used when building the container.
- `<environment-name>` __(Optional)__: If you are training with a linux
executable, this is the name of the executable. If you are training in the
Editor, do not pass a `<environment-name>` argument and press the
:arrow_forward: button in Unity when the message _"Start training by pressing
the Play button in the Unity Editor"_ is displayed on the screen.
- `source`: Reference to the path in your host OS where you will store the Unity
executable.
- `target`: Tells Docker to mount the `source` path as a disk with this name.
- `docker-target-name`: Tells the ML-Agents Python package what the name of the
disk where it can read the Unity executable and store the graph. **This should
therefore be identical to `target`.**
- `trainer-config-file`, `train`, `run-id`: ML-Agents arguments passed to
`mlagents-learn`. `trainer-config-file` is the filename of the trainer config
file, `train` trains the algorithm, and `run-id` is used to tag each
experiment with a unique identifier. We recommend placing the trainer-config
file inside `unity-volume` so that the container has access to the file.
To train with a `3DBall` environment executable, the command would be:
```sh
docker run -it --name 3DBallContainer.first.trial \
--mount type=bind,source="$(pwd)"/unity-volume,target=/unity-volume \
-p 5005:5005 \
-p 6006:6006 \
balance.ball.v0.1:latest 3DBall \
--docker-target-name=unity-volume \
trainer_config.yaml \
--env=3DBall \
--train \
--run-id=3dball_first_trial
```
For more detail on Docker mounts, check out
[these](https://docs.docker.com/storage/bind-mounts/) docs from Docker.
**NOTE** If you are training using docker for environments that use visual observations, you may need to increase the default memory that Docker allocates for the container. For example, see [here](https://docs.docker.com/docker-for-mac/#advanced) for instructions for Docker for Mac.
### Running Tensorboard
You can run Tensorboard to monitor your training instance on http://localhost:6006:
```sh
docker exec -it <container-name> tensorboard --logdir=/unity-volume/summaries --host=0.0.0.0
```
With our previous 3DBall example, this command would look like this:
```sh
docker exec -it 3DBallContainer.first.trial tensorboard --logdir=/unity-volume/summaries --host=0.0.0.0
```
For more details on Tensorboard, check out the documentation about [Using Tensorboard](Using-Tensorboard.md).
### Stopping Container and Saving State
If you are satisfied with the training progress, you can stop the Docker
container while saving state by either using `Ctrl+C` or `⌘+C` (Mac) or by using
the following command:
```sh
docker kill --signal=SIGINT <container-name>
```
`<container-name>` is the name of the container specified in the earlier `docker
run` command. If you didn't specify one, you can find the randomly generated
identifier by running `docker container ls`.

351
docs/Installation-Windows.md


# Installing ML-Agents Toolkit for Windows
The ML-Agents toolkit supports Windows 10. While it might be possible to run the
ML-Agents toolkit using other versions of Windows, it has not been tested on
other versions. Furthermore, the ML-Agents toolkit has not been tested on a
Windows VM such as Bootcamp or Parallels.
To use the ML-Agents toolkit, you install Python and the required Python
packages as outlined below. This guide also covers how set up GPU-based training
(for advanced users). GPU-based training is not currently required for the
ML-Agents toolkit. However, training on a GPU might be required by future
versions and features.
## Step 1: Install Python via Anaconda
[Download](https://www.anaconda.com/download/#windows) and install Anaconda for
Windows. By using Anaconda, you can manage separate environments for different
distributions of Python. Python 3.6.1 or higher is required as we no longer support
Python 2. In this guide, we are using Python version 3.6 and Anaconda version
5.1
([64-bit](https://repo.continuum.io/archive/Anaconda3-5.1.0-Windows-x86_64.exe)
or [32-bit](https://repo.continuum.io/archive/Anaconda3-5.1.0-Windows-x86.exe)
direct links).
<p align="center">
<img src="images/anaconda_install.PNG"
alt="Anaconda Install"
width="500" border="10" />
</p>
We recommend the default _advanced installation options_. However, select the
options appropriate for your specific situation.
<p align="center">
<img src="images/anaconda_default.PNG" alt="Anaconda Install" width="500" border="10" />
</p>
After installation, you must open __Anaconda Navigator__ to finish the setup.
From the Windows search bar, type _anaconda navigator_. You can close Anaconda
Navigator after it opens.
If environment variables were not created, you will see error "conda is not
recognized as internal or external command" when you type `conda` into the
command line. To solve this you will need to set the environment variable
correctly.
Type `environment variables` in the search bar (this can be reached by hitting
the Windows key or the bottom left Windows button). You should see an option
called __Edit the system environment variables__.
<p align="center">
<img src="images/edit_env_var.png"
alt="edit env variables"
width="250" border="10" />
</p>
From here, click the __Environment Variables__ button. Double click "Path" under
__System variable__ to edit the "Path" variable, click __New__ to add the
following new paths.
```console
%UserProfile%\Anaconda3\Scripts
%UserProfile%\Anaconda3\Scripts\conda.exe
%UserProfile%\Anaconda3
%UserProfile%\Anaconda3\python.exe
```
## Step 2: Setup and Activate a New Conda Environment
You will create a new [Conda environment](https://conda.io/docs/) to be used
with the ML-Agents toolkit. This means that all the packages that you install
are localized to just this environment. It will not affect any other
installation of Python or other environments. Whenever you want to run
ML-Agents, you will need activate this Conda environment.
To create a new Conda environment, open a new Anaconda Prompt (_Anaconda Prompt_
in the search bar) and type in the following command:
```sh
conda create -n ml-agents python=3.6
```
You may be asked to install new packages. Type `y` and press enter _(make sure
you are connected to the Internet)_. You must install these required packages.
The new Conda environment is called ml-agents and uses Python version 3.6.
<p align="center">
<img src="images/conda_new.PNG" alt="Anaconda Install" width="500" border="10" />
</p>
To use this environment, you must activate it. _(To use this environment In the
future, you can run the same command)_. In the same Anaconda Prompt, type in the
following command:
```sh
activate ml-agents
```
You should see `(ml-agents)` prepended on the last line.
Next, install `tensorflow`. Install this package using `pip` - which is a
package management system used to install Python packages. Latest versions of
TensorFlow won't work, so you will need to make sure that you install version
1.7.1. In the same Anaconda Prompt, type in the following command _(make sure
you are connected to the Internet)_:
```sh
pip install tensorflow==1.7.1
```
## Step 3: Install Required Python Packages
The ML-Agents toolkit depends on a number of Python packages. Use `pip` to
install these Python dependencies.
If you haven't already, clone the ML-Agents Toolkit Github repository to your
local computer. You can do this using Git ([download
here](https://git-scm.com/download/win)) and running the following commands in
an Anaconda Prompt _(if you open a new prompt, be sure to activate the ml-agents
Conda environment by typing `activate ml-agents`)_:
```sh
git clone https://github.com/Unity-Technologies/ml-agents.git
```
If you don't want to use Git, you can always directly download all the files
[here](https://github.com/Unity-Technologies/ml-agents/archive/master.zip).
The `UnitySDK` subdirectory contains the Unity Assets to add to your projects.
It also contains many [example environments](Learning-Environment-Examples.md)
to help you get started.
The `ml-agents` subdirectory contains a Python package which provides deep reinforcement
learning trainers to use with Unity environments.
The `ml-agents-envs` subdirectory contains a Python API to interface with Unity, which
the `ml-agents` package depends on.
The `gym-unity` subdirectory contains a package to interface with OpenAI Gym.
Keep in mind where the files were downloaded, as you will need the
trainer config files in this directory when running `mlagents-learn`.
Make sure you are connected to the Internet and then type in the Anaconda
Prompt:
```console
pip install mlagents
```
This will complete the installation of all the required Python packages to run
the ML-Agents toolkit.
Sometimes on Windows, when you use pip to install certain Python packages, the pip will get stuck when trying to read the cache of the package. If you see this, you can try:
```console
pip install mlagents --no-cache-dir
```
This `--no-cache-dir` tells the pip to disable the cache.
### Installing for Development
If you intend to make modifications to `ml-agents` or `ml-agents-envs`, you should install
the packages from the cloned repo rather than from PyPi. To do this, you will need to install
`ml-agents` and `ml-agents-envs` separately.
In our example, the files are located in `C:\Downloads`. After you have either
cloned or downloaded the files, from the Anaconda Prompt, change to the ml-agents
subdirectory inside the ml-agents directory:
```console
cd C:\Downloads\ml-agents
```
From the repo's main directory, now run:
```console
cd ml-agents-envs
pip install -e .
cd ..
cd ml-agents
pip install -e .
```
Running pip with the `-e` flag will let you make changes to the Python files directly and have those
reflected when you run `mlagents-learn`. It is important to install these packages in this order as the
`mlagents` package depends on `mlagents_envs`, and installing it in the other
order will download `mlagents_envs` from PyPi.
## (Optional) Step 4: GPU Training using The ML-Agents Toolkit
GPU is not required for the ML-Agents toolkit and won't speed up the PPO
algorithm a lot during training(but something in the future will benefit from
GPU). This is a guide for advanced users who want to train using GPUs.
Additionally, you will need to check if your GPU is CUDA compatible. Please
check Nvidia's page [here](https://developer.nvidia.com/cuda-gpus).
Currently for the ML-Agents toolkit, only CUDA v9.0 and cuDNN v7.0.5 is supported.
### Install Nvidia CUDA toolkit
[Download](https://developer.nvidia.com/cuda-toolkit-archive) and install the
CUDA toolkit 9.0 from Nvidia's archive. The toolkit includes GPU-accelerated
libraries, debugging and optimization tools, a C/C++ (Step Visual Studio 2017)
compiler and a runtime library and is needed to run the ML-Agents toolkit. In
this guide, we are using version
[9.0.176](https://developer.nvidia.com/compute/cuda/9.0/Prod/network_installers/cuda_9.0.176_win10_network-exe)).
Before installing, please make sure you __close any running instances of Unity
or Visual Studio__.
Run the installer and select the Express option. Note the directory where you
installed the CUDA toolkit. In this guide, we installed in the directory
`C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0`
### Install Nvidia cuDNN library
[Download](https://developer.nvidia.com/cudnn) and install the cuDNN library
from Nvidia. cuDNN is a GPU-accelerated library of primitives for deep neural
networks. Before you can download, you will need to sign up for free to the
Nvidia Developer Program.
<p align="center">
<img src="images/cuDNN_membership_required.png"
alt="cuDNN membership required"
width="500" border="10" />
</p>
Once you've signed up, go back to the cuDNN
[downloads page](https://developer.nvidia.com/cudnn).
You may or may not be asked to fill out a short survey. When you get to the list
cuDNN releases, __make sure you are downloading the right version for the CUDA
toolkit you installed in Step 1.__ In this guide, we are using version 7.0.5 for
CUDA toolkit version 9.0
([direct link](https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v7.0.5/prod/9.0_20171129/cudnn-9.0-windows10-x64-v7)).
After you have downloaded the cuDNN files, you will need to extract the files
into the CUDA toolkit directory. In the cuDNN zip file, there are three folders
called `bin`, `include`, and `lib`.
<p align="center">
<img src="images/cudnn_zip_files.PNG"
alt="cuDNN zip files"
width="500" border="10" />
</p>
Copy these three folders into the CUDA toolkit directory. The CUDA toolkit
directory is located at
`C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0`
<p align="center">
<img src="images/cuda_toolkit_directory.PNG"
alt="cuda toolkit directory"
width="500" border="10" />
</p>
### Set Environment Variables
You will need to add one environment variable and two path variables.
To set the environment variable, type `environment variables` in the search bar
(this can be reached by hitting the Windows key or the bottom left Windows
button). You should see an option called __Edit the system environment
variables__.
<p align="center">
<img src="images/edit_env_var.png"
alt="edit env variables"
width="250" border="10" />
</p>
From here, click the __Environment Variables__ button. Click __New__ to add a
new system variable _(make sure you do this under __System variables__ and not
User variables_.
<p align="center">
<img src="images/new_system_variable.PNG"
alt="new system variable"
width="500" border="10" />
</p>
For __Variable Name__, enter `CUDA_HOME`. For the variable value, put the
directory location for the CUDA toolkit. In this guide, the directory location
is `C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0`. Press __OK__ once.
<p align="center">
<img src="images/system_variable_name_value.PNG"
alt="system variable names and values"
width="500" border="10" />
</p>
To set the two path variables, inside the same __Environment Variables__ window
and under the second box called __System Variables__, find a variable called
`Path` and click __Edit__. You will add two directories to the list. For this
guide, the two entries would look like:
```console
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\lib\x64
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\extras\CUPTI\libx64
```
Make sure to replace the relevant directory location with the one you have
installed. _Please note that case sensitivity matters_.
<p align="center">
<img src="images/path_variables.PNG"
alt="Path variables"
width="500" border="10" />
</p>
### Install TensorFlow GPU
Next, install `tensorflow-gpu` using `pip`. You'll need version 1.7.1. In an
Anaconda Prompt with the Conda environment ml-agents activated, type in the
following command to uninstall TensorFlow for cpu and install TensorFlow
for gpu _(make sure you are connected to the Internet)_:
```sh
pip uninstall tensorflow
pip install tensorflow-gpu==1.7.1
```
Lastly, you should test to see if everything installed properly and that
TensorFlow can identify your GPU. In the same Anaconda Prompt, open Python
in the Prompt by calling:
```sh
python
```
And then type the following commands:
```python
import tensorflow as tf
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
```
You should see something similar to:
```console
Found device 0 with properties ...
```
## Acknowledgments
We would like to thank
[Jason Weimann](https://unity3d.college/2017/10/25/machine-learning-in-unity3d-setting-up-the-environment-tensorflow-for-agentml-on-windows-10/)
and
[Nitish S. Mutha](http://blog.nitishmutha.com/tensorflow/2017/01/22/TensorFlow-with-gpu-for-windows.html)
for writing the original articles which were used to create this guide.
正在加载...
取消
保存