浏览代码

updated ml-agents to ml-agents toolkit where appropriate

/hotfix-v0.9.2a
unityjeffrey 6 年前
当前提交
6ed6b8d6
共有 31 个文件被更改,包括 101 次插入101 次删除
  1. 6
      CONTRIBUTING.md
  2. 12
      README.md
  3. 2
      docs/API-Reference.md
  4. 10
      docs/Background-Machine-Learning.md
  5. 10
      docs/Background-TensorFlow.md
  6. 2
      docs/Background-Unity.md
  7. 10
      docs/Basic-Guide.md
  8. 4
      docs/FAQ.md
  9. 18
      docs/Getting-Started-with-Balance-Ball.md
  10. 2
      docs/Glossary.md
  11. 24
      docs/Installation-Windows.md
  12. 8
      docs/Installation.md
  13. 2
      docs/Learning-Environment-Create-New.md
  14. 2
      docs/Learning-Environment-Design-Agents.md
  15. 2
      docs/Learning-Environment-Design-Brains.md
  16. 6
      docs/Learning-Environment-Design.md
  17. 2
      docs/Limitations.md
  18. 38
      docs/ML-Agents-Overview.md
  19. 8
      docs/Migrating.md
  20. 2
      docs/Python-API.md
  21. 2
      docs/Readme.md
  22. 2
      docs/Training-Curriculum-Learning.md
  23. 4
      docs/Training-ML-Agents.md
  24. 4
      docs/Training-on-Amazon-Web-Service.md
  25. 2
      docs/Training-on-Microsoft-Azure-Custom-Instance.md
  26. 4
      docs/Training-on-Microsoft-Azure.md
  27. 2
      docs/Using-TensorFlow-Sharp-in-Unity.md
  28. 4
      docs/Using-Tensorboard.md
  29. 2
      docs/dox-ml-agents.conf
  30. 4
      python/Basics.ipynb
  31. 2
      unity-environment/Assets/ML-Agents/Scripts/Academy.cs

6
CONTRIBUTING.md


# Contribution Guidelines
Thank you for your interest in contributing to ML-Agents! We are incredibly
excited to see how members of our community will use and extend ML-Agents.
Thank you for your interest in contributing to the ML-Agents toolkit! We are incredibly
excited to see how members of our community will use and extend the ML-Agents toolkit.
To facilitate your contributions, we've outlined a brief set of guidelines
to ensure that your extensions can be easily integrated.

as we expect all our contributors to follow it.
Second, before starting on a project that you intend to contribute
to ML-Agents (whether environments or modifications to the codebase),
to the ML-Agents toolkit (whether environments or modifications to the codebase),
we **strongly** recommend posting on our
[Issues page](https://github.com/Unity-Technologies/ml-agents/issues) and
briefly outlining the changes you plan to make. This will enable us to provide

12
README.md


These trained agents can be used for multiple purposes, including
controlling NPC behavior (in a variety of settings such as multi-agent and
adversarial), automated testing of game builds and evaluating different game
design decisions pre-release. ML-Agents is mutually beneficial for both game
design decisions pre-release. The ML-Agents toolkit is mutually beneficial for both game
developers and AI researchers as it provides a central platform where advances
in AI can be evaluated on Unity’s rich environments and then made accessible
to the wider research and game developer communities.

* For more information, in addition to installation and usage
instructions, see our [documentation home](docs/Readme.md).
* If you have
used a version of ML-Agents prior to v0.4, we strongly recommend
used a version of the ML-Agents toolkit prior to v0.4, we strongly recommend
our [guide on migrating from earlier versions](docs/Migrating.md).
## References

## Community and Feedback
ML-Agents is an open-source project and we encourage and welcome contributions.
The ML-Agents toolkit is an open-source project and we encourage and welcome contributions.
If you wish to contribute, be sure to review our
[contribution guidelines](CONTRIBUTING.md) and
[code of conduct](CODE_OF_CONDUCT.md).

* Join our
[Unity Machine Learning Channel](https://connect.unity.com/messages/c/035fba4f88400000)
to connect with others using ML-Agents and Unity developers enthusiastic
to connect with others using the ML-Agents toolkit and Unity developers enthusiastic
regarding ML-Agents (and, more broadly, machine learning in games).
* If you run into any problems using ML-Agents,
regarding the ML-Agents toolkit (and, more broadly, machine learning in games).
* If you run into any problems using the ML-Agents toolkit,
[submit an issue](https://github.com/Unity-Technologies/ml-agents/issues) and
make sure to include as much detail as possible.

2
docs/API-Reference.md


doxygen dox-ml-agents.conf
`dox-ml-agents.conf` is a Doxygen configuration file for ML-Agents
`dox-ml-agents.conf` is a Doxygen configuration file for the ML-Agents toolkit
that includes the classes that have been properly formatted.
The generated HTML files will be placed
in the `html/` subdirectory. Open `index.html` within that subdirectory to

10
docs/Background-Machine-Learning.md


# Background: Machine Learning
Given that a number of users of ML-Agents might not have a formal machine
Given that a number of users of the ML-Agents toolkit might not have a formal machine
understanding of ML-Agents. However, We will not attempt to provide a thorough
understanding of the ML-Agents toolkit. However, We will not attempt to provide a thorough
treatment of machine learning as there are fantastic resources online.
Machine learning, a branch of artificial intelligence, focuses on learning

several iterations to achieve good performance.
We now switch to reinforcement learning, the third class of
machine learning algorithms, and arguably the one most relevant for ML-Agents.
machine learning algorithms, and arguably the one most relevant for the ML-Agents toolkit.
## Reinforcement Learning

robot, with its own observations about the environment, its own set of actions
and a specific objective. Thus it is natural to explore how we can
train behaviors within Unity using reinforcement learning. This is precisely
what ML-Agents offers. The video linked below includes a reinforcement
learning demo showcasing training character behaviors using ML-Agents.
what the ML-Agents toolkit offers. The video linked below includes a reinforcement
learning demo showcasing training character behaviors using the ML-Agents toolkit.
<p align="center">
<a href="http://www.youtube.com/watch?feature=player_embedded&v=fiQsmdwEGT8" target="_blank">

10
docs/Background-TensorFlow.md


As discussed in our
[machine learning background page](Background-Machine-Learning.md), many of the
algorithms we provide in ML-Agents leverage some form of deep learning.
algorithms we provide in the ML-Agents toolkit leverage some form of deep learning.
produced by ML-Agents are (currently) in a format only understood by
produced by the ML-Agents toolkit are (currently) in a format only understood by
to TensorFlow-related tools that we leverage within ML-Agents.
to TensorFlow-related tools that we leverage within the ML-Agents toolkit.
## TensorFlow

GPUs in a desktop, server, or mobile device. Within ML-Agents, when you
GPUs in a desktop, server, or mobile device. Within the ML-Agents toolkit, when you
train the behavior of an Agent, the output is a TensorFlow model (.bytes)
file that you can then embed within an Internal Brain. Unless you implement
a new algorithm, the use of TensorFlow is mostly abstracted away and behind

TensorFlowSharp. We provide an additional in-depth overview of how to
leverage [TensorFlowSharp within Unity](Using-TensorFlow-Sharp-in-Unity.md)
which will become more relevant once you install and start training
behaviors within ML-Agents. Given the reliance on TensorFlowSharp, the
behaviors within the ML-Agents toolkit. Given the reliance on TensorFlowSharp, the
Internal Brain is currently marked as experimental.

2
docs/Background-Unity.md


[Tutorials page](https://unity3d.com/learn/tutorials). The
[Roll-a-ball tutorial](https://unity3d.com/learn/tutorials/s/roll-ball-tutorial)
is a fantastic resource to learn all the basic concepts of Unity to get started
with ML-Agents:
with the ML-Agents toolkit:
* [Editor](https://docs.unity3d.com/Manual/UsingTheEditor.html)
* [Interface](https://docs.unity3d.com/Manual/LearningtheInterface.html)
* [Scene](https://docs.unity3d.com/Manual/CreatingScenes.html)

10
docs/Basic-Guide.md


If you are not familiar with the [Unity Engine](https://unity3d.com/unity),
we highly recommend the [Roll-a-ball tutorial](https://unity3d.com/learn/tutorials/s/roll-ball-tutorial) to learn all the basic concepts of Unity.
## Setting up ML-Agents within Unity
## Setting up the ML-Agents Toolkit within Unity
In order to use ML-Agents within Unity, you need to change some Unity settings first. Also [TensorFlowSharp plugin](https://s3.amazonaws.com/unity-ml-agents/0.4/TFSharpPlugin.unitypackage) is needed for you to use pretrained model within Unity, which is based on the [TensorFlowSharp repo](https://github.com/migueldeicaza/TensorFlowSharp).
In order to use the ML-Agents toolkit within Unity, you need to change some Unity settings first. Also [TensorFlowSharp plugin](https://s3.amazonaws.com/unity-ml-agents/0.4/TFSharpPlugin.unitypackage) is needed for you to use pretrained model within Unity, which is based on the [TensorFlowSharp repo](https://github.com/migueldeicaza/TensorFlowSharp).
3. Using the file dialog that opens, locate the `unity-environment` folder within the ML-Agents project and click **Open**.
3. Using the file dialog that opens, locate the `unity-environment` folder within the the ML-Agents toolkit project and click **Open**.
4. Go to **Edit** > **Project Settings** > **Player**
5. For **each** of the platforms you target
(**PC, Mac and Linux Standalone**, **iOS** or **Android**):

### Training the environment
1. Open a command or terminal window.
2. Nagivate to the folder where you installed ML-Agents.
2. Nagivate to the folder where you installed the ML-Agents toolkit.
3. Change to the `python` directory.
4. Run `python3 learn.py --run-id=<run-identifier> --train`
Where:

## Next Steps
* For more information on ML-Agents, in addition to helpful background, check out the [ML-Agents Overview](ML-Agents-Overview.md) page.
* For more information on the ML-Agents toolkit, in addition to helpful background, check out the [ML-Agents Toolkit Overview](ML-Agents-Overview.md) page.
* For a more detailed walk-through of our 3D Balance Ball environment, check out the [Getting Started](Getting-Started-with-Balance-Ball.md) page.
* For a "Hello World" introduction to creating your own learning environment, check out the [Making a New Learning Environment](Learning-Environment-Create-New.md) page.
* For a series of Youtube video tutorials, checkout the [Machine Learning Agents PlayList](https://www.youtube.com/playlist?list=PLX2vGYjWbI0R08eWQkO7nQkGiicHAX7IX) page.

4
docs/FAQ.md


error CS1061: Type `System.Text.StringBuilder' does not contain a definition for `Clear' and no extension method `Clear' of type `System.Text.StringBuilder' could be found. Are you missing an assembly reference?
```
This is because .NET 3.5 doesn't support method Clear() for StringBuilder, refer to [Setting Up ML-Agents Within Unity](Installation.md#setting-up-ml-agent-within-unity) for solution.
This is because .NET 3.5 doesn't support method Clear() for StringBuilder, refer to [Setting Up The ML-Agents Toolkit Within Unity](Installation.md#setting-up-ml-agent-within-unity) for solution.
### TensorFlowSharp flag not turned on.

You need to install and enable the TensorFlowSharp plugin in order to use the internal brain.
```
This error message occurs because the TensorFlowSharp plugin won't be usage without the ENABLE_TENSORFLOW flag, refer to [Setting Up ML-Agents Within Unity](Installation.md#setting-up-ml-agent-within-unity) for solution.
This error message occurs because the TensorFlowSharp plugin won't be usage without the ENABLE_TENSORFLOW flag, refer to [Setting Up The ML-Agents Toolkit Within Unity](Installation.md#setting-up-ml-agent-within-unity) for solution.
### Tensorflow epsilon placeholder error

18
docs/Getting-Started-with-Balance-Ball.md


# Getting Started with the 3D Balance Ball Environment
This tutorial walks through the end-to-end process of opening an ML-Agents
This tutorial walks through the end-to-end process of opening a ML-Agents toolkit
ML-Agents includes a number of [example environments](Learning-Environment-Examples.md)
which you can examine to help understand the different ways in which ML-Agents
The ML-Agents toolkit includes a number of [example environments](Learning-Environment-Examples.md)
which you can examine to help understand the different ways in which the ML-Agents toolkit
can be used. These environments can also serve as templates for new
environments or as ways to test new ML algorithms. After reading this tutorial,
you should be able to explore and build the example environments.

## Installation
In order to install and set up ML-Agents, the Python dependencies and Unity,
In order to install and set up the ML-Agents toolkit, the Python dependencies and Unity,
see the [installation instructions](Installation.md).
## Understanding a Unity Environment (3D Balance Ball)

**Vector Observation Space**
Before making a decision, an agent collects its observation about its state
in the world. ML-Agents classifies vector observations into two types:
in the world. The ML-Agents toolkit classifies vector observations into two types:
**Continuous** and **Discrete**. The **Continuous** vector observation space
collects observations in a vector of floating point numbers. The **Discrete**
vector observation space is an index into a table of states. Most of the example

**Vector Action Space**
An agent is given instructions from the brain in the form of *actions*. Like
states, ML-Agents classifies actions into two types: the **Continuous**
states, ML-Agents toolkit classifies actions into two types: the **Continuous**
vector action space is a vector of numbers that can vary continuously. What
each element of the vector means is defined by the agent logic (the PPO
training process just learns what values are better given particular state

Reinforcement Learning algorithm called Proximal Policy Optimization (PPO).
This is a method that has been shown to be safe, efficient, and more general
purpose than many other RL algorithms, as such we have chosen it as the
example algorithm for use with ML-Agents. For more information on PPO,
example algorithm for use with ML-Agents toolkit. For more information on PPO,
OpenAI has a recent [blog post](https://blog.openai.com/openai-baselines-ppo/)
explaining it.

**Note**: If you're using Anaconda, don't forget to activate the ml-agents environment first.
The `--train` flag tells ML-Agents to run in training mode.
The `--train` flag tells the ML-Agents toolkit to run in training mode.
**Note**: You can train using an executable rather than the Editor. To do so, follow the intructions in
[Using an Execuatble](Learning-Environment-Executable.md).

default. In order to enable it, you must follow these steps. Please note that
the `Internal` Brain mode will only be available once completing these steps.
To set up the TensorFlowSharp Support, follow [Setting up ML-Agents within Unity](Basic-Guide.md#setting-up-ml-agents-within-unity) section.
To set up the TensorFlowSharp Support, follow [Setting up ML-Agents Toolkit within Unity](Basic-Guide.md#setting-up-ml-agents-within-unity) section.
of the Basic Guide page.
### Embedding the trained model into Unity

2
docs/Glossary.md


# ML-Agents Glossary
# ML-Agents Toolkit Glossary
* **Academy** - Unity Component which controls timing, reset, and
training/inference settings of the environment.

24
docs/Installation-Windows.md


# Installing ML-Agents for Windows
# Installing ML-Agents Toolkit for Windows
ML-Agents supports Windows 10. While it might be possible to run ML-Agents using other versions of Windows, it has not been tested on other versions. Furthermore, ML-Agents has not been tested on a Windows VM such as Bootcamp or Parallels.
The ML-Agents toolkit supports Windows 10. While it might be possible to run the ML-Agents toolkit using other versions of Windows, it has not been tested on other versions. Furthermore, the ML-Agents toolkit has not been tested on a Windows VM such as Bootcamp or Parallels.
To use ML-Agents, you install Python and the required Python packages as outlined below. This guide also covers how set up GPU-based training (for advanced users). GPU-based training is not required for the v0.4 release of ML-Agents. However, training on a GPU might be required by future versions and features.
To use the ML-Agents toolkit, you install Python and the required Python packages as outlined below. This guide also covers how set up GPU-based training (for advanced users). GPU-based training is not required for the v0.4 release of the ML-Agents toolkit. However, training on a GPU might be required by future versions and features.
## Step 1: Install Python via Anaconda

## Step 2: Setup and Activate a New Conda Environment
You will create a new [Conda environment](https://conda.io/docs/) to be used with ML-Agents. This means that all the packages that you install are localized to just this environment. It will not affect any other installation of Python or other environments. Whenever you want to run ML-Agents, you will need activate this Conda environment.
You will create a new [Conda environment](https://conda.io/docs/) to be used with the ML-Agents toolkit. This means that all the packages that you install are localized to just this environment. It will not affect any other installation of Python or other environments. Whenever you want to run ML-Agents, you will need activate this Conda environment.
To create a new Conda environment, open a new Anaconda Prompt (_Anaconda Prompt_ in the search bar) and type in the following command:

## Step 3: Install Required Python Packages
ML-Agents depends on a number of Python packages. Use `pip` to install these Python dependencies.
The ML-Agents toolkit depends on a number of Python packages. Use `pip` to install these Python dependencies.
If you haven't already, clone the ML-Agents Github repository to your local computer. You can do this using Git ([download here](https://git-scm.com/download/win)) and running the following commands in an Anaconda Prompt _(if you open a new prompt, be sure to activate the ml-agents Conda environment by typing `activate ml-agents`)_:
If you haven't already, clone the ML-Agents Toolkit Github repository to your local computer. You can do this using Git ([download here](https://git-scm.com/download/win)) and running the following commands in an Anaconda Prompt _(if you open a new prompt, be sure to activate the ml-agents Conda environment by typing `activate ml-agents`)_:
```
git clone https://github.com/Unity-Technologies/ml-agents.git

In our example, the files are located in `C:\Downloads`. After you have either cloned or downloaded the files, from the Anaconda Prompt, change to the python directory inside the ML-agents directory:
In our example, the files are located in `C:\Downloads`. After you have either cloned or downloaded the files, from the Anaconda Prompt, change to the python directory inside the ml-agents directory:
```
cd C:\Downloads\ml-agents\python

```
This will complete the installation of all the required Python packages to run ML-Agents.
This will complete the installation of all the required Python packages to run the ML-Agents toolkit.
## (Optional) Step 4: GPU Training using ML-Agents
## (Optional) Step 4: GPU Training using The ML-Agents Toolkit
GPU is not required for ML-Agents and won't speed up the PPO algorithm a lot during training(but something in the future will benefit from GPU). This is a guide for advanced users who want to train using GPUs. Additionally, you will need to check if your GPU is CUDA compatible. Please check Nvidia's page [here](https://developer.nvidia.com/cuda-gpus).
GPU is not required for the ML-Agents toolkit and won't speed up the PPO algorithm a lot during training(but something in the future will benefit from GPU). This is a guide for advanced users who want to train using GPUs. Additionally, you will need to check if your GPU is CUDA compatible. Please check Nvidia's page [here](https://developer.nvidia.com/cuda-gpus).
As of ML-Agents v0.4, only CUDA v9.0 and cuDNN v7.0.5 is supported.
As of the ML-Agents toolkit v0.4, only CUDA v9.0 and cuDNN v7.0.5 is supported.
[Download](https://developer.nvidia.com/cuda-toolkit-archive) and install the CUDA toolkit 9.0 from Nvidia's archive. The toolkit includes GPU-accelerated libraries, debugging and optimization tools, a C/C++ (Step Visual Studio 2017) compiler and a runtime library and is needed to run ML-Agents. In this guide, we are using version 9.0.176 (https://developer.nvidia.com/compute/cuda/9.0/Prod/network_installers/cuda_9.0.176_win10_network-exe)).
[Download](https://developer.nvidia.com/cuda-toolkit-archive) and install the CUDA toolkit 9.0 from Nvidia's archive. The toolkit includes GPU-accelerated libraries, debugging and optimization tools, a C/C++ (Step Visual Studio 2017) compiler and a runtime library and is needed to run the ML-Agents toolkit. In this guide, we are using version 9.0.176 (https://developer.nvidia.com/compute/cuda/9.0/Prod/network_installers/cuda_9.0.176_win10_network-exe)).
Before installing, please make sure you __close any running instances of Unity or Visual Studio__.

8
docs/Installation.md


width="500" border="10" />
</p>
## Clone the ml-agents Repository
## Clone the Ml-Agents Toolkit Repository
Once installed, you will want to clone the ML-Agents GitHub repository.
Once installed, you will want to clone the ML-Agents Toolkit GitHub repository.
git clone https://github.com/Unity-Technologies/ml-agents.git

## Install Python (with Dependencies)
In order to use ML-Agents, you need Python 3.5 or 3.6 along with
In order to use ML-Agents toolkit, you need Python 3.5 or 3.6 along with
the dependencies listed in the [requirements file](../python/requirements.txt).
Some of the primary dependencies include:
- [TensorFlow](Background-TensorFlow.md)

## Next Steps
The [Basic Guide](Basic-Guide.md) page contains several short
tutorials on setting up ML-Agents within Unity, running a pre-trained model, in
tutorials on setting up the ML-Agents toolkit within Unity, running a pre-trained model, in
addition to building and training environments.
## Help

2
docs/Learning-Environment-Create-New.md


## Overview
Using ML-Agents in a Unity project involves the following basic steps:
Using the ML-Agents toolkit in a Unity project involves the following basic steps:
1. Create an environment for your agents to live in. An environment can range from a simple physical simulation containing a few objects to an entire game or ecosystem.
2. Implement an Academy subclass and add it to a GameObject in the Unity scene containing the environment. This GameObject will serve as the parent for any Brain objects in the scene. Your Academy class can implement a few optional methods to update the scene independently of any agents. For example, you can add, move, or delete agents and other entities in the environment.

2
docs/Learning-Environment-Design-Agents.md


### Discrete Vector Observation Space: Table Lookup
You can use the discrete vector observation space when an agent only has a limited number of possible states and those states can be enumerated by a single number. For instance, the [Basic example environment](Learning-Environment-Examples.md#basic) in ML-Agents defines an agent with a discrete vector observation space. The states of this agent are the integer steps between two linear goals. In the Basic example, the agent learns to move to the goal that provides the greatest reward.
You can use the discrete vector observation space when an agent only has a limited number of possible states and those states can be enumerated by a single number. For instance, the [Basic example environment](Learning-Environment-Examples.md#basic) in the ML-Agents toolkit defines an agent with a discrete vector observation space. The states of this agent are the integer steps between two linear goals. In the Basic example, the agent learns to move to the goal that provides the greatest reward.
More generally, the discrete vector observation identifier could be an index into a table of the possible states. However, tables quickly become unwieldy as the environment becomes more complex. For example, even a simple game like [tic-tac-toe has 765 possible states](https://en.wikipedia.org/wiki/Game_complexity) (far more if you don't reduce the number of observations by combining those that are rotations or reflections of each other).

2
docs/Learning-Environment-Design-Brains.md


The Brain encapsulates the decision making process. Brain objects must be children of the Academy in the Unity scene hierarchy. Every Agent must be assigned a Brain, but you can use the same Brain with more than one Agent. You can also create several Brains, attach each of the Brain to one or more than one Agent.
Use the Brain class directly, rather than a subclass. Brain behavior is determined by the **Brain Type**. ML-Agents defines four Brain Types:
Use the Brain class directly, rather than a subclass. Brain behavior is determined by the **Brain Type**. The ML-Agents toolkit defines four Brain Types:
* [External](Learning-Environment-Design-External-Internal-Brains.md) — The **External** and **Internal** types typically work together; set **External** when training your agents. You can also use the **External** brain to communicate with a Python script via the Python `UnityEnvironment` class included in the Python portion of the ML-Agents SDK.
* [Internal](Learning-Environment-Design-External-Internal-Brains.md) – Set **Internal** to make use of a trained model.

6
docs/Learning-Environment-Design.md


Reinforcement learning is an artificial intelligence technique that trains _agents_ to perform tasks by rewarding desirable behavior. During reinforcement learning, an agent explores its environment, observes the state of things, and, based on those observations, takes an action. If the action leads to a better state, the agent receives a positive reward. If it leads to a less desirable state, then the agent receives no reward or a negative reward (punishment). As the agent learns during training, it optimizes its decision making so that it receives the maximum reward over time.
ML-Agents uses a reinforcement learning technique called [Proximal Policy Optimization (PPO)](https://blog.openai.com/openai-baselines-ppo/). PPO uses a neural network to approximate the ideal function that maps an agent's observations to the best action an agent can take in a given state. The ML-Agents PPO algorithm is implemented in TensorFlow and runs in a separate Python process (communicating with the running Unity application over a socket).
The ML-Agents toolkit uses a reinforcement learning technique called [Proximal Policy Optimization (PPO)](https://blog.openai.com/openai-baselines-ppo/). PPO uses a neural network to approximate the ideal function that maps an agent's observations to the best action an agent can take in a given state. The ML-Agents PPO algorithm is implemented in TensorFlow and runs in a separate Python process (communicating with the running Unity application over a socket).
**Note:** if you aren't studying machine and reinforcement learning as a subject and just want to train agents to accomplish tasks, you can treat PPO training as a _black box_. There are a few training-related parameters to adjust inside Unity as well as on the Python training side, but you do not need in-depth knowledge of the algorithm itself to successfully create and train agents. Step-by-step procedures for running the training process are provided in the [Training section](Training-ML-Agents.md).

## Organizing the Unity Scene
To train and use ML-Agents in a Unity scene, the scene must contain a single Academy subclass along with as many Brain objects and Agent subclasses as you need. Any Brain instances in the scene must be attached to GameObjects that are children of the Academy in the Unity Scene Hierarchy. Agent instances should be attached to the GameObject representing that agent.
To train and use the ML-Agents toolkit in a Unity scene, the scene must contain a single Academy subclass along with as many Brain objects and Agent subclasses as you need. Any Brain instances in the scene must be attached to GameObjects that are children of the Academy in the Unity Scene Hierarchy. Agent instances should be attached to the GameObject representing that agent.
![Scene Hierarchy](images/scene-hierarchy.png)

## Environments
An _environment_ in ML-Agents can be any scene built in Unity. The Unity scene provides the environment in which agents observe, act, and learn. How you set up the Unity scene to serve as a learning environment really depends on your goal. You may be trying to solve a specific reinforcement learning problem of limited scope, in which case you can use the same scene for both training and for testing trained agents. Or, you may be training agents to operate in a complex game or simulation. In this case, it might be more efficient and practical to create a purpose-built training scene.
An _environment_ in the ML-Agents toolkit can be any scene built in Unity. The Unity scene provides the environment in which agents observe, act, and learn. How you set up the Unity scene to serve as a learning environment really depends on your goal. You may be trying to solve a specific reinforcement learning problem of limited scope, in which case you can use the same scene for both training and for testing trained agents. Or, you may be training agents to operate in a complex game or simulation. In this case, it might be more efficient and practical to create a purpose-built training scene.
Both training and testing (or normal game) scenes must contain an Academy object to control the agent decision making process. The Academy defines several properties that can be set differently for a training scene versus a regular scene. The Academy's **Configuration** properties control rendering and time scale. You can set the **Training Configuration** to minimize the time Unity spends rendering graphics in order to speed up training. You may need to adjust the other functional, Academy settings as well. For example, `Max Steps` should be as short as possible for training — just long enough for the agent to accomplish its task, with some extra time for "wandering" while it learns. In regular scenes, you often do not want the Academy to reset the scene at all; if so, `Max Steps` should be set to zero.

2
docs/Limitations.md


As of version 0.3, we no longer support Python 2.
### Tensorflow support
Currently Ml-Agents uses TensorFlow 1.7.1 due to the version of the TensorFlowSharp plugin we are using.
Currently the Ml-Agents toolkit uses TensorFlow 1.7.1 due to the version of the TensorFlowSharp plugin we are using.

38
docs/ML-Agents-Overview.md


# ML-Agents Overview
# ML-Agents Toolkit Overview
**Unity Machine Learning Agents** (ML-Agents) is an open-source Unity plugin
**The Unity Machine Learning Agents Toolkit** (ML-Agents Toolkit) is an open-source Unity plugin
that enables games and simulations to serve as environments for training
intelligent agents. Agents can be trained using reinforcement learning,
imitation learning, neuroevolution, or other machine learning methods through

These trained agents can be used for multiple purposes, including
controlling NPC behavior (in a variety of settings such as multi-agent and
adversarial), automated testing of game builds and evaluating different game
design decisions pre-release. ML-Agents is mutually beneficial for both game
design decisions pre-release. The ML-Agents toolkit is mutually beneficial for both game
developers and AI researchers as it provides a central platform where advances
in AI can be evaluated on Unity’s rich environments and then made accessible
to the wider research and game developer communities.

To make your transition to ML-Agents easier, we provide several background
To make your transition to the ML-Agents toolkit easier, we provide several background
pages that include overviews and helpful resources on the
[Unity Engine](Background-Unity.md),
[machine learning](Background-Machine-Learning.md) and

The remainder of this page contains a deep dive into ML-Agents, its key
components, different training modes and scenarios. By the end of it, you
should have a good sense of _what_ ML-Agents allows you to do. The subsequent
should have a good sense of _what_ the ML-Agents toolkit allows you to do. The subsequent
documentation pages provide examples of _how_ to use ML-Agents.
## Running Example: Training NPC Behaviors

**training phase**, while playing the game with an NPC that is using its
learned policy is called the **inference phase**.
ML-Agents provides all the necessary tools for using Unity as the simulation
The ML-Agents toolkit provides all the necessary tools for using Unity as the simulation
In the next few sections, we discuss how ML-Agents achieves this and what
In the next few sections, we discuss how the ML-Agents toolkit achieves this and what
ML-Agents is a Unity plugin that contains three high-level components:
The ML-Agents toolkit is a Unity plugin that contains three high-level components:
* **Learning Environment** - which contains the Unity scene and all the game
characters.
* **Python API** - which contains all the machine learning algorithms that are

border="10" />
</p>
_Example block diagram of ML-Agents for our sample game._
_Example block diagram of ML-Agents toolkit for our sample game._
We have yet to discuss how ML-Agents trains behaviors, and what role the
We have yet to discuss how the ML-Agents toolkit trains behaviors, and what role the
Python API and External Communicator play. Before we dive into those details,
let's summarize the earlier components. Each character is attached to an Agent,
and each Agent is linked to a Brain. The Brain receives observations and

### Built-in Training and Inference
As mentioned previously, ML-Agents ships with several implementations of
As mentioned previously, the ML-Agents toolkit ships with several implementations of
state-of-the-art algorithms for training intelligent agents. In this mode, the
Brain type is set to External during training and Internal during inference.
More specifically, during training, all the medics in the scene send their

In the previous mode, the External Brain type was used for training
to generate a TensorFlow model that the Internal Brain type can understand
and use. However, any user of ML-Agents can leverage their own algorithms
and use. However, any user of the ML-Agents toolkit can leverage their own algorithms
for both training and inference. In this case, the Brain type would be set
to External for both training and inferences phases and the behaviors of
all the Agents in the scene will be controlled within Python.

one that is successively improved as the environment gradually increases in
complexity. In our example, we can imagine first training the medic when each
team only contains one player, and then iteratively increasing the number of
players (i.e. the environment complexity). ML-Agents supports setting
players (i.e. the environment complexity). The ML-Agents toolkit supports setting
custom environment parameters within the Academy. This allows
elements of the environment related to difficulty or complexity to be
dynamically adjusted based on training progress.

## Additional Features
Beyond the flexible training scenarios available, ML-Agents includes
Beyond the flexible training scenarios available, the ML-Agents toolkit includes
* **On Demand Decision Making** - With ML-Agents it is possible to have agents
* **On Demand Decision Making** - With the ML-Agents toolkit it is possible to have agents
request decisions only when needed as opposed to requesting decisions at
every step of the environment. This enables training of turn based games,
games where agents

[here](Feature-Monitor.md).
* **Complex Visual Observations** - Unlike other platforms, where the agent’s
observation might be limited to a single vector or image, ML-Agents allows
observation might be limited to a single vector or image, the ML-Agents toolkit allows
multiple cameras to be used for observations per agent. This enables agents to
learn to integrate information from multiple visual streams. This can be
helpful in several scenarios such as training a self-driving car which requires

[guide](Using-Docker.md) on how
to create and run a Docker container.
* **Cloud Training on AWS** - To facilitate using ML-Agents on
* **Cloud Training on AWS** - To facilitate using the ML-Agents toolkit on
* **Cloud Training on Microsoft Azure** - To facilitate using ML-Agents on
* **Cloud Training on Microsoft Azure** - To facilitate using the ML-Agents toolkit on
Azure machines, we provide a
[guide](Training-on-Microsoft-Azure.md)
on how to set-up virtual machine instances in addition to a pre-configured data science image.

To briefly summarize: ML-Agents enables games and simulations built in Unity
To briefly summarize: The ML-Agents toolkit enables games and simulations built in Unity
to serve as the platform for training intelligent agents. It is designed
to enable a large variety of training modes and scenarios and comes packed
with several features to enable researchers and developers to leverage

8
docs/Migrating.md


# Migrating from ML-Agents v0.3 to ML-Agents v0.4
# Migrating from the ML-Agents toolkit v0.3 to the ML-Agents toolkit v0.4
## Unity API
* `using MLAgents;` needs to be added in all of the C# scripts that use ML-Agents.

# Migrating from ML-Agents v0.2 to ML-Agents v0.3
# Migrating from the ML-Agents toolkit v0.2 to the ML-Agents toolkit v0.3
There are a large number of new features and improvements in ML-Agents v0.3 which change both the training process and Unity API in ways which will cause incompatibilities with environments made using older versions. This page is designed to highlight those changes for users familiar with v0.1 or v0.2 in order to ensure a smooth transition.
There are a large number of new features and improvements in the ML-Agents toolkit v0.3 which change both the training process and Unity API in ways which will cause incompatibilities with environments made using older versions. This page is designed to highlight those changes for users familiar with v0.1 or v0.2 in order to ensure a smooth transition.
* ML-Agents is no longer compatible with Python 2.
* The ML-Agents toolkit is no longer compatible with Python 2.
## Python Training
* The training script `ppo.py` and `PPO.ipynb` Python notebook have been replaced with a single `learn.py` script as the launching point for training with ML-Agents. For more information on using `learn.py`, see [here]().

2
docs/Python-API.md


# Python API
ML-Agents provides a Python API for controlling the agent simulation loop of a environment or game built with Unity. This API is used by the ML-Agent training algorithms (run with `learn.py`), but you can also write your Python programs using this API.
The ML-Agents toolkit provides a Python API for controlling the agent simulation loop of a environment or game built with Unity. This API is used by the ML-Agent training algorithms (run with `learn.py`), but you can also write your Python programs using this API.
The key objects in the Python API include:

2
docs/Readme.md


* [Basic Guide](Basic-Guide.md)
## Getting Started
* [ML-Agents Overview](ML-Agents-Overview.md)
* [ML-Agents Toolkit Overview](ML-Agents-Overview.md)
* [Background: Unity](Background-Unity.md)
* [Background: Machine Learning](Background-Machine-Learning.md)
* [Background: TensorFlow](Background-TensorFlow.md)

2
docs/Training-Curriculum-Learning.md


accomplish the task. From there, we can slowly add to the difficulty of the task by
increasing the size of the wall, until the agent can complete the initially
near-impossible task of scaling the wall. We are including just such an environment with
ML-Agents 0.2, called Wall Jump.
the ML-Agents toolkit 0.2, called Wall Jump.
![Wall](images/curriculum.png)

4
docs/Training-ML-Agents.md


# Training ML-Agents
ML-Agents conducts training using an external Python training process. During training, this external process communicates with the Academy object in the Unity scene to generate a block of agent experiences. These experiences become the training set for a neural network used to optimize the agent's policy (which is essentially a mathematical function mapping observations to actions). In reinforcement learning, the neural network optimizes the policy by maximizing the expected rewards. In imitation learning, the neural network optimizes the policy to achieve the smallest difference between the actions chosen by the agent trainee and the actions chosen by the expert in the same situation.
The ML-Agents toolkit conducts training using an external Python training process. During training, this external process communicates with the Academy object in the Unity scene to generate a block of agent experiences. These experiences become the training set for a neural network used to optimize the agent's policy (which is essentially a mathematical function mapping observations to actions). In reinforcement learning, the neural network optimizes the policy by maximizing the expected rewards. In imitation learning, the neural network optimizes the policy to achieve the smallest difference between the actions chosen by the agent trainee and the actions chosen by the expert in the same situation.
For a broader overview of reinforcement learning, imitation learning and the ML-Agents training process, see [ML-Agents Overview](ML-Agents-Overview.md).
For a broader overview of reinforcement learning, imitation learning and the ML-Agents training process, see [ML-Agents Toolkit Overview](ML-Agents-Overview.md).
## Training with learn.py

4
docs/Training-on-Amazon-Web-Service.md


export DISPLAY=:0
```
## Configuring your own Instance
## Configuring your own instance
### Installing ML-Agents on the instance
### Installing the ML-Agents toolkit on the instance
After launching your EC2 instance using the ami and ssh into it:

2
docs/Training-on-Microsoft-Azure-Custom-Instance.md


# Setting up a Custom Instance on Microsoft Azure for Training (works with ML-Agents v0.3)
# Setting up a Custom Instance on Microsoft Azure for Training (works with the ML-Agents toolkit v0.3)
This page contains instructions for setting up a custom Virtual Machine on Microsoft Azure so you can running ML-Agents training in the cloud.

4
docs/Training-on-Microsoft-Azure.md


# Training on Microsoft Azure (works with ML-Agents v0.3)
# Training on Microsoft Azure (works with ML-Agents toolkit v0.3)
This page contains instructions for setting up training on Microsoft Azure through either [Azure Container Instances](https://azure.microsoft.com/services/container-instances/) or Virtual Machines. Non "headless" training has not yet been tested to verify support.

## Running on Azure Container Instances
[Azure Container Instances](https://azure.microsoft.com/services/container-instances/) allow you to spin up a container, on demand, that will run your training and then be shut down. This ensures you aren't leaving a billable VM running when it isn't needed. You can read more about [ML-Agents support for Docker containers here](Using-Docker.md). Using ACI enables you to offload training of your models without needing to install Python and Tensorflow on your own computer. You can find [instructions, including a pre-deployed image in DockerHub for you to use, available here](https://github.com/druttka/unity-ml-on-azure).
[Azure Container Instances](https://azure.microsoft.com/services/container-instances/) allow you to spin up a container, on demand, that will run your training and then be shut down. This ensures you aren't leaving a billable VM running when it isn't needed. You can read more about [The ML-Agents toolkit support for Docker containers here](Using-Docker.md). Using ACI enables you to offload training of your models without needing to install Python and Tensorflow on your own computer. You can find [instructions, including a pre-deployed image in DockerHub for you to use, available here](https://github.com/druttka/unity-ml-on-azure).

2
docs/Using-TensorFlow-Sharp-in-Unity.md


# Using TensorFlowSharp in Unity (Experimental)
ML-Agents allows you to use pre-trained [TensorFlow graphs](https://www.tensorflow.org/programmers_guide/graphs) inside your Unity games. This support is possible thanks to [the TensorFlowSharp project](https://github.com/migueldeicaza/TensorFlowSharp). The primary purpose for this support is to use the TensorFlow models produced by the ML-Agents own training programs, but a side benefit is that you can use any TensorFlow model.
The ML-Agents toolkit allows you to use pre-trained [TensorFlow graphs](https://www.tensorflow.org/programmers_guide/graphs) inside your Unity games. This support is possible thanks to [the TensorFlowSharp project](https://github.com/migueldeicaza/TensorFlowSharp). The primary purpose for this support is to use the TensorFlow models produced by the ML-Agents toolkit's own training programs, but a side benefit is that you can use any TensorFlow model.
_Notice: This feature is still experimental. While it is possible to embed trained models into Unity games, Unity Technologies does not officially support this use-case for production games at this time. As such, no guarantees are provided regarding the quality of experience. If you encounter issues regarding battery life, or general performance (especially on mobile), please let us know._

4
docs/Using-Tensorboard.md


# Using TensorBoard to Observe Training
ML-Agents saves statistics during learning session that you can view with a TensorFlow utility named, [TensorBoard](https://www.tensorflow.org/programmers_guide/summaries_and_tensorboard).
The ML-Agents toolkit saves statistics during learning session that you can view with a TensorFlow utility named, [TensorBoard](https://www.tensorflow.org/programmers_guide/summaries_and_tensorboard).
The `learn.py` program saves training statistics to a folder named `summaries`, organized by the `run-id` value you assign to a training session.

When you run the training program, `learn.py`, you can use the `--save-freq` option to specify how frequently to save the statistics.
## ML-Agents training statistics
## The ML-Agents toolkit training statistics
The ML-agents training program saves the following statistics:

2
docs/dox-ml-agents.conf


# title of most generated pages and in a few other places.
# The default value is: My Project.
PROJECT_NAME = "ML-Agents"
PROJECT_NAME = "ML-Agents Toolkit"
# The PROJECT_NUMBER tag can be used to enter a project or revision number. This
# could be handy for archiving the generated documentation or if some version

4
python/Basics.ipynb


"source": [
"### 2. Load dependencies\n",
"\n",
"The following loads the necessary dependencies and checks the Python version (at runtime). ML-Agents (v0.3 onwards) requires Python 3."
"The following loads the necessary dependencies and checks the Python version (at runtime). ML-Agents Toolkit (v0.3 onwards) requires Python 3."
]
},
{

"\n",
"# check Python version\n",
"if (sys.version_info[0] < 3):\n",
" raise Exception(\"ERROR: ML-Agents (v0.3 onwards) requires Python 3\")"
" raise Exception(\"ERROR: ML-Agents Toolkit (v0.3 onwards) requires Python 3\")"
]
},
{

2
unity-environment/Assets/ML-Agents/Scripts/Academy.cs


/**
* Welcome to Unity Machine Learning Agents (ML-Agents).
*
* ML-Agents contains five entities: Academy, Brain, Agent, Communicator and
* The ML-Agents toolkit contains five entities: Academy, Brain, Agent, Communicator and
* Python API. The academy, and all its brains and connected agents live within
* a learning environment (herin called Environment), while the communicator
* manages the communication between the learning environment and the Python

正在加载...
取消
保存