to connect with others using the ML-Agents toolkit and Unity developers
enthusiastic about machine learning. We use that channel to surface updates
regarding the ML-Agents toolkit (and, more broadly, machine learning in
games).
* If you run into any problems using the ML-Agents toolkit,
[submit an issue](https://github.com/Unity-Technologies/ml-agents/issues) and
make sure to include as much detail as possible.
Your opinion matters a great deal to us. Only by hearing your thoughts on the Unity ML-Agents Toolkit can we continue to improve and grow. Please take a few minutes to [let us know about it](https://github.com/Unity-Technologies/ml-agents/issues/1454).
We've included pre-trained models for the 3D Ball example.
1. In the **Project** window, go to `Assets/ML-Agents/Examples/3DBall/Scenes` folder
1. In the **Project** window, go to the `Assets/ML-Agents/Examples/3DBall/Scenes` folder
2. In the **Project** window, go to `Assets/ML-Agents/Examples/3DBall/Prefabs` folder
and select the `Game/Platform` prefab.
3. In the `Ball 3D Agent` Component: Drag the **3DBallLearning** Brain located in
`Assets/ML-Agents/Examples/3DBall/Brains` into the `Brain` property of
the `Ball 3D Agent`.
4. Make sure that all of the Agents in the Scene now have **3DBallLearning** as `Brain`.
__Note__ : You can modify multiple game objects in a scene by selecting them all at
2. In the **Project** window, go to the `Assets/ML-Agents/Examples/3DBall/Prefabs` folder.
Expand `Game` and click on the `Platform` prefab. You should see the `Platform` prefab in the **Inspector** window.
**Note**: The platforms in the `3DBall` scene were created using the `Platform` prefab. Instead of updating all 12 platforms individually, you can update the `Platform` prefab instead.
![Platform Prefab](images/platform_prefab.png)
3. In the **Project** window, drag the **3DBallLearning** Brain located in
`Assets/ML-Agents/Examples/3DBall/Brains` into the `Brain` property under `Ball 3D Agent (Script)` component in the **Inspector** window.
4. You should notice that each `Platform` under each `Game` in the **Hierarchy** windows now contains **3DBallLearning** as `Brain`. __Note__ : You can modify multiple game objects in a scene by selecting them all at
5. In the **Project** window, locate the `Assets/ML-Agents/Examples/3DBall/TFModels`
5. In the **Project** window, click on the **3DBallLearning** Brain located in
`Assets/ML-Agents/Examples/3DBall/Brains`. You should see the properties in the **Inspector** window.
6. In the **Project** window, open the `Assets/ML-Agents/Examples/3DBall/TFModels`
6. Drag the `3DBallLearning` model file from the `Assets/ML-Agents/Examples/3DBall/TFModels`
folder to the **Model** field of the **3DBallLearning** Brain.
7. Click the **Play** button and you will see the platforms balance the balls
7. Drag the `3DBallLearning` model file from the `Assets/ML-Agents/Examples/3DBall/TFModels`
folder to the **Model** field of the **3DBallLearning** Brain in the **Inspector** window. __Note__ : All of the brains should now have `3DBallLearning` as the TensorFlow model in the `Model` property
8. Click the **Play** button and you will see the platforms balance the balls
![Running a pretrained model](images/running-a-pretrained-model.gif)
![Running a pretrained model](images/running-a-pretrained-model.gif)
## Using the Basics Jupyter Notebook
## Training the Brain with Reinforcement Learning
### Adding a Brain to the training session
### Setting up the enviornment for training
1. Assign the **3DBallLearning** Brain to the agents you would like to train.
__Note:__ You can assign the same Brain to multiple agents at once : To do so, you can
use the prefab system. When an agent is created from a prefab, modifying the prefab
will modify the agent as well. If the agent does not synchronize with the prefab, you
can hit the Revert button on top of the Inspector.
Alternatively, you can select multiple agents in the scene and modify their `Brain`
property all at once.
2. Select the **Ball3DAcademy** GameObject and make sure the **3DBallLearning** Brain
is in the Broadcast Hub. In order to train, you need to toggle the
`Control` checkbox.
__Note:__ Assigning a Brain to an agent (dragging a Brain into the `Brain` property of
1. Each platform agent needs an assigned `Learning Brain`. In this example, each platform agent was created using a prefab. To update all of the brains in each platform agent at once, you only need to update the platform agent prefab. In the **Project** window, go to the `Assets/ML-Agents/Examples/3DBall/Prefabs` folder. Expand `Game` and click on the `Platform` prefab. You should see the `Platform` prefab in the **Inspector** window. In the **Project** window, drag the **3DBallLearning** Brain located in `Assets/ML-Agents/Examples/3DBall/Brains` into the `Brain` property under `Ball 3D Agent (Script)` component in the **Inspector** window.
**Note**: The Unity prefab system will modify all instances of the agent properties in your scene. If the agent does not synchronize automatically with the prefab, you can hit the Revert button in the top of the **Inspector** window.
2. In the **Hierarchy** window, select `Ball3DAcademy`.
3. In the **Project** window, go to `Assets/ML-Agents/Examples/3DBall/Brains` folder and drag the **3DBallLearning** Brain to the `Brains` property under `Braodcast Hub` in the `Ball3DAcademy` object in the **Inspector** window. In order to train, make sure the `Control` checkbox is selected.
**Note:** Assigning a Brain to an agent (dragging a Brain into the `Brain` property of
![Set Brain to External](images/mlagents-SetBrainToTrain.png)
![Set Brain to External](images/mlagents-SetBrainToTrain.png)
### Training the environment
Editor"_ is displayed on the screen, you can press the :arrow_forward: button
in Unity to start training in the Editor.
**Note**: Alternatively, you can use an executable rather than the Editor to
**Note**: Alternatively, you can use an executable rather than the Editor to
perform training. Please refer to [this
page](Learning-Environment-Executable.md) for instructions on how to build and
use an executable.
### After training
You can press Ctrl+C to stop the training, and your trained model will be at
`models/<run-identifier>/<brain_name>.tf` where
`models/<run-identifier>/<brain_name>.bytes` where
`<brain_name>` is the name of the Brain corresponding to the model.
(**Note:** There is a known bug on Windows that causes the saving of the model to
fail when you early terminate the training, it's recommended to wait until Step
In order to install and set up the ML-Agents toolkit, the Python dependencies
and Unity, see the [installation instructions](Installation.md).
## Understanding a Unity Environment (3D Balance Ball)
## Understanding the Unity Environment (3D Balance Ball)
An agent is an autonomous actor that observes and interacts with an
_environment_. In the context of Unity, an environment is a scene containing an
The Academy object for the scene is placed on the Ball3DAcademy GameObject. When
you look at an Academy component in the inspector, you can see several
properties that control how the environment works.
The **Broadcast Hub** keeps track of which Brains will send data during training,
If a Brain is added to the hub, his data will be sent to the external training
The **Broadcast Hub** keeps track of which Brains will send data during training.
If a Brain is added to the hub, the data from this Brain will be sent to the external training
control the agents linked to the Brain to train them.
The **Training** and **Inference Configuration** properties
control and train the agents linked to the Brain.
The **Training Configuration** and **Inference Configuration** properties
Typically, you set low graphics quality and a high time scale for the **Training
configuration** and a high graphics quality and the timescale to `1.0` for the
Typically, you would set a low graphics quality and timescale to greater `1.0` for the **Training
Configuration** and a high graphics quality and timescale to `1.0` for the
**Inference Configuration** .
**Note:** if you want to observe the environment during training, you can adjust
Another aspect of an environment to look at is the Academy implementation. Since
Another aspect of an environment is the Academy implementation. Since
the base Academy class is abstract, you must always define a subclass. There are
three functions you can implement, though they are all optional:
### Brain
Brains are assets that exist in your project folder. The Ball3DAgents are connected
to a Brain, for example : the **3DBallLearning**.
A Brain doesn't store any information about an Agent, it just
As of v0.6, a Brain is a Unity asset and exists within the `UnitySDK` folder. These brains (ex. **3DBallLearning.asset**) are loaded into each Agent object (ex. **Ball3DAgents**). A Brain doesn't store any information about an Agent, it just
returns the chosen action to the Agent. Thus, all Agents can share the same
Brain, but act independently. The Brain settings tell you quite a bit about how
returns the chosen action to the Agent. All Agents can share the same
Brain, but would act independently. The Brain settings tell you quite a bit about how
You can create Brain objects by selecting `Assets ->
Create -> ML-Agents -> Brain`. There are 3 kinds of Brains :
The **Learning Brain** is a Brain that uses a Neural Network to take decisions.
When the Brain is checked as `Control` in the Academy **Broadcast Hub**, the
external process will be taking decisions for the agents
and generate a neural network when the training is over. You can also use the
You can create new Brain assets by selecting `Assets ->
Create -> ML-Agents -> Brain`. There are 3 types of Brains.
The **Learning Brain** is a Brain that uses a trained neural network to make decisions.
When the `Control` box is checked in the Brains property under the **Broadcast Hub** in the Academy, the external process that is training the neural network will take over decision making for the agents
and ultimately generate a trained neural network. You can also use the
The **Heuristic** Brain allows you to hand-code the Agent's logic by extending
The **Heuristic** Brain allows you to hand-code the Agent logic by extending
can be useful when testing your agents and environment. If none of these types
of Brains do what you need, you can implement your own Brain.
can be useful when testing your agents and environment. You can also implement your own type of Brain.
In this tutorial, you will use a **Learning Brain** for training.
In this tutorial, you will use the **Learning Brain** for training.
#### Vector Observation Space
explaining it.
To train the agents within the Ball Balance environment, we will be using the
Python package. We have provided a convenient script called `mlagents-learn`
Python package. We have provided a convenient command called `mlagents-learn`
which accepts arguments used to configure both training and inference phases.
We can use `run_id` to identify the experiment and create a folder where the
use it with Agents having a **Learning Brain**.
__Note:__ Do not just close the Unity Window once the `Saved Model` message appears.
Either wait for the training process to close the window or press Ctrl+C at the
command-line prompt. If you simply close the window manually, the `.tf` file
command-line prompt. If you close the window manually, the `.bytes` file
### Setting up Inference Support
### Setting up TensorFlowSharp
In order to run neural network models inside of Unity, you will need to setup the
Inference Engine with an appropriate backend. See [here](Inference-Engine.md) for more
information.
Because TensorFlowSharp support is still experimental, it is disabled by
default. Please note that the `Learning` Brain inference can only be used with
TensorFlowSharp.
To set up the TensorFlowSharp Support, follow [Setting up ML-Agents Toolkit
within Unity](Basic-Guide.md#setting-up-ml-agents-within-unity) section. of the
1. Create an environment for your agents to live in. An environment can range
from a simple physical simulation containing a few objects to an entire game
or ecosystem.
from a simple physical simulation containing a few objects to an entire game
or ecosystem.
containing the environment. Your Academy class can implement a few optional
methods to update the scene independently of any agents. For example, you can
add, move, or delete agents and other entities in the environment.
3. Create one or more Brain assets by clicking `Assets -> Create -> ML-Agents
-> Bain`. And name them appropriately.
containing the environment. Your Academy class can implement a few optional
methods to update the scene independently of any agents. For example, you can
add, move, or delete agents and other entities in the environment.
3. Create one or more Brain assets by clicking **Assets** > **Create** >
**ML-Agents** > **Brain**, and naming them appropriately.
uses to observe its environment, to carry out assigned actions, and to
calculate the rewards used for reinforcement training. You can also implement
optional methods to reset the Agent when it has finished or failed its task.
uses to observe its environment, to carry out assigned actions, and to
calculate the rewards used for reinforcement training. You can also implement
optional methods to reset the Agent when it has finished or failed its task.
in the scene that represents the Agent in the simulation. Each Agent object
must be assigned a Brain object.
in the scene that represents the Agent in the simulation. Each Agent object
must be assigned a Brain object.
[run the training process](Training-ML-Agents.md).
[run the training process](Training-ML-Agents.md).
**Note:** If you are unfamiliar with Unity, refer to
[Learning the interface](https://docs.unity3d.com/Manual/LearningtheInterface.html)
importing the ML-Agents assets into it:
1. Launch the Unity Editor and create a new project named "RollerBall".
2. In a file system window, navigate to the folder containing your cloned
ML-Agents repository.
3. Drag the `ML-Agents` folder from `UnitySDK/Assets` to the Unity Editor
Project window.
4. Setup the ML-Agents toolkit by following the instructions [here](https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Basic-Guide.md#setting-up-the-ml-agents-toolkit-within-unity).
2. Make sure that the Scripting Runtime Version for the project is set to use
**.NET 4.x Equivalent** (This is an experimental option in Unity 2017,
but is the default as of 2018.3.)
3. In a file system window, navigate to the folder containing your cloned
ML-Agents repository.
4. Drag the `ML-Agents` and `Gizmos` folders from `UnitySDK/Assets` to the Unity
Editor Project window.
Your Unity **Project** window should contain the following assets:
for the Agent to move around on, a Cube to act as the goal or target for the
agent to seek, and a Sphere to represent the Agent itself.
### Create the floor plane
### Create the Floor Plane
3. Select Plane to view its properties in the Inspector window.
4. Set Transform to Position = (0,0,0), Rotation = (0,0,0), Scale = (1,1,1).
3. Select the Floor Plane to view its properties in the Inspector window.
4. Set Transform to Position = (0,0,0), Rotation = (0,0,0), Scale = (1,1,1).
default-material to *floor*.
default-material to *LightGridFloorSquare* (or any suitable material of your choice).
name. This opens the **Object Picker** dialog so that you can choose the a
name. This opens the **Object Picker** dialog so that you can choose a
different material from the list of all materials currently in the project.)
![The Floor in the Inspector window](images/mlagents-NewTutFloor.png)
1. Right click in Hierarchy window, select 3D Object > Cube.
2. Name the GameObject "Target"
3. Select Target to view its properties in the Inspector window.
4. Set Transform to Position = (3,0.5,3), Rotation = (0,0,0), Scale = (1,1,1).
3. Select the Target Cube to view its properties in the Inspector window.
4. Set Transform to Position = (3,0.5,3), Rotation = (0,0,0), Scale = (1,1,1).
default-material to *Block*.
default-material to *Block*.
![The Target Cube in the Inspector window](images/mlagents-NewTutBlock.png)
2. Name the GameObject "RollerAgent"
3. Select Target to view its properties in the Inspector window.
4. Set Transform to Position = (0,0.5,0), Rotation = (0,0,0), Scale = (1,1,1).
3. Select the RollerAgent Sphere to view its properties in the Inspector window.
4. Set Transform to Position = (0,0.5,0), Rotation = (0,0,0), Scale = (1,1,1).
default-material to *checker 1*.
default-material to *CheckerSquare*.
7. Add the Physics/Rigidbody component to the Sphere. (Adding a Rigidbody)
7. Add the Physics/Rigidbody component to the Sphere.
![The Agent GameObject in the Inspector window](images/mlagents-NewTutSphere.png)
Next, edit the new `RollerAcademy` script:
1. In the Unity Project window, double-click the `RollerAcademy` script to open
it in your code editor. (By default new scripts are placed directly in the
**Assets** folder.)
2. In the editor, change the base class from `MonoBehaviour` to `Academy`.
3. Delete the `Start()` and `Update()` methods that were added by default.
it in your code editor. (By default new scripts are placed directly in the
**Assets** folder.)
2. In the code editor, add the statement, `using MLAgents;`.
3. Change the base class from `MonoBehaviour` to `Academy`.
4. Delete the `Start()` and `Update()` methods that were added by default.
In such a basic scene, we don't need the Academy to initialize, reset, or
otherwise control any objects in the environment so we have the simplest
![The Academy properties](images/mlagents-NewTutAcademy.png)
## Add Brains
## Add Brain Assets
(Learning, Heuristic or player) determines how the Brain makes decisions.
(Learning, Heuristic or Player) determines how the Brain makes decisions.
1. Go to `Assets -> Create -> ML-Agents` and select the type of Brain you want to
create. In this tutorial, we will create a **Learning Brain** and
a **Player Brain**.
1. Go to **Assets** > **Create** > **ML-Agents** and select the type of Brain asset
you want to create. For this tutorial, create a **Learning Brain** and
a **Player Brain**.
![Creating a Brain Asset](images/mlagents-NewTutBrain.png)
| | **Element 0–N** | The mapping of keys to action values. |
| | **Key** | The key on the keyboard. |
| | **Branch Index** | The element of the Agent's action vector to set when this key is pressed. The index value cannot exceed the size of the Action Space (minus 1, since it is an array index). |
| | **Value** | The value to send to the Agent as its action when the mapped key is pressed. Cannot exceed the max value for the associated branch (minus 1, since it is an array index). |
| | **Value** | The value to send to the Agent as its action when the mapped key is pressed. Cannot exceed the max value for the associated branch (minus 1, since it is an array index). Note that if no key is pressed for that branch, the default action will be 0. |
For more information about the Unity input system, see
of training a medic NPC : instead of indirectly training a medic with the help
of training a medic NPC. Instead of indirectly training a medic with the help
of a reward function, we can give the medic real world examples of observations
from the game and actions from a game controller to guide the medic's behavior.
Imitation Learning uses pairs of observations and actions from
3. Build the scene, assigning the agent a Learning Brain, and set the Brain to Control in the Broadcast Hub. For more information on Brains, see [here](Learning-Environment-Design-Brains.md).
4. Open the `config/offline_bc_config.yaml` file.
5. Modify the `demo_path` parameter in the file to reference the path to the demonstration file recorded in step 2. In our case this is: `./UnitySDK/Assets/Demonstrations/AgentRecording.demo`
6. Launch `mlagent-learn`, and providing `./config/offline_bc_config.yaml` as the config parameter, and your environment as the `--env` parameter.
6. Launch `mlagent-learn`, providing `./config/offline_bc_config.yaml` as the config parameter, and include the `--run-id` and `--train` as usual. Provide your environment as the `--env` parameter if it has been compiled as standalone, or omit to train in the editor.
7. (Optional) Observe training performance using Tensorboard.
This will use the demonstration file to train a neural network driven agent to directly imitate the actions provided in the demonstration. The environment will launch and be used for evaluating the agent's performance during training.
similarly to the demonstrations.
9. Once the Student Agents are exhibiting the desired behavior, end the training
process with `CTL+C` from the command line.
10. Move the resulting `*.tf` file into the `TFModels` subdirectory of the
10. Move the resulting `*.bytes` file into the `TFModels` subdirectory of the
Assets folder (or a subdirectory within Assets of your choosing) , and use
Your opinion matters a great deal to us. Only by hearing your thoughts on the Unity ML-Agents Toolkit can we continue to improve and grow. Please take a few minutes to let us know about it.
[Fill out the survey](https://goo.gl/forms/qFMYSYr5TlINvG6f1)