浏览代码
Merge pull request #1083 from Unity-Technologies/develop-flat-code-restructure
Merge pull request #1083 from Unity-Technologies/develop-flat-code-restructure
ML-Agents Code Restructure/develop-generalizationTraining-TrainerController
GitHub
6 年前
当前提交
3900ed66
共有 170 个文件被更改,包括 1720 次插入 和 1506 次删除
-
49.gitignore
-
10Dockerfile
-
25docs/API-Reference.md
-
15docs/Background-Jupyter.md
-
146docs/Basic-Guide.md
-
113docs/FAQ.md
-
2docs/Feature-Memory.md
-
380docs/Getting-Started-with-Balance-Ball.md
-
68docs/Glossary.md
-
2docs/Installation-Windows.md
-
57docs/Installation.md
-
12docs/Learning-Environment-Create-New.md
-
2docs/Learning-Environment-Design-External-Internal-Brains.md
-
2docs/Learning-Environment-Examples.md
-
28docs/Learning-Environment-Executable.md
-
685docs/ML-Agents-Overview.md
-
138docs/Python-API.md
-
81docs/Readme.md
-
23docs/Training-Curriculum-Learning.md
-
4docs/Training-Imitation-Learning.md
-
34docs/Training-ML-Agents.md
-
2docs/Training-on-Amazon-Web-Service.md
-
6docs/Training-on-Microsoft-Azure.md
-
115docs/Using-Docker.md
-
6docs/Using-Tensorboard.md
-
8docs/dox-ml-agents.conf
-
16gym-unity/Readme.md
-
2gym-unity/gym_unity/envs/unity_env.py
-
2gym-unity/setup.py
-
2MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityToExternalGrpc.cs.meta
-
10MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityToExternalGrpc.cs
-
2MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityToExternal.cs.meta
-
19MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityToExternal.cs
-
2MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityRlOutput.cs.meta
-
27MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityRlOutput.cs
-
2MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityRlInput.cs.meta
-
41MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityRlInput.cs
-
2MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityRlInitializationOutput.cs.meta
-
31MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityRlInitializationOutput.cs
-
2MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityRlInitializationInput.cs.meta
-
14MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityRlInitializationInput.cs
-
2MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityOutput.cs.meta
-
29MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityOutput.cs
-
2MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityMessage.cs.meta
-
32MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityMessage.cs
-
2MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityInput.cs.meta
-
27MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityInput.cs
-
2MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/SpaceTypeProto.cs.meta
-
17MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/SpaceTypeProto.cs
-
2MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/ResolutionProto.cs.meta
-
15MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/ResolutionProto.cs
-
2MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/Header.cs.meta
-
14MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/Header.cs
-
2MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/EnvironmentParametersProto.cs.meta
-
21MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/EnvironmentParametersProto.cs
-
2MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/EngineConfigurationProto.cs.meta
-
19MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/EngineConfigurationProto.cs
-
2MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/CommandProto.cs.meta
-
14MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/CommandProto.cs
-
2MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/BrainTypeProto.cs.meta
-
18MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/BrainTypeProto.cs
-
2MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/BrainParametersProto.cs.meta
-
36MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/BrainParametersProto.cs
-
2MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/AgentInfoProto.cs.meta
-
24MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/AgentInfoProto.cs
-
2MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/AgentActionProto.cs.meta
-
16MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects/AgentActionProto.cs
-
0MLAgentsSDK/Assets/ML-Agents/Scripts/ActionMasker.cs
-
2MLAgentsSDK/Assets/ML-Agents/Scripts/Academy.cs
-
4ml-agents/tests/trainers/test_curriculum.py
-
36ml-agents/tests/trainers/test_trainer_controller.py
-
5ml-agents/tests/mock_communicator.py
-
10ml-agents/mlagents/envs/communicator_objects/unity_to_external_pb2_grpc.py
-
18ml-agents/mlagents/envs/communicator_objects/unity_to_external_pb2.py
-
54ml-agents/mlagents/envs/communicator_objects/unity_rl_output_pb2.py
-
66ml-agents/mlagents/envs/communicator_objects/unity_rl_input_pb2.py
-
39ml-agents/mlagents/envs/communicator_objects/unity_rl_initialization_output_pb2.py
-
21ml-agents/mlagents/envs/communicator_objects/unity_rl_initialization_input_pb2.py
-
33ml-agents/mlagents/envs/communicator_objects/unity_output_pb2.py
-
39ml-agents/mlagents/envs/communicator_objects/unity_message_pb2.py
-
33ml-agents/mlagents/envs/communicator_objects/unity_input_pb2.py
-
25ml-agents/mlagents/envs/communicator_objects/space_type_proto_pb2.py
-
25ml-agents/mlagents/envs/communicator_objects/resolution_proto_pb2.py
-
23ml-agents/mlagents/envs/communicator_objects/header_pb2.py
-
36ml-agents/mlagents/envs/communicator_objects/environment_parameters_proto_pb2.py
-
31ml-agents/mlagents/envs/communicator_objects/engine_configuration_proto_pb2.py
-
23ml-agents/mlagents/envs/communicator_objects/command_proto_pb2.py
-
29ml-agents/mlagents/envs/communicator_objects/brain_type_proto_pb2.py
-
49ml-agents/mlagents/envs/communicator_objects/brain_parameters_proto_pb2.py
-
41ml-agents/mlagents/envs/communicator_objects/agent_info_proto_pb2.py
-
27ml-agents/mlagents/envs/communicator_objects/agent_action_proto_pb2.py
-
4ml-agents/mlagents/envs/socket_communicator.py
-
6ml-agents/mlagents/envs/rpc_communicator.py
-
4ml-agents/mlagents/envs/exception.py
-
4ml-agents/mlagents/envs/environment.py
-
4ml-agents/mlagents/envs/communicator.py
-
0ml-agents/mlagents/envs/brain.py
-
3ml-agents/mlagents/envs/__init__.py
-
21notebooks/getting-started.ipynb
-
16ml-agents/mlagents/trainers/trainer_controller.py
|
|||
# API Reference |
|||
|
|||
Our developer-facing C# classes (Academy, Agent, Decision and |
|||
Monitor) have been documented to be compatabile with |
|||
[Doxygen](http://www.stack.nl/~dimitri/doxygen/) for auto-generating HTML |
|||
Our developer-facing C# classes (Academy, Agent, Decision and Monitor) have been |
|||
documented to be compatabile with |
|||
[Doxygen](http://www.stack.nl/~dimitri/doxygen/) for auto-generating HTML |
|||
To generate the API reference, |
|||
[download Doxygen](http://www.stack.nl/~dimitri/doxygen/download.html) and run |
|||
the following command within the `docs/` directory: |
|||
To generate the API reference, [download |
|||
Doxygen](http://www.stack.nl/~dimitri/doxygen/download.html) and run the |
|||
following command within the `docs/` directory: |
|||
that includes the classes that have been properly formatted. |
|||
The generated HTML files will be placed |
|||
in the `html/` subdirectory. Open `index.html` within that subdirectory to |
|||
navigate to the API reference home. Note that `html/` is already included in |
|||
the repository's `.gitignore` file. |
|||
that includes the classes that have been properly formatted. The generated HTML |
|||
files will be placed in the `html/` subdirectory. Open `index.html` within that |
|||
subdirectory to navigate to the API reference home. Note that `html/` is already |
|||
included in the repository's `.gitignore` file. |
|||
In the near future, we aim to expand our documentation |
|||
to include all the Unity C# classes and Python API. |
|||
In the near future, we aim to expand our documentation to include all the Unity |
|||
C# classes and Python API. |
|
|||
# Background: Jupyter |
|||
|
|||
[Jupyter](https://jupyter.org) is a fantastic tool for writing code with |
|||
embedded visualizations. We provide one such notebook, `python/notebooks/getting-started.ipynb`, |
|||
for testing the Python control interface to a Unity build. This notebook is |
|||
introduced in the |
|||
[Getting Started with the 3D Balance Ball Environment](Getting-Started-with-Balance-Ball.md) |
|||
[Jupyter](https://jupyter.org) is a fantastic tool for writing code with |
|||
embedded visualizations. We provide one such notebook, |
|||
`notebooks/getting-started.ipynb`, for testing the Python control |
|||
interface to a Unity build. This notebook is introduced in the [Getting Started |
|||
with the 3D Balance Ball Environment](Getting-Started-with-Balance-Ball.md) |
|||
in the _Jupyter/IPython Quick Start Guide_. To launch Jupyter, run in the command line: |
|||
in the _Jupyter/IPython Quick Start Guide_. To launch Jupyter, run in the |
|||
command line: |
|||
`jupyter notebook` |
|||
jupyter notebook |
|||
|
|||
Then navigate to `localhost:8888` to access your notebooks. |
|
|||
# Frequently Asked Questions |
|||
|
|||
|
|||
### Scripting Runtime Environment not setup correctly |
|||
## Scripting Runtime Environment not setup correctly |
|||
If you haven't switched your scripting runtime version from .NET 3.5 to .NET 4.6 or .NET 4.x, you will see such error message: |
|||
If you haven't switched your scripting runtime version from .NET 3.5 to .NET 4.6 |
|||
or .NET 4.x, you will see such error message: |
|||
This is because .NET 3.5 doesn't support method Clear() for StringBuilder, refer to [Setting Up The ML-Agents Toolkit Within Unity](Installation.md#setting-up-ml-agent-within-unity) for solution. |
|||
This is because .NET 3.5 doesn't support method Clear() for StringBuilder, refer |
|||
to [Setting Up The ML-Agents Toolkit Within |
|||
Unity](Installation.md#setting-up-ml-agent-within-unity) for solution. |
|||
### TensorFlowSharp flag not turned on. |
|||
## TensorFlowSharp flag not turned on |
|||
If you have already imported the TensorFlowSharp plugin, but havn't set ENABLE_TENSORFLOW flag for your scripting define symbols, you will see the following error message: |
|||
If you have already imported the TensorFlowSharp plugin, but havn't set |
|||
ENABLE_TENSORFLOW flag for your scripting define symbols, you will see the |
|||
following error message: |
|||
You need to install and enable the TensorFlowSharp plugin in order to use the internal brain. |
|||
You need to install and enable the TensorFlowSharp plugin in order to use the internal brain. |
|||
This error message occurs because the TensorFlowSharp plugin won't be usage without the ENABLE_TENSORFLOW flag, refer to [Setting Up The ML-Agents Toolkit Within Unity](Installation.md#setting-up-ml-agent-within-unity) for solution. |
|||
This error message occurs because the TensorFlowSharp plugin won't be usage |
|||
without the ENABLE_TENSORFLOW flag, refer to [Setting Up The ML-Agents Toolkit |
|||
Within Unity](Installation.md#setting-up-ml-agent-within-unity) for solution. |
|||
### Tensorflow epsilon placeholder error |
|||
## Tensorflow epsilon placeholder error |
|||
If you have a graph placeholder set in the internal Brain inspector that is not present in the TensorFlow graph, you will see some error like this: |
|||
If you have a graph placeholder set in the internal Brain inspector that is not |
|||
present in the TensorFlow graph, you will see some error like this: |
|||
UnityAgentsException: One of the Tensorflow placeholder could not be found. In brain <some_brain_name>, there are no FloatingPoint placeholder named <some_placeholder_name>. |
|||
UnityAgentsException: One of the Tensorflow placeholder could not be found. In brain <some_brain_name>, there are no FloatingPoint placeholder named <some_placeholder_name>. |
|||
Solution: Go to all of your Brain object, find `Graph placeholders` and change its `size` to 0 to remove the `epsilon` placeholder. |
|||
Solution: Go to all of your Brain object, find `Graph placeholders` and change |
|||
its `size` to 0 to remove the `epsilon` placeholder. |
|||
Similarly, if you have a graph scope set in the internal Brain inspector that is not correctly set, you will see some error like this: |
|||
Similarly, if you have a graph scope set in the internal Brain inspector that is |
|||
not correctly set, you will see some error like this: |
|||
Solution: Make sure your Graph Scope field matches the corresponding brain object name in your Hierachy Inspector when there is multiple brain. |
|||
Solution: Make sure your Graph Scope field matches the corresponding brain |
|||
object name in your Hierachy Inspector when there is multiple brain. |
|||
### Environment Permission Error |
|||
## Environment Permission Error |
|||
If you directly import your Unity environment without building it in the |
|||
editor, you might need to give it additional permissions to execute it. |
|||
If you directly import your Unity environment without building it in the |
|||
editor, you might need to give it additional permissions to execute it. |
|||
`chmod -R 755 *.app` |
|||
```shell |
|||
chmod -R 755 *.app |
|||
``` |
|||
`chmod -R 755 *.x86_64` |
|||
```shell |
|||
chmod -R 755 *.x86_64 |
|||
``` |
|||
On Windows, you can find |
|||
On Windows, you can find |
|||
### Environment Connection Timeout |
|||
## Environment Connection Timeout |
|||
|
|||
If you are able to launch the environment from `UnityEnvironment` but then |
|||
receive a timeout error, there may be a number of possible causes. |
|||
If you are able to launch the environment from `UnityEnvironment` but |
|||
then receive a timeout error, there may be a number of possible causes. |
|||
* _Cause_: There may be no Brains in your environment which are set |
|||
to `External`. In this case, the environment will not attempt to |
|||
communicate with python. _Solution_: Set the Brains(s) you wish to |
|||
externally control through the Python API to `External` from the |
|||
Unity Editor, and rebuild the environment. |
|||
* _Cause_: On OSX, the firewall may be preventing communication with |
|||
the environment. _Solution_: Add the built environment binary to the |
|||
list of exceptions on the firewall by following |
|||
[instructions](https://support.apple.com/en-us/HT201642). |
|||
* _Cause_: An error happened in the Unity Environment preventing |
|||
communication. _Solution_: Look into the |
|||
[log files](https://docs.unity3d.com/Manual/LogFiles.html) |
|||
generated by the Unity Environment to figure what error happened. |
|||
* _Cause_: There may be no Brains in your environment which are set to |
|||
`External`. In this case, the environment will not attempt to communicate |
|||
with python. _Solution_: Set the Brains(s) you wish to externally control |
|||
through the Python API to `External` from the Unity Editor, and rebuild the |
|||
environment. |
|||
* _Cause_: On OSX, the firewall may be preventing communication with the |
|||
environment. _Solution_: Add the built environment binary to the list of |
|||
exceptions on the firewall by following |
|||
[instructions](https://support.apple.com/en-us/HT201642). |
|||
* _Cause_: An error happened in the Unity Environment preventing communication. |
|||
_Solution_: Look into the [log |
|||
files](https://docs.unity3d.com/Manual/LogFiles.html) generated by the Unity |
|||
Environment to figure what error happened. |
|||
### Communication port {} still in use |
|||
## Communication port {} still in use |
|||
If you receive an exception `"Couldn't launch new environment because |
|||
communication port {} is still in use. "`, you can change the worker |
|||
number in the Python script when calling |
|||
If you receive an exception `"Couldn't launch new environment because |
|||
communication port {} is still in use. "`, you can change the worker number in |
|||
the Python script when calling |
|||
`UnityEnvironment(file_name=filename, worker_id=X)` |
|||
```python |
|||
UnityEnvironment(file_name=filename, worker_id=X) |
|||
``` |
|||
### Mean reward : nan |
|||
## Mean reward : nan |
|||
If you receive a message `Mean reward : nan` when attempting to train a |
|||
model using PPO, this is due to the episodes of the learning environment |
|||
not terminating. In order to address this, set `Max Steps` for either |
|||
the Academy or Agents within the Scene Inspector to a value greater |
|||
than 0. Alternatively, it is possible to manually set `done` conditions |
|||
for episodes from within scripts for custom episode-terminating events. |
|||
If you receive a message `Mean reward : nan` when attempting to train a model |
|||
using PPO, this is due to the episodes of the learning environment not |
|||
terminating. In order to address this, set `Max Steps` for either the Academy or |
|||
Agents within the Scene Inspector to a value greater than 0. Alternatively, it |
|||
is possible to manually set `done` conditions for episodes from within scripts |
|||
for custom episode-terminating events. |
|
|||
# ML-Agents Toolkit Glossary |
|||
|
|||
* **Academy** - Unity Component which controls timing, reset, and |
|||
training/inference settings of the environment. |
|||
* **Action** - The carrying-out of a decision on the part of an |
|||
agent within the environment. |
|||
* **Agent** - Unity Component which produces observations and |
|||
takes actions in the environment. Agents actions are determined |
|||
by decisions produced by a linked Brain. |
|||
* **Brain** - Unity Component which makes decisions for the agents |
|||
linked to it. |
|||
* **Decision** - The specification produced by a Brain for an action |
|||
to be carried out given an observation. |
|||
* **Editor** - The Unity Editor, which may include any pane |
|||
(e.g. Hierarchy, Scene, Inspector). |
|||
* **Environment** - The Unity scene which contains Agents, Academy, |
|||
and Brains. |
|||
* **FixedUpdate** - Unity method called each time the the game engine |
|||
is stepped. ML-Agents logic should be placed here. |
|||
* **Frame** - An instance of rendering the main camera for the |
|||
display. Corresponds to each `Update` call of the game engine. |
|||
* **Observation** - Partial information describing the state of the |
|||
environment available to a given agent. (e.g. Vector, Visual, Text) |
|||
* **Policy** - Function for producing decisions from observations. |
|||
* **Reward** - Signal provided at every step used to indicate |
|||
desirability of an agent’s action within the current state |
|||
of the environment. |
|||
* **State** - The underlying properties of the environment |
|||
(including all agents within it) at a given time. |
|||
* **Step** - Corresponds to each `FixedUpdate` call of the game engine. |
|||
Is the smallest atomic change to the state possible. |
|||
* **Update** - Unity function called each time a frame is rendered. |
|||
ML-Agents logic should not be placed here. |
|||
* **External Coordinator** - ML-Agents class responsible for |
|||
communication with outside processes (in this case, the Python API). |
|||
* **Trainer** - Python class which is responsible for training a given |
|||
external brain. Contains TensorFlow graph which makes decisions |
|||
for external brain. |
|||
* **Academy** - Unity Component which controls timing, reset, and |
|||
training/inference settings of the environment. |
|||
* **Action** - The carrying-out of a decision on the part of an agent within the |
|||
environment. |
|||
* **Agent** - Unity Component which produces observations and takes actions in |
|||
the environment. Agents actions are determined by decisions produced by a |
|||
linked Brain. |
|||
* **Brain** - Unity Component which makes decisions for the agents linked to it. |
|||
* **Decision** - The specification produced by a Brain for an action to be |
|||
carried out given an observation. |
|||
* **Editor** - The Unity Editor, which may include any pane (e.g. Hierarchy, |
|||
Scene, Inspector). |
|||
* **Environment** - The Unity scene which contains Agents, Academy, and Brains. |
|||
* **FixedUpdate** - Unity method called each time the the game engine is |
|||
stepped. ML-Agents logic should be placed here. |
|||
* **Frame** - An instance of rendering the main camera for the display. |
|||
Corresponds to each `Update` call of the game engine. |
|||
* **Observation** - Partial information describing the state of the environment |
|||
available to a given agent. (e.g. Vector, Visual, Text) |
|||
* **Policy** - Function for producing decisions from observations. |
|||
* **Reward** - Signal provided at every step used to indicate desirability of an |
|||
agent’s action within the current state of the environment. |
|||
* **State** - The underlying properties of the environment (including all agents |
|||
within it) at a given time. |
|||
* **Step** - Corresponds to each `FixedUpdate` call of the game engine. Is the |
|||
smallest atomic change to the state possible. |
|||
* **Update** - Unity function called each time a frame is rendered. ML-Agents |
|||
logic should not be placed here. |
|||
* **External Coordinator** - ML-Agents class responsible for communication with |
|||
outside processes (in this case, the Python API). |
|||
* **Trainer** - Python class which is responsible for training a given external |
|||
brain. Contains TensorFlow graph which makes decisions for external brain. |
|
|||
# Python API |
|||
|
|||
The ML-Agents toolkit provides a Python API for controlling the agent simulation loop of a environment or game built with Unity. This API is used by the ML-Agent training algorithms (run with `learn.py`), but you can also write your Python programs using this API. |
|||
The ML-Agents toolkit provides a Python API for controlling the agent simulation |
|||
loop of a environment or game built with Unity. This API is used by the ML-Agent |
|||
training algorithms (run with `mlagents-learn`), but you can also write your Python |
|||
programs using this API. |
|||
* **UnityEnvironment** — the main interface between the Unity application and your code. Use UnityEnvironment to start and control a simulation or training session. |
|||
* **BrainInfo** — contains all the data from agents in the simulation, such as observations and rewards. |
|||
* **BrainParameters** — describes the data elements in a BrainInfo object. For example, provides the array length of an observation in BrainInfo. |
|||
- **UnityEnvironment** — the main interface between the Unity application and |
|||
your code. Use UnityEnvironment to start and control a simulation or training |
|||
session. |
|||
- **BrainInfo** — contains all the data from agents in the simulation, such as |
|||
observations and rewards. |
|||
- **BrainParameters** — describes the data elements in a BrainInfo object. For |
|||
example, provides the array length of an observation in BrainInfo. |
|||
These classes are all defined in the `python/unityagents` folder of the ML-Agents SDK. |
|||
These classes are all defined in the `ml-agents/mlagents/envs` folder of |
|||
the ML-Agents SDK. |
|||
To communicate with an agent in a Unity environment from a Python program, the agent must either use an **External** brain or use a brain that is broadcasting (has its **Broadcast** property set to true). Your code is expected to return actions for agents with external brains, but can only observe broadcasting brains (the information you receive for an agent is the same in both cases). See [Using the Broadcast Feature](Learning-Environment-Design-Brains.md#using-the-broadcast-feature). |
|||
To communicate with an agent in a Unity environment from a Python program, the |
|||
agent must either use an **External** brain or use a brain that is broadcasting |
|||
(has its **Broadcast** property set to true). Your code is expected to return |
|||
actions for agents with external brains, but can only observe broadcasting |
|||
brains (the information you receive for an agent is the same in both cases). See |
|||
[Using the Broadcast |
|||
Feature](Learning-Environment-Design-Brains.md#using-the-broadcast-feature). |
|||
For a simple example of using the Python API to interact with a Unity environment, see the Basic [Jupyter](Background-Jupyter.md) notebook (`python/notebooks/getting-started.ipynb`), which opens an environment, runs a few simulation steps taking random actions, and closes the environment. |
|||
For a simple example of using the Python API to interact with a Unity |
|||
environment, see the Basic [Jupyter](Background-Jupyter.md) notebook |
|||
(`notebooks/getting-started.ipynb`), which opens an environment, runs a few |
|||
simulation steps taking random actions, and closes the environment. |
|||
_Notice: Currently communication between Unity and Python takes place over an open socket without authentication. As such, please make sure that the network where training takes place is secure. This will be addressed in a future release._ |
|||
_Notice: Currently communication between Unity and Python takes place over an |
|||
open socket without authentication. As such, please make sure that the network |
|||
where training takes place is secure. This will be addressed in a future |
|||
release._ |
|||
Python-side communication happens through `UnityEnvironment` which is located in `python/unityagents`. To load a Unity environment from a built binary file, put the file in the same directory as `unityagents`. For example, if the filename of your Unity environment is 3DBall.app, in python, run: |
|||
Python-side communication happens through `UnityEnvironment` which is located in |
|||
`ml-agents/mlagents/envs`. To load a Unity environment from a built binary |
|||
file, put the file in the same directory as `envs`. For example, if the filename |
|||
of your Unity environment is 3DBall.app, in python, run: |
|||
from unityagents import UnityEnvironment |
|||
from mlagents.env import UnityEnvironment |
|||
* `file_name` is the name of the environment binary (located in the root directory of the python project). |
|||
* `worker_id` indicates which port to use for communication with the environment. For use in parallel training regimes such as A3C. |
|||
* `seed` indicates the seed to use when generating random numbers during the training process. In environments which do not involve physics calculations, setting the seed enables reproducible experimentation by ensuring that the environment and trainers utilize the same random seed. |
|||
- `file_name` is the name of the environment binary (located in the root |
|||
directory of the python project). |
|||
- `worker_id` indicates which port to use for communication with the |
|||
environment. For use in parallel training regimes such as A3C. |
|||
- `seed` indicates the seed to use when generating random numbers during the |
|||
training process. In environments which do not involve physics calculations, |
|||
setting the seed enables reproducible experimentation by ensuring that the |
|||
environment and trainers utilize the same random seed. |
|||
If you want to directly interact with the Editor, you need to use `file_name=None`, then press the :arrow_forward: button in the Editor when the message _"Start training by pressing the Play button in the Unity Editor"_ is displayed on the screen |
|||
If you want to directly interact with the Editor, you need to use |
|||
`file_name=None`, then press the :arrow_forward: button in the Editor when the |
|||
message _"Start training by pressing the Play button in the Unity Editor"_ is |
|||
displayed on the screen |
|||
* **`visual_observations`** : A list of 4 dimensional numpy arrays. Matrix n of the list corresponds to the n<sup>th</sup> observation of the brain. |
|||
* **`vector_observations`** : A two dimensional numpy array of dimension `(batch size, vector observation size)`. |
|||
* **`text_observations`** : A list of string corresponding to the agents text observations. |
|||
* **`memories`** : A two dimensional numpy array of dimension `(batch size, memory size)` which corresponds to the memories sent at the previous step. |
|||
* **`rewards`** : A list as long as the number of agents using the brain containing the rewards they each obtained at the previous step. |
|||
* **`local_done`** : A list as long as the number of agents using the brain containing `done` flags (whether or not the agent is done). |
|||
* **`max_reached`** : A list as long as the number of agents using the brain containing true if the agents reached their max steps. |
|||
* **`agents`** : A list of the unique ids of the agents using the brain. |
|||
* **`previous_actions`** : A two dimensional numpy array of dimension `(batch size, vector action size)` if the vector action space is continuous and `(batch size, number of branches)` if the vector action space is discrete. |
|||
- **`visual_observations`** : A list of 4 dimensional numpy arrays. Matrix n of |
|||
the list corresponds to the n<sup>th</sup> observation of the brain. |
|||
- **`vector_observations`** : A two dimensional numpy array of dimension `(batch |
|||
size, vector observation size)`. |
|||
- **`text_observations`** : A list of string corresponding to the agents text |
|||
observations. |
|||
- **`memories`** : A two dimensional numpy array of dimension `(batch size, |
|||
memory size)` which corresponds to the memories sent at the previous step. |
|||
- **`rewards`** : A list as long as the number of agents using the brain |
|||
containing the rewards they each obtained at the previous step. |
|||
- **`local_done`** : A list as long as the number of agents using the brain |
|||
containing `done` flags (whether or not the agent is done). |
|||
- **`max_reached`** : A list as long as the number of agents using the brain |
|||
containing true if the agents reached their max steps. |
|||
- **`agents`** : A list of the unique ids of the agents using the brain. |
|||
- **`previous_actions`** : A two dimensional numpy array of dimension `(batch |
|||
size, vector action size)` if the vector action space is continuous and |
|||
`(batch size, number of branches)` if the vector action space is discrete. |
|||
Once loaded, you can use your UnityEnvironment object, which referenced by a variable named `env` in this example, can be used in the following way: |
|||
Once loaded, you can use your UnityEnvironment object, which referenced by a |
|||
variable named `env` in this example, can be used in the following way: |
|||
|
|||
Prints all parameters relevant to the loaded environment and the external brains. |
|||
Prints all parameters relevant to the loaded environment and the external |
|||
brains. |
|||
Send a reset signal to the environment, and provides a dictionary mapping brain names to BrainInfo objects. |
|||
- `train_model` indicates whether to run the environment in train (`True`) or test (`False`) mode. |
|||
- `config` is an optional dictionary of configuration flags specific to the environment. For generic environments, `config` can be ignored. `config` is a dictionary of strings to floats where the keys are the names of the `resetParameters` and the values are their corresponding float values. Define the reset parameters on the [Academy Inspector](Learning-Environment-Design-Academy.md#academy-properties) window in the Unity Editor. |
|||
Send a reset signal to the environment, and provides a dictionary mapping |
|||
brain names to BrainInfo objects. |
|||
- `train_model` indicates whether to run the environment in train (`True`) or |
|||
test (`False`) mode. |
|||
- `config` is an optional dictionary of configuration flags specific to the |
|||
environment. For generic environments, `config` can be ignored. `config` is |
|||
a dictionary of strings to floats where the keys are the names of the |
|||
`resetParameters` and the values are their corresponding float values. |
|||
Define the reset parameters on the [Academy |
|||
Inspector](Learning-Environment-Design-Academy.md#academy-properties) window |
|||
in the Unity Editor. |
|||
Sends a step signal to the environment using the actions. For each brain : |
|||
- `action` can be one dimensional arrays or two dimensional arrays if you have multiple agents per brains. |
|||
- `memory` is an optional input that can be used to send a list of floats per agents to be retrieved at the next step. |
|||
- `text_action` is an optional input that be used to send a single string per agent. |
|||
Sends a step signal to the environment using the actions. For each brain : |
|||
- `action` can be one dimensional arrays or two dimensional arrays if you have |
|||
multiple agents per brains. |
|||
- `memory` is an optional input that can be used to send a list of floats per |
|||
agents to be retrieved at the next step. |
|||
- `text_action` is an optional input that be used to send a single string per |
|||
agent. |
|||
|
|||
For example, to access the BrainInfo belonging to a brain called 'brain_name', and the BrainInfo field 'vector_observations': |
|||
|
|||
For example, to access the BrainInfo belonging to a brain called |
|||
'brain_name', and the BrainInfo field 'vector_observations': |
|||
``` |
|||
``` |
|||
Note that if you have more than one external brain in the environment, you must provide dictionaries from brain names to arrays for `action`, `memory` and `value`. For example: If you have two external brains named `brain1` and `brain2` each with one agent taking two continuous actions, then you can have: |
|||
Note that if you have more than one external brain in the environment, you |
|||
must provide dictionaries from brain names to arrays for `action`, `memory` |
|||
and `value`. For example: If you have two external brains named `brain1` and |
|||
`brain2` each with one agent taking two continuous actions, then you can |
|||
have: |
|||
Returns a dictionary mapping brain names to BrainInfo objects. |
|||
- **Close : `env.close()`** |
|||
Sends a shutdown signal to the environment and closes the communication socket. |
|||
Returns a dictionary mapping brain names to BrainInfo objects. |
|||
- **Close : `env.close()`** |
|||
Sends a shutdown signal to the environment and closes the communication |
|||
socket. |
|
|||
# Unity ML-Agents Toolkit Documentation |
|||
|
|||
## Installation & Set-up |
|||
* [Installation](Installation.md) |
|||
* [Background: Jupyter Notebooks](Background-Jupyter.md) |
|||
* [Docker Set-up](Using-Docker.md) |
|||
* [Basic Guide](Basic-Guide.md) |
|||
|
|||
* [Installation](Installation.md) |
|||
* [Background: Jupyter Notebooks](Background-Jupyter.md) |
|||
* [Docker Set-up](Using-Docker.md) |
|||
* [Basic Guide](Basic-Guide.md) |
|||
* [ML-Agents Toolkit Overview](ML-Agents-Overview.md) |
|||
* [Background: Unity](Background-Unity.md) |
|||
* [Background: Machine Learning](Background-Machine-Learning.md) |
|||
* [Background: TensorFlow](Background-TensorFlow.md) |
|||
* [Getting Started with the 3D Balance Ball Environment](Getting-Started-with-Balance-Ball.md) |
|||
* [Example Environments](Learning-Environment-Examples.md) |
|||
|
|||
* [ML-Agents Toolkit Overview](ML-Agents-Overview.md) |
|||
* [Background: Unity](Background-Unity.md) |
|||
* [Background: Machine Learning](Background-Machine-Learning.md) |
|||
* [Background: TensorFlow](Background-TensorFlow.md) |
|||
* [Getting Started with the 3D Balance Ball Environment](Getting-Started-with-Balance-Ball.md) |
|||
* [Example Environments](Learning-Environment-Examples.md) |
|||
* [Making a New Learning Environment](Learning-Environment-Create-New.md) |
|||
* [Designing a Learning Environment](Learning-Environment-Design.md) |
|||
* [Agents](Learning-Environment-Design-Agents.md) |
|||
* [Academy](Learning-Environment-Design-Academy.md) |
|||
* [Brains](Learning-Environment-Design-Brains.md): [Player](Learning-Environment-Design-Player-Brains.md), [Heuristic](Learning-Environment-Design-Heuristic-Brains.md), [Internal & External](Learning-Environment-Design-External-Internal-Brains.md) |
|||
* [Learning Environment Best Practices](Learning-Environment-Best-Practices.md) |
|||
* [Using the Monitor](Feature-Monitor.md) |
|||
* [Using an Executable Environment](Learning-Environment-Executable.md) |
|||
* [TensorFlowSharp in Unity (Experimental)](Using-TensorFlow-Sharp-in-Unity.md) |
|||
|
|||
|
|||
* [Making a New Learning Environment](Learning-Environment-Create-New.md) |
|||
* [Designing a Learning Environment](Learning-Environment-Design.md) |
|||
* [Agents](Learning-Environment-Design-Agents.md) |
|||
* [Academy](Learning-Environment-Design-Academy.md) |
|||
* [Brains](Learning-Environment-Design-Brains.md): |
|||
[Player](Learning-Environment-Design-Player-Brains.md), |
|||
[Heuristic](Learning-Environment-Design-Heuristic-Brains.md), |
|||
[Internal & External](Learning-Environment-Design-External-Internal-Brains.md) |
|||
* [Learning Environment Best Practices](Learning-Environment-Best-Practices.md) |
|||
* [Using the Monitor](Feature-Monitor.md) |
|||
* [Using an Executable Environment](Learning-Environment-Executable.md) |
|||
* [TensorFlowSharp in Unity (Experimental)](Using-TensorFlow-Sharp-in-Unity.md) |
|||
|
|||
* [Training ML-Agents](Training-ML-Agents.md) |
|||
* [Training with Proximal Policy Optimization](Training-PPO.md) |
|||
* [Training with Curriculum Learning](Training-Curriculum-Learning.md) |
|||
* [Training with Imitation Learning](Training-Imitation-Learning.md) |
|||
* [Training with LSTM](Feature-Memory.md) |
|||
* [Training on the Cloud with Amazon Web Services](Training-on-Amazon-Web-Service.md) |
|||
* [Training on the Cloud with Microsoft Azure](Training-on-Microsoft-Azure.md) |
|||
* [Using TensorBoard to Observe Training](Using-Tensorboard.md) |
|||
|
|||
* [Training ML-Agents](Training-ML-Agents.md) |
|||
* [Training with Proximal Policy Optimization](Training-PPO.md) |
|||
* [Training with Curriculum Learning](Training-Curriculum-Learning.md) |
|||
* [Training with Imitation Learning](Training-Imitation-Learning.md) |
|||
* [Training with LSTM](Feature-Memory.md) |
|||
* [Training on the Cloud with Amazon Web Services](Training-on-Amazon-Web-Service.md) |
|||
* [Training on the Cloud with Microsoft Azure](Training-on-Microsoft-Azure.md) |
|||
* [Using TensorBoard to Observe Training](Using-Tensorboard.md) |
|||
* [Migrating from earlier versions of ML-Agents](Migrating.md) |
|||
* [Frequently Asked Questions](FAQ.md) |
|||
* [ML-Agents Glossary](Glossary.md) |
|||
* [Limitations](Limitations.md) |
|||
|
|||
|
|||
* [Migrating from earlier versions of ML-Agents](Migrating.md) |
|||
* [Frequently Asked Questions](FAQ.md) |
|||
* [ML-Agents Glossary](Glossary.md) |
|||
* [Limitations](Limitations.md) |
|||
|
|||
* [API Reference](API-Reference.md) |
|||
* [How to use the Python API](Python-API.md) |
|||
* [Wrapping Learning Environment as a Gym](../gym-unity/Readme.md) |
|||
|
|||
* [API Reference](API-Reference.md) |
|||
* [How to use the Python API](Python-API.md) |
|||
* [Wrapping Learning Environment as a Gym](../gym-unity/Readme.md) |
|
|||
# Using Docker For ML-Agents |
|||
|
|||
We currently offer a solution for Windows and Mac users who would like to do training or inference using Docker. This option may be appealing to those who would like to avoid installing Python and TensorFlow themselves. The current setup forces both TensorFlow and Unity to _only_ rely on the CPU for computations. Consequently, our Docker simulation does not use a GPU and uses [`Xvfb`](https://en.wikipedia.org/wiki/Xvfb) to do visual rendering. `Xvfb` is a utility that enables `ML-Agents` (or any other application) to do rendering virtually i.e. it does not assume that the machine running `ML-Agents` has a GPU or a display attached to it. This means that rich environments which involve agents using camera-based visual observations might be slower. |
|||
We currently offer a solution for Windows and Mac users who would like to do |
|||
training or inference using Docker. This option may be appealing to those who |
|||
would like to avoid installing Python and TensorFlow themselves. The current |
|||
setup forces both TensorFlow and Unity to _only_ rely on the CPU for |
|||
computations. Consequently, our Docker simulation does not use a GPU and uses |
|||
[`Xvfb`](https://en.wikipedia.org/wiki/Xvfb) to do visual rendering. `Xvfb` is a |
|||
utility that enables `ML-Agents` (or any other application) to do rendering |
|||
virtually i.e. it does not assume that the machine running `ML-Agents` has a GPU |
|||
or a display attached to it. This means that rich environments which involve |
|||
agents using camera-based visual observations might be slower. |
|||
## Requirements |
|||
## Requirements |
|||
- [Download](https://unity3d.com/get-unity/download) the Unity Installer and |
|||
add the _Linux Build Support_ Component |
|||
- [Download](https://unity3d.com/get-unity/download) the Unity Installer and add |
|||
the _Linux Build Support_ Component |
|||
- [Download](https://www.docker.com/community-edition#/download) and |
|||
install Docker if you don't have it setup on your machine. |
|||
- [Download](https://www.docker.com/community-edition#/download) and install |
|||
Docker if you don't have it setup on your machine. |
|||
- Since Docker runs a container in an environment that is isolated from the host machine, a mounted directory in your host machine is used to share data, e.g. the Unity executable, curriculum files and TensorFlow graph. For convenience, we created an empty `unity-volume` directory at the root of the repository for this purpose, but feel free to use any other directory. The remainder of this guide assumes that the `unity-volume` directory is the one used. |
|||
- Since Docker runs a container in an environment that is isolated from the host |
|||
machine, a mounted directory in your host machine is used to share data, e.g. |
|||
the Unity executable, curriculum files and TensorFlow graph. For convenience, |
|||
we created an empty `unity-volume` directory at the root of the repository for |
|||
this purpose, but feel free to use any other directory. The remainder of this |
|||
guide assumes that the `unity-volume` directory is the one used. |
|||
Using Docker for ML-Agents involves three steps: building the Unity environment with specific flags, building a Docker container and, finally, running the container. If you are not familiar with building a Unity environment for ML-Agents, please read through our [Getting Started with the 3D Balance Ball Example](Getting-Started-with-Balance-Ball.md) guide first. |
|||
Using Docker for ML-Agents involves three steps: building the Unity environment |
|||
with specific flags, building a Docker container and, finally, running the |
|||
container. If you are not familiar with building a Unity environment for |
|||
ML-Agents, please read through our [Getting Started with the 3D Balance Ball |
|||
Example](Getting-Started-with-Balance-Ball.md) guide first. |
|||
|
|||
Since Docker typically runs a container sharing a (linux) kernel with the host machine, the |
|||
Unity environment **has** to be built for the **linux platform**. When building a Unity environment, please select the following options from the the Build Settings window: |
|||
Since Docker typically runs a container sharing a (linux) kernel with the host |
|||
machine, the Unity environment **has** to be built for the **linux platform**. |
|||
When building a Unity environment, please select the following options from the |
|||
the Build Settings window: |
|||
|
|||
- If the environment does not contain visual observations, you can select the `headless` option here. |
|||
- If the environment does not contain visual observations, you can select the |
|||
`headless` option here. |
|||
Then click `Build`, pick an environment name (e.g. `3DBall`) and set the output directory to `unity-volume`. After building, ensure that the file `<environment-name>.x86_64` and subdirectory `<environment-name>_Data/` are created under `unity-volume`. |
|||
Then click `Build`, pick an environment name (e.g. `3DBall`) and set the output |
|||
directory to `unity-volume`. After building, ensure that the file |
|||
`<environment-name>.x86_64` and subdirectory `<environment-name>_Data/` are |
|||
created under `unity-volume`. |
|||
First, make sure the Docker engine is running on your machine. Then build the Docker container by calling the following command at the top-level of the repository: |
|||
First, make sure the Docker engine is running on your machine. Then build the |
|||
Docker container by calling the following command at the top-level of the |
|||
repository: |
|||
``` |
|||
```shell |
|||
``` |
|||
Replace `<image-name>` with a name for the Docker image, e.g. `balance.ball.v0.1`. |
|||
``` |
|||
**Note** if you modify hyperparameters in `trainer_config.yaml` you will have to build a new Docker Container before running. |
|||
Replace `<image-name>` with a name for the Docker image, e.g. |
|||
`balance.ball.v0.1`. |
|||
Run the Docker container by calling the following command at the top-level of the repository: |
|||
Run the Docker container by calling the following command at the top-level of |
|||
the repository: |
|||
``` |
|||
```shell |
|||
<image-name>:latest <environment-name> \ |
|||
<image-name>:latest \ |
|||
<trainer-config-path> \ |
|||
--env=<environment-name> \ |
|||
- `<container-name>` is used to identify the container (in case you want to interrupt and terminate it). This is optional and Docker will generate a random name if this is not set. _Note that this must be unique for every run of a Docker image._ |
|||
|
|||
- `<container-name>` is used to identify the container (in case you want to |
|||
interrupt and terminate it). This is optional and Docker will generate a |
|||
random name if this is not set. _Note that this must be unique for every run |
|||
of a Docker image._ |
|||
- `<environemnt-name>` __(Optional)__: If you are training with a linux executable, this is the name of the executable. If you are training in the Editor, do not pass a `<environemnt-name>` argument and press the :arrow_forward: button in Unity when the message _"Start training by pressing the Play button in the Unity Editor"_ is displayed on the screen. |
|||
- `source`: Reference to the path in your host OS where you will store the Unity executable. |
|||
- `target`: Tells Docker to mount the `source` path as a disk with this name. |
|||
- `docker-target-name`: Tells the ML-Agents Python package what the name of the disk where it can read the Unity executable and store the graph. **This should therefore be identical to `target`.** |
|||
- `train`, `run-id`: ML-Agents arguments passed to `learn.py`. `train` trains the algorithm, `run-id` is used to tag each experiment with a unique identifier. |
|||
- `<environment-name>` __(Optional)__: If you are training with a linux |
|||
executable, this is the name of the executable. If you are training in the |
|||
Editor, do not pass a `<environment-name>` argument and press the |
|||
:arrow_forward: button in Unity when the message _"Start training by pressing |
|||
the Play button in the Unity Editor"_ is displayed on the screen. |
|||
- `source`: Reference to the path in your host OS where you will store the Unity |
|||
executable. |
|||
- `target`: Tells Docker to mount the `source` path as a disk with this name. |
|||
- `docker-target-name`: Tells the ML-Agents Python package what the name of the |
|||
disk where it can read the Unity executable and store the graph. **This should |
|||
therefore be identical to `target`.** |
|||
- `trainer-config-path`, `train`, `run-id`: ML-Agents arguments passed to |
|||
`mlagents-learn`. `trainer-config-path` is the filepath of the trainer config |
|||
file, `train` trains the algorithm, and `run-id` is used to tag each |
|||
experiment with a unique identifier. We recommend placing the trainer-config |
|||
file inside `unity-volume` so that the container has access to the file. |
|||
``` |
|||
```shell |
|||
<trainer-config-path> \ |
|||
For more detail on Docker mounts, check out [these](https://docs.docker.com/storage/bind-mounts/) docs from Docker. |
|||
|
|||
For more detail on Docker mounts, check out |
|||
[these](https://docs.docker.com/storage/bind-mounts/) docs from Docker. |
|||
If you are satisfied with the training progress, you can stop the Docker container while saving state by either using `Ctrl+C` or `⌘+C` (Mac) or by using the following command: |
|||
If you are satisfied with the training progress, you can stop the Docker |
|||
container while saving state by either using `Ctrl+C` or `⌘+C` (Mac) or by using |
|||
the following command: |
|||
``` |
|||
```shell |
|||
`<container-name>` is the name of the container specified in the earlier `docker run` command. If you didn't specify one, you can find the randomly generated identifier by running `docker container ls`. |
|||
`<container-name>` is the name of the container specified in the earlier `docker |
|||
run` command. If you didn't specify one, you can find the randomly generated |
|||
identifier by running `docker container ls`. |
|
|||
from .environment import * |
|||
from .brain import * |
|||
from .exception import * |
部分文件因为文件数量过多而无法显示
撰写
预览
正在加载...
取消
保存
Reference in new issue