GitHub
6 年前
当前提交
25495874
共有 313 个文件被更改,包括 6778 次插入 和 3623 次删除
-
58.gitignore
-
5CODE_OF_CONDUCT.md
-
60CONTRIBUTING.md
-
11Dockerfile
-
201LICENSE
-
106README.md
-
29docs/API-Reference.md
-
15docs/Background-Jupyter.md
-
301docs/Background-Machine-Learning.md
-
74docs/Background-TensorFlow.md
-
12docs/Background-Unity.md
-
242docs/Basic-Guide.md
-
136docs/FAQ.md
-
57docs/Feature-Memory.md
-
50docs/Feature-Monitor.md
-
393docs/Getting-Started-with-Balance-Ball.md
-
68docs/Glossary.md
-
251docs/Installation-Windows.md
-
90docs/Installation.md
-
64docs/Learning-Environment-Best-Practices.md
-
363docs/Learning-Environment-Create-New.md
-
55docs/Learning-Environment-Design-Academy.md
-
469docs/Learning-Environment-Design-Agents.md
-
106docs/Learning-Environment-Design-Brains.md
-
118docs/Learning-Environment-Design-External-Internal-Brains.md
-
34docs/Learning-Environment-Design-Heuristic-Brains.md
-
47docs/Learning-Environment-Design-Player-Brains.md
-
203docs/Learning-Environment-Design.md
-
381docs/Learning-Environment-Examples.md
-
219docs/Learning-Environment-Executable.md
-
27docs/Limitations.md
-
703docs/ML-Agents-Overview.md
-
135docs/Migrating.md
-
160docs/Python-API.md
-
80docs/Readme.md
-
143docs/Training-Curriculum-Learning.md
-
78docs/Training-Imitation-Learning.md
-
228docs/Training-ML-Agents.md
-
218docs/Training-PPO.md
-
118docs/Training-on-Amazon-Web-Service.md
-
112docs/Training-on-Microsoft-Azure-Custom-Instance.md
-
107docs/Training-on-Microsoft-Azure.md
-
117docs/Using-Docker.md
-
179docs/Using-TensorFlow-Sharp-in-Unity.md
-
79docs/Using-Tensorboard.md
-
8docs/dox-ml-agents.conf
-
611docs/images/banner.png
-
129docs/images/player_brain.png
-
79docs/images/scene-hierarchy.png
-
309docs/images/unity-logo-rgb.png
-
5docs/localized/zh-CN/README.md
-
26docs/localized/zh-CN/docs/Getting-Started-with-Balance-Ball.md
-
4docs/localized/zh-CN/docs/Installation.md
-
4docs/localized/zh-CN/docs/Learning-Environment-Create-New.md
-
14docs/localized/zh-CN/docs/Learning-Environment-Design.md
-
42docs/localized/zh-CN/docs/Learning-Environment-Examples.md
-
2docs/localized/zh-CN/docs/ML-Agents-Overview.md
-
42notebooks/getting-started.ipynb
-
10ml-agents/mlagents/envs/communicator_objects/unity_to_external_pb2_grpc.py
-
18ml-agents/mlagents/envs/communicator_objects/unity_to_external_pb2.py
-
54ml-agents/mlagents/envs/communicator_objects/unity_rl_output_pb2.py
-
66ml-agents/mlagents/envs/communicator_objects/unity_rl_input_pb2.py
-
39ml-agents/mlagents/envs/communicator_objects/unity_rl_initialization_output_pb2.py
-
21ml-agents/mlagents/envs/communicator_objects/unity_rl_initialization_input_pb2.py
-
40ml-agents/mlagents/envs/communicator_objects/unity_output_pb2.py
-
39ml-agents/mlagents/envs/communicator_objects/unity_message_pb2.py
-
40ml-agents/mlagents/envs/communicator_objects/unity_input_pb2.py
-
25ml-agents/mlagents/envs/communicator_objects/space_type_proto_pb2.py
-
25ml-agents/mlagents/envs/communicator_objects/resolution_proto_pb2.py
-
23ml-agents/mlagents/envs/communicator_objects/header_pb2.py
-
36ml-agents/mlagents/envs/communicator_objects/environment_parameters_proto_pb2.py
-
31ml-agents/mlagents/envs/communicator_objects/engine_configuration_proto_pb2.py
-
23ml-agents/mlagents/envs/communicator_objects/command_proto_pb2.py
-
29ml-agents/mlagents/envs/communicator_objects/brain_type_proto_pb2.py
-
69ml-agents/mlagents/envs/communicator_objects/brain_parameters_proto_pb2.py
-
46ml-agents/mlagents/envs/communicator_objects/agent_info_proto_pb2.py
-
32ml-agents/mlagents/envs/communicator_objects/agent_action_proto_pb2.py
-
5config/curricula/wall-jump/BigWallBrain.json
-
2config/curricula/test/TestBrain.json
-
28ml-agents/tests/mock_communicator.py
-
63config/trainer_config.yaml
-
4ml-agents/mlagents/envs/socket_communicator.py
-
6ml-agents/mlagents/envs/rpc_communicator.py
-
4ml-agents/mlagents/envs/exception.py
-
98ml-agents/mlagents/envs/environment.py
-
7ml-agents/mlagents/envs/communicator.py
-
32ml-agents/mlagents/envs/brain.py
-
1ml-agents/mlagents/envs/__init__.py
-
148ml-agents/mlagents/trainers/curriculum.py
-
348ml-agents/mlagents/trainers/trainer_controller.py
-
54ml-agents/mlagents/trainers/trainer.py
-
315ml-agents/mlagents/trainers/ppo/trainer.py
-
100ml-agents/mlagents/trainers/ppo/models.py
-
275ml-agents/mlagents/trainers/models.py
-
15ml-agents/mlagents/trainers/buffer.py
-
161ml-agents/mlagents/trainers/bc/trainer.py
-
83ml-agents/mlagents/trainers/bc/models.py
-
1ml-agents/mlagents/trainers/bc/__init__.py
-
6ml-agents/mlagents/trainers/__init__.py
-
2ml-agents/requirements.txt
|
|||
# Contribution Guidelines |
|||
|
|||
Thank you for your interest in contributing to the ML-Agents toolkit! We are incredibly |
|||
excited to see how members of our community will use and extend the ML-Agents toolkit. |
|||
To facilitate your contributions, we've outlined a brief set of guidelines |
|||
to ensure that your extensions can be easily integrated. |
|||
Thank you for your interest in contributing to the ML-Agents toolkit! We are |
|||
incredibly excited to see how members of our community will use and extend the |
|||
ML-Agents toolkit. To facilitate your contributions, we've outlined a brief set |
|||
of guidelines to ensure that your extensions can be easily integrated. |
|||
### Communication |
|||
## Communication |
|||
First, please read through our [code of conduct](CODE_OF_CONDUCT.md), |
|||
as we expect all our contributors to follow it. |
|||
First, please read through our [code of conduct](CODE_OF_CONDUCT.md), as we |
|||
expect all our contributors to follow it. |
|||
Second, before starting on a project that you intend to contribute |
|||
to the ML-Agents toolkit (whether environments or modifications to the codebase), |
|||
we **strongly** recommend posting on our |
|||
[Issues page](https://github.com/Unity-Technologies/ml-agents/issues) and |
|||
briefly outlining the changes you plan to make. This will enable us to provide |
|||
some context that may be helpful for you. This could range from advice and |
|||
feedback on how to optimally perform your changes or reasons for not doing it. |
|||
Second, before starting on a project that you intend to contribute to the |
|||
ML-Agents toolkit (whether environments or modifications to the codebase), we |
|||
**strongly** recommend posting on our |
|||
[Issues page](https://github.com/Unity-Technologies/ml-agents/issues) |
|||
and briefly outlining the changes you plan to make. This will enable us to |
|||
provide some context that may be helpful for you. This could range from advice |
|||
and feedback on how to optimally perform your changes or reasons for not doing |
|||
it. |
|||
### Git Branches |
|||
## Git Branches |
|||
Starting with v0.3, we adopted the |
|||
Starting with v0.3, we adopted the |
|||
Consequently, the `master` branch corresponds to the latest release of |
|||
Consequently, the `master` branch corresponds to the latest release of |
|||
|
|||
* Corresponding changes to documentation, unit tests and sample environments |
|||
(if applicable) |
|||
* Corresponding changes to documentation, unit tests and sample environments (if |
|||
applicable) |
|||
### Environments |
|||
## Environments |
|||
We are also actively open to adding community contributed environments as |
|||
examples, as long as they are small, simple, demonstrate a unique feature of |
|||
the platform, and provide a unique non-trivial challenge to modern |
|||
We are also actively open to adding community contributed environments as |
|||
examples, as long as they are small, simple, demonstrate a unique feature of |
|||
the platform, and provide a unique non-trivial challenge to modern |
|||
PR explaining the nature of the environment and task. |
|||
PR explaining the nature of the environment and task. |
|||
### Style Guide |
|||
## Style Guide |
|||
When performing changes to the codebase, ensure that you follow the style |
|||
guide of the file you're modifying. For Python, we follow |
|||
[PEP 8](https://www.python.org/dev/peps/pep-0008/). For C#, we will soon be |
|||
adding a formal style guide for our repository. |
|||
When performing changes to the codebase, ensure that you follow the style guide |
|||
of the file you're modifying. For Python, we follow |
|||
[PEP 8](https://www.python.org/dev/peps/pep-0008/). |
|||
For C#, we will soon be adding a formal style guide for our repository. |
|
|||
Apache License |
|||
Version 2.0, January 2004 |
|||
http://www.apache.org/licenses/ |
|||
|
|||
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION |
|||
|
|||
1. Definitions. |
|||
|
|||
"License" shall mean the terms and conditions for use, reproduction, |
|||
and distribution as defined by Sections 1 through 9 of this document. |
|||
|
|||
"Licensor" shall mean the copyright owner or entity authorized by |
|||
the copyright owner that is granting the License. |
|||
|
|||
"Legal Entity" shall mean the union of the acting entity and all |
|||
other entities that control, are controlled by, or are under common |
|||
control with that entity. For the purposes of this definition, |
|||
"control" means (i) the power, direct or indirect, to cause the |
|||
direction or management of such entity, whether by contract or |
|||
otherwise, or (ii) ownership of fifty percent (50%) or more of the |
|||
outstanding shares, or (iii) beneficial ownership of such entity. |
|||
|
|||
"You" (or "Your") shall mean an individual or Legal Entity |
|||
exercising permissions granted by this License. |
|||
|
|||
"Source" form shall mean the preferred form for making modifications, |
|||
including but not limited to software source code, documentation |
|||
source, and configuration files. |
|||
|
|||
"Object" form shall mean any form resulting from mechanical |
|||
transformation or translation of a Source form, including but |
|||
not limited to compiled object code, generated documentation, |
|||
and conversions to other media types. |
|||
|
|||
"Work" shall mean the work of authorship, whether in Source or |
|||
Object form, made available under the License, as indicated by a |
|||
copyright notice that is included in or attached to the work |
|||
(an example is provided in the Appendix below). |
|||
|
|||
"Derivative Works" shall mean any work, whether in Source or Object |
|||
form, that is based on (or derived from) the Work and for which the |
|||
editorial revisions, annotations, elaborations, or other modifications |
|||
represent, as a whole, an original work of authorship. For the purposes |
|||
of this License, Derivative Works shall not include works that remain |
|||
separable from, or merely link (or bind by name) to the interfaces of, |
|||
the Work and Derivative Works thereof. |
|||
|
|||
"Contribution" shall mean any work of authorship, including |
|||
the original version of the Work and any modifications or additions |
|||
to that Work or Derivative Works thereof, that is intentionally |
|||
submitted to Licensor for inclusion in the Work by the copyright owner |
|||
or by an individual or Legal Entity authorized to submit on behalf of |
|||
the copyright owner. For the purposes of this definition, "submitted" |
|||
means any form of electronic, verbal, or written communication sent |
|||
to the Licensor or its representatives, including but not limited to |
|||
communication on electronic mailing lists, source code control systems, |
|||
and issue tracking systems that are managed by, or on behalf of, the |
|||
Licensor for the purpose of discussing and improving the Work, but |
|||
excluding communication that is conspicuously marked or otherwise |
|||
designated in writing by the copyright owner as "Not a Contribution." |
|||
|
|||
"Contributor" shall mean Licensor and any individual or Legal Entity |
|||
on behalf of whom a Contribution has been received by Licensor and |
|||
subsequently incorporated within the Work. |
|||
|
|||
2. Grant of Copyright License. Subject to the terms and conditions of |
|||
this License, each Contributor hereby grants to You a perpetual, |
|||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable |
|||
copyright license to reproduce, prepare Derivative Works of, |
|||
publicly display, publicly perform, sublicense, and distribute the |
|||
Work and such Derivative Works in Source or Object form. |
|||
|
|||
3. Grant of Patent License. Subject to the terms and conditions of |
|||
this License, each Contributor hereby grants to You a perpetual, |
|||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable |
|||
(except as stated in this section) patent license to make, have made, |
|||
use, offer to sell, sell, import, and otherwise transfer the Work, |
|||
where such license applies only to those patent claims licensable |
|||
by such Contributor that are necessarily infringed by their |
|||
Contribution(s) alone or by combination of their Contribution(s) |
|||
with the Work to which such Contribution(s) was submitted. If You |
|||
institute patent litigation against any entity (including a |
|||
cross-claim or counterclaim in a lawsuit) alleging that the Work |
|||
or a Contribution incorporated within the Work constitutes direct |
|||
or contributory patent infringement, then any patent licenses |
|||
granted to You under this License for that Work shall terminate |
|||
as of the date such litigation is filed. |
|||
|
|||
4. Redistribution. You may reproduce and distribute copies of the |
|||
Work or Derivative Works thereof in any medium, with or without |
|||
modifications, and in Source or Object form, provided that You |
|||
meet the following conditions: |
|||
|
|||
(a) You must give any other recipients of the Work or |
|||
Derivative Works a copy of this License; and |
|||
|
|||
(b) You must cause any modified files to carry prominent notices |
|||
stating that You changed the files; and |
|||
|
|||
(c) You must retain, in the Source form of any Derivative Works |
|||
that You distribute, all copyright, patent, trademark, and |
|||
attribution notices from the Source form of the Work, |
|||
excluding those notices that do not pertain to any part of |
|||
the Derivative Works; and |
|||
|
|||
(d) If the Work includes a "NOTICE" text file as part of its |
|||
distribution, then any Derivative Works that You distribute must |
|||
include a readable copy of the attribution notices contained |
|||
within such NOTICE file, excluding those notices that do not |
|||
pertain to any part of the Derivative Works, in at least one |
|||
of the following places: within a NOTICE text file distributed |
|||
as part of the Derivative Works; within the Source form or |
|||
documentation, if provided along with the Derivative Works; or, |
|||
within a display generated by the Derivative Works, if and |
|||
wherever such third-party notices normally appear. The contents |
|||
of the NOTICE file are for informational purposes only and |
|||
do not modify the License. You may add Your own attribution |
|||
notices within Derivative Works that You distribute, alongside |
|||
or as an addendum to the NOTICE text from the Work, provided |
|||
that such additional attribution notices cannot be construed |
|||
as modifying the License. |
|||
|
|||
You may add Your own copyright statement to Your modifications and |
|||
may provide additional or different license terms and conditions |
|||
for use, reproduction, or distribution of Your modifications, or |
|||
for any such Derivative Works as a whole, provided Your use, |
|||
reproduction, and distribution of the Work otherwise complies with |
|||
the conditions stated in this License. |
|||
|
|||
5. Submission of Contributions. Unless You explicitly state otherwise, |
|||
any Contribution intentionally submitted for inclusion in the Work |
|||
by You to the Licensor shall be under the terms and conditions of |
|||
this License, without any additional terms or conditions. |
|||
Notwithstanding the above, nothing herein shall supersede or modify |
|||
the terms of any separate license agreement you may have executed |
|||
with Licensor regarding such Contributions. |
|||
|
|||
6. Trademarks. This License does not grant permission to use the trade |
|||
names, trademarks, service marks, or product names of the Licensor, |
|||
except as required for reasonable and customary use in describing the |
|||
origin of the Work and reproducing the content of the NOTICE file. |
|||
|
|||
7. Disclaimer of Warranty. Unless required by applicable law or |
|||
agreed to in writing, Licensor provides the Work (and each |
|||
Contributor provides its Contributions) on an "AS IS" BASIS, |
|||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or |
|||
implied, including, without limitation, any warranties or conditions |
|||
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A |
|||
PARTICULAR PURPOSE. You are solely responsible for determining the |
|||
appropriateness of using or redistributing the Work and assume any |
|||
risks associated with Your exercise of permissions under this License. |
|||
|
|||
8. Limitation of Liability. In no event and under no legal theory, |
|||
whether in tort (including negligence), contract, or otherwise, |
|||
unless required by applicable law (such as deliberate and grossly |
|||
negligent acts) or agreed to in writing, shall any Contributor be |
|||
liable to You for damages, including any direct, indirect, special, |
|||
incidental, or consequential damages of any character arising as a |
|||
result of this License or out of the use or inability to use the |
|||
Work (including but not limited to damages for loss of goodwill, |
|||
work stoppage, computer failure or malfunction, or any and all |
|||
other commercial damages or losses), even if such Contributor |
|||
has been advised of the possibility of such damages. |
|||
|
|||
9. Accepting Warranty or Additional Liability. While redistributing |
|||
the Work or Derivative Works thereof, You may choose to offer, |
|||
and charge a fee for, acceptance of support, warranty, indemnity, |
|||
or other liability obligations and/or rights consistent with this |
|||
License. However, in accepting such obligations, You may act only |
|||
on Your own behalf and on Your sole responsibility, not on behalf |
|||
of any other Contributor, and only if You agree to indemnify, |
|||
defend, and hold each Contributor harmless for any liability |
|||
incurred by, or claims asserted against, such Contributor by reason |
|||
of your accepting any such warranty or additional liability. |
|||
|
|||
END OF TERMS AND CONDITIONS |
|||
|
|||
APPENDIX: How to apply the Apache License to your work. |
|||
|
|||
To apply the Apache License to your work, attach the following |
|||
boilerplate notice, with the fields enclosed by brackets "{}" |
|||
replaced with your own identifying information. (Don't include |
|||
the brackets!) The text should be enclosed in the appropriate |
|||
comment syntax for the file format. We also recommend that a |
|||
file or class name and description of purpose be included on the |
|||
same "printed page" as the copyright notice for easier |
|||
identification within third-party archives. |
|||
|
|||
Copyright 2017 Unity Technologies |
|||
|
|||
Licensed under the Apache License, Version 2.0 (the "License"); |
|||
you may not use this file except in compliance with the License. |
|||
You may obtain a copy of the License at |
|||
|
|||
http://www.apache.org/licenses/LICENSE-2.0 |
|||
|
|||
Unless required by applicable law or agreed to in writing, software |
|||
distributed under the License is distributed on an "AS IS" BASIS, |
|||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
|||
See the License for the specific language governing permissions and |
|||
limitations under the License. |
|
|||
# API Reference |
|||
|
|||
Our developer-facing C# classes (Academy, Agent, Decision and |
|||
Monitor) have been documented to be compatabile with |
|||
[Doxygen](http://www.stack.nl/~dimitri/doxygen/) for auto-generating HTML |
|||
Our developer-facing C# classes (Academy, Agent, Decision and Monitor) have been |
|||
documented to be compatible with |
|||
[Doxygen](http://www.stack.nl/~dimitri/doxygen/) for auto-generating HTML |
|||
To generate the API reference, |
|||
[download Doxygen](http://www.stack.nl/~dimitri/doxygen/download.html) and run |
|||
the following command within the `docs/` directory: |
|||
To generate the API reference, |
|||
[download Doxygen](http://www.stack.nl/~dimitri/doxygen/download.html) |
|||
and run the following command within the `docs/` directory: |
|||
doxygen dox-ml-agents.conf |
|||
```sh |
|||
doxygen dox-ml-agents.conf |
|||
``` |
|||
that includes the classes that have been properly formatted. |
|||
The generated HTML files will be placed |
|||
in the `html/` subdirectory. Open `index.html` within that subdirectory to |
|||
navigate to the API reference home. Note that `html/` is already included in |
|||
the repository's `.gitignore` file. |
|||
that includes the classes that have been properly formatted. The generated HTML |
|||
files will be placed in the `html/` subdirectory. Open `index.html` within that |
|||
subdirectory to navigate to the API reference home. Note that `html/` is already |
|||
included in the repository's `.gitignore` file. |
|||
In the near future, we aim to expand our documentation |
|||
to include all the Unity C# classes and Python API. |
|||
In the near future, we aim to expand our documentation to include all the Unity |
|||
C# classes and Python API. |
|
|||
# Background: Jupyter |
|||
|
|||
[Jupyter](https://jupyter.org) is a fantastic tool for writing code with |
|||
embedded visualizations. We provide one such notebook, `python/Basics.ipynb`, |
|||
for testing the Python control interface to a Unity build. This notebook is |
|||
introduced in the |
|||
[Jupyter](https://jupyter.org) is a fantastic tool for writing code with |
|||
embedded visualizations. We provide one such notebook, |
|||
`notebooks/getting-started.ipynb`, for testing the Python control interface to a |
|||
Unity build. This notebook is introduced in the |
|||
in the _Jupyter/IPython Quick Start Guide_. To launch Jupyter, run in the command line: |
|||
in the _Jupyter/IPython Quick Start Guide_. To launch Jupyter, run in the |
|||
command line: |
|||
`jupyter notebook` |
|||
```sh |
|||
jupyter notebook |
|||
``` |
|||
|
|||
Then navigate to `localhost:8888` to access your notebooks. |
|
|||
# Frequently Asked Questions |
|||
|
|||
## Scripting Runtime Environment not setup correctly |
|||
### Scripting Runtime Environment not setup correctly |
|||
If you haven't switched your scripting runtime version from .NET 3.5 to .NET 4.6 |
|||
or .NET 4.x, you will see such error message: |
|||
If you haven't switched your scripting runtime version from .NET 3.5 to .NET 4.6 or .NET 4.x, you will see such error message: |
|||
|
|||
``` |
|||
```console |
|||
This is because .NET 3.5 doesn't support method Clear() for StringBuilder, refer to [Setting Up The ML-Agents Toolkit Within Unity](Installation.md#setting-up-ml-agent-within-unity) for solution. |
|||
This is because .NET 3.5 doesn't support method Clear() for StringBuilder, refer |
|||
to [Setting Up The ML-Agents Toolkit Within |
|||
Unity](Installation.md#setting-up-ml-agent-within-unity) for solution. |
|||
### TensorFlowSharp flag not turned on. |
|||
## TensorFlowSharp flag not turned on |
|||
If you have already imported the TensorFlowSharp plugin, but havn't set ENABLE_TENSORFLOW flag for your scripting define symbols, you will see the following error message: |
|||
If you have already imported the TensorFlowSharp plugin, but haven't set |
|||
ENABLE_TENSORFLOW flag for your scripting define symbols, you will see the |
|||
following error message: |
|||
``` |
|||
You need to install and enable the TensorFlowSharp plugin in order to use the internal brain. |
|||
```console |
|||
You need to install and enable the TensorFlowSharp plugin in order to use the Internal Brain. |
|||
This error message occurs because the TensorFlowSharp plugin won't be usage without the ENABLE_TENSORFLOW flag, refer to [Setting Up The ML-Agents Toolkit Within Unity](Installation.md#setting-up-ml-agent-within-unity) for solution. |
|||
This error message occurs because the TensorFlowSharp plugin won't be usage |
|||
without the ENABLE_TENSORFLOW flag, refer to [Setting Up The ML-Agents Toolkit |
|||
Within Unity](Installation.md#setting-up-ml-agent-within-unity) for solution. |
|||
### Tensorflow epsilon placeholder error |
|||
## Instance of CoreBrainInternal couldn't be created |
|||
If you have a graph placeholder set in the internal Brain inspector that is not present in the TensorFlow graph, you will see some error like this: |
|||
If you try to use ML-Agents in Unity versions 2017.1 - 2017.3, you might |
|||
encounter an error that looks like this: |
|||
``` |
|||
UnityAgentsException: One of the Tensorflow placeholder could not be found. In brain <some_brain_name>, there are no FloatingPoint placeholder named <some_placeholder_name>. |
|||
```console |
|||
Instance of CoreBrainInternal couldn't be created. The the script |
|||
class needs to derive from ScriptableObject. |
|||
UnityEngine.ScriptableObject:CreateInstance(String) |
|||
Solution: Go to all of your Brain object, find `Graph placeholders` and change its `size` to 0 to remove the `epsilon` placeholder. |
|||
You can fix the error by removing `CoreBrain` from CoreBrainInternal.cs:16, |
|||
clicking on your Brain Gameobject to let the scene recompile all the changed |
|||
C# scripts, then adding the `CoreBrain` back. Make sure your brain is in |
|||
Internal mode, your TensorFlowSharp plugin is imported and the |
|||
ENABLE_TENSORFLOW flag is set. This fix is only valid locally and unstable. |
|||
|
|||
## Tensorflow epsilon placeholder error |
|||
Similarly, if you have a graph scope set in the internal Brain inspector that is not correctly set, you will see some error like this: |
|||
If you have a graph placeholder set in the Internal Brain inspector that is not |
|||
present in the TensorFlow graph, you will see some error like this: |
|||
```console |
|||
UnityAgentsException: One of the TensorFlow placeholder could not be found. In brain <some_brain_name>, there are no FloatingPoint placeholder named <some_placeholder_name>. |
|||
|
|||
Solution: Go to all of your Brain object, find `Graph placeholders` and change |
|||
its `size` to 0 to remove the `epsilon` placeholder. |
|||
|
|||
Similarly, if you have a graph scope set in the Internal Brain inspector that is |
|||
not correctly set, you will see some error like this: |
|||
|
|||
```console |
|||
Solution: Make sure your Graph Scope field matches the corresponding brain object name in your Hierachy Inspector when there is multiple brain. |
|||
Solution: Make sure your Graph Scope field matches the corresponding Brain |
|||
object name in your Hierarchy Inspector when there are multiple Brains. |
|||
### Environment Permission Error |
|||
## Environment Permission Error |
|||
If you directly import your Unity environment without building it in the |
|||
editor, you might need to give it additional permissions to execute it. |
|||
If you directly import your Unity environment without building it in the |
|||
editor, you might need to give it additional permissions to execute it. |
|||
`chmod -R 755 *.app` |
|||
```sh |
|||
chmod -R 755 *.app |
|||
``` |
|||
`chmod -R 755 *.x86_64` |
|||
```sh |
|||
chmod -R 755 *.x86_64 |
|||
``` |
|||
On Windows, you can find |
|||
On Windows, you can find |
|||
### Environment Connection Timeout |
|||
## Environment Connection Timeout |
|||
If you are able to launch the environment from `UnityEnvironment` but |
|||
then receive a timeout error, there may be a number of possible causes. |
|||
* _Cause_: There may be no Brains in your environment which are set |
|||
to `External`. In this case, the environment will not attempt to |
|||
communicate with python. _Solution_: Set the Brains(s) you wish to |
|||
externally control through the Python API to `External` from the |
|||
Unity Editor, and rebuild the environment. |
|||
* _Cause_: On OSX, the firewall may be preventing communication with |
|||
the environment. _Solution_: Add the built environment binary to the |
|||
list of exceptions on the firewall by following |
|||
[instructions](https://support.apple.com/en-us/HT201642). |
|||
* _Cause_: An error happened in the Unity Environment preventing |
|||
communication. _Solution_: Look into the |
|||
[log files](https://docs.unity3d.com/Manual/LogFiles.html) |
|||
generated by the Unity Environment to figure what error happened. |
|||
If you are able to launch the environment from `UnityEnvironment` but then |
|||
receive a timeout error, there may be a number of possible causes. |
|||
### Communication port {} still in use |
|||
* _Cause_: There may be no Brains in your environment which are set to |
|||
`External`. In this case, the environment will not attempt to communicate |
|||
with python. _Solution_: Set the Brains(s) you wish to externally control |
|||
through the Python API to `External` from the Unity Editor, and rebuild the |
|||
environment. |
|||
* _Cause_: On OSX, the firewall may be preventing communication with the |
|||
environment. _Solution_: Add the built environment binary to the list of |
|||
exceptions on the firewall by following |
|||
[instructions](https://support.apple.com/en-us/HT201642). |
|||
* _Cause_: An error happened in the Unity Environment preventing communication. |
|||
_Solution_: Look into the [log |
|||
files](https://docs.unity3d.com/Manual/LogFiles.html) generated by the Unity |
|||
Environment to figure what error happened. |
|||
If you receive an exception `"Couldn't launch new environment because |
|||
communication port {} is still in use. "`, you can change the worker |
|||
number in the Python script when calling |
|||
## Communication port {} still in use |
|||
`UnityEnvironment(file_name=filename, worker_id=X)` |
|||
If you receive an exception `"Couldn't launch new environment because |
|||
communication port {} is still in use. "`, you can change the worker number in |
|||
the Python script when calling |
|||
### Mean reward : nan |
|||
```python |
|||
UnityEnvironment(file_name=filename, worker_id=X) |
|||
``` |
|||
If you receive a message `Mean reward : nan` when attempting to train a |
|||
model using PPO, this is due to the episodes of the learning environment |
|||
not terminating. In order to address this, set `Max Steps` for either |
|||
the Academy or Agents within the Scene Inspector to a value greater |
|||
than 0. Alternatively, it is possible to manually set `done` conditions |
|||
for episodes from within scripts for custom episode-terminating events. |
|||
## Mean reward : nan |
|||
|
|||
If you receive a message `Mean reward : nan` when attempting to train a model |
|||
using PPO, this is due to the episodes of the Learning Environment not |
|||
terminating. In order to address this, set `Max Steps` for either the Academy or |
|||
Agents within the Scene Inspector to a value greater than 0. Alternatively, it |
|||
is possible to manually set `done` conditions for episodes from within scripts |
|||
for custom episode-terminating events. |
|
|||
# Getting Started with the 3D Balance Ball Environment |
|||
|
|||
This tutorial walks through the end-to-end process of opening a ML-Agents toolkit |
|||
example environment in Unity, building the Unity executable, training an agent |
|||
in it, and finally embedding the trained model into the Unity environment. |
|||
This tutorial walks through the end-to-end process of opening a ML-Agents |
|||
toolkit example environment in Unity, building the Unity executable, training an |
|||
Agent in it, and finally embedding the trained model into the Unity environment. |
|||
The ML-Agents toolkit includes a number of [example environments](Learning-Environment-Examples.md) |
|||
which you can examine to help understand the different ways in which the ML-Agents toolkit |
|||
can be used. These environments can also serve as templates for new |
|||
environments or as ways to test new ML algorithms. After reading this tutorial, |
|||
you should be able to explore and build the example environments. |
|||
The ML-Agents toolkit includes a number of [example |
|||
environments](Learning-Environment-Examples.md) which you can examine to help |
|||
understand the different ways in which the ML-Agents toolkit can be used. These |
|||
environments can also serve as templates for new environments or as ways to test |
|||
new ML algorithms. After reading this tutorial, you should be able to explore |
|||
and build the example environments. |
|||
This walk-through uses the **3D Balance Ball** environment. 3D Balance Ball contains |
|||
a number of platforms and balls (which are all copies of each other). |
|||
Each platform tries to keep its ball from falling by rotating either |
|||
horizontally or vertically. In this environment, a platform is an **agent** |
|||
that receives a reward for every step that it balances the ball. An agent is |
|||
also penalized with a negative reward for dropping the ball. The goal of the |
|||
training process is to have the platforms learn to never drop the ball. |
|||
This walk-through uses the **3D Balance Ball** environment. 3D Balance Ball |
|||
contains a number of platforms and balls (which are all copies of each other). |
|||
Each platform tries to keep its ball from falling by rotating either |
|||
horizontally or vertically. In this environment, a platform is an **Agent** that |
|||
receives a reward for every step that it balances the ball. An agent is also |
|||
penalized with a negative reward for dropping the ball. The goal of the training |
|||
process is to have the platforms learn to never drop the ball. |
|||
In order to install and set up the ML-Agents toolkit, the Python dependencies and Unity, |
|||
see the [installation instructions](Installation.md). |
|||
In order to install and set up the ML-Agents toolkit, the Python dependencies |
|||
and Unity, see the [installation instructions](Installation.md). |
|||
An agent is an autonomous actor that observes and interacts with an |
|||
_environment_. In the context of Unity, an environment is a scene containing |
|||
an Academy and one or more Brain and Agent objects, and, of course, the other |
|||
entities that an agent interacts with. |
|||
An agent is an autonomous actor that observes and interacts with an |
|||
_environment_. In the context of Unity, an environment is a scene containing an |
|||
Academy and one or more Brain and Agent objects, and, of course, the other |
|||
entities that an agent interacts with. |
|||
**Note:** In Unity, the base object of everything in a scene is the |
|||
_GameObject_. The GameObject is essentially a container for everything else, |
|||
including behaviors, graphics, physics, etc. To see the components that make |
|||
up a GameObject, select the GameObject in the Scene window, and open the |
|||
Inspector window. The Inspector shows every component on a GameObject. |
|||
|
|||
The first thing you may notice after opening the 3D Balance Ball scene is that |
|||
it contains not one, but several platforms. Each platform in the scene is an |
|||
independent agent, but they all share the same brain. 3D Balance Ball does this |
|||
**Note:** In Unity, the base object of everything in a scene is the |
|||
_GameObject_. The GameObject is essentially a container for everything else, |
|||
including behaviors, graphics, physics, etc. To see the components that make up |
|||
a GameObject, select the GameObject in the Scene window, and open the Inspector |
|||
window. The Inspector shows every component on a GameObject. |
|||
|
|||
The first thing you may notice after opening the 3D Balance Ball scene is that |
|||
it contains not one, but several platforms. Each platform in the scene is an |
|||
independent agent, but they all share the same Brain. 3D Balance Ball does this |
|||
The Academy object for the scene is placed on the Ball3DAcademy GameObject. |
|||
When you look at an Academy component in the inspector, you can see several |
|||
properties that control how the environment works. For example, the |
|||
**Training** and **Inference Configuration** properties set the graphics and |
|||
timescale properties for the Unity application. The Academy uses the |
|||
**Training Configuration** during training and the **Inference Configuration** |
|||
when not training. (*Inference* means that the agent is using a trained model |
|||
or heuristics or direct control — in other words, whenever **not** training.) |
|||
Typically, you set low graphics quality and a high time scale for the |
|||
**Training configuration** and a high graphics quality and the timescale to |
|||
`1.0` for the **Inference Configuration** . |
|||
The Academy object for the scene is placed on the Ball3DAcademy GameObject. When |
|||
you look at an Academy component in the inspector, you can see several |
|||
properties that control how the environment works. For example, the **Training** |
|||
and **Inference Configuration** properties set the graphics and timescale |
|||
properties for the Unity application. The Academy uses the **Training |
|||
Configuration** during training and the **Inference Configuration** when not |
|||
training. (*Inference* means that the Agent is using a trained model or |
|||
heuristics or direct control — in other words, whenever **not** training.) |
|||
Typically, you set low graphics quality and a high time scale for the **Training |
|||
configuration** and a high graphics quality and the timescale to `1.0` for the |
|||
**Inference Configuration** . |
|||
**Note:** if you want to observe the environment during training, you can |
|||
adjust the **Inference Configuration** settings to use a larger window and a |
|||
timescale closer to 1:1. Be sure to set these parameters back when training in |
|||
earnest; otherwise, training can take a very long time. |
|||
**Note:** if you want to observe the environment during training, you can adjust |
|||
the **Inference Configuration** settings to use a larger window and a timescale |
|||
closer to 1:1. Be sure to set these parameters back when training in earnest; |
|||
otherwise, training can take a very long time. |
|||
Another aspect of an environment to look at is the Academy implementation. |
|||
Since the base Academy class is abstract, you must always define a subclass. |
|||
There are three functions you can implement, though they are all optional: |
|||
Another aspect of an environment to look at is the Academy implementation. Since |
|||
the base Academy class is abstract, you must always define a subclass. There are |
|||
three functions you can implement, though they are all optional: |
|||
* Academy.AcademyStep() — Called at every simulation step before |
|||
Agent.AgentAction() (and after the agents collect their observations). |
|||
* Academy.AcademyReset() — Called when the Academy starts or restarts the |
|||
simulation (including the first time). |
|||
* Academy.AcademyStep() — Called at every simulation step before |
|||
agent.AgentAction() (and after the Agents collect their observations). |
|||
* Academy.AcademyReset() — Called when the Academy starts or restarts the |
|||
simulation (including the first time). |
|||
The 3D Balance Ball environment does not use these functions — each agent |
|||
resets itself when needed — but many environments do use these functions to |
|||
control the environment around the agents. |
|||
The 3D Balance Ball environment does not use these functions — each Agent resets |
|||
itself when needed — but many environments do use these functions to control the |
|||
environment around the Agents. |
|||
The Ball3DBrain GameObject in the scene, which contains a Brain component, |
|||
is a child of the Academy object. (All Brain objects in a scene must be |
|||
children of the Academy.) All the agents in the 3D Balance Ball environment |
|||
use the same Brain instance. |
|||
A Brain doesn't store any information about an agent, |
|||
it just routes the agent's collected observations to the decision making |
|||
process and returns the chosen action to the agent. Thus, all agents can share |
|||
the same brain, but act independently. The Brain settings tell you quite a bit |
|||
about how an agent works. |
|||
The Ball3DBrain GameObject in the scene, which contains a Brain component, is a |
|||
child of the Academy object. (All Brain objects in a scene must be children of |
|||
the Academy.) All the Agents in the 3D Balance Ball environment use the same |
|||
Brain instance. A Brain doesn't store any information about an Agent, it just |
|||
routes the Agent's collected observations to the decision making process and |
|||
returns the chosen action to the Agent. Thus, all Agents can share the same |
|||
Brain, but act independently. The Brain settings tell you quite a bit about how |
|||
an Agent works. |
|||
The **Brain Type** determines how an agent makes its decisions. The |
|||
**External** and **Internal** types work together — use **External** when |
|||
training your agents; use **Internal** when using the trained model. |
|||
The **Heuristic** brain allows you to hand-code the agent's logic by extending |
|||
the Decision class. Finally, the **Player** brain lets you map keyboard |
|||
commands to actions, which can be useful when testing your agents and |
|||
environment. If none of these types of brains do what you need, you can |
|||
implement your own CoreBrain to create your own type. |
|||
The **Brain Type** determines how an Agent makes its decisions. The **External** |
|||
and **Internal** types work together — use **External** when training your |
|||
Agents; use **Internal** when using the trained model. The **Heuristic** Brain |
|||
allows you to hand-code the Agent's logic by extending the Decision class. |
|||
Finally, the **Player** Brain lets you map keyboard commands to actions, which |
|||
can be useful when testing your agents and environment. If none of these types |
|||
of Brains do what you need, you can implement your own CoreBrain to create your |
|||
own type. |
|||
In this tutorial, you will set the **Brain Type** to **External** for training; |
|||
In this tutorial, you will set the **Brain Type** to **External** for training; |
|||
**Vector Observation Space** |
|||
#### Vector Observation Space |
|||
Before making a decision, an agent collects its observation about its state |
|||
in the world. The ML-Agents toolkit classifies vector observations into two types: |
|||
**Continuous** and **Discrete**. The **Continuous** vector observation space |
|||
collects observations in a vector of floating point numbers. The **Discrete** |
|||
vector observation space is an index into a table of states. Most of the example |
|||
environments use a continuous vector observation space. |
|||
Before making a decision, an agent collects its observation about its state in |
|||
the world. The vector observation is a vector of floating point numbers which |
|||
contain relevant information for the agent to make decisions. |
|||
The Brain instance used in the 3D Balance Ball example uses the **Continuous** |
|||
vector observation space with a **State Size** of 8. This means that the |
|||
feature vector containing the agent's observations contains eight elements: |
|||
the `x` and `z` components of the platform's rotation and the `x`, `y`, and `z` |
|||
components of the ball's relative position and velocity. (The observation |
|||
values are defined in the agent's `CollectObservations()` function.) |
|||
The Brain instance used in the 3D Balance Ball example uses the **Continuous** |
|||
vector observation space with a **State Size** of 8. This means that the feature |
|||
vector containing the Agent's observations contains eight elements: the `x` and |
|||
`z` components of the platform's rotation and the `x`, `y`, and `z` components |
|||
of the ball's relative position and velocity. (The observation values are |
|||
defined in the Agent's `CollectObservations()` function.) |
|||
**Vector Action Space** |
|||
#### Vector Action Space |
|||
An agent is given instructions from the brain in the form of *actions*. Like |
|||
states, ML-Agents toolkit classifies actions into two types: the **Continuous** |
|||
vector action space is a vector of numbers that can vary continuously. What |
|||
each element of the vector means is defined by the agent logic (the PPO |
|||
training process just learns what values are better given particular state |
|||
observations based on the rewards received when it tries different values). |
|||
For example, an element might represent a force or torque applied to a |
|||
`RigidBody` in the agent. The **Discrete** action vector space defines its |
|||
actions as a table. A specific action given to the agent is an index into |
|||
this table. |
|||
An Agent is given instructions from the Brain in the form of *actions*. |
|||
ML-Agents toolkit classifies actions into two types: the **Continuous** vector |
|||
action space is a vector of numbers that can vary continuously. What each |
|||
element of the vector means is defined by the Agent logic (the PPO training |
|||
process just learns what values are better given particular state observations |
|||
based on the rewards received when it tries different values). For example, an |
|||
element might represent a force or torque applied to a `Rigidbody` in the Agent. |
|||
The **Discrete** action vector space defines its actions as tables. An action |
|||
given to the Agent is an array of indices into tables. |
|||
space. |
|||
You can try training with both settings to observe whether there is a |
|||
difference. (Set the `Vector Action Space Size` to 4 when using the discrete |
|||
space. You can try training with both settings to observe whether there is a |
|||
difference. (Set the `Vector Action Space Size` to 4 when using the discrete |
|||
|
|||
|
|||
The Agent is the actor that observes and takes actions in the environment. |
|||
In the 3D Balance Ball environment, the Agent components are placed on the |
|||
twelve Platform GameObjects. The base Agent object has a few properties that |
|||
affect its behavior: |
|||
The Agent is the actor that observes and takes actions in the environment. In |
|||
the 3D Balance Ball environment, the Agent components are placed on the twelve |
|||
Platform GameObjects. The base Agent object has a few properties that affect its |
|||
behavior: |
|||
* **Brain** — Every agent must have a Brain. The brain determines how an agent |
|||
makes decisions. All the agents in the 3D Balance Ball scene share the same |
|||
brain. |
|||
* **Visual Observations** — Defines any Camera objects used by the agent to |
|||
observe its environment. 3D Balance Ball does not use camera observations. |
|||
* **Max Step** — Defines how many simulation steps can occur before the agent |
|||
decides it is done. In 3D Balance Ball, an agent restarts after 5000 steps. |
|||
* **Reset On Done** — Defines whether an agent starts over when it is finished. |
|||
3D Balance Ball sets this true so that the agent restarts after reaching the |
|||
**Max Step** count or after dropping the ball. |
|||
* **Brain** — Every Agent must have a Brain. The Brain determines how an Agent |
|||
makes decisions. All the Agents in the 3D Balance Ball scene share the same |
|||
Brain. |
|||
* **Visual Observations** — Defines any Camera objects used by the Agent to |
|||
observe its environment. 3D Balance Ball does not use camera observations. |
|||
* **Max Step** — Defines how many simulation steps can occur before the Agent |
|||
decides it is done. In 3D Balance Ball, an Agent restarts after 5000 steps. |
|||
* **Reset On Done** — Defines whether an Agent starts over when it is finished. |
|||
3D Balance Ball sets this true so that the Agent restarts after reaching the |
|||
**Max Step** count or after dropping the ball. |
|||
Perhaps the more interesting aspect of an agent is the Agent subclass |
|||
implementation. When you create an agent, you must extend the base Agent class. |
|||
Perhaps the more interesting aspect of an agents is the Agent subclass |
|||
implementation. When you create an Agent, you must extend the base Agent class. |
|||
* Agent.AgentReset() — Called when the Agent resets, including at the beginning |
|||
of a session. The Ball3DAgent class uses the reset function to reset the |
|||