浏览代码

Merge pull request #4109 from Unity-Technologies/release_3_merge_master

Release 3 merge to master
/MLA-1734-demo-provider
GitHub 5 年前
当前提交
fefbc038
共有 27 个文件被更改,包括 6436 次插入12115 次删除
  1. 495
      Project/Assets/ML-Agents/Examples/3DBall/TFModels/3DBall.nn
  2. 586
      Project/Assets/ML-Agents/Examples/3DBall/TFModels/3DBallHard.nn
  3. 13
      Project/Assets/ML-Agents/Examples/Basic/TFModels/Basic.nn
  4. 149
      Project/Assets/ML-Agents/Examples/Bouncer/TFModels/Bouncer.nn
  5. 1001
      Project/Assets/ML-Agents/Examples/Crawler/TFModels/CrawlerDynamic.nn
  6. 1001
      Project/Assets/ML-Agents/Examples/Crawler/TFModels/CrawlerStatic.nn
  7. 682
      Project/Assets/ML-Agents/Examples/FoodCollector/TFModels/FoodCollector.nn
  8. 1001
      Project/Assets/ML-Agents/Examples/GridWorld/TFModels/GridWorld.nn
  9. 999
      Project/Assets/ML-Agents/Examples/Hallway/TFModels/Hallway.nn
  10. 1001
      Project/Assets/ML-Agents/Examples/PushBlock/TFModels/PushBlock.nn
  11. 1001
      Project/Assets/ML-Agents/Examples/Pyramids/TFModels/Pyramids.nn
  12. 564
      Project/Assets/ML-Agents/Examples/Reacher/TFModels/Reacher.nn
  13. 1001
      Project/Assets/ML-Agents/Examples/Soccer/TFModels/Goalie.nn
  14. 1001
      Project/Assets/ML-Agents/Examples/Soccer/TFModels/SoccerTwos.nn
  15. 1001
      Project/Assets/ML-Agents/Examples/Soccer/TFModels/Striker.nn
  16. 1001
      Project/Assets/ML-Agents/Examples/Tennis/TFModels/Tennis.nn
  17. 1001
      Project/Assets/ML-Agents/Examples/Walker/TFModels/WalkerDynamic.nn
  18. 1001
      Project/Assets/ML-Agents/Examples/Walker/TFModels/WalkerStatic.nn
  19. 1001
      Project/Assets/ML-Agents/Examples/WallJump/TFModels/BigWallJump.nn
  20. 1001
      Project/Assets/ML-Agents/Examples/WallJump/TFModels/SmallWallJump.nn
  21. 1001
      Project/Assets/ML-Agents/Examples/Worm/TFModels/WormDynamic.nn
  22. 1001
      Project/Assets/ML-Agents/Examples/Worm/TFModels/WormStatic.nn
  23. 5
      README.md
  24. 10
      com.unity.ml-agents/CHANGELOG.md
  25. 2
      ml-agents/mlagents/trainers/settings.py
  26. 23
      ml-agents/mlagents/trainers/tests/test_meta_curriculum.py
  27. 8
      utils/make_readme_table.py

495
Project/Assets/ML-Agents/Examples/3DBall/TFModels/3DBall.nn
文件差异内容过多而无法显示
查看文件

586
Project/Assets/ML-Agents/Examples/3DBall/TFModels/3DBallHard.nn
文件差异内容过多而无法显示
查看文件

13
Project/Assets/ML-Agents/Examples/Basic/TFModels/Basic.nn


vector_observation���� action_masks����policy_1/concat_2/concatactionaction_output_shape������?action_output_shape memory_sizeversion_numberis_continuous_controlpolicy_1/add_2/ypolicy_1/Sum/reduction_indicespolicy_1/add/y$policy/main_graph_0/hidden_0/BiasAdd�����?vector_observation#policy/main_graph_0/hidden_0/kernel�!policy/main_graph_0/hidden_0/bias� policy/main_graph_0/hidden_0/Mul2 �����?$policy/main_graph_0/hidden_0/BiasAddpolicy_1/dense/MatMul�����? policy/main_graph_0/hidden_0/Mulpolicy/dense/kernel�<policy_1/dense/MatMul/patch:0�policy_1/strided_slice������? action_maskspolicy_1/Softmax2�����?policy_1/dense/MatMul policy_1/addd�����?policy_1/Softmaxpolicy_1/add/y policy_1/Mulf�����? policy_1/addpolicy_1/strided_slice policy_1/Sum������? policy_1/Mulpolicy_1/Sum/reduction_indicespolicy_1/truedivg�����? policy_1/Mul policy_1/Sumpolicy_1/add_2d�����?policy_1/truedivpolicy_1/add_2/ypolicy_1/Log_12r�����?policy_1/add_2policy_1/concat_2/concat2�����?policy_1/Log_1action2�����?policy_1/concat_2/concat@@@���3�?���3�k��8�>�:���{i��#���Ld�n�<�B$�=6_��ސ��8�þ�Z=��>���;�/�O�>�����q�=VU�����=�l��e,�-X7������xQ>>�K��$>46���a><9�>�(>�(O>�(>�X�<��=�p��D��m��>���(z� ���x��;W�}J�;~7���:���G���%�=�> 8�<;��ݾ=��U>;H<8B%����Å2���?���=g0ѽ ������$־ڜ^>w����.��c��؟=Tgx�C��;�o:�z���)�=������eٳ>�f�9ˡ=���<
� ��)��R��=/)�<��z>��H�$5���>�����}y��l�>��[��@�;�Ӵ=�0�=8׾T;>яV�?h���>Y�k=��=�꽕!�>5�=P O�~���L�J=��̾�м.�<�W?�;��[.��h���?k�Q���b>� u=�;��+?�<IE�>��>wz�=�� � ��s�^>��O>�_>Wo�>t����f����%V>��c<B��>�1c�Q�,�j�<=���>�̭��G�$��<�� >0m¾����j�x�q��>�Y�=��>�^1=��Ͼ<�<ξTP��}:�j�>C�>��=��$>J��>ߖ>��>ג�>�g>�>������7=�|=�J?Q�=��> �j>�P>-�l>೘�W^�]�)>��w>� �=���*X��|�>4��?�X>�]�ώ�M�N��>�>B��8�)`��/ =�G�=���>�2�>���� >�>�_�>���<}�n��f>`�!��(Q�qr1�C�6��.�>��>}���%�Ͼ�Ɖ>�ܘ�N��ӶG<f1��d�?0�K>�꾀"k��Ϭ>���>^�?��'�+�e(��s��c��>7���J����&����>��e�D�(=]����q�rJ�>.y>�(����eϪ>�(>;�,==5�/�>S%>2} >n#n=�C=�i>y���d��=��ν�|�8[)���0�=L��
$�J�:>E�>��.��!��4�y=+r�>�|���t��}����~����>*%��+>0L��fB>[g�� ����>>��=$ݽ=މ?>�z@�y v�L��>�2h>Bl#>L(��#vZ>�ر���<���=s%�>jx?���=p�<3ś=�5^���>��=�D->��)>)�˽g ����s?H���ε�xc���Ҿ �M?hRӾ�D�S��P*,?�Ծ���f#�K��mC&?�(@?Pꬾ�ാ�x�>Z�?J��>��پk���s�u��&�����>] ��鸉���"���@?-r��Ҝ�x���F}���5�>�T�>��+�����(.?���>U���4n�=�w��`�*�$5>q6�>O�Z�=��=�k�=�L�<&q= �u>d#
=M���&����ꖾy:;>��²{>6ƀ<:�h�G^�������T���1�>N��[�s�� =�:!>o{>4���A䛾lo�� �]>{�v�`n�c=4���f?�>M���0m>�X>����Ҷ�<'�>����NV����|>�ꑾh�ɾ4#�悞<'K�>����~�=HO�>N8B��Ui�K �=u��=e`?��+�ǟ;���.�?�)�DP\?�-���@�;-�9V\?N�<��\2��5�ؗ'���e?��W?bl+�\�/��5[?��]?�O����A? L?^?�;E�'� ?n�>{�8��) ?�� ?��?�Ik?��>�F6�OI ����.@?)~?��>��8���?�-?�$@�=�>�>?)�4��_�+���9?��?�?��?��M?L?�7�h�?�<?8�@ ?�?G ;�vI&�`<�?�B?d��U��s�??o �>CI
?ݪ5��?��?��?��� �~��>�=?wC�f��L�W?
vector_observation���� action_masks����policy_1/concat_2/concatactionaction_output_shape������?
action_output_shape memory_sizeversion_numbertrainer_patch_versiontrainer_minor_versiontrainer_major_versionis_continuous_controlpolicy_1/add_2/ypolicy_1/Sum/reduction_indicespolicy_1/add/y $policy/main_graph_0/hidden_0/BiasAdd�����?vector_observation#policy/main_graph_0/hidden_0/kernel
�!policy/main_graph_0/hidden_0/bias� policy/main_graph_0/hidden_0/Mul2 �����?$policy/main_graph_0/hidden_0/BiasAddpolicy_1/dense/MatMul�����? policy/main_graph_0/hidden_0/Mulpolicy/dense/kernel�<policy_1/dense/MatMul/patch:0�policy_1/strided_slice������? action_maskspolicy_1/Softmax2�����?policy_1/dense/MatMul policy_1/addd�����?policy_1/Softmaxpolicy_1/add/y policy_1/Mulf�����? policy_1/addpolicy_1/strided_slice policy_1/Sum������? policy_1/Mulpolicy_1/Sum/reduction_indicespolicy_1/truedivg�����? policy_1/Mul policy_1/Sumpolicy_1/add_2d�����?policy_1/truedivpolicy_1/add_2/ypolicy_1/Log_12r�����?policy_1/add_2policy_1/concat_2/concat2�����?policy_1/Log_1action2�����?policy_1/concat_2/concat@@@�A���3�?���3��ƽl >_jX=�P��
��z�>����0>�����K�>s�>��Y>&�j>�M�����=,�����"����>�P�����>�ɹ<����8��c�==�����:)�y�x�>|\-<���xdO���"����{?=���=[�1>�l>>_����->�8���8��ž= �m>M�/>����H� =/��Nh���b_�4��>�]�܌<��>�Ӥ>�E��D>�T=�zO�����A�;Qd���Ͼ�{|��D;>Z�`�/�aV���o�>TJ�=Sa��xİ�5!)>�8�=� 侥Z�='�.�O�<��5n�A��|�=�����#��jfs>1x=��<)���g����K�>>z>���>���=�< Lb���>�ݘ>X=C=^JѾ��վ*�>�L�>�����`��%T�(Qz��'�>��k�����C�|>0�D�S:3<̋�>�D^=K8�Y^�>kX����f=b�=�׍�J9�=������=t:��
q�>�h`� �>�J��F��4.�� ��[@�;��+>}=��m>�В<xxW��ᔾ+�Q���>��=���>���>E�=��j�>�a�<����Ə<���u��� ,��FF��VG��� T�m^���&<�+g=Z ^>��u��>tFy>�f�=/{M>� ���վv��x���n)�=9�A>s�J���@>*� ��:�>��>A��P���q�>qI=�Z���H���l���e>��[>O�þ��:�G����(�)��>���>����G�����>��^=
�?�ٻ&��B�$>AW5>�٧���?s�=�E�>č�>�-/<��¾�{��#!O��u0>z�<>���>��=+� �r|�kV�fo�>��=F@Z>�j=V �> �6b��U�6='�?sզ��2�;M4��@�C>��?�-? �=x/>��ؾ�g5=c��=�v����c>$Bi>m�����I>ؖ��ou<=�o��cb����ؽ�m�EQ�S�K=�7�=�[��Q�>/*��7��>�j��#|>D�����|��=���W���� �>���=M�6=�W�>�<�>��=����>(���o��>U)�>���>�����-��2�>F>�<ؾ[�7?*��>��E>>1?i��>S�>�# ?� ?���>uTJ�W#�>+Y澷&)?�D?��L?���x���d��>��>:'�>�u?� ?r_�>�9@?j����>��>���Z �=��7��Ϟ�@�d�v>�>9O�>��z>H���ᝡ���p=��=J��>��?i(�>8���>%>��%��樾9���n����r>6^->�Х=Ng>�r>�ad>,{����>1��<1����!>ev�Z �=+-�� ���A����<��:>�1f�{J���c=��!� ���]3>�]R�+��<��C>DY0�
Ty=��)��R�>C�>�[�= ޽mU�>?��=�(��:��λ�&.{=��f��3/��kK��\�=W�ʻ1��>�����4�=k�)������ ��=P�#���,���A�sJ,��ՙ>�kY�>����ܼ~I>%�S���ս�k� վW�g>�f>
ƙ����l��..R>��ʽ�k�=�=����a�P��IO?��I?�C?��F?t�B?;.�CM?o(��I?o�F?VGD?h�(����SH?�{H?��>��D?�G?�A?E7F?7���"Ǿk0/?W ��ž�-,?X��� k���!?ߥ߾�����,?Ӿ �⾡�?�T?�.�>��"��N�Y�˾5n0?�?�>*��> %��P��� ��>.?�l�b�����?���� " ��L-?�H�>x?L"�n��>%~�>�g'��X ����^�??��<�ݾ��+?LD!<.V<��,������Q"?,�|� ���,?G��������?���G��W�-?

149
Project/Assets/ML-Agents/Examples/Bouncer/TFModels/Bouncer.nn
文件差异内容过多而无法显示
查看文件

1001
Project/Assets/ML-Agents/Examples/Crawler/TFModels/CrawlerDynamic.nn
文件差异内容过多而无法显示
查看文件

1001
Project/Assets/ML-Agents/Examples/Crawler/TFModels/CrawlerStatic.nn
文件差异内容过多而无法显示
查看文件

682
Project/Assets/ML-Agents/Examples/FoodCollector/TFModels/FoodCollector.nn
文件差异内容过多而无法显示
查看文件

1001
Project/Assets/ML-Agents/Examples/GridWorld/TFModels/GridWorld.nn
文件差异内容过多而无法显示
查看文件

999
Project/Assets/ML-Agents/Examples/Hallway/TFModels/Hallway.nn
文件差异内容过多而无法显示
查看文件

1001
Project/Assets/ML-Agents/Examples/PushBlock/TFModels/PushBlock.nn
文件差异内容过多而无法显示
查看文件

1001
Project/Assets/ML-Agents/Examples/Pyramids/TFModels/Pyramids.nn
文件差异内容过多而无法显示
查看文件

564
Project/Assets/ML-Agents/Examples/Reacher/TFModels/Reacher.nn
文件差异内容过多而无法显示
查看文件

1001
Project/Assets/ML-Agents/Examples/Soccer/TFModels/Goalie.nn
文件差异内容过多而无法显示
查看文件

1001
Project/Assets/ML-Agents/Examples/Soccer/TFModels/SoccerTwos.nn
文件差异内容过多而无法显示
查看文件

1001
Project/Assets/ML-Agents/Examples/Soccer/TFModels/Striker.nn
文件差异内容过多而无法显示
查看文件

1001
Project/Assets/ML-Agents/Examples/Tennis/TFModels/Tennis.nn
文件差异内容过多而无法显示
查看文件

1001
Project/Assets/ML-Agents/Examples/Walker/TFModels/WalkerDynamic.nn
文件差异内容过多而无法显示
查看文件

1001
Project/Assets/ML-Agents/Examples/Walker/TFModels/WalkerStatic.nn
文件差异内容过多而无法显示
查看文件

1001
Project/Assets/ML-Agents/Examples/WallJump/TFModels/BigWallJump.nn
文件差异内容过多而无法显示
查看文件

1001
Project/Assets/ML-Agents/Examples/WallJump/TFModels/SmallWallJump.nn
文件差异内容过多而无法显示
查看文件

1001
Project/Assets/ML-Agents/Examples/Worm/TFModels/WormDynamic.nn
文件差异内容过多而无法显示
查看文件

1001
Project/Assets/ML-Agents/Examples/Worm/TFModels/WormStatic.nn
文件差异内容过多而无法显示
查看文件

5
README.md


| **Version** | **Release Date** | **Source** | **Documentation** | **Download** |
|:-------:|:------:|:-------------:|:-------:|:------------:|
| **master (unstable)** | -- | [source](https://github.com/Unity-Technologies/ml-agents/tree/master) | [docs](https://github.com/Unity-Technologies/ml-agents/tree/master/docs/Readme.md) | [download](https://github.com/Unity-Technologies/ml-agents/archive/master.zip) |
| **Release 2** | **May 20, 2020** | **[source](https://github.com/Unity-Technologies/ml-agents/tree/release_2)** | **[docs](https://github.com/Unity-Technologies/ml-agents/tree/release_2/docs/Readme.md)** | **[download](https://github.com/Unity-Technologies/ml-agents/archive/release_2.zip)** |
| **Release 1** | April 30, 2020 | [source](https://github.com/Unity-Technologies/ml-agents/tree/release_1) | [docs](https://github.com/Unity-Technologies/ml-agents/tree/release_1/docs/Readme.md) | [download](https://github.com/Unity-Technologies/ml-agents/archive/release_1.zip) |
| **Release 3** | **June 10, 2020** | **[source](https://github.com/Unity-Technologies/ml-agents/tree/release_3)** | **[docs](https://github.com/Unity-Technologies/ml-agents/tree/release_3_docs/docs/Readme.md)** | **[download](https://github.com/Unity-Technologies/ml-agents/archive/release_3.zip)** |
| **Release 2** | May 20, 2020 | [source](https://github.com/Unity-Technologies/ml-agents/tree/release_2) | [docs](https://github.com/Unity-Technologies/ml-agents/tree/release_2_docs/docs/Readme.md) | [download](https://github.com/Unity-Technologies/ml-agents/archive/release_2.zip) |
| **Release 1** | April 30, 2020 | [source](https://github.com/Unity-Technologies/ml-agents/tree/release_1) | [docs](https://github.com/Unity-Technologies/ml-agents/tree/release_1_docs/docs/Readme.md) | [download](https://github.com/Unity-Technologies/ml-agents/archive/release_1.zip) |
| **0.15.1** | March 30, 2020 | [source](https://github.com/Unity-Technologies/ml-agents/tree/0.15.1) | [docs](https://github.com/Unity-Technologies/ml-agents/tree/0.15.1/docs/Readme.md) | [download](https://github.com/Unity-Technologies/ml-agents/archive/0.15.1.zip) |
| **0.15.0** | March 18, 2020 | [source](https://github.com/Unity-Technologies/ml-agents/tree/0.15.0) | [docs](https://github.com/Unity-Technologies/ml-agents/tree/0.15.0/docs/Readme.md) | [download](https://github.com/Unity-Technologies/ml-agents/archive/0.15.0.zip) |
| **0.14.1** | February 26, 2020 | [source](https://github.com/Unity-Technologies/ml-agents/tree/0.14.1) | [docs](https://github.com/Unity-Technologies/ml-agents/tree/0.14.1/docs/Readme.md) | [download](https://github.com/Unity-Technologies/ml-agents/archive/0.14.1.zip) |

10
com.unity.ml-agents/CHANGELOG.md


- `max_step` in the `TerminalStep` and `TerminalSteps` objects was renamed `interrupted`.
- `beta` and `epsilon` in `PPO` are no longer decayed by default but follow the same schedule as learning rate. (#3940)
- `get_behavior_names()` and `get_behavior_spec()` on UnityEnvironment were replaced by the `behavior_specs` property. (#3946)
- The first version of the Unity Environment Registry (Experimental) has been released. More information [here](https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Unity-Environment-Registry.md)(#3967)
- The first version of the Unity Environment Registry (Experimental) has been released. More information [here](https://github.com/Unity-Technologies/ml-agents/blob/release_3_docs/docs/Unity-Environment-Registry.md)(#3967)
- `use_visual` and `allow_multiple_visual_obs` in the `UnityToGymWrapper` constructor
were replaced by `allow_multiple_obs` which allows one or more visual observations and
vector observations to be used simultaneously. (#3981) Thank you @shakenes !

- The format for trainer configuration has changed, and the "default" behavior has been deprecated.
See the [Migration Guide](https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Migrating.md) for more details. (#3936)
See the [Migration Guide](https://github.com/Unity-Technologies/ml-agents/blob/release_3_docs/docs/Migrating.md) for more details. (#3936)
- Training artifacts (trained models, summaries) are now found in the `results/`
directory. (#3829)
- When using Curriculum, the current lesson will resume if training is quit and resumed. As such,

- Introduced the `SideChannelManager` to register, unregister and access side
channels. (#3807)
- `Academy.FloatProperties` was replaced by `Academy.EnvironmentParameters`.
See the [Migration Guide](https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Migrating.md)
See the [Migration Guide](https://github.com/Unity-Technologies/ml-agents/blob/release_1_docs/docs/Migrating.md)
for more details on upgrading. (#3807)
- `SideChannel.OnMessageReceived` is now a protected method (was public)
- SideChannel IncomingMessages methods now take an optional default argument,

`--load`. (#3705)
- The Jupyter notebooks have been removed from the repository. (#3704)
- The multi-agent gym option was removed from the gym wrapper. For multi-agent
scenarios, use the [Low Level Python API](https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Python-API.md). (#3681)
scenarios, use the [Low Level Python API](https://github.com/Unity-Technologies/ml-agents/blob/release_1_docs/docs/Python-API.md). (#3681)
[Low Level Python API](https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Python-API.md)
[Low Level Python API](https://github.com/Unity-Technologies/ml-agents/blob/release_1_docs/docs/Python-API.md)
documentation for more information. If you use `mlagents-learn` for training, this should be a
transparent change. (#3681)
- Added ability to start training (initialize model weights) from a previous run

2
ml-agents/mlagents/trainers/settings.py


REWARD: str = "reward"
measure: str = attr.ib(default=MeasureType.REWARD)
thresholds: List[int] = attr.ib(factory=list)
thresholds: List[float] = attr.ib(factory=list)
min_lesson_length: int = 0
signal_smoothing: bool = True
parameters: Dict[str, List[float]] = attr.ib(kw_only=True)

23
ml-agents/mlagents/trainers/tests/test_meta_curriculum.py


import pytest
from unittest.mock import patch, Mock, call
import yaml
import cattr
from mlagents.trainers.meta_curriculum import MetaCurriculum

@pytest.fixture
def reward_buff_sizes():
return {"Brain1": 7, "Brain2": 8}
def test_convert_from_dict():
config = yaml.safe_load(
"""
measure: progress
thresholds: [0.1, 0.3, 0.5]
min_lesson_length: 100
signal_smoothing: true
parameters:
param1: [0.0, 4.0, 6.0, 8.0]
"""
)
should_be_config = CurriculumSettings(
thresholds=[0.1, 0.3, 0.5],
min_lesson_length=100,
signal_smoothing=True,
measure=CurriculumSettings.MeasureType.PROGRESS,
parameters={"param1": [0.0, 4.0, 6.0, 8.0]},
)
assert cattr.structure(config, CurriculumSettings) == should_be_config
def test_curriculum_config(param_name="test_param1", min_lesson_length=100):

8
utils/make_readme_table.py


def table_line(display_name, name, date, bold=False):
bold_str = "**" if bold else ""
return f"| **{display_name}** | {bold_str}{date}{bold_str} | {bold_str}[source](https://github.com/Unity-Technologies/ml-agents/tree/{name}){bold_str} | {bold_str}[docs](https://github.com/Unity-Technologies/ml-agents/tree/{name}/docs/Readme.md){bold_str} | {bold_str}[download](https://github.com/Unity-Technologies/ml-agents/archive/{name}.zip){bold_str} |" # noqa
# For release_X branches, docs are on a separate tag.
if name.startswith("release"):
docs_name = name + "_docs"
else:
docs_name = name
return f"| **{display_name}** | {bold_str}{date}{bold_str} | {bold_str}[source](https://github.com/Unity-Technologies/ml-agents/tree/{name}){bold_str} | {bold_str}[docs](https://github.com/Unity-Technologies/ml-agents/tree/{docs_name}/docs/Readme.md){bold_str} | {bold_str}[download](https://github.com/Unity-Technologies/ml-agents/archive/{name}.zip){bold_str} |" # noqa
class ReleaseInfo(NamedTuple):

ReleaseInfo.from_simple_tag("0.15.1", "March 30, 2020"),
ReleaseInfo("release_1", "1.0.0", "0.16.0", "April 30, 2020"),
ReleaseInfo("release_2", "1.0.2", "0.16.1", "May 20, 2020"),
ReleaseInfo("release_3", "1.1.0", "0.17.0", "June 10, 2020"),
]
MAX_DAYS = 150 # do not print releases older than this many days

正在加载...
取消
保存