浏览代码

Merge pull request #2179 from Unity-Technologies/release-v0.8.2

Merge from release 0.8.2 to develop
/develop-generalizationTraining-TrainerController
GitHub 6 年前
当前提交
dcef9f69
共有 25 个文件被更改,包括 5978 次插入9002 次删除
  1. 1
      .gitignore
  2. 1001
      UnitySDK/Assets/ML-Agents/Examples/3DBall/TFModels/3DBallHardLearning.nn
  3. 985
      UnitySDK/Assets/ML-Agents/Examples/3DBall/TFModels/3DBallLearning.nn
  4. 646
      UnitySDK/Assets/ML-Agents/Examples/BananaCollectors/TFModels/BananaLearning.nn
  5. 21
      UnitySDK/Assets/ML-Agents/Examples/Basic/TFModels/BasicLearning.nn
  6. 286
      UnitySDK/Assets/ML-Agents/Examples/Bouncer/TFModels/BouncerLearning.nn
  7. 1001
      UnitySDK/Assets/ML-Agents/Examples/Crawler/TFModels/CrawlerDynamicLearning.nn
  8. 1001
      UnitySDK/Assets/ML-Agents/Examples/Crawler/TFModels/CrawlerStaticLearning.nn
  9. 1001
      UnitySDK/Assets/ML-Agents/Examples/GridWorld/TFModels/GridWorldLearning.nn
  10. 1001
      UnitySDK/Assets/ML-Agents/Examples/Hallway/TFModels/HallwayLearning.nn
  11. 1001
      UnitySDK/Assets/ML-Agents/Examples/PushBlock/TFModels/PushBlockLearning.nn
  12. 1001
      UnitySDK/Assets/ML-Agents/Examples/Pyramids/TFModels/PyramidsLearning.nn
  13. 1001
      UnitySDK/Assets/ML-Agents/Examples/Reacher/TFModels/ReacherLearning.nn
  14. 1001
      UnitySDK/Assets/ML-Agents/Examples/Soccer/TFModels/GoalieLearning.nn
  15. 1001
      UnitySDK/Assets/ML-Agents/Examples/Soccer/TFModels/StrikerLearning.nn
  16. 1001
      UnitySDK/Assets/ML-Agents/Examples/Tennis/TFModels/TennisLearning.nn
  17. 1001
      UnitySDK/Assets/ML-Agents/Examples/WallJump/TFModels/BigWallJumpLearning.nn
  18. 1001
      UnitySDK/Assets/ML-Agents/Examples/WallJump/TFModels/SmallWallJumpLearning.nn
  19. 1
      config/trainer_config.yaml
  20. 6
      docs/Basic-Guide.md
  21. 4
      docs/Learning-Environment-Create-New.md
  22. 4
      gym-unity/setup.py
  23. 2
      ml-agents-envs/setup.py
  24. 11
      ml-agents/setup.py

1
.gitignore


*.pyc
*.idea/misc.xml
*.idea/modules.xml
*.idea/
*.iml
*.cache
*/build/

1001
UnitySDK/Assets/ML-Agents/Examples/3DBall/TFModels/3DBallHardLearning.nn
文件差异内容过多而无法显示
查看文件

985
UnitySDK/Assets/ML-Agents/Examples/3DBall/TFModels/3DBallLearning.nn
文件差异内容过多而无法显示
查看文件

646
UnitySDK/Assets/ML-Agents/Examples/BananaCollectors/TFModels/BananaLearning.nn
文件差异内容过多而无法显示
查看文件

21
UnitySDK/Assets/ML-Agents/Examples/Basic/TFModels/BasicLearning.nn


vector_observation���� action_masks���� action_probsconcat_1actionvalue_estimatestrided_slice_1/stack_2������?strided_slice_1/stack_2strided_slice_1/stack_1������?strided_slice_1/stack_1strided_slice_1/stack������?strided_slice_1/stackstrided_slice/stack_2������?strided_slice/stack_2strided_slice/stack_1������?strided_slice/stack_1strided_slice/stack������?strided_slice/stack
action_output_shape������?action_output_shape  memory_size������? memory_size version_number������?version_numberis_continuous_control������?is_continuous_controladd_1/y������?add_1/ySum/reduction_indices������?Sum/reduction_indicesadd/y������?add/ymain_graph_0/hidden_0/BiasAdd�����?vector_observationmain_graph_0/hidden_0/kernel�main_graph_0/hidden_0/bias�main_graph_0/hidden_0/Mul2 �����?main_graph_0/hidden_0/BiasAdd dense/MatMul�����?main_graph_0/hidden_0/Mul dense/kernel�<dense/MatMul/patch:0� action_probs2�����? dense/MatMulSoftmax2�����? action_probsaddd�����?Softmaxadd/yMulf�����?add action_masksSum������?MulSum/reduction_indicestruedivg�����?MulSumadd_1d�����?truedivadd_1/yLog_12r�����?add_1concat_12�����?Log_1action2�����?Log_1dense_1/BiasAdd�����?main_graph_0/hidden_0/Muldense_1/kernel� dense_1/bias
value_estimate2�����?dense_1/BiasAdd�?�?@@�?�?@@@@@���.�?���.�)= �=��X����>��>���d?)>PXx>.���Ӿ�Q��d?>�8�>Y�&=:[:>ۙa>��>��f>A
=95����>��>d�����D���=���>^��sN����,=�=ܴ�>B��Q�>�Kξԙ��۹[>ۺ�����>0��K�>�0�қ�=���;T���>���>�>ʚ`>f�>>bS�>ɴJ>��>�Ƣ�$�j���@���6�P���>L�+Y$��Ԫ�p��>��Ǽ!�ž�jf>k����G�����>��=�T���ث���iܾQy�>��k>[�>lN8�<'=0�R��#��1�
�J>��پ���>x��;�NN>�I^>��8=Ԣ��7�~= ʋ>�B�;+��>:�&>�CR��v�����>�0c��h>w��>2Ѿ�F����>�1��@ȶ>y�\=]�"=���Tn�=�)�>K�>3��>ѩ��� ��u|M>�B��xj��>�c�>�k����>NO�>8�����j<�>����W���<�>�Zþח�>��>�>L�e�Y��>��=�J�=�x+��[!�-1�>3�q=�z���?(����mί>x�5����>�ߴ���ؼo���Z7��8O>�m����=F�5>������C�zcݾ@��JqL>DP�B�o������D?�h=r�? ����������쩾"���������?������q"���Ɇ�͂>>�:eH���i�ΰ�>;���˕>ZG >��R<o����->X������Z��> nA<�v����> �H������F=SR0>У�=�hC���Ⱦ�~s��}>pj潦/����4��5#���>��O=C8�>�Y��[�>8@,>!�>��=>�Y����>����"�>S�m>�z�>l�<��>H��^0+�-�V�~ؾ�j�=� ���>}�=��?r=i��;Z?c�>��%���I>M ��k>U��>�樺%����p�>�����s���S�Kp)�5�?!��)��>�!?^��>4��T�=�Q�>��L�"u�>�Ͼ-��>buo>0�?����q��<�_��z���h����cD? � ��C?���>3z�>�|~���5?���>ۗ��|�&��G�s��=��$?�4?o��5�'?�D�@h��,�������C&?������>���>��?� ���c
?�q?j�_=�膾y���>�H>���>� ˾��>��f����?0���\���'?�br���>p�7?8?�?��<�>�d�>}���H������9?Ϛ�>�1�>�짾�x?����]���b��1��VA�>�ޞ��c�>��7?�;)?Sϟ�j=c>gǸ>�)��z���Hʾ����>�M ?p� ?�hT��"Ͼ�S=��f>�%=� O���Q���=��>�!Ѽ [�>�x��{>O�j>q嶾̇>i�=�i¼��>E�a���4>��4>�ۅ��W<>U����_,>��>�Ѧ;
J���4>C��>i�>����.���|�=�݃��H:>H�����˼�ܽ��=fG2� �w� +�=y9�=�F>^��=�� �&>�����l>�޲�[�<[��>A�>vf���P��8�s�4�;1F1>m�<ƨ7?������C& ���ݾh�&?ˎ���4?33?��!?��J5.?�b6?s��5���J ���"?^�&?��-?
�� ����
��$?a��>H��>(� ����>�s�>Yb�� �>e��>�2 ��5�>�7�>eF��7ھ�Q
���?W ?T.�>g���$���m�;?��ᾚ0߾H�?G��_���љ?C�>-ɽ>�� ������Hm/?�������a=?�A�>*w�>x) ��^z>�a >%���3��>h?M����kv����?=���
��xV?�/쾷����?
��>�S?�����$>��r����>JI��F�5?r6e�� 0�3�->D�G�u�f�܂�9S��0�=�C�>�T����o�6�>y �>׹2>2����^=
vector_observation���� action_masks���� action_probsconcat_1actionvalue_estimateaction_output_shape������?action_output_shape memory_sizeversion_numberis_continuous_controlSum/reduction_indicesadd/yadd_1/ymain_graph_0/hidden_0/BiasAdd�����?vector_observationmain_graph_0/hidden_0/kernel�main_graph_0/hidden_0/bias�main_graph_0/hidden_0/Mul2 �����?main_graph_0/hidden_0/BiasAdd dense/MatMul�����?main_graph_0/hidden_0/Mul dense/kernel�<dense/MatMul/patch:0� action_probs2�����? dense/MatMul strided_slice������? action_probsstrided_slice_1������? action_masksSoftmax2�����? strided_sliceaddd�����?Softmaxadd/yMulf�����?addstrided_slice_1Sum������?MulSum/reduction_indicestruedivg�����?MulSumadd_1d�����?truedivadd_1/yLog_12r�����?add_1concat_12�����?Log_1action2�����?Log_1dense_1/BiasAdd�����?main_graph_0/hidden_0/Muldense_1/kernel� dense_1/bias�value_estimate2�����?dense_1/BiasAdd@@@�?���.���.��|���g=�>ׯ;v�F����>z�<����9���P��]�����:��q�=[���o�IZ����a�\�͈�=�����|�w�B��T[>Y))���>�[����>�ɾ1@�����K\�=
��>)3ν�>�ڱ�,���ݳ>�م>��=��=P8>G@�=rF�>6��� ]>��?�����-����=3�w�p�>�U�<&1V>^/p>=���7X�>(`D>.�B=/ž�s���ڑ�nc/>��y>� �>=�@���YY>�L�>�B��=[>�,Ѿ8Se�y�������X�>���>|�=tҽ�믾�����>�>�<˾���>��>�Sþ�\�>w��>���>�Z>���`
=Ȋ >F?㾒���c�=w���oy���������A�<I��=S�-=���=�C[�ƕ�=�����c#�<�=[,�>��>1G:�1Z���TN=k��=@.ؾ'��>�y̽�G�=�\>P׾*n}>��S>�ñ�^�=�����)>ImB=b�O�{��=�`4<�rv�S�n>�>����U���#>5C���:�>��]M�>Y U��=i#�>¯德���"Ɵ>�ȽD�� P�>�m>0:�����>�
>�S���Ɇ>* /�1�Ѿ(6>�ˠ�c����L���$�>~Ő�h���L>�P�<�ث���L>�">iʊ>vL,�"�=*(���5�?�?-,��l'(=4��=���.�־���Z���"R�=h[��Nh����u>��->'%N>&K>g'�=�� �Z@=��R����='����ns>���=�/>��>��=�}�>������>�J���>ɽ$����\tu=���=�l�=���� [�=��B>�B�>�?���>��<x|�=�׵>�wʾ�y�E��>��>�����о,�R?D-ʾ����¾s�R>����1���y�>���=��P<�� ?\�4�1��>���>��ʾ\��/5>ğ>w�7�;G��f�R?~Cx=����'���?V��-���
?J.���q�>�r ?R�C��?%�<ʑ
�y���" ?���>��H�X�$D)?;����v2�Jk&����>����e��=�M;?���pL�>ʹW>0��P�>�X?'�&��\۾N�>�.?A⾖��e��>�,���Q��?��� �>�Ѿ���?8�>����>r�?�뾙��<6?熽��$��b�?wɁ>�־ۮ�|??W�0J���ľ���>�U�����x,I>*䈾0*�>��>�p־zQ#>Z��> ⦾h���{�?c5�>��.��|��H?�~'���۾t.��(�'?�Wξ �۾���>��վ[�>QӖ>�"�SK>�n>Ą����>C�
���w>�l6��/��O�E>M�,��Ế��H��Ȃ�!m��|վYD��d�>����`I>�&��p$�':�>��=b�2<�h�<�3�=�X�=y�� �羖2�=_xf�@X;Z��>F��=Y����8����$�xe����Ͻ����>�� >᣽z���_�>��>?;��>w�ͽ�N ��Ǥ>����`)��"-��w>i��>$D�+�>�(���3�����h#J?�� ���)� +?�()?n���:���B?C���� %��)�%�7?� �4� �D?� �!N?�B?cG ���(?k�
��X�cp2?�?J��>� &�ɺ�>���>>�M���:oؾ��?Q� �g��#?�>���>������>��>|r�u��ݾ��l-?�{�>���>���u��>���>W���)?�1�>h]6�V,��뾜]&?%n ?�[�>s��s��>5��>U��������_�1?Yp�>�"�>rh�9���.ܾ8b'?e������A?"6�>�|�>E��j
��C6����?�>g>����x���]�@�;i��� �EH���=>�}�=)i�>�/�><���k�澯�.>����x��=>�\�E��1�7'�=

286
UnitySDK/Assets/ML-Agents/Examples/Bouncer/TFModels/BouncerLearning.nn
文件差异内容过多而无法显示
查看文件

1001
UnitySDK/Assets/ML-Agents/Examples/Crawler/TFModels/CrawlerDynamicLearning.nn
文件差异内容过多而无法显示
查看文件

1001
UnitySDK/Assets/ML-Agents/Examples/Crawler/TFModels/CrawlerStaticLearning.nn
文件差异内容过多而无法显示
查看文件

1001
UnitySDK/Assets/ML-Agents/Examples/GridWorld/TFModels/GridWorldLearning.nn
文件差异内容过多而无法显示
查看文件

1001
UnitySDK/Assets/ML-Agents/Examples/Hallway/TFModels/HallwayLearning.nn
文件差异内容过多而无法显示
查看文件

1001
UnitySDK/Assets/ML-Agents/Examples/PushBlock/TFModels/PushBlockLearning.nn
文件差异内容过多而无法显示
查看文件

1001
UnitySDK/Assets/ML-Agents/Examples/Pyramids/TFModels/PyramidsLearning.nn
文件差异内容过多而无法显示
查看文件

1001
UnitySDK/Assets/ML-Agents/Examples/Reacher/TFModels/ReacherLearning.nn
文件差异内容过多而无法显示
查看文件

1001
UnitySDK/Assets/ML-Agents/Examples/Soccer/TFModels/GoalieLearning.nn
文件差异内容过多而无法显示
查看文件

1001
UnitySDK/Assets/ML-Agents/Examples/Soccer/TFModels/StrikerLearning.nn
文件差异内容过多而无法显示
查看文件

1001
UnitySDK/Assets/ML-Agents/Examples/Tennis/TFModels/TennisLearning.nn
文件差异内容过多而无法显示
查看文件

1001
UnitySDK/Assets/ML-Agents/Examples/WallJump/TFModels/BigWallJumpLearning.nn
文件差异内容过多而无法显示
查看文件

1001
UnitySDK/Assets/ML-Agents/Examples/WallJump/TFModels/SmallWallJumpLearning.nn
文件差异内容过多而无法显示
查看文件

1
config/trainer_config.yaml


lambd: 0.99
gamma: 0.995
beta: 0.001
use_curiosity: true
3DBallHardLearning:
normalize: true

6
docs/Basic-Guide.md


`UnitySDK/Assets/ML-Agents/Examples/3DBall/TFModels/`.
2. Open the Unity Editor, and select the **3DBall** scene as described above.
3. Select the **3DBallLearning** Learning Brain from the Scene hierarchy.
5. Drag the `<brain_name>.nn` file from the Project window of
4. Drag the `<brain_name>.nn` file from the Project window of
6. Select Ball3DAcademy in the scene and toggle off Control, each platform's brain now regains control.
7. Press the :arrow_forward: button at the top of the Editor.
5. Select Ball3DAcademy in the scene and toggle off Control, each platform's brain now regains control.
6. Press the :arrow_forward: button at the top of the Editor.
## Next Steps

4
docs/Learning-Environment-Create-New.md


## Training the Environment
Now you can train the Agent. To get ready for training, you must first to change
the `Brain` of the agent to be the Learning Brain `RollerBallBrain`.
Now you can train the Agent. To get ready for training, you must first drag the
`RollerBallBrain` asset to the **RollerAgent** GameObject `Brain` field to change to the learning brain.
Then, select the Academy GameObject and check the `Control` checkbox for
the RollerBallBrain item in the **Broadcast Hub** list. From there, the process is
the same as described in [Training ML-Agents](Training-ML-Agents.md). Note that the

4
gym-unity/setup.py


setup(
name="gym_unity",
version="0.4.1",
version="0.4.2",
description="Unity Machine Learning Agents Gym Interface",
license="Apache License 2.0",
author="Unity Technologies",

install_requires=["gym", "mlagents_envs==0.8.1"],
install_requires=["gym", "mlagents_envs==0.8.2"],
)

2
ml-agents-envs/setup.py


setup(
name="mlagents_envs",
version="0.8.1",
version="0.8.2",
description="Unity Machine Learning Agents Interface",
url="https://github.com/Unity-Technologies/ml-agents",
author="Unity Technologies",

11
ml-agents/setup.py


from setuptools import setup, find_packages
from setuptools import setup, find_namespace_packages
from os import path
from io import open

setup(
name="mlagents",
version="0.8.1",
version="0.8.2",
description="Unity Machine Learning Agents",
long_description=long_description,
long_description_content_type="text/markdown",

"License :: OSI Approved :: Apache Software License",
"Programming Language :: Python :: 3.6",
],
packages=["mlagents.trainers"], # Required
# find_namespace_packages will recurse through the directories and find all the packages
packages=find_namespace_packages(
exclude=["*.tests", "*.tests.*", "tests.*", "tests"]
),
"mlagents_envs==0.8.1",
"mlagents_envs==0.8.2",
"tensorflow>=1.7,<1.8",
"Pillow>=4.2.1",
"matplotlib",

部分文件因为文件数量过多而无法显示

正在加载...
取消
保存