浏览代码

Merge branch 'develop' into hotfix-0.4b

/hotfix-v0.9.2a
GitHub 7 年前
当前提交
4e73f770
共有 64 个文件被更改,包括 899 次插入1109 次删除
  1. 7
      docs/Getting-Started-with-Balance-Ball.md
  2. 6
      docs/Installation.md
  3. 27
      docs/Learning-Environment-Design-Agents.md
  4. 3
      docs/Learning-Environment-Design-Brains.md
  5. 26
      docs/Learning-Environment-Examples.md
  6. 6
      docs/ML-Agents-Overview.md
  7. 2
      docs/Python-API.md
  8. 2
      docs/Training-Curriculum-Learning.md
  9. 13
      python/communicator_objects/agent_action_proto_pb2.py
  10. 27
      python/communicator_objects/agent_info_proto_pb2.py
  11. 43
      python/communicator_objects/brain_parameters_proto_pb2.py
  12. 15
      python/communicator_objects/brain_type_proto_pb2.py
  13. 13
      python/communicator_objects/command_proto_pb2.py
  14. 19
      python/communicator_objects/engine_configuration_proto_pb2.py
  15. 18
      python/communicator_objects/environment_parameters_proto_pb2.py
  16. 11
      python/communicator_objects/header_pb2.py
  17. 13
      python/communicator_objects/resolution_proto_pb2.py
  18. 11
      python/communicator_objects/space_type_proto_pb2.py
  19. 22
      python/communicator_objects/unity_input_pb2.py
  20. 13
      python/communicator_objects/unity_message_pb2.py
  21. 22
      python/communicator_objects/unity_output_pb2.py
  22. 9
      python/communicator_objects/unity_rl_initialization_input_pb2.py
  23. 17
      python/communicator_objects/unity_rl_initialization_output_pb2.py
  24. 28
      python/communicator_objects/unity_rl_input_pb2.py
  25. 24
      python/communicator_objects/unity_rl_output_pb2.py
  26. 2
      python/requirements.txt
  27. 1
      python/tests/mock_communicator.py
  28. 57
      python/tests/test_unityagents.py
  29. 61
      python/tests/test_unitytrainers.py
  30. 1
      python/unityagents/__init__.py
  31. 29
      python/unityagents/brain.py
  32. 3
      python/unityagents/communicator.py
  33. 27
      python/unityagents/environment.py
  34. 2
      python/unitytrainers/__init__.py
  35. 11
      python/unitytrainers/bc/trainer.py
  36. 66
      python/unitytrainers/models.py
  37. 38
      python/unitytrainers/ppo/models.py
  38. 25
      python/unitytrainers/ppo/trainer.py
  39. 21
      python/unitytrainers/trainer_controller.py
  40. 10
      unity-environment/Assets/ML-Agents/Editor/BrainEditor.cs
  41. 171
      unity-environment/Assets/ML-Agents/Examples/Basic/Scenes/Basic.unity
  42. 28
      unity-environment/Assets/ML-Agents/Examples/Basic/Scripts/BasicAgent.cs
  43. 165
      unity-environment/Assets/ML-Agents/Examples/Basic/TFModels/Basic.bytes
  44. 2
      unity-environment/Assets/ML-Agents/Examples/Basic/TFModels/Basic.bytes.meta
  45. 78
      unity-environment/Assets/ML-Agents/Scripts/Agent.cs
  46. 2
      unity-environment/Assets/ML-Agents/Scripts/Batcher.cs
  47. 22
      unity-environment/Assets/ML-Agents/Scripts/Brain.cs
  48. 52
      unity-environment/Assets/ML-Agents/Scripts/CommunicatorObjects/BrainParametersProto.cs
  49. 4
      unity-environment/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityInput.cs
  50. 6
      unity-environment/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityMessage.cs
  51. 4
      unity-environment/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityOutput.cs
  52. 2
      unity-environment/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityRlInitializationOutput.cs
  53. 2
      unity-environment/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityRlInput.cs
  54. 6
      unity-environment/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityToExternal.cs
  55. 7
      unity-environment/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityToExternalGrpc.cs
  56. 2
      unity-environment/Assets/ML-Agents/Scripts/CoreBrain.cs
  57. 13
      unity-environment/Assets/ML-Agents/Scripts/CoreBrainExternal.cs
  58. 9
      unity-environment/Assets/ML-Agents/Scripts/CoreBrainHeuristic.cs
  59. 649
      unity-environment/Assets/ML-Agents/Scripts/CoreBrainInternal.cs
  60. 2
      unity-environment/Assets/ML-Agents/Scripts/RpcCommunicator.cs
  61. 2
      unity-environment/Assets/ML-Agents/Scripts/UnityAgentsException.cs
  62. 14
      python/unitytrainers/curriculum.py
  63. 15
      python/unitytrainers/exception.py
  64. 0
      /python/unitytrainers/curriculum.py

7
docs/Getting-Started-with-Balance-Ball.md


**Vector Observation Space**
Before making a decision, an agent collects its observation about its state
in the world. The ML-Agents toolkit classifies vector observations into two types:
**Continuous** and **Discrete**. The **Continuous** vector observation space
collects observations in a vector of floating point numbers. The **Discrete**
vector observation space is an index into a table of states. Most of the example
environments use a continuous vector observation space.
in the world. The vector observation is a vector of floating point numbers
which contain relevant information for the agent to make decisions.
The Brain instance used in the 3D Balance Ball example uses the **Continuous**
vector observation space with a **State Size** of 8. This means that the

6
docs/Installation.md


## Install Python (with Dependencies)
In order to use ML-Agents toolkit, you need Python 3.5 or 3.6 along with
In order to use ML-Agents toolkit, you need Python 3.6 along with
**NOTES**
- We do not currently support Python 3.7 or Python 3.5.
- If you are using Anaconda and are having trouble with TensorFlow, please see the following [note](https://www.tensorflow.org/install/install_mac#installing_with_anaconda) on how to install TensorFlow in an Anaconda environment.
### Windows Users

27
docs/Learning-Environment-Design-Agents.md


## Observations
To make decisions, an agent must observe its environment to determine its current state. A state observation can take the following forms:
To make decisions, an agent must observe its environment in order to infer the state of the world. A state observation can take the following forms:
* **Continuous Vector** — a feature vector consisting of an array of numbers.
* **Discrete Vector** — an index into a state table (typically only useful for the simplest of environments).
* **Vector Observation** — a feature vector consisting of an array of floating point numbers.
When you use the **Continuous** or **Discrete** vector observation space for an agent, implement the `Agent.CollectObservations()` method to create the feature vector or state index. When you use **Visual Observations**, you only need to identify which Unity Camera objects will provide images and the base Agent class handles the rest. You do not need to implement the `CollectObservations()` method when your agent uses visual observations (unless it also uses vector observations).
When you use vector observations for an agent, implement the `Agent.CollectObservations()` method to create the feature vector. When you use **Visual Observations**, you only need to identify which Unity Camera objects will provide images and the base Agent class handles the rest. You do not need to implement the `CollectObservations()` method when your agent uses visual observations (unless it also uses vector observations).
### Continuous Vector Observation Space: Feature Vectors
### Vector Observation Space: Feature Vectors
For agents using a continuous state space, you create a feature vector to represent the agent's observation at each step of the simulation. The Brain class calls the `CollectObservations()` method of each of its agents. Your implementation of this function must call `AddVectorObs` to add vector observations.

When you set up an Agent's brain in the Unity Editor, set the following properties to use a continuous vector observation:
**Space Size** — The state size must match the length of your feature vector.
**Space Type** — Set to **Continuous**.
**Brain Type** — Set to **External** during training; set to **Internal** to use the trained model.
The observation feature vector is a list of floating point numbers, which means you must convert any other data types to a float or a list of floats.

### Multiple Visual Observations
Camera observations use rendered textures from one or more cameras in a scene. The brain vectorizes the textures into a 3D Tensor which can be fed into a convolutional neural network (CNN). For more information on CNNs, see [this guide](http://cs231n.github.io/convolutional-networks/). You can use camera observations and either continuous feature vector or discrete state observations at the same time.
Camera observations use rendered textures from one or more cameras in a scene. The brain vectorizes the textures into a 3D Tensor which can be fed into a convolutional neural network (CNN). For more information on CNNs, see [this guide](http://cs231n.github.io/convolutional-networks/). You can use camera observations along side vector observations.
Agents using camera images can capture state of arbitrary complexity and are useful when the state is difficult to describe numerically. However, they are also typically less efficient and slower to train, and sometimes don't succeed at all.

In addition, make sure that the Agent's Brain expects a visual observation. In the Brain inspector, under **Brain Parameters** > **Visual Observations**, specify the number of Cameras the agent is using for its visual observations. For each visual observation, set the width and height of the image (in pixels) and whether or not the observation is color or grayscale (when `Black And White` is checked).
### Discrete Vector Observation Space: Table Lookup
You can use the discrete vector observation space when an agent only has a limited number of possible states and those states can be enumerated by a single number. For instance, the [Basic example environment](Learning-Environment-Examples.md#basic) in the ML-Agents toolkit defines an agent with a discrete vector observation space. The states of this agent are the integer steps between two linear goals. In the Basic example, the agent learns to move to the goal that provides the greatest reward.
More generally, the discrete vector observation identifier could be an index into a table of the possible states. However, tables quickly become unwieldy as the environment becomes more complex. For example, even a simple game like [tic-tac-toe has 765 possible states](https://en.wikipedia.org/wiki/Game_complexity) (far more if you don't reduce the number of observations by combining those that are rotations or reflections of each other).
To implement a discrete state observation, implement the `CollectObservations()` method of your Agent subclass and return a `List` containing a single number representing the state:
```csharp
public override void CollectObservations()
{
AddVectorObs(stateIndex); // stateIndex is the state identifier
}
```
## Vector Actions
An action is an instruction from the brain that the agent carries out. The action is passed to the agent as a parameter when the Academy invokes the agent's `AgentAction()` function. When you specify that the vector action space is **Continuous**, the action parameter passed to the agent is an array of control signals with length equal to the `Vector Action Space Size` property. When you specify a **Discrete** vector action space type, the action parameter is an array containing only a single value, which is an index into your list or table of commands. In the **Discrete** vector action space type, the `Vector Action Space Size` is the number of elements in your action table. Set the `Vector Action Space Size` and `Vector Action Space Type` properties on the Brain object assigned to the agent (using the Unity Editor Inspector window).

3
docs/Learning-Environment-Design-Brains.md


* `Brain Parameters` - Define vector observations, visual observation, and vector actions for the Brain.
* `Vector Observation`
* `Space Type` - Corresponds to whether the observation vector contains a single integer (Discrete) or a series of real-valued floats (Continuous).
* `Space Size` - Length of vector observation for brain (In _Continuous_ space type). Or number of possible values (in _Discrete_ space type).
* `Space Size` - Length of vector observation for brain.
* `Stacked Vectors` - The number of previous vector observations that will be stacked and used collectively for decision making. This results in the effective size of the vector observation being passed to the brain being: _Space Size_ x _Stacked Vectors_.
* `Visual Observations` - Describes height, width, and whether to grayscale visual observations for the Brain.
* `Vector Action`

26
docs/Learning-Environment-Examples.md


* +0.1 for arriving at suboptimal state.
* +1.0 for arriving at optimal state.
* Brains: One brain with the following observation/action space.
* Vector Observation space: (Discrete) One variable corresponding to current state.
* Vector Observation space: One variable corresponding to current state.
* Vector Action space: (Discrete) Two possible actions (Move left, move right).
* Visual Observations: None.
* Reset Parameters: None

* +0.1 for every step the ball remains on the platform.
* -1.0 if the ball falls from the platform.
* Brains: One brain with the following observation/action space.
* Vector Observation space: (Continuous) 8 variables corresponding to rotation of platform, and position, rotation, and velocity of ball.
* Vector Observation space (Hard Version): (Continuous) 5 variables corresponding to rotation of platform and position and rotation of ball.
* Vector Observation space: 8 variables corresponding to rotation of platform, and position, rotation, and velocity of ball.
* Vector Observation space (Hard Version): 5 variables corresponding to rotation of platform and position and rotation of ball.
* Vector Action space: (Continuous) Size of 2, with one value corresponding to X-rotation, and the other to Z-rotation.
* Visual Observations: None.
* Reset Parameters: None

* +0.1 To agent when hitting ball over net.
* -0.1 To agent who let ball hit their ground, or hit ball out of bounds.
* Brains: One brain with the following observation/action space.
* Vector Observation space: (Continuous) 8 variables corresponding to position and velocity of ball and racket.
* Vector Observation space: 8 variables corresponding to position and velocity of ball and racket.
* Vector Action space: (Continuous) Size of 2, corresponding to movement toward net or away from net, and jumping.
* Visual Observations: None.
* Reset Parameters: One, corresponding to size of ball.

* +1.0 if the agent touches the goal.
* -1.0 if the agent falls off the platform.
* Brains: Two brains, each with the following observation/action space.
* Vector Observation space: (Continuous) 16 variables corresponding to position and velocities of agent, block, and goal, plus the height of the wall.
* Vector Observation space: 16 variables corresponding to position and velocities of agent, block, and goal, plus the height of the wall.
* Vector Action space: (Discrete) Size of 74, corresponding to 14 raycasts each detecting 4 possible objects. plus the global position of the agent and whether or not the agent is grounded.
* Visual Observations: None.
* Reset Parameters: 4, corresponding to the height of the possible walls.

* Agent Reward Function (independent):
* +0.1 Each step agent's hand is in goal location.
* Brains: One brain with the following observation/action space.
* Vector Observation space: (Continuous) 26 variables corresponding to position, rotation, velocity, and angular velocities of the two arm Rigidbodies.
* Vector Observation space: 26 variables corresponding to position, rotation, velocity, and angular velocities of the two arm Rigidbodies.
* Vector Action space: (Continuous) Size of 4, corresponding to torque applicable to two joints.
* Visual Observations: None.
* Reset Parameters: Two, corresponding to goal size, and goal movement speed.

* +0.03 times body velocity in the goal direction.
* +0.01 times body direction alignment with goal direction.
* Brains: One brain with the following observation/action space.
* Vector Observation space: (Continuous) 117 variables corresponding to position, rotation, velocity, and angular velocities of each limb plus the acceleration and angular acceleration of the body.
* Vector Observation space: 117 variables corresponding to position, rotation, velocity, and angular velocities of each limb plus the acceleration and angular acceleration of the body.
* Vector Action space: (Continuous) Size of 20, corresponding to target rotations for joints.
* Visual Observations: None.
* Reset Parameters: None

* +1 for interaction with yellow banana
* -1 for interaction with blue banana.
* Brains: One brain with the following observation/action space.
* Vector Observation space: (Continuous) 53 corresponding to velocity of agent (2), whether agent is frozen and/or shot its laser (2), plus ray-based perception of objects around agent's forward direction (49; 7 raycast angles with 7 measurements for each).
* Vector Observation space: 53 corresponding to velocity of agent (2), whether agent is frozen and/or shot its laser (2), plus ray-based perception of objects around agent's forward direction (49; 7 raycast angles with 7 measurements for each).
* Vector Action space: (Continuous) Size of 3, corresponding to forward movement, y-axis rotation, and whether to use laser to disable other agents.
* Visual Observations (Optional): First-person camera per-agent. Use `VisualBanana` scene.
* Reset Parameters: None.

* -0.1 For moving to incorrect goal.
* -0.0003 Existential penalty.
* Brains: One brain with the following observation/action space:
* Vector Observation space: (Continuous) 30 corresponding to local ray-casts detecting objects, goals, and walls.
* Vector Observation space: 30 corresponding to local ray-casts detecting objects, goals, and walls.
* Vector Action space: (Discrete) 4 corresponding to agent rotation and forward/backward movement.
* Visual Observations (Optional): First-person view for the agent. Use `VisualHallway` scene.
* Reset Parameters: None.

* -1 For bouncing out of bounds.
* -0.05 Times the action squared. Energy expenditure penalty.
* Brains: One brain with the following observation/action space:
* Vector Observation space: (Continuous) 6 corresponding to local position of agent and banana.
* Vector Observation space: 6 corresponding to local position of agent and banana.
* Vector Action space: (Continuous) 3 corresponding to agent force applied for the jump.
* Visual Observations: None.
* Reset Parameters: None.

* +0.1 When ball enters opponents goal.
* +0.001 Existential bonus.
* Brains: Two brain with the following observation/action space:
* Vector Observation space: (Continuous) 112 corresponding to local 14 ray casts, each detecting 7 possible object types, along with the object's distance. Perception is in 180 degree view from front of agent.
* Vector Observation space: 112 corresponding to local 14 ray casts, each detecting 7 possible object types, along with the object's distance. Perception is in 180 degree view from front of agent.
* Vector Action space: (Discrete)
* Striker: 6 corresponding to forward, backward, sideways movement, as well as rotation.
* Goalie: 4 corresponding to forward, backward, sideways movement.

* +0.01 times body direction alignment with goal direction.
* -0.01 times head velocity difference from body velocity.
* Brains: One brain with the following observation/action space.
* Vector Observation space: (Continuous) 215 variables corresponding to position, rotation, velocity, and angular velocities of each limb, along with goal direction.
* Vector Observation space: 215 variables corresponding to position, rotation, velocity, and angular velocities of each limb, along with goal direction.
* Vector Action space: (Continuous) Size of 39, corresponding to target rotations applicable to the joints.
* Visual Observations: None.
* Reset Parameters: None.

* Agent Reward Function (independent):
* +2 For moving to golden brick (minus 0.001 per step).
* Brains: One brain with the following observation/action space:
* Vector Observation space: (Continuous) 148 corresponding to local ray-casts detecting switch, bricks, golden brick, and walls, plus variable indicating switch state.
* Vector Observation space: 148 corresponding to local ray-casts detecting switch, bricks, golden brick, and walls, plus variable indicating switch state.
* Vector Action space: (Discrete) 4 corresponding to agent rotation and forward/backward movement.
* Visual Observations (Optional): First-person camera per-agent. Use `VisualPyramids` scene.
* Reset Parameters: None.

6
docs/ML-Agents-Overview.md


Observations can be numeric and/or visual. Numeric observations measure
attributes of the environment from the point of view of the agent. For
our medic this would be attributes of the battlefield that are visible to it.
Observations can either be _discrete_ or _continuous_ depending on the complexity
of the game and agent. For most interesting environments, an agent will require
several continuous numeric observations, while for simple environments with
a small number of unique configurations, a discrete observation will suffice.
For most interesting environments, an agent will require
several continuous numeric observations.
Visual observations, on the other hand, are images generated from the cameras
attached to the agent and represent what the agent is seeing at that point
in time. It is common to confuse an agent's observation with the environment

2
docs/Python-API.md


A BrainInfo object contains the following fields:
* **`visual_observations`** : A list of 4 dimensional numpy arrays. Matrix n of the list corresponds to the n<sup>th</sup> observation of the brain.
* **`vector_observations`** : A two dimensional numpy array of dimension `(batch size, vector observation size)` if the vector observation space is continuous and `(batch size, 1)` if the vector observation space is discrete.
* **`vector_observations`** : A two dimensional numpy array of dimension `(batch size, vector observation size)`.
* **`text_observations`** : A list of string corresponding to the agents text observations.
* **`memories`** : A two dimensional numpy array of dimension `(batch size, memory size)` which corresponds to the memories sent at the previous step.
* **`rewards`** : A list as long as the number of agents using the brain containing the rewards they each obtained at the previous step.

2
docs/Training-Curriculum-Learning.md


structure of the curriculum. Within it we can set at what points in the training process
our wall height will change, either based on the percentage of training steps which have
taken place, or what the average reward the agent has received in the recent past is.
Once these are in place, we simply launch ppo.py using the `–curriculum-file` flag to
Once these are in place, we simply launch learn.py using the `–curriculum-file` flag to
point to the JSON file, and PPO we will train using Curriculum Learning. Of course we can
then keep track of the current lesson and progress via TensorBoard.

13
python/communicator_objects/agent_action_proto_pb2.py


from google.protobuf import message as _message
from google.protobuf import reflection as _reflection
from google.protobuf import symbol_database as _symbol_database
from google.protobuf import descriptor_pb2
# @@protoc_insertion_point(imports)
_sym_db = _symbol_database.Default()

name='communicator_objects/agent_action_proto.proto',
package='communicator_objects',
syntax='proto3',
serialized_options=_b('\252\002\034MLAgents.CommunicatorObjects'),
serialized_pb=_b('\n-communicator_objects/agent_action_proto.proto\x12\x14\x63ommunicator_objects\"R\n\x10\x41gentActionProto\x12\x16\n\x0evector_actions\x18\x01 \x03(\x02\x12\x14\n\x0ctext_actions\x18\x02 \x01(\t\x12\x10\n\x08memories\x18\x03 \x03(\x02\x42\x1f\xaa\x02\x1cMLAgents.CommunicatorObjectsb\x06proto3')
)

has_default_value=False, default_value=[],
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='text_actions', full_name='communicator_objects.AgentActionProto.text_actions', index=1,
number=2, type=9, cpp_type=9, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='memories', full_name='communicator_objects.AgentActionProto.memories', index=2,
number=3, type=2, cpp_type=6, label=3,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
],
extensions=[
],

options=None,
serialized_options=None,
is_extendable=False,
syntax='proto3',
extension_ranges=[],

_sym_db.RegisterMessage(AgentActionProto)
DESCRIPTOR.has_options = True
DESCRIPTOR._options = _descriptor._ParseOptions(descriptor_pb2.FileOptions(), _b('\252\002\034MLAgents.CommunicatorObjects'))
DESCRIPTOR._options = None
# @@protoc_insertion_point(module_scope)

27
python/communicator_objects/agent_info_proto_pb2.py


from google.protobuf import message as _message
from google.protobuf import reflection as _reflection
from google.protobuf import symbol_database as _symbol_database
from google.protobuf import descriptor_pb2
# @@protoc_insertion_point(imports)
_sym_db = _symbol_database.Default()

name='communicator_objects/agent_info_proto.proto',
package='communicator_objects',
syntax='proto3',
serialized_options=_b('\252\002\034MLAgents.CommunicatorObjects'),
serialized_pb=_b('\n+communicator_objects/agent_info_proto.proto\x12\x14\x63ommunicator_objects\"\xfd\x01\n\x0e\x41gentInfoProto\x12\"\n\x1astacked_vector_observation\x18\x01 \x03(\x02\x12\x1b\n\x13visual_observations\x18\x02 \x03(\x0c\x12\x18\n\x10text_observation\x18\x03 \x01(\t\x12\x1d\n\x15stored_vector_actions\x18\x04 \x03(\x02\x12\x1b\n\x13stored_text_actions\x18\x05 \x01(\t\x12\x10\n\x08memories\x18\x06 \x03(\x02\x12\x0e\n\x06reward\x18\x07 \x01(\x02\x12\x0c\n\x04\x64one\x18\x08 \x01(\x08\x12\x18\n\x10max_step_reached\x18\t \x01(\x08\x12\n\n\x02id\x18\n \x01(\x05\x42\x1f\xaa\x02\x1cMLAgents.CommunicatorObjectsb\x06proto3')
)

has_default_value=False, default_value=[],
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='visual_observations', full_name='communicator_objects.AgentInfoProto.visual_observations', index=1,
number=2, type=12, cpp_type=9, label=3,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='text_observation', full_name='communicator_objects.AgentInfoProto.text_observation', index=2,
number=3, type=9, cpp_type=9, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='stored_vector_actions', full_name='communicator_objects.AgentInfoProto.stored_vector_actions', index=3,
number=4, type=2, cpp_type=6, label=3,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='stored_text_actions', full_name='communicator_objects.AgentInfoProto.stored_text_actions', index=4,
number=5, type=9, cpp_type=9, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='memories', full_name='communicator_objects.AgentInfoProto.memories', index=5,
number=6, type=2, cpp_type=6, label=3,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='reward', full_name='communicator_objects.AgentInfoProto.reward', index=6,
number=7, type=2, cpp_type=6, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='done', full_name='communicator_objects.AgentInfoProto.done', index=7,
number=8, type=8, cpp_type=7, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='max_step_reached', full_name='communicator_objects.AgentInfoProto.max_step_reached', index=8,
number=9, type=8, cpp_type=7, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='id', full_name='communicator_objects.AgentInfoProto.id', index=9,
number=10, type=5, cpp_type=1, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
],
extensions=[
],

options=None,
serialized_options=None,
is_extendable=False,
syntax='proto3',
extension_ranges=[],

_sym_db.RegisterMessage(AgentInfoProto)
DESCRIPTOR.has_options = True
DESCRIPTOR._options = _descriptor._ParseOptions(descriptor_pb2.FileOptions(), _b('\252\002\034MLAgents.CommunicatorObjects'))
DESCRIPTOR._options = None
# @@protoc_insertion_point(module_scope)

43
python/communicator_objects/brain_parameters_proto_pb2.py


from google.protobuf import message as _message
from google.protobuf import reflection as _reflection
from google.protobuf import symbol_database as _symbol_database
from google.protobuf import descriptor_pb2
# @@protoc_insertion_point(imports)
_sym_db = _symbol_database.Default()

name='communicator_objects/brain_parameters_proto.proto',
package='communicator_objects',
syntax='proto3',
serialized_pb=_b('\n1communicator_objects/brain_parameters_proto.proto\x12\x14\x63ommunicator_objects\x1a+communicator_objects/resolution_proto.proto\x1a+communicator_objects/brain_type_proto.proto\x1a+communicator_objects/space_type_proto.proto\"\xc6\x03\n\x14\x42rainParametersProto\x12\x1f\n\x17vector_observation_size\x18\x01 \x01(\x05\x12\'\n\x1fnum_stacked_vector_observations\x18\x02 \x01(\x05\x12\x1a\n\x12vector_action_size\x18\x03 \x01(\x05\x12\x41\n\x12\x63\x61mera_resolutions\x18\x04 \x03(\x0b\x32%.communicator_objects.ResolutionProto\x12\"\n\x1avector_action_descriptions\x18\x05 \x03(\t\x12\x46\n\x18vector_action_space_type\x18\x06 \x01(\x0e\x32$.communicator_objects.SpaceTypeProto\x12K\n\x1dvector_observation_space_type\x18\x07 \x01(\x0e\x32$.communicator_objects.SpaceTypeProto\x12\x12\n\nbrain_name\x18\x08 \x01(\t\x12\x38\n\nbrain_type\x18\t \x01(\x0e\x32$.communicator_objects.BrainTypeProtoB\x1f\xaa\x02\x1cMLAgents.CommunicatorObjectsb\x06proto3')
serialized_options=_b('\252\002\034MLAgents.CommunicatorObjects'),
serialized_pb=_b('\n1communicator_objects/brain_parameters_proto.proto\x12\x14\x63ommunicator_objects\x1a+communicator_objects/resolution_proto.proto\x1a+communicator_objects/brain_type_proto.proto\x1a+communicator_objects/space_type_proto.proto\"\xf9\x02\n\x14\x42rainParametersProto\x12\x1f\n\x17vector_observation_size\x18\x01 \x01(\x05\x12\'\n\x1fnum_stacked_vector_observations\x18\x02 \x01(\x05\x12\x1a\n\x12vector_action_size\x18\x03 \x01(\x05\x12\x41\n\x12\x63\x61mera_resolutions\x18\x04 \x03(\x0b\x32%.communicator_objects.ResolutionProto\x12\"\n\x1avector_action_descriptions\x18\x05 \x03(\t\x12\x46\n\x18vector_action_space_type\x18\x06 \x01(\x0e\x32$.communicator_objects.SpaceTypeProto\x12\x12\n\nbrain_name\x18\x07 \x01(\t\x12\x38\n\nbrain_type\x18\x08 \x01(\x0e\x32$.communicator_objects.BrainTypeProtoB\x1f\xaa\x02\x1cMLAgents.CommunicatorObjectsb\x06proto3')
,
dependencies=[communicator__objects_dot_resolution__proto__pb2.DESCRIPTOR,communicator__objects_dot_brain__type__proto__pb2.DESCRIPTOR,communicator__objects_dot_space__type__proto__pb2.DESCRIPTOR,])

has_default_value=False, default_value=0,
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='num_stacked_vector_observations', full_name='communicator_objects.BrainParametersProto.num_stacked_vector_observations', index=1,
number=2, type=5, cpp_type=1, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='vector_action_size', full_name='communicator_objects.BrainParametersProto.vector_action_size', index=2,
number=3, type=5, cpp_type=1, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='camera_resolutions', full_name='communicator_objects.BrainParametersProto.camera_resolutions', index=3,
number=4, type=11, cpp_type=10, label=3,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='vector_action_descriptions', full_name='communicator_objects.BrainParametersProto.vector_action_descriptions', index=4,
number=5, type=9, cpp_type=9, label=3,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='vector_action_space_type', full_name='communicator_objects.BrainParametersProto.vector_action_space_type', index=5,
number=6, type=14, cpp_type=8, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
name='vector_observation_space_type', full_name='communicator_objects.BrainParametersProto.vector_observation_space_type', index=6,
number=7, type=14, cpp_type=8, label=1,
has_default_value=False, default_value=0,
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='brain_name', full_name='communicator_objects.BrainParametersProto.brain_name', index=7,
number=8, type=9, cpp_type=9, label=1,
name='brain_name', full_name='communicator_objects.BrainParametersProto.brain_name', index=6,
number=7, type=9, cpp_type=9, label=1,
options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
name='brain_type', full_name='communicator_objects.BrainParametersProto.brain_type', index=8,
number=9, type=14, cpp_type=8, label=1,
name='brain_type', full_name='communicator_objects.BrainParametersProto.brain_type', index=7,
number=8, type=14, cpp_type=8, label=1,
options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
],
extensions=[
],

options=None,
serialized_options=None,
is_extendable=False,
syntax='proto3',
extension_ranges=[],

serialized_end=665,
serialized_end=588,
_BRAINPARAMETERSPROTO.fields_by_name['vector_observation_space_type'].enum_type = communicator__objects_dot_space__type__proto__pb2._SPACETYPEPROTO
_BRAINPARAMETERSPROTO.fields_by_name['brain_type'].enum_type = communicator__objects_dot_brain__type__proto__pb2._BRAINTYPEPROTO
DESCRIPTOR.message_types_by_name['BrainParametersProto'] = _BRAINPARAMETERSPROTO
_sym_db.RegisterFileDescriptor(DESCRIPTOR)

_sym_db.RegisterMessage(BrainParametersProto)
DESCRIPTOR.has_options = True
DESCRIPTOR._options = _descriptor._ParseOptions(descriptor_pb2.FileOptions(), _b('\252\002\034MLAgents.CommunicatorObjects'))
DESCRIPTOR._options = None
# @@protoc_insertion_point(module_scope)

15
python/communicator_objects/brain_type_proto_pb2.py


from google.protobuf import message as _message
from google.protobuf import reflection as _reflection
from google.protobuf import symbol_database as _symbol_database
from google.protobuf import descriptor_pb2
# @@protoc_insertion_point(imports)
_sym_db = _symbol_database.Default()

name='communicator_objects/brain_type_proto.proto',
package='communicator_objects',
syntax='proto3',
serialized_options=_b('\252\002\034MLAgents.CommunicatorObjects'),
serialized_pb=_b('\n+communicator_objects/brain_type_proto.proto\x12\x14\x63ommunicator_objects\x1a+communicator_objects/resolution_proto.proto*G\n\x0e\x42rainTypeProto\x12\n\n\x06Player\x10\x00\x12\r\n\tHeuristic\x10\x01\x12\x0c\n\x08\x45xternal\x10\x02\x12\x0c\n\x08Internal\x10\x03\x42\x1f\xaa\x02\x1cMLAgents.CommunicatorObjectsb\x06proto3')
,
dependencies=[communicator__objects_dot_resolution__proto__pb2.DESCRIPTOR,])

values=[
_descriptor.EnumValueDescriptor(
name='Player', index=0, number=0,
options=None,
serialized_options=None,
options=None,
serialized_options=None,
options=None,
serialized_options=None,
options=None,
serialized_options=None,
options=None,
serialized_options=None,
serialized_start=114,
serialized_end=185,
)

_sym_db.RegisterFileDescriptor(DESCRIPTOR)
DESCRIPTOR.has_options = True
DESCRIPTOR._options = _descriptor._ParseOptions(descriptor_pb2.FileOptions(), _b('\252\002\034MLAgents.CommunicatorObjects'))
DESCRIPTOR._options = None
# @@protoc_insertion_point(module_scope)

13
python/communicator_objects/command_proto_pb2.py


from google.protobuf import message as _message
from google.protobuf import reflection as _reflection
from google.protobuf import symbol_database as _symbol_database
from google.protobuf import descriptor_pb2
# @@protoc_insertion_point(imports)
_sym_db = _symbol_database.Default()

name='communicator_objects/command_proto.proto',
package='communicator_objects',
syntax='proto3',
serialized_options=_b('\252\002\034MLAgents.CommunicatorObjects'),
serialized_pb=_b('\n(communicator_objects/command_proto.proto\x12\x14\x63ommunicator_objects*-\n\x0c\x43ommandProto\x12\x08\n\x04STEP\x10\x00\x12\t\n\x05RESET\x10\x01\x12\x08\n\x04QUIT\x10\x02\x42\x1f\xaa\x02\x1cMLAgents.CommunicatorObjectsb\x06proto3')
)

values=[
_descriptor.EnumValueDescriptor(
name='STEP', index=0, number=0,
options=None,
serialized_options=None,
options=None,
serialized_options=None,
options=None,
serialized_options=None,
options=None,
serialized_options=None,
serialized_start=66,
serialized_end=111,
)

_sym_db.RegisterFileDescriptor(DESCRIPTOR)
DESCRIPTOR.has_options = True
DESCRIPTOR._options = _descriptor._ParseOptions(descriptor_pb2.FileOptions(), _b('\252\002\034MLAgents.CommunicatorObjects'))
DESCRIPTOR._options = None
# @@protoc_insertion_point(module_scope)

19
python/communicator_objects/engine_configuration_proto_pb2.py


from google.protobuf import message as _message
from google.protobuf import reflection as _reflection
from google.protobuf import symbol_database as _symbol_database
from google.protobuf import descriptor_pb2
# @@protoc_insertion_point(imports)
_sym_db = _symbol_database.Default()

name='communicator_objects/engine_configuration_proto.proto',
package='communicator_objects',
syntax='proto3',
serialized_options=_b('\252\002\034MLAgents.CommunicatorObjects'),
serialized_pb=_b('\n5communicator_objects/engine_configuration_proto.proto\x12\x14\x63ommunicator_objects\"\x95\x01\n\x18\x45ngineConfigurationProto\x12\r\n\x05width\x18\x01 \x01(\x05\x12\x0e\n\x06height\x18\x02 \x01(\x05\x12\x15\n\rquality_level\x18\x03 \x01(\x05\x12\x12\n\ntime_scale\x18\x04 \x01(\x02\x12\x19\n\x11target_frame_rate\x18\x05 \x01(\x05\x12\x14\n\x0cshow_monitor\x18\x06 \x01(\x08\x42\x1f\xaa\x02\x1cMLAgents.CommunicatorObjectsb\x06proto3')
)

has_default_value=False, default_value=0,
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='height', full_name='communicator_objects.EngineConfigurationProto.height', index=1,
number=2, type=5, cpp_type=1, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='quality_level', full_name='communicator_objects.EngineConfigurationProto.quality_level', index=2,
number=3, type=5, cpp_type=1, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='time_scale', full_name='communicator_objects.EngineConfigurationProto.time_scale', index=3,
number=4, type=2, cpp_type=6, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='target_frame_rate', full_name='communicator_objects.EngineConfigurationProto.target_frame_rate', index=4,
number=5, type=5, cpp_type=1, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='show_monitor', full_name='communicator_objects.EngineConfigurationProto.show_monitor', index=5,
number=6, type=8, cpp_type=7, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
],
extensions=[
],

options=None,
serialized_options=None,
is_extendable=False,
syntax='proto3',
extension_ranges=[],

_sym_db.RegisterMessage(EngineConfigurationProto)
DESCRIPTOR.has_options = True
DESCRIPTOR._options = _descriptor._ParseOptions(descriptor_pb2.FileOptions(), _b('\252\002\034MLAgents.CommunicatorObjects'))
DESCRIPTOR._options = None
# @@protoc_insertion_point(module_scope)

18
python/communicator_objects/environment_parameters_proto_pb2.py


from google.protobuf import message as _message
from google.protobuf import reflection as _reflection
from google.protobuf import symbol_database as _symbol_database
from google.protobuf import descriptor_pb2
# @@protoc_insertion_point(imports)
_sym_db = _symbol_database.Default()

name='communicator_objects/environment_parameters_proto.proto',
package='communicator_objects',
syntax='proto3',
serialized_options=_b('\252\002\034MLAgents.CommunicatorObjects'),
serialized_pb=_b('\n7communicator_objects/environment_parameters_proto.proto\x12\x14\x63ommunicator_objects\"\xb5\x01\n\x1a\x45nvironmentParametersProto\x12_\n\x10\x66loat_parameters\x18\x01 \x03(\x0b\x32\x45.communicator_objects.EnvironmentParametersProto.FloatParametersEntry\x1a\x36\n\x14\x46loatParametersEntry\x12\x0b\n\x03key\x18\x01 \x01(\t\x12\r\n\x05value\x18\x02 \x01(\x02:\x02\x38\x01\x42\x1f\xaa\x02\x1cMLAgents.CommunicatorObjectsb\x06proto3')
)

has_default_value=False, default_value=_b("").decode('utf-8'),
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='value', full_name='communicator_objects.EnvironmentParametersProto.FloatParametersEntry.value', index=1,
number=2, type=2, cpp_type=6, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
],
extensions=[
],

options=_descriptor._ParseOptions(descriptor_pb2.MessageOptions(), _b('8\001')),
serialized_options=_b('8\001'),
is_extendable=False,
syntax='proto3',
extension_ranges=[],

has_default_value=False, default_value=[],
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
],
extensions=[
],

options=None,
serialized_options=None,
is_extendable=False,
syntax='proto3',
extension_ranges=[],

_sym_db.RegisterMessage(EnvironmentParametersProto.FloatParametersEntry)
DESCRIPTOR.has_options = True
DESCRIPTOR._options = _descriptor._ParseOptions(descriptor_pb2.FileOptions(), _b('\252\002\034MLAgents.CommunicatorObjects'))
_ENVIRONMENTPARAMETERSPROTO_FLOATPARAMETERSENTRY.has_options = True
_ENVIRONMENTPARAMETERSPROTO_FLOATPARAMETERSENTRY._options = _descriptor._ParseOptions(descriptor_pb2.MessageOptions(), _b('8\001'))
DESCRIPTOR._options = None
_ENVIRONMENTPARAMETERSPROTO_FLOATPARAMETERSENTRY._options = None
# @@protoc_insertion_point(module_scope)

11
python/communicator_objects/header_pb2.py


from google.protobuf import message as _message
from google.protobuf import reflection as _reflection
from google.protobuf import symbol_database as _symbol_database
from google.protobuf import descriptor_pb2
# @@protoc_insertion_point(imports)
_sym_db = _symbol_database.Default()

name='communicator_objects/header.proto',
package='communicator_objects',
syntax='proto3',
serialized_options=_b('\252\002\034MLAgents.CommunicatorObjects'),
serialized_pb=_b('\n!communicator_objects/header.proto\x12\x14\x63ommunicator_objects\")\n\x06Header\x12\x0e\n\x06status\x18\x01 \x01(\x05\x12\x0f\n\x07message\x18\x02 \x01(\tB\x1f\xaa\x02\x1cMLAgents.CommunicatorObjectsb\x06proto3')
)

has_default_value=False, default_value=0,
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='message', full_name='communicator_objects.Header.message', index=1,
number=2, type=9, cpp_type=9, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
],
extensions=[
],

options=None,
serialized_options=None,
is_extendable=False,
syntax='proto3',
extension_ranges=[],

_sym_db.RegisterMessage(Header)
DESCRIPTOR.has_options = True
DESCRIPTOR._options = _descriptor._ParseOptions(descriptor_pb2.FileOptions(), _b('\252\002\034MLAgents.CommunicatorObjects'))
DESCRIPTOR._options = None
# @@protoc_insertion_point(module_scope)

13
python/communicator_objects/resolution_proto_pb2.py


from google.protobuf import message as _message
from google.protobuf import reflection as _reflection
from google.protobuf import symbol_database as _symbol_database
from google.protobuf import descriptor_pb2
# @@protoc_insertion_point(imports)
_sym_db = _symbol_database.Default()

name='communicator_objects/resolution_proto.proto',
package='communicator_objects',
syntax='proto3',
serialized_options=_b('\252\002\034MLAgents.CommunicatorObjects'),
serialized_pb=_b('\n+communicator_objects/resolution_proto.proto\x12\x14\x63ommunicator_objects\"D\n\x0fResolutionProto\x12\r\n\x05width\x18\x01 \x01(\x05\x12\x0e\n\x06height\x18\x02 \x01(\x05\x12\x12\n\ngray_scale\x18\x03 \x01(\x08\x42\x1f\xaa\x02\x1cMLAgents.CommunicatorObjectsb\x06proto3')
)

has_default_value=False, default_value=0,
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='height', full_name='communicator_objects.ResolutionProto.height', index=1,
number=2, type=5, cpp_type=1, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='gray_scale', full_name='communicator_objects.ResolutionProto.gray_scale', index=2,
number=3, type=8, cpp_type=7, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
],
extensions=[
],

options=None,
serialized_options=None,
is_extendable=False,
syntax='proto3',
extension_ranges=[],

_sym_db.RegisterMessage(ResolutionProto)
DESCRIPTOR.has_options = True
DESCRIPTOR._options = _descriptor._ParseOptions(descriptor_pb2.FileOptions(), _b('\252\002\034MLAgents.CommunicatorObjects'))
DESCRIPTOR._options = None
# @@protoc_insertion_point(module_scope)

11
python/communicator_objects/space_type_proto_pb2.py


from google.protobuf import message as _message
from google.protobuf import reflection as _reflection
from google.protobuf import symbol_database as _symbol_database
from google.protobuf import descriptor_pb2
# @@protoc_insertion_point(imports)
_sym_db = _symbol_database.Default()

name='communicator_objects/space_type_proto.proto',
package='communicator_objects',
syntax='proto3',
serialized_options=_b('\252\002\034MLAgents.CommunicatorObjects'),
serialized_pb=_b('\n+communicator_objects/space_type_proto.proto\x12\x14\x63ommunicator_objects\x1a+communicator_objects/resolution_proto.proto*.\n\x0eSpaceTypeProto\x12\x0c\n\x08\x64iscrete\x10\x00\x12\x0e\n\ncontinuous\x10\x01\x42\x1f\xaa\x02\x1cMLAgents.CommunicatorObjectsb\x06proto3')
,
dependencies=[communicator__objects_dot_resolution__proto__pb2.DESCRIPTOR,])

values=[
_descriptor.EnumValueDescriptor(
name='discrete', index=0, number=0,
options=None,
serialized_options=None,
options=None,
serialized_options=None,
options=None,
serialized_options=None,
serialized_start=114,
serialized_end=160,
)

_sym_db.RegisterFileDescriptor(DESCRIPTOR)
DESCRIPTOR.has_options = True
DESCRIPTOR._options = _descriptor._ParseOptions(descriptor_pb2.FileOptions(), _b('\252\002\034MLAgents.CommunicatorObjects'))
DESCRIPTOR._options = None
# @@protoc_insertion_point(module_scope)

22
python/communicator_objects/unity_input_pb2.py


from google.protobuf import message as _message
from google.protobuf import reflection as _reflection
from google.protobuf import symbol_database as _symbol_database
from google.protobuf import descriptor_pb2
# @@protoc_insertion_point(imports)
_sym_db = _symbol_database.Default()

name='communicator_objects/unity_input.proto',
package='communicator_objects',
syntax='proto3',
serialized_pb=_b('\n&communicator_objects/unity_input.proto\x12\x14\x63ommunicator_objects\x1a)communicator_objects/unity_rl_input.proto\x1a\x38\x63ommunicator_objects/unity_rl_initialization_input.proto\"\xb0\x01\n\nUnityInput\x12\x34\n\x08rl_input\x18\x01 \x01(\x0b\x32\".communicator_objects.UnityRLInput\x12Q\n\x17rl_initialization_input\x18\x02 \x01(\x0b\x32\x30.communicator_objects.UnityRLInitializationInput\x12\x19\n\x11\x63ustom_data_input\x18\x03 \x01(\x05\x42\x1f\xaa\x02\x1cMLAgents.CommunicatorObjectsb\x06proto3')
serialized_options=_b('\252\002\034MLAgents.CommunicatorObjects'),
serialized_pb=_b('\n&communicator_objects/unity_input.proto\x12\x14\x63ommunicator_objects\x1a)communicator_objects/unity_rl_input.proto\x1a\x38\x63ommunicator_objects/unity_rl_initialization_input.proto\"\x95\x01\n\nUnityInput\x12\x34\n\x08rl_input\x18\x01 \x01(\x0b\x32\".communicator_objects.UnityRLInput\x12Q\n\x17rl_initialization_input\x18\x02 \x01(\x0b\x32\x30.communicator_objects.UnityRLInitializationInputB\x1f\xaa\x02\x1cMLAgents.CommunicatorObjectsb\x06proto3')
,
dependencies=[communicator__objects_dot_unity__rl__input__pb2.DESCRIPTOR,communicator__objects_dot_unity__rl__initialization__input__pb2.DESCRIPTOR,])

has_default_value=False, default_value=None,
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='rl_initialization_input', full_name='communicator_objects.UnityInput.rl_initialization_input', index=1,
number=2, type=11, cpp_type=10, label=1,

options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='custom_data_input', full_name='communicator_objects.UnityInput.custom_data_input', index=2,
number=3, type=5, cpp_type=1, label=1,
has_default_value=False, default_value=0,
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
],
extensions=[
],

options=None,
serialized_options=None,
is_extendable=False,
syntax='proto3',
extension_ranges=[],

serialized_end=342,
serialized_end=315,
)
_UNITYINPUT.fields_by_name['rl_input'].message_type = communicator__objects_dot_unity__rl__input__pb2._UNITYRLINPUT

_sym_db.RegisterMessage(UnityInput)
DESCRIPTOR.has_options = True
DESCRIPTOR._options = _descriptor._ParseOptions(descriptor_pb2.FileOptions(), _b('\252\002\034MLAgents.CommunicatorObjects'))
DESCRIPTOR._options = None
# @@protoc_insertion_point(module_scope)

13
python/communicator_objects/unity_message_pb2.py


from google.protobuf import message as _message
from google.protobuf import reflection as _reflection
from google.protobuf import symbol_database as _symbol_database
from google.protobuf import descriptor_pb2
# @@protoc_insertion_point(imports)
_sym_db = _symbol_database.Default()

name='communicator_objects/unity_message.proto',
package='communicator_objects',
syntax='proto3',
serialized_options=_b('\252\002\034MLAgents.CommunicatorObjects'),
serialized_pb=_b('\n(communicator_objects/unity_message.proto\x12\x14\x63ommunicator_objects\x1a\'communicator_objects/unity_output.proto\x1a&communicator_objects/unity_input.proto\x1a!communicator_objects/header.proto\"\xac\x01\n\x0cUnityMessage\x12,\n\x06header\x18\x01 \x01(\x0b\x32\x1c.communicator_objects.Header\x12\x37\n\x0cunity_output\x18\x02 \x01(\x0b\x32!.communicator_objects.UnityOutput\x12\x35\n\x0bunity_input\x18\x03 \x01(\x0b\x32 .communicator_objects.UnityInputB\x1f\xaa\x02\x1cMLAgents.CommunicatorObjectsb\x06proto3')
,
dependencies=[communicator__objects_dot_unity__output__pb2.DESCRIPTOR,communicator__objects_dot_unity__input__pb2.DESCRIPTOR,communicator__objects_dot_header__pb2.DESCRIPTOR,])

has_default_value=False, default_value=None,
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='unity_output', full_name='communicator_objects.UnityMessage.unity_output', index=1,
number=2, type=11, cpp_type=10, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='unity_input', full_name='communicator_objects.UnityMessage.unity_input', index=2,
number=3, type=11, cpp_type=10, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
],
extensions=[
],

options=None,
serialized_options=None,
is_extendable=False,
syntax='proto3',
extension_ranges=[],

_sym_db.RegisterMessage(UnityMessage)
DESCRIPTOR.has_options = True
DESCRIPTOR._options = _descriptor._ParseOptions(descriptor_pb2.FileOptions(), _b('\252\002\034MLAgents.CommunicatorObjects'))
DESCRIPTOR._options = None
# @@protoc_insertion_point(module_scope)

22
python/communicator_objects/unity_output_pb2.py


from google.protobuf import message as _message
from google.protobuf import reflection as _reflection
from google.protobuf import symbol_database as _symbol_database
from google.protobuf import descriptor_pb2
# @@protoc_insertion_point(imports)
_sym_db = _symbol_database.Default()

name='communicator_objects/unity_output.proto',
package='communicator_objects',
syntax='proto3',
serialized_pb=_b('\n\'communicator_objects/unity_output.proto\x12\x14\x63ommunicator_objects\x1a*communicator_objects/unity_rl_output.proto\x1a\x39\x63ommunicator_objects/unity_rl_initialization_output.proto\"\xb6\x01\n\x0bUnityOutput\x12\x36\n\trl_output\x18\x01 \x01(\x0b\x32#.communicator_objects.UnityRLOutput\x12S\n\x18rl_initialization_output\x18\x02 \x01(\x0b\x32\x31.communicator_objects.UnityRLInitializationOutput\x12\x1a\n\x12\x63ustom_data_output\x18\x03 \x01(\tB\x1f\xaa\x02\x1cMLAgents.CommunicatorObjectsb\x06proto3')
serialized_options=_b('\252\002\034MLAgents.CommunicatorObjects'),
serialized_pb=_b('\n\'communicator_objects/unity_output.proto\x12\x14\x63ommunicator_objects\x1a*communicator_objects/unity_rl_output.proto\x1a\x39\x63ommunicator_objects/unity_rl_initialization_output.proto\"\x9a\x01\n\x0bUnityOutput\x12\x36\n\trl_output\x18\x01 \x01(\x0b\x32#.communicator_objects.UnityRLOutput\x12S\n\x18rl_initialization_output\x18\x02 \x01(\x0b\x32\x31.communicator_objects.UnityRLInitializationOutputB\x1f\xaa\x02\x1cMLAgents.CommunicatorObjectsb\x06proto3')
,
dependencies=[communicator__objects_dot_unity__rl__output__pb2.DESCRIPTOR,communicator__objects_dot_unity__rl__initialization__output__pb2.DESCRIPTOR,])

has_default_value=False, default_value=None,
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='rl_initialization_output', full_name='communicator_objects.UnityOutput.rl_initialization_output', index=1,
number=2, type=11, cpp_type=10, label=1,

options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='custom_data_output', full_name='communicator_objects.UnityOutput.custom_data_output', index=2,
number=3, type=9, cpp_type=9, label=1,
has_default_value=False, default_value=_b("").decode('utf-8'),
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
],
extensions=[
],

options=None,
serialized_options=None,
is_extendable=False,
syntax='proto3',
extension_ranges=[],

serialized_end=351,
serialized_end=323,
)
_UNITYOUTPUT.fields_by_name['rl_output'].message_type = communicator__objects_dot_unity__rl__output__pb2._UNITYRLOUTPUT

_sym_db.RegisterMessage(UnityOutput)
DESCRIPTOR.has_options = True
DESCRIPTOR._options = _descriptor._ParseOptions(descriptor_pb2.FileOptions(), _b('\252\002\034MLAgents.CommunicatorObjects'))
DESCRIPTOR._options = None
# @@protoc_insertion_point(module_scope)

9
python/communicator_objects/unity_rl_initialization_input_pb2.py


from google.protobuf import message as _message
from google.protobuf import reflection as _reflection
from google.protobuf import symbol_database as _symbol_database
from google.protobuf import descriptor_pb2
# @@protoc_insertion_point(imports)
_sym_db = _symbol_database.Default()

name='communicator_objects/unity_rl_initialization_input.proto',
package='communicator_objects',
syntax='proto3',
serialized_options=_b('\252\002\034MLAgents.CommunicatorObjects'),
serialized_pb=_b('\n8communicator_objects/unity_rl_initialization_input.proto\x12\x14\x63ommunicator_objects\"*\n\x1aUnityRLInitializationInput\x12\x0c\n\x04seed\x18\x01 \x01(\x05\x42\x1f\xaa\x02\x1cMLAgents.CommunicatorObjectsb\x06proto3')
)

has_default_value=False, default_value=0,
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
],
extensions=[
],

options=None,
serialized_options=None,
is_extendable=False,
syntax='proto3',
extension_ranges=[],

_sym_db.RegisterMessage(UnityRLInitializationInput)
DESCRIPTOR.has_options = True
DESCRIPTOR._options = _descriptor._ParseOptions(descriptor_pb2.FileOptions(), _b('\252\002\034MLAgents.CommunicatorObjects'))
DESCRIPTOR._options = None
# @@protoc_insertion_point(module_scope)

17
python/communicator_objects/unity_rl_initialization_output_pb2.py


from google.protobuf import message as _message
from google.protobuf import reflection as _reflection
from google.protobuf import symbol_database as _symbol_database
from google.protobuf import descriptor_pb2
# @@protoc_insertion_point(imports)
_sym_db = _symbol_database.Default()

name='communicator_objects/unity_rl_initialization_output.proto',
package='communicator_objects',
syntax='proto3',
serialized_options=_b('\252\002\034MLAgents.CommunicatorObjects'),
serialized_pb=_b('\n9communicator_objects/unity_rl_initialization_output.proto\x12\x14\x63ommunicator_objects\x1a\x31\x63ommunicator_objects/brain_parameters_proto.proto\x1a\x37\x63ommunicator_objects/environment_parameters_proto.proto\"\xe6\x01\n\x1bUnityRLInitializationOutput\x12\x0c\n\x04name\x18\x01 \x01(\t\x12\x0f\n\x07version\x18\x02 \x01(\t\x12\x10\n\x08log_path\x18\x03 \x01(\t\x12\x44\n\x10\x62rain_parameters\x18\x05 \x03(\x0b\x32*.communicator_objects.BrainParametersProto\x12P\n\x16\x65nvironment_parameters\x18\x06 \x01(\x0b\x32\x30.communicator_objects.EnvironmentParametersProtoB\x1f\xaa\x02\x1cMLAgents.CommunicatorObjectsb\x06proto3')
,
dependencies=[communicator__objects_dot_brain__parameters__proto__pb2.DESCRIPTOR,communicator__objects_dot_environment__parameters__proto__pb2.DESCRIPTOR,])

has_default_value=False, default_value=_b("").decode('utf-8'),
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='version', full_name='communicator_objects.UnityRLInitializationOutput.version', index=1,
number=2, type=9, cpp_type=9, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='log_path', full_name='communicator_objects.UnityRLInitializationOutput.log_path', index=2,
number=3, type=9, cpp_type=9, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='brain_parameters', full_name='communicator_objects.UnityRLInitializationOutput.brain_parameters', index=3,
number=5, type=11, cpp_type=10, label=3,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='environment_parameters', full_name='communicator_objects.UnityRLInitializationOutput.environment_parameters', index=4,
number=6, type=11, cpp_type=10, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
],
extensions=[
],

options=None,
serialized_options=None,
is_extendable=False,
syntax='proto3',
extension_ranges=[],

_sym_db.RegisterMessage(UnityRLInitializationOutput)
DESCRIPTOR.has_options = True
DESCRIPTOR._options = _descriptor._ParseOptions(descriptor_pb2.FileOptions(), _b('\252\002\034MLAgents.CommunicatorObjects'))
DESCRIPTOR._options = None
# @@protoc_insertion_point(module_scope)

28
python/communicator_objects/unity_rl_input_pb2.py


from google.protobuf import message as _message
from google.protobuf import reflection as _reflection
from google.protobuf import symbol_database as _symbol_database
from google.protobuf import descriptor_pb2
# @@protoc_insertion_point(imports)
_sym_db = _symbol_database.Default()

name='communicator_objects/unity_rl_input.proto',
package='communicator_objects',
syntax='proto3',
serialized_options=_b('\252\002\034MLAgents.CommunicatorObjects'),
serialized_pb=_b('\n)communicator_objects/unity_rl_input.proto\x12\x14\x63ommunicator_objects\x1a-communicator_objects/agent_action_proto.proto\x1a\x37\x63ommunicator_objects/environment_parameters_proto.proto\x1a(communicator_objects/command_proto.proto\"\xb4\x03\n\x0cUnityRLInput\x12K\n\ragent_actions\x18\x01 \x03(\x0b\x32\x34.communicator_objects.UnityRLInput.AgentActionsEntry\x12P\n\x16\x65nvironment_parameters\x18\x02 \x01(\x0b\x32\x30.communicator_objects.EnvironmentParametersProto\x12\x13\n\x0bis_training\x18\x03 \x01(\x08\x12\x33\n\x07\x63ommand\x18\x04 \x01(\x0e\x32\".communicator_objects.CommandProto\x1aM\n\x14ListAgentActionProto\x12\x35\n\x05value\x18\x01 \x03(\x0b\x32&.communicator_objects.AgentActionProto\x1al\n\x11\x41gentActionsEntry\x12\x0b\n\x03key\x18\x01 \x01(\t\x12\x46\n\x05value\x18\x02 \x01(\x0b\x32\x37.communicator_objects.UnityRLInput.ListAgentActionProto:\x02\x38\x01\x42\x1f\xaa\x02\x1cMLAgents.CommunicatorObjectsb\x06proto3')
,
dependencies=[communicator__objects_dot_agent__action__proto__pb2.DESCRIPTOR,communicator__objects_dot_environment__parameters__proto__pb2.DESCRIPTOR,communicator__objects_dot_command__proto__pb2.DESCRIPTOR,])

has_default_value=False, default_value=[],
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
],
extensions=[
],

options=None,
serialized_options=None,
is_extendable=False,
syntax='proto3',
extension_ranges=[],

has_default_value=False, default_value=_b("").decode('utf-8'),
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='value', full_name='communicator_objects.UnityRLInput.AgentActionsEntry.value', index=1,
number=2, type=11, cpp_type=10, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
],
extensions=[
],

options=_descriptor._ParseOptions(descriptor_pb2.MessageOptions(), _b('8\001')),
serialized_options=_b('8\001'),
is_extendable=False,
syntax='proto3',
extension_ranges=[],

has_default_value=False, default_value=[],
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='environment_parameters', full_name='communicator_objects.UnityRLInput.environment_parameters', index=1,
number=2, type=11, cpp_type=10, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='is_training', full_name='communicator_objects.UnityRLInput.is_training', index=2,
number=3, type=8, cpp_type=7, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='command', full_name='communicator_objects.UnityRLInput.command', index=3,
number=4, type=14, cpp_type=8, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
],
extensions=[
],

options=None,
serialized_options=None,
is_extendable=False,
syntax='proto3',
extension_ranges=[],

_sym_db.RegisterMessage(UnityRLInput.AgentActionsEntry)
DESCRIPTOR.has_options = True
DESCRIPTOR._options = _descriptor._ParseOptions(descriptor_pb2.FileOptions(), _b('\252\002\034MLAgents.CommunicatorObjects'))
_UNITYRLINPUT_AGENTACTIONSENTRY.has_options = True
_UNITYRLINPUT_AGENTACTIONSENTRY._options = _descriptor._ParseOptions(descriptor_pb2.MessageOptions(), _b('8\001'))
DESCRIPTOR._options = None
_UNITYRLINPUT_AGENTACTIONSENTRY._options = None
# @@protoc_insertion_point(module_scope)

24
python/communicator_objects/unity_rl_output_pb2.py


from google.protobuf import message as _message
from google.protobuf import reflection as _reflection
from google.protobuf import symbol_database as _symbol_database
from google.protobuf import descriptor_pb2
# @@protoc_insertion_point(imports)
_sym_db = _symbol_database.Default()

name='communicator_objects/unity_rl_output.proto',
package='communicator_objects',
syntax='proto3',
serialized_options=_b('\252\002\034MLAgents.CommunicatorObjects'),
serialized_pb=_b('\n*communicator_objects/unity_rl_output.proto\x12\x14\x63ommunicator_objects\x1a+communicator_objects/agent_info_proto.proto\"\xa3\x02\n\rUnityRLOutput\x12\x13\n\x0bglobal_done\x18\x01 \x01(\x08\x12G\n\nagentInfos\x18\x02 \x03(\x0b\x32\x33.communicator_objects.UnityRLOutput.AgentInfosEntry\x1aI\n\x12ListAgentInfoProto\x12\x33\n\x05value\x18\x01 \x03(\x0b\x32$.communicator_objects.AgentInfoProto\x1ai\n\x0f\x41gentInfosEntry\x12\x0b\n\x03key\x18\x01 \x01(\t\x12\x45\n\x05value\x18\x02 \x01(\x0b\x32\x36.communicator_objects.UnityRLOutput.ListAgentInfoProto:\x02\x38\x01\x42\x1f\xaa\x02\x1cMLAgents.CommunicatorObjectsb\x06proto3')
,
dependencies=[communicator__objects_dot_agent__info__proto__pb2.DESCRIPTOR,])

has_default_value=False, default_value=[],
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
],
extensions=[
],

options=None,
serialized_options=None,
is_extendable=False,
syntax='proto3',
extension_ranges=[],

has_default_value=False, default_value=_b("").decode('utf-8'),
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='value', full_name='communicator_objects.UnityRLOutput.AgentInfosEntry.value', index=1,
number=2, type=11, cpp_type=10, label=1,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
],
extensions=[
],

options=_descriptor._ParseOptions(descriptor_pb2.MessageOptions(), _b('8\001')),
serialized_options=_b('8\001'),
is_extendable=False,
syntax='proto3',
extension_ranges=[],

has_default_value=False, default_value=False,
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
_descriptor.FieldDescriptor(
name='agentInfos', full_name='communicator_objects.UnityRLOutput.agentInfos', index=1,
number=2, type=11, cpp_type=10, label=3,

options=None, file=DESCRIPTOR),
serialized_options=None, file=DESCRIPTOR),
],
extensions=[
],

options=None,
serialized_options=None,
is_extendable=False,
syntax='proto3',
extension_ranges=[],

_sym_db.RegisterMessage(UnityRLOutput.AgentInfosEntry)
DESCRIPTOR.has_options = True
DESCRIPTOR._options = _descriptor._ParseOptions(descriptor_pb2.FileOptions(), _b('\252\002\034MLAgents.CommunicatorObjects'))
_UNITYRLOUTPUT_AGENTINFOSENTRY.has_options = True
_UNITYRLOUTPUT_AGENTINFOSENTRY._options = _descriptor._ParseOptions(descriptor_pb2.MessageOptions(), _b('8\001'))
DESCRIPTOR._options = None
_UNITYRLOUTPUT_AGENTINFOSENTRY._options = None
# @@protoc_insertion_point(module_scope)

2
python/requirements.txt


pytest>=3.2.2
docopt
pyyaml
protobuf==3.5.2
protobuf==3.6.0
grpcio==1.11.0

1
python/tests/mock_communicator.py


camera_resolutions=resolutions,
vector_action_descriptions=["", ""],
vector_action_space_type=int(not self.is_discrete),
vector_observation_space_type=1,
brain_name="RealFakeBrain",
brain_type=2
)

57
python/tests/test_unityagents.py


import json
import unittest.mock as mock
import pytest
import struct

from unityagents import UnityEnvironment, UnityEnvironmentException, UnityActionException, \
BrainInfo, Curriculum
BrainInfo
dummy_curriculum = json.loads('''{
"measure" : "reward",
"thresholds" : [10, 20, 50],
"min_lesson_length" : 3,
"signal_smoothing" : true,
"parameters" :
{
"param1" : [0.7, 0.5, 0.3, 0.1],
"param2" : [100, 50, 20, 15],
"param3" : [0.2, 0.3, 0.7, 0.9]
}
}''')
bad_curriculum = json.loads('''{
"measure" : "reward",
"thresholds" : [10, 20, 50],
"min_lesson_length" : 3,
"signal_smoothing" : false,
"parameters" :
{
"param1" : [0.7, 0.5, 0.3, 0.1],
"param2" : [100, 50, 20],
"param3" : [0.2, 0.3, 0.7, 0.9]
}
}''')
def test_handles_bad_filename():

env.close()
assert not env._loaded
assert comm.has_been_closed
def test_curriculum():
open_name = '%s.open' % __name__
with mock.patch('json.load') as mock_load:
with mock.patch(open_name, create=True) as mock_open:
mock_open.return_value = 0
mock_load.return_value = bad_curriculum
with pytest.raises(UnityEnvironmentException):
Curriculum('tests/test_unityagents.py', {"param1": 1, "param2": 1, "param3": 1})
mock_load.return_value = dummy_curriculum
with pytest.raises(UnityEnvironmentException):
Curriculum('tests/test_unityagents.py', {"param1": 1, "param2": 1})
curriculum = Curriculum('tests/test_unityagents.py', {"param1": 1, "param2": 1, "param3": 1})
assert curriculum.get_lesson_number == 0
curriculum.set_lesson_number(1)
assert curriculum.get_lesson_number == 1
curriculum.increment_lesson(10)
assert curriculum.get_lesson_number == 1
curriculum.increment_lesson(30)
curriculum.increment_lesson(30)
assert curriculum.get_lesson_number == 1
assert curriculum.lesson_length == 3
curriculum.increment_lesson(30)
assert curriculum.get_config() == {'param1': 0.3, 'param2': 20, 'param3': 0.7}
assert curriculum.get_config(0) == {"param1": 0.7, "param2": 100, "param3": 0.2}
assert curriculum.lesson_length == 0
assert curriculum.get_lesson_number == 2
if __name__ == '__main__':

61
python/tests/test_unitytrainers.py


import json
import yaml
import unittest.mock as mock
import pytest

from unitytrainers.models import *
from unitytrainers.ppo.trainer import PPOTrainer
from unitytrainers.bc.trainer import BehavioralCloningTrainer
from unityagents import UnityEnvironmentException
from unitytrainers.curriculum import Curriculum
from unitytrainers.exception import CurriculumError
from unityagents.exception import UnityEnvironmentException
from .mock_communicator import MockCommunicator
dummy_start = '''{

"memorySize": 0,
"cameraResolutions": [],
"vectorActionDescriptions": ["",""],
"vectorActionSpaceType": 1,
"vectorObservationSpaceType": 1
"vectorActionSpaceType": 1
}]
}'''.encode()

memory_size: 8
''')
dummy_curriculum = json.loads('''{
"measure" : "reward",
"thresholds" : [10, 20, 50],
"min_lesson_length" : 3,
"signal_smoothing" : true,
"parameters" :
{
"param1" : [0.7, 0.5, 0.3, 0.1],
"param2" : [100, 50, 20, 15],
"param3" : [0.2, 0.3, 0.7, 0.9]
}
}''')
bad_curriculum = json.loads('''{
"measure" : "reward",
"thresholds" : [10, 20, 50],
"min_lesson_length" : 3,
"signal_smoothing" : false,
"parameters" :
{
"param1" : [0.7, 0.5, 0.3, 0.1],
"param2" : [100, 50, 20],
"param3" : [0.2, 0.3, 0.7, 0.9]
}
}''')
@mock.patch('unityagents.UnityEnvironment.executable_launcher')
@mock.patch('unityagents.UnityEnvironment.get_communicator')

batch_size=None, training_length=2)
assert len(b.update_buffer['action']) == 10
assert np.array(b.update_buffer['action']).shape == (10, 2, 2)
def test_curriculum():
open_name = '%s.open' % __name__
with mock.patch('json.load') as mock_load:
with mock.patch(open_name, create=True) as mock_open:
mock_open.return_value = 0
mock_load.return_value = bad_curriculum
with pytest.raises(CurriculumError):
Curriculum('tests/test_unityagents.py', {"param1": 1, "param2": 1, "param3": 1})
mock_load.return_value = dummy_curriculum
with pytest.raises(CurriculumError):
Curriculum('tests/test_unityagents.py', {"param1": 1, "param2": 1})
curriculum = Curriculum('tests/test_unityagents.py', {"param1": 1, "param2": 1, "param3": 1})
assert curriculum.get_lesson_number == 0
curriculum.set_lesson_number(1)
assert curriculum.get_lesson_number == 1
curriculum.increment_lesson(10)
assert curriculum.get_lesson_number == 1
curriculum.increment_lesson(30)
curriculum.increment_lesson(30)
assert curriculum.get_lesson_number == 1
assert curriculum.lesson_length == 3
curriculum.increment_lesson(30)
assert curriculum.get_config() == {'param1': 0.3, 'param2': 20, 'param3': 0.7}
assert curriculum.get_config(0) == {"param1": 0.7, "param2": 100, "param3": 0.2}
assert curriculum.lesson_length == 0
assert curriculum.get_lesson_number == 2
if __name__ == '__main__':

1
python/unityagents/__init__.py


from .environment import *
from .brain import *
from .exception import *
from .curriculum import *

29
python/unityagents/brain.py


self.vector_action_space_size = brain_param["vectorActionSize"]
self.vector_action_descriptions = brain_param["vectorActionDescriptions"]
self.vector_action_space_type = ["discrete", "continuous"][brain_param["vectorActionSpaceType"]]
self.vector_observation_space_type = ["discrete", "continuous"][brain_param["vectorObservationSpaceType"]]
return '''Unity brain name: {0}
Number of Visual Observations (per agent): {1}
Vector Observation space type: {2}
Vector Observation space size (per agent): {3}
Number of stacked Vector Observation: {4}
Vector Action space type: {5}
Vector Action space size (per agent): {6}
Vector Action descriptions: {7}'''.format(self.brain_name,
str(self.number_visual_observations),
self.vector_observation_space_type,
str(self.vector_observation_space_size),
str(self.num_stacked_vector_observations),
self.vector_action_space_type,
str(self.vector_action_space_size),
', '.join(self.vector_action_descriptions))
return '''Unity brain name: {}
Number of Visual Observations (per agent): {}
Vector Observation space size (per agent): {}
Number of stacked Vector Observation: {}
Vector Action space type: {}
Vector Action space size (per agent): {}
Vector Action descriptions: {}'''.format(self.brain_name,
str(self.number_visual_observations),
str(self.vector_observation_space_size),
str(self.num_stacked_vector_observations),
self.vector_action_space_type,
str(self.vector_action_space_size),
', '.join(self.vector_action_descriptions))

3
python/unityagents/communicator.py


class Communicator(object):
def __init__(self, worker_id=0,
base_port=5005):
def __init__(self, worker_id=0, base_port=5005):
"""
Python side of the communication. Must be used in pair with the right Unity Communicator equivalent.

27
python/unityagents/environment.py


from .brain import BrainInfo, BrainParameters, AllBrainInfo
from .exception import UnityEnvironmentException, UnityActionException, UnityTimeOutException
from .curriculum import Curriculum
from communicator_objects import UnityRLInput, UnityRLOutput, AgentActionProto,\
EnvironmentParametersProto, UnityRLInitializationInput, UnityRLInitializationOutput,\

class UnityEnvironment(object):
def __init__(self, file_name=None, worker_id=0,
base_port=5005, curriculum=None,
seed=0, docker_training=False, no_graphics=False):
base_port=5005, seed=0,
docker_training=False, no_graphics=False):
"""
Starts a new unity environment and establishes a connection with the environment.
Notice: Currently communication between Unity and Python takes place over an open socket without authentication.

"cameraResolutions": resolution,
"vectorActionSize": brain_param.vector_action_size,
"vectorActionDescriptions": brain_param.vector_action_descriptions,
"vectorActionSpaceType": brain_param.vector_action_space_type,
"vectorObservationSpaceType": brain_param.vector_observation_space_type
"vectorActionSpaceType": brain_param.vector_action_space_type
})
if brain_param.brain_type == 2:
self._external_brain_names += [brain_param.brain_name]

self._curriculum = Curriculum(curriculum, self._resetParameters)
@property
def curriculum(self):
return self._curriculum
@property
def logfile_path(self):

# return SocketCommunicator(worker_id, base_port)
def __str__(self):
_new_reset_param = self._curriculum.get_config()
for k in _new_reset_param:
self._resetParameters[k] = _new_reset_param[k]
Lesson number : {3}
Reset Parameters :\n\t\t{4}'''.format(self._academy_name, str(self._num_brains),
str(self._num_external_brains), self._curriculum.get_lesson_number,
"\n\t\t".join([str(k) + " -> " + str(self._resetParameters[k])
Reset Parameters :\n\t\t{3}'''.format(self._academy_name, str(self._num_brains),
str(self._num_external_brains),
"\n\t\t".join([str(k) + " -> " + str(self._resetParameters[k])
def reset(self, train_mode=True, config=None, lesson=None) -> AllBrainInfo:
def reset(self, config=None, train_mode=True) -> AllBrainInfo:
config = self._curriculum.get_config(lesson)
config = self._resetParameters
elif config != {}:
logger.info("\nAcademy Reset with parameters : \t{0}"
.format(', '.join([str(x) + ' -> ' + str(config[x]) for x in config])))

2
python/unitytrainers/__init__.py


from .buffer import *
from .curriculum import *
from .models import *
from .trainer_controller import *
from .bc.models import *

from .exception import *

11
python/unitytrainers/bc/trainer.py


self.training_buffer = Buffer()
self.is_continuous_action = (env.brains[brain_name].vector_action_space_type == "continuous")
self.is_continuous_observation = (env.brains[brain_name].vector_observation_space_type == "continuous")
if self.use_visual_observations:
logger.info('Cannot use observations with imitation learning')
self.use_vector_observations = (env.brains[brain_name].vector_observation_space_size > 0)
self.summary_path = trainer_parameters['summary_path']
if not os.path.exists(self.summary_path):

else:
feed_dict[self.model.true_action] = np.array(_buffer['actions'][start:end]).reshape([-1])
if self.use_vector_observations:
if not self.is_continuous_observation:
feed_dict[self.model.vector_in] = np.array(_buffer['vector_observations'][start:end])\
.reshape([-1, self.brain.num_stacked_vector_observations])
else:
feed_dict[self.model.vector_in] = np.array(_buffer['vector_observations'][start:end])\
.reshape([-1, self.brain.vector_observation_space_size * self.brain.num_stacked_vector_observations])
feed_dict[self.model.vector_in] = np.array(_buffer['vector_observations'][start:end])\
.reshape([-1, self.brain.vector_observation_space_size * self.brain.num_stacked_vector_observations])
if self.use_visual_observations:
for i, _ in enumerate(self.model.visual_in):
_obs = np.array(_buffer['visual_observations%d' % i][start:end])

66
python/unitytrainers/models.py


:param o_size: Size of stacked vector observation.
:return:
"""
if self.brain.vector_observation_space_type == "continuous":
self.vector_in = tf.placeholder(shape=[None, self.o_size], dtype=tf.float32, name=name)
if self.normalize:
self.running_mean = tf.get_variable("running_mean", [self.o_size], trainable=False, dtype=tf.float32,
initializer=tf.zeros_initializer())
self.running_variance = tf.get_variable("running_variance", [self.o_size], trainable=False,
dtype=tf.float32, initializer=tf.ones_initializer())
self.update_mean, self.update_variance = self.create_normalizer_update(self.vector_in)
self.vector_in = tf.placeholder(shape=[None, self.o_size], dtype=tf.float32, name=name)
if self.normalize:
self.running_mean = tf.get_variable("running_mean", [self.o_size], trainable=False, dtype=tf.float32,
initializer=tf.zeros_initializer())
self.running_variance = tf.get_variable("running_variance", [self.o_size], trainable=False,
dtype=tf.float32, initializer=tf.ones_initializer())
self.update_mean, self.update_variance = self.create_normalizer_update(self.vector_in)
self.normalized_state = tf.clip_by_value((self.vector_in - self.running_mean) / tf.sqrt(
self.running_variance / (tf.cast(self.global_step, tf.float32) + 1)), -5, 5,
name="normalized_state")
return self.normalized_state
else:
return self.vector_in
self.normalized_state = tf.clip_by_value((self.vector_in - self.running_mean) / tf.sqrt(
self.running_variance / (tf.cast(self.global_step, tf.float32) + 1)), -5, 5,
name="normalized_state")
return self.normalized_state
self.vector_in = tf.placeholder(shape=[None, 1], dtype=tf.int32, name='vector_observation')
return self.vector_in
def create_normalizer_update(self, vector_input):

return update_mean, update_variance
@staticmethod
def create_continuous_observation_encoder(observation_input, h_size, activation, num_layers, scope, reuse):
def create_vector_observation_encoder(observation_input, h_size, activation, num_layers, scope, reuse):
"""
Builds a set of hidden state encoders.
:param reuse: Whether to re-use the weights within the same scope.

hidden = c_layers.flatten(conv2)
with tf.variable_scope(scope+'/'+'flat_encoding'):
hidden_flat = self.create_continuous_observation_encoder(hidden, h_size, activation,
num_layers, scope, reuse)
hidden_flat = self.create_vector_observation_encoder(hidden, h_size, activation,
num_layers, scope, reuse)
@staticmethod
def create_discrete_observation_encoder(observation_input, s_size, h_size, activation,
num_layers, scope, reuse):
"""
Builds a set of hidden state encoders from discrete state input.
:param reuse: Whether to re-use the weights within the same scope.
:param scope: The scope of the graph within which to create the ops.
:param observation_input: Discrete observation.
:param s_size: state input size (discrete).
:param h_size: Hidden layer size.
:param activation: What type of activation function to use for layers.
:param num_layers: number of hidden layers to create.
:return: List of hidden layer tensors.
"""
with tf.variable_scope(scope):
vector_in = tf.reshape(observation_input, [-1])
state_onehot = tf.one_hot(vector_in, s_size)
hidden = state_onehot
for i in range(num_layers):
hidden = tf.layers.dense(hidden, h_size, use_bias=False, activation=activation,
reuse=reuse, name="hidden_{}".format(i))
return hidden
def create_observation_streams(self, num_streams, h_size, num_layers):
"""
Creates encoding stream for observations.

visual_encoders.append(encoded_visual)
hidden_visual = tf.concat(visual_encoders, axis=1)
if brain.vector_observation_space_size > 0:
if brain.vector_observation_space_type == "continuous":
hidden_state = self.create_continuous_observation_encoder(vector_observation_input,
h_size, activation_fn, num_layers,
"main_graph_{}".format(i), False)
else:
hidden_state = self.create_discrete_observation_encoder(vector_observation_input, self.o_size,
h_size, activation_fn, num_layers,
"main_graph_{}".format(i), False)
hidden_state = self.create_vector_observation_encoder(vector_observation_input,
h_size, activation_fn, num_layers,
"main_graph_{}".format(i), False)
if hidden_state is not None and hidden_visual is not None:
final_hidden = tf.concat([hidden_visual, hidden_state], axis=1)
elif hidden_state is None and hidden_visual is not None:

38
python/unitytrainers/ppo/models.py


encoded_next_state_list.append(hidden_next_visual)
if self.o_size > 0:
if self.brain.vector_observation_space_type == "continuous":
# Create input op for next (t+1) vector observation.
self.next_vector_in = tf.placeholder(shape=[None, self.o_size], dtype=tf.float32,
name='next_vector_observation')
encoded_vector_obs = self.create_continuous_observation_encoder(self.vector_in,
self.curiosity_enc_size,
self.swish, 2, "vector_obs_encoder",
False)
encoded_next_vector_obs = self.create_continuous_observation_encoder(self.next_vector_in,
self.curiosity_enc_size,
self.swish, 2,
"vector_obs_encoder",
True)
else:
self.next_vector_in = tf.placeholder(shape=[None, 1], dtype=tf.int32,
name='next_vector_observation')
# Create input op for next (t+1) vector observation.
self.next_vector_in = tf.placeholder(shape=[None, self.o_size], dtype=tf.float32,
name='next_vector_observation')
encoded_vector_obs = self.create_discrete_observation_encoder(self.vector_in, self.o_size,
self.curiosity_enc_size,
self.swish, 2, "vector_obs_encoder",
False)
encoded_next_vector_obs = self.create_discrete_observation_encoder(self.next_vector_in, self.o_size,
self.curiosity_enc_size,
self.swish, 2, "vector_obs_encoder",
True)
encoded_vector_obs = self.create_vector_observation_encoder(self.vector_in,
self.curiosity_enc_size,
self.swish, 2, "vector_obs_encoder",
False)
encoded_next_vector_obs = self.create_vector_observation_encoder(self.next_vector_in,
self.curiosity_enc_size,
self.swish, 2,
"vector_obs_encoder",
True)
encoded_state_list.append(encoded_vector_obs)
encoded_next_state_list.append(encoded_next_vector_obs)

25
python/unitytrainers/ppo/trainer.py


self.cumulative_rewards = {}
self.episode_steps = {}
self.is_continuous_action = (env.brains[brain_name].vector_action_space_type == "continuous")
self.is_continuous_observation = (env.brains[brain_name].vector_observation_space_type == "continuous")
self.use_visual_obs = (env.brains[brain_name].number_visual_observations > 0)
self.use_vector_obs = (env.brains[brain_name].vector_observation_space_size > 0)
self.summary_path = trainer_parameters['summary_path']

self.inference_run_list.append(self.model.output_pre)
if self.use_recurrent:
self.inference_run_list.extend([self.model.memory_out])
if (self.is_training and self.is_continuous_observation and
self.use_vector_obs and self.trainer_parameters['normalize']):
if self.is_training and self.use_vector_obs and self.trainer_parameters['normalize']:
self.inference_run_list.extend([self.model.update_mean, self.model.update_variance])
def __str__(self):

if self.use_recurrent:
feed_dict[self.model.prev_action] = np.array(buffer['prev_action'][start:end]).flatten()
if self.use_vector_obs:
if self.is_continuous_observation:
total_observation_length = self.brain.vector_observation_space_size * \
self.brain.num_stacked_vector_observations
feed_dict[self.model.vector_in] = np.array(buffer['vector_obs'][start:end]).reshape(
[-1, total_observation_length])
if self.use_curiosity:
feed_dict[self.model.next_vector_in] = np.array(buffer['next_vector_in'][start:end]) \
.reshape([-1, total_observation_length])
else:
feed_dict[self.model.vector_in] = np.array(buffer['vector_obs'][start:end]).reshape(
[-1, self.brain.num_stacked_vector_observations])
if self.use_curiosity:
feed_dict[self.model.next_vector_in] = np.array(buffer['next_vector_in'][start:end]) \
.reshape([-1, self.brain.num_stacked_vector_observations])
total_observation_length = self.brain.vector_observation_space_size * \
self.brain.num_stacked_vector_observations
feed_dict[self.model.vector_in] = np.array(buffer['vector_obs'][start:end]).reshape(
[-1, total_observation_length])
if self.use_curiosity:
feed_dict[self.model.next_vector_in] = np.array(buffer['next_vector_in'][start:end]) \
.reshape([-1, total_observation_length])
if self.use_visual_obs:
for i, _ in enumerate(self.model.visual_in):
_obs = np.array(buffer['visual_obs%d' % i][start:end])

21
python/unitytrainers/trainer_controller.py


from tensorflow.python.tools import freeze_graph
from unitytrainers.ppo.trainer import PPOTrainer
from unitytrainers.bc.trainer import BehavioralCloningTrainer
from unitytrainers import Curriculum
from unityagents import UnityEnvironment, UnityEnvironmentException

np.random.seed(self.seed)
tf.set_random_seed(self.seed)
self.env = UnityEnvironment(file_name=env_path, worker_id=self.worker_id,
curriculum=self.curriculum_file, seed=self.seed,
docker_training=self.docker_training,
seed=self.seed, docker_training=self.docker_training,
self.curriculum = Curriculum(curriculum_file, self.env._resetParameters)
if self.env.curriculum.measure_type == "progress":
if self.curriculum.measure_type == "progress":
elif self.env.curriculum.measure_type == "reward":
elif self.curriculum.measure_type == "reward":
for brain_name in self.env.external_brain_names:
progress += self.trainers[brain_name].get_last_reward
return progress

.format(model_path))
def start_learning(self):
self.env.curriculum.set_lesson_number(self.lesson)
self.curriculum.set_lesson_number(self.lesson)
trainer_config = self._load_config()
self._create_model_path(self.model_path)

else:
sess.run(init)
global_step = 0 # This is only for saving the model
self.env.curriculum.increment_lesson(self._get_progress())
curr_info = self.env.reset(train_mode=self.fast_simulation)
self.curriculum.increment_lesson(self._get_progress())
curr_info = self.env.reset(config=self.curriculum.get_config(), train_mode=self.fast_simulation)
if self.train_model:
for brain_name, trainer in self.trainers.items():
trainer.write_tensorboard_text('Hyperparameters', trainer.parameters)

self.env.curriculum.increment_lesson(self._get_progress())
curr_info = self.env.reset(train_mode=self.fast_simulation)
self.curriculum.increment_lesson(self._get_progress())
curr_info = self.env.reset(config=self.curriculum.get_config(), train_mode=self.fast_simulation)
for brain_name, trainer in self.trainers.items():
trainer.end_episode()
# Decide and take an action

# Perform gradient descent with experience buffer
trainer.update_model()
# Write training statistics to Tensorboard.
trainer.write_summary(self.env.curriculum.lesson_number)
trainer.write_summary(self.curriculum.lesson_number)
if self.train_model and trainer.get_step <= trainer.get_max_steps:
trainer.increment_step_and_update_last_reward()
if self.train_model:

10
unity-environment/Assets/ML-Agents/Editor/BrainEditor.cs


EditorGUILayout.LabelField("Vector Observation");
EditorGUI.indentLevel++;
SerializedProperty bpVectorObsType =
serializedBrain.FindProperty("brainParameters.vectorObservationSpaceType");
EditorGUILayout.PropertyField(bpVectorObsType, new GUIContent("Space Type",
"Corresponds to whether state " +
"vector contains a single integer (Discrete) " +
"or a series of real-valued floats (Continuous)."));
SerializedProperty bpVectorObsSize =
serializedBrain.FindProperty("brainParameters.vectorObservationSize");
EditorGUILayout.PropertyField(bpVectorObsSize, new GUIContent("Space Size",

EditorGUI.indentLevel = indentLevel;
SerializedProperty bt = serializedBrain.FindProperty("brainType");
EditorGUILayout.PropertyField(bt);
if (bt.enumValueIndex < 0)
{

171
unity-environment/Assets/ML-Agents/Examples/Basic/Scenes/Basic.unity


--- !u!104 &2
RenderSettings:
m_ObjectHideFlags: 0
serializedVersion: 8
serializedVersion: 9
m_Fog: 0
m_FogColor: {r: 0.5, g: 0.5, b: 0.5, a: 1}
m_FogMode: 3

m_CustomReflection: {fileID: 0}
m_Sun: {fileID: 0}
m_IndirectSpecularColor: {r: 0, g: 0, b: 0, a: 1}
m_UseRadianceAmbientProbe: 0
--- !u!157 &3
LightmapSettings:
m_ObjectHideFlags: 0

m_EnableBakedLightmaps: 1
m_EnableRealtimeLightmaps: 1
m_LightmapEditorSettings:
serializedVersion: 9
serializedVersion: 10
m_TextureWidth: 1024
m_TextureHeight: 1024
m_AtlasSize: 1024
m_AO: 0
m_AOMaxDistance: 1
m_CompAOExponent: 1

m_PVRFilteringAtrousPositionSigmaDirect: 0.5
m_PVRFilteringAtrousPositionSigmaIndirect: 2
m_PVRFilteringAtrousPositionSigmaAO: 1
m_ShowResolutionOverlay: 1
m_LightingDataAsset: {fileID: 0}
m_UseShadowmask: 1
--- !u!196 &4

debug:
m_Flags: 0
m_NavMeshData: {fileID: 0}
--- !u!114 &35309571
MonoBehaviour:
m_ObjectHideFlags: 0
m_PrefabParentObject: {fileID: 0}
m_PrefabInternal: {fileID: 0}
m_GameObject: {fileID: 0}
m_Enabled: 1
m_EditorHideFlags: 0
m_Script: {fileID: 11500000, guid: 35813a1be64e144f887d7d5f15b963fa, type: 3}
m_Name: (Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)
m_EditorClassIdentifier:
brain: {fileID: 846768605}
--- !u!1 &282272644
GameObject:
m_ObjectHideFlags: 0

- component: {fileID: 282272645}
- component: {fileID: 282272649}
m_Layer: 0
m_Name: Agent
m_Name: BasicAgent
m_TagString: Untagged
m_Icon: {fileID: 0}
m_NavMeshLayer: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RenderingLayerMask: 4294967295
m_Materials:
- {fileID: 2100000, guid: 260483cdfc6b14e26823a02f23bd8baa, type: 2}
m_StaticBatchInfo:

onDemandDecision: 1
numberOfActionsBetweenDecisions: 1
timeBetweenDecisionsAtInference: 0.15
position: 0
smallGoalPosition: -3
largeGoalPosition: 7
minPosition: -10
maxPosition: 10
--- !u!114 &339558607
MonoBehaviour:
m_ObjectHideFlags: 0
m_PrefabParentObject: {fileID: 0}
m_PrefabInternal: {fileID: 0}
m_GameObject: {fileID: 0}
m_Enabled: 1
m_EditorHideFlags: 0
m_Script: {fileID: 11500000, guid: 8b23992c8eb17439887f5e944bf04a40, type: 3}
m_Name: (Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)
m_EditorClassIdentifier:
broadcast: 1
graphModel: {fileID: 4900000, guid: 8786b6500d406497c959f24c2a8b59ac, type: 3}
graphScope:
graphPlaceholders: []
BatchSizePlaceholderName: batch_size
VectorObservationPlacholderName: vector_observation
RecurrentInPlaceholderName: recurrent_in
RecurrentOutPlaceholderName: recurrent_out
VisualObservationPlaceholderName: []
ActionPlaceholderName: action
PreviousActionPlaceholderName: prev_action
brain: {fileID: 846768605}
--- !u!114 &703554261
MonoBehaviour:
m_ObjectHideFlags: 0
m_PrefabParentObject: {fileID: 0}
m_PrefabInternal: {fileID: 0}
m_GameObject: {fileID: 0}
m_Enabled: 1
m_EditorHideFlags: 0
m_Script: {fileID: 11500000, guid: 943466ab374444748a364f9d6c3e2fe2, type: 3}
m_Name: (Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)
m_EditorClassIdentifier:
broadcast: 1
brain: {fileID: 0}
--- !u!1 &762086410
GameObject:
m_ObjectHideFlags: 0

-
-
vectorActionSpaceType: 0
vectorObservationSpaceType: 0
- {fileID: 1458832067}
- {fileID: 1183791066}
- {fileID: 1066285776}
- {fileID: 977008778}
instanceID: 21298
--- !u!114 &977008778
MonoBehaviour:
m_ObjectHideFlags: 0
m_PrefabParentObject: {fileID: 0}
m_PrefabInternal: {fileID: 0}
m_GameObject: {fileID: 0}
m_Enabled: 1
m_EditorHideFlags: 0
m_Script: {fileID: 11500000, guid: 8b23992c8eb17439887f5e944bf04a40, type: 3}
m_Name: (Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)
m_EditorClassIdentifier:
broadcast: 1
graphModel: {fileID: 4900000, guid: 8786b6500d406497c959f24c2a8b59ac, type: 3}
graphScope:
graphPlaceholders: []
BatchSizePlaceholderName: batch_size
VectorObservationPlacholderName: vector_observation
RecurrentInPlaceholderName: recurrent_in
RecurrentOutPlaceholderName: recurrent_out
VisualObservationPlaceholderName: []
ActionPlaceholderName: action
PreviousActionPlaceholderName: prev_action
brain: {fileID: 846768605}
- {fileID: 1962229171}
- {fileID: 703554261}
- {fileID: 35309571}
- {fileID: 339558607}
instanceID: 14244
--- !u!1 &984725368
GameObject:
m_ObjectHideFlags: 0

- component: {fileID: 984725370}
- component: {fileID: 984725369}
m_Layer: 0
m_Name: largeGoal
m_Name: LargeGoal
m_TagString: Untagged
m_Icon: {fileID: 0}
m_NavMeshLayer: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RenderingLayerMask: 4294967295
m_Materials:
- {fileID: 2100000, guid: 624b24bbec31f44babfb57ef2dfbc537, type: 2}
m_StaticBatchInfo:

m_Father: {fileID: 0}
m_RootOrder: 4
m_LocalEulerAnglesHint: {x: 0, y: 0, z: 0}
--- !u!114 &1066285776
MonoBehaviour:
m_ObjectHideFlags: 0
m_PrefabParentObject: {fileID: 0}
m_PrefabInternal: {fileID: 0}
m_GameObject: {fileID: 0}
m_Enabled: 1
m_EditorHideFlags: 0
m_Script: {fileID: 11500000, guid: 35813a1be64e144f887d7d5f15b963fa, type: 3}
m_Name: (Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)
m_EditorClassIdentifier:
brain: {fileID: 846768605}
--- !u!1 &1178588871
GameObject:
m_ObjectHideFlags: 0

- component: {fileID: 1178588873}
- component: {fileID: 1178588872}
m_Layer: 0
m_Name: smallGoal
m_Name: SmallGoal
m_TagString: Untagged
m_Icon: {fileID: 0}
m_NavMeshLayer: 0

m_MotionVectors: 1
m_LightProbeUsage: 1
m_ReflectionProbeUsage: 1
m_RenderingLayerMask: 4294967295
m_Materials:
- {fileID: 2100000, guid: 624b24bbec31f44babfb57ef2dfbc537, type: 2}
m_StaticBatchInfo:

m_Father: {fileID: 0}
m_RootOrder: 5
m_LocalEulerAnglesHint: {x: 0, y: 0, z: 0}
--- !u!114 &1183791066
MonoBehaviour:
m_ObjectHideFlags: 0
m_PrefabParentObject: {fileID: 0}
m_PrefabInternal: {fileID: 0}
m_GameObject: {fileID: 0}
m_Enabled: 1
m_EditorHideFlags: 0
m_Script: {fileID: 11500000, guid: 943466ab374444748a364f9d6c3e2fe2, type: 3}
m_Name: (Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)
m_EditorClassIdentifier:
broadcast: 1
brain: {fileID: 0}
--- !u!114 &1458832067
MonoBehaviour:
m_ObjectHideFlags: 0
m_PrefabParentObject: {fileID: 0}
m_PrefabInternal: {fileID: 0}
m_GameObject: {fileID: 0}
m_Enabled: 1
m_EditorHideFlags: 0
m_Script: {fileID: 11500000, guid: 41e9bda8f3cf1492fa74926a530f6f70, type: 3}
m_Name: (Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)
m_EditorClassIdentifier:
broadcast: 1
continuousPlayerActions: []
discretePlayerActions:
- key: 97
value: 0
- key: 100
value: 1
defaultAction: -1
brain: {fileID: 846768605}
--- !u!1 &1574236047
GameObject:
m_ObjectHideFlags: 0

m_Father: {fileID: 0}
m_RootOrder: 0
m_LocalEulerAnglesHint: {x: 0, y: 0, z: 0}
--- !u!114 &1962229171
MonoBehaviour:
m_ObjectHideFlags: 0
m_PrefabParentObject: {fileID: 0}
m_PrefabInternal: {fileID: 0}
m_GameObject: {fileID: 0}
m_Enabled: 1
m_EditorHideFlags: 0
m_Script: {fileID: 11500000, guid: 41e9bda8f3cf1492fa74926a530f6f70, type: 3}
m_Name: (Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)
m_EditorClassIdentifier:
broadcast: 1
keyContinuousPlayerActions: []
axisContinuousPlayerActions: []
discretePlayerActions:
- key: 97
value: 0
- key: 100
value: 1
defaultAction: -1
brain: {fileID: 846768605}

28
unity-environment/Assets/ML-Agents/Examples/Basic/Scripts/BasicAgent.cs


private BasicAcademy academy;
public float timeBetweenDecisionsAtInference;
private float timeSinceDecision;
public int position;
public int smallGoalPosition;
public int largeGoalPosition;
int position;
int smallGoalPosition;
int largeGoalPosition;
public int minPosition;
public int maxPosition;
int minPosition;
int maxPosition;
public override void InitializeAgent()
{

public override void CollectObservations()
{
AddVectorObs(position);
AddVectorObs(position, 20);
}
public override void AgentAction(float[] vectorAction, string textAction)

if (position < minPosition) { position = minPosition; }
if (position > maxPosition) { position = maxPosition; }
gameObject.transform.position = new Vector3(position, 0f, 0f);
gameObject.transform.position = new Vector3(position - 10f, 0f, 0f);
AddReward(-0.01f);

public override void AgentReset()
{
position = 0;
minPosition = -10;
maxPosition = 10;
smallGoalPosition = -3;
largeGoalPosition = 7;
smallGoal.transform.position = new Vector3(smallGoalPosition, 0f, 0f);
largeGoal.transform.position = new Vector3(largeGoalPosition, 0f, 0f);
position = 10;
minPosition = 0;
maxPosition = 20;
smallGoalPosition = 7;
largeGoalPosition = 17;
smallGoal.transform.position = new Vector3(smallGoalPosition - 10f, 0f, 0f);
largeGoal.transform.position = new Vector3(largeGoalPosition - 10f, 0f, 0f);
}
public override void AgentOnDone()

165
unity-environment/Assets/ML-Agents/Examples/Basic/TFModels/Basic.bytes


L
vector_observation Placeholder*
dtype0*
shape: ���������
D
Reshape/shapeConst*
valueB:
���������*
dtype0
L
ReshapeReshapevector_observation Reshape/shape*
T0*
Tshape0
?
OneHotEncoding/ToInt64CastReshape*
SrcT0*
DstT0
F
OneHotEncoding/one_hot/depthConst*
value B:*
dtype0
L
OneHotEncoding/one_hot/on_valueConst*
value B
*�?*
vector_observation Placeholder*
shape: ���������*
dtype0
main_graph_0/hidden_0/kernelConst*�
value� B� "� ���_>��7�ȣ*��Y{>7ˤ�u9>q��>�I��=�\>�u�<���/����_N>4�>3��D��=6����q���)�6*��˵?F=���>`wy>y�>쳻�����r�=���>`�\�q<��r�>6�<&�|h���9>WU<��K�h�_=N����2~>��=���|��>���4S�>���< <��U����A>��V����>5˟�-я�\DP=�p>M�ٽ9q�S���E��=`��I �>/��� ���>���g��� >�\��G���;6���~���"�}�{>���>鿥>���<�|!�)�&�.x�=�����ԫ>���=�.��N�=��7����=��)>��>��->�s�=�������cRz�%��QB�=�u>��-���o>>I��CE�v��=���> ����f�>�ƌ=��E>��`>2�V�. ���6�>j��>[I�>/'�=y����5=�\��A�=��o>e���I ��k4e����=��G���������>7�g�a��<+�>H���x�v>��;>N~��J� �������=���=���>P��=ͫw��r��l�㾡l徏%�w�>�l0=�!�>x��>.��U�V�{��=��6>w�$=�y��T�s>�z�R#���E¨=Nl>"���Ă�=g�&���+��?���^<4��"xA>����B�����=�a�M
�>c�y7�=�;ͽ�`.=�觾}�>���>?���iպ>]�]�Ph>��>뚪=9�~>���>���=��)>���>�-�=i��=;��> �j��U[�]Gj>:�>01�<z+�> ��=���Ȣ�<���*���b.��K�>G��=߆�>�cE=fX=^�k��Ӿ���<R>=][ �>�P>�vA>�늽�?�k�>ZV>@��>����S���?��X>��2?'>��ɾ�p>��K���,>�2?�&���>?9��91}>e& ��J�=�l�>6ۊ�o'�� �>�
<>J×>��>M�*�?�?�o+��B�}���q7>X��?�? ��>]( ���h>��?��>�/7�lJ-�Ϯ��?� ?ٓ޽��>�[�>)�D�B?�;����x2�������3D��J�>��$��<�����=���>�,�>���>������{ ?͜?�ӻ��>�L�/|-?+ f�pRJ��5о5CK>є���?f=rz���(�>���>J��<���>�cw���ɾ�~? ��>���<���>�H����>E#���#��&�v�>sk�:;?G�g> ��X?ċ�>��Ǿ�f�>^�o�IT�Ϫ)?D`�>���>֌>��,�p�?���Qz�����5>Z=,I�>��q>�U�=e���ؗZ>a��=Lo�<�u*>�rs��@0>{�L>8�����>�͞>Ũ�>��ܼK����>Ni��^`]�e��4>��>���������[��p^�=h��>8/����Z��ZB�=T[\>���=�m�=�t�@�<��� h�<��
�w�>O�'��/>%n��Us����=��ɽ���^z�=6r�6Z=1*�>4�J:+P>@��>��Z>os<߿~>�Ų���7�8��<�Ѡ�*
dtype0
�
!main_graph_0/hidden_0/kernel/readIdentitymain_graph_0/hidden_0/kernel*
T0*/
_class%
#!loc:@main_graph_0/hidden_0/kernel
�
main_graph_0/hidden_0/biasConst*e
value\BZ"PR��?��?��>P?<�ܾ7Ǿ�?��>Q[�>}D?j���I?6�־
1������t?4��� �?P8J>*
M
OneHotEncoding/one_hot/off_valueConst*
value B
**

main_graph_0/hidden_0/bias/readIdentitymain_graph_0/hidden_0/bias*
T0*-
_class#
!loc:@main_graph_0/hidden_0/bias
�
main_graph_0/hidden_0/MatMulMatMulvector_observation!main_graph_0/hidden_0/kernel/read*
T0*
transpose_a(*
transpose_b(
�
main_graph_0/hidden_0/BiasAddBiasAddmain_graph_0/hidden_0/MatMulmain_graph_0/hidden_0/bias/read*
T0*
data_formatNHWC
P
main_graph_0/hidden_0/SigmoidSigmoidmain_graph_0/hidden_0/BiasAdd*
T0
g
main_graph_0/hidden_0/MulMulmain_graph_0/hidden_0/BiasAddmain_graph_0/hidden_0/Sigmoid*
T0
�
dense/kernelConst*�
value�B�"����>0���}*��Z�>�������>��f��$Y>H����k�>�>����HM�>va������_ ?)Wʾ�w�>!�̾�_�>k�
�D?$c�>L*׾����v�>N��>�G��R�?>T���>�I�� �ؾ^'�>��>����X���c
?Q�z���T>*
�
OneHotEncoding/one_hotOneHotOneHotEncoding/ToInt64OneHotEncoding/one_hot/depthOneHotEncoding/one_hot/on_value OneHotEncoding/one_hot/off_value*
axis ���������*
T0*
TI0
dense/kernelConst*
dtype0*�
value� B� "� F� �/R�>Im?\��>�Zֽ��E�:㍾�6����ܾ�(�>O���>���(?��=v�>�,'>\�%?Q��ڧ��Ƶɾ[� ��0�>��F?:�s>��A?Y��>1Rv�A=)���Ͼ��b?Y|��96�c[�>�e>��>� �a0.?�z�p��X�8�“��]�v?��w>!;?e^?v�-?���ȹ�C���y�>�
`�w��XJ�>�5�=�Y4?r{$��>�і�9ା��������X?��H?��??��>K�O?�o��1�Rk���0�>T�߾�b �%��>n?i��>��(�2�?��߾x� �难𬾡�??��'?��-?�k>8�+?���}߾��ӾCJ?nþ签�R[6?�=?�?E @�.��>*�3�����00������2�>��>��>��?��,?.3-�);������~5?Rw��pR3����>�5?]�(?X�M�]��=�̅�r0B���7�����@?�U�>O?�M?�[?����:���V*����>�D���R-�t�;s�2>�T�>�z��Pt?�S8�������(����B,�>��}>Z)>P�Ž`w������s��D�� �>&&�����rz��d%1>�������fِ>X��=���;����s��4�=���>���Ϋ6���ֽ�J�hr=�d=>��n=�[*�lfս�f��ৗ����<Ȩ�> s7��bt�(O>�-�>�z�X�,>�!��E��H07>� ��(��`�i>�ͯ�T�>x(7=��?��?�>�.>�M����ؼ0��<�W]��:"��+�>PEy>ʀ�� �\�s�þ��þr��>@���`�������d����`�>�o�>0(T=����]H��mє���L>os�xMR���+=^�v����>�+��X������{Z�����=��8ꭾ��;�t:�=h*w>�{�=�9����> d���#�0��<uI>Ow>(�>�>˽0.">
�¾,?1��J�=���
�������j���:��>���>���� 4<B�j��K{�L�;>��@�ȒS>� ܽ�����>��6>���=����܏���P��H�
>��0�������=���=��1��=�>3��ws���t>���>��Z�d�Q>�)���p >F�¾TQ�li~�\6<>�*�=��G>ݤ=�L"'>P�>o럾%�>2����#2�� >p���t�>���>h�B=�>W�f�B�>��N��?�WW����=L���f�>�}�>pb�=�Z>�������>�۽
e�>��#�Hv�~g�>��>"Ӏ>�=k�4�d>$׈�>{�>?�=�9B>d<�=r��>�E=�T��>F+�>�脾�8C��擽0�'�8�� ���@�g���9�>J>٬0��ľv_�>6�Z�2��>_]b��.������KB��7���>���=�mܽ�mF>x#*�@;z�VxW�g�����'>������=a}>� ���s��v�>ܝ> )>P�>��>��l=��>��J�͌��4M~>��>X�ž�򏽰7�=po>D��=5�|�: ��T�>���<�{��T�u>����<]νRI��/��>�_�>tR>�����!��&��>X��>
p
dense/MatMulMatMulOneHotEncoding/one_hotdense/kernel/read*
s
dense/MatMulMatMulmain_graph_0/hidden_0/Muldense/kernel/read*
T0*
transpose_b(*
T0
/
dense/SigmoidSigmoid dense/MatMul*
T0
6
dense/MulMul dense/MatMul dense/Sigmoid*
T0
�
dense_1/kernelConst*
dtype0*�
value�B�"����>q���� ��.?�����?"�t� ?�(���?IM ��$?|:�>�RܾÈ�>5�ؾ>K?���g��}"?=.&?6E!�3�?}��%h���E�>C?���> p��?��? ��hP��H �>5??b���g?��Yx ?���
[
dense_1/kernel/readIdentitydense_1/kernel*
T0*!
_class
loc:@dense_1/kernel
g
dense_2/MatMulMatMul dense/Muldense_1/kernel/read*
transpose_a(*
transpose_b(*
T0
0
action_probsSoftmaxdense_2/MatMul*
transpose_b(
.
action_probsSoftmax dense/MatMul*

multinomial/Multinomial Multinomialdense_2/MatMul#multinomial/Multinomial/num_samples*
seed��2*
seed21*
�
multinomial/Multinomial Multinomial dense/MatMul#multinomial/Multinomial/num_samples*
seed20*
seed��#*
output_dtype0 *
dense_2/kernelConst*
dtype0*i
value`B^"P8i��-�*��,ս!'�>>�?��<;�Ӿ�zc>�<E���A�>�};�;��e}?�T�>�c=2I�>��̾U(><��
dense_1/kernelConst*i
value`B^"P�� ؁>���>E�ƾgA?��X��T? i#�x���I;AeN>����x�=@�j>��9����I�>+�=���=��־*
dtype0
dense_2/kernel/readIdentitydense_2/kernel*
dense_1/kernel/readIdentitydense_1/kernel*
loc:@dense_2/kernel
loc:@dense_1/kernel
dense_2/biasConst*
dtype0*
valueB*��>
dense_1/biasConst*
valueB*�]=*
dtype0
dense_2/bias/readIdentity dense_2/bias*
dense_1/bias/readIdentity dense_1/bias*
loc:@dense_2/bias
g
dense_3/MatMulMatMul dense/Muldense_2/kernel/read*
transpose_a(*
loc:@dense_1/bias
w
dense_1/MatMulMatMulmain_graph_0/hidden_0/Muldense_1/kernel/read*
T0
T0*
transpose_a(
dense_3/BiasAddBiasAdddense_3/MatMuldense_2/bias/read*
T0*
data_formatNHWC
dense_1/BiasAddBiasAdddense_1/MatMuldense_1/bias/read*
data_formatNHWC*
T0
value_estimateIdentitydense_3/BiasAdd*
value_estimateIdentitydense_1/BiasAdd*
T0

2
unity-environment/Assets/ML-Agents/Examples/Basic/TFModels/Basic.bytes.meta


fileFormatVersion: 2
guid: 8786b6500d406497c959f24c2a8b59ac
timeCreated: 1523662030
licenseType: Free
TextScriptImporter:
externalObjects: {}
userData:

78
unity-environment/Assets/ML-Agents/Scripts/Agent.cs


action.textActions = "";
info.memories = new List<float>();
action.memories = new List<float>();
if (param.vectorObservationSpaceType == SpaceType.continuous)
{
info.vectorObservation =
new List<float>(param.vectorObservationSize);
info.stackedVectorObservation =
new List<float>(param.vectorObservationSize
* brain.brainParameters.numStackedVectorObservations);
info.stackedVectorObservation.AddRange(
new float[param.vectorObservationSize
* param.numStackedVectorObservations]);
}
else
{
info.vectorObservation = new List<float>(1);
info.stackedVectorObservation =
new List<float>(param.numStackedVectorObservations);
info.stackedVectorObservation.AddRange(
new float[param.numStackedVectorObservations]);
}
info.vectorObservation =
new List<float>(param.vectorObservationSize);
info.stackedVectorObservation =
new List<float>(param.vectorObservationSize
* brain.brainParameters.numStackedVectorObservations);
info.stackedVectorObservation.AddRange(
new float[param.vectorObservationSize
* param.numStackedVectorObservations]);
info.visualObservations = new List<Texture2D>();
}

CollectObservations();
BrainParameters param = brain.brainParameters;
if (param.vectorObservationSpaceType == SpaceType.continuous)
if (info.vectorObservation.Count != param.vectorObservationSize)
if (info.vectorObservation.Count != param.vectorObservationSize)
{
throw new UnityAgentsException(string.Format(
"Vector Observation size mismatch between continuous " +
"agent {0} and brain {1}. " +
"Was Expecting {2} but received {3}. ",
gameObject.name, brain.gameObject.name,
brain.brainParameters.vectorObservationSize,
info.vectorObservation.Count));
}
info.stackedVectorObservation.RemoveRange(
0, param.vectorObservationSize);
info.stackedVectorObservation.AddRange(info.vectorObservation);
throw new UnityAgentsException(string.Format(
"Vector Observation size mismatch between continuous " +
"agent {0} and brain {1}. " +
"Was Expecting {2} but received {3}. ",
gameObject.name, brain.gameObject.name,
brain.brainParameters.vectorObservationSize,
info.vectorObservation.Count));
else
{
if (info.vectorObservation.Count != 1)
{
throw new UnityAgentsException(string.Format(
"Vector Observation size mismatch between discrete agent" +
" {0} and brain {1}. Was Expecting {2} but received {3}. ",
gameObject.name, brain.gameObject.name,
1, info.vectorObservation.Count));
}
info.stackedVectorObservation.RemoveRange(0, 1);
info.stackedVectorObservation.AddRange(info.vectorObservation);
}
info.stackedVectorObservation.RemoveRange(
0, param.vectorObservationSize);
info.stackedVectorObservation.AddRange(info.vectorObservation);
info.visualObservations.Clear();
if (param.cameraResolutions.Length > agentParameters.agentCameras.Count)

/// - <see cref="AddVectorObs(float[])"/>
/// - <see cref="AddVectorObs(List{float})"/>
/// - <see cref="AddVectorObs(Quaternion)"/>
/// - <see cref="AddVectorObs(bool)"/>
/// - <see cref="AddVectorObs(int, int)"/>
/// Depending on your environment, any combination of these helpers can
/// be used. They just need to be used in the exact same order each time
/// this method is called and the resulting size of the vector observation

/// <param name="observation">Observation.</param>
protected void AddVectorObs(int observation)
{
info.vectorObservation.Add((float) observation);
info.vectorObservation.Add(observation);
}
/// <summary>

protected void AddVectorObs(bool observation)
{
info.vectorObservation.Add(observation ? 1f : 0f);
}
protected void AddVectorObs(int observation, int range)
{
float[] oneHotVector = new float[range];
oneHotVector[observation] = 1;
info.vectorObservation.AddRange(oneHotVector);
}
/// <summary>

/// The agent must set maxStepReached.</param>
/// <param name="academyDone">If set to <c>true</c>
/// The agent must set done.</param>
/// <param name="academyStepCounter">Number of current steps in episode</param>
void SetStatus(bool academyMaxStep, bool academyDone, int academyStepCounter)
{
if (academyDone)

2
unity-environment/Assets/ML-Agents/Scripts/Batcher.cs


VectorActionSize = brainParameters.vectorActionSize,
VectorActionSpaceType =
(CommunicatorObjects.SpaceTypeProto)brainParameters.vectorActionSpaceType,
VectorObservationSpaceType =
(CommunicatorObjects.SpaceTypeProto)brainParameters.vectorObservationSpaceType,
BrainName = name,
BrainType = type
};

22
unity-environment/Assets/ML-Agents/Scripts/Brain.cs


public SpaceType vectorActionSpaceType = SpaceType.discrete;
/**< \brief Defines if the action is discrete or continuous */
public SpaceType vectorObservationSpaceType = SpaceType.continuous;
/**< \brief Defines if the state is discrete or continuous */
}
[HelpURL("https://github.com/Unity-Technologies/ml-agents/blob/master/" +

*/
public class Brain : MonoBehaviour
{
private bool isInitialized = false;
private bool isInitialized;
private Dictionary<Agent, AgentInfo> agentInfos =
new Dictionary<Agent, AgentInfo>(1024);

public BrainType brainType;
//[HideInInspector]
///**< \brief Keeps track of the agents which subscribe to this brain*/
/// Keeps track of the agents which subscribe to this brain*/
// public Dictionary<int, Agent> agents = new Dictionary<int, Agent>();
[SerializeField] ScriptableObject[] CoreBrains;

{
CoreBrains[(int) bt] =
ScriptableObject.CreateInstance(
"CoreBrain" + bt.ToString());
"CoreBrain" + bt);
CoreBrains[(int) bt] =
ScriptableObject.Instantiate(CoreBrains[(int) bt]);
CoreBrains[(int) bt] = Instantiate(CoreBrains[(int) bt]);
}
}

if (!gameObject.activeSelf)
{
throw new UnityAgentsException(
string.Format("Agent {0} tried to request an action " +
"from brain {1} but it is not active.",
agent.gameObject.name, gameObject.name));
$"Agent {agent.gameObject.name} tried to request an action " +
$"from brain {gameObject.name} but it is not active.");
string.Format("Agent {0} tried to request an action " +
"from brain {1} but it was not initialized.",
agent.gameObject.name, gameObject.name));
$"Agent {agent.gameObject.name} tried to request an action " +
$"from brain {gameObject.name} but it was not initialized.");
}
else
{

52
unity-environment/Assets/ML-Agents/Scripts/CommunicatorObjects/BrainParametersProto.cs


"LnByb3RvEhRjb21tdW5pY2F0b3Jfb2JqZWN0cxorY29tbXVuaWNhdG9yX29i",
"amVjdHMvcmVzb2x1dGlvbl9wcm90by5wcm90bxorY29tbXVuaWNhdG9yX29i",
"amVjdHMvYnJhaW5fdHlwZV9wcm90by5wcm90bxorY29tbXVuaWNhdG9yX29i",
"amVjdHMvc3BhY2VfdHlwZV9wcm90by5wcm90byLGAwoUQnJhaW5QYXJhbWV0",
"amVjdHMvc3BhY2VfdHlwZV9wcm90by5wcm90byL5AgoUQnJhaW5QYXJhbWV0",
"ZXJzUHJvdG8SHwoXdmVjdG9yX29ic2VydmF0aW9uX3NpemUYASABKAUSJwof",
"bnVtX3N0YWNrZWRfdmVjdG9yX29ic2VydmF0aW9ucxgCIAEoBRIaChJ2ZWN0",
"b3JfYWN0aW9uX3NpemUYAyABKAUSQQoSY2FtZXJhX3Jlc29sdXRpb25zGAQg",

"LlNwYWNlVHlwZVByb3RvEksKHXZlY3Rvcl9vYnNlcnZhdGlvbl9zcGFjZV90",
"eXBlGAcgASgOMiQuY29tbXVuaWNhdG9yX29iamVjdHMuU3BhY2VUeXBlUHJv",
"dG8SEgoKYnJhaW5fbmFtZRgIIAEoCRI4CgpicmFpbl90eXBlGAkgASgOMiQu",
"Y29tbXVuaWNhdG9yX29iamVjdHMuQnJhaW5UeXBlUHJvdG9CH6oCHE1MQWdl",
"bnRzLkNvbW11bmljYXRvck9iamVjdHNiBnByb3RvMw=="));
"LlNwYWNlVHlwZVByb3RvEhIKCmJyYWluX25hbWUYByABKAkSOAoKYnJhaW5f",
"dHlwZRgIIAEoDjIkLmNvbW11bmljYXRvcl9vYmplY3RzLkJyYWluVHlwZVBy",
"b3RvQh+qAhxNTEFnZW50cy5Db21tdW5pY2F0b3JPYmplY3RzYgZwcm90bzM="));
new pbr::GeneratedClrTypeInfo(typeof(global::MLAgents.CommunicatorObjects.BrainParametersProto), global::MLAgents.CommunicatorObjects.BrainParametersProto.Parser, new[]{ "VectorObservationSize", "NumStackedVectorObservations", "VectorActionSize", "CameraResolutions", "VectorActionDescriptions", "VectorActionSpaceType", "VectorObservationSpaceType", "BrainName", "BrainType" }, null, null, null)
new pbr::GeneratedClrTypeInfo(typeof(global::MLAgents.CommunicatorObjects.BrainParametersProto), global::MLAgents.CommunicatorObjects.BrainParametersProto.Parser, new[]{ "VectorObservationSize", "NumStackedVectorObservations", "VectorActionSize", "CameraResolutions", "VectorActionDescriptions", "VectorActionSpaceType", "BrainName", "BrainType" }, null, null, null)
}));
}
#endregion

cameraResolutions_ = other.cameraResolutions_.Clone();
vectorActionDescriptions_ = other.vectorActionDescriptions_.Clone();
vectorActionSpaceType_ = other.vectorActionSpaceType_;
vectorObservationSpaceType_ = other.vectorObservationSpaceType_;
brainName_ = other.brainName_;
brainType_ = other.brainType_;
_unknownFields = pb::UnknownFieldSet.Clone(other._unknownFields);

}
}
/// <summary>Field number for the "vector_observation_space_type" field.</summary>
public const int VectorObservationSpaceTypeFieldNumber = 7;
private global::MLAgents.CommunicatorObjects.SpaceTypeProto vectorObservationSpaceType_ = 0;
[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public global::MLAgents.CommunicatorObjects.SpaceTypeProto VectorObservationSpaceType {
get { return vectorObservationSpaceType_; }
set {
vectorObservationSpaceType_ = value;
}
}
public const int BrainNameFieldNumber = 8;
public const int BrainNameFieldNumber = 7;
private string brainName_ = "";
[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public string BrainName {

}
/// <summary>Field number for the "brain_type" field.</summary>
public const int BrainTypeFieldNumber = 9;
public const int BrainTypeFieldNumber = 8;
private global::MLAgents.CommunicatorObjects.BrainTypeProto brainType_ = 0;
[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public global::MLAgents.CommunicatorObjects.BrainTypeProto BrainType {

if(!cameraResolutions_.Equals(other.cameraResolutions_)) return false;
if(!vectorActionDescriptions_.Equals(other.vectorActionDescriptions_)) return false;
if (VectorActionSpaceType != other.VectorActionSpaceType) return false;
if (VectorObservationSpaceType != other.VectorObservationSpaceType) return false;
if (BrainName != other.BrainName) return false;
if (BrainType != other.BrainType) return false;
return Equals(_unknownFields, other._unknownFields);

hash ^= cameraResolutions_.GetHashCode();
hash ^= vectorActionDescriptions_.GetHashCode();
if (VectorActionSpaceType != 0) hash ^= VectorActionSpaceType.GetHashCode();
if (VectorObservationSpaceType != 0) hash ^= VectorObservationSpaceType.GetHashCode();
if (BrainName.Length != 0) hash ^= BrainName.GetHashCode();
if (BrainType != 0) hash ^= BrainType.GetHashCode();
if (_unknownFields != null) {

output.WriteRawTag(48);
output.WriteEnum((int) VectorActionSpaceType);
}
if (VectorObservationSpaceType != 0) {
output.WriteRawTag(56);
output.WriteEnum((int) VectorObservationSpaceType);
}
output.WriteRawTag(66);
output.WriteRawTag(58);
output.WriteRawTag(72);
output.WriteRawTag(64);
output.WriteEnum((int) BrainType);
}
if (_unknownFields != null) {

if (VectorActionSpaceType != 0) {
size += 1 + pb::CodedOutputStream.ComputeEnumSize((int) VectorActionSpaceType);
}
if (VectorObservationSpaceType != 0) {
size += 1 + pb::CodedOutputStream.ComputeEnumSize((int) VectorObservationSpaceType);
}
if (BrainName.Length != 0) {
size += 1 + pb::CodedOutputStream.ComputeStringSize(BrainName);
}

vectorActionDescriptions_.Add(other.vectorActionDescriptions_);
if (other.VectorActionSpaceType != 0) {
VectorActionSpaceType = other.VectorActionSpaceType;
}
if (other.VectorObservationSpaceType != 0) {
VectorObservationSpaceType = other.VectorObservationSpaceType;
}
if (other.BrainName.Length != 0) {
BrainName = other.BrainName;

vectorActionSpaceType_ = (global::MLAgents.CommunicatorObjects.SpaceTypeProto) input.ReadEnum();
break;
}
case 56: {
vectorObservationSpaceType_ = (global::MLAgents.CommunicatorObjects.SpaceTypeProto) input.ReadEnum();
break;
}
case 66: {
case 58: {
case 72: {
case 64: {
brainType_ = (global::MLAgents.CommunicatorObjects.BrainTypeProto) input.ReadEnum();
break;
}

4
unity-environment/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityInput.cs


[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public UnityInput(UnityInput other) : this() {
RlInput = other.rlInput_ != null ? other.RlInput.Clone() : null;
RlInitializationInput = other.rlInitializationInput_ != null ? other.RlInitializationInput.Clone() : null;
rlInput_ = other.rlInput_ != null ? other.rlInput_.Clone() : null;
rlInitializationInput_ = other.rlInitializationInput_ != null ? other.rlInitializationInput_.Clone() : null;
_unknownFields = pb::UnknownFieldSet.Clone(other._unknownFields);
}

6
unity-environment/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityMessage.cs


[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public UnityMessage(UnityMessage other) : this() {
Header = other.header_ != null ? other.Header.Clone() : null;
UnityOutput = other.unityOutput_ != null ? other.UnityOutput.Clone() : null;
UnityInput = other.unityInput_ != null ? other.UnityInput.Clone() : null;
header_ = other.header_ != null ? other.header_.Clone() : null;
unityOutput_ = other.unityOutput_ != null ? other.unityOutput_.Clone() : null;
unityInput_ = other.unityInput_ != null ? other.unityInput_.Clone() : null;
_unknownFields = pb::UnknownFieldSet.Clone(other._unknownFields);
}

4
unity-environment/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityOutput.cs


[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public UnityOutput(UnityOutput other) : this() {
RlOutput = other.rlOutput_ != null ? other.RlOutput.Clone() : null;
RlInitializationOutput = other.rlInitializationOutput_ != null ? other.RlInitializationOutput.Clone() : null;
rlOutput_ = other.rlOutput_ != null ? other.rlOutput_.Clone() : null;
rlInitializationOutput_ = other.rlInitializationOutput_ != null ? other.rlInitializationOutput_.Clone() : null;
_unknownFields = pb::UnknownFieldSet.Clone(other._unknownFields);
}

2
unity-environment/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityRlInitializationOutput.cs


version_ = other.version_;
logPath_ = other.logPath_;
brainParameters_ = other.brainParameters_.Clone();
EnvironmentParameters = other.environmentParameters_ != null ? other.EnvironmentParameters.Clone() : null;
environmentParameters_ = other.environmentParameters_ != null ? other.environmentParameters_.Clone() : null;
_unknownFields = pb::UnknownFieldSet.Clone(other._unknownFields);
}

2
unity-environment/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityRlInput.cs


[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public UnityRLInput(UnityRLInput other) : this() {
agentActions_ = other.agentActions_.Clone();
EnvironmentParameters = other.environmentParameters_ != null ? other.EnvironmentParameters.Clone() : null;
environmentParameters_ = other.environmentParameters_ != null ? other.environmentParameters_.Clone() : null;
isTraining_ = other.isTraining_;
command_ = other.command_;
_unknownFields = pb::UnknownFieldSet.Clone(other._unknownFields);

6
unity-environment/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityToExternal.cs


// Generated by the protocol buffer compiler. DO NOT EDIT!
// source: communicator_objects/unity_to_external.proto
// <auto-generated>
// Generated by the protocol buffer compiler. DO NOT EDIT!
// source: communicator_objects/unity_to_external.proto
// </auto-generated>
#pragma warning disable 1591, 0612, 3021
#region Designer generated code

7
unity-environment/Assets/ML-Agents/Scripts/CommunicatorObjects/UnityToExternalGrpc.cs


#pragma warning disable 1591
#region Designer generated code
using System;
using System.Threading;
using System.Threading.Tasks;
using grpc = global::Grpc.Core;
namespace MLAgents.CommunicatorObjects {

/// <param name="deadline">An optional deadline for the call. The call will be cancelled if deadline is hit.</param>
/// <param name="cancellationToken">An optional token for canceling the call.</param>
/// <returns>The response received from the server.</returns>
public virtual global::MLAgents.CommunicatorObjects.UnityMessage Exchange(global::MLAgents.CommunicatorObjects.UnityMessage request, grpc::Metadata headers = null, DateTime? deadline = null, CancellationToken cancellationToken = default(CancellationToken))
public virtual global::MLAgents.CommunicatorObjects.UnityMessage Exchange(global::MLAgents.CommunicatorObjects.UnityMessage request, grpc::Metadata headers = null, global::System.DateTime? deadline = null, global::System.Threading.CancellationToken cancellationToken = default(global::System.Threading.CancellationToken))
{
return Exchange(request, new grpc::CallOptions(headers, deadline, cancellationToken));
}

/// <param name="deadline">An optional deadline for the call. The call will be cancelled if deadline is hit.</param>
/// <param name="cancellationToken">An optional token for canceling the call.</param>
/// <returns>The call object.</returns>
public virtual grpc::AsyncUnaryCall<global::MLAgents.CommunicatorObjects.UnityMessage> ExchangeAsync(global::MLAgents.CommunicatorObjects.UnityMessage request, grpc::Metadata headers = null, DateTime? deadline = null, CancellationToken cancellationToken = default(CancellationToken))
public virtual grpc::AsyncUnaryCall<global::MLAgents.CommunicatorObjects.UnityMessage> ExchangeAsync(global::MLAgents.CommunicatorObjects.UnityMessage request, grpc::Metadata headers = null, global::System.DateTime? deadline = null, global::System.Threading.CancellationToken cancellationToken = default(global::System.Threading.CancellationToken))
{
return ExchangeAsync(request, new grpc::CallOptions(headers, deadline, cancellationToken));
}

2
unity-environment/Assets/ML-Agents/Scripts/CoreBrain.cs


void SetBrain(Brain b);
/// Implement this method to initialize CoreBrain
void InitializeCoreBrain(MLAgents.Batcher brainBatcher);
void InitializeCoreBrain(Batcher brainBatcher);
/// Implement this method to define the logic for deciding actions
void DecideAction(Dictionary<Agent, AgentInfo> agentInfo);

13
unity-environment/Assets/ML-Agents/Scripts/CoreBrainExternal.cs


/**< Reference to the brain that uses this CoreBrainExternal */
public Brain brain;
MLAgents.Batcher brainBatcher;
Batcher brainBatcher;
/// Creates the reference to the brain
public void SetBrain(Brain b)

/// Generates the communicator for the Academy if none was present and
/// subscribe to ExternalCommunicator if it was present.
public void InitializeCoreBrain(MLAgents.Batcher brainBatcher)
public void InitializeCoreBrain(Batcher brainBatcher)
throw new UnityAgentsException(string.Format("The brain {0} was set to" +
" External mode" +
" but Unity was unable to read the" +
" arguments passed at launch.",
brain.gameObject.name));
throw new UnityAgentsException($"The brain {brain.gameObject.name} was set to" + " External mode" +
" but Unity was unable to read the" + " arguments passed at launch.");
}
else
{

{
brainBatcher.SendBrainInfo(brain.gameObject.name, agentInfo);
}
return;
}
/// Nothing needs to appear in the inspector

9
unity-environment/Assets/ML-Agents/Scripts/CoreBrainHeuristic.cs


/**< Reference to the brain that uses this CoreBrainHeuristic */
public Brain brain;
MLAgents.Batcher brainBatcher;
Batcher brainBatcher;
/**< Reference to the Decision component used to decide the actions */
public Decision decision;

}
/// Create the reference to decision
public void InitializeCoreBrain(MLAgents.Batcher brainBatcher)
public void InitializeCoreBrain(Batcher brainBatcher)
{
decision = brain.gameObject.GetComponent<Decision>();

/// Uses the Decision Component to decide that action to take
public void DecideAction(Dictionary<Agent, AgentInfo> agentInfo)
{
if (brainBatcher != null)
{
brainBatcher.SendBrainInfo(brain.gameObject.name, agentInfo);
}
brainBatcher?.SendBrainInfo(brain.gameObject.name, agentInfo);
if (decision == null)
{

649
unity-environment/Assets/ML-Agents/Scripts/CoreBrainInternal.cs


using System.Collections;
using System.Collections.Generic;
using UnityEngine;
#endif
namespace MLAgents

{
[SerializeField]
[Tooltip("If checked, the brain will broadcast states and actions to Python.")]
[SerializeField] [Tooltip("If checked, the brain will broadcast states and actions to Python.")]
#pragma warning disable
private bool broadcast = true;
#pragma warning restore

{
public enum tensorType
public enum TensorType
{
Integer,
FloatingPoint

public tensorType valueType;
public TensorType valueType;
MLAgents.Batcher brainBatcher;
Batcher brainBatcher;
[Tooltip("This must be the bytes file corresponding to the pretrained TensorFlow graph.")]
/// Modify only in inspector : Reference to the Graph asset

/// Modify only in inspector : Name of the previous action node
public string PreviousActionPlaceholderName = "prev_action";
#if ENABLE_TENSORFLOW
TFGraph graph;
TFSession session;
bool hasRecurrent;
bool hasState;
bool hasBatchSize;
bool hasPrevAction;
float[,] inputState;
int[] inputPrevAction;
List<float[,,,]> observationMatrixList;
float[,] inputOldMemories;
List<Texture2D> texturesHolder;
int memorySize;
TFGraph graph;
TFSession session;
bool hasRecurrent;
bool hasState;
bool hasBatchSize;
bool hasPrevAction;
float[,] inputState;
int[] inputPrevAction;
List<float[,,,]> observationMatrixList;
float[,] inputOldMemories;
List<Texture2D> texturesHolder;
int memorySize;
#endif
/// Reference to the brain that uses this CoreBrainInternal

{
#if ENABLE_TENSORFLOW
#if UNITY_ANDROID
// This needs to ba called only once and will raise an exception if
// there are multiple internal brains
// This needs to ba called only once and will raise an exception if
// there are multiple internal brains
try{
TensorFlowSharp.Android.NativeBinding.Init();
}

#endif
if ((brainBatcher == null)
|| (!broadcast))
{
this.brainBatcher = null;
}
else
{
this.brainBatcher = brainBatcher;
this.brainBatcher.SubscribeBrain(brain.gameObject.name);
}
if ((brainBatcher == null)
|| (!broadcast))
{
this.brainBatcher = null;
}
else
{
this.brainBatcher = brainBatcher;
this.brainBatcher.SubscribeBrain(brain.gameObject.name);
}
if (graphModel != null)
{
graph = new TFGraph();
graph.Import(graphModel.bytes);
session = new TFSession(graph);
if (graphModel != null)
{
// TODO: Make this a loop over a dynamic set of graph inputs
graph = new TFGraph();
if ((graphScope.Length > 1) && (graphScope[graphScope.Length - 1] != '/'))
{
graphScope = graphScope + '/';
}
graph.Import(graphModel.bytes);
if (graph[graphScope + BatchSizePlaceholderName] != null)
{
hasBatchSize = true;
}
session = new TFSession(graph);
if ((graph[graphScope + RecurrentInPlaceholderName] != null) &&
(graph[graphScope + RecurrentOutPlaceholderName] != null))
{
hasRecurrent = true;
var runner = session.GetRunner();
runner.Fetch(graph[graphScope + "memory_size"][0]);
var networkOutput = runner.Run()[0].GetValue();
memorySize = (int) networkOutput;
}
// TODO: Make this a loop over a dynamic set of graph inputs
if (graph[graphScope + VectorObservationPlacholderName] != null)
{
hasState = true;
}
if ((graphScope.Length > 1) && (graphScope[graphScope.Length - 1] != '/'))
{
graphScope = graphScope + '/';
if (graph[graphScope + PreviousActionPlaceholderName] != null)
{
hasPrevAction = true;
}
if (graph[graphScope + BatchSizePlaceholderName] != null)
{
hasBatchSize = true;
}
if ((graph[graphScope + RecurrentInPlaceholderName] != null) && (graph[graphScope + RecurrentOutPlaceholderName] != null))
{
hasRecurrent = true;
var runner = session.GetRunner();
runner.Fetch(graph[graphScope + "memory_size"][0]);
var networkOutput = runner.Run()[0].GetValue();
memorySize = (int)networkOutput;
}
if (graph[graphScope + VectorObservationPlacholderName] != null)
{
hasState = true;
}
if (graph[graphScope + PreviousActionPlaceholderName] != null)
{
hasPrevAction = true;
}
}
observationMatrixList = new List<float[,,,]>();
texturesHolder = new List<Texture2D>();
observationMatrixList = new List<float[,,,]>();
texturesHolder = new List<Texture2D>();
/// Uses the stored information to run the tensorflow graph and generate

#if ENABLE_TENSORFLOW
if (brainBatcher != null)
{
brainBatcher.SendBrainInfo(brain.gameObject.name, agentInfo);
}
int currentBatchSize = agentInfo.Count();
List<Agent> agentList = agentInfo.Keys.ToList();
if (currentBatchSize == 0)
{
return;
}
if (brainBatcher != null)
{
brainBatcher.SendBrainInfo(brain.gameObject.name, agentInfo);
}
int currentBatchSize = agentInfo.Count();
List<Agent> agentList = agentInfo.Keys.ToList();
if (currentBatchSize == 0)
{
return;
}
// Create the state tensor
if (hasState)
{
int stateLength = 1;
if (brain.brainParameters.vectorObservationSpaceType == SpaceType.continuous)
// Create the state tensor
if (hasState)
int stateLength = 1;
}
inputState =
new float[currentBatchSize, stateLength * brain.brainParameters.numStackedVectorObservations];
inputState =
new float[currentBatchSize, stateLength * brain.brainParameters.numStackedVectorObservations];
var i = 0;
foreach (Agent agent in agentList)
{
List<float> state_list = agentInfo[agent].stackedVectorObservation;
for (int j =
0; j < stateLength * brain.brainParameters.numStackedVectorObservations; j++)
var i = 0;
foreach (Agent agent in agentList)
inputState[i, j] = state_list[j];
List<float> stateList = agentInfo[agent].stackedVectorObservation;
for (int j =
0;
j < stateLength * brain.brainParameters.numStackedVectorObservations;
j++)
{
inputState[i, j] = stateList[j];
}
i++;
i++;
}
// Create the state tensor
if (hasPrevAction)
{
inputPrevAction = new int[currentBatchSize];
var i = 0;
foreach (Agent agent in agentList)
// Create the state tensor
if (hasPrevAction)
float[] action_list = agentInfo[agent].storedVectorActions;
inputPrevAction[i] = Mathf.FloorToInt(action_list[0]);
i++;
inputPrevAction = new int[currentBatchSize];
var i = 0;
foreach (Agent agent in agentList)
{
float[] actionList = agentInfo[agent].storedVectorActions;
inputPrevAction[i] = Mathf.FloorToInt(actionList[0]);
i++;
}
}
observationMatrixList.Clear();
for (int observationIndex =
0; observationIndex < brain.brainParameters.cameraResolutions.Count(); observationIndex++){
texturesHolder.Clear();
foreach (Agent agent in agentList){
texturesHolder.Add(agentInfo[agent].visualObservations[observationIndex]);
}
observationMatrixList.Add(
BatchVisualObservations(texturesHolder, brain.brainParameters.cameraResolutions[observationIndex].blackAndWhite));
}
// Create the recurrent tensor
if (hasRecurrent)
{
// Need to have variable memory size
inputOldMemories = new float[currentBatchSize, memorySize];
var i = 0;
foreach (Agent agent in agentList)
observationMatrixList.Clear();
for (int observationIndex =
0;
observationIndex < brain.brainParameters.cameraResolutions.Length;
observationIndex++)
float[] m = agentInfo[agent].memories.ToArray();
for (int j = 0; j < m.Count(); j++)
texturesHolder.Clear();
foreach (Agent agent in agentList)
inputOldMemories[i, j] = m[j];
texturesHolder.Add(agentInfo[agent].visualObservations[observationIndex]);
i++;
observationMatrixList.Add(
BatchVisualObservations(texturesHolder,
brain.brainParameters.cameraResolutions[observationIndex].blackAndWhite));
}
// Create the recurrent tensor
if (hasRecurrent)
{
// Need to have variable memory size
inputOldMemories = new float[currentBatchSize, memorySize];
var i = 0;
foreach (Agent agent in agentList)
{
float[] m = agentInfo[agent].memories.ToArray();
for (int j = 0; j < m.Length; j++)
{
inputOldMemories[i, j] = m[j];
}
var runner = session.GetRunner();
try
{
runner.Fetch(graph[graphScope + ActionPlaceholderName][0]);
}
catch
{
throw new UnityAgentsException(string.Format(@"The node {0} could not be found. Please make sure the graphScope {1} is correct",
graphScope + ActionPlaceholderName, graphScope));
}
i++;
}
}
if (hasBatchSize)
{
runner.AddInput(graph[graphScope + BatchSizePlaceholderName][0], new int[] { currentBatchSize });
}
foreach (TensorFlowAgentPlaceholder placeholder in graphPlaceholders)
{
var runner = session.GetRunner();
if (placeholder.valueType == TensorFlowAgentPlaceholder.tensorType.FloatingPoint)
{
runner.AddInput(graph[graphScope + placeholder.name][0], new float[] { Random.Range(placeholder.minValue, placeholder.maxValue) });
}
else if (placeholder.valueType == TensorFlowAgentPlaceholder.tensorType.Integer)
{
runner.AddInput(graph[graphScope + placeholder.name][0], new int[] { Random.Range((int)placeholder.minValue, (int)placeholder.maxValue + 1) });
}
runner.Fetch(graph[graphScope + ActionPlaceholderName][0]);
throw new UnityAgentsException(string.Format(@"One of the Tensorflow placeholder cound nout be found.
In brain {0}, there are no {1} placeholder named {2}.",
brain.gameObject.name, placeholder.valueType.ToString(), graphScope + placeholder.name));
throw new UnityAgentsException(string.Format(
@"The node {0} could not be found. Please make sure the graphScope {1} is correct",
graphScope + ActionPlaceholderName, graphScope));
}
// Create the state tensor
if (hasState)
{
if (brain.brainParameters.vectorObservationSpaceType == SpaceType.discrete)
if (hasBatchSize)
var discreteInputState = new int[currentBatchSize, 1];
for (int i = 0; i < currentBatchSize; i++)
runner.AddInput(graph[graphScope + BatchSizePlaceholderName][0], new int[] {currentBatchSize});
}
foreach (TensorFlowAgentPlaceholder placeholder in graphPlaceholders)
{
try
discreteInputState[i, 0] = (int)inputState[i, 0];
if (placeholder.valueType == TensorFlowAgentPlaceholder.TensorType.FloatingPoint)
{
runner.AddInput(graph[graphScope + placeholder.name][0],
new float[] {Random.Range(placeholder.minValue, placeholder.maxValue)});
}
else if (placeholder.valueType == TensorFlowAgentPlaceholder.TensorType.Integer)
{
runner.AddInput(graph[graphScope + placeholder.name][0],
new int[] {Random.Range((int) placeholder.minValue, (int) placeholder.maxValue + 1)});
}
runner.AddInput(graph[graphScope + VectorObservationPlacholderName][0], discreteInputState);
catch
{
throw new UnityAgentsException(string.Format(
@"One of the Tensorflow placeholder cound nout be found.
In brain {0}, there are no {1} placeholder named {2}.",
brain.gameObject.name, placeholder.valueType.ToString(), graphScope + placeholder.name));
}
else
// Create the state tensor
if (hasState)
}
// Create the previous action tensor
if (hasPrevAction)
{
runner.AddInput(graph[graphScope + PreviousActionPlaceholderName][0], inputPrevAction);
}
// Create the previous action tensor
if (hasPrevAction)
{
runner.AddInput(graph[graphScope + PreviousActionPlaceholderName][0], inputPrevAction);
}
// Create the observation tensors
for (int obs_number =
0; obs_number < brain.brainParameters.cameraResolutions.Length; obs_number++)
{
runner.AddInput(graph[graphScope + VisualObservationPlaceholderName[obs_number]][0], observationMatrixList[obs_number]);
}
// Create the observation tensors
for (int obsNumber =
0;
obsNumber < brain.brainParameters.cameraResolutions.Length;
obsNumber++)
{
runner.AddInput(graph[graphScope + VisualObservationPlaceholderName[obsNumber]][0],
observationMatrixList[obsNumber]);
}
if (hasRecurrent)
{
runner.AddInput(graph[graphScope + "sequence_length"][0], 1);
runner.AddInput(graph[graphScope + RecurrentInPlaceholderName][0], inputOldMemories);
runner.Fetch(graph[graphScope + RecurrentOutPlaceholderName][0]);
}
if (hasRecurrent)
{
runner.AddInput(graph[graphScope + "sequence_length"][0], 1);
runner.AddInput(graph[graphScope + RecurrentInPlaceholderName][0], inputOldMemories);
runner.Fetch(graph[graphScope + RecurrentOutPlaceholderName][0]);
}
TFTensor[] networkOutput;
try
{
networkOutput = runner.Run();
}
catch (TFException e)
{
string errorMessage = e.Message;
TFTensor[] networkOutput;
errorMessage =
string.Format(@"The tensorflow graph needs an input for {0} of type {1}",
e.Message.Split(new string[] { "Node: " }, 0)[1].Split('=')[0],
e.Message.Split(new string[] { "dtype=" }, 0)[1].Split(',')[0]);
networkOutput = runner.Run();
finally
catch (TFException e)
throw new UnityAgentsException(errorMessage);
string errorMessage = e.Message;
try
{
errorMessage =
$@"The tensorflow graph needs an input for {e.Message.Split(new string[] {"Node: "}, 0)[1].Split('=')[0]} of type {e.Message.Split(new string[] {"dtype="}, 0)[1].Split(',')[0]}";
}
finally
{
throw new UnityAgentsException(errorMessage);
}
}
// Create the recurrent tensor
if (hasRecurrent)
{
float[,] recurrent_tensor = networkOutput[1].GetValue() as float[,];
var i = 0;
foreach (Agent agent in agentList)
// Create the recurrent tensor
if (hasRecurrent)
var m = new float[memorySize];
for (int j = 0; j < memorySize; j++)
float[,] recurrentTensor = networkOutput[1].GetValue() as float[,];
var i = 0;
foreach (Agent agent in agentList)
m[j] = recurrent_tensor[i, j];
var m = new float[memorySize];
for (int j = 0; j < memorySize; j++)
{
m[j] = recurrentTensor[i, j];
}
agent.UpdateMemoriesAction(m.ToList());
i++;
agent.UpdateMemoriesAction(m.ToList());
i++;
}
if (brain.brainParameters.vectorActionSpaceType == SpaceType.continuous)
{
var output = networkOutput[0].GetValue() as float[,];
var i = 0;
foreach (Agent agent in agentList)
if (brain.brainParameters.vectorActionSpaceType == SpaceType.continuous)
var a = new float[brain.brainParameters.vectorActionSize];
for (int j = 0; j < brain.brainParameters.vectorActionSize; j++)
var output = networkOutput[0].GetValue() as float[,];
var i = 0;
foreach (Agent agent in agentList)
a[j] = output[i, j];
var a = new float[brain.brainParameters.vectorActionSize];
for (int j = 0; j < brain.brainParameters.vectorActionSize; j++)
{
a[j] = output[i, j];
}
agent.UpdateVectorAction(a);
i++;
agent.UpdateVectorAction(a);
i++;
}
else if (brain.brainParameters.vectorActionSpaceType == SpaceType.discrete)
{
long[,] output = networkOutput[0].GetValue() as long[,];
var i = 0;
foreach (Agent agent in agentList)
else if (brain.brainParameters.vectorActionSpaceType == SpaceType.discrete)
var a = new float[1] { (float)(output[i, 0]) };
agent.UpdateVectorAction(a);
i++;
long[,] output = networkOutput[0].GetValue() as long[,];
var i = 0;
foreach (Agent agent in agentList)
{
var a = new float[1] {(float) (output[i, 0])};
agent.UpdateVectorAction(a);
i++;
}
}
#else

public void OnInspector()
{
#if ENABLE_TENSORFLOW && UNITY_EDITOR
EditorGUILayout.LabelField("", GUI.skin.horizontalSlider);
broadcast = EditorGUILayout.Toggle(new GUIContent("Broadcast",
"If checked, the brain will broadcast states and actions to Python."), broadcast);
EditorGUILayout.LabelField("", GUI.skin.horizontalSlider);
broadcast = EditorGUILayout.Toggle(new GUIContent("Broadcast",
"If checked, the brain will broadcast states and actions to Python."), broadcast);
var serializedBrain = new SerializedObject(this);
GUILayout.Label("Edit the Tensorflow graph parameters here");
var tfGraphModel = serializedBrain.FindProperty("graphModel");
serializedBrain.Update();
EditorGUILayout.ObjectField(tfGraphModel);
serializedBrain.ApplyModifiedProperties();
var serializedBrain = new SerializedObject(this);
GUILayout.Label("Edit the Tensorflow graph parameters here");
var tfGraphModel = serializedBrain.FindProperty("graphModel");
serializedBrain.Update();
EditorGUILayout.ObjectField(tfGraphModel);
serializedBrain.ApplyModifiedProperties();
if (graphModel == null)
{
EditorGUILayout.HelpBox("Please provide a tensorflow graph as a bytes file.", MessageType.Error);
}
if (graphModel == null)
{
EditorGUILayout.HelpBox("Please provide a tensorflow graph as a bytes file.", MessageType.Error);
}
graphScope =
EditorGUILayout.TextField(new GUIContent("Graph Scope", "If you set a scope while training your tensorflow model, " +
"all your placeholder name will have a prefix. You must specify that prefix here."), graphScope);
graphScope =
EditorGUILayout.TextField(new GUIContent("Graph Scope",
"If you set a scope while training your tensorflow model, " +
"all your placeholder name will have a prefix. You must specify that prefix here."), graphScope);
if (BatchSizePlaceholderName == "")
{
BatchSizePlaceholderName = "batch_size";
}
if (BatchSizePlaceholderName == "")
{
BatchSizePlaceholderName = "batch_size";
}
BatchSizePlaceholderName =
EditorGUILayout.TextField(new GUIContent("Batch Size Node Name", "If the batch size is one of " +
"the inputs of your graph, you must specify the name if the placeholder here."), BatchSizePlaceholderName);
if (VectorObservationPlacholderName == "")
{
VectorObservationPlacholderName = "state";
}
VectorObservationPlacholderName =
EditorGUILayout.TextField(new GUIContent("Vector Observation Node Name", "If your graph uses the state as an input, " +
"you must specify the name if the placeholder here."), VectorObservationPlacholderName);
if (RecurrentInPlaceholderName == "")
{
RecurrentInPlaceholderName = "recurrent_in";
}
RecurrentInPlaceholderName =
EditorGUILayout.TextField(new GUIContent("Recurrent Input Node Name", "If your graph uses a " +
"recurrent input / memory as input and outputs new recurrent input / memory, " +
"you must specify the name if the input placeholder here."), RecurrentInPlaceholderName);
if (RecurrentOutPlaceholderName == "")
{
RecurrentOutPlaceholderName = "recurrent_out";
}
RecurrentOutPlaceholderName =
EditorGUILayout.TextField(new GUIContent("Recurrent Output Node Name", " If your graph uses a " +
"recurrent input / memory as input and outputs new recurrent input / memory, you must specify the name if " +
"the output placeholder here."), RecurrentOutPlaceholderName);
BatchSizePlaceholderName =
EditorGUILayout.TextField(new GUIContent("Batch Size Node Name", "If the batch size is one of " +
"the inputs of your graph, you must specify the name if the placeholder here."),
BatchSizePlaceholderName);
if (VectorObservationPlacholderName == "")
{
VectorObservationPlacholderName = "state";
}
if (brain.brainParameters.cameraResolutions != null)
{
if (brain.brainParameters.cameraResolutions.Count() > 0)
VectorObservationPlacholderName =
EditorGUILayout.TextField(new GUIContent("Vector Observation Node Name",
"If your graph uses the state as an input, " +
"you must specify the name if the placeholder here."), VectorObservationPlacholderName);
if (RecurrentInPlaceholderName == "")
if (VisualObservationPlaceholderName == null)
{
VisualObservationPlaceholderName =
new string[brain.brainParameters.cameraResolutions.Count()];
}
if (VisualObservationPlaceholderName.Count() != brain.brainParameters.cameraResolutions.Count())
{
VisualObservationPlaceholderName =
new string[brain.brainParameters.cameraResolutions.Count()];
}
for (int obs_number =
0; obs_number < brain.brainParameters.cameraResolutions.Count(); obs_number++)
RecurrentInPlaceholderName = "recurrent_in";
}
RecurrentInPlaceholderName =
EditorGUILayout.TextField(new GUIContent("Recurrent Input Node Name", "If your graph uses a " +
"recurrent input / memory as input and outputs new recurrent input / memory, " +
"you must specify the name if the input placeholder here."),
RecurrentInPlaceholderName);
if (RecurrentOutPlaceholderName == "")
{
RecurrentOutPlaceholderName = "recurrent_out";
}
RecurrentOutPlaceholderName =
EditorGUILayout.TextField(new GUIContent("Recurrent Output Node Name", " If your graph uses a " +
"recurrent input / memory as input and outputs new recurrent input / memory, you must specify the name if " +
"the output placeholder here."),
RecurrentOutPlaceholderName);
if (brain.brainParameters.cameraResolutions != null)
{
if (brain.brainParameters.cameraResolutions.Count() > 0)
if ((VisualObservationPlaceholderName[obs_number] == "") || (VisualObservationPlaceholderName[obs_number] == null))
if (VisualObservationPlaceholderName == null)
VisualObservationPlaceholderName =
new string[brain.brainParameters.cameraResolutions.Count()];
}
VisualObservationPlaceholderName[obs_number] =
"visual_observation_" + obs_number;
if (VisualObservationPlaceholderName.Count() != brain.brainParameters.cameraResolutions.Count())
{
VisualObservationPlaceholderName =
new string[brain.brainParameters.cameraResolutions.Count()];
for (int obs_number =
0;
obs_number < brain.brainParameters.cameraResolutions.Count();
obs_number++)
{
if ((VisualObservationPlaceholderName[obs_number] == "") ||
(VisualObservationPlaceholderName[obs_number] == null))
{
VisualObservationPlaceholderName[obs_number] =
"visual_observation_" + obs_number;
}
}
var opn = serializedBrain.FindProperty("VisualObservationPlaceholderName");
serializedBrain.Update();
EditorGUILayout.PropertyField(opn, true);
serializedBrain.ApplyModifiedProperties();
var opn = serializedBrain.FindProperty("VisualObservationPlaceholderName");
serializedBrain.Update();
EditorGUILayout.PropertyField(opn, true);
serializedBrain.ApplyModifiedProperties();
}
if (ActionPlaceholderName == "")
{
ActionPlaceholderName = "action";
}
ActionPlaceholderName =
EditorGUILayout.TextField(new GUIContent("Action Node Name", "Specify the name of the " +
"placeholder corresponding to the actions of the brain in your graph. If the action space type is " +
"continuous, the output must be a one dimensional tensor of float of length Action Space Size, " +
"if the action space type is discrete, the output must be a one dimensional tensor of int " +
"of length 1."), ActionPlaceholderName);
if (ActionPlaceholderName == "")
{
ActionPlaceholderName = "action";
}
ActionPlaceholderName =
EditorGUILayout.TextField(new GUIContent("Action Node Name", "Specify the name of the " +
"placeholder corresponding to the actions of the brain in your graph. If the action space type is " +
"continuous, the output must be a one dimensional tensor of float of length Action Space Size, " +
"if the action space type is discrete, the output must be a one dimensional tensor of int " +
"of length 1."), ActionPlaceholderName);
var tfPlaceholders = serializedBrain.FindProperty("graphPlaceholders");
serializedBrain.Update();
EditorGUILayout.PropertyField(tfPlaceholders, true);
serializedBrain.ApplyModifiedProperties();
var tfPlaceholders = serializedBrain.FindProperty("graphPlaceholders");
serializedBrain.Update();
EditorGUILayout.PropertyField(tfPlaceholders, true);
serializedBrain.ApplyModifiedProperties();
#endif
#if !ENABLE_TENSORFLOW && UNITY_EDITOR
EditorGUILayout.HelpBox(

System.Buffer.BlockCopy(resultTemp, 0, result, 0, batchSize * hwp * sizeof(float));
return result;
}
}
}

2
unity-environment/Assets/ML-Agents/Scripts/RpcCommunicator.cs


/// <param name="communicatorParameters">Communicator parameters.</param>
public RPCCommunicator(CommunicatorParameters communicatorParameters)
{
this.m_communicatorParameters = communicatorParameters;
m_communicatorParameters = communicatorParameters;
}
/// <summary>

2
unity-environment/Assets/ML-Agents/Scripts/UnityAgentsException.cs


namespace MLAgents
{
[System.Serializable]
[Serializable]
/// Contains exceptions specific to ML-Agents.
public class UnityAgentsException : System.Exception
{

14
python/unitytrainers/curriculum.py


import json
from .exception import UnityEnvironmentException
from .exception import CurriculumError
import logging

with open(location) as data_file:
self.data = json.load(data_file)
except IOError:
raise UnityEnvironmentException(
raise CurriculumError(
raise UnityEnvironmentException("There was an error decoding {}".format(location))
raise CurriculumError("There was an error decoding {}".format(location))
raise UnityEnvironmentException("{0} does not contain a "
raise CurriculumError("{0} does not contain a "
"{1} field.".format(location, key))
parameters = self.data['parameters']
self.measure_type = self.data['measure']

raise UnityEnvironmentException(
raise CurriculumError(
raise UnityEnvironmentException(
raise CurriculumError(
"The parameter {0} in Curriculum {1} must have {2} values "
"but {3} were found".format(key, location,
self.max_lesson_number + 1, len(parameters[key])))

def increment_lesson(self, progress):
"""
Increments the lesson number depending on the progree given.
Increments the lesson number depending on the progress given.
:param progress: Measure of progress (either reward or percentage steps completed).
"""
if self.data is None or progress is None:

15
python/unitytrainers/exception.py


"""
Contains exceptions for the unitytrainers package.
"""
class TrainerError(Exception):
"""
Any error related to the trainers in the ML-Agents Toolkit.
"""
pass
class CurriculumError(TrainerError):
"""
Any error related to training with a curriculum.
"""
pass

/python/unityagents/curriculum.py → /python/unitytrainers/curriculum.py

正在加载...
取消
保存