Merge pull request #367 from Unity-Technologies/feature/LSTM2

Hallway & LSTM Improvements
7 年前 · 2bba53b8
--- a/docs/Learning-Environment-Examples.md
+++ b/docs/Learning-Environment-Examples.md

 ## Basic

+![Basic](images/basic.png)
+
 * Set-up: A linear movement task where the agent must move left or right to rewarding states.
 * Goal: Move to the most reward state.
 * Agents: The environment contains one agent linked to a single brain.
-* Brains: One brain with the following state/action space.
+* Brains: One brain with the following observation/action space.
-    * Observations: 0
+    * Visual Observations: 0
 * Reset Parameters: None

 ## 3DBall
 * Agent Reward Function: 
    * +0.1 for every step the ball remains on the platform. 
    * -1.0 if the ball falls from the platform.
-* Brains: One brain with the following state/action space.
-    * State space: (Continuous) 8 variables corresponding to rotation of platform, and position, rotation, and velocity of ball.
-    * State space (Hard Version): (Continuous) 5 variables corresponding to rotation of platform and position and rotation of ball.
+* Brains: One brain with the following observation/action space.
+    * Vector Observation space: (Continuous) 8 variables corresponding to rotation of platform, and position, rotation, and velocity of ball.
+    * Vector Observation space (Hard Version): (Continuous) 5 variables corresponding to rotation of platform and position and rotation of ball.
-    * Observations: 0
+    * Visual Observations: 0
 * Reset Parameters: None

 ## GridWorld
    * -0.01 for every step.
    * +1.0 if the agent navigates to the goal position of the grid (episode ends).
    * -1.0 if the agent navigates to an obstacle (episode ends).
-* Brains: One brain with the following state/action space.
-    * State space: None
+* Brains: One brain with the following observation/action space.
+    * Vector Observation space: None
-    * Observations: One corresponding to top-down view of GridWorld.
+    * Visual Observations: One corresponding to top-down view of GridWorld.
 * Reset Parameters: Three, corresponding to grid size, number of obstacles, and number of goals.


 * Agent Reward Function (independent): 
    * +0.1 To agent when hitting ball over net.
    * -0.1 To agent who let ball hit their ground, or hit ball out of bounds.
-* Brains: One brain with the following state/action space.
-    * State space: (Continuous) 8 variables corresponding to position and velocity of ball and racket.
+* Brains: One brain with the following observation/action space.
+    * Vector Observation space: (Continuous) 8 variables corresponding to position and velocity of ball and racket.
-    * Observations: None
+    * Visual Observations: None
 * Reset Parameters: One, corresponding to size of ball.

 ## Area 
    * -0.01 for every step.
    * +1.0 if the block touches the goal.
    * -1.0 if the agent falls off the platform.
-* Brains: One brain with the following state/action space.
-    * State space: (Continuous) 15 variables corresponding to position and velocities of agent, block, and goal.
+* Brains: One brain with the following observation/action space.
+    * Vector Observation space: (Continuous) 15 variables corresponding to position and velocities of agent, block, and goal.
-    * Observations: None.
+    * Visual Observations: None.
 * Reset Parameters: One, corresponding to number of steps in training. Used to adjust size of elements for Curriculum Learning.

 ### Wall Area
    * -0.01 for every step.
    * +1.0 if the agent touches the goal.
    * -1.0 if the agent falls off the platform.
-* Brains: One brain with the following state/action space.
-    * State space: (Continuous) 16 variables corresponding to position and velocities of agent, block, and goal, plus the height of the wall.
+* Brains: One brain with the following observation/action space.
+    * Vector Observation space: (Continuous) 16 variables corresponding to position and velocities of agent, block, and goal, plus the height of the wall.
-    * Observations: None.
+    * Visual Observations: None.
 * Reset Parameters: One, corresponding to number of steps in training. Used to adjust size of the wall for Curriculum Learning.

 ## Reacher
 * Agents: The environment contains 32 agent linked to a single brain.
 * Agent Reward Function (independent): 
    * +0.1 Each step agent's hand is in goal location.
-* Brains: One brain with the following state/action space.
-    * State space: (Continuous) 26 variables corresponding to position, rotation, velocity, and angular velocities of the two arm rigidbodies.
+* Brains: One brain with the following observation/action space.
+    * Vector Observation space: (Continuous) 26 variables corresponding to position, rotation, velocity, and angular velocities of the two arm rigidbodies.
-    * Observations: None
+    * Visual Observations: None
 * Reset Parameters: Two, corresponding to goal size, and goal movement speed.

 ## Crawler
    * -0.01 times the action squared
    * -0.05 times y position change
    * -0.05 times velocity in the z direction 
-* Brains: One brain with the following state/action space.
-    * State space: (Continuous) 117 variables corresponding to position, rotation, velocity, and angular velocities of each limb plus the acceleration and angular acceleration of the body.
+* Brains: One brain with the following observation/action space.
+    * Vector Observation space: (Continuous) 117 variables corresponding to position, rotation, velocity, and angular velocities of each limb plus the acceleration and angular acceleration of the body.
-    * Observations: None
+    * Visual Observations: None
-![Banana](../images/banana.png)
+![Banana](images/banana.png)

 * Set-up: A multi-agent environment where agents compete to collect bananas. 
 * Goal: The agents must learn to move to as many yellow bananas as possible while avoiding red bananas.
    * -1 for interaction with red banana.
-* Brains: One brain with the following state/action space.
-    * State space: (Continuous) 51 corresponding to velocity of agent, plus ray-based perception of objects around agent's forward direction.
+* Brains: One brain with the following observation/action space.
+    * Vector Observation space: (Continuous) 51 corresponding to velocity of agent, plus ray-based perception of objects around agent's forward direction.
-    * Observations (Optional): First-person view for each agent. 
+    * Visual Observations (Optional): First-person view for each agent. 
+* Reset Parameters: None
+
+## Hallway
+
+![Hallway](images/hallway.png)
+
+* Set-up: Environment where the agent needs to find information in a room, remeber it, and use it to move to the correct goal.
+* Goal: Move to the goal which corresponds to the color of the block in the room.
+* Agents: The environment contains one agent linked to a single brain.
+* Agent Reward Function (independent):
+    * +1 For moving to correct goal.
+    * -0.1 For moving to incorrect goal.
+    * -0.0003 Existential penalty.
+* Brains: One brain with the following observation/action space:
+    * Vector Observation space: (Continuous) 30 corresponding to local ray-casts detecting objects, goals, and walls.
+    * Action space: (Discrete) 4 corresponding to agent rotation and forward/backward movement.
+    * Visual Observations (Optional): First-person view for the agent.
 * Reset Parameters: None
--- a/python/trainer_config.yaml
+++ b/python/trainer_config.yaml
 default:
    trainer: ppo
-    batch_size: 32
+    batch_size: 1024
-    buffer_size: 512
+    buffer_size: 10240
    epsilon: 0.2
    gamma: 0.99
    hidden_units: 128
-    normalize: true
+    normalize: false
    num_epoch: 5
    num_layers: 2
    time_horizon: 64

 Ball3DBrain:
-    summary_freq: 1000
+    normalize: true
+
+BrainWallJumpCC:
+    max_steps: 2.0e5
+    num_layers: 2
+    beta: 5.0e-4
+    hidden_units: 256
+    use_recurrent: true
+    sequence_length: 32
+    time_horizon: 32
+    batch_size: 32
+    buffer_size: 320
+
+Ball3DHardBrain:
+    max_steps: 5.0e4
+    num_layers: 2
-    batch_size: 1000
-    buffer_size: 10000 
+    num_epoch: 3
+    beta: 5.0e-4
-    max_steps: 1.0e4
+    use_recurrent: true
+    sequence_length: 8
+    time_horizon: 8
+    batch_size: 32
+    buffer_size: 320
+
+HallwayBrainDC:
+    num_layers: 3
+    hidden_units: 256
+    beta: 1.0e-2
+    gamma: 0.99
+    num_epoch: 3
+    buffer_size: 512
+    batch_size: 64
+    max_steps: 5.0e5
+    summary_freq: 1000
+    time_horizon: 64
+
+BrainWallJumpDC:
+    use_recurrent: true
+    sequence_length: 64
+    num_layers: 2
+    hidden_units: 128
+    beta: 1.0e-2
+    gamma: 0.99
+    num_epoch: 3
+    buffer_size: 16
+    batch_size: 2
+    max_steps: 5.0e5
+    summary_freq: 1000
+    time_horizon: 64
+
+HallwayBrainDCLSTM:
+    use_recurrent: true
+    sequence_length: 64
+    num_layers: 2
+    hidden_units: 128
+    beta: 1.0e-2
+    gamma: 0.99
+    num_epoch: 3
+    buffer_size: 16
+    batch_size: 2
+    max_steps: 5.0e5
+    summary_freq: 1000
+    time_horizon: 64

 GridWorldBrain:
    batch_size: 32
--- a/python/unitytrainers/models.py
+++ b/python/unitytrainers/models.py
        return global_step, increment_step

    @staticmethod
+    def swish(input_activation):
+        """Swish activation function. For more info: https://arxiv.org/abs/1710.05941"""
+        return tf.multiply(input_activation, tf.nn.sigmoid(input_activation))
+
+    @staticmethod
    def create_visual_input(o_size_h, o_size_w, bw):
        if bw:
            c_channels = 1
            hidden = tf.layers.dense(hidden, h_size, use_bias=False, activation=activation)
        return hidden

-    def create_new_obs(self, num_streams, h_size, num_layers, activation_fn):
+    def create_new_obs(self, num_streams, h_size, num_layers):
+        if brain.action_space_type == "continuous":
+            activation_fn = tf.nn.tanh
+        else:
+            activation_fn = self.swish

        self.observation_in = []
        for i in range(brain.number_observations):
                                                                initial_state=lstm_state_in,
                                                                time_major=False,
                                                                dtype=tf.float32)
+
-        hidden_streams = self.create_new_obs(num_streams, h_size, num_layers, tf.nn.elu)
+        hidden_streams = self.create_new_obs(num_streams, h_size, num_layers)
+            self.prev_action = tf.placeholder(shape=[None], dtype=tf.int32, name='prev_action')
+            self.prev_action_oh = c_layers.one_hot_encoding(self.prev_action, self.a_size)
+            hidden = tf.concat([hidden, self.prev_action_oh], axis=1)
+
            self.memory_in = tf.placeholder(shape=[None, self.m_size], dtype=tf.float32, name='recurrent_in')
            hidden, self.memory_out = self.create_recurrent_encoder(hidden, self.memory_in)
            self.memory_out = tf.identity(self.memory_out, name='recurrent_out')

        self.value = tf.layers.dense(hidden, 1, activation=None)
        self.value = tf.identity(self.value, name="value_estimate")
-
-
-        self.action_holder = tf.placeholder(shape=[None], dtype=tf.int32, name="action_input")
+        self.action_holder = tf.placeholder(shape=[None], dtype=tf.int32)
        self.selected_actions = c_layers.one_hot_encoding(self.action_holder, self.a_size)

        self.all_old_probs = tf.placeholder(shape=[None, self.a_size], dtype=tf.float32, name='old_probabilities')
    def create_cc_actor_critic(self, h_size, num_layers):
        num_streams = 2
-        hidden_streams = self.create_new_obs(num_streams, h_size, num_layers, tf.nn.tanh)
+        hidden_streams = self.create_new_obs(num_streams, h_size, num_layers)

        if self.use_recurrent:
            self.memory_in = tf.placeholder(shape=[None, self.m_size], dtype=tf.float32, name='recurrent_in')
+
-
-        hidden_policy = hidden_streams[0]
-        hidden_value = hidden_streams[1]
+        else:
+            hidden_policy = hidden_streams[0]
+            hidden_value = hidden_streams[1]

        self.mu = tf.layers.dense(hidden_policy, self.a_size, activation=None, use_bias=False,
                                  kernel_initializer=c_layers.variance_scaling_initializer(factor=0.01))
        a = tf.exp(-1 * tf.pow(tf.stop_gradient(self.output) - self.mu, 2) / (2 * self.sigma_sq))
        b = 1 / tf.sqrt(2 * self.sigma_sq * np.pi)
        self.all_probs = tf.multiply(a, b, name="action_probs")
-        self.probs = tf.identity(self.all_probs)
+        self.probs = tf.reduce_prod(self.all_probs, axis=1)
-        self.old_probs = tf.identity(self.all_old_probs)
+        self.old_probs = tf.reduce_prod(self.all_old_probs, axis=1)
--- a/python/unitytrainers/ppo/models.py
+++ b/python/unitytrainers/ppo/models.py
        self.last_reward, self.new_reward, self.update_reward = self.create_reward_encoder()
        if brain.action_space_type == "continuous":
            self.create_cc_actor_critic(h_size, num_layers)
+            self.entropy = tf.ones_like(tf.reshape(self.value, [-1])) * self.entropy
        else:
            self.create_dc_actor_critic(h_size, num_layers)
        self.create_ppo_optimizer(self.probs, self.old_probs, self.value,
        """

        self.returns_holder = tf.placeholder(shape=[None], dtype=tf.float32, name='discounted_rewards')
-        self.advantage = tf.placeholder(shape=[None, 1], dtype=tf.float32, name='advantages')
+        self.advantage = tf.placeholder(shape=[None], dtype=tf.float32, name='advantages')
+        self.old_value = tf.placeholder(shape=[None], dtype=tf.float32, name='old_value_estimates')
+        self.mask_input = tf.placeholder(shape=[None], dtype=tf.float32, name='masks')
+
-        r_theta = probs / (old_probs + 1e-10)
-        p_opt_a = r_theta * self.advantage
-        p_opt_b = tf.clip_by_value(r_theta, 1 - decay_epsilon, 1 + decay_epsilon) * self.advantage
-        self.policy_loss = -tf.reduce_mean(tf.minimum(p_opt_a, p_opt_b))
-        self.value_loss = tf.reduce_mean(tf.squared_difference(self.returns_holder, tf.reduce_sum(value, axis=1)))
-        self.loss = self.policy_loss + 0.5 * self.value_loss - decay_beta * tf.reduce_mean(entropy)
+        self.mask = tf.equal(self.mask_input, 1.0)
+
+        clipped_value_estimate = self.old_value + tf.clip_by_value(tf.reduce_sum(value, axis=1) - self.old_value,
+                                                                   - decay_epsilon, decay_epsilon)
+
+        v_opt_a = tf.squared_difference(self.returns_holder, tf.reduce_sum(value, axis=1))
+        v_opt_b = tf.squared_difference(self.returns_holder, clipped_value_estimate)
+        self.value_loss = tf.reduce_mean(tf.boolean_mask(tf.maximum(v_opt_a, v_opt_b), self.mask))
+
+        self.r_theta = probs / (old_probs + 1e-10)
+        self.p_opt_a = self.r_theta * self.advantage
+        self.p_opt_b = tf.clip_by_value(self.r_theta, 1.0 - decay_epsilon, 1.0 + decay_epsilon) * self.advantage
+        self.policy_loss = -tf.reduce_mean(tf.boolean_mask(tf.minimum(self.p_opt_a, self.p_opt_b), self.mask))
+        self.loss = self.policy_loss + 0.5 * self.value_loss - decay_beta * tf.reduce_mean(
+            tf.boolean_mask(entropy, self.mask))
        self.update_batch = optimizer.minimize(self.loss)
--- a/python/unitytrainers/ppo/trainer.py
+++ b/python/unitytrainers/ppo/trainer.py
                    self.model.learning_rate]
        if self.is_continuous:
            run_list.append(self.model.epsilon)
+        elif self.use_recurrent:
+            feed_dict[self.model.prev_action] = np.reshape(info.previous_actions, [-1])
        if self.use_observations:
            for i, _ in enumerate(curr_brain_info.observations):
                feed_dict[self.model.observation_in[i]] = curr_brain_info.observations[i]
                    if self.is_continuous:
                        self.training_buffer[agent_id]['epsilons'].append(epsi[idx])
                    self.training_buffer[agent_id]['actions'].append(actions[idx])
+                    self.training_buffer[agent_id]['prev_action'].append(info.previous_actions[idx])
+                    self.training_buffer[agent_id]['masks'].append(1.0)
                    self.training_buffer[agent_id]['rewards'].append(next_info.rewards[next_idx])
                    self.training_buffer[agent_id]['action_probs'].append(a_dist[idx])
                    self.training_buffer[agent_id]['value_estimates'].append(value[idx][0])
                        feed_dict[self.model.state_in] = info.states
                    if self.use_recurrent:
                        feed_dict[self.model.memory_in] = info.memories
+                    if not self.is_continuous:
+                        feed_dict[self.model.prev_action] = np.reshape(info.previous_actions, [-1])
+
                self.training_buffer[agent_id]['advantages'].set(
                    get_gae(
                        rewards=self.training_buffer[agent_id]['rewards'].get_batch(),
        total_v, total_p = 0, 0
        advantages = self.training_buffer.update_buffer['advantages'].get_batch()
        self.training_buffer.update_buffer['advantages'].set(
-            (advantages - advantages.mean()) / advantages.std())
+            (advantages - advantages.mean()) / advantages.std() + 1e-10)
        for k in range(num_epoch):
            self.training_buffer.update_buffer.shuffle()
            for l in range(len(self.training_buffer.update_buffer['actions']) // batch_size):
                feed_dict = {self.model.batch_size: batch_size,
                             self.model.sequence_length: self.sequence_length,
+                             self.model.mask_input: np.array(_buffer['masks'][start:end]).reshape(
+                                 [-1]),
-                             self.model.advantage: np.array(_buffer['advantages'][start:end]).reshape([-1, 1]),
+                             self.model.old_value: np.array(_buffer['value_estimates'][start:end]).reshape([-1]),
+                             self.model.advantage: np.array(_buffer['advantages'][start:end]).reshape([-1]),
                             self.model.all_old_probs: np.array(
                                 _buffer['action_probs'][start:end]).reshape([-1, self.brain.action_space_size])}
                if self.is_continuous:
                    feed_dict[self.model.action_holder] = np.array(
                        _buffer['actions'][start:end]).reshape([-1])
+                    if self.use_recurrent:
+                        feed_dict[self.model.prev_action] = np.array(
+                            _buffer['prev_action'][start:end]).reshape([-1])
                if self.use_states:
                    if self.brain.state_space_type == "continuous":
                        feed_dict[self.model.state_in] = np.array(
                        feed_dict[self.model.observation_in[i]] = _obs.reshape([-1, _w, _h, _c])
                # Memories are zeros
                if self.use_recurrent:
-                    feed_dict[self.model.memory_in] = np.zeros([batch_size, self.m_size])
-                v_loss, p_loss, _ = self.sess.run([self.model.value_loss, self.model.policy_loss,
-                                                   self.model.update_batch], feed_dict=feed_dict)
+                    # feed_dict[self.model.memory_in] = np.zeros([batch_size, self.m_size])
+                    feed_dict[self.model.memory_in] = np.array(_buffer['memory'][start:end])[:, 0, :]
+                v_loss, p_loss, _ = self.sess.run(
+                    [self.model.value_loss, self.model.policy_loss,
+                     self.model.update_batch], feed_dict=feed_dict)
                total_v += v_loss
                total_p += p_loss
        self.stats['value_loss'].append(total_v)
--- a/python/unitytrainers/trainer_controller.py
+++ b/python/unitytrainers/trainer_controller.py
                elif not self.trainers[brain_name].parameters["use_recurrent"]:
                    nodes += [scope + x for x in ["action", "value_estimate", "action_probs"]]
                else:
-                    nodes += [scope + x for x in ["action", "value_estimate", "action_probs", "recurrent_out"]]
+                    node_list = ["action", "value_estimate", "action_probs", "recurrent_out"]
+                    nodes += [scope + x for x in node_list]
        if len(scopes) > 1:
            self.logger.info("List of available scopes :")
            for scope in scopes:
--- a/unity-environment/Assets/ML-Agents/Scripts/CoreBrainInternal.cs
+++ b/unity-environment/Assets/ML-Agents/Scripts/CoreBrainInternal.cs
    public string[] ObservationPlaceholderName;
    /// Modify only in inspector : Name of the action node
    public string ActionPlaceholderName = "action";
+    public string PreviousActionPlaceholderName = "prev_action";
 #if ENABLE_TENSORFLOW
    TFGraph graph;
    TFSession session;
    bool hasValue;
+    bool hasPrevAction;
+    int[] inputPrevAction;
    List<float[,,,]> observationMatrixList;
    float[,] inputOldMemories;
 #endif

            session = new TFSession(graph);

+            // TODO: Make this a loop over a dynamic set of graph inputs
+
            if ((graphScope.Length > 1) && (graphScope[graphScope.Length - 1] != '/'))
            {
                graphScope = graphScope + '/';
            if ((graph[graphScope + RecurrentInPlaceholderName] != null) && (graph[graphScope + RecurrentOutPlaceholderName] != null))
            {
                hasRecurrent = true;
-
            }
            if (graph[graphScope + StatePlacholderName] != null)
            {
            {
                hasValue = true;
            }
+            if (graph[graphScope + PreviousActionPlaceholderName] != null)
+            {
+                hasPrevAction = true;
+            }
        }
 #endif
    }
                List<float> state_list = states[k];
                for (int j = 0; j < stateLength * brain.brainParameters.stackedStates; j++)
                {
-
                    inputState[i, j] = state_list[j];
                }
                i++;
+        // Create the state tensor
+        if (hasPrevAction)
+        {
+            Dictionary<int, float[]> prevActions = brain.CollectActions();
+            inputPrevAction = new int[currentBatchSize];
+            var i = 0;
+            foreach (int k in agentKeys)
+            {
+                float[] action_list = prevActions[k];
+                inputPrevAction[i] = Mathf.FloorToInt(action_list[0]);
+                i++;
+            }
+        }

        // Create the observation tensors
        observationMatrixList = brain.GetObservationMatrixList(agentKeys);
            {
                runner.AddInput(graph[graphScope + StatePlacholderName][0], inputState);
            }
+        }
+
+        // Create the previous action tensor
+        if (hasPrevAction)
+        {
+            runner.AddInput(graph[graphScope + PreviousActionPlaceholderName][0], inputPrevAction);
        }

        // Create the observation tensors
--- a/docs/images/basic.png
+++ b/docs/images/basic.png
--- a/docs/images/hallway.png
+++ b/docs/images/hallway.png
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway.meta
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway.meta
+fileFormatVersion: 2
+guid: 595153db86e9a4835bdcd85155b706ab
+folderAsset: yes
+DefaultImporter:
+  externalObjects: {}
+  userData: 
+  assetBundleName: 
+  assetBundleVariant: 
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Material.meta
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Material.meta
+fileFormatVersion: 2
+guid: 937f21b4457694172922753c3e1da0e3
+folderAsset: yes
+DefaultImporter:
+  externalObjects: {}
+  userData: 
+  assetBundleName: 
+  assetBundleVariant: 
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/FailGround.mat
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/FailGround.mat
+%YAML 1.1
+%TAG !u! tag:unity3d.com,2011:
+--- !u!21 &2100000
+Material:
+  serializedVersion: 6
+  m_ObjectHideFlags: 0
+  m_PrefabParentObject: {fileID: 0}
+  m_PrefabInternal: {fileID: 0}
+  m_Name: FailGround
+  m_Shader: {fileID: 46, guid: 0000000000000000f000000000000000, type: 0}
+  m_ShaderKeywords: _EMISSION
+  m_LightmapFlags: 1
+  m_EnableInstancingVariants: 0
+  m_DoubleSidedGI: 0
+  m_CustomRenderQueue: -1
+  stringTagMap: {}
+  disabledShaderPasses: []
+  m_SavedProperties:
+    serializedVersion: 3
+    m_TexEnvs:
+    - _BumpMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _DetailAlbedoMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _DetailMask:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _DetailNormalMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _EmissionMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _MainTex:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _MetallicGlossMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _OcclusionMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _ParallaxMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    m_Floats:
+    - _BumpScale: 1
+    - _Cutoff: 0.5
+    - _DetailNormalMapScale: 1
+    - _DstBlend: 0
+    - _GlossMapScale: 1
+    - _Glossiness: 0
+    - _GlossyReflections: 1
+    - _Metallic: 0
+    - _Mode: 0
+    - _OcclusionStrength: 1
+    - _Parallax: 0.02
+    - _SmoothnessTextureChannel: 0
+    - _SpecularHighlights: 1
+    - _SrcBlend: 1
+    - _UVSec: 0
+    - _ZWrite: 1
+    m_Colors:
+    - _Color: {r: 0, g: 0, b: 0, a: 1}
+    - _EmissionColor: {r: 1, g: 0.49264705, b: 0.49264705, a: 1}
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/FailGround.mat.meta
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/FailGround.mat.meta
+fileFormatVersion: 2
+guid: afec5566ab8f64f288135f321697aff1
+timeCreated: 1511408979
+licenseType: Free
+NativeFormatImporter:
+  externalObjects: {}
+  mainObjectFileID: 2100000
+  userData: 
+  assetBundleName: 
+  assetBundleVariant: 
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/Goal.mat
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/Goal.mat
+%YAML 1.1
+%TAG !u! tag:unity3d.com,2011:
+--- !u!21 &2100000
+Material:
+  serializedVersion: 6
+  m_ObjectHideFlags: 0
+  m_PrefabParentObject: {fileID: 0}
+  m_PrefabInternal: {fileID: 0}
+  m_Name: Goal
+  m_Shader: {fileID: 46, guid: 0000000000000000f000000000000000, type: 0}
+  m_ShaderKeywords: _EMISSION
+  m_LightmapFlags: 1
+  m_EnableInstancingVariants: 0
+  m_DoubleSidedGI: 0
+  m_CustomRenderQueue: -1
+  stringTagMap: {}
+  disabledShaderPasses: []
+  m_SavedProperties:
+    serializedVersion: 3
+    m_TexEnvs:
+    - _BumpMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _DetailAlbedoMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _DetailMask:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _DetailNormalMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _EmissionMap:
+        m_Texture: {fileID: 2800000, guid: 4e8dc69015456454b9e28364eb70ea8e, type: 3}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _MainTex:
+        m_Texture: {fileID: 2800000, guid: 4e8dc69015456454b9e28364eb70ea8e, type: 3}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _MetallicGlossMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _OcclusionMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _ParallaxMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    m_Floats:
+    - _BumpScale: 1
+    - _Cutoff: 0.5
+    - _DetailNormalMapScale: 1
+    - _DstBlend: 0
+    - _GlossMapScale: 1
+    - _Glossiness: 0
+    - _GlossyReflections: 1
+    - _Metallic: 0
+    - _Mode: 0
+    - _OcclusionStrength: 1
+    - _Parallax: 0.02
+    - _SmoothnessTextureChannel: 0
+    - _SpecularHighlights: 1
+    - _SrcBlend: 1
+    - _UVSec: 0
+    - _ZWrite: 1
+    m_Colors:
+    - _Color: {r: 0, g: 0, b: 0, a: 1}
+    - _EmissionColor: {r: 1, g: 1, b: 1, a: 1}
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/Goal.mat.meta
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/Goal.mat.meta
+fileFormatVersion: 2
+guid: 3891f240163794c6083c01266cd38351
+timeCreated: 1506189863
+licenseType: Pro
+NativeFormatImporter:
+  externalObjects: {}
+  mainObjectFileID: 2100000
+  userData: 
+  assetBundleName: 
+  assetBundleVariant: 
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/Ground.mat
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/Ground.mat
+%YAML 1.1
+%TAG !u! tag:unity3d.com,2011:
+--- !u!21 &2100000
+Material:
+  serializedVersion: 6
+  m_ObjectHideFlags: 0
+  m_PrefabParentObject: {fileID: 0}
+  m_PrefabInternal: {fileID: 0}
+  m_Name: Ground
+  m_Shader: {fileID: 46, guid: 0000000000000000f000000000000000, type: 0}
+  m_ShaderKeywords: 
+  m_LightmapFlags: 4
+  m_EnableInstancingVariants: 0
+  m_DoubleSidedGI: 0
+  m_CustomRenderQueue: -1
+  stringTagMap: {}
+  disabledShaderPasses: []
+  m_SavedProperties:
+    serializedVersion: 3
+    m_TexEnvs:
+    - _BumpMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _DetailAlbedoMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _DetailMask:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _DetailNormalMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _EmissionMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _MainTex:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _MetallicGlossMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _OcclusionMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _ParallaxMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    m_Floats:
+    - _BumpScale: 1
+    - _Cutoff: 0.5
+    - _DetailNormalMapScale: 1
+    - _DstBlend: 0
+    - _GlossMapScale: 1
+    - _Glossiness: 0.5
+    - _GlossyReflections: 1
+    - _Metallic: 0
+    - _Mode: 0
+    - _OcclusionStrength: 1
+    - _Parallax: 0.02
+    - _SmoothnessTextureChannel: 0
+    - _SpecularHighlights: 1
+    - _SrcBlend: 1
+    - _UVSec: 0
+    - _ZWrite: 1
+    m_Colors:
+    - _Color: {r: 1, g: 1, b: 1, a: 1}
+    - _EmissionColor: {r: 0, g: 0, b: 0, a: 1}
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/Ground.mat.meta
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/Ground.mat.meta
+fileFormatVersion: 2
+guid: 8c1052704d10a471aae9b294328689cd
+NativeFormatImporter:
+  externalObjects: {}
+  mainObjectFileID: 2100000
+  userData: 
+  assetBundleName: 
+  assetBundleVariant: 
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/Orange.mat
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/Orange.mat
+%YAML 1.1
+%TAG !u! tag:unity3d.com,2011:
+--- !u!21 &2100000
+Material:
+  serializedVersion: 6
+  m_ObjectHideFlags: 0
+  m_PrefabParentObject: {fileID: 0}
+  m_PrefabInternal: {fileID: 0}
+  m_Name: Orange
+  m_Shader: {fileID: 46, guid: 0000000000000000f000000000000000, type: 0}
+  m_ShaderKeywords: _EMISSION
+  m_LightmapFlags: 1
+  m_EnableInstancingVariants: 0
+  m_DoubleSidedGI: 0
+  m_CustomRenderQueue: -1
+  stringTagMap: {}
+  disabledShaderPasses: []
+  m_SavedProperties:
+    serializedVersion: 3
+    m_TexEnvs:
+    - _BumpMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _DetailAlbedoMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _DetailMask:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _DetailNormalMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _EmissionMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _MainTex:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _MetallicGlossMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _OcclusionMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _ParallaxMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    m_Floats:
+    - _BumpScale: 1
+    - _Cutoff: 0.5
+    - _DetailNormalMapScale: 1
+    - _DstBlend: 0
+    - _GlossMapScale: 1
+    - _Glossiness: 0
+    - _GlossyReflections: 1
+    - _Metallic: 0
+    - _Mode: 0
+    - _OcclusionStrength: 1
+    - _Parallax: 0.02
+    - _SmoothnessTextureChannel: 0
+    - _SpecularHighlights: 1
+    - _SrcBlend: 1
+    - _UVSec: 0
+    - _ZWrite: 1
+    m_Colors:
+    - _Color: {r: 0, g: 0, b: 0, a: 1}
+    - _EmissionColor: {r: 0.849, g: 0.544531, b: 0, a: 1}
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/Orange.mat.meta
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/Orange.mat.meta
+fileFormatVersion: 2
+guid: fd56382c5d1a049b6a909c184a4c6d42
+NativeFormatImporter:
+  externalObjects: {}
+  mainObjectFileID: 0
+  userData: 
+  assetBundleName: 
+  assetBundleVariant: 
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/PrototypeCheckerAlbedo.png
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/PrototypeCheckerAlbedo.png
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/PrototypeCheckerAlbedo.png.meta
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/PrototypeCheckerAlbedo.png.meta
+fileFormatVersion: 2
+guid: 4e8dc69015456454b9e28364eb70ea8e
+timeCreated: 1511408038
+licenseType: Free
+TextureImporter:
+  fileIDToRecycleName: {}
+  externalObjects: {}
+  serializedVersion: 4
+  mipmaps:
+    mipMapMode: 0
+    enableMipMap: 1
+    sRGBTexture: 1
+    linearTexture: 0
+    fadeOut: 0
+    borderMipMap: 0
+    mipMapsPreserveCoverage: 0
+    alphaTestReferenceValue: 0.5
+    mipMapFadeDistanceStart: 1
+    mipMapFadeDistanceEnd: 3
+  bumpmap:
+    convertToNormalMap: 0
+    externalNormalMap: 0
+    heightScale: 0.25
+    normalMapFilter: 0
+  isReadable: 0
+  grayScaleToAlpha: 0
+  generateCubemap: 6
+  cubemapConvolution: 0
+  seamlessCubemap: 0
+  textureFormat: 1
+  maxTextureSize: 2048
+  textureSettings:
+    serializedVersion: 2
+    filterMode: -1
+    aniso: -1
+    mipBias: -1
+    wrapU: -1
+    wrapV: -1
+    wrapW: -1
+  nPOTScale: 1
+  lightmap: 0
+  compressionQuality: 50
+  spriteMode: 0
+  spriteExtrude: 1
+  spriteMeshType: 1
+  alignment: 0
+  spritePivot: {x: 0.5, y: 0.5}
+  spriteBorder: {x: 0, y: 0, z: 0, w: 0}
+  spritePixelsToUnits: 100
+  alphaUsage: 1
+  alphaIsTransparency: 0
+  spriteTessellationDetail: -1
+  textureType: 0
+  textureShape: 1
+  maxTextureSizeSet: 0
+  compressionQualitySet: 0
+  textureFormatSet: 0
+  platformSettings:
+  - buildTarget: DefaultTexturePlatform
+    maxTextureSize: 2048
+    resizeAlgorithm: 0
+    textureFormat: -1
+    textureCompression: 1
+    compressionQuality: 50
+    crunchedCompression: 0
+    allowsAlphaSplitting: 0
+    overridden: 0
+    androidETC2FallbackOverride: 0
+  spriteSheet:
+    serializedVersion: 2
+    sprites: []
+    outline: []
+    physicsShape: []
+  spritePackingTag: 
+  userData: 
+  assetBundleName: 
+  assetBundleVariant: 
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/Red.mat
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/Red.mat
+%YAML 1.1
+%TAG !u! tag:unity3d.com,2011:
+--- !u!21 &2100000
+Material:
+  serializedVersion: 6
+  m_ObjectHideFlags: 0
+  m_PrefabParentObject: {fileID: 0}
+  m_PrefabInternal: {fileID: 0}
+  m_Name: Red
+  m_Shader: {fileID: 46, guid: 0000000000000000f000000000000000, type: 0}
+  m_ShaderKeywords: _EMISSION
+  m_LightmapFlags: 1
+  m_EnableInstancingVariants: 0
+  m_DoubleSidedGI: 0
+  m_CustomRenderQueue: -1
+  stringTagMap: {}
+  disabledShaderPasses: []
+  m_SavedProperties:
+    serializedVersion: 3
+    m_TexEnvs:
+    - _BumpMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _DetailAlbedoMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _DetailMask:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _DetailNormalMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _EmissionMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _MainTex:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _MetallicGlossMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _OcclusionMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _ParallaxMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    m_Floats:
+    - _BumpScale: 1
+    - _Cutoff: 0.5
+    - _DetailNormalMapScale: 1
+    - _DstBlend: 0
+    - _GlossMapScale: 1
+    - _Glossiness: 0
+    - _GlossyReflections: 1
+    - _Metallic: 0
+    - _Mode: 0
+    - _OcclusionStrength: 1
+    - _Parallax: 0.02
+    - _SmoothnessTextureChannel: 0
+    - _SpecularHighlights: 1
+    - _SrcBlend: 1
+    - _UVSec: 0
+    - _ZWrite: 1
+    m_Colors:
+    - _Color: {r: 0, g: 0, b: 0, a: 1}
+    - _EmissionColor: {r: 0.75686276, g: 0.1254902, b: 0.20392157, a: 1}
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/Red.mat.meta
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/Red.mat.meta
+fileFormatVersion: 2
+guid: 14324848a330c4217817001de3315e6e
+NativeFormatImporter:
+  externalObjects: {}
+  mainObjectFileID: 2100000
+  userData: 
+  assetBundleName: 
+  assetBundleVariant: 
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/SuccessGround.mat
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/SuccessGround.mat
+%YAML 1.1
+%TAG !u! tag:unity3d.com,2011:
+--- !u!21 &2100000
+Material:
+  serializedVersion: 6
+  m_ObjectHideFlags: 0
+  m_PrefabParentObject: {fileID: 0}
+  m_PrefabInternal: {fileID: 0}
+  m_Name: SuccessGround
+  m_Shader: {fileID: 46, guid: 0000000000000000f000000000000000, type: 0}
+  m_ShaderKeywords: _EMISSION
+  m_LightmapFlags: 1
+  m_EnableInstancingVariants: 0
+  m_DoubleSidedGI: 0
+  m_CustomRenderQueue: -1
+  stringTagMap: {}
+  disabledShaderPasses: []
+  m_SavedProperties:
+    serializedVersion: 3
+    m_TexEnvs:
+    - _BumpMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _DetailAlbedoMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _DetailMask:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _DetailNormalMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _EmissionMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _MainTex:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _MetallicGlossMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _OcclusionMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    - _ParallaxMap:
+        m_Texture: {fileID: 0}
+        m_Scale: {x: 1, y: 1}
+        m_Offset: {x: 0, y: 0}
+    m_Floats:
+    - _BumpScale: 1
+    - _Cutoff: 0.5
+    - _DetailNormalMapScale: 1
+    - _DstBlend: 0
+    - _GlossMapScale: 1
+    - _Glossiness: 0
+    - _GlossyReflections: 1
+    - _Metallic: 0
+    - _Mode: 0
+    - _OcclusionStrength: 1
+    - _Parallax: 0.02
+    - _SmoothnessTextureChannel: 0
+    - _SpecularHighlights: 1
+    - _SrcBlend: 1
+    - _UVSec: 0
+    - _ZWrite: 1
+    m_Colors:
+    - _Color: {r: 0, g: 0, b: 0, a: 1}
+    - _EmissionColor: {r: 0.47861382, g: 0.7205882, b: 0.31260812, a: 1}
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/SuccessGround.mat.meta
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Material/SuccessGround.mat.meta
+fileFormatVersion: 2
+guid: b3effc31ad8e442268e6b1f77b06fc4d
+timeCreated: 1511408979
+licenseType: Free
+NativeFormatImporter:
+  externalObjects: {}
+  mainObjectFileID: 2100000
+  userData: 
+  assetBundleName: 
+  assetBundleVariant: 
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Prefabs.meta
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Prefabs.meta
+fileFormatVersion: 2
+guid: b84a628ee2ba849289722480171596cc
+folderAsset: yes
+DefaultImporter:
+  externalObjects: {}
+  userData: 
+  assetBundleName: 
+  assetBundleVariant: 
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Prefabs/HallwayArea.prefab
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Prefabs/HallwayArea.prefab
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Prefabs/HallwayArea.prefab.meta
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Prefabs/HallwayArea.prefab.meta
+fileFormatVersion: 2
+guid: f3a451555dc514f46a69319857762eda
+NativeFormatImporter:
+  externalObjects: {}
+  mainObjectFileID: 100100000
+  userData: 
+  assetBundleName: 
+  assetBundleVariant: 
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Prefabs/orangeBlock.prefab
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Prefabs/orangeBlock.prefab
+%YAML 1.1
+%TAG !u! tag:unity3d.com,2011:
+--- !u!1001 &100100000
+Prefab:
+  m_ObjectHideFlags: 1
+  serializedVersion: 2
+  m_Modification:
+    m_TransformParent: {fileID: 0}
+    m_Modifications: []
+    m_RemovedComponents: []
+  m_ParentPrefab: {fileID: 0}
+  m_RootGameObject: {fileID: 1519958416335468}
+  m_IsPrefabParent: 1
+--- !u!1 &1519958416335468
+GameObject:
+  m_ObjectHideFlags: 0
+  m_PrefabParentObject: {fileID: 0}
+  m_PrefabInternal: {fileID: 100100000}
+  serializedVersion: 5
+  m_Component:
+  - component: {fileID: 4835573359625274}
+  - component: {fileID: 33314145500572160}
+  - component: {fileID: 65658361496764116}
+  - component: {fileID: 23329157610650020}
+  - component: {fileID: 54473675077019716}
+  m_Layer: 0
+  m_Name: orangeBlock
+  m_TagString: block
+  m_Icon: {fileID: 0}
+  m_NavMeshLayer: 0
+  m_StaticEditorFlags: 0
+  m_IsActive: 1
+--- !u!4 &4835573359625274
+Transform:
+  m_ObjectHideFlags: 1
+  m_PrefabParentObject: {fileID: 0}
+  m_PrefabInternal: {fileID: 100100000}
+  m_GameObject: {fileID: 1519958416335468}
+  m_LocalRotation: {x: -0, y: -0, z: -0, w: 1}
+  m_LocalPosition: {x: 0, y: 2, z: 2}
+  m_LocalScale: {x: 1, y: 1, z: 1}
+  m_Children: []
+  m_Father: {fileID: 0}
+  m_RootOrder: 0
+  m_LocalEulerAnglesHint: {x: 0, y: 0, z: 0}
+--- !u!23 &23329157610650020
+MeshRenderer:
+  m_ObjectHideFlags: 1
+  m_PrefabParentObject: {fileID: 0}
+  m_PrefabInternal: {fileID: 100100000}
+  m_GameObject: {fileID: 1519958416335468}
+  m_Enabled: 1
+  m_CastShadows: 1
+  m_ReceiveShadows: 1
+  m_DynamicOccludee: 1
+  m_MotionVectors: 1
+  m_LightProbeUsage: 1
+  m_ReflectionProbeUsage: 1
+  m_RenderingLayerMask: 4294967295
+  m_Materials:
+  - {fileID: 2100000, guid: fd56382c5d1a049b6a909c184a4c6d42, type: 2}
+  m_StaticBatchInfo:
+    firstSubMesh: 0
+    subMeshCount: 0
+  m_StaticBatchRoot: {fileID: 0}
+  m_ProbeAnchor: {fileID: 0}
+  m_LightProbeVolumeOverride: {fileID: 0}
+  m_ScaleInLightmap: 1
+  m_PreserveUVs: 1
+  m_IgnoreNormalsForChartDetection: 0
+  m_ImportantGI: 0
+  m_StitchLightmapSeams: 0
+  m_SelectedEditorRenderState: 3
+  m_MinimumChartSize: 4
+  m_AutoUVMaxDistance: 0.5
+  m_AutoUVMaxAngle: 89
+  m_LightmapParameters: {fileID: 0}
+  m_SortingLayerID: 0
+  m_SortingLayer: 0
+  m_SortingOrder: 0
+--- !u!33 &33314145500572160
+MeshFilter:
+  m_ObjectHideFlags: 1
+  m_PrefabParentObject: {fileID: 0}
+  m_PrefabInternal: {fileID: 100100000}
+  m_GameObject: {fileID: 1519958416335468}
+  m_Mesh: {fileID: 10202, guid: 0000000000000000e000000000000000, type: 0}
+--- !u!54 &54473675077019716
+Rigidbody:
+  m_ObjectHideFlags: 1
+  m_PrefabParentObject: {fileID: 0}
+  m_PrefabInternal: {fileID: 100100000}
+  m_GameObject: {fileID: 1519958416335468}
+  serializedVersion: 2
+  m_Mass: 25
+  m_Drag: 1
+  m_AngularDrag: 0.05
+  m_UseGravity: 1
+  m_IsKinematic: 0
+  m_Interpolate: 0
+  m_Constraints: 80
+  m_CollisionDetection: 0
+--- !u!65 &65658361496764116
+BoxCollider:
+  m_ObjectHideFlags: 1
+  m_PrefabParentObject: {fileID: 0}
+  m_PrefabInternal: {fileID: 100100000}
+  m_GameObject: {fileID: 1519958416335468}
+  m_Material: {fileID: 0}
+  m_IsTrigger: 0
+  m_Enabled: 1
+  serializedVersion: 2
+  m_Size: {x: 1, y: 1, z: 1}
+  m_Center: {x: 0, y: 0, z: 0}
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Prefabs/orangeBlock.prefab.meta
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Prefabs/orangeBlock.prefab.meta
+fileFormatVersion: 2
+guid: 833420600d9884ec0ad46d9a3388b6fd
+NativeFormatImporter:
+  externalObjects: {}
+  mainObjectFileID: 100100000
+  userData: 
+  assetBundleName: 
+  assetBundleVariant: 
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Prefabs/violetBlock.prefab
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Prefabs/violetBlock.prefab
+%YAML 1.1
+%TAG !u! tag:unity3d.com,2011:
+--- !u!1001 &100100000
+Prefab:
+  m_ObjectHideFlags: 1
+  serializedVersion: 2
+  m_Modification:
+    m_TransformParent: {fileID: 0}
+    m_Modifications: []
+    m_RemovedComponents: []
+  m_ParentPrefab: {fileID: 0}
+  m_RootGameObject: {fileID: 1656546804574634}
+  m_IsPrefabParent: 1
+--- !u!1 &1656546804574634
+GameObject:
+  m_ObjectHideFlags: 0
+  m_PrefabParentObject: {fileID: 0}
+  m_PrefabInternal: {fileID: 100100000}
+  serializedVersion: 5
+  m_Component:
+  - component: {fileID: 4560479494974998}
+  - component: {fileID: 33309654106178894}
+  - component: {fileID: 65061098387865406}
+  - component: {fileID: 23942543559044578}
+  - component: {fileID: 54767578272624966}
+  m_Layer: 0
+  m_Name: violetBlock
+  m_TagString: block
+  m_Icon: {fileID: 0}
+  m_NavMeshLayer: 0
+  m_StaticEditorFlags: 0
+  m_IsActive: 1
+--- !u!4 &4560479494974998
+Transform:
+  m_ObjectHideFlags: 1
+  m_PrefabParentObject: {fileID: 0}
+  m_PrefabInternal: {fileID: 100100000}
+  m_GameObject: {fileID: 1656546804574634}
+  m_LocalRotation: {x: -0, y: -0, z: -0, w: 1}
+  m_LocalPosition: {x: 0, y: 2, z: 2}
+  m_LocalScale: {x: 1, y: 1, z: 1}
+  m_Children: []
+  m_Father: {fileID: 0}
+  m_RootOrder: 0
+  m_LocalEulerAnglesHint: {x: 0, y: 0, z: 0}
+--- !u!23 &23942543559044578
+MeshRenderer:
+  m_ObjectHideFlags: 1
+  m_PrefabParentObject: {fileID: 0}
+  m_PrefabInternal: {fileID: 100100000}
+  m_GameObject: {fileID: 1656546804574634}
+  m_Enabled: 1
+  m_CastShadows: 1
+  m_ReceiveShadows: 1
+  m_DynamicOccludee: 1
+  m_MotionVectors: 1
+  m_LightProbeUsage: 1
+  m_ReflectionProbeUsage: 1
+  m_RenderingLayerMask: 4294967295
+  m_Materials:
+  - {fileID: 2100000, guid: 14324848a330c4217817001de3315e6e, type: 2}
+  m_StaticBatchInfo:
+    firstSubMesh: 0
+    subMeshCount: 0
+  m_StaticBatchRoot: {fileID: 0}
+  m_ProbeAnchor: {fileID: 0}
+  m_LightProbeVolumeOverride: {fileID: 0}
+  m_ScaleInLightmap: 1
+  m_PreserveUVs: 1
+  m_IgnoreNormalsForChartDetection: 0
+  m_ImportantGI: 0
+  m_StitchLightmapSeams: 0
+  m_SelectedEditorRenderState: 3
+  m_MinimumChartSize: 4
+  m_AutoUVMaxDistance: 0.5
+  m_AutoUVMaxAngle: 89
+  m_LightmapParameters: {fileID: 0}
+  m_SortingLayerID: 0
+  m_SortingLayer: 0
+  m_SortingOrder: 0
+--- !u!33 &33309654106178894
+MeshFilter:
+  m_ObjectHideFlags: 1
+  m_PrefabParentObject: {fileID: 0}
+  m_PrefabInternal: {fileID: 100100000}
+  m_GameObject: {fileID: 1656546804574634}
+  m_Mesh: {fileID: 10202, guid: 0000000000000000e000000000000000, type: 0}
+--- !u!54 &54767578272624966
+Rigidbody:
+  m_ObjectHideFlags: 1
+  m_PrefabParentObject: {fileID: 0}
+  m_PrefabInternal: {fileID: 100100000}
+  m_GameObject: {fileID: 1656546804574634}
+  serializedVersion: 2
+  m_Mass: 25
+  m_Drag: 1
+  m_AngularDrag: 0.05
+  m_UseGravity: 1
+  m_IsKinematic: 0
+  m_Interpolate: 0
+  m_Constraints: 80
+  m_CollisionDetection: 0
+--- !u!65 &65061098387865406
+BoxCollider:
+  m_ObjectHideFlags: 1
+  m_PrefabParentObject: {fileID: 0}
+  m_PrefabInternal: {fileID: 100100000}
+  m_GameObject: {fileID: 1656546804574634}
+  m_Material: {fileID: 13400000, guid: 8c6374adc4d814c2eb5ecdfe810d813b, type: 2}
+  m_IsTrigger: 0
+  m_Enabled: 1
+  serializedVersion: 2
+  m_Size: {x: 1, y: 1, z: 1}
+  m_Center: {x: 0, y: 0, z: 0}
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Prefabs/violetBlock.prefab.meta
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Prefabs/violetBlock.prefab.meta
+fileFormatVersion: 2
+guid: 48e08825f37b1428b9794b433ed29a1d
+NativeFormatImporter:
+  externalObjects: {}
+  mainObjectFileID: 100100000
+  userData: 
+  assetBundleName: 
+  assetBundleVariant: 
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Scenes.meta
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Scenes.meta
+fileFormatVersion: 2
+guid: 6898a815458594782920cc6ce36e5a99
+folderAsset: yes
+DefaultImporter:
+  externalObjects: {}
+  userData: 
+  assetBundleName: 
+  assetBundleVariant: 
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Scenes/Hallway.unity
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Scenes/Hallway.unity
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Scenes/Hallway.unity.meta
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Scenes/Hallway.unity.meta
+fileFormatVersion: 2
+guid: d6d6a33ed0e18459a8d61817d600978a
+DefaultImporter:
+  externalObjects: {}
+  userData: 
+  assetBundleName: 
+  assetBundleVariant: 
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Scripts.meta
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Scripts.meta
+fileFormatVersion: 2
+guid: c85133268635d484daae3cbfb68cc34d
+folderAsset: yes
+DefaultImporter:
+  externalObjects: {}
+  userData: 
+  assetBundleName: 
+  assetBundleVariant: 
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/TFModels.meta
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/TFModels.meta
+fileFormatVersion: 2
+guid: c630ad4900acd4468a81be1cd65bb706
+folderAsset: yes
+DefaultImporter:
+  externalObjects: {}
+  userData: 
+  assetBundleName: 
+  assetBundleVariant: 
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/TFModels/Hallway.bytes
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/TFModels/Hallway.bytes
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/TFModels/Hallway.bytes.meta
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/TFModels/Hallway.bytes.meta
+fileFormatVersion: 2
+guid: 8aa65be485d2a408291f32ff56a9074e
+TextScriptImporter:
+  externalObjects: {}
+  userData: 
+  assetBundleName: 
+  assetBundleVariant: 
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/TFModels/hallway-1.bytes
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/TFModels/hallway-1.bytes
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/TFModels/hallway-1.bytes.meta
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/TFModels/hallway-1.bytes.meta
+fileFormatVersion: 2
+guid: 1b1bcb1bc80e84f0192f732cb0223525
+TextScriptImporter:
+  externalObjects: {}
+  userData: 
+  assetBundleName: 
+  assetBundleVariant: 
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Scripts/HallwayAcademy.cs.meta
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Scripts/HallwayAcademy.cs.meta
+fileFormatVersion: 2
+guid: 40db664a3061b46a0a0628f90b2264f7
+MonoImporter:
+  externalObjects: {}
+  serializedVersion: 2
+  defaultReferences: []
+  executionOrder: 0
+  icon: {instanceID: 0}
+  userData: 
+  assetBundleName: 
+  assetBundleVariant: 
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Scripts/HallwayAgent.cs.meta
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Scripts/HallwayAgent.cs.meta
+fileFormatVersion: 2
+guid: b446afae240924105b36d07e8d17a608
+MonoImporter:
+  externalObjects: {}
+  serializedVersion: 2
+  defaultReferences: []
+  executionOrder: 0
+  icon: {instanceID: 0}
+  userData: 
+  assetBundleName: 
+  assetBundleVariant: 
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Scripts/HallwayAcademy.cs
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Scripts/HallwayAcademy.cs
+using System.Collections;
+using System.Collections.Generic;
+using UnityEngine;
+
+public class HallwayAcademy : Academy {
+
+	public float agentRunSpeed; 
+	public float agentRotationSpeed;
+    public Material goalScoredMaterial; //when a goal is scored the ground will use this material for a few seconds.
+    public Material failMaterial; //when fail, the ground will use this material for a few seconds. 
+	public float gravityMultiplier; //use ~3 to make things less floaty
+
+    public override void InitializeAcademy()
+    {
+        Physics.gravity *= gravityMultiplier;
+    }
+
+	public override void AcademyReset()
+	{
+
+    }
+}
--- a/unity-environment/Assets/ML-Agents/Examples/Hallway/Scripts/HallwayAgent.cs
+++ b/unity-environment/Assets/ML-Agents/Examples/Hallway/Scripts/HallwayAgent.cs
+//Put this script on your blue cube.
+
+using System.Collections;
+using System.Collections.Generic;
+using UnityEngine;
+
+public class HallwayAgent : Agent
+{
+    public GameObject ground; //ground game object. we will use the area bounds to spawn the blocks
+    public GameObject area;
+
+    public GameObject goalA;
+    public GameObject goalB;
+    public GameObject orangeBlock; //the orange block we are going to be pushing
+    public GameObject violetBlock;
+    Rigidbody shortBlockRB;  //cached on initialization
+    Rigidbody agentRB;  //cached on initialization
+    Material groundMaterial; //cached on Awake()
+    Renderer groundRenderer;
+    HallwayAcademy academy;
+
+    int selection;
+
+    public override void InitializeAgent()
+    {
+        base.InitializeAgent();
+        academy = FindObjectOfType<HallwayAcademy>();
+        brain = FindObjectOfType<Brain>(); //only one brain in the scene so this should find our brain. BRAAAINS.
+
+        agentRB = GetComponent<Rigidbody>(); //cache the agent rigidbody
+        groundRenderer = ground.GetComponent<Renderer>(); //get the ground renderer so we can change the material when a goal is scored
+        groundMaterial = groundRenderer.material; //starting material
+
+    }
+
+    public List<float> RayPerception(List<float> state, float rayDistance,
+                                 float[] rayAngles, string[] detectableObjects, float height)
+    {
+        foreach (float angle in rayAngles)
+        {
+            float noise = 0f;
+            float noisyAngle = angle + Random.Range(-noise, noise);
+            Vector3 position = transform.TransformDirection(GiveCatersian(rayDistance, noisyAngle));
+            position.y = height;
+            Debug.DrawRay(transform.position, position, Color.red, 0.1f, true);
+            RaycastHit hit;
+            float[] subList = new float[detectableObjects.Length + 2];
+            if (Physics.SphereCast(transform.position, 1.0f, position, out hit, rayDistance))
+            {
+                for (int i = 0; i < detectableObjects.Length; i++)
+                {
+                    if (hit.collider.gameObject.CompareTag(detectableObjects[i]))
+                    {
+                        subList[i] = 1;
+                        subList[detectableObjects.Length + 1] = hit.distance / rayDistance;
+                        break;
+                    }
+                }
+            }
+            else
+            {
+                subList[detectableObjects.Length] = 1f;
+            }
+            state.AddRange(new List<float>(subList));
+        }
+        return state;
+    }
+
+    public Vector3 GiveCatersian(float radius, float angle)
+    {
+        float x = radius * Mathf.Cos(DegreeToRadian(angle));
+        float z = radius * Mathf.Sin(DegreeToRadian(angle));
+        return new Vector3(x, 1f, z);
+    }
+
+    public float DegreeToRadian(float degree)
+    {
+        return degree * Mathf.PI / 180f;
+    }
+
+
+    public override List<float> CollectState()
+    {
+        float rayDistance = 8.5f;
+        float[] rayAngles = { 0f, 45f, 90f, 135f, 180f };
+        string[] detectableObjects = { "goal", "orangeBlock", "redBlock", "wall" };
+        state = RayPerception(state, rayDistance, rayAngles, detectableObjects, 0f);
+        return state;
+    }
+
+    //swap ground material, wait time seconds, then swap back to the regular ground material.
+    IEnumerator GoalScoredSwapGroundMaterial(Material mat, float time)
+    {
+        groundRenderer.material = mat;
+        yield return new WaitForSeconds(time); //wait for 2 sec
+        groundRenderer.material = groundMaterial;
+    }
+
+
+    public void MoveAgent(float[] act)
+    {
+
+        Vector3 dirToGo = Vector3.zero;
+        Vector3 rotateDir = Vector3.zero;
+
+        //If we're using Continuous control you will need to change the Action
+        if (brain.brainParameters.actionSpaceType == StateType.continuous)
+        {
+            dirToGo = transform.forward * Mathf.Clamp(act[0], -1f, 1f);
+            rotateDir = transform.up * Mathf.Clamp(act[1], -1f, 1f);
+        }
+        else
+        {
+            int action = Mathf.FloorToInt(act[0]);
+            if (action == 0)
+            {
+                dirToGo = transform.forward * 1f;
+            }
+            else if (action == 1)
+            {
+                dirToGo = transform.forward * -1f;
+            }
+            else if (action == 2)
+            {
+                rotateDir = transform.up * 1f;
+            }
+            else if (action == 3)
+            {
+                rotateDir = transform.up * -1f;
+            }
+        }
+        transform.Rotate(rotateDir, Time.deltaTime * 100f);
+        agentRB.AddForce(dirToGo * academy.agentRunSpeed, ForceMode.VelocityChange); //GO
+    }
+
+    public override void AgentStep(float[] act)
+    {
+        reward -= 0.0003f;
+
+        MoveAgent(act); //perform agent actions
+        bool fail = false;  // did the agent or block get pushed off the edge?
+
+        if (!Physics.Raycast(agentRB.position, Vector3.down, 20)) //if the agent has gone over the edge, we done.
+        {
+            fail = true; //fell off bro
+            reward -= 1f; // BAD AGENT
+                          //transform.position =  GetRandomSpawnPos(agentSpawnAreaBounds, agentSpawnArea);
+            done = true; //if we mark an agent as done it will be reset automatically. AgentReset() will be called.
+        }
+
+        if (fail)
+        {
+            StartCoroutine(GoalScoredSwapGroundMaterial(academy.failMaterial, .5f)); //swap ground material to indicate fail
+        }
+    }
+
+    // detect when we touch the goal
+    void OnCollisionEnter(Collision col)
+    {
+        if (col.gameObject.CompareTag("goal")) //touched goal
+        {
+            if ((selection == 0 && col.gameObject.name == "GoalA") || (selection == 1 && col.gameObject.name == "GoalB"))
+            {
+                reward += 1f; //you get 5 points
+                StartCoroutine(GoalScoredSwapGroundMaterial(academy.goalScoredMaterial, 2)); //swap ground material for a bit to indicate we scored.
+            }
+            else
+            {
+                reward -= 0.1f; //you lose a point
+                StartCoroutine(GoalScoredSwapGroundMaterial(academy.failMaterial, .5f)); //swap ground material to indicate fail
+            }
+            done = true; //if we mark an agent as done it will be reset automatically. AgentReset() will be called.
+        }
+    }
+
+    //In the editor, if "Reset On Done" is checked then AgentReset() will be called automatically anytime we mark done = true in an agent script.
+    public override void AgentReset()
+    {
+        selection = Random.Range(0, 2);
+        if (selection == 0)
+        {
+            orangeBlock.transform.position = new Vector3(0f + Random.Range(-3f, 3f), 2f, -15f + Random.Range(-5f, 5f)) + ground.transform.position;
+            violetBlock.transform.position = new Vector3(0f, -1000f, -15f + Random.Range(-5f, 5f)) + ground.transform.position;
+        }
+        else
+        {
+            orangeBlock.transform.position = new Vector3(0f, -1000f, -15f + Random.Range(-5f, 5f)) + ground.transform.position;
+            violetBlock.transform.position = new Vector3(0f, 2f, -15f + Random.Range(-5f, 5f)) + ground.transform.position;
+        }
+        transform.position = new Vector3(0f+ Random.Range(-3f, 3f), 1f, 0f + Random.Range(-5f, 5f)) + ground.transform.position;
+        transform.rotation = Quaternion.Euler(0f, Random.Range(0f, 360f), 0f);
+        agentRB.velocity *= 0f;
+    }
+}
+
--- a//docs/images/banana.png
+++ b//docs/images/banana.png