Merge remote-tracking branch 'origin/master' into develop-sidechannel-usability

5 年前 · 7f2e815a
--- a/com.unity.ml-agents/CHANGELOG.md
+++ b/com.unity.ml-agents/CHANGELOG.md
 - `DecisionRequester` has been made internal (you can still use the DecisionRequesterComponent from the inspector). `RepeatAction` was renamed `TakeActionsBetweenDecisions` for clarity. (#3555)
 - The `IFloatProperties` interface has been removed.
 - Fix #3579.
+ - Fixed an issue when using GAIL with less than `batch_size` number of demonstrations. (#3591)

 ## [0.14.1-preview] - 2020-02-25

--- a/docs/Python-API.md
+++ b/docs/Python-API.md
 allows you to interact directly with a Unity Environment (`mlagents_envs`) and
 an entry point to train (`mlagents-learn`) which allows you to train agents in
 Unity Environments using our implementations of reinforcement learning or
-imitation learning.
+imitation learning. This document describes how to use the `mlagents_envs` API.
+For information on using `mlagents-learn`, see [here](Training-ML-Agents.md).
-You can use the Python Low Level API to interact directly with your learning
-environment, and use it to develop new learning algorithms.
+The Python Low Level API can be used to interact directly with your Unity learning environment.
+As such, it can serve as the basis for developing and evaluating new learning algorithms.

 ## mlagents_envs

 Python-side communication happens through `UnityEnvironment` which is located in
 [`environment.py`](../ml-agents-envs/mlagents_envs/environment.py). To load
 a Unity environment from a built binary file, put the file in the same directory
-as `envs`. For example, if the filename of your Unity environment is 3DBall.app, in python, run:
+as `envs`. For example, if the filename of your Unity environment is `3DBall`, in python, run:

 ```python
 from mlagents_envs.environment import UnityEnvironment
 `discrete_action_branches = (3,2,)`)


-### Modifying the environment from Python
-The Environment can be modified by using side channels to send data to the
-environment. When creating the environment, pass a list of side channels as
-`side_channels` argument to the constructor.
+### Communicating additional information with the Environment
+In addition to the means of communicating between Unity and python described above,
+we also provide methods for sharing agent-agnostic information. These
+additional methods are referred to as side channels. ML-Agents includes two ready-made
+side channels, described below. It is also possible to create custom side channels to
+communicate any additional data between a Unity environment and Python. Instructions for
+creating custom side channels can be found [here](Custom-SideChannels.md).
+
+Side channels exist as separate classes which are instantiated, and then passed as list to the `side_channels` argument of the constructor of the `UnityEnvironment` class.
+
+```python
+channel = MyChannel()
+
+env = UnityEnvironment(side_channels = [channel])
+```
-__Note__ : A side channel will only send/receive messages when `env.step` is
+__Note__ : A side channel will only send/receive messages when `env.step` or `env.reset()` is
-An `EngineConfiguration` will allow you to modify the time scale and graphics quality of the Unity engine.
+The `EngineConfiguration` side channel allows you to modify the time-scale, resolution, and graphics quality of the environment. This can be useful for adjusting the environment to perform better during training, or be more interpretable during inference.
+
- * `set_configuration_parameters` with arguments
-   * width: Defines the width of the display. Default 80.
-   * height: Defines the height of the display. Default 80.
-   * quality_level: Defines the quality level of the simulation. Default 1.
-   * time_scale: Defines the multiplier for the deltatime in the simulation. If set to a higher value, time will pass faster in the simulation but the physics might break. Default 20.
-   *  target_frame_rate: Instructs simulation to try to render at a specified frame rate. Default -1.
+ * `set_configuration_parameters` which takes the following arguments:
+   * `width`: Defines the width of the display. Default 80.
+   * `height`: Defines the height of the display. Default 80.
+   * `quality_level`: Defines the quality level of the simulation. Default 1.
+   * `time_scale`: Defines the multiplier for the deltatime in the simulation. If set to a higher value, time will pass faster in the simulation but the physics may perform unpredictably. Default 20.
+   *  `target_frame_rate`: Instructs simulation to try to render at a specified frame rate. Default -1.
-For example :
+For example, the following code would adjust the time-scale of the simulation to be 2x realtime.
+
 ```python
 from mlagents_envs.environment import UnityEnvironment
 from mlagents_envs.side_channel.engine_configuration_channel import EngineConfigurationChannel
 ```

 #### FloatPropertiesChannel
-A `FloatPropertiesChannel` will allow you to get and set float properties
-in the environment. You can call get_property and set_property on the
-side channel to read and write properties.
+The `FloatPropertiesChannel` will allow you to get and set pre-defined numerical values in the environment. This can be useful for adjusting environment-specific settings, or for reading non-agent related information from the environment. You can call `get_property` and `set_property` on the side channel to read and write properties.
+
 `FloatPropertiesChannel` has three methods:

 * `set_property` Sets a property in the Unity Environment.
 channel.set_property("parameter_1", 2.0)

 i = env.reset()
+
+readout_value = channel.get_property("parameter_2")
 ...
 ```

 float property1 = sharedProperties.GetPropertyWithDefault("parameter_1", 0.0f);
 ```

-#### [Advanced] Create your own SideChannel
-
-You can create your own `SideChannel` in C# and Python and use it to communicate data between the two.
-
-##### Unity side
-The side channel will have to implement the `SideChannel` abstract class and the following method.
-
- * `OnMessageReceived(byte[] data)` : You must implement this method to specify what the side channel will be doing
- with the data received from Python. The data is a `byte[]` argument.
-
-The side channel must also assign a `ChannelId` property in the constructor. The `ChannelId` is a Guid
-(or UUID in Python) used to uniquely identify a side channel. This Guid must be the same on C# and Python.
-There can only be one side channel of a certain id during communication.
-
-To send a byte array from C# to Python, call the `base.QueueMessageToSend(data)` method inside the side channel.
-The `data` argument must be a `byte[]`.
-
-To register a side channel on the Unity side, call `Academy.Instance.RegisterSideChannel` with the side channel
-as only argument.
-
-##### Python side
-The side channel will have to implement the `SideChannel` abstract class. You must implement :
-
- * `on_message_received(self, data: bytes) -> None` : You must implement this method to specify what the
- side channel will be doing with the data received from Unity. The data is a `byte[]` argument.
-
-The side channel must also assign a `channel_id` property in the constructor. The `channel_id` is a UUID
-(referred in C# as Guid) used to uniquely identify a side channel. This number must be the same on C# and
-Python. There can only be one side channel of a certain id during communication.
-
-To assign the `channel_id` call the abstract class constructor with the appropriate `channel_id` as follows:
-
-```python
-super().__init__(my_channel_id)
-```
-
-To send a byte array from Python to C#, call the `super().queue_message_to_send(bytes_data)` method inside the
-side channel. The `bytes_data` argument must be a `bytes` object.
-
-To register a side channel on the Python side, pass the side channel as argument when creating the
-`UnityEnvironment` object. One of the arguments of the constructor (`side_channels`) is a list of side channels.
-
-##### Example implementation
-
-Here is a simple implementation of a Side Channel that will exchange strings between C# and Python
-(encoded as ascii).
-
-One the C# side :
-Here is an implementation of a `StringLogSideChannel` that will listed to the `UnityEngine.Debug.LogError` calls in
-the game :
-
-```csharp
-using UnityEngine;
-using MLAgents;
-using System.Text;
-using System;
-
-public class StringLogSideChannel : SideChannel
-{
-    public StringLogSideChannel()
-    {
-        ChannelId = new Guid("621f0a70-4f87-11ea-a6bf-784f4387d1f7");
-    }
-
-    public override void OnMessageReceived(byte[] data)
-    {
-        var receivedString = Encoding.ASCII.GetString(data);
-        Debug.Log("From Python : " + receivedString);
-    }
-
-    public void SendDebugStatementToPython(string logString, string stackTrace, LogType type)
-    {
-        if (type == LogType.Error)
-        {
-            var stringToSend = type.ToString() + ": " + logString + "\n" + stackTrace;
-            var encodedString = Encoding.ASCII.GetBytes(stringToSend);
-            base.QueueMessageToSend(encodedString);
-        }
-    }
-}
-```
-
-We also need to register this side channel to the Academy and to the `Application.logMessageReceived` events,
-so we write a simple MonoBehavior for this. (Do not forget to attach it to a GameObject in the scene).
-
-```csharp
-using UnityEngine;
-using MLAgents;
-
-
-public class RegisterStringLogSideChannel : MonoBehaviour
-{
-
-    StringLogSideChannel stringChannel;
-    public void Awake()
-    {
-        // We create the Side Channel
-        stringChannel = new StringLogSideChannel();
-
-        // When a Debug.Log message is created, we send it to the stringChannel
-        Application.logMessageReceived += stringChannel.SendDebugStatementToPython;
-
-        // Just in case the Academy has not yet initialized
-        Academy.Instance.RegisterSideChannel(stringChannel);
-    }
-
-    public void OnDestroy()
-    {
-        // De-register the Debug.Log callback
-        Application.logMessageReceived -= stringChannel.SendDebugStatementToPython;
-        if (Academy.IsInitialized){
-            Academy.Instance.UnregisterSideChannel(stringChannel);
-        }
-    }
-
-    public void Update()
-    {
-        // Optional : If the space bar is pressed, raise an error !
-        if (Input.GetKeyDown(KeyCode.Space))
-        {
-            Debug.LogError("This is a fake error. Space bar was pressed in Unity.");
-        }
-    }
-}
-```
-
-And here is the script on the Python side. This script creates a new Side channel type (`StringLogChannel`) and
-launches a `UnityEnvironment` with that side channel.
-
-```python
-
-from mlagents_envs.environment import UnityEnvironment
-from mlagents_envs.side_channel.side_channel import SideChannel
-import numpy as np
-
-
-# Create the StringLogChannel class
-class StringLogChannel(SideChannel):
-
-    def __init__(self) -> None:
-        super().__init__(uuid.UUID("621f0a70-4f87-11ea-a6bf-784f4387d1f7"))
+#### Custom side channels
-    def on_message_received(self, data: bytes) -> None:
-        """
-        Note :We must implement this method of the SideChannel interface to
-        receive messages from Unity
-        """
-        # We simply print the data received interpreted as ascii
-        print(data.decode("ascii"))
-
-    def send_string(self, data: str) -> None:
-        # Convert the string to ascii
-        bytes_data = data.encode("ascii")
-        # We call this method to queue the data we want to send
-        super().queue_message_to_send(bytes_data)
-
-# Create the channel
-string_log = StringLogChannel()
-
-# We start the communication with the Unity Editor and pass the string_log side channel as input
-env = UnityEnvironment(base_port=UnityEnvironment.DEFAULT_EDITOR_PORT, side_channels=[string_log])
-env.reset()
-string_log.send_string("The environment was reset")
-
-group_name = env.get_agent_groups()[0]  # Get the first group_name
-for i in range(1000):
-    step_data = env.get_step_result(group_name)
-    n_agents = step_data.n_agents()  # Get the number of agents
-    # We send data to Unity : A string with the number of Agent at each
-    string_log.send_string(
-        "Step " + str(i) + " occurred with " + str(n_agents) + " agents."
-    )
-    env.step()  # Move the simulation forward
-
-env.close()
-```
-
-Now, if you run this script and press `Play` the Unity Editor when prompted, The console in the Unity Editor will
-display a message at every Python step. Additionally, if you press the Space Bar in the Unity Engine, a message will
-appear in the terminal.
+For information on how to make custom side channels for sending additional data types, see the documentation [here](Custom-SideChannels.md).
--- a/docs/Readme.md
+++ b/docs/Readme.md
  * [Using the Monitor](Feature-Monitor.md)
  * [Using the Video Recorder](https://github.com/Unity-Technologies/video-recorder)
  * [Using an Executable Environment](Learning-Environment-Executable.md)
+  * [Creating Custom Side Channels](Custom-SideChannels.md)

 ## Training

--- a/ml-agents/mlagents/trainers/components/reward_signals/init.py
+++ b/ml-agents/mlagents/trainers/components/reward_signals/init.py

 from mlagents.trainers.exception import UnityTrainerException
 from mlagents.trainers.policy.tf_policy import TFPolicy
+from mlagents.trainers.buffer import AgentBuffer

 logger = logging.getLogger("mlagents.trainers")

        self.strength = strength
        self.stats_name_to_update_name: Dict[str, str] = {}

-    def evaluate_batch(self, mini_batch: Dict[str, np.array]) -> RewardSignalResult:
+    def evaluate_batch(self, mini_batch: AgentBuffer) -> RewardSignalResult:
        """
        Evaluates the reward for the data present in the Dict mini_batch. Use this when evaluating a reward
        function drawn straight from a Buffer.
        )

    def prepare_update(
-        self, policy: TFPolicy, mini_batch: Dict[str, np.ndarray], num_sequences: int
+        self, policy: TFPolicy, mini_batch: AgentBuffer, num_sequences: int
    ) -> Dict[tf.Tensor, Any]:
        """
        If the reward signal has an internal model (e.g. GAIL or Curiosity), get the feed_dict
--- a/ml-agents/mlagents/trainers/components/reward_signals/curiosity/signal.py
+++ b/ml-agents/mlagents/trainers/components/reward_signals/curiosity/signal.py
 from mlagents.trainers.components.reward_signals import RewardSignal, RewardSignalResult
 from mlagents.trainers.components.reward_signals.curiosity.model import CuriosityModel
 from mlagents.trainers.policy.tf_policy import TFPolicy
+from mlagents.trainers.buffer import AgentBuffer


 class CuriosityRewardSignal(RewardSignal):
        }
        self.has_updated = False

-    def evaluate_batch(self, mini_batch: Dict[str, np.array]) -> RewardSignalResult:
+    def evaluate_batch(self, mini_batch: AgentBuffer) -> RewardSignalResult:
        feed_dict: Dict[tf.Tensor, Any] = {
            self.policy.batch_size_ph: len(mini_batch["actions"]),
            self.policy.sequence_length_ph: self.policy.sequence_length,
        super().check_config(config_dict, param_keys)

    def prepare_update(
-        self, policy: TFPolicy, mini_batch: Dict[str, np.ndarray], num_sequences: int
+        self, policy: TFPolicy, mini_batch: AgentBuffer, num_sequences: int
    ) -> Dict[tf.Tensor, Any]:
        """
        Prepare for update and get feed_dict.
--- a/ml-agents/mlagents/trainers/components/reward_signals/extrinsic/signal.py
+++ b/ml-agents/mlagents/trainers/components/reward_signals/extrinsic/signal.py
 import numpy as np

 from mlagents.trainers.components.reward_signals import RewardSignal, RewardSignalResult
+from mlagents.trainers.buffer import AgentBuffer


 class ExtrinsicRewardSignal(RewardSignal):
        param_keys = ["strength", "gamma"]
        super().check_config(config_dict, param_keys)

-    def evaluate_batch(self, mini_batch: Dict[str, np.array]) -> RewardSignalResult:
+    def evaluate_batch(self, mini_batch: AgentBuffer) -> RewardSignalResult:
        env_rews = np.array(mini_batch["environment_rewards"], dtype=np.float32)
        return RewardSignalResult(self.strength * env_rews, env_rews)
--- a/ml-agents/mlagents/trainers/components/reward_signals/gail/signal.py
+++ b/ml-agents/mlagents/trainers/components/reward_signals/gail/signal.py
 from mlagents.trainers.policy.tf_policy import TFPolicy
 from .model import GAILModel
 from mlagents.trainers.demo_loader import demo_to_buffer
+from mlagents.trainers.buffer import AgentBuffer


 class GAILRewardSignal(RewardSignal):
            "Policy/GAIL Expert Estimate": "gail_expert_estimate",
        }

-    def evaluate_batch(self, mini_batch: Dict[str, np.array]) -> RewardSignalResult:
+    def evaluate_batch(self, mini_batch: AgentBuffer) -> RewardSignalResult:
        feed_dict: Dict[tf.Tensor, Any] = {
            self.policy.batch_size_ph: len(mini_batch["actions"]),
            self.policy.sequence_length_ph: self.policy.sequence_length,
        super().check_config(config_dict, param_keys)

    def prepare_update(
-        self, policy: TFPolicy, mini_batch: Dict[str, np.ndarray], num_sequences: int
+        self, policy: TFPolicy, mini_batch: AgentBuffer, num_sequences: int
    ) -> Dict[tf.Tensor, Any]:
        """
        Prepare inputs for update. .
        """
-        max_num_experiences = min(
-            len(mini_batch["actions"]), self.demonstration_buffer.num_experiences
-        )
-        # If num_sequences is less, we need to shorten the input batch.
-        for key, element in mini_batch.items():
-            mini_batch[key] = element[:max_num_experiences]
-
-        # Get batch from demo buffer
+        # Get batch from demo buffer. Even if demo buffer is smaller, we sample with replacement
-            len(mini_batch["actions"]), 1
+            mini_batch.num_experiences, 1
        )

        feed_dict: Dict[tf.Tensor, Any] = {
--- a/ml-agents/mlagents/trainers/sac/optimizer.py
+++ b/ml-agents/mlagents/trainers/sac/optimizer.py
        return update_stats

    def update_reward_signals(
-        self, reward_signal_minibatches: Mapping[str, Dict], num_sequences: int
+        self, reward_signal_minibatches: Mapping[str, AgentBuffer], num_sequences: int
    ) -> Dict[str, float]:
        """
        Only update the reward signals.
        feed_dict: Dict[tf.Tensor, Any],
        update_dict: Dict[str, tf.Tensor],
        stats_needed: Dict[str, str],
-        reward_signal_minibatches: Mapping[str, Dict],
+        reward_signal_minibatches: Mapping[str, AgentBuffer],
        num_sequences: int,
    ) -> None:
        """
--- a/ml-agents/mlagents/trainers/tests/test_ppo.py
+++ b/ml-agents/mlagents/trainers/tests/test_ppo.py
 from mlagents.trainers.tests import mock_brain as mb
 from mlagents.trainers.tests.mock_brain import make_brain_parameters
 from mlagents.trainers.tests.test_trajectory import make_fake_trajectory
+from mlagents.trainers.tests.test_reward_signals import (  # noqa: F401; pylint: disable=unused-variable
+    curiosity_dummy_config,
+    gail_dummy_config,
+)


@pytest.fixture
    optimizer.update(
        update_buffer,
        num_sequences=update_buffer.num_experiences // dummy_config["sequence_length"],
+    )
+
+
+@pytest.mark.parametrize("discrete", [True, False], ids=["discrete", "continuous"])
+@pytest.mark.parametrize("visual", [True, False], ids=["visual", "vector"])
+@pytest.mark.parametrize("rnn", [True, False], ids=["rnn", "no_rnn"])
+# We need to test this separately from test_reward_signals.py to ensure no interactions
+def test_ppo_optimizer_update_curiosity(
+    curiosity_dummy_config, dummy_config, rnn, visual, discrete  # noqa: F811
+):
+    # Test evaluate
+    tf.reset_default_graph()
+    dummy_config["reward_signals"].update(curiosity_dummy_config)
+    optimizer = _create_ppo_optimizer_ops_mock(
+        dummy_config, use_rnn=rnn, use_discrete=discrete, use_visual=visual
+    )
+    # Test update
+    update_buffer = mb.simulate_rollout(BUFFER_INIT_SAMPLES, optimizer.policy.brain)
+    # Mock out reward signal eval
+    update_buffer["advantages"] = update_buffer["environment_rewards"]
+    update_buffer["extrinsic_returns"] = update_buffer["environment_rewards"]
+    update_buffer["extrinsic_value_estimates"] = update_buffer["environment_rewards"]
+    update_buffer["curiosity_returns"] = update_buffer["environment_rewards"]
+    update_buffer["curiosity_value_estimates"] = update_buffer["environment_rewards"]
+    optimizer.update(
+        update_buffer,
+        num_sequences=update_buffer.num_experiences // optimizer.policy.sequence_length,
+    )
+
+
+# We need to test this separately from test_reward_signals.py to ensure no interactions
+def test_ppo_optimizer_update_gail(gail_dummy_config, dummy_config):  # noqa: F811
+    # Test evaluate
+    tf.reset_default_graph()
+    dummy_config["reward_signals"].update(gail_dummy_config)
+    optimizer = _create_ppo_optimizer_ops_mock(
+        dummy_config, use_rnn=False, use_discrete=False, use_visual=False
+    )
+    # Test update
+    update_buffer = mb.simulate_rollout(BUFFER_INIT_SAMPLES, optimizer.policy.brain)
+    # Mock out reward signal eval
+    update_buffer["advantages"] = update_buffer["environment_rewards"]
+    update_buffer["extrinsic_returns"] = update_buffer["environment_rewards"]
+    update_buffer["extrinsic_value_estimates"] = update_buffer["environment_rewards"]
+    update_buffer["gail_returns"] = update_buffer["environment_rewards"]
+    update_buffer["gail_value_estimates"] = update_buffer["environment_rewards"]
+    optimizer.update(
+        update_buffer,
+        num_sequences=update_buffer.num_experiences // optimizer.policy.sequence_length,
+    )
+
+    # Check if buffer size is too big
+    update_buffer = mb.simulate_rollout(3000, optimizer.policy.brain)
+    # Mock out reward signal eval
+    update_buffer["advantages"] = update_buffer["environment_rewards"]
+    update_buffer["extrinsic_returns"] = update_buffer["environment_rewards"]
+    update_buffer["extrinsic_value_estimates"] = update_buffer["environment_rewards"]
+    update_buffer["gail_returns"] = update_buffer["environment_rewards"]
+    update_buffer["gail_value_estimates"] = update_buffer["environment_rewards"]
+    optimizer.update(
+        update_buffer,
+        num_sequences=update_buffer.num_experiences // optimizer.policy.sequence_length,
    )


--- a/docs/Custom-SideChannels.md
+++ b/docs/Custom-SideChannels.md
+# Custom Side Channels
+
+You can create your own side channel in C# and Python and use it to communicate
+custom data structures between the two. This can be useful for situations in
+which the data to be sent is too complex or structured for the built-in
+`FloatPropertiesChannel`, or is not related to any specific agent, and therefore
+inappropriate as an agent observation.
+
+## Overview
+
+In order to use a side channel, it must be implemented as both Unity and Python classes.
+
+### Unity side
+The side channel will have to implement the `SideChannel` abstract class and the following method.
+
+ * `OnMessageReceived(byte[] data)` : You must implement this method to specify what the side channel will be doing
+ with the data received from Python. The data is a `byte[]` argument.
+
+The side channel must also assign a `ChannelId` property in the constructor. The `ChannelId` is a Guid
+(or UUID in Python) used to uniquely identify a side channel. This Guid must be the same on C# and Python.
+There can only be one side channel of a certain id during communication.
+
+To send a byte array from C# to Python, call the `base.QueueMessageToSend(data)` method inside the side channel.
+The `data` argument must be a `byte[]`.
+
+To register a side channel on the Unity side, call `Academy.Instance.RegisterSideChannel` with the side channel
+as only argument.
+
+### Python side
+The side channel will have to implement the `SideChannel` abstract class. You must implement :
+
+ * `on_message_received(self, data: bytes) -> None` : You must implement this method to specify what the
+ side channel will be doing with the data received from Unity. The data is a `byte[]` argument.
+
+The side channel must also assign a `channel_id` property in the constructor. The `channel_id` is a UUID
+(referred in C# as Guid) used to uniquely identify a side channel. This number must be the same on C# and
+Python. There can only be one side channel of a certain id during communication.
+
+To assign the `channel_id` call the abstract class constructor with the appropriate `channel_id` as follows:
+
+```python
+super().__init__(my_channel_id)
+```
+
+To send a byte array from Python to C#, call the `super().queue_message_to_send(bytes_data)` method inside the
+side channel. The `bytes_data` argument must be a `bytes` object.
+
+To register a side channel on the Python side, pass the side channel as argument when creating the
+`UnityEnvironment` object. One of the arguments of the constructor (`side_channels`) is a list of side channels.
+
+## Example implementation
+
+Below is a simple implementation of a side channel that will exchange ascii encoded
+strings between a Unity environment and Python.
+
+### Example Unity C# code
+
+The first step is to create the `StringLogSideChannel` class within the Unity project.
+Here is an implementation of a `StringLogSideChannel` that will listen for messages
+from python and print them to the Unity debug log, as well as send error messages
+from Unity to python.
+
+```csharp
+using UnityEngine;
+using MLAgents;
+using MLAgents.SideChannels;
+using System.Text;
+using System;
+
+public class StringLogSideChannel : SideChannel
+{
+    public StringLogSideChannel()
+    {
+        ChannelId = new Guid("621f0a70-4f87-11ea-a6bf-784f4387d1f7");
+    }
+
+    public override void OnMessageReceived(byte[] data)
+    {
+        var receivedString = Encoding.ASCII.GetString(data);
+        Debug.Log("From Python : " + receivedString);
+    }
+
+    public void SendDebugStatementToPython(string logString, string stackTrace, LogType type)
+    {
+        if (type == LogType.Error)
+        {
+            var stringToSend = type.ToString() + ": " + logString + "\n" + stackTrace;
+            var encodedString = Encoding.ASCII.GetBytes(stringToSend);
+            base.QueueMessageToSend(encodedString);
+        }
+    }
+}
+```
+
+Once we have defined our custom side channel class, we need to ensure that it is
+instantiated and registered. This can typically be done wherever the logic of
+the side channel makes sense to be associated, for example on a MonoBehaviour
+object that might need to access data from the side channel. Here we show a
+simple MonoBehaviour object which instantiates and registeres the new side
+channel. If you have not done it already, make sure that the MonoBehaviour
+which registers the side channel is attached to a gameobject which will
+be live in your Unity scene.
+
+```csharp
+using UnityEngine;
+using MLAgents;
+
+
+public class RegisterStringLogSideChannel : MonoBehaviour
+{
+
+    StringLogSideChannel stringChannel;
+    public void Awake()
+    {
+        // We create the Side Channel
+        stringChannel = new StringLogSideChannel();
+
+        // When a Debug.Log message is created, we send it to the stringChannel
+        Application.logMessageReceived += stringChannel.SendDebugStatementToPython;
+
+        // The channel must be registered with the Academy
+        Academy.Instance.RegisterSideChannel(stringChannel);
+    }
+
+    public void OnDestroy()
+    {
+        // De-register the Debug.Log callback
+        Application.logMessageReceived -= stringChannel.SendDebugStatementToPython;
+        if (Academy.IsInitialized){
+            Academy.Instance.UnregisterSideChannel(stringChannel);
+        }
+    }
+
+    public void Update()
+    {
+        // Optional : If the space bar is pressed, raise an error !
+        if (Input.GetKeyDown(KeyCode.Space))
+        {
+            Debug.LogError("This is a fake error. Space bar was pressed in Unity.");
+        }
+    }
+}
+```
+
+### Example Python code
+
+Now that we have created the necessary Unity C# classes, we can create their Python counterparts.
+
+```python
+from mlagents_envs.environment import UnityEnvironment
+from mlagents_envs.side_channel.side_channel import SideChannel
+import numpy as np
+import uuid
+
+
+# Create the StringLogChannel class
+class StringLogChannel(SideChannel):
+
+    def __init__(self) -> None:
+        super().__init__(uuid.UUID("621f0a70-4f87-11ea-a6bf-784f4387d1f7"))
+
+    def on_message_received(self, data: bytes) -> None:
+        """
+        Note :We must implement this method of the SideChannel interface to
+        receive messages from Unity
+        """
+        # We simply print the data received interpreted as ascii
+        print(data.decode("ascii"))
+
+    def send_string(self, data: str) -> None:
+        # Convert the string to ascii
+        bytes_data = data.encode("ascii")
+        # We call this method to queue the data we want to send
+        super().queue_message_to_send(bytes_data)
+```
+
+
+We can then instantiate the new side channel,
+launch a `UnityEnvironment` with that side channel active, and send a series of
+messages to the Unity environment from Python using it.
+
+```python
+# Create the channel
+string_log = StringLogChannel()
+
+# We start the communication with the Unity Editor and pass the string_log side channel as input
+env = UnityEnvironment(base_port=UnityEnvironment.DEFAULT_EDITOR_PORT, side_channels=[string_log])
+env.reset()
+string_log.send_string("The environment was reset")
+
+group_name = env.get_agent_groups()[0]  # Get the first group_name
+for i in range(1000):
+    step_data = env.get_step_result(group_name)
+    n_agents = step_data.n_agents()  # Get the number of agents
+    # We send data to Unity : A string with the number of Agent at each
+    string_log.send_string(
+        "Step " + str(i) + " occurred with " + str(n_agents) + " agents."
+    )
+    env.step()  # Move the simulation forward
+
+env.close()
+```
+
+Now, if you run this script and press `Play` the Unity Editor when prompted,
+the console in the Unity Editor will display a message at every Python step.
+Additionally, if you press the Space Bar in the Unity Engine, a message will
+appear in the terminal.