[refactor] Move output artifacts to a single results/ folder (#3829)

4 年前 · 232519e4
--- a/.gitignore
+++ b/.gitignore
-# Tensorflow Model Info
+# Output Artifacts (Legacy)
+# Output Artifacts
+/results

 # Training environments
 /envs
--- a/com.unity.ml-agents/CHANGELOG.md
+++ b/com.unity.ml-agents/CHANGELOG.md
  instead of "camelCase"; for example, `Agent.maxStep` was renamed to
  `Agent.MaxStep`. For a full list of changes, see the pull request. (#3828)
 - Update Barracuda to 0.7.0-preview which has breaking namespace and assembly name changes.
+- Training artifacts (trained models, summaries) are now found in the `results/`
+  directory. (#3829)

 ### Minor Changes

 - The maximum compatible version of tensorflow was changed to allow tensorflow 2.1 and 2.2. This
 will allow use with python 3.8 using tensorflow 2.2.0rc3.
 - `UnityRLCapabilities` was added to help inform users when RL features are mismatched between C# and Python packages. (#3831)
+- Unity Player logs are now written out to the results directory. (#3877)

 ### Bug Fixes

--- a/docs/Getting-Started.md
+++ b/docs/Getting-Started.md
        sequence_length:     64
        summary_freq:        1000
        use_recurrent:       False
-        summary_path:        ./summaries/first3DBallRun
-        model_path: ./models/first3DBallRun/3DBallLearning
+        output_path: ./results/first3DBallRun/3DBallLearning
 INFO:mlagents.trainers: first3DBallRun: 3DBallLearning: Step: 1000. Mean Reward: 1.242. Std of Reward: 0.746. Training.
 INFO:mlagents.trainers: first3DBallRun: 3DBallLearning: Step: 2000. Mean Reward: 1.319. Std of Reward: 0.693. Training.
 INFO:mlagents.trainers: first3DBallRun: 3DBallLearning: Step: 3000. Mean Reward: 1.804. Std of Reward: 1.056. Training.
 mlagents-learn config/trainer_config.yaml --run-id=first3DBallRun --resume
 ```

-Your trained model will be at `models/<run-identifier>/<behavior_name>.nn` where
+Your trained model will be at `results/<run-identifier>/<behavior_name>.nn` where
 `<behavior_name>` is the name of the `Behavior Name` of the agents corresponding
 to the model. This file corresponds to your model's latest checkpoint. You can
 now embed this trained model into your Agents by following the steps below,
--- a/docs/Learning-Environment-Executable.md
+++ b/docs/Learning-Environment-Executable.md
        sequence_length:     64
        summary_freq:        1000
        use_recurrent:       False
-        summary_path:        ./summaries/first-run-0
-        model_path: ./models/first-run-0/Ball3DLearning
+        output_path: ./results/first-run-0/Ball3DLearning
 INFO:mlagents.trainers: first-run-0: Ball3DLearning: Step: 1000. Mean Reward: 1.242. Std of Reward: 0.746. Training.
 INFO:mlagents.trainers: first-run-0: Ball3DLearning: Step: 2000. Mean Reward: 1.319. Std of Reward: 0.693. Training.
 INFO:mlagents.trainers: first-run-0: Ball3DLearning: Step: 3000. Mean Reward: 1.804. Std of Reward: 1.056. Training.
 ```

 You can press Ctrl+C to stop the training, and your trained model will be at
-`models/<run-identifier>/<behavior_name>.nn`, which corresponds to your model's
+`results/<run-identifier>/<behavior_name>.nn`, which corresponds to your model's
 latest checkpoint. (**Note:** There is a known bug on Windows that causes the
 saving of the model to fail when you early terminate the training, it's
 recommended to wait until Step has reached the max_steps parameter you set in
--- a/docs/Migrating.md
+++ b/docs/Migrating.md
  instead of "camelCase"; for example, `Agent.maxStep` was renamed to
  `Agent.MaxStep`. For a full list of changes, see the pull request. (#3828)
 - `WriteAdapter` was renamed to `ObservationWriter`. (#3834)
+- Training artifacts (trained models, summaries) are now found under `results/`
+  instead of `summaries/` and `models/`.

 ### Steps to Migrate

--- a/docs/Training-ML-Agents.md
+++ b/docs/Training-ML-Agents.md
 Regardless of which training methods, configurations or hyperparameters you
 provide, the training process will always generate three artifacts:

-1. Summaries (under the `summaries/` folder): these are training metrics that
+1. Summaries (under the `results/<run-identifier>/<behavior-name>` folder):
+   these are training metrics that
-1. Models (under the `models/` folder): these contain the model checkpoints that
+1. Models (under the `results/<run-identifier>/` folder): these contain the model checkpoints that
-1. Timers file (also under the `summaries/` folder): this contains aggregated
+1. Timers file (also under the `results/<run-identifier>` folder): this contains aggregated
   metrics on your training process, including time spent on specific code
   blocks. See [Profiling in Python](Profiling-Python.md) for more information
   on the timers generated.
--- a/docs/Training-PPO.md
+++ b/docs/Training-PPO.md
 `init_path` can be specified to initialize your model from a previous run before starting.
 Note that the prior run should have used the same trainer configurations as the current run,
 and have been saved with the same version of ML-Agents. You should provide the full path
-to the folder where the checkpoints were saved, e.g. `./models/{run-id}/{behavior_name}`.
+to the folder where the checkpoints were saved, e.g. `./results/{run-id}/{behavior_name}`.

 This option is provided in case you want to initialize different behaviors from different runs;
 in most cases, it is sufficient to use the `--initialize-from` CLI parameter to initialize
--- a/docs/Training-SAC.md
+++ b/docs/Training-SAC.md
 `init_path` can be specified to initialize your model from a previous run before starting.
 Note that the prior run should have used the same trainer configurations as the current run,
 and have been saved with the same version of ML-Agents. You should provide the full path
-to the folder where the checkpoints were saved, e.g. `./models/{run-id}/{behavior_name}`.
+to the folder where the checkpoints were saved, e.g. `./results/{run-id}/{behavior_name}`.

 This option is provided in case you want to initialize different behaviors from different runs;
 in most cases, it is sufficient to use the `--initialize-from` CLI parameter to initialize
--- a/docs/Using-Tensorboard.md
+++ b/docs/Using-Tensorboard.md

 1. Open a terminal or console window:
 1. Navigate to the directory where the ML-Agents Toolkit is installed.
-1. From the command line run: `tensorboard --logdir=summaries --port=6006`
+1. From the command line run: `tensorboard --logdir=results --port=6006`
 1. Open a browser window and navigate to
   [localhost:6006](http://localhost:6006).

--- a/ml-agents-envs/mlagents_envs/environment.py
+++ b/ml-agents-envs/mlagents_envs/environment.py
        seed: int = 0,
        no_graphics: bool = False,
        timeout_wait: int = 60,
-        args: Optional[List[str]] = None,
+        additional_args: Optional[List[str]] = None,
+        log_folder: Optional[str] = None,
    ):
        """
        Starts a new unity environment and establishes a connection with the environment.
        :int timeout_wait: Time (in seconds) to wait for connection from environment.
        :list args: Addition Unity command line arguments
        :list side_channels: Additional side channel for no-rl communication with Unity
+        :str log_folder: Optional folder to write the Unity Player log file into.  Requires absolute path.
-        args = args or []
+        self.additional_args = additional_args or []
+        self.no_graphics = no_graphics
        # If base port is not specified, use BASE_ENVIRONMENT_PORT if we have
        # an environment, otherwise DEFAULT_EDITOR_PORT
        if base_port is None:
                        )
                    )
                self.side_channels[_sc.channel_id] = _sc
+        self.log_folder = log_folder

        # If the environment name is None, a new environment will not be launched
        # and the communicator will directly try to connect to an existing unity environment.
                "the worker-id must be 0 in order to connect with the Editor."
            )
        if file_name is not None:
-            self.executable_launcher(file_name, no_graphics, args)
+            self.executable_launcher(file_name, no_graphics, additional_args)
        else:
            logger.info(
                f"Listening on port {self.port}. "
                launch_string = candidates[0]
        return launch_string

+    def executable_args(self) -> List[str]:
+        args: List[str] = []
+        if self.no_graphics:
+            args += ["-nographics", "-batchmode"]
+        args += [UnityEnvironment.PORT_COMMAND_LINE_ARG, str(self.port)]
+        if self.log_folder:
+            log_file_path = os.path.join(
+                self.log_folder, f"Player-{self.worker_id}.log"
+            )
+            args += ["-logFile", log_file_path]
+        # Add in arguments passed explicitly by the user.
+        args += self.additional_args
+        return args
+
    def executable_launcher(self, file_name, no_graphics, args):
        launch_string = self.validate_environment_path(file_name)
        if launch_string is None:
        else:
            logger.debug("This is the launch string {}".format(launch_string))
            # Launch Unity environment
-            subprocess_args = [launch_string]
-            if no_graphics:
-                subprocess_args += ["-nographics", "-batchmode"]
-            subprocess_args += [UnityEnvironment.PORT_COMMAND_LINE_ARG, str(self.port)]
-            subprocess_args += args
+            subprocess_args = [launch_string] + self.executable_args()
            try:
                self.proc1 = subprocess.Popen(
                    subprocess_args,
--- a/ml-agents-envs/mlagents_envs/tests/test_envs.py
+++ b/ml-agents-envs/mlagents_envs/tests/test_envs.py

@mock.patch("mlagents_envs.environment.UnityEnvironment.executable_launcher")
@mock.patch("mlagents_envs.environment.UnityEnvironment.get_communicator")
+def test_log_file_path_is_set(mock_communicator, mock_launcher):
+    mock_communicator.return_value = MockCommunicator()
+    env = UnityEnvironment(
+        file_name="myfile", worker_id=0, log_folder="./some-log-folder-path"
+    )
+    args = env.executable_args()
+    log_file_index = args.index("-logFile")
+    assert args[log_file_index + 1] == "./some-log-folder-path/Player-0.log"
+
+
+@mock.patch("mlagents_envs.environment.UnityEnvironment.executable_launcher")
+@mock.patch("mlagents_envs.environment.UnityEnvironment.get_communicator")
 def test_reset(mock_communicator, mock_launcher):
    mock_communicator.return_value = MockCommunicator(
        discrete_action=False, visual_inputs=0
--- a/ml-agents/mlagents/trainers/learn.py
+++ b/ml-agents/mlagents/trainers/learn.py
 # # Unity ML-Agents Toolkit
 import argparse
+import yaml

 import os
 import numpy as np
    :param run_options: Command line arguments for training.
    """
    with hierarchical_timer("run_training.setup"):
-        model_path = f"./models/{options.run_id}"
+        base_path = "results"
+        write_path = os.path.join(base_path, options.run_id)
-            f"./models/{options.initialize_from}" if options.initialize_from else None
+            os.path.join(base_path, options.run_id) if options.initialize_from else None
-        summaries_dir = "./summaries"
+        run_logs_dir = os.path.join(write_path, "run_logs")
-
+        # Check if directory exists
+        handle_existing_directories(
+            write_path, options.resume, options.force, maybe_init_path
+        )
+        # Make run logs directory
+        os.makedirs(run_logs_dir, exist_ok=True)
-            summaries_dir,
+            write_path,
-        handle_existing_directories(
-            model_path, summaries_dir, options.resume, options.force, maybe_init_path
-        )
-        tb_writer = TensorboardWriter(summaries_dir, clear_past_data=not options.resume)
+        tb_writer = TensorboardWriter(write_path, clear_past_data=not options.resume)
        gauge_write = GaugeWriter()
        console_writer = ConsoleWriter()
        StatsReporter.add_writer(tb_writer)
        if options.env_path is None:
            port = UnityEnvironment.DEFAULT_EDITOR_PORT
        env_factory = create_environment_factory(
-            options.env_path, options.no_graphics, run_seed, port, options.env_args
+            options.env_path,
+            options.no_graphics,
+            run_seed,
+            port,
+            options.env_args,
+            os.path.abspath(run_logs_dir),  # Unity environment requires absolute path
        )
        engine_config = EngineConfig(
            width=options.width,
        )
        trainer_factory = TrainerFactory(
            options.trainer_config,
-            summaries_dir,
-            model_path,
+            write_path,
            options.keep_checkpoints,
            not options.inference,
            options.resume,
        # Create controller and begin training.
        tc = TrainerController(
            trainer_factory,
-            model_path,
-            summaries_dir,
+            write_path,
            options.run_id,
            options.save_freq,
            maybe_meta_curriculum,
        tc.start_learning(env_manager)
    finally:
        env_manager.close()
-        write_timing_tree(summaries_dir, options.run_id)
+        write_run_options(write_path, options)
+        write_timing_tree(run_logs_dir)
+
+
+def write_run_options(output_dir: str, run_options: RunOptions) -> None:
+    run_options_path = os.path.join(output_dir, "configuration.yaml")
+    try:
+        with open(run_options_path, "w") as f:
+            try:
+                yaml.dump(dict(run_options._asdict()), f, sort_keys=False)
+            except TypeError:  # Older versions of pyyaml don't support sort_keys
+                yaml.dump(dict(run_options._asdict()), f)
+    except FileNotFoundError:
+        logger.warning(
+            f"Unable to save configuration to {run_options_path}. Make sure the directory exists"
+        )
-def write_timing_tree(summaries_dir: str, run_id: str) -> None:
-    timing_path = f"{summaries_dir}/{run_id}_timers.json"
+def write_timing_tree(output_dir: str) -> None:
+    timing_path = os.path.join(output_dir, "timers.json")
    try:
        with open(timing_path, "w") as f:
            json.dump(get_timer_tree(), f, indent=4)
    seed: int,
    start_port: int,
    env_args: Optional[List[str]],
+    log_folder: str,
 ) -> Callable[[int, List[SideChannel]], BaseEnv]:
    if env_path is not None:
        launch_string = UnityEnvironment.validate_environment_path(env_path)
            seed=env_seed,
            no_graphics=no_graphics,
            base_port=start_port,
-            args=env_args,
+            additional_args=env_args,
+            log_folder=log_folder,
        )

    return create_unity_environment
--- a/ml-agents/mlagents/trainers/policy/tf_policy.py
+++ b/ml-agents/mlagents/trainers/policy/tf_policy.py
 from typing import Any, Dict, List, Optional
 import abc
+import os
 import numpy as np
 from mlagents.tf_utils import tf
 from mlagents import tf_utils
        self.use_continuous_act = brain.vector_action_space_type == "continuous"
        if self.use_continuous_act:
            self.num_branches = self.brain.vector_action_space_size[0]
-        self.model_path = trainer_parameters["model_path"]
+        self.model_path = trainer_parameters["output_path"]
        self.initialize_path = trainer_parameters.get("init_path", None)
        self.keep_checkpoints = trainer_parameters.get("keep_checkpoints", 5)
        self.graph = tf.Graph()
        :return:
        """
        with self.graph.as_default():
-            last_checkpoint = self.model_path + "/model-" + str(steps) + ".ckpt"
+            last_checkpoint = os.path.join(self.model_path, f"model-{steps}.ckpt")
            self.saver.save(self.sess, last_checkpoint)
            tf.train.write_graph(
                self.graph, self.model_path, "raw_graph_def.pb", as_text=False
--- a/ml-agents/mlagents/trainers/ppo/trainer.py
+++ b/ml-agents/mlagents/trainers/ppo/trainer.py
            "sequence_length",
            "summary_freq",
            "use_recurrent",
-            "summary_path",
-            "model_path",
+            "output_path",
            "reward_signals",
        ]
        self._check_param_keys()
--- a/ml-agents/mlagents/trainers/sac/trainer.py
+++ b/ml-agents/mlagents/trainers/sac/trainer.py
            "summary_freq",
            "tau",
            "use_recurrent",
-            "summary_path",
-            "model_path",
+            "output_path",
            "reward_signals",
        ]

        Save the training buffer's update buffer to a pickle file.
        """
        filename = os.path.join(
-            self.trainer_parameters["model_path"], "last_replay_buffer.hdf5"
+            self.trainer_parameters["output_path"], "last_replay_buffer.hdf5"
        )
        logger.info("Saving Experience Replay Buffer to {}".format(filename))
        with open(filename, "wb") as file_object:
        Loads the last saved replay buffer from a file.
        """
        filename = os.path.join(
-            self.trainer_parameters["model_path"], "last_replay_buffer.hdf5"
+            self.trainer_parameters["output_path"], "last_replay_buffer.hdf5"
        )
        logger.info("Loading Experience Replay Buffer from {}".format(filename))
        with open(filename, "rb+") as file_object:
--- a/ml-agents/mlagents/trainers/tests/test_barracuda_converter.py
+++ b/ml-agents/mlagents/trainers/tests/test_barracuda_converter.py
        memory_size: 8
        curiosity_strength: 0.0
        curiosity_enc_size: 1
-        summary_path: test
-        model_path: test
+        output_path: test
        reward_signals:
          extrinsic:
            strength: 1.0
@pytest.mark.parametrize("rnn", [True, False], ids=["rnn", "no_rnn"])
 def test_policy_conversion(dummy_config, tmpdir, rnn, visual, discrete):
    tf.reset_default_graph()
-    dummy_config["summary_path"] = str(tmpdir)
-    dummy_config["model_path"] = os.path.join(tmpdir, "test")
+    dummy_config["output_path"] = os.path.join(tmpdir, "test")
    policy = create_policy_mock(
        dummy_config, use_rnn=rnn, use_discrete=discrete, use_visual=visual
    )
--- a/ml-agents/mlagents/trainers/tests/test_bcmodule.py
+++ b/ml-agents/mlagents/trainers/tests/test_bcmodule.py

 def create_bc_module(mock_brain, trainer_config, use_rnn, demo_file, tanhresample):
    # model_path = env.external_brain_names[0]
-    trainer_config["model_path"] = "testpath"
+    trainer_config["output_path"] = "testpath"
    trainer_config["keep_checkpoints"] = 3
    trainer_config["use_recurrent"] = use_rnn
    trainer_config["behavioral_cloning"]["demo_path"] = (
--- a/ml-agents/mlagents/trainers/tests/test_ghost.py
+++ b/ml-agents/mlagents/trainers/tests/test_ghost.py
        memory_size: 8
        curiosity_strength: 0.0
        curiosity_enc_size: 1
-        summary_path: test
-        model_path: test
+        output_path: test
        reward_signals:
          extrinsic:
            strength: 1.0
        vector_action_descriptions=[],
        vector_action_space_type=0,
    )
-    dummy_config["summary_path"] = "./summaries/test_trainer_summary"
-    dummy_config["model_path"] = "./models/test_trainer_models/TestModel"
+    dummy_config["output_path"] = "./results/test_trainer_models/TestModel"
    ppo_trainer = PPOTrainer(brain_name, 0, dummy_config, True, False, 0, "0")
    controller = GhostController(100)
    trainer = GhostTrainer(
        vector_action_descriptions=[],
        vector_action_space_type=0,
    )
-    dummy_config["summary_path"] = "./summaries/test_trainer_summary"
-    dummy_config["model_path"] = "./models/test_trainer_models/TestModel"
+    dummy_config["output_path"] = "./results/test_trainer_models/TestModel"
    ppo_trainer = PPOTrainer(brain_name, 0, dummy_config, True, False, 0, "0")
    controller = GhostController(100)
    trainer = GhostTrainer(
--- a/ml-agents/mlagents/trainers/tests/test_learn.py
+++ b/ml-agents/mlagents/trainers/tests/test_learn.py
    return parse_command_line(args)


+@patch("mlagents.trainers.learn.write_timing_tree")
+@patch("mlagents.trainers.learn.write_run_options")
@patch("mlagents.trainers.learn.handle_existing_directories")
@patch("mlagents.trainers.learn.TrainerFactory")
@patch("mlagents.trainers.learn.SamplerManager")
    sampler_manager_mock,
    trainer_factory_mock,
    handle_dir_mock,
+    write_run_options_mock,
+    write_timing_tree_mock,
 ):
    mock_env = MagicMock()
    mock_env.external_brain_names = []
    mock_init = MagicMock(return_value=None)
    with patch.object(TrainerController, "__init__", mock_init):
        with patch.object(TrainerController, "start_learning", MagicMock()):
-            learn.run_training(0, basic_options())
+            options = basic_options()
+            learn.run_training(0, options)
-                "./models/ppo",
-                "./summaries",
+                "results/ppo",
                "ppo",
                50000,
                None,
                None,
            )
-            handle_dir_mock.assert_called_once_with(
-                "./models/ppo", "./summaries", False, False, None
-            )
+            handle_dir_mock.assert_called_once_with("results/ppo", False, False, None)
+            write_timing_tree_mock.assert_called_once_with("results/ppo/run_logs")
+            write_run_options_mock.assert_called_once_with("results/ppo", options)
    StatsReporter.writers.clear()  # make sure there aren't any writers as added by learn.py


            seed=None,
            start_port=8000,
            env_args=None,
+            log_folder="results/log_folder",
        )


--- a/ml-agents/mlagents/trainers/tests/test_nn_policy.py
+++ b/ml-agents/mlagents/trainers/tests/test_nn_policy.py
        memory_size: 8
        curiosity_strength: 0.0
        curiosity_enc_size: 1
-        summary_path: test
-        model_path: test
+        output_path: test
        reward_signals:
          extrinsic:
            strength: 1.0
    path1 = os.path.join(tmp_path, "runid1")
    path2 = os.path.join(tmp_path, "runid2")
    trainer_params = dummy_config
-    trainer_params["model_path"] = path1
+    trainer_params["output_path"] = path1
    policy = create_policy_mock(trainer_params)
    policy.initialize_or_load()
    policy.save_model(2000)
        vector_action_descriptions=[],
        vector_action_space_type=0,
    )
-    dummy_config["summary_path"] = "./summaries/test_trainer_summary"
-    dummy_config["model_path"] = "./models/test_trainer_models/TestModel"
+    dummy_config["output_path"] = "./results/test_trainer_models/TestModel"

    time_horizon = 6
    trajectory = make_fake_trajectory(
--- a/ml-agents/mlagents/trainers/tests/test_policy.py
+++ b/ml-agents/mlagents/trainers/tests/test_policy.py


 def basic_params():
-    return {"use_recurrent": False, "model_path": "my/path"}
+    return {"use_recurrent": False, "output_path": "my/path"}


 class FakePolicy(TFPolicy):
--- a/ml-agents/mlagents/trainers/tests/test_ppo.py
+++ b/ml-agents/mlagents/trainers/tests/test_ppo.py
        memory_size: 10
        curiosity_strength: 0.0
        curiosity_enc_size: 1
-        summary_path: test
-        model_path: test
+        output_path: test
        reward_signals:
          extrinsic:
            strength: 1.0
        vector_action_descriptions=[],
        vector_action_space_type=0,
    )
-    dummy_config["summary_path"] = "./summaries/test_trainer_summary"
-    dummy_config["model_path"] = "./models/test_trainer_models/TestModel"
+    dummy_config["output_path"] = "./results/test_trainer_models/TestModel"
    trainer = PPOTrainer(brain_params, 0, dummy_config, True, False, 0, "0")
    policy = trainer.create_policy(brain_params.brain_name, brain_params)
    trainer.add_policy(brain_params.brain_name, policy)
    mock_optimizer.reward_signals = {}
    ppo_optimizer.return_value = mock_optimizer

-    dummy_config["summary_path"] = "./summaries/test_trainer_summary"
-    dummy_config["model_path"] = "./models/test_trainer_models/TestModel"
+    dummy_config["output_path"] = "./results/test_trainer_models/TestModel"
    trainer = PPOTrainer(brain_params, 0, dummy_config, True, False, 0, "0")
    policy = mock.Mock(spec=NNPolicy)
    policy.get_current_step.return_value = 2000
--- a/ml-agents/mlagents/trainers/tests/test_reward_signals.py
+++ b/ml-agents/mlagents/trainers/tests/test_reward_signals.py
    )
    trainer_parameters = trainer_config
    model_path = "testpath"
-    trainer_parameters["model_path"] = model_path
+    trainer_parameters["output_path"] = model_path
    trainer_parameters["keep_checkpoints"] = 3
    trainer_parameters["reward_signals"].update(reward_signal_config)
    trainer_parameters["use_recurrent"] = use_rnn
--- a/ml-agents/mlagents/trainers/tests/test_rl_trainer.py
+++ b/ml-agents/mlagents/trainers/tests/test_rl_trainer.py
 def dummy_config():
    return yaml.safe_load(
        """
-        summary_path: "test/"
+        output_path: "test/"
        summary_freq: 1000
        max_steps: 100
        reward_signals:
--- a/ml-agents/mlagents/trainers/tests/test_sac.py
+++ b/ml-agents/mlagents/trainers/tests/test_sac.py

    trainer_parameters = dummy_config
    model_path = "testmodel"
-    trainer_parameters["model_path"] = model_path
+    trainer_parameters["output_path"] = model_path
    trainer_parameters["keep_checkpoints"] = 3
    trainer_parameters["use_recurrent"] = use_rnn
    policy = NNPolicy(
        discrete_action_space=DISCRETE_ACTION_SPACE,
    )
    trainer_params = dummy_config
-    trainer_params["summary_path"] = str(tmpdir)
-    trainer_params["model_path"] = str(tmpdir)
+    trainer_params["output_path"] = str(tmpdir)
    trainer_params["save_replay_buffer"] = True
    trainer = SACTrainer(mock_brain.brain_name, 1, trainer_params, True, False, 0, 0)
    policy = trainer.create_policy(mock_brain.brain_name, mock_brain)
    mock_optimizer.reward_signals = {}
    sac_optimizer.return_value = mock_optimizer

-    dummy_config["summary_path"] = "./summaries/test_trainer_summary"
-    dummy_config["model_path"] = "./models/test_trainer_models/TestModel"
+    dummy_config["output_path"] = "./results/test_trainer_models/TestModel"
    trainer = SACTrainer(brain_params, 0, dummy_config, True, False, 0, "0")
    policy = mock.Mock(spec=NNPolicy)
    policy.get_current_step.return_value = 2000
    brain_params = make_brain_parameters(
        discrete_action=False, visual_inputs=0, vec_obs_size=6
    )
-    dummy_config["summary_path"] = "./summaries/test_trainer_summary"
-    dummy_config["model_path"] = "./models/test_trainer_models/TestModel"
+    dummy_config["output_path"] = "./results/test_trainer_models/TestModel"
    dummy_config["steps_per_update"] = 20
    trainer = SACTrainer(brain_params, 0, dummy_config, True, False, 0, "0")
    policy = trainer.create_policy(brain_params.brain_name, brain_params)
    dummy_config["sequence_length"] = 64
    dummy_config["batch_size"] = 32
    dummy_config["use_recurrent"] = True
-    dummy_config["summary_path"] = "./summaries/test_trainer_summary"
-    dummy_config["model_path"] = "./models/test_trainer_models/TestModel"
+    dummy_config["output_path"] = "./results/test_trainer_models/TestModel"
    with pytest.raises(UnityTrainerException):
        _ = SACTrainer(brain_params, 0, dummy_config, True, False, 0, "0")

--- a/ml-agents/mlagents/trainers/tests/test_simple_rl.py
+++ b/ml-agents/mlagents/trainers/tests/test_simple_rl.py
            env_manager = SimpleEnvManager(env, EnvironmentParametersChannel())
        trainer_factory = TrainerFactory(
            trainer_config=trainer_config,
-            summaries_dir=dir,
-            model_path=dir,
+            output_path=dir,
            keep_checkpoints=1,
            train_model=True,
            load_model=False,

        tc = TrainerController(
            trainer_factory=trainer_factory,
-            summaries_dir=dir,
-            model_path=dir,
+            output_path=dir,
            run_id=run_id,
            meta_curriculum=meta_curriculum,
            train=True,
--- a/ml-agents/mlagents/trainers/tests/test_trainer_controller.py
+++ b/ml-agents/mlagents/trainers/tests/test_trainer_controller.py
    trainer_factory_mock.ghost_controller = GhostController()
    return TrainerController(
        trainer_factory=trainer_factory_mock,
-        model_path="test_model_path",
-        summaries_dir="test_summaries_dir",
+        output_path="test_model_path",
        run_id="test_run_id",
        save_freq=100,
        meta_curriculum=None,
    trainer_factory_mock.ghost_controller = GhostController()
    TrainerController(
        trainer_factory=trainer_factory_mock,
-        model_path="",
-        summaries_dir="",
+        output_path="",
        run_id="1",
        save_freq=1,
        meta_curriculum=None,
--- a/ml-agents/mlagents/trainers/tests/test_trainer_util.py
+++ b/ml-agents/mlagents/trainers/tests/test_trainer_util.py
 def test_initialize_trainer_parameters_override_defaults(
    BrainParametersMock, dummy_config_with_override
 ):
-    summaries_dir = "test_dir"
-    model_path = "model_dir"
+    output_path = "model_dir"
    keep_checkpoints = 1
    train_model = True
    load_model = False
    base_config = dummy_config_with_override
    expected_config = base_config["default"]
-    expected_config["summary_path"] = f"{run_id}_testbrain"
-    expected_config["model_path"] = model_path + "/testbrain"
+    expected_config["output_path"] = output_path + "/testbrain"
    expected_config["keep_checkpoints"] = keep_checkpoints

    # Override value from specific brain config
    with patch.object(PPOTrainer, "__init__", mock_constructor):
        trainer_factory = trainer_util.TrainerFactory(
            trainer_config=base_config,
-            summaries_dir=summaries_dir,
-            model_path=model_path,
+            output_path=output_path,
            keep_checkpoints=keep_checkpoints,
            train_model=train_model,
            load_model=load_model,
    brain_params_mock = BrainParametersMock()
    BrainParametersMock.return_value.brain_name = "testbrain"
    external_brains = {"testbrain": BrainParametersMock()}
-    summaries_dir = "test_dir"
-    model_path = "model_dir"
+    output_path = "results_dir"
    keep_checkpoints = 1
    train_model = True
    load_model = False
    base_config = dummy_config
    expected_config = base_config["default"]
-    expected_config["summary_path"] = f"{run_id}_testbrain"
-    expected_config["model_path"] = model_path + "/testbrain"
+    expected_config["output_path"] = output_path + "/testbrain"
    expected_config["keep_checkpoints"] = keep_checkpoints

    def mock_constructor(
    with patch.object(PPOTrainer, "__init__", mock_constructor):
        trainer_factory = trainer_util.TrainerFactory(
            trainer_config=base_config,
-            summaries_dir=summaries_dir,
-            model_path=model_path,
+            output_path=output_path,
            keep_checkpoints=keep_checkpoints,
            train_model=train_model,
            load_model=load_model,
 def test_initialize_invalid_trainer_raises_exception(
    BrainParametersMock, dummy_bad_config
 ):
-    summaries_dir = "test_dir"
-    model_path = "model_dir"
+    output_path = "results_dir"
    keep_checkpoints = 1
    train_model = True
    load_model = False
    with pytest.raises(TrainerConfigError):
        trainer_factory = trainer_util.TrainerFactory(
            trainer_config=bad_config,
-            summaries_dir=summaries_dir,
-            model_path=model_path,
+            output_path=output_path,
            keep_checkpoints=keep_checkpoints,
            train_model=train_model,
            load_model=load_model,
    with pytest.raises(TrainerConfigError):
        trainer_factory = trainer_util.TrainerFactory(
            trainer_config=bad_config,
-            summaries_dir=summaries_dir,
-            model_path=model_path,
+            output_path=output_path,
            keep_checkpoints=keep_checkpoints,
            train_model=train_model,
            load_model=load_model,
    with pytest.raises(UnityTrainerException):
        trainer_factory = trainer_util.TrainerFactory(
            trainer_config=bad_config,
-            summaries_dir=summaries_dir,
-            model_path=model_path,
+            output_path=output_path,
            keep_checkpoints=keep_checkpoints,
            train_model=train_model,
            load_model=load_model,

    trainer_factory = trainer_util.TrainerFactory(
        trainer_config=no_default_config,
-        summaries_dir="test_dir",
-        model_path="model_dir",
+        output_path="output_path",
        keep_checkpoints=1,
        train_model=True,
        load_model=False,

    trainer_factory = trainer_util.TrainerFactory(
        trainer_config=bad_config,
-        summaries_dir="test_dir",
-        model_path="model_dir",
+        output_path="output_path",
        keep_checkpoints=1,
        train_model=True,
        load_model=False,


 def test_existing_directories(tmp_path):
-    model_path = os.path.join(tmp_path, "runid")
-    # Unused summary path
-    summary_path = os.path.join(tmp_path, "runid")
+    output_path = os.path.join(tmp_path, "runid")
-    trainer_util.handle_existing_directories(model_path, summary_path, False, False)
+    trainer_util.handle_existing_directories(output_path, False, False)
-        trainer_util.handle_existing_directories(model_path, summary_path, True, False)
+        trainer_util.handle_existing_directories(output_path, True, False)
-    os.mkdir(model_path)
+    os.mkdir(output_path)
-        trainer_util.handle_existing_directories(model_path, summary_path, False, False)
+        trainer_util.handle_existing_directories(output_path, False, False)
-    trainer_util.handle_existing_directories(model_path, summary_path, True, False)
+    trainer_util.handle_existing_directories(output_path, True, False)
-    trainer_util.handle_existing_directories(model_path, summary_path, False, True)
+    trainer_util.handle_existing_directories(output_path, False, True)
-        trainer_util.handle_existing_directories(
-            model_path, summary_path, False, True, init_path
-        )
+        trainer_util.handle_existing_directories(output_path, False, True, init_path)
-    trainer_util.handle_existing_directories(
-        model_path, summary_path, False, True, init_path
-    )
+    trainer_util.handle_existing_directories(output_path, False, True, init_path)
--- a/ml-agents/mlagents/trainers/trainer/trainer.py
+++ b/ml-agents/mlagents/trainers/trainer/trainer.py
        self.brain_name = brain_name
        self.run_id = run_id
        self.trainer_parameters = trainer_parameters
-        self.summary_path = trainer_parameters["summary_path"]
-        self._stats_reporter = StatsReporter(self.summary_path)
+        self._stats_reporter = StatsReporter(brain_name)
        self.is_training = training
        self._reward_buffer: Deque[float] = deque(maxlen=reward_buff_cap)
        self.policy_queues: List[AgentManagerQueue[Policy]] = []
--- a/ml-agents/mlagents/trainers/trainer_controller.py
+++ b/ml-agents/mlagents/trainers/trainer_controller.py
    def __init__(
        self,
        trainer_factory: TrainerFactory,
-        model_path: str,
-        summaries_dir: str,
+        output_path: str,
        run_id: str,
        save_freq: int,
        meta_curriculum: Optional[MetaCurriculum],
        resampling_interval: Optional[int],
    ):
        """
-        :param model_path: Path to save the model.
+        :param output_path: Path to save the model.
        :param summaries_dir: Folder to save training summaries.
        :param run_id: The sub-directory name for model and summary statistics
        :param save_freq: Frequency at which to save model
        self.trainers: Dict[str, Trainer] = {}
        self.brain_name_to_identifier: Dict[str, Set] = defaultdict(set)
        self.trainer_factory = trainer_factory
-        self.model_path = model_path
-        self.summaries_dir = summaries_dir
+        self.output_path = output_path
        self.logger = get_logger(__name__)
        self.run_id = run_id
        self.save_freq = save_freq
                self.trainers[brain_name].export_model(name_behavior_id)

    @staticmethod
-    def _create_model_path(model_path):
+    def _create_output_path(output_path):
-            if not os.path.exists(model_path):
-                os.makedirs(model_path)
+            if not os.path.exists(output_path):
+                os.makedirs(output_path)
-                "The folder {} containing the "
+                f"The folder {output_path} containing the "
-                "permissions are set correctly.".format(model_path)
+                "permissions are set correctly."
            )

    @timed

    @timed
    def start_learning(self, env_manager: EnvManager) -> None:
-        self._create_model_path(self.model_path)
+        self._create_output_path(self.output_path)
        tf.reset_default_graph()
        global_step = 0
        last_brain_behavior_ids: Set[str] = set()
--- a/ml-agents/mlagents/trainers/trainer_util.py
+++ b/ml-agents/mlagents/trainers/trainer_util.py
    def __init__(
        self,
        trainer_config: Any,
-        summaries_dir: str,
-        model_path: str,
+        output_path: str,
        keep_checkpoints: int,
        train_model: bool,
        load_model: bool,
        multi_gpu: bool = False,
    ):
        self.trainer_config = trainer_config
-        self.summaries_dir = summaries_dir
-        self.model_path = model_path
+        self.output_path = output_path
        self.init_path = init_path
        self.keep_checkpoints = keep_checkpoints
        self.train_model = train_model
        return initialize_trainer(
            self.trainer_config,
            brain_name,
-            self.summaries_dir,
-            self.model_path,
+            self.output_path,
            self.keep_checkpoints,
            self.train_model,
            self.load_model,
 def initialize_trainer(
    trainer_config: Any,
    brain_name: str,
-    summaries_dir: str,
-    model_path: str,
+    output_path: str,
    keep_checkpoints: int,
    train_model: bool,
    load_model: bool,

    :param trainer_config: Original trainer configuration loaded from YAML
    :param brain_name: Name of the brain to be associated with trainer
-    :param summaries_dir: Directory to store trainer summary statistics
-    :param model_path: Path to save the model
+    :param output_path: Path to save the model and summary statistics
    :param keep_checkpoints: How many model checkpoints to keep
    :param train_model: Whether to train the model (vs. run inference)
    :param load_model: Whether to load the model or randomly initialize
        )

    trainer_parameters = trainer_config.get("default", {}).copy()
-    trainer_parameters["summary_path"] = str(run_id) + "_" + brain_name
-    trainer_parameters["model_path"] = "{basedir}/{name}".format(
-        basedir=model_path, name=brain_name
-    )
+    trainer_parameters["output_path"] = os.path.join(output_path, brain_name)
-        trainer_parameters["init_path"] = "{basedir}/{name}".format(
-            basedir=init_path, name=brain_name
-        )
+        trainer_parameters["init_path"] = os.path.join(init_path, brain_name)
    trainer_parameters["keep_checkpoints"] = keep_checkpoints
    if brain_name in trainer_config:
        _brain_key: Any = brain_name


 def handle_existing_directories(
-    model_path: str, summary_path: str, resume: bool, force: bool, init_path: str = None
+    output_path: str, resume: bool, force: bool, init_path: str = None
 ) -> None:
    """
    Validates that if the run_id model exists, we do not overwrite it unless --force is specified.
    :param force: Whether or not the --force flag was passed.
    """

-    model_path_exists = os.path.isdir(model_path)
+    output_path_exists = os.path.isdir(output_path)
-    if model_path_exists:
+    if output_path_exists:
        if not resume and not force:
            raise UnityTrainerException(
                "Previous data from this run ID was found. "
--- a/ml-agents/tests/yamato/scripts/run_llapi.py
+++ b/ml-agents/tests/yamato/scripts/run_llapi.py
        file_name=env_name,
        side_channels=[engine_configuration_channel],
        no_graphics=True,
-        args=["-logFile", "-"],
+        additional_args=["-logFile", "-"],
    )

    try:
    """
    try:
        env1 = UnityEnvironment(
-            file_name=env_name, base_port=5006, no_graphics=True, args=["-logFile", "-"]
+            file_name=env_name,
+            base_port=5006,
+            no_graphics=True,
+            additional_args=["-logFile", "-"],
-            file_name=env_name, base_port=5006, no_graphics=True, args=["-logFile", "-"]
+            file_name=env_name,
+            base_port=5006,
+            no_graphics=True,
+            additional_args=["-logFile", "-"],
-            file_name=env_name, base_port=5007, no_graphics=True, args=["-logFile", "-"]
+            file_name=env_name,
+            base_port=5007,
+            no_graphics=True,
+            additional_args=["-logFile", "-"],
        )
        env2.reset()
    finally:
--- a/ml-agents/tests/yamato/training_int_tests.py
+++ b/ml-agents/tests/yamato/training_int_tests.py
    print(
        f"Running training with python={python_version or latest} and c#={csharp_version or latest}"
    )
-    nn_file_expected = f"./models/{run_id}/3DBall.nn"
+    nn_file_expected = f"./results/{run_id}/3DBall.nn"
    if os.path.exists(nn_file_expected):
        # Should never happen - make sure nothing leftover from an old test.
        print("Artifacts from previous build found!")