To train the agents within the Ball Balance environment, we will be using the python
package. We have provided a convenient python wrapper script called `learn.py` which accepts arguments used to configure both training and inference phases.
package. We have provided a convenient Python wrapper script called `learn.py` which accepts arguments used to configure both training and inference phases.
We will pass to this script the path of the environment executable that we just built. (Optionally) We can
4. Link the brains to the desired agents (one agent as the teacher and at least one agent as a student).
5. Build the Unity executable for your desired platform.
6. In `trainer_config.yaml`, add an entry for the "Student" brain. Set the `trainer` parameter of this entry to `imitation`, and the `brain_to_imitate` parameter to the name of the teacher brain: "Teacher". Additionally, set `batches_per_epoch`, which controls how much training to do each moment. Increase the `max_steps` option if you'd like to keep training the agents for a longer period of time.
7. Launch the training process with `python python/learn.py <env_name> --train --slow`, where `<env_name>` is the path to your built Unity executable.
7. Launch the training process with `python3 python/learn.py <env_name> --train --slow`, where `<env_name>` is the path to your built Unity executable.
8. From the Unity window, control the agent with the Teacher brain by providing "teacher demonstrations" of the behavior you would like to see.
9. Watch as the agent(s) with the student brain attached begin to behave similarly to the demonstrations.
10. Once the Student agents are exhibiting the desired behavior, end the training process with `CTL+C` from the command line.
where `<env_file_path>` is the path to your Unity executable containing the agents to be trained and `<run-identifier>` is an optional identifier you can use to identify the results of individual training runs.
3. Navigate to the ml-agents `python` folder.
4. Run the following to launch the training process using the path to the Unity environment you built in step 1:
During a training session, the training program prints out and saves updates at regular intervals (specified by the `summary_freq` option). The saved statistics are grouped by the `run-id` value so you should assign a unique id to each training run if you plan to view the statistics. You can view these statistics using TensorBoard during or after training by running the following command (from the ML-Agents python directory):