|
|
|
|
|
|
The basic command for training is: |
|
|
|
|
|
|
|
```sh |
|
|
|
mlagents-learn <trainer-config-file> --env=<env_name> --run-id=<run-identifier> --train |
|
|
|
mlagents-learn <trainer-config-file> --env=<env_name> --run-id=<run-identifier> |
|
|
|
``` |
|
|
|
|
|
|
|
where |
|
|
|
|
|
|
environment you built in step 1: |
|
|
|
|
|
|
|
```sh |
|
|
|
mlagents-learn config/trainer_config.yaml --env=../../projects/Cats/CatsOnBicycles.app --run-id=cob_1 --train |
|
|
|
mlagents-learn config/trainer_config.yaml --env=../../projects/Cats/CatsOnBicycles.app --run-id=cob_1 |
|
|
|
``` |
|
|
|
|
|
|
|
During a training session, the training program prints out and saves updates at |
|
|
|
|
|
|
`models/cob_1/CatsOnBicycles_cob_1.nn`. |
|
|
|
|
|
|
|
While this example used the default training hyperparameters, you can edit the |
|
|
|
[training_config.yaml file](#training-config-file) with a text editor to set |
|
|
|
[trainer_config.yaml file](#training-config-file) with a text editor to set |
|
|
|
To interrupt training and save the current progress, hit Ctrl+C once and wait for the |
|
|
|
model to be saved out. |
|
|
|
|
|
|
|
### Loading an Existing Model |
|
|
|
|
|
|
|
If you've quit training early using Ctrl+C, you can resume the training run by running |
|
|
|
`mlagents-learn` again, specifying the same `<run-identifier>` and appending the `--resume` flag |
|
|
|
to the command. |
|
|
|
|
|
|
|
You can also use this mode to run inference of an already-trained model in Python. |
|
|
|
Append both the `--resume` and `--inference` to do this. Note that if you want to run |
|
|
|
inference in Unity, you should use the |
|
|
|
[Unity Inference Engine](Getting-started#Running-a-pre-trained-model). |
|
|
|
|
|
|
|
If you've already trained a model using the specified `<run-identifier>` and `--resume` is not |
|
|
|
specified, you will not be able to continue with training. Use `--force` to force ML-Agents to |
|
|
|
overwrite the existing data. |
|
|
|
|
|
|
|
### Command Line Training Options |
|
|
|
|
|
|
|
In addition to passing the path of the Unity executable containing your training |
|
|
|
|
|
|
training. Defaults to 0. |
|
|
|
* `--num-envs=<n>`: Specifies the number of concurrent Unity environment instances to |
|
|
|
collect experiences from when training. Defaults to 1. |
|
|
|
* `--run-id=<path>`: Specifies an identifier for each training run. This |
|
|
|
* `--run-id=<run-identifier>`: Specifies an identifier for each training run. This |
|
|
|
identifier is used to name the subdirectories in which the trained model and |
|
|
|
summary statistics are saved as well as the saved model itself. The default id |
|
|
|
is "ppo". If you use TensorBoard to view the training statistics, always set a |
|
|
|
|
|
|
will use the port `(base_port + worker_id)`, where the `worker_id` is sequential IDs |
|
|
|
given to each instance from 0 to `num_envs - 1`. Default is 5005. __Note:__ When |
|
|
|
training using the Editor rather than an executable, the base port will be ignored. |
|
|
|
* `--train`: Specifies whether to train model or only run in inference mode. |
|
|
|
When training, **always** use the `--train` option. |
|
|
|
* `--load`: If set, the training code loads an already trained model to |
|
|
|
* `--inference`: Specifies whether to only run in inference mode. Omit to train the model. |
|
|
|
To load an existing model, specify a run-id and combine with `--resume`. |
|
|
|
* `--resume`: If set, the training code loads an already trained model to |
|
|
|
training). When not set (the default), the neural network weights are randomly |
|
|
|
initialized and an existing model is not loaded. |
|
|
|
training). This option only works when the models exist, and have the same behavior names |
|
|
|
as the current agents in your scene. |
|
|
|
* `--force`: Attempting to train a model with a run-id that has been used before will |
|
|
|
throw an error. Use `--force` to force-overwrite this run-id's summary and model data. |
|
|
|
* `--no-graphics`: Specify this option to run the Unity executable in |
|
|
|
`-batchmode` and doesn't initialize the graphics driver. Use this only if your |
|
|
|
training doesn't involve visual observations (reading from Pixels). See |
|
|
|