浏览代码

Fixing learn.py, trainer_controller.py, and Docker (#1164)

* Fixing learn.py, trainer_controller.py, and Docker

- learn.py has been moved under trainers.
    - this was a two line change
- learn.py will no longer be run as a main method
- docopt arguments are strings by default. learn.py now uses
  this assumption to correctly parse arguments.
- trainer_controller.py now considers the Docker volume when
  accepting a trainer config file path.
- the Docker container now uses mlagents-learn.

* Removing extraneous unity-volume ref.
/develop-generalizationTraining-TrainerController
GitHub 6 年前
当前提交
a6f45b76
共有 7 个文件被更改,包括 129 次插入137 次删除
  1. 7
      Dockerfile
  2. 2
      docs/Training-ML-Agents.md
  3. 18
      docs/Using-Docker.md
  4. 9
      ml-agents/mlagents/trainers/trainer_controller.py
  5. 2
      ml-agents/setup.py
  6. 110
      ml-agents/mlagents/trainers/learn.py
  7. 118
      ml-agents/mlagents/learn.py

7
Dockerfile


# xvfb is used to do CPU based rendering of Unity
RUN apt-get install -y xvfb
COPY ml-agents/requirements.txt .
RUN pip install --trusted-host pypi.python.org -r requirements.txt
COPY README.md .
COPY ml-agents /ml-agents
WORKDIR /ml-agents
RUN pip install .

ENTRYPOINT ["python", "mlagents/learn.py"]
ENTRYPOINT ["mlagents-learn"]

2
docs/Training-ML-Agents.md


Use the command `mlagents-learn` to train your agents. This command is installed
with the `mlagents` package and its implementation can be found at
`ml-agents/learn.py`. The [configuration file](#training-config-file),
`ml-agents/mlagents/trainers/learn.py`. The [configuration file](#training-config-file),
`config/trainer_config.yaml` specifies the hyperparameters used during training.
You can edit this file with a text editor to add a specific configuration for
each brain.

18
docs/Using-Docker.md


- Since Docker runs a container in an environment that is isolated from the host
machine, a mounted directory in your host machine is used to share data, e.g.
the Unity executable, curriculum files and TensorFlow graph. For convenience,
we created an empty `unity-volume` directory at the root of the repository for
this purpose, but feel free to use any other directory. The remainder of this
guide assumes that the `unity-volume` directory is the one used.
the trainer configuration file, Unity executable, curriculum files and
TensorFlow graph. For convenience, we created an empty `unity-volume`
directory at the root of the repository for this purpose, but feel free to use
any other directory. The remainder of this guide assumes that the
`unity-volume` directory is the one used.
## Usage

-p 5005:5005 \
<image-name>:latest \
--docker-target-name=unity-volume \
<trainer-config-path> \
<trainer-config-file> \
--env=<environment-name> \
--train \
--run-id=<run-id>

- `docker-target-name`: Tells the ML-Agents Python package what the name of the
disk where it can read the Unity executable and store the graph. **This should
therefore be identical to `target`.**
- `trainer-config-path`, `train`, `run-id`: ML-Agents arguments passed to
`mlagents-learn`. `trainer-config-path` is the filepath of the trainer config
- `trainer-config-file`, `train`, `run-id`: ML-Agents arguments passed to
`mlagents-learn`. `trainer-config-file` is the filename of the trainer config
file, `train` trains the algorithm, and `run-id` is used to tag each
experiment with a unique identifier. We recommend placing the trainer-config
file inside `unity-volume` so that the container has access to the file.

-p 5005:5005 \
balance.ball.v0.1:latest 3DBall \
--docker-target-name=unity-volume \
<trainer-config-path> \
trainer_config.yaml \
--env=3DBall
--train \
--run-id=3dball_first_trial
```

9
ml-agents/mlagents/trainers/trainer_controller.py


:param no_graphics: Whether to run the Unity simulator in no-graphics
mode.
"""
self.trainer_config_path = trainer_config_path
if env_path is not None:
# Strip out executable extensions if passed
env_path = (env_path.strip()

.replace('.x86', ''))
# Recognize and use docker volume if one is passed as an argument
if docker_target_name == '':
if not docker_target_name:
self.trainer_config_path = trainer_config_path
self.trainer_config_path = \
'/{docker_target_name}/{trainer_config_path}'.format(
docker_target_name=docker_target_name,
trainer_config_path = trainer_config_path)
self.model_path = '/{docker_target_name}/models/{run_id}'.format(
docker_target_name=docker_target_name,
run_id=run_id)

2
ml-agents/setup.py


entry_points={
'console_scripts': [
'mlagents-learn=mlagents.learn:main',
'mlagents-learn=mlagents.trainers.learn:main',
],
},
)

110
ml-agents/mlagents/trainers/learn.py


# # Unity ML-Agents Toolkit
import logging
import os
import multiprocessing
import numpy as np
from docopt import docopt
from .trainer_controller import TrainerController
from .exception import TrainerError
def run_training(sub_id, run_seed, run_options):
"""
Launches training session.
:param sub_id: Unique id for training session.
:param run_seed: Random seed used for training.
:param run_options: Command line arguments for training.
"""
# Docker Parameters
docker_target_name = (run_options['--docker-target-name']
if run_options['--docker-target-name'] != 'None' else None)
# General parameters
env_path = (run_options['--env']
if run_options['--env'] != 'None' else None)
run_id = run_options['--run-id']
load_model = run_options['--load']
train_model = run_options['--train']
save_freq = int(run_options['--save-freq'])
keep_checkpoints = int(run_options['--keep-checkpoints'])
worker_id = int(run_options['--worker-id'])
curriculum_file = (run_options['--curriculum']
if run_options['--curriculum'] != 'None' else None)
lesson = int(run_options['--lesson'])
fast_simulation = not bool(run_options['--slow'])
no_graphics = run_options['--no-graphics']
trainer_config_path = run_options['<trainer-config-path>']
# Create controller and begin training.
tc = TrainerController(env_path, run_id + '-' + str(sub_id),
save_freq, curriculum_file, fast_simulation,
load_model, train_model, worker_id + sub_id,
keep_checkpoints, lesson, run_seed,
docker_target_name, trainer_config_path, no_graphics)
tc.start_learning()
def main():
try:
print('''
,m' ,▓▓▓▀▓▓▄ ▓▓▓ ▓▓▌
' ▄▓▓▀ ▓▓▓ ▄▄ ▄▄ ,▄▄ ▄▄▄▄ ,▄▄ ▄▓▓▌▄ ▄▄▄ ,▄▄
^
^ `
'▀▓▓▓▄ ^▓▓▓ ▓▓▓ └▀▀▀▀ ▀▀ ^▀▀ `▀▀ `▀▀ '
,
`
¬`
''')
except:
print('\n\n\tUnity Technologies\n')
logger = logging.getLogger('mlagents.trainers')
_USAGE = '''
Usage:
mlagents-learn <trainer-config-path> [options]
mlagents-learn --help
Options:
--env=<file> Name of the Unity executable [default: None].
--curriculum=<directory> Curriculum json directory for environment [default: None].
--keep-checkpoints=<n> How many model checkpoints to keep [default: 5].
--lesson=<n> Start learning from this lesson [default: 0].
--load Whether to load the model or randomly initialize [default: False].
--run-id=<path> The directory name for model and summary statistics [default: ppo].
--num-runs=<n> Number of concurrent training sessions [default: 1].
--save-freq=<n> Frequency at which to save model [default: 50000].
--seed=<n> Random seed used for training [default: -1].
--slow Whether to run the game at training speed [default: False].
--train Whether to train model, or only run inference [default: False].
--worker-id=<n> Number to add to communication port (5005) [default: 0].
--docker-target-name=<dt> Docker volume to store training-specific files [default: None].
--no-graphics Whether to run the environment in no-graphics mode [default: False].
'''
options = docopt(_USAGE)
logger.info(options)
num_runs = int(options['--num-runs'])
seed = int(options['--seed'])
if options['--env'] == 'None' and num_runs > 1:
raise TrainerError('It is not possible to launch more than one concurrent training session '
'when training from the editor.')
jobs = []
run_seed = seed
for i in range(num_runs):
if seed == -1:
run_seed = np.random.randint(0, 10000)
p = multiprocessing.Process(target=run_training, args=(i, run_seed, options))
jobs.append(p)
p.start()

118
ml-agents/mlagents/learn.py


# # Unity ML-Agents Toolkit
import logging
import os
import multiprocessing
import numpy as np
from docopt import docopt
from mlagents.trainers.trainer_controller import TrainerController
from mlagents.trainers.exception import TrainerError
def run_training(sub_id, run_seed, run_options):
"""
Launches training session.
:param sub_id: Unique id for training session.
:param run_seed: Random seed used for training.
:param run_options: Command line arguments for training.
"""
# Docker Parameters
if run_options['--docker-target-name'] == 'Empty':
docker_target_name = ''
else:
docker_target_name = run_options['--docker-target-name']
# General parameters
env_path = run_options['--env']
if env_path == 'None':
env_path = None
run_id = run_options['--run-id']
load_model = run_options['--load']
train_model = run_options['--train']
save_freq = int(run_options['--save-freq'])
keep_checkpoints = int(run_options['--keep-checkpoints'])
worker_id = int(run_options['--worker-id'])
curriculum_file = str(run_options['--curriculum'])
if curriculum_file == 'None':
curriculum_file = None
lesson = int(run_options['--lesson'])
fast_simulation = not bool(run_options['--slow'])
no_graphics = run_options['--no-graphics']
trainer_config_path = run_options['<trainer-config-path>']
# Create controller and begin training.
tc = TrainerController(env_path, run_id + '-' + str(sub_id),
save_freq, curriculum_file, fast_simulation,
load_model, train_model, worker_id + sub_id,
keep_checkpoints, lesson, run_seed,
docker_target_name, trainer_config_path, no_graphics)
tc.start_learning()
def main():
try:
print('''
,m' ,▓▓▓▀▓▓▄ ▓▓▓ ▓▓▌
' ▄▓▓▀ ▓▓▓ ▄▄ ▄▄ ,▄▄ ▄▄▄▄ ,▄▄ ▄▓▓▌▄ ▄▄▄ ,▄▄
^
^ `
'▀▓▓▓▄ ^▓▓▓ ▓▓▓ └▀▀▀▀ ▀▀ ^▀▀ `▀▀ `▀▀ '
,
`
¬`
''')
except:
print('\n\n\tUnity Technologies\n')
logger = logging.getLogger('mlagents.learn')
_USAGE = '''
Usage:
mlagents-learn <trainer-config-path> [options]
mlagents-learn --help
Options:
--env=<file> Name of the Unity executable [default: None].
--curriculum=<directory> Curriculum json directory for environment [default: None].
--keep-checkpoints=<n> How many model checkpoints to keep [default: 5].
--lesson=<n> Start learning from this lesson [default: 0].
--load Whether to load the model or randomly initialize [default: False].
--run-id=<path> The directory name for model and summary statistics [default: ppo].
--num-runs=<n> Number of concurrent training sessions [default: 1].
--save-freq=<n> Frequency at which to save model [default: 50000].
--seed=<n> Random seed used for training [default: -1].
--slow Whether to run the game at training speed [default: False].
--train Whether to train model, or only run inference [default: False].
--worker-id=<n> Number to add to communication port (5005) [default: 0].
--docker-target-name=<dt> Docker volume to store training-specific files [default: Empty].
--no-graphics Whether to run the environment in no-graphics mode [default: False].
'''
options = docopt(_USAGE)
logger.info(options)
num_runs = int(options['--num-runs'])
seed = int(options['--seed'])
if options['--env'] == 'None' and num_runs > 1:
raise TrainerError('It is not possible to launch more than one concurrent training session '
'when training from the editor.')
jobs = []
run_seed = seed
for i in range(num_runs):
if seed == -1:
run_seed = np.random.randint(0, 10000)
p = multiprocessing.Process(target=run_training, args=(i, run_seed, options))
jobs.append(p)
p.start()
if __name__ == '__main__':
main()
正在加载...
取消
保存