浏览代码

Addressing feedback from offline meeting.

- python directory has been removed.
- config dirctory created.
- mlagents-learn now takes --env as an optional arg.
/develop-generalizationTraining-TrainerController
Deric Pang 6 年前
当前提交
20dd50c4
共有 25 个文件被更改,包括 71 次插入62 次删除
  1. 2
      .gitignore
  2. 4
      Dockerfile
  3. 4
      docs/Basic-Guide.md
  4. 2
      docs/Feature-Memory.md
  5. 2
      docs/Getting-Started-with-Balance-Ball.md
  6. 2
      docs/Installation-Windows.md
  7. 4
      docs/Installation.md
  8. 9
      docs/Learning-Environment-Executable.md
  9. 4
      docs/Python-API.md
  10. 2
      docs/Readme.md
  11. 14
      docs/Training-Curriculum-Learning.md
  12. 4
      docs/Training-Imitation-Learning.md
  13. 13
      docs/Training-ML-Agents.md
  14. 2
      docs/Training-on-Amazon-Web-Service.md
  15. 6
      docs/Training-on-Microsoft-Azure.md
  16. 7
      docs/Using-Docker.md
  17. 2
      protobuf-definitions/make.bat
  18. 2
      ml-agents/setup.py
  19. 48
      ml-agents/mlagents/learn.py
  20. 0
      /config/curricula
  21. 0
      /config/trainer_config.yaml
  22. 0
      /gym-unity
  23. 0
      /ml-agents

2
.gitignore


# Tensorflow Model Info
/models
/summaries
python/models
python/summaries
# Environemnt logfile
*MLAgentsSDK.log

4
Dockerfile


RUN apt-get install -y xvfb
COPY python/ml-agents/requirements.txt .
COPY ml-agents/requirements.txt .
COPY python/ml-agents /ml-agents
COPY ml-agents /ml-agents
WORKDIR /ml-agents
RUN pip install .

4
docs/Basic-Guide.md


**Plugins** > **Computer**.
**Note**: If you don't see anything under **Assets**, drag the
`ml-agents/MLAgentsSDK/Assets/ML-Agents` folder under **Assets** within
`MLAgentsSDK/Assets/ML-Agents` folder under **Assets** within
Project window.
![Imported TensorFlowsharp](images/imported-tensorflowsharp.png)

Where:
- `<trainer-config-path>` is the relative or absolute filepath of the
trainer configuration. The defaults used by environments in the ML-Agents
SDK can be found in `trainer_config.yaml`.
SDK can be found in `config/trainer_config.yaml`.
- `<run-identifier>` is a string used to separate the results of different
training runs
- And the `--train` tells `mlagents-learn` to run a training session (rather

2
docs/Feature-Memory.md


track of what is important to remember with [LSTM](https://en.wikipedia.org/wiki/Long_short-term_memory).
## How to use
When configuring the trainer parameters in the `trainer_config.yaml`
When configuring the trainer parameters in the `config/trainer_config.yaml`
file, add the following parameters to the Brain you want to use.
```json

2
docs/Getting-Started-with-Balance-Ball.md


To summarize, go to your command line, enter the `ml-agents` directory and type:
```shell
mlagents-learn trainer_config.yaml --run-id=<run-identifier> --train
mlagents-learn config/trainer_config.yaml --run-id=<run-identifier> --train
```
When the message _"Start training by pressing the Play button in the Unity

2
docs/Installation-Windows.md


In our example, the files are located in `C:\Downloads`. After you have either cloned or downloaded the files, from the Anaconda Prompt, change to the python directory inside the ml-agents directory:
```
cd C:\Downloads\ml-agents\python\mlagents
cd C:\Downloads\ml-agents\ml-agents
```
Make sure you are connected to the internet and then type in the Anaconda Prompt:

4
docs/Installation.md


## Install Python (with Dependencies)
In order to use ML-Agents toolkit, you need Python 3.6 along with
the dependencies listed in the [requirements file](../python/ml-agents/requirements.txt).
the dependencies listed in the [requirements file](../ml-agents/requirements.txt).
Some of the primary dependencies include:
- [TensorFlow](Background-TensorFlow.md)

[instructions](https://packaging.python.org/guides/installing-using-linux-tools/#installing-pip-setuptools-wheel-with-linux-package-managers)
on installing it.
To install dependencies, enter the `python/ml-agents/` directory and run from
To install dependencies, enter the `ml-agents/` directory and run from
the command line:
pip install -r requirements.txt

9
docs/Learning-Environment-Executable.md


the 3DBall Scene is the only one checked. (If the list is empty, than only the
current scene is included in the build).
6. Click **Build**:
- In the File dialog, navigate to the `python` folder in your ML-Agents
directory.
- In the File dialog, navigate to your ML-Agents directory.
- (For Windows)With Unity 2018.1, it will ask you to select a folder instead of a file name. Create a subfolder within `python` folder and select that folder to build. In the following steps you will refer to this subfolder's name as `env_name`.
- (For Windows)With Unity 2018.1, it will ask you to select a folder instead of a file name. Create a subfolder within the ML-Agents folder and select that folder to build. In the following steps you will refer to this subfolder's name as `env_name`.
![Build Window](images/mlagents-BuildWindow.png)

1. Open a command or terminal window.
2. Nagivate to the folder where you installed ML-Agents.
3. Change to the python directory.
4. Run `mlagents-learn <trainer-config-file> <env_name> --run-id=<run-identifier> --train`
4. Run `mlagents-learn <trainer-config-file> --env=<env_name> --run-id=<run-identifier> --train`
Where:
- `<trainer-config-file>` is the filepath of the trainer configuration yaml.
- `<env_name>` is the name and path to the executable you exported from Unity (without extension)

For example, if you are training with a 3DBall executable you exported to the ml-agents/python directory, run:
```shell
mlagents-learn 3DBall --run-id=firstRun --train
mlagents-learn config/trainer_config.yaml --env=3DBall --run-id=firstRun --train
```
![Training command example](images/training-command-example.png)

4
docs/Python-API.md


- **BrainParameters** — describes the data elements in a BrainInfo object. For
example, provides the array length of an observation in BrainInfo.
These classes are all defined in the `python/ml-agents/mlagents/envs` folder of
These classes are all defined in the `ml-agents/mlagents/envs` folder of
the ML-Agents SDK.
To communicate with an agent in a Unity environment from a Python program, the

## Loading a Unity Environment
Python-side communication happens through `UnityEnvironment` which is located in
`python/ml-agents/mlagents/envs`. To load a Unity environment from a built binary
`ml-agents/mlagents/envs`. To load a Unity environment from a built binary
file, put the file in the same directory as `envs`. For example, if the filename
of your Unity environment is 3DBall.app, in python, run:

2
docs/Readme.md


* [API Reference](API-Reference.md)
* [How to use the Python API](Python-API.md)
* [Wrapping Learning Environment as a Gym](../python/gym-unity/Readme.md)
* [Wrapping Learning Environment as a Gym](../gym-unity/Readme.md)

14
docs/Training-Curriculum-Learning.md


### Specifying a Metacurriculum
We first create a folder inside `curricula/` for the environment we want
We first create a folder inside `config/curricula/` for the environment we want
`curricula/wall-jump/`. We will place our curriculums inside this folder.
`config/curricula/wall-jump/`. We will place our curriculums inside this folder.
### Specifying a Curriculum

`mlagents-learn` using the `–curriculum` flag to point to the metacurriculum
folder and PPO will train using Curriculum Learning. For example, to train
agents in the Wall Jump environment with curriculum learning, we can run
`mlagents-learn --curriculum=curricula/wall-jump/ --run-id=wall-jump-curriculum
--train`. We can then keep track of the current lessons and progresses via
TensorBoard.
```shell
mlagents-learn config/trainer_config.yaml --curriculum=curricula/wall-jump/ --run-id=wall-jump-curriculum --train
```
We can then keep track of the current
lessons and progresses via TensorBoard.

4
docs/Training-Imitation-Learning.md


2. Set the "Teacher" brain to Player mode, and properly configure the inputs to map to the corresponding actions. **Ensure that "Broadcast" is checked within the Brain inspector window.**
3. Set the "Student" brain to External mode.
4. Link the brains to the desired agents (one agent as the teacher and at least one agent as a student).
5. In `trainer_config.yaml`, add an entry for the "Student" brain. Set the `trainer` parameter of this entry to `imitation`, and the `brain_to_imitate` parameter to the name of the teacher brain: "Teacher". Additionally, set `batches_per_epoch`, which controls how much training to do each moment. Increase the `max_steps` option if you'd like to keep training the agents for a longer period of time.
6. Launch the training process with `mlagents-learn --train --slow`, and press the :arrow_forward: button in Unity when the message _"Start training by pressing the Play button in the Unity Editor"_ is displayed on the screen
5. In `config/trainer_config.yaml`, add an entry for the "Student" brain. Set the `trainer` parameter of this entry to `imitation`, and the `brain_to_imitate` parameter to the name of the teacher brain: "Teacher". Additionally, set `batches_per_epoch`, which controls how much training to do each moment. Increase the `max_steps` option if you'd like to keep training the agents for a longer period of time.
6. Launch the training process with `mlagents-learn config/trainer_config.yaml --train --slow`, and press the :arrow_forward: button in Unity when the message _"Start training by pressing the Play button in the Unity Editor"_ is displayed on the screen
7. From the Unity window, control the agent with the Teacher brain by providing "teacher demonstrations" of the behavior you would like to see.
8. Watch as the agent(s) with the student brain attached begin to behave similarly to the demonstrations.
9. Once the Student agents are exhibiting the desired behavior, end the training process with `CTL+C` from the command line.

13
docs/Training-ML-Agents.md


The output of the training process is a model file containing the optimized policy. This model file is a TensorFlow data graph containing the mathematical operations and the optimized weights selected during the training process. You can use the generated model file with the Internal Brain type in your Unity project to decide the best course of action for an agent.
Use the command `mlagents-learn` to train your agents. This command is installed with the `mlagents` package
and its implementation can be found at `python/ml-agents/learn.py`. The [configuration file](#training-config-file), `trainer_config.yaml` specifies the hyperparameters used during training. You can edit this file with a text editor to add a specific configuration for each brain.
and its implementation can be found at `ml-agents/learn.py`. The [configuration file](#training-config-file), `config/trainer_config.yaml` specifies the hyperparameters used during training. You can edit this file with a text editor to add a specific configuration for each brain.
For a broader overview of reinforcement learning, imitation learning and the ML-Agents training process, see [ML-Agents Toolkit Overview](ML-Agents-Overview.md).

Run `mlagents-learn` from the command line to launch the training process. Use the command line patterns and the `trainer_config.yaml` file to control training options.
Run `mlagents-learn` from the command line to launch the training process. Use the command line patterns and the `config/trainer_config.yaml` file to control training options.
mlagents-learn <trainer-config-file> <env_name> --run-id=<run-identifier> --train
mlagents-learn <trainer-config-file> --env=<env_name> --run-id=<run-identifier> --train
```
where

3. Navigate to the ml-agents `python` folder.
4. Run the following to launch the training process using the path to the Unity environment you built in step 1:
mlagents-learn ../../projects/Cats/CatsOnBicycles.app --run-id=cob_1 --train
mlagents-learn config/trainer_config.yaml --env=../../projects/Cats/CatsOnBicycles.app --run-id=cob_1 --train
During a training session, the training program prints out and saves updates at regular intervals (specified by the `summary_freq` option). The saved statistics are grouped by the `run-id` value so you should assign a unique id to each training run if you plan to view the statistics. You can view these statistics using TensorBoard during or after training by running the following command (from the ML-Agents python directory):

In addition to passing the path of the Unity executable containing your training environment, you can set the following command line options when invoking `mlagents-learn`:
* `--env=<env>` - Specify an executable environment to train.
* `--curriculum=<file>` – Specify a curriculum JSON file for defining the lessons for curriculum training. See [Curriculum Training](Training-Curriculum-Learning.md) for more information.
* `--keep-checkpoints=<n>` – Specify the maximum number of model checkpoints to keep. Checkpoints are saved after the number of steps specified by the `save-freq` option. Once the maximum number of checkpoints has been reached, the oldest checkpoint is deleted when saving a new checkpoint. Defaults to 5.
* `--lesson=<n>` – Specify which lesson to start with when performing curriculum training. Defaults to 0.

### Training config file
The training config file, `trainer_config.yaml` specifies the training method, the hyperparameters, and a few additional values to use during training. The file is divided into sections. The **default** section defines the default values for all the available settings. You can also add new sections to override these defaults to train specific Brains. Name each of these override sections after the GameObject containing the Brain component that should use these settings. (This GameObject will be a child of the Academy in your scene.) Sections for the example environments are included in the provided config file.
The training config file, `config/trainer_config.yaml` specifies the training method, the hyperparameters, and a few additional values to use during training. The file is divided into sections. The **default** section defines the default values for all the available settings. You can also add new sections to override these defaults to train specific Brains. Name each of these override sections after the GameObject containing the Brain component that should use these settings. (This GameObject will be a child of the Academy in your scene.) Sections for the example environments are included in the provided config file.
| ** Setting ** | **Description** | **Applies To Trainer**|
| :-- | :-- | :-- |

* [Training with Curriculum Learning](Training-Curriculum-Learning.md)
* [Training with Imitation Learning](Training-Imitation-Learning.md)
You can also compare the [example environments](Learning-Environment-Examples.md) to the corresponding sections of the `trainer-config.yaml` file for each example to see how the hyperparameters and other configuration variables have been changed from the defaults.
You can also compare the [example environments](Learning-Environment-Examples.md) to the corresponding sections of the `config/trainer_config.yaml` file for each example to see how the hyperparameters and other configuration variables have been changed from the defaults.

2
docs/Training-on-Amazon-Web-Service.md


3. Select Linux as the Target Platform, and x86_64 as the target architecture.
4. Check Headless Mode (If you haven't setup the X Server).
5. Click Build to build the Unity environment executable.
6. Upload the executable to your EC2 instance within `ml-agents/python` folder.
6. Upload the executable to your EC2 instance within `ml-agents` folder.
7. Test the instance setup from Python using:
```python

6
docs/Training-on-Microsoft-Azure.md


## Installing ML-Agents
2. [Move](https://docs.microsoft.com/en-us/azure/virtual-machines/linux/copy-files-to-linux-vm-using-scp) the `python` sub-folder of this ml-agents repo to the remote Azure instance, and set it as the working directory.
2. [Move](https://docs.microsoft.com/en-us/azure/virtual-machines/linux/copy-files-to-linux-vm-using-scp) the `ml-agents` sub-folder of this ml-agents repo to the remote Azure instance, and set it as the working directory.
2. Install the required packages with `pip3 install .`.
## Testing

To run your training on the VM:
1. [Move](https://docs.microsoft.com/en-us/azure/virtual-machines/linux/copy-files-to-linux-vm-using-scp) your built Unity application to your Virtual Machine.
2. Set the `python` sub-folder of the ml-agents repo to your working directory.
2. Set the `ml-agents` sub-folder of the ml-agents repo to your working directory.
python3 learn.py <your_app> --run-id=<run_id> --train
mlagents-learn <trainer_config> --env=<your_app> --run-id=<run_id> --train
```
Where `<your_app>` is the path to your app (i.e. `~/unity-volume/3DBallHeadless`) and `<run_id>` is an identifer you would like to identify your training run with.

7
docs/Using-Docker.md


docker run --name <container-name> \
--mount type=bind,source="$(pwd)"/unity-volume,target=/unity-volume \
-p 5005:5005 \
<image-name>:latest <environment-name> \
<image-name>:latest \
--env=<environment-name> \
--train \
--run-id=<run-id>
```

random name if this is not set. _Note that this must be unique for every run
of a Docker image._
- `<image-name>` references the image name used when building the container.
- `<environemnt-name>` __(Optional)__: If you are training with a linux
- `<environment-name>` __(Optional)__: If you are training with a linux
Editor, do not pass a `<environemnt-name>` argument and press the
Editor, do not pass a `<environment-name>` argument and press the
:arrow_forward: button in Unity when the message _"Start training by pressing
the Play button in the Unity Editor"_ is displayed on the screen.
- `source`: Reference to the path in your host OS where you will store the Unity

2
protobuf-definitions/make.bat


SRC_DIR=proto/mlagents/envs/communicator_objects
DST_DIR_C=../MLAgentsSDK/Assets/ML-Agents/Scripts/CommunicatorObjects
DST_DIR_P=../python/ml-agents
DST_DIR_P=../ml-agents
PROTO_PATH=proto
PYTHON_PACKAGE=mlagents/envs/communicator_objects

2
ml-agents/setup.py


here = path.abspath(path.dirname(__file__))
# Get the long description from the README file
with open(path.join(here, '../../README.md'), encoding='utf-8') as f:
with open(path.join(here, '../README.md'), encoding='utf-8') as f:
long_description = f.read()
setup(

48
ml-agents/mlagents/learn.py


docker_target_name = run_options['--docker-target-name']
# General parameters
env_path = run_options['<env>']
env_path = run_options['--env']
if env_path == 'None':
env_path = None
run_id = run_options['--run-id']
load_model = run_options['--load']
train_model = run_options['--train']

curriculum_file = str(run_options['--curriculum'])
if curriculum_file == "None":
if curriculum_file == 'None':
curriculum_file = None
lesson = int(run_options['--lesson'])
fast_simulation = not bool(run_options['--slow'])

def main():
print('''
try:
print('''
,m' ,▓▓▓▀▓▓▄ ▓▓▓ ▓▓▌
' ▄▓▓▀ ▓▓▓ ▄▄ ▄▄ ,▄▄ ▄▄▄▄ ,▄▄ ▄▓▓▌▄ ▄▄▄ ,▄▄
^
^ `
'▀▓▓▓▄ ^▓▓▓ ▓▓▓ └▀▀▀▀ ▀▀ ^▀▀ `▀▀ `▀▀ '
,
`
¬`
''')
,m' ,▓▓▓▀▓▓▄ ▓▓▓ ▓▓▌
' ▄▓▓▀ ▓▓▓ ▄▄ ▄▄ ,▄▄ ▄▄▄▄ ,▄▄ ▄▓▓▌▄ ▄▄▄ ,▄▄
^
^ `
'▀▓▓▓▄ ^▓▓▓ ▓▓▓ └▀▀▀▀ ▀▀ ^▀▀ `▀▀ `▀▀ '
,
`
¬`
logger = logging.getLogger("mlagents.learn")
''')
# TODO: figure this out
except:
print('UNITY!!!')
logger = logging.getLogger('mlagents.learn')
learn (<trainer-config-path>) [<env>] [options]
learn [options]
learn <trainer-config-path> [options]
--env=<file> Name of the Unity executable [default: None].
--curriculum=<file> Curriculum json file for environment [default: None].
--keep-checkpoints=<n> How many model checkpoints to keep [default: 5].
--lesson=<n> Start learning from this lesson [default: 0].

num_runs = int(options['--num-runs'])
seed = int(options['--seed'])
if options['<env>'] is None and num_runs > 1:
if options['--env'] != 'None' and num_runs > 1:
raise TrainerError('It is not possible to launch more than one concurrent training session '
'when training from the editor.')

/curricula → /config/curricula

/trainer_config.yaml → /config/trainer_config.yaml

/python/gym-unity → /gym-unity

/python/ml-agents → /ml-agents

正在加载...
取消
保存