### Example - DQN Baseline
In order to train an agent to play the `GridWorld` environment using the
Baselines DQN algorithm, create a file called `train_unity.py` within the
`baselines/deepq/experiments` subfolder of the baselines repository. This file
will be a modification of the `run_atari.py` file within the same folder. Then
create and `/envs/` directory within the repository, and build the GridWorld
environment to that directory. For more information on building Unity
environments, see [here ](../docs/Learning-Environment-Executable.md ). Add the
following code to the `train_unity.py` file:
Baselines DQN algorithm, you first need to install the baselines package using
pip:
```
pip install git+git://github.com/openai/baselines
```
Next, create a file called `train_unity.py` . Then create an `/envs/` directory
and build the GridWorld environment to that directory. For more information on
building Unity environments, see
[here ](../docs/Learning-Environment-Executable.md ). Add the following code to
the `train_unity.py` file:
```python
import gym
def main():
env = UnityEnv("./envs/GridWorld", 0, use_visual=True)
model = deepq.models.cnn_to_mlp(
convs=[(32, 8, 4), (64, 4, 2), (64, 3, 1)],
hiddens=[256],
dueling=True,
)
q_func=model,
"mlp",
max_timesteps=100000,
total_timesteps=100000,
print_freq=10,
print_freq=10
)
print("Saving model to unity_model.pkl")
act.save("unity_model.pkl")
repository:
```sh
python -m baselines.deepq.experiments. train_unity
python -m train_unity
the example provided above. In most cases, the primary changes needed to use a
Unity environment are to import `UnityEnv` , and to replace the environment
the examples from the baselines package. In most cases, the primary changes needed
to use a Unity environment are to import `UnityEnv` , and to replace the environment
creation code, typically `gym.make()` , with a call to `UnityEnv(env_path)`
passing the environment binary path.
Some algorithms will make use of `make_atari_env()` or `make_mujoco_env()`
functions. These are defined in `baselines/common/cmd_util.py` . In order to use
Unity environments for these algorithms, add the following import statement and
function to `cmd_utils.py` :
Some algorithms will make use of `make_env()` or `make_mujoco_env()`
functions. You can define a similar function for Unity environments. An example of
such a method using the PPO2 baseline:
from baselines.common.vec_env.subproc_vec_env import SubprocVecEnv
from baselines.bench import Monitor
from baselines import logger
import baselines.ppo2.ppo2 as ppo2
import os
try:
from mpi4py import MPI
except ImportError:
MPI = None
def make_env(rank): # pylint: disable=C0111
def make_env(rank, use_visual=True): # pylint: disable=C0111
env = UnityEnv(env_directory, rank, use_visual=True)
env = UnityEnv(env_directory, rank, use_visual=use_visual)
env = Monitor(env, logger.get_dir() and os.path.join(logger.get_dir(), str(rank)))
return env
return _thunk
rank = MPI.COMM_WORLD.Get_rank()
env = UnityEnv(env_directory, rank, use_visual=False)
env = Monitor(env, logger.get_dir() and os.path.join(logger.get_dir(), str(rank)))
return env
rank = MPI.COMM_WORLD.Get_rank() if MPI else 0
return make_env(rank, use_visual=False)
def main():
env = make_unity_env('./envs/GridWorld', 4, True)
ppo2.learn(
network="mlp",
env=env,
total_timesteps=100000,
lr=1e-3,
)
if __name__ == '__main__':
main()
```