* Remove env creation logic from TrainerController
Currently TrainerController includes logic related to creating the
UnityEnvironment, which causes poor separation of concerns between
the learn.py application script, TrainerController and UnityEnvironment:
* TrainerController must know about the proper way to instantiate the
UnityEnvironment, which may differ from application to application.
This also makes mocking or subclassing UnityEnvironment more
difficult.
* Many arguments are passed by learn.py to TrainerController and passed
along to UnityEnvironment.
This change moves environment construction logic into learn.py, as part
of the greater refactor to separate trainer logic from actor / environment.
* Add option to set gym visual observation to uint8
* Add option to flatten branched discrete actions
* Add game_over variable to gym wrapper
* Add guide on how to use Dopamine with the gym wrapper and comparisons with Baselines and PPO
* Add blurb about using the --load flag in the intro guide, and typo fix.
* Add section in tutorial to create multiple area learning environment.
* Add mention of Done() method in agent design