* Move 'take_action' into Policy class
This refactor is part of Actor-Trainer separation. Since policies
will be distributed across actors in separate processes which share
a single trainer, taking an action should be the responsibility of
the policy.
This change makes a few smaller changes:
* Combines `take_action` logic between trainers, making it more
generic
* Adds an `ActionInfo` data class to be more explicit about the
data returned by the policy, only used by TrainerController and
policy for now.
* Moves trainer stats logic out of `take_action` and into
`add_experiences`
* Renames 'take_action' to 'get_action'
Removing this function breaks some tests, and the only way around
this at this time is a bigger refactor or hacky fixes to tests.
For now, I'd suggest we just revert this small part of a change
and keep a refactor in mind for the future.
* Ticked API :
- Ticked API for pypi for mlagents
- Ticked API for pypi for unity-gym
- Ticked Communication number for API
- Ticked Model Loader number for API
* Ticked the API for the pytest
* added the pypiwin32 package
* fixed the break on mac, fixed part of pytest above version 4
* added something to the windows to help unstuck people
* resolved the comment
* Switched default Mac GFX API to Metal
* Added Barracuda pre-0.1.5
* Added basic integration with Barracuda Inference Engine
* Use predefined outputs the same way as for TF engine
* Fixed discrete action + LSTM support
* Switch Unity Mac Editor to Metal GFX API
* Fixed null model handling
* All examples converted to support Barracuda
* Added model conversion from Tensorflow to Barracuda
copied the barracuda.py file to ml-agents/mlagents/trainers
copied the tensorflow_to_barracuda.py file to ml-agents/mlagents/trainers
modified the tensorflow_to_barracuda.py file so it could be called from mlagents
modified ml-agents/mlagents/trainers/policy.py to convert the tf models to barracuda compatible .bytes file
* Added missing iOS BLAS plugin
* Added forgotten prefab changes
* Removed GLCore GFX backend for Mac, because it doesn't support Compute shaders
* Exposed GPU support for LearningBrain inference
...
* Remove env creation logic from TrainerController
Currently TrainerController includes logic related to creating the
UnityEnvironment, which causes poor separation of concerns between
the learn.py application script, TrainerController and UnityEnvironment:
* TrainerController must know about the proper way to instantiate the
UnityEnvironment, which may differ from application to application.
This also makes mocking or subclassing UnityEnvironment more
difficult.
* Many arguments are passed by learn.py to TrainerController and passed
along to UnityEnvironment.
This change moves environment construction logic into learn.py, as part
of the greater refactor to separate trainer logic from actor / environment.
* Add option to set gym visual observation to uint8
* Add option to flatten branched discrete actions
* Add game_over variable to gym wrapper
* Add guide on how to use Dopamine with the gym wrapper and comparisons with Baselines and PPO
* Add blurb about using the --load flag in the intro guide, and typo fix.
* Add section in tutorial to create multiple area learning environment.
* Add mention of Done() method in agent design