* More efficiently allocate memory when sending states
* Code clean-up
* Additional changes
* More GC reduction
* Remove state list initialization from example environments
* Use built-in json tool to serialize state message
* Remove commented code
* Use more efficient CompareTag
* Comments before code
* Use type inference where appropriate
* `learn.py` is now main script for training brains.
* Simultaneous multi-brain training is now possible.
* `ghost-trainer` allows for proper training in adversarial scenarios.
* `imitation-trainer` provides a basic implementation of real-time behavioral cloning.
* All trainer hyperparameters now exist in `.yaml` files.
* `PPO.ipynb` removed.
* LSTM model added.
* More dynamic buffer class to handle greater variety of scenarios.
* Add support for stacking past n states to allow network to learn temporal dependencies.
* Add Banana Collector environment for demonstrating partially observable multi-agent environments.
* Add 3DBall Hard which lacks velocity information in state representation. Used as test for LSTM and state-stacking features.
* Rework Tennis environment to be continuous control and trainable in 100k steps.
* Fix Basic environment to properly reflect number of states.
* Fix discrete states when using stacked states.
* Add trained model for Basic environment.
* On Demand Decision : Use RequestDecision and RequestAction
* New Agent Inspector : Use it to set On Demand Decision
* New BrainParameters interface
* LSTM memory size is now set in python
* New C# API
* Semantic Changes
* Replaced RunMDP
* New Bouncer Environment to test On Demand Dscision
* Add config for crawler, and change crawler scene
* Changed number of crawlers in scene to 12
* Changed Max-steps for crawlers to 5000
* Newer hyperparameters and newly trained crawler model
* Clean up crawler code, and improve efficency
Added several class and method-level comments that are compatibale with Doxygen for auto-generation of documentation. In addition to some stylistic and minor code changes (summarized below).
Stylistic changes:
- Modified comments to /// style instead of /** */
- Removed unnecessary imports
- Limited code to 80 characters per line
Code changes:
- Change SetTextObs to accept a string, not an object
- Renamed all methods that have “state” semantics to “info” semantics
- Renamed _InitializeAgent as OnEnableHelper
- Removed _DisableAgent, foldered into OnDisable
- Renamed StoredVectorActions to storedVectorActions, similarly for StoredTextActions
- Changed internal methods to protected since thats the desired behavior
- Renamed _info to info and _action to action since they’re already private
These refactorings had impacts on CoreBrainInternal, ExternalCommunicator, MLAgentsEditModeTest.
Performed minor improvemens to Ball3DAgent and AgentEditor (re...
* Fixes internal brain for Banana Imitation.
* Fixes Discrete Control training for Imitation Learning.
* Fixes Visual Observations in internal brain with non-square inputs.
* [CoreBrain] Bug fix in the internal brain
Discrete vector observations did not have the right size
* [Docs] Removed all references to the unitypackages other than the TensorFlowSharp.unitypackage
.
* [Basic] Updated the bytes file of basic
* [Docs] Addressed comments
* [Docs] Re-addressed the comments
* [Bug Fix] Scalling the visual input between 0 and 1
* [Comments] Added comments to the
BatchVisualObservations method of the CoreInternalBrain.
* [Renaming] Renamed BlackAndWhite to blackAndWhite
* [containers] Enables container support for scenes that use visual observations
* [Initial Commit] Works only with simple balance ball
* [Optimiztion] Store the academy in the brainBatcher as a temporary measure
* [Modifications] Made it work from the editor as a prototype
* [Made socket communicator and reimplmented all functionalities]
* [Forgotten file] removed .meta file
* [Forgot the meta file]
* [Metafile] deleted metafile
* [Comments] Removed dead code
* [Comments] Added some descriptions
* [Bug Fix] Multi brain scenario
* [improved AgentInfo converter]
* [Optimization] Remove VectorObs since StackedVectorObs is present in the AgentInfo protobuf object
* [Timeout] Implemented a timeout for the rpc communicator in Unity
* [Libraries] Added the C# Protobuf and Grpc libraries
* [Requirements] Added protobuf 3.5.2 to the requirements
* [Code Formating] Removed dead code and split some lines
...
* [Initial Commit]
Modified the model.py file and the ppo/trainer.py file to use masked actions
* Preliminary modifications to the python side of the code to enable action masking
* Preliminary modifications to the C# side of the code to enable action masking
* Preliminary modifications to the communication side of the code to enable action masking
* Implemented action masking for BC
Note : The actions of the teacher are not masked
* More error messages for the action masking
* fix pytests
* Added Documentation
* Address comment
* Addressed Comments on docs
* Addressed second comment on docs
* Addressed comments for the python side of the code
* Created the action masker and associated unit tests
* Addressed comments on the C# side
* Addressed the comment regarding action_masking_name
* Addressed the comments