Added a Handshake method for the communicator
The academy will try to handshake regardless of the brains present
Player and Heuristic brains will send their information through the communicator but will not receive commands
* added broadcast to the player and heuristic brain.
Allows the python API to record actions taken along with the states and rewards
* removed the broadcast checkbox
Added a Handshake method for the communicator
The academy will try to handshake regardless of the brains present
Player and Heuristic brains will send their information through the communicator but will not receive commands
* bug fix : The environment only requests actions from external brains when unique
* added warning in case no brins are set to external
* fix on the instanciation of coreBrains,
fix on the conversion of actions to arrays in the BrainInfo received from step
* default discrete action is now 0
bug fix for discrete broadcast action (the action size should be one in Agents.cs)
modified Tennis so that the default action is no action
modified the TemplateDecsion.cs to ensure non null values are sent from Decide() and MakeMemory()
* minor fixes
* need to convert the s...
* More efficiently allocate memory when sending states
* Code clean-up
* Additional changes
* More GC reduction
* Remove state list initialization from example environments
* Use built-in json tool to serialize state message
* Remove commented code
* Use more efficient CompareTag
* Comments before code
* Use type inference where appropriate
This should help developers figure out faster if errors are happening on the Unity side.
Looking into the Player.Log or using a developement build could be replaced by this feature.
* `learn.py` is now main script for training brains.
* Simultaneous multi-brain training is now possible.
* `ghost-trainer` allows for proper training in adversarial scenarios.
* `imitation-trainer` provides a basic implementation of real-time behavioral cloning.
* All trainer hyperparameters now exist in `.yaml` files.
* `PPO.ipynb` removed.
* LSTM model added.
* More dynamic buffer class to handle greater variety of scenarios.
* Add ability to seed learning (numpy, tensorflow, and Unity) with `--seed` flag.
* Add `maxStepReached` flag to Agents and Academy.
* Change way value bootstrapping works in PPO to take advantage of timeouts.
* Default size of GridWorld changed to 5x5 in order to validate bootstrapping changes.
* On Demand Decision : Use RequestDecision and RequestAction
* New Agent Inspector : Use it to set On Demand Decision
* New BrainParameters interface
* LSTM memory size is now set in python
* New C# API
* Semantic Changes
* Replaced RunMDP
* New Bouncer Environment to test On Demand Dscision
* [Previous Text Actions] Renamed previous_action to previous_vector_action
added previous_text_action to the BrainInfo
* [Semantics] Carried the modifications to the semantics of previous_vector_action to the trainers
Added several class and method-level comments that are compatibale with Doxygen for auto-generation of documentation. In addition to some stylistic and minor code changes (summarized below).
Stylistic changes:
- Modified comments to /// style instead of /** */
- Removed unnecessary imports
- Removed unnecessary “private” declarations
- Limited code to 80 characters per line
- Re-organized variables to group those that are visible in Inspector (they are now at the top)
Code changes:
- Renamed ScreenConfiguration to EnvironmentConfiguration (variable only used within Academy.cs, thus no other files needed modification)
- Renamed ConfigureEngine to ConfigureEnvironment and created a ConfigureEnvironmentHelper method
- Renamed _isCurrentlyInference to modeSwitched to signify when the engine config needs to be changed
- Added isCommunicatorOn flag to be explicit about the existence of a communicator
- Made isInference private which requ...
Added several class and method-level comments that are compatibale with Doxygen for auto-generation of documentation. In addition to some stylistic and minor code changes (summarized below).
Stylistic changes:
- Modified comments to /// style instead of /** */
- Removed unnecessary imports
- Limited code to 80 characters per line
Code changes:
- Change SetTextObs to accept a string, not an object
- Renamed all methods that have “state” semantics to “info” semantics
- Renamed _InitializeAgent as OnEnableHelper
- Removed _DisableAgent, foldered into OnDisable
- Renamed StoredVectorActions to storedVectorActions, similarly for StoredTextActions
- Changed internal methods to protected since thats the desired behavior
- Renamed _info to info and _action to action since they’re already private
These refactorings had impacts on CoreBrainInternal, ExternalCommunicator, MLAgentsEditModeTest.
Performed minor improvemens to Ball3DAgent and AgentEditor (re...