Added a Handshake method for the communicator
The academy will try to handshake regardless of the brains present
Player and Heuristic brains will send their information through the communicator but will not receive commands
bug fix for discrete broadcast action (the action size should be one in Agents.cs)
modified Tennis so that the default action is no action
modified the TemplateDecsion.cs to ensure non null values are sent from Decide() and MakeMemory()
* made BrainParameters a class to set default values
Modified the error message if the state is discrete
* Add discrete state support to PPO and provide discrete state example environment
* Add flexibility to continuous control as well
* Finish PPO flexible model generation implementation
* Fix formatting
* Support color observations
* Add best practices document
* bug fix for non square observations
* Update Readme.md
* Remove scipy dependency
* Add installation doc
* added broadcast to the player and heuristic brain.
Allows the python API to record actions taken along with the states and rewards
* removed the broadcast checkbox
Added a Handshake method for the communicator
The academy will try to handshake regardless of the brains present
Player and Heuristic brains will send their information through the communicator but will not receive commands
* bug fix : The environment only requests actions from external brains when unique
* added warning in case no brins are set to external
* fix on the instanciation of coreBrains,
fix on the conversion of actions to arrays in the BrainInfo received from step
* default discrete action is now 0
bug fix for discrete broadcast action (the action size should be one in Agents.cs)
modified Tennis so that the default action is no action
modified the TemplateDecsion.cs to ensure non null values are sent from Decide() and MakeMemory()
* minor fixes
* need to convert the s...
* More efficiently allocate memory when sending states
* Code clean-up
* Additional changes
* More GC reduction
* Remove state list initialization from example environments
* Use built-in json tool to serialize state message
* Remove commented code
* Use more efficient CompareTag
* Comments before code
* Use type inference where appropriate
This should help developers figure out faster if errors are happening on the Unity side.
Looking into the Player.Log or using a developement build could be replaced by this feature.
Greatly simplified GridWorld code. It now also only uses a visual observation rather than state vector in order to demonstrate learning purely from a visual input.
* `learn.py` is now main script for training brains.
* Simultaneous multi-brain training is now possible.
* `ghost-trainer` allows for proper training in adversarial scenarios.
* `imitation-trainer` provides a basic implementation of real-time behavioral cloning.
* All trainer hyperparameters now exist in `.yaml` files.
* `PPO.ipynb` removed.
* LSTM model added.
* More dynamic buffer class to handle greater variety of scenarios.
* Add support for stacking past n states to allow network to learn temporal dependencies.
* Add Banana Collector environment for demonstrating partially observable multi-agent environments.
* Add 3DBall Hard which lacks velocity information in state representation. Used as test for LSTM and state-stacking features.
* Rework Tennis environment to be continuous control and trainable in 100k steps.
* Add ability to seed learning (numpy, tensorflow, and Unity) with `--seed` flag.
* Add `maxStepReached` flag to Agents and Academy.
* Change way value bootstrapping works in PPO to take advantage of timeouts.
* Default size of GridWorld changed to 5x5 in order to validate bootstrapping changes.
* Implement behavioral cloning for cc/dc, fc/rnn, state/observations.
* Re-organize folder structure in anticipation of unitytrainers as a package.
* Create demo environment BananaImitation to validate behavioral cloning.
* Fixes#336
* Reorganized python tests into separate folder, and make individiual test files for different (sub) modules.
* Add tests for trainer_controller, PPO, and behavioral cloning. More to come soon.
* Minor bug fixes discovered while writing tests.
* Reworked GirdWorld to reset much faster.
* Cleaned ObservationToTex and reworked GetObservationMatrixList to be 3x faster.
* Fix Basic environment to properly reflect number of states.
* Fix discrete states when using stacked states.
* Add trained model for Basic environment.
* On Demand Decision : Use RequestDecision and RequestAction
* New Agent Inspector : Use it to set On Demand Decision
* New BrainParameters interface
* LSTM memory size is now set in python
* New C# API
* Semantic Changes
* Replaced RunMDP
* New Bouncer Environment to test On Demand Dscision
* [Previous Text Actions] Renamed previous_action to previous_vector_action
added previous_text_action to the BrainInfo
* [Semantics] Carried the modifications to the semantics of previous_vector_action to the trainers
* Add config for crawler, and change crawler scene
* Changed number of crawlers in scene to 12
* Changed Max-steps for crawlers to 5000
* Newer hyperparameters and newly trained crawler model
* Clean up crawler code, and improve efficency
* [timeBetweenDecisions] Reimplementation of waitTime for GridWorld and Basic
* [EnvironmentModification] Changed the gridworld TimeBetweenDecisionAtInference
* [AddVectorObs] Made it possible to call AddVectorObs with int, Vector2, Vector3, List<float> and float[].
* [Comments] Made the comment clearer after overloading
* [Fix] Use AddRange instead of Add when adding lists or floatarrays
* [Fix] Fix the unit tests in C# since the academy now resets in the first fixed update and not the awake method in inference
* [Fix] Adding comments to the unit tests
* [Comments] Improving the comments
Added several class and method-level comments that are compatibale with Doxygen for auto-generation of documentation. In addition to some stylistic and minor code changes (summarized below).
Stylistic changes:
- Modified comments to /// style instead of /** */
- Removed unnecessary imports
- Removed unnecessary “private” declarations
- Limited code to 80 characters per line
- Re-organized variables to group those that are visible in Inspector (they are now at the top)
Code changes:
- Renamed ScreenConfiguration to EnvironmentConfiguration (variable only used within Academy.cs, thus no other files needed modification)
- Renamed ConfigureEngine to ConfigureEnvironment and created a ConfigureEnvironmentHelper method
- Renamed _isCurrentlyInference to modeSwitched to signify when the engine config needs to be changed
- Added isCommunicatorOn flag to be explicit about the existence of a communicator
- Made isInference private which requ...
* [AddVectorObs] Converted the Examples to use the new AddVectorObs
* [AddVectorObs] Converted the Reacher to use the new AddVectorObs
* [Improvement] One liner for adding the rotation
* [New Bouncer] Revamped the Bouncer to be in 3D
* [Bouncer Configuration file] Added the BouncerBrain configuration
* [Documentation] Added the Bouncer tot he documentation page
* [Fixes] Fixed lines too long and the documentation typo
* Slight adjustments to bouncer environment
* Don't default to internal brain on bouncer
Added several class and method-level comments that are compatibale with Doxygen for auto-generation of documentation. In addition to some stylistic and minor code changes (summarized below).
Stylistic changes:
- Modified comments to /// style instead of /** */
- Removed unnecessary imports
- Limited code to 80 characters per line
Code changes:
- Change SetTextObs to accept a string, not an object
- Renamed all methods that have “state” semantics to “info” semantics
- Renamed _InitializeAgent as OnEnableHelper
- Removed _DisableAgent, foldered into OnDisable
- Renamed StoredVectorActions to storedVectorActions, similarly for StoredTextActions
- Changed internal methods to protected since thats the desired behavior
- Renamed _info to info and _action to action since they’re already private
These refactorings had impacts on CoreBrainInternal, ExternalCommunicator, MLAgentsEditModeTest.
Performed minor improvemens to Ball3DAgent and AgentEditor (re...
- Incorporated feedback provided offline
- Fixed capitalizations of Agent/agent
- Re-organized trainers and features sections (renamed files accordingly)
- Change Agent Editor (code) to ODD feature
- Added a summary and next steps section
This completes adding API doc to all developer-facing classes: Academy, Agent and Monitor.
- Minor refactoring of variable names for spelling correctness:
- isInstanciated —> isInstantiated
- InstanciateCanvas() —> InstantiateCanvas()
- Removed unused imports.
- Removing private declarations.
- Added API doc
- Fixed semantics of state / observation to vector and visual observations
- Updated decision scripts for sample environments accordingly
RayPerception moved to a component that is now used by Banana, Soccer, Hallway, and Push Block.
Converted Push Block to use RayPerception for local perception and retrained model.
Re-worked Hallway to be more extensible.
Updated the Reacher's Vector Observation's space size from 24 to 26, also in Internal brain mode, Vector observation node name from "state" to "vector_observation"
* Minor changes to ensure a common visual language.
* Agents are blue (or additionally red in competitive scenarios).
* Interactable objects are orange.
* Goals are green when objects, and checkerboards when places.
* Not everything perfectly follows this, but things are mostly consistent now.
* Renamed "Banana" folder to "BananaCollectors"
* Ensured all brains were set to "Player"
* Moved non-shared assets out of the "SharedAssets" folder.
* Fixes internal brain for Banana Imitation.
* Fixes Discrete Control training for Imitation Learning.
* Fixes Visual Observations in internal brain with non-square inputs.
Fixes the following issues:
* Missing component reference in BananaRL environment.
* Neural Network for multiple visual observations was not properly generated.
* Episode time-out value estimate bootstrapping used incorrect observation as input.
* [CoreBrain] Bug fix in the internal brain
Discrete vector observations did not have the right size
* [Docs] Removed all references to the unitypackages other than the TensorFlowSharp.unitypackage
.
* [Basic] Updated the bytes file of basic
* [Docs] Addressed comments
* [Docs] Re-addressed the comments
* [Bug Fix] Scalling the visual input between 0 and 1
* [Comments] Added comments to the
BatchVisualObservations method of the CoreInternalBrain.
* [Renaming] Renamed BlackAndWhite to blackAndWhite
This PR makes the following changes:
* Moves clipping of continuous control model into model itself. Output is now always [-1, 1].
* Internal model values are now clipped between [-3, 3] before being rescaled to [-1, 1] for output. * This improves training performance by providing a wider range of values within which the pdf of the gaussian can fall. Output of [-1, 1] is used to be more environment-creator friendly.
* Fixes issue where epsilon was erroneously being used to reconstruct old probabilities during PPO update, leading to reduced learning performance.
* Introduce ScaleAction() function within python to easily rescale values from [-1, 1] to arbitrary range.
* Re-train all CC models using improved algorithm. All performance levels are equal or improved. In the case of Crawler, improvement is drastic.
* Update documentation appropriately.
* Made miscellaneous minor code style and optimization improvements within environments.
* [Refactor] Fixed line indentation
* Removed the library Newtonsoft.Json from the monitor
* Replaced calls to JSON converstion with manual conversion
* [Modified] The Monitor now has multiple
* Log methods that take different object types
* [containers] Enables container support for scenes that use visual observations
* [Initial Commit] Works only with simple balance ball
* [Optimiztion] Store the academy in the brainBatcher as a temporary measure
* [Modifications] Made it work from the editor as a prototype
* [Made socket communicator and reimplmented all functionalities]
* [Forgotten file] removed .meta file
* [Forgot the meta file]
* [Metafile] deleted metafile
* [Comments] Removed dead code
* [Comments] Added some descriptions
* [Bug Fix] Multi brain scenario
* [improved AgentInfo converter]
* [Optimization] Remove VectorObs since StackedVectorObs is present in the AgentInfo protobuf object
* [Timeout] Implemented a timeout for the rpc communicator in Unity
* [Libraries] Added the C# Protobuf and Grpc libraries
* [Requirements] Added protobuf 3.5.2 to the requirements
* [Code Formating] Removed dead code and split some lines
...
* Adds implementation of Curiosity-driven Exploration by Self-supervised Prediction (https://arxiv.org/abs/1705.05363) to PPO trainer.
* To enable, set use_curiosity flag to true in hyperparameter file.
* Includes refactor of unitytrainers model code to accommodate new feature.
* Adds new Pyramids environment (w/ documentation). Environment contains sparse reward, and can only be solved using PPO+Curiosity.
* Revamps agent code for walker and crawler environments to use shared JointDriveController system.
* Crawler has been reworked to be very cute.
* Crawler & Walker environments have been reworked to be visually consistent.
* Added Dynamic Crawler scene.
* All scenes re-trained and new models added.
* Documentation changes.
* Added missing declaration to docs sample code.
* Added pretrained model as default graph in Internal brain of Tennis scene
* Disabled PlayerBrain in Tennis by default.
* Removed accidental config.
- The environment did not respond to reset parameter values
defined by a curriculum until the second reset. This was
because on the first reset, the reset parameters were not
updated by the Academy.
* [Initial Commit]
Modified the model.py file and the ppo/trainer.py file to use masked actions
* Preliminary modifications to the python side of the code to enable action masking
* Preliminary modifications to the C# side of the code to enable action masking
* Preliminary modifications to the communication side of the code to enable action masking
* Implemented action masking for BC
Note : The actions of the teacher are not masked
* More error messages for the action masking
* fix pytests
* Added Documentation
* Address comment
* Addressed Comments on docs
* Addressed second comment on docs
* Addressed comments for the python side of the code
* Created the action masker and associated unit tests
* Addressed comments on the C# side
* Addressed the comment regarding action_masking_name
* Addressed the comments
* GridWorld now uses action masking
* Addressed the comments
* addressed comments
* Added checkbox to turn action masking on/off (#1146)
* Added checkbox to turn action masking on/off
* Fix to handle the no-action option
* Added comment to GridWorld mentioning the use of action masking. (#1153)
* updated the Pyramids model
* updated the pyramids model, chnaged the max_steps to reflect the new max steps required to achive ~1.8 cumulative reward
* Initial Commit
Ported most functionalities, still need to :
- Documentation
- Add Comments
- Custom drawer for BrainParameters
- Fix the UnitTests
- Review Functionalities
* Added Custom Drawer for the Brain Parameters
* Improvements to the HubDrawer
* Modified the Brain Editors
* Minor bug fixes and UI changes
* Modified the Help Boxes of the Drawers
* Modified Brain class, renamed Initialize and made DecideAction virtual
* Fix the UnityTests
* Simpler Brain creation menu
* Renamed Internal Brain to Learning Brain
* modified the parameters to remove reference to External or Internal in the Protobuf objects
* Updated the protobuf generated files
* Fix the Pytests
* Removed the graph scope from the Learning Brain
* cleaner logic than try catch
* Removed the isExternal field of the brain and put the isTraining logic into LearningBrain and Training Hub
* Modified how the Brain finds the A...
* New brains for Pyramid scene
* Add reacher brains
* New brains for Soccer agents
* New Tennis Brains
* Set prefabs correctly
* New brains for bouncer
* New Dynamic Crawler Brains
* Initial Commit
* attempt at refactor
* Put all static methods into the CoreInternalBrain
* improvements
* more testing
* modifications
* renamed epsilon
* misc
* Now supports discrete actions
* added discrete support and RNN and visual. Left to do is refactor and save variables into models
* code cleaning
* made a tensor generator and applier
* fix on the models.py file
* Moved the Checks to a different Class
* Added some unit tests
* BugFix
* Need to generate the output tensors as well as inputs before executing the graph
* Made NodeNames static and created a new namespace
* Added comments to the TensorAppliers
* Started adding comments on the TensorGenerators code
* Added comments for the Tensor Generator
* Moving the helper classes into a separate folder
* Added initial comments to the TensorChecks
* Renamed NodeNames -> TensorNames
* Removing warnings in tests
* Now using Aut...
* Adding model for 3D Balance Ball.
* Adding LearningBrain to BroadCast Hub.
* Removed CrawlerPlayer Brain
* Renamed CrawlerLearning —> CrawlerStaticLearning
* Update Hallway models
* Attaching model to brain for Hallway
* Attaching model to 3DBall Brain.
* Updated CrawlerLearning —> CrawlerStaticLearning on trainer config.
* Adding Reacher model
* Remove model specification in Hallway Brain asset
* Removing model specification from 3Dball scene
* Adding crawler model file
* Specifying learning brain as default for crawler
* Switched default Mac GFX API to Metal
* Added Barracuda pre-0.1.5
* Added basic integration with Barracuda Inference Engine
* Use predefined outputs the same way as for TF engine
* Fixed discrete action + LSTM support
* Switch Unity Mac Editor to Metal GFX API
* Fixed null model handling
* All examples converted to support Barracuda
* Added model conversion from Tensorflow to Barracuda
copied the barracuda.py file to ml-agents/mlagents/trainers
copied the tensorflow_to_barracuda.py file to ml-agents/mlagents/trainers
modified the tensorflow_to_barracuda.py file so it could be called from mlagents
modified ml-agents/mlagents/trainers/policy.py to convert the tf models to barracuda compatible .bytes file
* Added missing iOS BLAS plugin
* Added forgotten prefab changes
* Removed GLCore GFX backend for Mac, because it doesn't support Compute shaders
* Exposed GPU support for LearningBrain inference
...
* Ticked API :
- Ticked API for pypi for mlagents
- Ticked API for pypi for unity-gym
- Ticked Communication number for API
- Ticked Model Loader number for API
* Ticked the API for the pytest
* Fix for Brains not reinitialising when the scene is reloaded.
This was a bug caused by the conversion of Brains over to ScriptableObjects. ScriptableObjects persist in memory between scene changes, which means that after a scene change the Brains would still be initialised and the agentInfos list would contain invalid references to the Agents from the previous scene.
The fix is to have the Academy notify the Brains when it is destroyed. This allows the Brains to clean themselves up and transition back to an uninitialised state. After the new scene is loaded, the Brain's LazyInitialise will reconnect the Brain to the new Academy as expected.
* Fix for Brains not reinitialising when the scene is reloaded.
This was a bug caused by the conversion of Brains over to ScriptableObjects. ScriptableObjects persist in memory between scene changes, which means that after a scene change the Brains would still be...
* Fix typos
* Use abstract class for rayperception
* Created RayPerception2D. (#1721)
* Incorporate RayPerception2D
* Fix typo
* Make abstract class
* Add tests
* Garbage collection optimisations:
- Changed a few IEnumerable instances to IReadOnlyList. This avoids some unnecessary GC allocs that cast the Lists to IEnumerables.
- Moved cdf allocation outside of the loop to avoid unnecessary GC allocation.
- Changed GeneratorImpl to use plain float and int arrays instead of Array during generation. This avoids SetValue performing boxing on the arrays, which eliminates an awful lot of GC allocs.
* Convert InferenceBrain to use IReadOnlyList to avoid garbage creation.
* Added RenderTexture support for visual observations
* Cleaned up new ObservationToTexture function
* Added check for to width/height of RenderTexture
* Added check to hide HelpBox unless both cameras and RenderTextures are used
* Added documentation for Visual Observations using RenderTextures
* Added GridWorldRenderTexture Example scene
* Adjusted image size of doc images
* Added GridWorld example reference
* Fixed missing reference in the GridWorldRenderTexture scene and resaved the agent prefab
* Fix prefab instantiation and render timing in GridWorldRenderTexture
* Added screenshot and reworded documentation
* Unchecked control box
* Rename renderTexture
* Make RenderTexture scene default for GridWorld
Co-authored-by: Mads Johansen <pyjamads@gmail.com>
- Ticked API for pypi for mlagents
- Ticked API for pypi for mlagents_envs
- Ticked Communication number for API
- Ticked API for unity-gym
* Ticked the API for the pytest
When using the SubprocessUnityEnvironment, parallel writes are
made to UnitySDK.log. This causes file access violation issues
in Windows/C#. This change modifies the access and sharing mode
for our writes to UnitySDK.log to fix the issue.
* Added the builder script
* Removed the menu item
* Changed the brainToControl to public
* Added the scene for switching
* Modified according to the comments
* Removed the Builder and BuilderUtils script, made all of the logic into the Startup.cs
* Switched back to the previous way using PreExport method
* Added the return at the EOF.
* Resolved the codacy comments.
* Removed one empty line
* Resolved the 2 round comments
* Sanitize demo filenames so that they can't be too long, overflow the header, and corrupt demo files
* Fix issue where 1st demo of each episode is always recorded as 0 action
Bringing bucket of temp memory allocation optimizations:
* switched to Barracuda backed tensor across the board, helps to leverage allocators and reuse of the internal buffers
* added Barracuda 0.2.4 release, which bring another set of temp memory allocation fixes
* Removed obsolete 'TestDstWrongShape' test as it does not reflect how Barracuda tensors work
* Added proper test cleanup, to avoid warning messages from finalizer thread.
- Fix issue with BC Trainer `increment_steps`.
- Fix issue with Demonstration Recorder and visual observations (memory leak fix was deleting vis obs too early).
- Make Samplers sample from the same random seed every time, so generalization runs are repeatable.
- Fix crash when using GAIL, Curiosity, and visual observations together.
Only cosmetic and readability improvements. No functional changes were intended.
Utilities.cs
- Fixed comments across file
- Made class static
- Removed unnecessary imports
- Removed unused method arguments
- Renamed variables as appropriate to make usage clearer
- In AddRangeNoAlloc, disabled (by comment) Rider’s suggestion to revert to use of built-in Range field (Fixed)
- In TextureToTensorProxy, swapped order of first two arguments to be more in-line with convention of input, output
UtilitiesTests.cs
- Removed unnecessary imports
- Simplified array creation commands
GeneratorImp.cs
- Rider automatically deleted spaces on empty lines
- Changed call to TextureToTensorProxy to mirror new argument ordering
* Clean-up to UnityAgentsException.cs
- Removed unnecessary imports
- Fixed comment warning
- Fixed method header
* Improvements to Startup.cs
- Created const for SCENE_NAME field
- Fixed strin...
* Initial Commit
* Remove the Academy Done flag from the protobuf definitions
* remove global_done in the environment
* Removed irrelevant unitTests
* Remove the max_step from the Academy inspector
* Removed global_done from the python scripts
* Modified and removed some tests
* This actually does not break either curriculum nor generalization training
* Replace global_done with reserved.
Addressing Chris Elion's comment regarding the deprecation of the global_done field. We will use a reserved field to make sure the global done does not get replaced in the future causing errors.
* Removed unused fake brain
* Tested that the first call to step was the same as a reset call
* black formating
* Added documentation changes
* Editing the migrating doc
* Addressing comments on the Migrating doc
* Addressing comments :
- Removing dead code
- Resolving forgotten merged conflicts
- Editing documentations...
* new env styles rebased on develop
* added new trained models
* renamed food collector platforms
* reduce training timescale on WallJump from 100 to 10
* uncheck academy control on walljump
* new banner image
* rename banner file
* new example env images
* add foodCollector image
* change Banana to FoodCollector and update image
* change bouncer description to include green cube
* update image
* update gridworld image
* cleanup prefab names and tags
* updated soccer env to reference purple agent instead of red
* remove unused mats
* rename files
* remove more unused tags
* update image
* change platform to agent cube
* update text. change platform to agents head
* cleanup
* cleaned up weird unused meta files
* add new wall jump nn files and rename a prefab
* walker change stacked states from 5 to 1
walker collects physics observations so stacked states are not need...
* Feature Deprecation : Online Behavioral Cloning
In this PR :
- Delete the online_bc_trainer
- Delete the tests for online bc
- delete the configuration file for online bc training
* Deleting the BCTeacherHelper.cs Script
TODO :
- Remove usages in the scene
- Documentation Edits
*DO NOT MERGE*
* IMPORTANT : REMOVED ALL IL SCENES
- Removed all the IL scenes from the Examples folder
* Removed all mentions of online BC training in the Documentation
* Made a note in the Migrating.md doc about the removal of the Online BC feature.
* Feature Deprecation : Online Behavioral Cloning
In this PR :
- Delete the online_bc_trainer
- Delete the tests for online bc
- delete the configuration file for online bc training
* Deleting the BCTeacherHelper.cs Script
TODO :
- Remove usages in the scene
- Documentation Edits
*DO NOT MERGE*
* IMPORTANT : REMOVED ALL IL SCENES
- Removed all the IL scenes from the Examples folder
* Removed all mentions of online BC training in the Documentation
* Made a note in the Migrating.md doc about the removal of the Online BC feature.
* Modified the Academy UI to remove the control checkbox and replaced it with a train in the editor checkbox
* Removed the Broadcast functionality from the non-Learning brains
* Bug fix
* Note that the scenes are broken since the BroadcastHub has changed
* Modified the LL-API for Python to remove the broadcasting functiuonality.
* All unit tests are running
* Modifie...
* proof of concept - simple C# hierachical timers
* fix compile error, add CustomSampler placeholder
* use CustomSampler and Recorder per node
* singleton, add to Batcher
* output timers
* raw counts and times
* curly braces
* timer cleaup
* json serialize timers
* more timer cleanup
* dont accumulate from Recorders
* move Timers to own file
* meta file
* Wait for env process to exit before killing it
* timer cleanup
* docstrings
* undo some accidental changes
* make timers closer to python
* Timer unit test
* getters
* no => for properties
* singleton
* property one-liner
* scientific notation, cleanup TODOs
* reasonable values for root timer
- Push (almost) all references to protobuf objects into the RpcCommunicator.
- Simplify the passing around of Agents and Agent Infos.
- Delete all references to the Batcher.
- Simplify the Environment Step by removing all of the reset and message counting logic.
- Finishes MLA-27 and MLA-28
* Feature Deprecation : Online Behavioral Cloning
In this PR :
- Delete the online_bc_trainer
- Delete the tests for online bc
- delete the configuration file for online bc training
* Deleting the BCTeacherHelper.cs Script
TODO :
- Remove usages in the scene
- Documentation Edits
*DO NOT MERGE*
* IMPORTANT : REMOVED ALL IL SCENES
- Removed all the IL scenes from the Examples folder
* Removed all mentions of online BC training in the Documentation
* Made a note in the Migrating.md doc about the removal of the Online BC feature.
* Modified the Academy UI to remove the control checkbox and replaced it with a train in the editor checkbox
* Removed the Broadcast functionality from the non-Learning brains
* Bug fix
* Note that the scenes are broken since the BroadcastHub has changed
* Modified the LL-API for Python to remove the broadcasting functiuonality.
* All unit tests are running
* Modified the scen...
* Created the model runner and uses a shared interface with the communicator.
Fixing bugs with dealocation
Removing unnecessary code
Added code comments
Renaming
* Addressing comments
* Modified the constructor of ModelRunner
* Addressing comments
* remaming the _verbose variable
* Addressing comments : Removed the Verbose check in the LearningBrainEditor
* ISensor and SensorBase
* camera and rendertex first pass
* use isensors for visual obs
* Update gridworld with CameraSensors
* compressed obs for reals
* Remove AgentInfo.visualObservations
* better separation of train and inference sensor calls
* compressed obs proto - need CI to generate code
* int32
* get proto name right
* run protoc locally for new fiels
* apply generated proto patch (pyi files were weird)
* don't repeat bytes
* hook up compressedobs
* dont send BrainParameters until there's an AgentInfo
* python BrainParameters now needs an AgentInfo to create
* remove last (I hope) dependency on camerares
* remove CameraResolutions and AgentInfo.visual_observations
* update mypy-protobuf version
* cleanup todos
* python cleanup
* more unit test fixes
* more unit test fix
* camera sensors for VisualFood collector, record demo
* SensorCompon...
* 1 to 1 Brain to Agent
This is a work in progess
In this PR :
- Deleted all Brain Objects
- Moved the BrainParameters into the Agent
- Gave the Agent a Heuristic method (see Balance Ball for example)
- Modified the Communicator and ModelRunner : Put can only take one agent at a time
- Made the IBrain Interface with RequestDecision and DecideAction method
No changes made to Python
[Design Doc](https://docs.google.com/document/d/1hBhBxZ9lepGF4H6fc6Hu6AW7UwOmnyX3trmgI3HpOmo/edit#)
* Removing editorconfig
* Updating BallanceBall scene
* grammar mistake
* Clearing the Agents of the Model runner
* Added Documentation on IBrain
* Modified comments on GiveModel
* Introduced a factory
* Split Learning Brain in two
* Changes to walljump
* Fixing the Unit tests
* Renaming the Brain to Policy
* Heuristic now has priority over training
* Edited code comments
* Fixing bugs
* Develop one to one scene edits...
* Initial commit removing memories from C# and deprecating memory fields in proto
* initial changes to Python
* Adding functionalities
* Fixes
* adding the memories to the dictionary
* Fixing bugs
* tweeks
* Resolving bugs
* Recreating the proto
* Addressing comments
* Passing by reference does not work. Do not merge
* Fixing huge bug in Inference
* Applying patches
* fixing tests
* Addressing comments
* Renaming variable to reflect type
* test
* Update package and communicator versions to 0.11
* Remove pip cache fallback for CircleCI
This change removes the caching fallback in the case where dependencies
change, since it can cause CI failures when we have incompatible
dependencies in the cache.
* Limit Tensorflow version for tests to <2.0
* Use stable bokken image. (#2815)
* build fixes for 2018+ (#2808)
* rename CompressionType enum
* fix standalone build test for 2018+
* Add more editor versions for testing. (#2809)
* class variable for API verison, fix env tests (#2817)
* fixed area prefab
agents were pointing to the wrong laser gameObject.
* WIP VectorSensor and StackedSensor
* fix a few dumb mistakes
* more VectorSensor
* remove Update(), add util methods, hook into TensorGenerator
* WriteApdater to write to tensors and arrays
* write float observations
* used circular buffer for stacked obs
* cleanup
* fix unit tests
* docstrings
* undo accidental checkins
* rider suggestions, add range check
* bounds check before writing
* undo ProjectVersion.txt change
* fix unit tests
* unit test for VectorSensor
* StackingSensor tests
* missing meta file
* missing meta file
* WriteAdapter tests
* Modifying the .proto files
* attempt 1 at refactoring Python
* works for ppo hallway
* changing the documentation
* now works with both sac and ppo both training and inference
* Ned to fix the tests
* TODOs :
- Fix the demonstration recorder
- Fix the demonstration loader
- verify the intrinsic reward signals work
- Fix the tests on Python
- Fix the C# tests
* Regenerating the protos
* fix proto typo
* protos and modifying the C# demo recorder
* modified the demo loader
* Demos are loading
* IMPORTANT : THESE ARE THE FILES USED FOR CONVERSION FROM OLD TO NEW FORMAT
* Modified all the demo files
* Fixing all the tests
* fixing ci
* addressing comments
* removing reference to memories in the ll-api
* Removed Barracuda as drop-in library, added Barracuda package dependency
* Removed Google Protobuf library as now it comes with Barracuda package
* List<T>.Length seems to be extension, which is not available in .NET coming with Unity 2017, switched to .Count
* [WIP] Side Channel initial layout
* Working prototype for raw bytes
* fixing format mistake
* Added some errors and some unit tests in C#
* Added the side channel for the Engine Configuration. (#2958)
* Added the side channel for the Engine Configuration.
Note that this change does not require modifying a lot of files :
- Adding a sender in Python
- Adding a receiver in C#
- subscribe the receiver to the communicator (here is a one liner in the Academy)
- Add the side channel to the Python UnityEnvironment (not represented here)
Adding the side channel to the environment would look like such :
```python
from mlagents.envs.environment import UnityEnvironment
from mlagents.envs.side_channel.raw_bytes_channel import RawBytesChannel
from mlagents.envs.side_channel.engine_configuration_channel import EngineConfigurationChannel
channel0 = RawBytesChannel()
channel1 = EngineConfigurationChannel()
env = UnityEnvironme...
* [WIP] Side Channel initial layout
* Working prototype for raw bytes
* fixing format mistake
* Added some errors and some unit tests in C#
* Added the side channel for the Engine Configuration. (#2958)
* Added the side channel for the Engine Configuration.
Note that this change does not require modifying a lot of files :
- Adding a sender in Python
- Adding a receiver in C#
- subscribe the receiver to the communicator (here is a one liner in the Academy)
- Add the side channel to the Python UnityEnvironment (not represented here)
Adding the side channel to the environment would look like such :
```python
from mlagents.envs.environment import UnityEnvironment
from mlagents.envs.side_channel.raw_bytes_channel import RawBytesChannel
from mlagents.envs.side_channel.engine_configuration_channel import EngineConfigurationChannel
channel0 = RawBytesChannel()
channel1 = EngineConfigurationChanne...
* added team id and identifier concat to behavior parameters
* splitting brain params into brain name and identifiers
* set team id in prefab
* recieves brain_name and identifier on python side
* added team id and identifier concat to behavior parameters
* splitting brain params into brain name and identifiers
* set team id in prefab
* recieves brain_name and identifier on python side
* rebased with develop
* Correctly calls concatBehaviorIdentifiers
* added team id and identifier concat to behavior parameters
* splitting brain params into brain name and identifiers
* set team id in prefab
* recieves brain_name and identifier on python side
* rebased with develop
* Correctly calls concatBehaviorIdentifiers
* trainer_controller expects name_behavior_ids
* add_policy and create_policy separated
* adjusting tests to expect trainer.add_policy to be called
* fixing tests
* fixed naming ...
* pass shape to WriteAdapter
* handle floats on python side
* cleanup
* whitespace
* rename GetFloatObservationShape, support uncompressed in RenderTexture sensor
* numpy float32
* remove unused using
* Float sensor and unit test
* replace asserts with exceptions, docstrings
* initial commit
* Fixed the compilation errors
* fixing the tests
* Addressing the comment about the brain parameters
* Fixing typo
* Made timers more accurate
* addressing comments
* Better memory allocation
* Added some docstrings
* Adding better sensor validation
* Wrapped in #if DEBUG and also wrapped GenerateSensorData in a timer
* Timer changes
* Simplifying the Agent reset logic
- Agents will reset in ResetIfDone immediately after being marked Done
- Agents will always request a decision right after reset
- This change implies that additional messages might be sent to Python
* Fixing the Unit Tests
* Added a note in the Migrating.md document
* Triming some of the methods of the agent but left SetReward
* Fixing bugs
* modifying the environments
* Reintroducing IsDone and IsMaxStepReached
* Updating the Migrating doc
* more details on the Migration
* Made the Agent reset immediately
* fixing the C# tests
* Fixing the tests still
* Trying with incremental episode ids
* deleting buffer rather than using an empty list
* Addressing the comments
* Forgot to edit the comment on AgentInfo
* Updating the migrating doc
* Fixed an obvious bug
* cleaning after an agent is done in agent processor
* Fixing the pytest errors