* added broadcast to the player and heuristic brain.
Allows the python API to record actions taken along with the states and rewards
* removed the broadcast checkbox
Added a Handshake method for the communicator
The academy will try to handshake regardless of the brains present
Player and Heuristic brains will send their information through the communicator but will not receive commands
* bug fix : The environment only requests actions from external brains when unique
* added warning in case no brins are set to external
* fix on the instanciation of coreBrains,
fix on the conversion of actions to arrays in the BrainInfo received from step
* default discrete action is now 0
bug fix for discrete broadcast action (the action size should be one in Agents.cs)
modified Tennis so that the default action is no action
modified the TemplateDecsion.cs to ensure non null values are sent from Decide() and MakeMemory()
* minor fixes
* need to convert the s...
* More efficiently allocate memory when sending states
* Code clean-up
* Additional changes
* More GC reduction
* Remove state list initialization from example environments
* Use built-in json tool to serialize state message
* Remove commented code
* Use more efficient CompareTag
* Comments before code
* Use type inference where appropriate
* On Demand Decision : Use RequestDecision and RequestAction
* New Agent Inspector : Use it to set On Demand Decision
* New BrainParameters interface
* LSTM memory size is now set in python
* New C# API
* Semantic Changes
* Replaced RunMDP
* New Bouncer Environment to test On Demand Dscision
* Add config for crawler, and change crawler scene
* Changed number of crawlers in scene to 12
* Changed Max-steps for crawlers to 5000
* Newer hyperparameters and newly trained crawler model
* Clean up crawler code, and improve efficency
* [AddVectorObs] Converted the Examples to use the new AddVectorObs
* [AddVectorObs] Converted the Reacher to use the new AddVectorObs
* [Improvement] One liner for adding the rotation
* Minor changes to ensure a common visual language.
* Agents are blue (or additionally red in competitive scenarios).
* Interactable objects are orange.
* Goals are green when objects, and checkerboards when places.
* Not everything perfectly follows this, but things are mostly consistent now.
* Renamed "Banana" folder to "BananaCollectors"
* Ensured all brains were set to "Player"
* Moved non-shared assets out of the "SharedAssets" folder.
This PR makes the following changes:
* Moves clipping of continuous control model into model itself. Output is now always [-1, 1].
* Internal model values are now clipped between [-3, 3] before being rescaled to [-1, 1] for output. * This improves training performance by providing a wider range of values within which the pdf of the gaussian can fall. Output of [-1, 1] is used to be more environment-creator friendly.
* Fixes issue where epsilon was erroneously being used to reconstruct old probabilities during PPO update, leading to reduced learning performance.
* Introduce ScaleAction() function within python to easily rescale values from [-1, 1] to arbitrary range.
* Re-train all CC models using improved algorithm. All performance levels are equal or improved. In the case of Crawler, improvement is drastic.
* Update documentation appropriately.
* Made miscellaneous minor code style and optimization improvements within environments.
* Revamps agent code for walker and crawler environments to use shared JointDriveController system.
* Crawler has been reworked to be very cute.
* Crawler & Walker environments have been reworked to be visually consistent.
* Added Dynamic Crawler scene.
* All scenes re-trained and new models added.
* Documentation changes.
* New brains for Pyramid scene
* Add reacher brains
* New brains for Soccer agents
* New Tennis Brains
* Set prefabs correctly
* New brains for bouncer
* New Dynamic Crawler Brains
* Adding model for 3D Balance Ball.
* Adding LearningBrain to BroadCast Hub.
* Removed CrawlerPlayer Brain
* Renamed CrawlerLearning —> CrawlerStaticLearning
* Update Hallway models
* Attaching model to brain for Hallway
* Attaching model to 3DBall Brain.
* Updated CrawlerLearning —> CrawlerStaticLearning on trainer config.
* Adding Reacher model
* Remove model specification in Hallway Brain asset
* Removing model specification from 3Dball scene
* Adding crawler model file
* Specifying learning brain as default for crawler
* Switched default Mac GFX API to Metal
* Added Barracuda pre-0.1.5
* Added basic integration with Barracuda Inference Engine
* Use predefined outputs the same way as for TF engine
* Fixed discrete action + LSTM support
* Switch Unity Mac Editor to Metal GFX API
* Fixed null model handling
* All examples converted to support Barracuda
* Added model conversion from Tensorflow to Barracuda
copied the barracuda.py file to ml-agents/mlagents/trainers
copied the tensorflow_to_barracuda.py file to ml-agents/mlagents/trainers
modified the tensorflow_to_barracuda.py file so it could be called from mlagents
modified ml-agents/mlagents/trainers/policy.py to convert the tf models to barracuda compatible .bytes file
* Added missing iOS BLAS plugin
* Added forgotten prefab changes
* Removed GLCore GFX backend for Mac, because it doesn't support Compute shaders
* Exposed GPU support for LearningBrain inference
...
* new env styles rebased on develop
* added new trained models
* renamed food collector platforms
* reduce training timescale on WallJump from 100 to 10
* uncheck academy control on walljump
* new banner image
* rename banner file
* new example env images
* add foodCollector image
* change Banana to FoodCollector and update image
* change bouncer description to include green cube
* update image
* update gridworld image
* cleanup prefab names and tags
* updated soccer env to reference purple agent instead of red
* remove unused mats
* rename files
* remove more unused tags
* update image
* change platform to agent cube
* update text. change platform to agents head
* cleanup
* cleaned up weird unused meta files
* add new wall jump nn files and rename a prefab
* walker change stacked states from 5 to 1
walker collects physics observations so stacked states are not need...
* Feature Deprecation : Online Behavioral Cloning
In this PR :
- Delete the online_bc_trainer
- Delete the tests for online bc
- delete the configuration file for online bc training
* Deleting the BCTeacherHelper.cs Script
TODO :
- Remove usages in the scene
- Documentation Edits
*DO NOT MERGE*
* IMPORTANT : REMOVED ALL IL SCENES
- Removed all the IL scenes from the Examples folder
* Removed all mentions of online BC training in the Documentation
* Made a note in the Migrating.md doc about the removal of the Online BC feature.
* Modified the Academy UI to remove the control checkbox and replaced it with a train in the editor checkbox
* Removed the Broadcast functionality from the non-Learning brains
* Bug fix
* Note that the scenes are broken since the BroadcastHub has changed
* Modified the LL-API for Python to remove the broadcasting functiuonality.
* All unit tests are running
* Modifie...
- Push (almost) all references to protobuf objects into the RpcCommunicator.
- Simplify the passing around of Agents and Agent Infos.
- Delete all references to the Batcher.
- Simplify the Environment Step by removing all of the reset and message counting logic.
- Finishes MLA-27 and MLA-28
* 1 to 1 Brain to Agent
This is a work in progess
In this PR :
- Deleted all Brain Objects
- Moved the BrainParameters into the Agent
- Gave the Agent a Heuristic method (see Balance Ball for example)
- Modified the Communicator and ModelRunner : Put can only take one agent at a time
- Made the IBrain Interface with RequestDecision and DecideAction method
No changes made to Python
[Design Doc](https://docs.google.com/document/d/1hBhBxZ9lepGF4H6fc6Hu6AW7UwOmnyX3trmgI3HpOmo/edit#)
* Removing editorconfig
* Updating BallanceBall scene
* grammar mistake
* Clearing the Agents of the Model runner
* Added Documentation on IBrain
* Modified comments on GiveModel
* Introduced a factory
* Split Learning Brain in two
* Changes to walljump
* Fixing the Unit tests
* Renaming the Brain to Policy
* Heuristic now has priority over training
* Edited code comments
* Fixing bugs
* Develop one to one scene edits...
* Triming some of the methods of the agent but left SetReward
* Fixing bugs
* modifying the environments
* Reintroducing IsDone and IsMaxStepReached
* Updating the Migrating doc
* more details on the Migration
* Add the VectorSensor to the CollectObservation call
* Example of API change for BalanceBall
* Modified the Examples
* Changes to the migrating doc
* Editing the docs
* Update docs/Learning-Environment-Design-Agents.md
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* Update docs/Migrating.md
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* Update docs/Migrating.md
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* Update docs/Getting-Started-with-Balance-Ball.md
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* addressing comments
* Removed the MLAgents.Sensor namespace
* Removing the MLAgents.Sensor namespace from the tests
* Editing the migrating docs
Co-authored-by: Chris Elion <celion@gmail.com>
* [skip ci] Renamed methods in the Agent class
WARNING, the user when implementing obsolete methods will see the message :Member `old method` overrides obsolete member `old method`. Add the Obsolete attribute to `old method`. It will not suggest the new method to override.
* [skip ci] Updated the example environment
* [skip ci] Updated migrating and changelog
* [skip ci] Editing the docs
* [skip ci] Missing docs
* :+1
* Update docs/Getting-Started-with-Balance-Ball.md
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* Update docs/Learning-Environment-Create-New.md
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* Update docs/Learning-Environment-Create-New.md
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* [skip ci] documentation changes
* [skip ci] Update docs/Getting-Started-with-Balance-Ball.md
* [skip ci] Update docs/Getting-Started-with-Balance-Ball.md
* [skip ci] Update docs/Gett...
* [bug-fix] Increase height of wall in CrawlerStatic (#3650)
* [bug-fix] Improve performance for PPO with continuous actions (#3662)
* Corrected a typo in a name of a function (#3670)
OnEpsiodeBegin was corrected to OnEpisodeBegin in Migrating.md document
* Add Academy.AutomaticSteppingEnabled to migration (#3666)
* Fix editor port in Dockerfile (#3674)
* Hotfix memory leak on Python (#3664)
* Hotfix memory leak on Python
* Fixing
* Fixing a bug in the heuristic policy. A decision should not be requested when the agent is done
* [bug-fix] Make Python able to deal with 0-step episodes (#3671)
* adding some comments
Co-authored-by: Ervin T <ervin@unity3d.com>
* Remove vis_encode_type from list of required (#3677)
* Update changelog (#3678)
* Shorten timeout duration for environment close (#3679)
The timeout duration for closing an environment was set to the
same duration as the timeout when waiting ...
* Bumping version on the release (#3615)
* Update examples project to 2018.4.18f1 (#3618)
From 2018.4.14f1. An internal package dependency was updated as
a side effect.
* Remove dead components from the examples scenes (#3619) (#3624)
* Improve warnings and exception if using unsupported combo
* add meta file
* fix unit test
* enforce onnx conversion (expect tf2 CI to fail) (#3600)
* Update error message
* Updated the release branch docs (#3621)
* Updated the release branch docs
* Edited the README
* make sure top-level timer is closed before writing
* Remove space from Product Name for examples
In #2588 it was suggested that the space in the Product Name for
our example environments causes confusion when using a default build
because of the need to escape the space in the build filename.
This change removes the space from the Product Name in the project's
player settings.
* [bug-fix] Incr...
* about to implement orientation cube
* oCube spawining works. ready to train
* working. about to try com
* ready for training
* add random rot on episode start
* feet now alternate but runs backwards
* still running with right leg in front
* increased joint strength to 40k
* removed texture example
* reduced maxAngVel, enabled enhanced determinism, cont spec
* rebuilt walker ragdoll to scale 1
* rebuilt ragdoll ready
* update walker pair prefab
* fixed bp heirarchy
* added trained model, renamed scene, usecollisioncallbacks
* updated dynamic platforms
* added dynamic walker tf file. max speed 5
* DynamicWalker working. has working nn file
* collect local rotations
* added new dynamic nn file
* hip facing reward
* Create WalkerDynamic.yaml
* fix hip rotation
* about to clean up code
* added dirIndicator and orentCubeGizmo
* clean up
* cleanup
* up...
* Update Dockerfile
* Separate send environment data from reset (#4128)
* Fixed a typo on ML-Agents-Overview.md (#4130)
Fixed redundant "to" word from the sentence since it is probably a typo in document.
* Updated the badge’s link to point to the newest doc version
* Replaced all of the doc to release_3_doc
* Fix 3DBall and 3DBallHard SAC regressions (#4132)
* Move memory validation to settings
* Update docs
* Add settings test
* Update to release_3 in installation.md (#4144)
* rename to SideChannelManager +backcompat (#4137)
* Remove comment about logo with --help (#4148)
* [bugfix] Make FoodCollector heuristic playable (#4147)
* Make FoodCollector heuristic playable
* Update changelog
* script to check for old release links and references (#4153)
* Remove package validation suite from Project (#4146)
* RayPerceptionSensor: handle empty and invalid tags (#4155...
* added Target and OCube controllers. updated crawler envs
* update walker prefab
* add refs to prefab
* Update Crawler.prefab
* update platform, ragdoll, ocube prefabs
* reformat file
* reformat files
* fix behavior name
* add final retrained crawler and walker nn files
* collect hip ocube rot in world space
* update crawler observations and update prefabs
* change to 20M steps
* update crwl prefab to 142 observ
* update obsvs to 241. add expvel reward
* change walkspeed to 3
* add new crawler and walker nn files
* adjust rewards
* enable other pairs
* add RewardManager
* cleanup about to do final training
* cleanup add nn files for increased facing rew reduced height rew
* try no facing rew
* add vel only policy, try dy target
* inc torq on cube
* added dynamic cube nn. gonna try 40M steps
* add 40M step test, more cleanup
* ch...
* Removing some scenes, All the Static and all the non variable speed environments. Also removed Bouncer, PushBlock, WallJump and reacher. Removed a bunch of visual environements as well. Removed 3DBallHard and FoodCollector (kept Visual and Grid FoodCollector)
* readding 3DBallHard
* readding pushblock and walljump
* Removing tennis
* removing mentions of removed environments
* removing unused images
* Renaming Crawler demos
* renaming some demo files
* removing and modifying some config files
* new examples image?
* removing Bouncer from build list
* replacing the Bouncer environment with Match3 for llapi tests
* Typo in yamato test