* Add support for stacking past n states to allow network to learn temporal dependencies.
* Add Banana Collector environment for demonstrating partially observable multi-agent environments.
* Add 3DBall Hard which lacks velocity information in state representation. Used as test for LSTM and state-stacking features.
* Rework Tennis environment to be continuous control and trainable in 100k steps.
* Add ability to seed learning (numpy, tensorflow, and Unity) with `--seed` flag.
* Add `maxStepReached` flag to Agents and Academy.
* Change way value bootstrapping works in PPO to take advantage of timeouts.
* Default size of GridWorld changed to 5x5 in order to validate bootstrapping changes.
* Implement behavioral cloning for cc/dc, fc/rnn, state/observations.
* Re-organize folder structure in anticipation of unitytrainers as a package.
* Create demo environment BananaImitation to validate behavioral cloning.
* Fixes#336
* On Demand Decision : Use RequestDecision and RequestAction
* New Agent Inspector : Use it to set On Demand Decision
* New BrainParameters interface
* LSTM memory size is now set in python
* New C# API
* Semantic Changes
* Replaced RunMDP
* New Bouncer Environment to test On Demand Dscision
RayPerception moved to a component that is now used by Banana, Soccer, Hallway, and Push Block.
Converted Push Block to use RayPerception for local perception and retrained model.
Re-worked Hallway to be more extensible.
* Fixes internal brain for Banana Imitation.
* Fixes Discrete Control training for Imitation Learning.
* Fixes Visual Observations in internal brain with non-square inputs.
This PR makes the following changes:
* Moves clipping of continuous control model into model itself. Output is now always [-1, 1].
* Internal model values are now clipped between [-3, 3] before being rescaled to [-1, 1] for output. * This improves training performance by providing a wider range of values within which the pdf of the gaussian can fall. Output of [-1, 1] is used to be more environment-creator friendly.
* Fixes issue where epsilon was erroneously being used to reconstruct old probabilities during PPO update, leading to reduced learning performance.
* Introduce ScaleAction() function within python to easily rescale values from [-1, 1] to arbitrary range.
* Re-train all CC models using improved algorithm. All performance levels are equal or improved. In the case of Crawler, improvement is drastic.
* Update documentation appropriately.
* Made miscellaneous minor code style and optimization improvements within environments.
* Fix typos
* Use abstract class for rayperception
* Created RayPerception2D. (#1721)
* Incorporate RayPerception2D
* Fix typo
* Make abstract class
* Add tests
* new env styles rebased on develop
* added new trained models
* renamed food collector platforms
* reduce training timescale on WallJump from 100 to 10
* uncheck academy control on walljump
* new banner image
* rename banner file
* new example env images
* add foodCollector image
* change Banana to FoodCollector and update image
* change bouncer description to include green cube
* update image
* update gridworld image
* cleanup prefab names and tags
* updated soccer env to reference purple agent instead of red
* remove unused mats
* rename files
* remove more unused tags
* update image
* change platform to agent cube
* update text. change platform to agents head
* cleanup
* cleaned up weird unused meta files
* add new wall jump nn files and rename a prefab
* walker change stacked states from 5 to 1
walker collects physics observations so stacked states are not need...
* 1 to 1 Brain to Agent
This is a work in progess
In this PR :
- Deleted all Brain Objects
- Moved the BrainParameters into the Agent
- Gave the Agent a Heuristic method (see Balance Ball for example)
- Modified the Communicator and ModelRunner : Put can only take one agent at a time
- Made the IBrain Interface with RequestDecision and DecideAction method
No changes made to Python
[Design Doc](https://docs.google.com/document/d/1hBhBxZ9lepGF4H6fc6Hu6AW7UwOmnyX3trmgI3HpOmo/edit#)
* Removing editorconfig
* Updating BallanceBall scene
* grammar mistake
* Clearing the Agents of the Model runner
* Added Documentation on IBrain
* Modified comments on GiveModel
* Introduced a factory
* Split Learning Brain in two
* Changes to walljump
* Fixing the Unit tests
* Renaming the Brain to Policy
* Heuristic now has priority over training
* Edited code comments
* Fixing bugs
* Develop one to one scene edits...
* [WIP] Side Channel initial layout
* Working prototype for raw bytes
* fixing format mistake
* Added some errors and some unit tests in C#
* Added the side channel for the Engine Configuration. (#2958)
* Added the side channel for the Engine Configuration.
Note that this change does not require modifying a lot of files :
- Adding a sender in Python
- Adding a receiver in C#
- subscribe the receiver to the communicator (here is a one liner in the Academy)
- Add the side channel to the Python UnityEnvironment (not represented here)
Adding the side channel to the environment would look like such :
```python
from mlagents.envs.environment import UnityEnvironment
from mlagents.envs.side_channel.raw_bytes_channel import RawBytesChannel
from mlagents.envs.side_channel.engine_configuration_channel import EngineConfigurationChannel
channel0 = RawBytesChannel()
channel1 = EngineConfigurationChanne...
* Add the VectorSensor to the CollectObservation call
* Example of API change for BalanceBall
* Modified the Examples
* Changes to the migrating doc
* Editing the docs
* Update docs/Learning-Environment-Design-Agents.md
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* Update docs/Migrating.md
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* Update docs/Migrating.md
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* Update docs/Getting-Started-with-Balance-Ball.md
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* addressing comments
* Removed the MLAgents.Sensor namespace
* Removing the MLAgents.Sensor namespace from the tests
* Editing the migrating docs
Co-authored-by: Chris Elion <celion@gmail.com>
* [skip ci] Renamed methods in the Agent class
WARNING, the user when implementing obsolete methods will see the message :Member `old method` overrides obsolete member `old method`. Add the Obsolete attribute to `old method`. It will not suggest the new method to override.
* [skip ci] Updated the example environment
* [skip ci] Updated migrating and changelog
* [skip ci] Editing the docs
* [skip ci] Missing docs
* :+1
* Update docs/Getting-Started-with-Balance-Ball.md
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* Update docs/Learning-Environment-Create-New.md
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* Update docs/Learning-Environment-Create-New.md
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* [skip ci] documentation changes
* [skip ci] Update docs/Getting-Started-with-Balance-Ball.md
* [skip ci] Update docs/Getting-Started-with-Balance-Ball.md
* [skip ci] Update docs/Gett...
* Deprecating Academy.Instance.FloatProperties
* Made the registered side channels a static property and created the sideChannelUtils class to handle side channel stuff
* Clearing the sending message queue in the Academy when the communicaor is not on
* addressing comments
* Make EnvironmentParameters a first-class citizen in the API
Missing: Python conterparts and testing.
* Minor comment fix to Engine Parameters
* A second minor fix.
* Make EngineConfigChannel Internal and add a singleton/sealed accessor
* Make StatsSideChannel Internal and add a singleton/sealed accessor
* Changes to SideChannelUtils
- Disallow two sidechannels of the same type to be added
- Remove GetSideChannels that return a list as that is now unnecessary
- Make most methods except (register/unregister) internal to limit users impacting the “system-level” side channels
- Add an improved comment to SideChannel.cs
* Added Dispose methods to system-level sidechannel wrappers
- Specifically to StatsRecorder, EnvironmentParameters and EngineParameters.
- Updated Academy.Dispose to take advantage of these.
- Updated Editor tests to cover all three “system-level” side channels.
Kudos to Unit Tests (TestAcade...
* Update Dockerfile
* Separate send environment data from reset (#4128)
* Fixed a typo on ML-Agents-Overview.md (#4130)
Fixed redundant "to" word from the sentence since it is probably a typo in document.
* Updated the badge’s link to point to the newest doc version
* Replaced all of the doc to release_3_doc
* Fix 3DBall and 3DBallHard SAC regressions (#4132)
* Move memory validation to settings
* Update docs
* Add settings test
* Update to release_3 in installation.md (#4144)
* rename to SideChannelManager +backcompat (#4137)
* Remove comment about logo with --help (#4148)
* [bugfix] Make FoodCollector heuristic playable (#4147)
* Make FoodCollector heuristic playable
* Update changelog
* script to check for old release links and references (#4153)
* Remove package validation suite from Project (#4146)
* RayPerceptionSensor: handle empty and invalid tags (#4155...
VisualFoodCollector is now an example environment of using a mix of visual and vector observation and is able to train with default config file.
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>