* Update Dockerfile
* Separate send environment data from reset (#4128)
* Fixed a typo on ML-Agents-Overview.md (#4130)
Fixed redundant "to" word from the sentence since it is probably a typo in document.
* Updated the badge’s link to point to the newest doc version
* Replaced all of the doc to release_3_doc
* Fix 3DBall and 3DBallHard SAC regressions (#4132)
* Move memory validation to settings
* Update docs
* Add settings test
* Update to release_3 in installation.md (#4144)
* rename to SideChannelManager +backcompat (#4137)
* Remove comment about logo with --help (#4148)
* [bugfix] Make FoodCollector heuristic playable (#4147)
* Make FoodCollector heuristic playable
* Update changelog
* script to check for old release links and references (#4153)
* Remove package validation suite from Project (#4146)
* RayPerceptionSensor: handle empty and invalid tags (#4155...
* Added Reward Providers for Torch
* Use NetworkBody to encode state in the reward providers
* Integrating the reward prodiders with ppo and torch
* work in progress, integration with PPO. Not training properly Pyramids at the moment
* Integration in PPO
* Removing duplicate file
* Gail and Curiosity working
* addressing comments
* Enfore float32 for tests
* enfore np.float32 in buffer
* Move linear encoding to NetworkBody
* moved encoders to processors (#4420)
* fix bad merge
* Get it running
* Replace mentions of visual_encoders
* Remove output_size property
* Fix tests
* Fix some references
* Revert test_simple_rl
* Fix networks test
* Make curiosity test more accomodating
* Rename total_input_size
* [Bug fix] Fix bug in GAIL gradient penalty (#4425) (#4426)
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
* Up number of steps
* Rename to visual_processors and vector_processors
Co-authored-by: andrewcoh <54679309+andrewcoh@users.noreply.github.com>
Co-authored-by: Andrew Cohen <andrew.cohen@unity3d.com>
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
* Fixes for Torch SAC and tests
* FIx recurrent sac test
* Properly update normalization for SAC-continuous
* Fix issue with log ent coef reporting in SAC Torch
* Add torch_utils
* Use torch from torch_utils
* Add torch to banned modules in CI
* Better import error handling
* Fix flake8 errors
* Address comments
* Move networks to GPU if enabled
* Switch to torch_utils
* More flake8 problems
* Move reward providers to GPU/CPU
* Remove anothere set default tensor
* Fix banned import in test
* Don't run value during inference
* Execute critic with LSTM
* Address comments
* Unformat
* Optimized soft update
* Move soft update to model utils
* Add test for soft update
* Fix of the target entropy for continuous SAC
* Lowering required steps of test and remove unecessary unsqueeze
* Changing the target from -dim(a)^2 to -dim(a) by removing implicit broadcasting