* Normalize observations when adding experiences
This change moves normalization of vector observations into the trainer's
"add_experiences" interface.
Prior to this change, normalization occurred at inference time. This
was somewhat confusing since usually executing a forward pass shouldn't
have side-effects which would change the training step. Also, in a
asynchronous or distributed setting where we copy the neural network
weights from a trainer to a remote actor / inference worker we'd end up
with training issues because of the weights being different on the trainer
than the workers.
* Initial Commit
* Remove the Academy Done flag from the protobuf definitions
* remove global_done in the environment
* Removed irrelevant unitTests
* Remove the max_step from the Academy inspector
* Removed global_done from the python scripts
* Modified and removed some tests
* This actually does not break either curriculum nor generalization training
* Replace global_done with reserved.
Addressing Chris Elion's comment regarding the deprecation of the global_done field. We will use a reserved field to make sure the global done does not get replaced in the future causing errors.
* Removed unused fake brain
* Tested that the first call to step was the same as a reset call
* black formating
* Added documentation changes
* Editing the migrating doc
* Addressing comments on the Migrating doc
* Addressing comments :
- Removing dead code
- Resolving forgotten merged conflicts
- Editing documentations...
* relax versions, add python 3.7 to CI
* add workflows
* try paramaterized circleci build, disable slow test
* fix workflow
* fix (?) pyversion
* set job name, fix pip freeze output
* test_requirements.txt
* fix install
* fix paths (again) - should use pushd popd instead
* use pushd and popd
* sort deps, restore unit test, cleanup CI
* relax versions more
* clean up versions in docs
* test older libs for 3.6, newer for 3.7
* pip: progress bar off
* fix gym-unity pip install
* try cat'ing setups for checksum
* dont use fallback (temporarily)
* dont turn off progress bar before upgrading pip
* PR feedback
* add parameter descriptions in CI config
* check using xargs
* fix broken BC link
* install npm, run precommit before unit tests
* try to install npm
* try a node image build
* add workflow
* don't use precommit on node run
* sudo make me a sandwich
* pass config arg
* revert CI order change
* retry precommit
* sudo apt-get
* sudo npm
* make sure fails on bad link
* cleanup and refix link
Only cosmetic and readability improvements. No functional changes were intended.
Utilities.cs
- Fixed comments across file
- Made class static
- Removed unnecessary imports
- Removed unused method arguments
- Renamed variables as appropriate to make usage clearer
- In AddRangeNoAlloc, disabled (by comment) Rider’s suggestion to revert to use of built-in Range field (Fixed)
- In TextureToTensorProxy, swapped order of first two arguments to be more in-line with convention of input, output
UtilitiesTests.cs
- Removed unnecessary imports
- Simplified array creation commands
GeneratorImp.cs
- Rider automatically deleted spaces on empty lines
- Changed call to TextureToTensorProxy to mirror new argument ordering
* Clean-up to UnityAgentsException.cs
- Removed unnecessary imports
- Fixed comment warning
- Fixed method header
* Improvements to Startup.cs
- Created const for SCENE_NAME field
- Fixed strin...
* This addresses #1835. Baselines expects single environments used with their ppo2 algorithm to be wrapped in a DummyVecEnv. The old readme did not instruct the reader to do so and the code failed to run with the latest version of baselines. This imports the correct function from baselines and fixes the make_unity_env function described in the readme.
* added line to gym-unity/README.md to note the version of baselines the examples were tested with
* Add Soft Actor-Critic model, trainer, and policy and sac_trainer_config.yaml
* Add documentation for SAC and tweak PPO documentation to reference the new pages.
* Add tests for SAC, change simple_rl test to run both PPO and SAC.