* Simplified rewards and observations; Determined better settings for training within a reasonable amount of time.
* Simplified Agent rewards; Added training section that discusses hyperparameters.
* Added note about DecisionFrequency.
* Updated screenshots and a small clarification in the text.
* Tested and updated using v0.6.
* Update a couple of images, minor text edit.
* Replace with more recent training stats.
* resolve a couple of minor review commnts.
* Increased the recommended batch and buffer size hyperparameter values.
* Fix 2 typos.
* Check that worker port is available in RpcCommunicator
Previously the RpcCommunicator did not check the port or create the
RPC server until `initialize()` was called. Since "initialize"
requires the environment to be available, this means we might create
a new environment which connects to an existing RPC server running
in another process. This causes both training runs to fail.
As a remedy to this issue, this commit moves the server creation into
the RpcCommunicator constructor and adds an explicit socket binding
check to the requested port.
* Fixes suggested by Codacy
* Update rpc_communicator.py
* Addressing feedback: formatting & consistency
* Documentation Update
* addressed comments
* new images for the recorder
* Improvements to the docs
* Address the comments
* Core_ML typo
* Updated the links to inference repo
* Put back Inference-Engine.md
* fix typos : brain
* Readd deleted file
* fix typos
* Addressed comments
* Adding model for 3D Balance Ball.
* Adding LearningBrain to BroadCast Hub.
* Removed CrawlerPlayer Brain
* Renamed CrawlerLearning —> CrawlerStaticLearning
* Update Hallway models
* Attaching model to brain for Hallway
* Attaching model to 3DBall Brain.
* Updated CrawlerLearning —> CrawlerStaticLearning on trainer config.
* Adding Reacher model
* Remove model specification in Hallway Brain asset
* Removing model specification from 3Dball scene
* Adding crawler model file
* Specifying learning brain as default for crawler
The check for wether an agent has fallen off the platform was using a wrong value of 1 instead of 0.
This meant that the agent immediately started in a falling state and entered a thrashing cycle of resetting itself.