* init
* Add reward manager and hurryUpReward
* fix hurry reward/ add awful first training
* Turn off head height and hurry rew
* changed max speed to 15. added small hh rew
* add NaN check for reward manager. start vel penalty
* add bpVel pen
* add new BPVelPen nn file
* remove outdated nn file
* add randomize speed bool
* try rewad product
* change coeff to 1
* try avg vel of all bp for reward
* move outside loop
* try linear inverselerp for vel
* add avg rew matchspeed15 nn file. looks much better
* save scene
* no hand penalty, random walk speed
* fix inverse lerp
* try new reward falloff
* cleanup
* added new nn file. don't allow hand contact
* update obsv
* remove hh rew. add trained no-hh model
* add new nn file
* new curve
* add new models. try no reset
* add hh rew
* clamp hh
* zero rewards if ground contact
* switch to approved with movi...
* Update Using-Tensorboard.md
"--logdir=results" is broken in newer versions of tensor board; "logdir results" without the equal sign works. See https://github.com/tensorflow/tensorboard/issues/686
* Removing equal sign from tensorboard command line params in docs
Co-authored-by: Nancy Iskander <nancyiskanderonline@gmail.com>
* [docs] buffer_size parameter clarification
It was not fully clear that it has a different behavior for PPO and SAC. The docs update should improve the understanding.
* [docs] updated buffer_size parameter clarification
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>