ml-agents

目录树: 45fe60db

作者	SHA1	备注	提交日期
Hunter	45fe60db	fixed broken prefabs	5 年前
GitHub	3880fd3a	Update development release version to 0.10.0.dev0 (#2443 ) In order for downstream packages to make use of the latest pre-release features, we can pre-release versions of our packages. For packages ending in `devN` pip will not install that package version by default. This change manually updates our package version to a development version with the idea that we can manually perform development versions with the potential for future automated / nightly dev releases.	5 年前
GitHub	43696d60	Fix bug in add_rewards_output and add test (#2442 )	5 年前
GitHub	689765d6	Modification of reward signals and rl_trainer for SAC (#2433 ) * Adds evaluate_batch to reward signals. Evaluates on minibatch rather than on BrainInfo. * Changes the way reward signal results are reported in rl_trainer so that we get the pure, unprocessed environment reward separate from the reward signals. * Moves end_episode to rl_trainer * Fixed bug with BCModule with RNN	5 年前
Arthur Juliani	fa46be7f	Merge branch 'RunSwimFlyRich-master' into develop	5 年前
GitHub	4abe89bc	Only call get_action on brains with policies (#2437 )	5 年前
GitHub	bd7eb286	Update reward signals in parallel with policy (#2362 )	5 年前
Jonathan Harper	e333abf8	Fixing compile error Variable "model" is undefined.	5 年前
GitHub	4472838e	Merge pull request #2421 from Unity-Technologies/hotfix-v0.9.1 Hotfix v0.9.1 - develop	5 年前
GitHub	7b69bd14	Refactor Trainer and Model (#2360 ) - Move common functions to trainer.py, model.pyfromppo/trainer.py, ppo/policy.pyandppo/model.py' - Introduce RLTrainer class and move most of add_experiences and some common reward signal code there. PPO and SAC will inherit from this, not so much BC Trainer. - Add methods to Buffer to enable sampling, truncating, and save/loading. - Add scoping to create encoders in model.py	5 年前

10 次代码提交 (45fe60db-4ef6-4edf-b0df-69a929174271)