ml-agents

作者	SHA1	备注	提交日期
GitHub	6a81a2f4	Add Soft Actor-Critic as trainer option (#2341 ) * Add Soft Actor-Critic model, trainer, and policy and sac_trainer_config.yaml * Add documentation for SAC and tweak PPO documentation to reference the new pages. * Add tests for SAC, change simple_rl test to run both PPO and SAC.	5 年前
GitHub	25926795	initialize trainer step count (#2498 ) * initialize trainer step count * remove step init from RLTrainer	5 年前
Ervin T	06d9678c	Minor fix to link to GAIL reward signal doc (#2435 )	5 年前
GitHub	4bb97e25	Fix bug with construct_curr_info (#2490 ) * Fix bug with construct_curr_info * Add more tests	5 年前
Ervin Teng	28ef8983	Add 2 visual obs test	5 年前
Ervin Teng	aca81efb	Add more tests	5 年前
Ervin Teng	e0da93d1	Fix bug with construct_curr_info and test	5 年前
GitHub	bf375235	Change update buffer to float32 instead of float64 (#2461 ) - Reduces memory usage of buffer.	5 年前
GitHub	3c1f4dbb	python coverage: specify dirs, exclude test files (#2473 ) * specify dirs, exclude test files * update comments * html coverage in CI artifacts * add destination * ignore coverage files * check gym-unity too	5 年前
Jeffrey Shih	df64b64a	Fixed typo in Training-Imitation-Learning.md (#2485 )	5 年前
GitHub	ec7fd11d	Merge pull request #2451 from DanAmador/patch-1 Fixed small typo in documentation.	5 年前
GitHub	cf9e67fb	Merge pull request #2470 from Unity-Technologies/release-0.9.2 Release 0.9.2 to develop	5 年前
GitHub	c13828ce	Merge pull request #2474 from Unity-Technologies/release-0.9.2-flake8-fix Fixed the flake8	5 年前
Yuan Gao	33404e1b	Fixed the flake8	5 年前
GitHub	df0196f9	Merge pull request #2472 from Unity-Technologies/release-0.9.2-multi-gpu-doc Added the doc for multi-gpu	5 年前
Yuan Gao	b9210f4c	Updated the comment for —multi-gpu option.	5 年前
Yuan Gao	66205c3e	Added the doc for multi-gpu	5 年前
GitHub	be66102d	Merge pull request #2471 from Unity-Technologies/setup.py-h5py-version More flexibility on the h5py version	5 年前
GitHub	0afd58fc	More flexibility on the h5py version	5 年前
Yuan Gao	0c492fb7	Updated the model	5 年前
Yuan Gao	f33830bc	Updated the python packages version to 0.9.2	5 年前
GitHub	261ee0b6	Merge pull request #2457 from Unity-Technologies/hh/fix-training-NaN-errors-crawler Fix NaN training errors for crawler	5 年前
Hunter	83703d20	moved look rotation logic to avoid potential NaN LookRotation	5 年前
GitHub	1b7045bf	Merge pull request #2448 from Unity-Technologies/hh/fix-broken-crawler-prefabs fixed broken crawler prefabs	5 年前
Hunter	45fe60db	fixed broken prefabs	5 年前
GitHub	3880fd3a	Update development release version to 0.10.0.dev0 (#2443 ) In order for downstream packages to make use of the latest pre-release features, we can pre-release versions of our packages. For packages ending in `devN` pip will not install that package version by default. This change manually updates our package version to a development version with the idea that we can manually perform development versions with the potential for future automated / nightly dev releases.	5 年前
GitHub	43696d60	Fix bug in add_rewards_output and add test (#2442 )	5 年前
GitHub	689765d6	Modification of reward signals and rl_trainer for SAC (#2433 ) * Adds evaluate_batch to reward signals. Evaluates on minibatch rather than on BrainInfo. * Changes the way reward signal results are reported in rl_trainer so that we get the pure, unprocessed environment reward separate from the reward signals. * Moves end_episode to rl_trainer * Fixed bug with BCModule with RNN	5 年前
Arthur Juliani	fa46be7f	Merge branch 'RunSwimFlyRich-master' into develop	5 年前
GitHub	4abe89bc	Only call get_action on brains with policies (#2437 )	5 年前
GitHub	bd7eb286	Update reward signals in parallel with policy (#2362 )	5 年前
Jonathan Harper	e333abf8	Fixing compile error Variable "model" is undefined.	5 年前
GitHub	4472838e	Merge pull request #2421 from Unity-Technologies/hotfix-v0.9.1 Hotfix v0.9.1 - develop	5 年前
GitHub	7b69bd14	Refactor Trainer and Model (#2360 ) - Move common functions to trainer.py, model.pyfromppo/trainer.py, ppo/policy.pyandppo/model.py' - Introduce RLTrainer class and move most of add_experiences and some common reward signal code there. PPO and SAC will inherit from this, not so much BC Trainer. - Add methods to Buffer to enable sampling, truncating, and save/loading. - Add scoping to create encoders in model.py	5 年前

34 次代码提交 (6a81a2f4-1938-4e8e-9bb1-4c8e4b264f40)