ml-agents

作者	SHA1	备注	提交日期
GitHub	4ac79742	Refactor reward signals into separate class (#2144 ) * Create new class (RewardSignal) that represents a reward signal. * Add value heads for each reward signal in the PPO model. * Make summaries agnostic to the type of reward signals, and log weighted rewards per reward signal. * Move extrinsic and curiosity rewards into this new structure. * Allow defining multiple reward signals in YAML file. Add documentation for this new structure.	5 年前
GitHub	b05c9ac1	Add environment manager for parallel environments (#2209 ) Previously in v0.8 we added parallel environments via the SubprocessUnityEnvironment, which exposed the same abstraction as UnityEnvironment while actually wrapping many parallel environments via subprocesses. Wrapping many environments with the same interface as a single environment had some downsides, however: * Ordering needed to be preserved for agents across different envs, complicating the SubprocessEnvironment logic * Asynchronous environments with steps taken out of sync with the trainer aren't viable with the Environment abstraction This PR introduces a new EnvManager abstraction which exposes a reduced subset of the UnityEnvironment abstraction and a SubprocessEnvManager implementation which replaces the SubprocessUnityEnvironment.	5 年前
GitHub	d80d5852	add some types to the reward signals (#2215 ) * WIP add some types to the reward signals * fix next_visual_in * cleanup TODO * fix bad merge	5 年前
GitHub	9c50abcf	GAIL and Pretraining (#2118 ) Based on the new reward signals architecture, add BC pretrainer and GAIL for PPO. Main changes: - A new GAILRewardSignal and GAILModel for GAIL/VAIL - A BCModule component (not a reward signal) to do pretraining during RL - Documentation for both of these - Change to Demo Loader that lets you load multiple demo files in a folder - Example Demo files for all of our tested sample environments (for future regression testing)	5 年前
Chris Elion	5d07ca1f	Merge remote-tracking branch 'origin/develop' into enable-flake8	5 年前
Chris Elion	dfdf7b83	fix whitespace and line breaks	5 年前
GitHub	f8041534	Merge pull request #2236 from Unity-Technologies/enable-flake8 Enable flake8	5 年前
GitHub	6a212f73	Improvements for GAIL (#2296 ) * Don't 0 value bootstrap for GAIL and Curiosity * Add gradient penalties to GAN to help with stability * Add gail_config.yaml with GAIL examples * Cleaned up trainer_config.yaml and unnecessary gammas * Documentation updates * Code cleanup	5 年前
GitHub	dd0d2a10	Remove unnecessary feed_dicts for GAIL and Curiosity (#2348 )	5 年前
GitHub	d7ebaae1	Return list instead of np array for make_mini_batch() (#2371 ) Return list instead of np array for make_mini_batch() to reduce time copying data	5 年前
GitHub	ab690b93	Fix naming conflict between Curiosity and GAIL (#2406 )	5 年前
GitHub	afb6ede5	Merge pull request #2393 from Unity-Technologies/hotfix-v0.9.0a - Fix issue with BC Trainer `increment_steps`. - Fix issue with Demonstration Recorder and visual observations (memory leak fix was deleting vis obs too early). - Make Samplers sample from the same random seed every time, so generalization runs are repeatable. - Fix crash when using GAIL, Curiosity, and visual observations together.	5 年前
Ervin Teng	072d2ef8	Merge latest develop	5 年前
GitHub	4472838e	Merge pull request #2421 from Unity-Technologies/hotfix-v0.9.1 Hotfix v0.9.1 - develop	5 年前
GitHub	bd7eb286	Update reward signals in parallel with policy (#2362 )	5 年前
GitHub	689765d6	Modification of reward signals and rl_trainer for SAC (#2433 ) * Adds evaluate_batch to reward signals. Evaluates on minibatch rather than on BrainInfo. * Changes the way reward signal results are reported in rl_trainer so that we get the pure, unprocessed environment reward separate from the reward signals. * Moves end_episode to rl_trainer * Fixed bug with BCModule with RNN	5 年前
GitHub	0a163871	Merge pull request #2469 from Unity-Technologies/release-0.9.2 Release 0.9.2	5 年前
GitHub	67d754c5	Fix flake8 import warnings (#2584 ) We have been ignoring unused imports and star imports via flake8. These are both bad practice and grow over time without automated checking. This commit attempts to fix all existing import errors and add back the corresponding flake8 checks.	5 年前
GitHub	b2fa2268	Merge pull request #2648 from Unity-Technologies/release-0.10.0 Release 0.10.0	5 年前
Anupam Bhatnagar	cc208c00	resolving conflicts	5 年前
Chris Elion	43e23941	rough pass at tf2 support, needs cleanup	5 年前
Chris Elion	806c77e4	centralize tensorflow imports	5 年前
GitHub	619465e1	Fix crash when SAC is used with Curiosity and Continuous Actions (#2740 ) * Add test for curiosity + SAC * Use actions for all curiosity (need to test on PPO) * Fix issue with reward signals updating multiple times * Put curiosity actions in the right placeholder * Test PPO curiosity update	5 年前
Chris Elion	3d8a70fb	Merge remote-tracking branch 'origin/develop' into try-tf2-support	5 年前
GitHub	495873e5	Merge pull request #2833 from Unity-Technologies/release-0.11.0 Release 0.11.0	5 年前
Chris Elion	73a346cb	cleanup	5 年前
GitHub	f57b7ac6	Allow usage with tensorflow 2.0.0 (via tf.compat.v1) (#2665 )	5 年前
Ervin Teng	987e0e3a	Merge tf2 branch	5 年前
GitHub	69d1a033	Develop remove past action communication (#2913 ) * Modifying the .proto files * attempt 1 at refactoring Python * works for ppo hallway * changing the documentation * now works with both sac and ppo both training and inference * Ned to fix the tests * TODOs : - Fix the demonstration recorder - Fix the demonstration loader - verify the intrinsic reward signals work - Fix the tests on Python - Fix the C# tests * Regenerating the protos * fix proto typo * protos and modifying the C# demo recorder * modified the demo loader * Demos are loading * IMPORTANT : THESE ARE THE FILES USED FOR CONVERSION FROM OLD TO NEW FORMAT * Modified all the demo files * Fixing all the tests * fixing ci * addressing comments * removing reference to memories in the ll-api	5 年前
Ervin Teng	54644477	Merge branch 'develop' of github.com:Unity-Technologies/ml-agents into develop-nomaxstep-test	5 年前
GitHub	d4780a55	Merge pull request #3010 from Unity-Technologies/release-0.12.0-to-master Merge Release 0.12.0 to master	5 年前
GitHub	36048cb6	Moving Env Manager to Trainers (#3062 ) The Env Manager is only used by the trainer codebase. The entry point to interact with an environment is UnityEnvironment. * Moving Env Manager to Trainers * fix pylint madness	5 年前
Ervin Teng	c330f6f6	Merge branch 'master' into develop-agentprocessor	5 年前
GitHub	f058b18c	Replace BrainInfos with BatchedStepResult (#3207 )	5 年前
Ervin Teng	29f3330f	Merge master into hotfix-0.13.1	5 年前
Ervin Teng	164732a9	Move optimizer creation to Trainer, fix some of the reward signals	5 年前
Ervin Teng	abc98c23	Change reward signal creation	5 年前
Ervin Teng	151e3b1c	Move policy to common location, remove epsilon	5 年前
Ervin Teng	b61d2fa1	Fix some typing issues with curiosity	5 年前
Ervin Teng	cadf6603	Fix SAC CC and some reward signal tests	5 年前
Ervin Teng	5bfc0b87	Update docstring	5 年前
Ervin Teng	7c0fa1c4	Remove action_holder placeholder	5 年前
GitHub	c145e75b	Split Policy and Optimizer, common Policy for PPO and SAC (#3345 )	5 年前
Andrew Cohen	5b0aca29	Merge branch 'master' into soccer-fives	5 年前
Ervin Teng	14f2a7f2	Rename LearningModel to ModelUtils	5 年前
Ervin Teng	1156b9b3	Merge branch 'develop-splitpolicyoptimizer' into develop-removeactionholder	5 年前
Ervin Teng	53c25fb1	Move one-hot out of policy and remove selected_actions	5 年前
Anupam Bhatnagar	e04fcd71	Merge branch 'master' into master-into-release-0.14.1	5 年前
GitHub	97a1d4b1	[change] Remove the action_holder placeholder from the policy. (#3492 )	5 年前
Andrew Cohen	de73baa9	Merge branch 'master' into soccer-fives	5 年前
GitHub	e4177de0	[change] Organize trainer files a bit better (#3538 )	5 年前
Andrew Cohen	573b1f6d	Merge branch 'master' into soccer-fives	5 年前
GitHub	ffd8f855	[bug-fix] Fix crash when demo size is smaller than batch size (#3591 )	5 年前
Chris Elion	7f2e815a	Merge remote-tracking branch 'origin/master' into develop-sidechannel-usability	5 年前
Chris Elion	fa5e7e6d	Merge remote-tracking branch 'origin/master' into develop-BehaviorParams-public	5 年前
Andrew Cohen	b1cfa74d	Merge branch 'master' into develop-test-imitation	5 年前
Andrew Cohen	53bea15c	Merge branch 'master' into soccer-fives	5 年前
Andrew Cohen	ac261e36	Merge branch 'master' into self-play-mutex	5 年前
Anupam Bhatnagar	50e52d9c	Merge branch 'master' into distributed-training	5 年前
Christopher Goy	ba80b292	format files with pre-commit.	4 年前
GitHub	f7373172	Merge pull request #4385 from Unity-Technologies/release_2_verified-barracuda-1.0.2 update verified brach with barracuda 1.0.2	4 年前
GitHub	e92b4f88	[refactor] Structure configuration files into classes (#3936 )	4 年前
Arthur Juliani	9724c9ac	Merge master	4 年前
GitHub	a28e2767	Update add-fire to latest master, including Policy refactor (#4263 ) * Update Dockerfile * Separate send environment data from reset (#4128) * Fixed a typo on ML-Agents-Overview.md (#4130) Fixed redundant "to" word from the sentence since it is probably a typo in document. * Updated the badge’s link to point to the newest doc version * Replaced all of the doc to release_3_doc * Fix 3DBall and 3DBallHard SAC regressions (#4132) * Move memory validation to settings * Update docs * Add settings test * Update to release_3 in installation.md (#4144) * rename to SideChannelManager +backcompat (#4137) * Remove comment about logo with --help (#4148) * [bugfix] Make FoodCollector heuristic playable (#4147) * Make FoodCollector heuristic playable * Update changelog * script to check for old release links and references (#4153) * Remove package validation suite from Project (#4146) * RayPerceptionSensor: handle empty and invalid tags (#4155...	4 年前
vincentpierre	599d7e9f	Merging master	4 年前
GitHub	3bcb029b	[refactor] Remove BrainParameters from Python code (#4138 )	4 年前
GitHub	1f5eb9da	add pyupgrade to pre-commit and run (#4239 )	4 年前
GitHub	129f9ddc	[MLA-427] make pyupgrade convert f-strings too (#4244 ) * make pyupgrade convert f-strings too	4 年前
Andrew Cohen	d8c123a0	Merge branch 'master' into sensitivity	4 年前
GitHub	380fef57	[refactor] Move TF-specific files to tf/ folder (#4266 )	4 年前
Andrew Cohen	06e4356c	Merge branch 'master' into sensitivity	4 年前
Arthur Juliani	1a123641	Merge remote-tracking branch 'origin/master' into r5-master	4 年前
HH	8eaddb61	Merge branch 'master' into hh/develop/loco-walker-variable-speed	4 年前
GitHub	c188781b	[life improvement] Moving Python files around (#4531 ) * Moved components to the tf folder and moved the TrainerFactory to the `trainer` folder * Addressing comments * Editing the migrating doc * fixing test	4 年前
Andrew Cohen	e5f14400	Merge branch 'master' into develop-hybrid-actions-singleton	4 年前
Andrew Cohen	f654df34	fixing tensorflow tests	4 年前
GitHub	cb8e4d25	Add ActionSpec (#4586 ) Co-authored-by: Ervin T <ervin@unity3d.com>	4 年前
Andrew Cohen	9689cf2c	remove _action_ from function names	4 年前
vincentpierre	a3a9a56b	Merge branch 'exp-multi-head-attention' into exp-bullet-hell	4 年前
Ruo-Ping Dong	9e08be87	Merge branch 'master' into release_9_branch_merge	4 年前
Andrew Cohen	97dfa142	fix action_spec refs	4 年前
GitHub	b853e5ba	Action buffer (#4612 ) Co-authored-by: Ervin T <ervin@unity3d.com> Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>	4 年前
GitHub	990f801a	Develop hybrid action staging (#4702 ) Co-authored-by: Ervin T <ervin@unity3d.com> Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com> Co-authored-by: Ruo-Ping Dong <ruoping.dong@unity3d.com> Co-authored-by: Chris Elion <chris.elion@unity3d.com>	4 年前
Andrew Cohen	8172b3d6	test_simple_rl/reward providers pass tf/torch	4 年前
Andrew Cohen	4ebc6c44	ml-agents-envs pass	4 年前
Arthur Juliani	0d2f8887	Merge remote-tracking branch 'origin/master' into goal-conditioning # Conflicts: # ml-agents-envs/mlagents_envs/base_env.py # ml-agents-envs/mlagents_envs/rpc_utils.py # ml-agents/mlagents/trainers/tests/mock_brain.py # ml-agents/mlagents/trainers/tests/simple_test_envs.py	4 年前
Ervin Teng	25dfd883	Merge branch 'master' into develop-centralizedcritic	4 年前
Andrew Cohen	498b1ee6	Merge branch 'develop-action-buffer' into develop-hybrid-actions-singleton	4 年前
Ruo-Ping Dong	8ed14762	Merge branch 'develop-hybrid-actions-singleton' into develop-hybrid-actions-csharp	4 年前

1 2

89 次代码提交 (336e1c34-b54d-4486-bdc5-42b46d0b5284)