ml-agents

作者	SHA1	备注	提交日期
GitHub	6a81a2f4	Add Soft Actor-Critic as trainer option (#2341 ) * Add Soft Actor-Critic model, trainer, and policy and sac_trainer_config.yaml * Add documentation for SAC and tweak PPO documentation to reference the new pages. * Add tests for SAC, change simple_rl test to run both PPO and SAC.	5 年前
GitHub	832e4a47	Normalize observations when adding experiences (#2556 ) * Normalize observations when adding experiences This change moves normalization of vector observations into the trainer's "add_experiences" interface. Prior to this change, normalization occurred at inference time. This was somewhat confusing since usually executing a forward pass shouldn't have side-effects which would change the training step. Also, in a asynchronous or distributed setting where we copy the neural network weights from a trainer to a remote actor / inference worker we'd end up with training issues because of the weights being different on the trainer than the workers.	5 年前
GitHub	67d754c5	Fix flake8 import warnings (#2584 ) We have been ignoring unused imports and star imports via flake8. These are both bad practice and grow over time without automated checking. This commit attempts to fix all existing import errors and add back the corresponding flake8 checks.	5 年前
GitHub	cb144f20	small mypy cleanup (#2637 ) * small mypy cleanup * sac cleanup * types for ppo policy init	5 年前
GitHub	619465e1	Fix crash when SAC is used with Curiosity and Continuous Actions (#2740 ) * Add test for curiosity + SAC * Use actions for all curiosity (need to test on PPO) * Fix issue with reward signals updating multiple times * Put curiosity actions in the right placeholder * Test PPO curiosity update	5 年前
GitHub	4da157fe	more pylint fixes (#2842 )	5 年前
Andrew Cohen	13fe9cf8	Bubbled up indexing of AllBrainInfo to trainer controller from trainers	5 年前
Andrew Cohen	e96b80db	recieves brain_name and identifier on python side	5 年前
Ervin Teng	df5ee7bf	Split buffer into two buffers (PPO works)	5 年前
Ervin Teng	e5459c49	buffer split for SAC	5 年前
Ervin Teng	fd0647a6	Rename append_update_buffer to append_to_update_buffer	5 年前
GitHub	213cd68d	Split Buffer into processing and update buffers (#2964 ) This is the first in a series of PRs that intend to move the agent processing logic (add_experiences and process_experiences) out of the trainer and into a separate class. The plan is to do so in steps: - Split the processing buffers (keeping track of agent trajectories and assembling trajectories) and update buffer (complete trajectories to be used for training) within the Trainer (this PR) - Move the processing buffer and add/process experiences into a separate, outside class - Change the data type of the update buffer to be a Trajectory - Place and read Trajectories from queues, add subscription mechanism for both AgentProcessor and Trainers	5 年前
Ervin Teng	9c5fdd31	Stats reporting is working	5 年前
Andrew Cohen	5097bcc0	recieves brain_name and identifier on python side	5 年前
Ervin Teng	76abf968	Add back max_step logic	5 年前
Ervin Teng	28eba789	Migrate SAC	5 年前
Andrew Cohen	8578b0b7	add_policy and create_policy separated	5 年前
Ervin Teng	f2b3cd7f	Remove dead code	5 年前
GitHub	36048cb6	Moving Env Manager to Trainers (#3062 ) The Env Manager is only used by the trainer codebase. The entry point to interact with an environment is UnityEnvironment. * Moving Env Manager to Trainers * fix pylint madness	5 年前
Ervin Teng	c9116ed2	Move some common logic to buffer class	5 年前
GitHub	90db165f	Add --namespace-packages to mypy for mlagents (#3075 )	5 年前
Andrew Cohen	614d276f	recieves brain_name and identifier on python side	5 年前
Chris Elion	fdc810ff	move (first pass)	5 年前
Ervin Teng	27c2a55b	Lots of test fixes	5 年前
Ervin Teng	97d66e71	Remove BootstrapExperience	5 年前
Ervin Teng	324d217b	Move agent_id to Trajectory	5 年前
Ervin Teng	77ff4822	Add back next_obs	5 年前
Ervin Teng	2b811fc8	Properly report value estimates and episode length	5 年前
GitHub	2fd305e7	Move add_experiences out of trainer, add Trajectories (#3067 )	5 年前
Andrew Cohen	de902fbb	passes all pytest and C# tests	5 年前
GitHub	2ac242f7	Remove TrainerMetrics and add CSVWriter using new StatsWriter API (#3108 )	5 年前
Ervin Teng	fdf9aea7	Make conversion methods part of NamedTuples	5 年前
Ervin Teng	6242b67d	Add way to check if trajectory is done or max_reached	5 年前
GitHub	0b5b1b01	Develop magic string + trajectory (#3122 ) * added team id and identifier concat to behavior parameters * splitting brain params into brain name and identifiers * set team id in prefab * recieves brain_name and identifier on python side * added team id and identifier concat to behavior parameters * splitting brain params into brain name and identifiers * set team id in prefab * recieves brain_name and identifier on python side * rebased with develop * Correctly calls concatBehaviorIdentifiers * added team id and identifier concat to behavior parameters * splitting brain params into brain name and identifiers * set team id in prefab * recieves brain_name and identifier on python side * rebased with develop * Correctly calls concatBehaviorIdentifiers * trainer_controller expects name_behavior_ids * add_policy and create_policy separated * adjusting tests to expect trainer.add_policy to be called * fixing tests * fixed naming ...	5 年前
GitHub	c7da0139	Fix mypy errors in trainer code. (#3135 )	5 年前
Andrew Cohen	082789ea	Merge branch 'master' into develop-magic-string	5 年前
Andrew Cohen	6a4e7cf9	added ppo/sac_policy attributes to keep up with master	5 年前
Ervin Teng	1bd791e5	Merge branch 'master' into develop-agentprocessor	5 年前
Andrew Cohen	3e76adbd	fixing more ci tests	5 年前
GitHub	bec2e8f0	Add Trajectory/Policy Queues, move Trainer logic to advance() (#3113 )	5 年前
Ervin Teng	db743971	Move private methods out of trainer, simplify interface	5 年前
GitHub	45010af3	Add stats reporter class and re-enable missing stats (#3076 )	5 年前
Ervin Teng	b3a4e641	Remove some vestigial code	5 年前
Ervin Teng	48793ec1	Fix test	5 年前
GitHub	5bc7531b	Get step from policy (#3223 )	5 年前
Ervin Teng	29f3330f	Merge master into hotfix-0.13.1	5 年前
GitHub	329b23e0	Fix extra summary being written when loading from checkpoint (#3272 ) * Load next summary properly * Add tests for add_policy and get_policy	5 年前
Ervin Teng	0ef40c08	SAC CC working	5 年前
Ervin Teng	db249ceb	Merge branch 'master' into develop-splitpolicyoptimizer	5 年前
Ervin Teng	b21b3d5c	Use resamp policy for SAC	5 年前
Ervin Teng	edeceefd	Zeroed version of LSTM working for PPO	5 年前
Ervin Teng	cfc2f455	Fix BC and tests	5 年前
Ervin Teng	78671383	Move initialization call around	5 年前
GitHub	dd86e879	Separate out optimizer creation and policy graph creation (#3355 )	5 年前
GitHub	c145e75b	Split Policy and Optimizer, common Policy for PPO and SAC (#3345 )	5 年前
GitHub	e4177de0	[change] Organize trainer files a bit better (#3538 )	5 年前
GitHub	cb153a0f	[change] Change warning language when adversarial scene is used without self-play (#3561 )	5 年前
GitHub	873ba7fd	[bug-fix] Fix stats reporting for reward signals in SAC (#3606 )	5 年前
GitHub	c42a11c3	[change] Throw a proper error when sequence length is greater than batch size. (#3583 )	5 年前
GitHub	ec278616	Hotfixes for Release 0.15.1 (#3698 ) * [bug-fix] Increase height of wall in CrawlerStatic (#3650) * [bug-fix] Improve performance for PPO with continuous actions (#3662) * Corrected a typo in a name of a function (#3670) OnEpsiodeBegin was corrected to OnEpisodeBegin in Migrating.md document * Add Academy.AutomaticSteppingEnabled to migration (#3666) * Fix editor port in Dockerfile (#3674) * Hotfix memory leak on Python (#3664) * Hotfix memory leak on Python * Fixing * Fixing a bug in the heuristic policy. A decision should not be requested when the agent is done * [bug-fix] Make Python able to deal with 0-step episodes (#3671) * adding some comments Co-authored-by: Ervin T <ervin@unity3d.com> * Remove vis_encode_type from list of required (#3677) * Update changelog (#3678) * Shorten timeout duration for environment close (#3679) The timeout duration for closing an environment was set to the same duration as the timeout when waiting ...	5 年前
GitHub	6709a9bf	[change] Clean up trainer interface, clean up GhostTrainer stats (#3634 )	5 年前
Andrew Cohen	9f09a65d	team id centric ghost trainer	5 年前
Ervin Teng	293579dd	Use steps_per_update to determine SAC train interval	5 年前
Ervin Teng	0fa2f4f7	Don't count buffer_init_steps	5 年前
Ervin Teng	dbf8f7a5	Fix comment	5 年前
GitHub	ff32035d	Remove vis_encode_type from list of required (#3677 )	5 年前
Andrew Cohen	4c9ac553	Merge branch 'master' into self-play-mutex	5 年前
GitHub	4ecd6ad3	Fix how we set logging levels (#3703 ) * cleanup logging * comments and cleanup * pylint, gym	5 年前
Andrew Cohen	59b88be6	Merge branch 'master' into self-play-mutex	5 年前
Ervin Teng	06fa3d39	Merge branch 'master' into develop-sac-apex	5 年前
Andrew Cohen	3de78baa	wrapped trainer has internal policy ghost	5 年前
Ervin Teng	b7151b51	Remove num_update as param	5 年前
Andrew Cohen	3013774b	alternative to internal-policy fix	5 年前
Ervin Teng	8b52a2d0	Address comments in docs	5 年前
Ervin Teng	817aab95	Update steps_per_update documentation Add constant Tweak buffer max size	5 年前
Ervin Teng	f29b17a9	Don't block one policy queue Only put policies when policy is actually updated	5 年前
Ervin Teng	5e980ec1	Merge branch 'master' into develop-sac-apex	5 年前
Anupam Bhatnagar	9d7dd3b6	[skip ci] moving step increment to trainer from environment for sac	5 年前
GitHub	232519e4	[refactor] Move output artifacts to a single results/ folder (#3829 )	5 年前
GitHub	4641038e	Renaming max_step to interrupted in TermialStep(s) (#3908 )	5 年前
Christopher Goy	ba80b292	format files with pre-commit.	4 年前
GitHub	e92b4f88	[refactor] Structure configuration files into classes (#3936 )	5 年前
GitHub	a7323393	[bug-fix] Fix issue with SAC updating too much on resume (#4038 )	5 年前
GitHub	5cce69ae	add "the the" to precommit spell check (#4059 )	5 年前
GitHub	09853e13	[refactor] Move checkpoint saving into trainer (#4034 )	5 年前
GitHub	a1c63c4b	Release 3 Cherry-pick bug-fixes and doc changes from master (#4102 ) * [bug-fix] Fix regression in --initialize-from feature (#4086) * Fixed text in GettingStarted page specifying the logdir for tensorboard. Before it was in a directory summaries which no longer existed. Results are now saved to the results dir. (#4085) * [refactor] Remove nonfunctional `output_path` option from TrainerSettings (#4087) * Reverting bug introduced in #4071 (#4101) Co-authored-by: Scott <Scott.m.jordan91@gmail.com> Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>	5 年前
Anupam Bhatnagar	4afd8f92	first commit	5 年前
Anupam Bhatnagar	f7a3c06e	[skip ci] updating sac	5 年前
Anupam Bhatnagar	a4567f27	[skip ci] restore process trajectory super calls	5 年前
Anupam Bhatnagar	26dc42e5	[skip ci]	5 年前
Anupam Bhatnagar	0aedad7c	fixing should_still_train call in rl_trainer.py	5 年前
Anupam Bhatnagar	392a84f1	[skip ci] fixing property decorator in sac	5 年前
Arthur Juliani	9724c9ac	Merge master	5 年前
GitHub	45154f52	Pytorch port of SAC (#4219 )	5 年前
GitHub	a28e2767	Update add-fire to latest master, including Policy refactor (#4263 ) * Update Dockerfile * Separate send environment data from reset (#4128) * Fixed a typo on ML-Agents-Overview.md (#4130) Fixed redundant "to" word from the sentence since it is probably a typo in document. * Updated the badge’s link to point to the newest doc version * Replaced all of the doc to release_3_doc * Fix 3DBall and 3DBallHard SAC regressions (#4132) * Move memory validation to settings * Update docs * Add settings test * Update to release_3 in installation.md (#4144) * rename to SideChannelManager +backcompat (#4137) * Remove comment about logo with --help (#4148) * [bugfix] Make FoodCollector heuristic playable (#4147) * Make FoodCollector heuristic playable * Update changelog * script to check for old release links and references (#4153) * Remove package validation suite from Project (#4146) * RayPerceptionSensor: handle empty and invalid tags (#4155...	4 年前
GitHub	93517833	[feature] Fix TF tests, add --torch CLI option, allow run TF without torch installed (#4305 )	4 年前
Ruo-Ping Dong	01e60921	add sac checkpoint	4 年前
GitHub	7ddfd81f	Added Reward Providers for Torch (#4280 ) * Added Reward Providers for Torch * Use NetworkBody to encode state in the reward providers * Integrating the reward prodiders with ppo and torch * work in progress, integration with PPO. Not training properly Pyramids at the moment * Integration in PPO * Removing duplicate file * Gail and Curiosity working * addressing comments * Enfore float32 for tests * enfore np.float32 in buffer	4 年前
Ruo-Ping Dong	71fe4df6	fix formatting and test	4 年前
Ruo-Ping Dong	09a741c8	small improvement	4 年前
GitHub	3bcb029b	[refactor] Remove BrainParameters from Python code (#4138 )	5 年前
Ruo-Ping Dong	e06812aa	fix tests	4 年前
GitHub	84440f05	Convert checkpoints to .NN (#4127 ) This change adds an export to .nn for each checkpoint generated by RLTrainer and adds a NNCheckpointManager to track the generated checkpoints and final model in training_status.json. Co-authored-by: Jonathan Harper <jharper+moar@unity3d.com>	5 年前
GitHub	129f9ddc	[MLA-427] make pyupgrade convert f-strings too (#4244 ) * make pyupgrade convert f-strings too	5 年前
GitHub	1b098c9a	Refactor TFPolicy and Policy (#4254 ) * Refactor TFPolicy and Policy	4 年前
GitHub	beb5aca5	[refactor] Make classes except Optimizer framework agnostic (#4268 )	4 年前
GitHub	3f44a0bc	cleanup around AdamOptimizer (#4333 ) * cleanup around AdamOptimizer * methods to creat Optimizer instances	4 年前
Ruo-Ping Dong	d3eb6c46	Merge branch 'develop-add-fire' into develop-add-fire-checkpoint	4 年前
Ruo-Ping Dong	95858e25	update saver interface and add tests	4 年前
Ruo-Ping Dong	523248be	update	4 年前
Ruo-Ping Dong	409a161c	fix bc tests	4 年前
GitHub	25dc8c3d	Add Saver Class to handle all save/load/checkpoint/export work (#4323 )	4 年前
Ervin Teng	d65a9326	Merge branch 'master' into develop-add-fire-mm3	4 年前
Ruo-Ping Dong	d57aa9ab	Merge branch 'develop-add-fire-mm3' into develop-add-fire-checkpoint	4 年前
GitHub	49545ce1	Pytorch ghost trainer (#4370 )	4 年前
Andrew Cohen	e7c9ff35	clean up docstrings create policies	4 年前
Andrew Cohen	039ae17f	capitalize Tensorflow	4 年前
Ruo-Ping Dong	c47ffc20	Rename saver	4 年前
Ruo-Ping Dong	27fb4270	brain_name to behavior_name	4 年前
Ruo-Ping Dong	ef3be79e	sac	4 年前
GitHub	beb5eb30	[bug-fix] Fixes for Torch SAC and tests (#4408 ) * Fixes for Torch SAC and tests * FIx recurrent sac test * Properly update normalization for SAC-continuous * Fix issue with log ent coef reporting in SAC Torch	4 年前
GitHub	6f534366	Add torch_utils class, auto-detect CUDA availability (#4403 ) * Add torch_utils * Use torch from torch_utils * Add torch to banned modules in CI * Better import error handling * Fix flake8 errors * Address comments * Move networks to GPU if enabled * Switch to torch_utils * More flake8 problems * Move reward providers to GPU/CPU * Remove anothere set default tensor * Fix banned import in test	4 年前
GitHub	badca342	Rename NNCheckpoint to ModelCheckpoint as Model can be NN or ONNX (#4540 )	4 年前
GitHub	c188781b	[life improvement] Moving Python files around (#4531 ) * Moved components to the tf folder and moved the TrainerFactory to the `trainer` folder * Addressing comments * Editing the migrating doc * fixing test	4 年前
GitHub	a690af74	[refactor] Make PyTorch the default and TensorFlow optional (#4517 ) * Torch setup.py * Set torch to default * Make torch default in setup.py * Remove indents * Remove other instances of TF being used * Add tensorboard to setup.py * Adding correst setup commands for verifying torch is installed (#4524) * Adding correst setup commands for verifying torch is installed * Editing the test_requirments to add tf and remove torch * Develop torchdefault raise outside setup (#4530) * Torch not imported error to raise at first usage * Torch not imported error to raise at first usage * [refactor] Use PyTorch TensorBoard utils (#4518) * Convert stats writer to use PyTorch TB support * Use common function to print params * Update test * Bump tensorboard to 1.15 to fix the tests * putting tensorboard 1.15.0 as min version requirement Co-authored-by: vincentpierre <vincentpierre@unity3d.com> * [Docs] Initial documentation changes for making...	4 年前
vincentpierre	b863af57	Removing TensorFlow Trainers	4 年前
Ervin Teng	6c77ac7a	Update SAC, fix PPO batching	4 年前
Ervin Teng	1db21cbb	Fix SAC interrupted condition and typing	4 年前
Ervin Teng	6e6a6b2b	Fix SAC interrupted again	4 年前
vincentpierre	713e65fb	removing tensorflow testing for pytest and yamato	4 年前
vincentpierre	2dd34aa5	Formatting	4 年前
vincentpierre	8f9634c2	Fxing test	4 年前
vincentpierre	735fcd52	[WIP] Refactor trainers to use list of obs rather than vec and vis obs	4 年前
GitHub	22658a40	use sensor types to differentiate obs (#4749 )	4 年前
GitHub	64fc7f43	Buffer key enums (#4907 )	4 年前
Ervin Teng	ae7643b8	Proper critic memories for PPO	4 年前
Ervin Teng	fd3f05b9	Enable GAIL to decay	4 年前
Ervin Teng	bb452ffd	Fix SAC	4 年前
GitHub	f16ce486	Update v2-staging from main (March 15) (#5123 )	4 年前
GitHub	62314056	Fix ghost curriculum and make steps private (#5098 ) * use get step to determine curriculum * add to CHANGELOG * Make step in trainer private (#5099) Co-authored-by: Ervin T <ervin@unity3d.com>	4 年前
Ervin Teng	d1c24251	[bug-fix] When agent isn't training, don't clear update buffer (#5205 ) * Don't clear update buffer, but don't append to it either * Update changelog * Address comments * Make experience replay buffer saving more verbose (cherry picked from commit 63e7ad44d96b7663b91f005ca1d88f4f3b11dd2a)	4 年前
GitHub	2e19759c	Turning some logger.info into logger.debug and remove some logging overhead when not using debug (#5211 ) * turning some logger.info into logger.debug and remove some logging overhead when not using debug * Addressing comments * Adding to changelog	4 年前

1 2 3

142 次代码提交 (4c19aef2-5d76-41e7-afd7-18dd7fea3725)