ml-agents

作者	SHA1	备注	提交日期
GitHub	7b69bd14	Refactor Trainer and Model (#2360 ) - Move common functions to trainer.py, model.pyfromppo/trainer.py, ppo/policy.pyandppo/model.py' - Introduce RLTrainer class and move most of add_experiences and some common reward signal code there. PPO and SAC will inherit from this, not so much BC Trainer. - Add methods to Buffer to enable sampling, truncating, and save/loading. - Add scoping to create encoders in model.py	5 年前
GitHub	689765d6	Modification of reward signals and rl_trainer for SAC (#2433 ) * Adds evaluate_batch to reward signals. Evaluates on minibatch rather than on BrainInfo. * Changes the way reward signal results are reported in rl_trainer so that we get the pure, unprocessed environment reward separate from the reward signals. * Moves end_episode to rl_trainer * Fixed bug with BCModule with RNN	5 年前
GitHub	f628d18b	initialize trainer step count (#2498 ) (#2505 ) * initialize trainer step count * remove step init from RLTrainer	5 年前
Ervin Teng	e0da93d1	Fix bug with construct_curr_info and test	5 年前
GitHub	25926795	initialize trainer step count (#2498 ) * initialize trainer step count * remove step init from RLTrainer	5 年前
Ervin Teng	4cb340b5	Fix crash when next_info is empty and using recurrent	5 年前
GitHub	b7e12a37	Fix crash in construct_curr_info when next_info doesn't have any agents (#2549 ) Fixes #1687	5 年前
GitHub	67d754c5	Fix flake8 import warnings (#2584 ) We have been ignoring unused imports and star imports via flake8. These are both bad practice and grow over time without automated checking. This commit attempts to fix all existing import errors and add back the corresponding flake8 checks.	5 年前
GitHub	5d3e05d1	Fix "memory leak" during inference (#2722 ) * Clear buffer if not training * Add tests	5 年前
GitHub	0fe5adc2	Develop remove memories (#2795 ) * Initial commit removing memories from C# and deprecating memory fields in proto * initial changes to Python * Adding functionalities * Fixes * adding the memories to the dictionary * Fixing bugs * tweeks * Resolving bugs * Recreating the proto * Addressing comments * Passing by reference does not work. Do not merge * Fixing huge bug in Inference * Applying patches * fixing tests * Addressing comments * Renaming variable to reflect type * test	5 年前
GitHub	4da157fe	more pylint fixes (#2842 )	5 年前
GitHub	ccb7eab4	Remove {text,custom} {action,observations} (#2839 ) * delete text actions and obs * delete custom actions and obs * regenerate protos * cleanup C# * format * fix tests * fix base env signature * doc cleanup	5 年前
Andrew Cohen	13fe9cf8	Bubbled up indexing of AllBrainInfo to trainer controller from trainers	5 年前
GitHub	69d1a033	Develop remove past action communication (#2913 ) * Modifying the .proto files * attempt 1 at refactoring Python * works for ppo hallway * changing the documentation * now works with both sac and ppo both training and inference * Ned to fix the tests * TODOs : - Fix the demonstration recorder - Fix the demonstration loader - verify the intrinsic reward signals work - Fix the tests on Python - Fix the C# tests * Regenerating the protos * fix proto typo * protos and modifying the C# demo recorder * modified the demo loader * Demos are loading * IMPORTANT : THESE ARE THE FILES USED FOR CONVERSION FROM OLD TO NEW FORMAT * Modified all the demo files * Fixing all the tests * fixing ci * addressing comments * removing reference to memories in the ll-api	5 年前
Ervin Teng	df5ee7bf	Split buffer into two buffers (PPO works)	5 年前
GitHub	652488d9	check for numpy float64 (#2948 )	5 年前
GitHub	213cd68d	Split Buffer into processing and update buffers (#2964 ) This is the first in a series of PRs that intend to move the agent processing logic (add_experiences and process_experiences) out of the trainer and into a separate class. The plan is to do so in steps: - Split the processing buffers (keeping track of agent trajectories and assembling trajectories) and update buffer (complete trajectories to be used for training) within the Trainer (this PR) - Move the processing buffer and add/process experiences into a separate, outside class - Change the data type of the update buffer to be a Trajectory - Place and read Trajectories from queues, add subscription mechanism for both AgentProcessor and Trainers	5 年前
Ervin Teng	9e661f0c	Looks like it's training	5 年前
Ervin Teng	9c5fdd31	Stats reporting is working	5 年前
Ervin Teng	f94365a2	No longer using ProcessingBuffer for PPO	5 年前
Ervin Teng	8b3b9e6c	Move trajectory and related functions to trajectory.py	5 年前
Andrew Cohen	8578b0b7	add_policy and create_policy separated	5 年前
GitHub	36048cb6	Moving Env Manager to Trainers (#3062 ) The Env Manager is only used by the trainer codebase. The entry point to interact with an environment is UnityEnvironment. * Moving Env Manager to Trainers * fix pylint madness	5 年前
GitHub	42bea858	Improve mypy coverage by adding --namespace-packages (#3049 )	5 年前
Ervin Teng	62d609f8	Fix some of the tests	5 年前
Ervin Teng	27c2a55b	Lots of test fixes	5 年前
Andrew Cohen	d1edbf43	add_policy and create_policy separated	5 年前
Ervin Teng	2b811fc8	Properly report value estimates and episode length	5 年前
GitHub	2fd305e7	Move add_experiences out of trainer, add Trajectories (#3067 )	5 年前
GitHub	0b5b1b01	Develop magic string + trajectory (#3122 ) * added team id and identifier concat to behavior parameters * splitting brain params into brain name and identifiers * set team id in prefab * recieves brain_name and identifier on python side * added team id and identifier concat to behavior parameters * splitting brain params into brain name and identifiers * set team id in prefab * recieves brain_name and identifier on python side * rebased with develop * Correctly calls concatBehaviorIdentifiers * added team id and identifier concat to behavior parameters * splitting brain params into brain name and identifiers * set team id in prefab * recieves brain_name and identifier on python side * rebased with develop * Correctly calls concatBehaviorIdentifiers * trainer_controller expects name_behavior_ids * add_policy and create_policy separated * adjusting tests to expect trainer.add_policy to be called * fixing tests * fixed naming ...	5 年前
Andrew Cohen	082789ea	Merge branch 'master' into develop-magic-string	5 年前
Ervin Teng	e577d5ea	Fix some mypy issues and remove unused code	5 年前
GitHub	bec2e8f0	Add Trajectory/Policy Queues, move Trainer logic to advance() (#3113 )	5 年前
Andrew Cohen	fc485077	fixed more ci problems/removed self.policies	5 年前
Ervin Teng	db743971	Move private methods out of trainer, simplify interface	5 年前
GitHub	45010af3	Add stats reporter class and re-enable missing stats (#3076 )	5 年前
Ervin Teng	b3a4e641	Remove some vestigial code	5 年前
Ervin Teng	48793ec1	Fix test	5 年前
Ervin Teng	cd74e51b	More progress	5 年前
Ervin Teng	9ad99eb6	Combined model and policy for PPO	5 年前
Ervin Teng	164732a9	Move optimizer creation to Trainer, fix some of the reward signals	5 年前
Ervin Teng	cbfbff2c	Split optimizer and TFOptimizer	5 年前
Ervin Teng	4d94e180	Move optimizer to common folder	5 年前
GitHub	1f9d04f2	Fix clear update buffer when trainer stops training, add test (#3422 ) * Fix clear update buffer when trainer stops training, add test * Fix buffer changing types when truncated	5 年前
Ervin Teng	5ef902bf	Merge branch 'master' into develop-splitpolicyoptimizer	5 年前
GitHub	e4177de0	[change] Organize trainer files a bit better (#3538 )	5 年前
Anupam Bhatnagar	f4dbedcf	removed extraneous logging imports and loggers	5 年前
GitHub	6709a9bf	[change] Clean up trainer interface, clean up GhostTrainer stats (#3634 )	5 年前
Ervin Teng	3deb8e30	Make trainer in separate threads	5 年前
Ervin Teng	93351d30	Fix comments	5 年前
Ervin Teng	ed06f37c	Ability to disable threading	5 年前
Ervin Teng	971e4b2d	Don't block when disabling threading	5 年前
Ervin Teng	f29b17a9	Don't block one policy queue Only put policies when policy is actually updated	5 年前
Anupam Bhatnagar	ac80ec82	[skip ci] increment steps on training	5 年前
Anupam Bhatnagar	d49ceecc	[skip ci] moving summary writer to update_policy [skip ci] more fixes [skip ci] tweaking 3dball configs [skip ci] swap summary writer and step increment order	5 年前
Anupam Bhatnagar	95ba923d	[skip ci] fix first summary statement output	5 年前
Anupam Bhatnagar	63abbe71	[skip ci] moving summary writer to update_policy	5 年前
Anupam Bhatnagar	45bac63e	[skip ci] more fixes	5 年前
Ervin Teng	d1fed8ae	Remove empty_queue interface	5 年前
Ervin Teng	e90ef688	Revert to get_nowait method in AgentManagerQueue	5 年前
Anupam Bhatnagar	9d7dd3b6	[skip ci] moving step increment to trainer from environment for sac	5 年前
Arthur Juliani	7c3bd376	Refactoring policy and optimizer	5 年前
Ervin Teng	6fa7ad0b	Avoid stall when multiple brains are present	5 年前
Ervin Teng	744db929	Adjust yield timeout	5 年前
GitHub	ccd40ce7	[bug-fix] Bugfixes for Threaded Trainers (#3817 )	5 年前
Arthur Juliani	212e2d1d	Merge remote-tracking branch 'origin/master' into develop-add-fire	5 年前
Arthur Juliani	ca887743	Support tf and pytorch alongside one another	5 年前
Christopher Goy	ba80b292	format files with pre-commit.	4 年前
GitHub	e92b4f88	[refactor] Structure configuration files into classes (#3936 )	5 年前
GitHub	09853e13	[refactor] Move checkpoint saving into trainer (#4034 )	5 年前
PSankalp Patro	45c4ea36	Save checkpoint files as .nn files in checkpoint directory	5 年前
Anupam Bhatnagar	4afd8f92	first commit	4 年前
Anupam Bhatnagar	0aedad7c	fixing should_still_train call in rl_trainer.py	4 年前
Arthur Juliani	9724c9ac	Merge master	5 年前
Anupam Bhatnagar	24d5f881	first commit	5 年前
GitHub	45154f52	Pytorch port of SAC (#4219 )	4 年前
GitHub	a28e2767	Update add-fire to latest master, including Policy refactor (#4263 ) * Update Dockerfile * Separate send environment data from reset (#4128) * Fixed a typo on ML-Agents-Overview.md (#4130) Fixed redundant "to" word from the sentence since it is probably a typo in document. * Updated the badge’s link to point to the newest doc version * Replaced all of the doc to release_3_doc * Fix 3DBall and 3DBallHard SAC regressions (#4132) * Move memory validation to settings * Update docs * Add settings test * Update to release_3 in installation.md (#4144) * rename to SideChannelManager +backcompat (#4137) * Remove comment about logo with --help (#4148) * [bugfix] Make FoodCollector heuristic playable (#4147) * Make FoodCollector heuristic playable * Update changelog * script to check for old release links and references (#4153) * Remove package validation suite from Project (#4146) * RayPerceptionSensor: handle empty and invalid tags (#4155...	4 年前
Ruo-Ping Dong	6feec58a	add Saver class (only TF working)	4 年前
GitHub	93517833	[feature] Fix TF tests, add --torch CLI option, allow run TF without torch installed (#4305 )	4 年前
Ruo-Ping Dong	6d67f857	move tf and add torch model serialization	4 年前
Ruo-Ping Dong	bdb2ba93	small improvements	4 年前
GitHub	7ddfd81f	Added Reward Providers for Torch (#4280 ) * Added Reward Providers for Torch * Use NetworkBody to encode state in the reward providers * Integrating the reward prodiders with ppo and torch * work in progress, integration with PPO. Not training properly Pyramids at the moment * Integration in PPO * Removing duplicate file * Gail and Curiosity working * addressing comments * Enfore float32 for tests * enfore np.float32 in buffer	4 年前
Ruo-Ping Dong	3b729a82	small improvements	4 年前
Ruo-Ping Dong	4e87b422	move checkpoint_path logic to saver	4 年前
Ruo-Ping Dong	71fe4df6	fix formatting and test	4 年前
GitHub	0e0daf47	[add-fire] Merge post-0.19.0 master into add-fire (#4328 )	4 年前
Ruo-Ping Dong	b4713baa	small improvements	4 年前
Ruo-Ping Dong	09a741c8	small improvement	4 年前
GitHub	84440f05	Convert checkpoints to .NN (#4127 ) This change adds an export to .nn for each checkpoint generated by RLTrainer and adds a NNCheckpointManager to track the generated checkpoints and final model in training_status.json. Co-authored-by: Jonathan Harper <jharper+moar@unity3d.com>	4 年前
GitHub	1f5eb9da	add pyupgrade to pre-commit and run (#4239 )	4 年前
GitHub	beb5aca5	[refactor] Make classes except Optimizer framework agnostic (#4268 )	4 年前
GitHub	8128defb	Don't save model twice, copy instead (#4302 ) * Don't save model twice, copy instead * narrower exception	4 年前
Ruo-Ping Dong	d3eb6c46	Merge branch 'develop-add-fire' into develop-add-fire-checkpoint	4 年前
Ruo-Ping Dong	95858e25	update saver interface and add tests	4 年前
Ervin Teng	0ba67eb6	Fix ONNX import for continuous	4 年前
GitHub	25dc8c3d	Add Saver Class to handle all save/load/checkpoint/export work (#4323 )	4 年前
Ervin Teng	d65a9326	Merge branch 'master' into develop-add-fire-mm3	4 年前
GitHub	8985a040	Removing the experiment script from add fire (#4373 ) * Removing the experiment script * Removing the script	4 年前
Andrew Cohen	a65d08c7	ghost trainer tests	4 年前
GitHub	49545ce1	Pytorch ghost trainer (#4370 )	4 年前
Ruo-Ping Dong	c47ffc20	Rename saver	4 年前
Ruo-Ping Dong	09c22679	fix NNCheckpointManager for Torch	4 年前
Ruo-Ping Dong	e60c7038	Merge branch 'master' into develop-saver-name	4 年前
GitHub	6f534366	Add torch_utils class, auto-detect CUDA availability (#4403 ) * Add torch_utils * Use torch from torch_utils * Add torch to banned modules in CI * Better import error handling * Fix flake8 errors * Address comments * Move networks to GPU if enabled * Switch to torch_utils * More flake8 problems * Move reward providers to GPU/CPU * Remove anothere set default tensor * Fix banned import in test	4 年前
vincentpierre	6b6d4c38	_	4 年前
vincentpierre	6cbe892f	_	4 年前
vincentpierre	8be52c38	-	4 年前
vincentpierre	c10da7ef	-	4 年前
vincentpierre	29f08b2e	-	4 年前
vincentpierre	170f47a5	-	4 年前
vincentpierre	a8137478	-	4 年前
vincentpierre	f49aa8c7	-	4 年前
GitHub	badca342	Rename NNCheckpoint to ModelCheckpoint as Model can be NN or ONNX (#4540 )	4 年前
GitHub	c188781b	[life improvement] Moving Python files around (#4531 ) * Moved components to the tf folder and moved the TrainerFactory to the `trainer` folder * Addressing comments * Editing the migrating doc * fixing test	4 年前
GitHub	a690af74	[refactor] Make PyTorch the default and TensorFlow optional (#4517 ) * Torch setup.py * Set torch to default * Make torch default in setup.py * Remove indents * Remove other instances of TF being used * Add tensorboard to setup.py * Adding correst setup commands for verifying torch is installed (#4524) * Adding correst setup commands for verifying torch is installed * Editing the test_requirments to add tf and remove torch * Develop torchdefault raise outside setup (#4530) * Torch not imported error to raise at first usage * Torch not imported error to raise at first usage * [refactor] Use PyTorch TensorBoard utils (#4518) * Convert stats writer to use PyTorch TB support * Use common function to print params * Update test * Bump tensorboard to 1.15 to fix the tests * putting tensorboard 1.15.0 as min version requirement Co-authored-by: vincentpierre <vincentpierre@unity3d.com> * [Docs] Initial documentation changes for making...	4 年前
Ervin Teng	3b15cc32	Multiprocessing but Stats are quite broken	4 年前
vincentpierre	b863af57	Removing TensorFlow Trainers	4 年前
GitHub	7387a77f	remove pylint (#4836 ) * remove pylint * remove other pylint disables	4 年前
Arthur Juliani	9e2f0814	Add histogram aggregation type	4 年前
GitHub	f16ce486	Update v2-staging from main (March 15) (#5123 )	4 年前
GitHub	62314056	Fix ghost curriculum and make steps private (#5098 ) * use get step to determine curriculum * add to CHANGELOG * Make step in trainer private (#5099) Co-authored-by: Ervin T <ervin@unity3d.com>	4 年前
GitHub	63169e2c	[cherry-pick] Fix group rewards for POCA, add warning for non-POCA trainers (#5120 ) * Fix end episode for POCA, add warning for group reward if not POCA (#5113) * Fix end episode for POCA, add warning for group reward if not POCA * Add missing imports * Use np.any, which is faster	4 年前
GitHub	8387e252	[release] Fix rl trainer warning (#5144 ) * Fix rl trainer warning * Fix typo	4 年前
Ervin Teng	d1c24251	[bug-fix] When agent isn't training, don't clear update buffer (#5205 ) * Don't clear update buffer, but don't append to it either * Update changelog * Address comments * Make experience replay buffer saving more verbose (cherry picked from commit 63e7ad44d96b7663b91f005ca1d88f4f3b11dd2a)	4 年前
GitHub	28eb43dd	[bug-fix] Delete .pt checkpoints past keep-checkpoints (#5271 ) * Manage non-ONNX files with checkpoint manager too * Update tests * Update training status version * Change ticking of status file version	4 年前
GitHub	ed69fd2b	collecting latest step as a stat (#5264 ) * collecting latest step as a stat * adding a list of hidden_keys to TB summarywriter to hide unnecessary stats from user * fixing precommit * fixing precommit * formating * defined the property types * moving custom defaults to get_default_stats_writers * new test for TensorboardWriter.hidden_keys * improved testing * explicit None evaluation Co-authored-by: Ervin T. <ervin@unity3d.com> * make hidden_keys optional Co-authored-by: Ervin T. <ervin@unity3d.com> * adding optional argument * lowering the training threshold to 0.8 on test_var_len_obs_and_goal_poca * Update pytest.yml * Do not merge! droping pytest 3.9 job * -add back pytest -format imports and comments * back to default threshold for test_var_len_obs_and_goal_poca Co-authored-by: mahon94 <maryam.honari@unity3d.com> Co-authored-by: Ervin T. <ervin@unity3d.com>	4 年前

1 2 3

126 次代码提交 (83f8d70d-b521-4845-89ab-70af6e70c9d0)