179 次代码提交 (da6d25c9-4e76-454f-81d0-1258ba68390b)

作者 SHA1 备注 提交日期
GitHub e4177de0 [change] Organize trainer files a bit better (#3538) 4 年前
Andrew Cohen 573b1f6d Merge branch 'master' into soccer-fives 4 年前
Anupam Bhatnagar 07b15ae7 [skip-ci] small refactors 4 年前
Andrew Cohen ac261e36 Merge branch 'master' into self-play-mutex 4 年前
Anupam Bhatnagar 9341f7a2 [skip-ci] small refactors 4 年前
Anupam Bhatnagar 06a54ae8 step increment moved to _update_policy, fixed exit status issue 4 年前
Anupam Bhatnagar 5d180caf [skip ci] modify learning rate in horovod optimizer 4 年前
Anupam Bhatnagar b3c2d431 [skip ci] minor formatting change 4 年前
Arthur Juliani 6879bae4 Initial optimizer port 4 年前
Arthur Juliani 7c3bd376 Refactoring policy and optimizer 4 年前
Arthur Juliani 2e51260a Resolving a few bugs 4 年前
Arthur Juliani 947f0d32 Slightly closer to running model 4 年前
Arthur Juliani 3c82bf59 Training runs, but doesn’t actually work 4 年前
Arthur Juliani 8c6f4696 Fix a couple additional bugs 4 年前
Arthur Juliani 61d671d8 Add conditional sigma for distribution 4 年前
Arthur Juliani a5b5b109 Mulkti-discrete now working 4 年前
Arthur Juliani 5f936990 Visual observations now train as well 4 年前
Arthur Juliani 1736559f Combine actor and critic classes. Initial export. 4 年前
Arthur Juliani ca887743 Support tf and pytorch alongside one another 4 年前
Arthur Juliani be7e55e1 Use LSTM and fix a few merge errors 4 年前
Arthur Juliani 3eef9d78 Optimize np -> tensor operations 4 年前
Ervin Teng 72180f9b Experiment with JIT compiler 4 年前
GitHub e92b4f88 [refactor] Structure configuration files into classes (#3936) 4 年前
Anupam Bhatnagar 4afd8f92 first commit 4 年前
Arthur Juliani 9724c9ac Merge master 4 年前
Anupam Bhatnagar 24d5f881 first commit 4 年前
GitHub cde8bd29 Convert List[np.ndarray] to np.ndarray before using torch.as_tensor (#4183) 4 年前
GitHub 05a11c96 Develop add fire exp framework (#4213) 4 年前
GitHub a28e2767 Update add-fire to latest master, including Policy refactor (#4263) 4 年前
GitHub 69579611 [refactor] Refactor Actor and Critic classes (#4287) 4 年前
Andrew Cohen ccb492dc ignore precommit/first bc commit 4 年前
Andrew Cohen 84ea84a6 bc loss for both continuous and disc 4 年前
Andrew Cohen f74d301a Merge branch 'develop-add-fire' into develop-add-fire-bc 4 年前
Andrew Cohen 22a0cabc changed path to torch bc module 4 年前
vincentpierre 599d7e9f Merging master 4 年前
GitHub 7ddfd81f Added Reward Providers for Torch (#4280) 4 年前
Ruo-Ping Dong 79d89158 Merge branch 'develop-add-fire' into develop-add-fire-checkpoint 4 年前
Andrew Cohen d8c123a0 Merge branch 'master' into sensitivity 4 年前
Andrew Cohen 02df39ab ignore precommit 4 年前
Andrew Cohen fa35292c write hist to tb 4 年前
GitHub beb5aca5 [refactor] Make classes except Optimizer framework agnostic (#4268) 4 年前
Andrew Cohen 06e4356c Merge branch 'master' into sensitivity 4 年前
Arthur Juliani 1a123641 Merge remote-tracking branch 'origin/master' into r5-master 4 年前
GitHub 3f44a0bc cleanup around AdamOptimizer (#4333) 4 年前
Andrew Cohen 598826fe Merge branch 'develop-add-fire' into develop-add-fire-bc 4 年前
Ruo-Ping Dong d3eb6c46 Merge branch 'develop-add-fire' into develop-add-fire-checkpoint 4 年前
Anupam Bhatnagar a5cc4d03 Merge branch 'master' into global-variables 4 年前
GitHub 6b255790 Behavioral Cloning Pytorch (#4293) 4 年前
GitHub f374f87a [add-fire] Add LSTM to SAC, LSTM fixes and initializations (#4324) 4 年前
Andrew Cohen 0a7444f9 revert bc default batch/epoch 4 年前
HH 8eaddb61 Merge branch 'master' into hh/develop/loco-walker-variable-speed 4 年前
Ruo-Ping Dong 59cc1a9f Merge branch 'develop-add-fire' into develop-add-fire-checkpoint 4 年前
Ervin Teng f4da3592 Add memories and sequence length to critic_pass 4 年前
Ervin Teng 13f15086 Merge branch 'develop-add-fire' into develop-add-fire-amrl 4 年前
Ervin Teng fa0d3cb6 Fix next_obs in get_trajectory_value_estimates 4 年前
Ervin Teng d65a9326 Merge branch 'master' into develop-add-fire-mm3 4 年前
Ruo-Ping Dong d57aa9ab Merge branch 'develop-add-fire-mm3' into develop-add-fire-checkpoint 4 年前
GitHub bd6bcd2f Merge master and add Saver class for save/load checkpoints 4 年前
Ervin Teng d218bf4d Merge branch 'develop-add-fire' into develop-add-fire-sac-lst 4 年前
Ervin Teng 42e25b25 Merge branch 'develop-add-fire' into develop-add-fire-memoryclass 4 年前
Christopher Goy 5a233353 Merge remote-tracking branch 'origin/master' into release_6-to-master 4 年前
GitHub 1955af9e [feature] Add experimental PyTorch support (#4335) 4 年前
vincentpierre 108fac9a Replace torch.detach().cpu().numpy() with a utils method 4 年前
HH d9962254 Merge branch 'master' into hh/develop/loco-walker-variable-speed 4 年前
Anupam Bhatnagar f4f1a8d9 merge master into trainer-plugin branch 4 年前
GitHub 498934f9 Replace torch.detach().cpu().numpy() with a utils method (#4406) 4 年前
Ruo-Ping Dong fd1dc3a6 Merge branch 'master' into develop-torch-omp 4 年前
GitHub 4e93cb6e [torch] Restructure PyTorch encoders (#4421) 4 年前
GitHub 6f534366 Add torch_utils class, auto-detect CUDA availability (#4403) 4 年前
Andrew Cohen 3997b14b Merge branch 'master' into develop-hybrid-actions 4 年前
Ervin Teng 3e771cbb Permute visual obs outside of network 4 年前
Ervin Teng 77c810fb Fix SAC and make utility method 4 年前
GitHub c188781b [life improvement] Moving Python files around (#4531) 4 年前
Andrew Cohen e5f14400 Merge branch 'master' into develop-hybrid-actions-singleton 4 年前
vincentpierre d3d4eb90 Trainer with attention 4 年前
GitHub b853e5ba Action buffer (#4612) 4 年前
Ervin Teng 95bdbba3 Less broken PPO 4 年前
vincentpierre b863af57 Removing TensorFlow Trainers 4 年前
Ervin Teng 5a5bd515 Fix multiple obs 4 年前
Ervin Teng fdaa8c3d Merge branch 'develop-unified-obs' into develop-centralizedcritic 4 年前
GitHub 990f801a Develop hybrid action staging (#4702) 4 年前
vincentpierre 735fcd52 [WIP] Refactor trainers to use list of obs rather than vec and vis obs 4 年前
Ervin Teng cb4b7ed3 Some minor tweaks but still broken 4 年前
Ervin Teng 56dcd75a Get next critic observations into value estimate 4 年前
Andrew Cohen 4ebc6c44 ml-agents-envs pass 4 年前
Arthur Juliani 0d2f8887 Merge remote-tracking branch 'origin/master' into goal-conditioning 4 年前
GitHub cc6b4564 Multi Directional Walker and Initial Hypernetwork (#4740) 4 年前
Ervin Teng 25dfd883 Merge branch 'master' into develop-centralizedcritic 4 年前
GitHub 22658a40 use sensor types to differentiate obs (#4749) 4 年前
Andrew Cohen 3c65b964 fixed recurrent prev_action issue 4 年前
GitHub 903d3afe Merge pull request #4707 from Unity-Technologies/develop-rm-tf 4 年前
Andrew Cohen 498b1ee6 Merge branch 'develop-action-buffer' into develop-hybrid-actions-singleton 4 年前
GitHub 29d94c7c Merge pull request #4734 from Unity-Technologies/develop-obs-as-list 4 年前
Andrew Cohen c0d01baf Merge branch 'master' into merge-release11-master 4 年前
vincentpierre 44ed3258 Merging master 4 年前
Andrew Cohen 3457cd3c save only discrete actions as prev 4 年前
vincentpierre 449712b0 renaming sensor_spec to sensor_specS 4 年前
Andrew Cohen 35769b53 Merge branch 'develop-action-buffer' into develop-hybrid-actions-singleton 4 年前
Chris Elion 76ebc20c Merge remote-tracking branch 'origin/master' into r12-to-master 4 年前
GitHub 458fee17 Merge pull request #4763 from Unity-Technologies/develop-att 4 年前
Ervin Teng 330fc1d0 Merge branch 'master' into develop-centralizedcritic-mm 4 年前
vincentpierre 519c5f47 merging master 4 年前
Ervin Teng ad439fb6 Additional changes 4 年前
Ervin Teng d02a1033 Some more fixes 4 年前
Ruo-Ping Dong 8ed14762 Merge branch 'develop-hybrid-actions-singleton' into develop-hybrid-actions-csharp 4 年前
GitHub 7387a77f remove pylint (#4836) 4 年前
Arthur Juliani 0b4b0992 Rename more files 4 年前
Ervin Teng aba633b2 Merge branch 'develop-attention-refactor' into develop-centralizedcritic-mm 4 年前
Arthur Juliani 0a876b9c Fix typos 4 年前
Ruo-Ping Dong 180d3e20 Merge branch 'develop-centralizedcritic-mm' into develop-cc-teammanager 4 年前
HH 0024a286 merge ervin's new stuff 4 年前
Ervin Teng 9c3da1b6 New buffer layout, TeamObsUtil, pad dead agents 4 年前
GitHub 67ad9651 Merge pull request #4825 from Unity-Technologies/sensor-types 4 年前
vincentpierre 8660b1c2 merging master 4 年前
Ervin Teng 3daa17a9 Merge branch 'develop-centralizedcritic-mm' into develop-zombieteammanager 4 年前
Ervin Teng 6b8b3db3 Try subtract marginalized value 4 年前
Ervin Teng 092ea232 Some more progress - still broken 4 年前
Ervin Teng 457b2630 I think it's running 4 年前
brccabral 457fb612 Merge branch 'master' of https://github.com/Unity-Technologies/ml-agents 4 年前
Andrew Cohen 6e1826f8 might be right 4 年前
Andrew Cohen 1511588d forcing this to work 4 年前
Andrew Cohen e1fad8a4 buffer error 4 年前
Andrew Cohen feb38012 add lambda return and target network 4 年前
Andrew Cohen 5741f8f6 no target net 4 年前
Andrew Cohen a92baab6 add target network back 4 年前
Andrew Cohen a4c336c2 value estimator 4 年前
Andrew Cohen fce842aa adding zombie to coma2 brnch 3 年前
Andrew Cohen 7f491ae7 cloud run with coma2 of held out zombie test env 3 年前
Andrew Cohen 9af22d30 use only value funcs 3 年前
Andrew Cohen 95253b47 ntegrate teammate dones 3 年前
Andrew Cohen 687f411b try again on cloud 3 年前
Andrew Cohen f9ff3fef shared baseline and v 3 年前
Ervin Teng 3283b6a1 Remove Q-net for perf 3 年前
Ervin Teng b6f88d6d Merge branch 'develop-base-teammanager' into develop-agentprocessor-teammanager 3 年前
Andrew Cohen 6bd396ee add critic to optimizer, ppo runs 3 年前
Andrew Cohen 3aec18a1 fix precommit errors 3 年前
Andrew Cohen 8efdeeb0 make critic a property 3 年前
Ervin Teng 0bde7598 Back out trainer changes 3 年前
Ervin Teng 514873bf Use correct memories (t-1 instead of t) for training 3 年前
Ervin Teng f3a2a81f Merge branch 'develop-fix-lstms' into develop-gru 3 年前
Ervin Teng 219e773b Merge branch 'develop-fix-lstms' into develop-critic-op-lstm 3 年前
Ervin Teng ae7643b8 Proper critic memories for PPO 3 年前
Ervin Teng 2b0dd850 Still somewhat broken but cleaner 3 年前
Ervin Teng 64839237 Fix indexing issue 3 年前
Ervin Teng 21e9785a Fix padding issues 3 年前
Ervin Teng 8d834f0b Fix more indexing bugs 3 年前
Ervin Teng 4fc0f93e Code cleanup 3 年前
Ervin Teng 6a573ebf Code cleanup 3 年前
Ervin Teng f3cec983 Append the right memories 3 年前
Ervin Teng a9666a0b Don't pad when not needed 3 年前
Ervin Teng c2883f5b Pad from back of trajectory 3 年前
Ervin Teng e46a86ad Merge branch 'master' into develop-superpush-int 3 年前
HH 15d512f9 Merge branch 'master' into hh/develop/dodgeball 3 年前
GitHub 338af2ec Move the Critic into the Optimizer (#4939) 3 年前
HH 4c947151 Merge branch 'main' into hh/develop/dodgeball 3 年前
Andrew Cohen 4b58527c checkout ppo/optimizer from main 3 年前
Ervin Teng 61781a1a Merge branch 'main' into develop-agentprocessor-teammanager 3 年前
GitHub c1d19e89 Fix gpu pytests (#5019) 3 年前
Arthur Juliani 06c147f8 Merge remote-tracking branch 'origin/main' into goal-conditioning-new 3 年前
Ervin Teng fd0dd35c Merge branch 'main' into develop-coma2-trainer 3 年前
Ervin Teng c8137dcd Merge branch 'main' into develop-superpush-int 3 年前
Andrew Cohen 131fa328 inital evaluate_by_seq, does not run 3 年前
Andrew Cohen 67beef88 finished evaluate_by_seq, does not run 3 年前
Andrew Cohen 8f799687 ignoring precommit, grabbing baseline/critic mems from buffer in trainer 3 年前
GitHub f16ce486 Update v2-staging from main (March 15) (#5123) 3 年前
Christopher Goy 921ba4f0 Update v2-staging from main (March 15) (#5123) 3 年前
GitHub ba2af269 [coma2] Make group extrinsic reward part of extrinsic (#5033) 3 年前
GitHub d24b0966 [bug-fix] Fix memory leak when using LSTMs (#5048) 3 年前
Christopher Goy ebe45056 Merge branch 'main' into release_14_branch-to-main 3 年前
Chris Elion 970f1d40 Merge remote-tracking branch 'origin/v2-staging' into MLA-1634-ObservationSpec 3 年前
Ervin Teng 1f026c70 Merge branch 'main' into develop-superpush-branch-cleanup 3 年前
Ervin Teng ce872033 Revert "Merge branch 'main' into develop-superpush-branch-cleanup" 3 年前
GitHub 8f35bdd3 POCA trainer (#5005) 3 年前
Andrew Cohen 9e77d7e1 Merge branch 'main' into develop-soccer-groupman 3 年前
Ervin Teng c108da4a [bug-fix] Fix POCA LSTM, pad sequences in the back (#5206) 3 年前
Ervin Teng d461a66a Fix padding in optimizer value estimate 3 年前
Ervin Teng 81b74634 Fix additional bugs and POCA 3 年前
Ervin Teng 9fd4a81e Address comments 3 年前
GitHub c5589b59 [bug-fix] Fix POCA LSTM, pad sequences in the back (#5206) 3 年前