96 次代码提交 (0968daa8-51d2-433d-a6e4-e3dd0f33392a)

作者 SHA1 备注 提交日期
Arthur Juliani 6879bae4 Initial optimizer port 5 年前
Arthur Juliani 7c3bd376 Refactoring policy and optimizer 5 年前
Arthur Juliani 2e51260a Resolving a few bugs 5 年前
Arthur Juliani 947f0d32 Slightly closer to running model 5 年前
Arthur Juliani 3c82bf59 Training runs, but doesn’t actually work 5 年前
Arthur Juliani 8c6f4696 Fix a couple additional bugs 5 年前
Arthur Juliani 61d671d8 Add conditional sigma for distribution 5 年前
Arthur Juliani a5b5b109 Mulkti-discrete now working 5 年前
Arthur Juliani 5f936990 Visual observations now train as well 5 年前
Arthur Juliani 1736559f Combine actor and critic classes. Initial export. 5 年前
Arthur Juliani be7e55e1 Use LSTM and fix a few merge errors 5 年前
Arthur Juliani 3eef9d78 Optimize np -> tensor operations 5 年前
Ervin Teng 72180f9b Experiment with JIT compiler 5 年前
Arthur Juliani 9724c9ac Merge master 5 年前
GitHub cde8bd29 Convert List[np.ndarray] to np.ndarray before using torch.as_tensor (#4183) 5 年前
GitHub 05a11c96 Develop add fire exp framework (#4213) 5 年前
GitHub a28e2767 Update add-fire to latest master, including Policy refactor (#4263) 4 年前
GitHub 69579611 [refactor] Refactor Actor and Critic classes (#4287) 4 年前
Andrew Cohen ccb492dc ignore precommit/first bc commit 4 年前
Andrew Cohen 84ea84a6 bc loss for both continuous and disc 4 年前
Andrew Cohen f74d301a Merge branch 'develop-add-fire' into develop-add-fire-bc 4 年前
Andrew Cohen 22a0cabc changed path to torch bc module 4 年前
GitHub 7ddfd81f Added Reward Providers for Torch (#4280) 4 年前
Andrew Cohen 598826fe Merge branch 'develop-add-fire' into develop-add-fire-bc 4 年前
GitHub 6b255790 Behavioral Cloning Pytorch (#4293) 4 年前
GitHub f374f87a [add-fire] Add LSTM to SAC, LSTM fixes and initializations (#4324) 4 年前
Ervin Teng f4da3592 Add memories and sequence length to critic_pass 4 年前
Ervin Teng fa0d3cb6 Fix next_obs in get_trajectory_value_estimates 4 年前
vincentpierre 108fac9a Replace torch.detach().cpu().numpy() with a utils method 4 年前
GitHub 4e93cb6e [torch] Restructure PyTorch encoders (#4421) 4 年前
GitHub 6f534366 Add torch_utils class, auto-detect CUDA availability (#4403) 4 年前
Ervin Teng 3e771cbb Permute visual obs outside of network 4 年前
Ervin Teng 77c810fb Fix SAC and make utility method 4 年前
vincentpierre d3d4eb90 Trainer with attention 4 年前
Ervin Teng 95bdbba3 Less broken PPO 4 年前
Ervin Teng 5a5bd515 Fix multiple obs 4 年前
vincentpierre 735fcd52 [WIP] Refactor trainers to use list of obs rather than vec and vis obs 4 年前
Ervin Teng cb4b7ed3 Some minor tweaks but still broken 4 年前
Ervin Teng 56dcd75a Get next critic observations into value estimate 4 年前
GitHub cc6b4564 Multi Directional Walker and Initial Hypernetwork (#4740) 4 年前
GitHub 22658a40 use sensor types to differentiate obs (#4749) 4 年前
vincentpierre 44ed3258 Merging master 4 年前
vincentpierre 449712b0 renaming sensor_spec to sensor_specS 4 年前
Ervin Teng 330fc1d0 Merge branch 'master' into develop-centralizedcritic-mm 4 年前
Ervin Teng ad439fb6 Additional changes 4 年前
Ervin Teng d02a1033 Some more fixes 4 年前
GitHub 7387a77f remove pylint (#4836) 4 年前
Arthur Juliani 0b4b0992 Rename more files 4 年前
Ervin Teng aba633b2 Merge branch 'develop-attention-refactor' into develop-centralizedcritic-mm 4 年前
Arthur Juliani 0a876b9c Fix typos 4 年前
Ervin Teng 9c3da1b6 New buffer layout, TeamObsUtil, pad dead agents 4 年前
GitHub 67ad9651 Merge pull request #4825 from Unity-Technologies/sensor-types 4 年前
Ervin Teng 6b8b3db3 Try subtract marginalized value 4 年前
Ervin Teng 092ea232 Some more progress - still broken 4 年前
Ervin Teng 457b2630 I think it's running 4 年前
Andrew Cohen 6e1826f8 might be right 4 年前
Andrew Cohen 1511588d forcing this to work 4 年前
Andrew Cohen e1fad8a4 buffer error 4 年前
Andrew Cohen feb38012 add lambda return and target network 4 年前
Andrew Cohen a4c336c2 value estimator 4 年前
Andrew Cohen fce842aa adding zombie to coma2 brnch 4 年前
Andrew Cohen 7f491ae7 cloud run with coma2 of held out zombie test env 4 年前
Andrew Cohen 9af22d30 use only value funcs 4 年前
Andrew Cohen 95253b47 ntegrate teammate dones 4 年前
Andrew Cohen 687f411b try again on cloud 4 年前
Andrew Cohen f9ff3fef shared baseline and v 4 年前
Ervin Teng 3283b6a1 Remove Q-net for perf 4 年前
Ervin Teng b6f88d6d Merge branch 'develop-base-teammanager' into develop-agentprocessor-teammanager 4 年前
Andrew Cohen 6bd396ee add critic to optimizer, ppo runs 4 年前
Andrew Cohen 3aec18a1 fix precommit errors 4 年前
Andrew Cohen 8efdeeb0 make critic a property 4 年前
Ervin Teng 514873bf Use correct memories (t-1 instead of t) for training 4 年前
Ervin Teng 219e773b Merge branch 'develop-fix-lstms' into develop-critic-op-lstm 4 年前
Ervin Teng ae7643b8 Proper critic memories for PPO 4 年前
Ervin Teng 2b0dd850 Still somewhat broken but cleaner 4 年前
Ervin Teng 64839237 Fix indexing issue 4 年前
Ervin Teng 21e9785a Fix padding issues 4 年前
Ervin Teng 8d834f0b Fix more indexing bugs 4 年前
Ervin Teng 4fc0f93e Code cleanup 4 年前
Ervin Teng 6a573ebf Code cleanup 4 年前
Ervin Teng f3cec983 Append the right memories 4 年前
Ervin Teng a9666a0b Don't pad when not needed 4 年前
Ervin Teng c2883f5b Pad from back of trajectory 4 年前
Ervin Teng e46a86ad Merge branch 'master' into develop-superpush-int 4 年前
GitHub 338af2ec Move the Critic into the Optimizer (#4939) 4 年前
GitHub c1d19e89 Fix gpu pytests (#5019) 4 年前
Andrew Cohen 131fa328 inital evaluate_by_seq, does not run 4 年前
Andrew Cohen 67beef88 finished evaluate_by_seq, does not run 4 年前
Andrew Cohen 8f799687 ignoring precommit, grabbing baseline/critic mems from buffer in trainer 4 年前
GitHub f16ce486 Update v2-staging from main (March 15) (#5123) 4 年前
GitHub ba2af269 [coma2] Make group extrinsic reward part of extrinsic (#5033) 4 年前
GitHub d24b0966 [bug-fix] Fix memory leak when using LSTMs (#5048) 4 年前
Ervin Teng c108da4a [bug-fix] Fix POCA LSTM, pad sequences in the back (#5206) 4 年前
Ervin Teng d461a66a Fix padding in optimizer value estimate 4 年前
Ervin Teng 81b74634 Fix additional bugs and POCA 4 年前
Ervin Teng 9fd4a81e Address comments 4 年前