96 次代码提交 (c3496649-d1ec-4d1a-9bd6-110b15248589)

作者 SHA1 备注 提交日期
GitHub 45154f52 Pytorch port of SAC (#4219) 5 年前
GitHub a28e2767 Update add-fire to latest master, including Policy refactor (#4263) 5 年前
GitHub 74c99ec8 [refactor] Refactor normalizers and encoders (#4275) 5 年前
Ruo-Ping Dong 01e60921 add sac checkpoint 5 年前
GitHub 3a982317 [add-fire] Add learning rate and beta/epsilon decay to PyTorch (#4318) 5 年前
GitHub 7ddfd81f Added Reward Providers for Torch (#4280) 5 年前
Ervin Teng 37f986c8 Running LSTM for SAC 5 年前
Ervin Teng 8ead82e2 Use correct half of memories 5 年前
Ruo-Ping Dong 71fe4df6 fix formatting and test 5 年前
Ruo-Ping Dong d3eb6c46 Merge branch 'develop-add-fire' into develop-add-fire-checkpoint 5 年前
GitHub f374f87a [add-fire] Add LSTM to SAC, LSTM fixes and initializations (#4324) 5 年前
Ervin Teng eeae6d97 Proper initialization and SAC masking 5 年前
Ruo-Ping Dong 59cc1a9f Merge branch 'develop-add-fire' into develop-add-fire-checkpoint 5 年前
GitHub 6de31a03 [add-fire] Fix masked mean for 2d tensors (#4364) 5 年前
vincentpierre 9f51ab14 Saving the reward providers 4 年前
vincentpierre 108fac9a Replace torch.detach().cpu().numpy() with a utils method 4 年前
vincentpierre 31750e97 Using item() in place of to_numpy() 4 年前
vincentpierre fdd343b2 more use of item() and additional tests 4 年前
GitHub 498934f9 Replace torch.detach().cpu().numpy() with a utils method (#4406) 4 年前
GitHub 4e93cb6e [torch] Restructure PyTorch encoders (#4421) 4 年前
GitHub beb5eb30 [bug-fix] Fixes for Torch SAC and tests (#4408) 4 年前
GitHub 6f534366 Add torch_utils class, auto-detect CUDA availability (#4403) 4 年前
Ervin Teng 916eec4b Run backwards() of losses in threads 4 年前
Ervin Teng 9b797d61 Thread inference and not backprop 4 年前
Ervin Teng a305a41b Try futures in Optimizer 4 年前
Ervin Teng 228ea059 Try futures in Optimizer 4 年前
Ervin Teng 3e771cbb Permute visual obs outside of network 4 年前
Ervin Teng 77c810fb Fix SAC and make utility method 4 年前
Ervin Teng 9088c07a Optimized SAC soft update 4 年前
Ervin Teng 7754ad7b Don't run value during inference 4 年前
Ervin Teng 5495b2b6 Works with continuous 4 年前
Ervin Teng 52efe509 Discrete and entrop coeff 4 年前
Ervin Teng d67b9f95 Remove comment 4 年前
GitHub 1f179527 Do not keep gradients on the q for the v backup (#4504) 4 年前
vincentpierre 181bdec0 - 4 年前
GitHub 4e4ad7b0 Don't run value during policy evaluate, optimized soft update function (#4501) 4 年前
Ervin Teng f9ff3efe Merge branch 'develop-policyonly' into develop-sac-targetq 4 年前
GitHub 05fc088d [refactor] Don't compute grad for q2_p in SAC Optimizer (#4509) 4 年前
Ervin Teng 8dec4771 Add hybrid actions to SAC 4 年前
GitHub dde34423 [bug-fix] Use proper masking for entropy and policy losses (#4572) 4 年前
Andrew Cohen 8013e544 ignoring Instance of 'AbstractContextManager' has no 'enter_context' member (no-member) 4 年前
GitHub cb8e4d25 Add ActionSpec (#4586) 4 年前
Andrew Cohen 9689cf2c remove *_action_* from function names 4 年前
Andrew Cohen dc89318d remove ActionType 4 年前
GitHub b853e5ba Action buffer (#4612) 4 年前
Ervin Teng 2fc23737 Manchausen RL 4 年前
GitHub 3c96a3a2 Action Model (#4580) 4 年前
GitHub 88d3ec3e Merge master into hybrid actions staging branch (#4704) 4 年前
GitHub 8175d558 [bug-fix] Fix BC module + action clipping (#4667) 4 年前
Ervin Teng 6c77ac7a Update SAC, fix PPO batching 4 年前
Andrew Cohen 056630d7 sac continuous and discrete train 4 年前
GitHub 990f801a Develop hybrid action staging (#4702) 4 年前
vincentpierre 735fcd52 [WIP] Refactor trainers to use list of obs rather than vec and vis obs 4 年前
Andrew Cohen 85e4db33 bc tests pass 4 年前
vincentpierre c1587bce Solving merge conflicts 4 年前
Ervin Teng 25dfd883 Merge branch 'master' into develop-centralizedcritic 4 年前
vincentpierre 8cb050ef WIP Made initial changes to enale dimension properties and added attention module 4 年前
vincentpierre 719c969c addressing comments. ObservationSpec is no longer a list 4 年前
vincentpierre 4bba4e8e Renaming ObservationSpec to SensorSpec 4 年前
vincentpierre 44ed3258 Merging master 4 年前
vincentpierre 449712b0 renaming sensor_spec to sensor_specS 4 年前
Andrew Cohen 17496265 move AgentAction, ActionLogProbs, and ActionFlattener to separate files 4 年前
GitHub 7387a77f remove pylint (#4836) 4 年前
Andrew Cohen 1bc2ff96 add weight decay to trainers 4 年前
Arthur Juliani 0b4b0992 Rename more files 4 年前
Arthur Juliani 0a876b9c Fix typos 4 年前
Andrew Cohen ff324d0c fixed sac recurrent tf simple rl 4 年前
GitHub 12e1fc28 [feature] Hybrid SAC (#4574) 4 年前
GitHub 67ad9651 Merge pull request #4825 from Unity-Technologies/sensor-types 4 年前
vincentpierre 115e944b adding weight decay for experimentation 4 年前
vincentpierre 9fbc2e0e _ 4 年前
vincentpierre bf16bad6 _ 4 年前
GitHub 64fc7f43 Buffer key enums (#4907) 4 年前
Ervin Teng 1831044a Update SAC to use separate policy 4 年前
Andrew Cohen 8efdeeb0 make critic a property 4 年前
Ervin Teng c675393c Move value network for SAC to device 4 年前
Andrew Cohen c74dca9f add SharedActorCritic 4 年前
Andrew Cohen 00b891df fix sac shared 4 年前
Ervin Teng fd3f05b9 Enable GAIL to decay 4 年前
Ervin Teng bb452ffd Fix SAC 4 年前
GitHub 338af2ec Move the Critic into the Optimizer (#4939) 4 年前
GitHub f16ce486 Update v2-staging from main (March 15) (#5123) 4 年前
Ervin Teng 9e2e2626 [bug-fix] Use correct memories for LSTM SAC (#5228) 4 年前
vincentpierre bab3ecb7 First version of MEDE, crawler does not seem to work properly, I suspect the actions make it distinguishable to the discriminator but not to the human eye 4 年前
Andrew Cohen d813bfd5 continuous, crawler integrated, new cube 4 年前
vincentpierre 8da21669 Adding some changes 4 年前
Andrew Cohen 3e642140 use discrete div 4 年前
Andrew Cohen bcee3bf5 no entropy loss 4 年前
vincentpierre 7c74c967 _ 4 年前
vincentpierre b4f30613 Adding a variational version 4 年前
vincentpierre 8450b154 - 4 年前
vincentpierre 5985959d Got 2 modes on Wlker I think 4 年前
vincentpierre 4bde393e Got the walker to walk different based on diversity setting 4 年前
GitHub fc6e8c35 [🐛🔨 ] Fix sac target for continuous actions (#5372) 4 年前
vincentpierre 8cdbc17f modifying SAC to make \alpha converge faster 4 年前
vincentpierre 983982ee Removing misleading learning rate 4 年前