46 次代码提交 (cb13a8ca-46cb-4e4f-ba59-dd52e7d06c0f)

作者 SHA1 备注 提交日期
Andrew Cohen cb13a8ca add type/docstring to slice 4 年前
Andrew Cohen 0afe5f24 add slice function to agent action 4 年前
Ervin Teng ac0b56bb Fix pypi issues 4 年前
GitHub 6ae8ea1e [coma2] Add support for variable length obs in COMA2 (#5038) 4 年前
GitHub ba2af269 [coma2] Make group extrinsic reward part of extrinsic (#5033) 4 年前
Andrew Cohen 4c56e6ad lstm runs with coma 4 年前
Andrew Cohen 81524ee8 lstm almost runs 4 年前
Andrew Cohen 8f799687 ignoring precommit, grabbing baseline/critic mems from buffer in trainer 4 年前
Andrew Cohen 67beef88 finished evaluate_by_seq, does not run 4 年前
Andrew Cohen 131fa328 inital evaluate_by_seq, does not run 4 年前
Ervin Teng fd0dd35c Merge branch 'main' into develop-coma2-trainer 4 年前
Andrew Cohen 43955c5b get value estimate test 4 年前
GitHub c9c7e3d0 Faster NaN masking, fix masking for visual obs (#5015) 4 年前
Andrew Cohen 8562471e add inital coma optimizer tests 4 年前
Andrew Cohen e2d46ca0 Merge branch 'develop-agentprocessor-teammanager' into develop-coma2-trainer 4 年前
Andrew Cohen 5d517c5e clean ups 4 年前
Ervin Teng bc3d3a95 Fix slicing typing and string printing in AgentBufferField 4 年前
Andrew Cohen 9060da06 Merge branch 'develop-agentprocessor-teammanager' into develop-coma2-trainer 4 年前
Andrew Cohen 4b58527c checkout ppo/optimizer from main 4 年前
Andrew Cohen e37c5a98 Merge branch 'master' into develop-coma2-trainer 4 年前
GitHub 67e945f0 clean ups (#5003) 4 年前
Ervin Teng 4da2e22e Fix Team Cumulative Reward 4 年前
Ervin Teng 4b159789 Add PushBlockCollab config and fix some stuff 4 年前
Ervin Teng c6904f86 Group reward function 4 年前
Ervin Teng b3958a8d Buffer fixes 4 年前
Ervin Teng a4fcbb63 Right loss function for stability, fix some pypi 4 年前
Ervin Teng 9bc88c41 Running COMA (not sure if learning) 4 年前
Ervin Teng 08db7c2f Merge branch 'develop-agentprocessor-teammanager' into develop-coma2-trainer-mm 4 年前
Andrew Cohen 98d647de MultiInputNetBody 4 年前
Andrew Cohen 418cc778 coma trainer and optimizer 4 年前
Andrew Cohen 3f7d68b8 fix test policy 4 年前
Andrew Cohen 00b891df fix sac shared 4 年前
Andrew Cohen d81d0be3 fix agent processor test 4 年前
Andrew Cohen 66742dc8 test for SharedActorCritic 4 年前
Andrew Cohen c74dca9f add SharedActorCritic 4 年前
Ervin Teng 24ee4bd5 Merge remote-tracking branch 'origin/develop-critic-optimizer' into develop-critic-optimizer 4 年前
Andrew Cohen 6828713c fix saver test 4 年前
Andrew Cohen 9b92f5fb remove commented code 4 年前
Ervin Teng c675393c Move value network for SAC to device 4 年前
Andrew Cohen 8efdeeb0 make critic a property 4 年前
Ervin Teng 1831044a Update SAC to use separate policy 4 年前
Andrew Cohen 543f22bc fix test_networks 4 年前
Andrew Cohen 3aec18a1 fix precommit errors 4 年前
Andrew Cohen 6bd396ee add critic to optimizer, ppo runs 4 年前
Andrew Cohen f73b9dba update policy to not use critic 4 年前
Andrew Cohen eeabb974 Separate Actor/Critic, remove ActorCritics 4 年前