Andrew Cohen
f165bfb5
update comment
4 年前
Andrew Cohen
95f62362
add test
4 年前
Andrew Cohen
cb13a8ca
add type/docstring to slice
4 年前
Andrew Cohen
0afe5f24
add slice function to agent action
4 年前
Ervin Teng
ac0b56bb
Fix pypi issues
4 年前
GitHub
6ae8ea1e
[coma2] Add support for variable length obs in COMA2 ( #5038 )
* Make group extrinsic part of extrinsic
* Fix test and init
* Fix tests and bug
* Add baseline loss to TensorBoard
* Add support for variable len obs in COMA2
* Remove weird merge artifact
* Make agent action run
* Fix __getitem__ replace with slice
* Revert "Fix __getitem__ replace with slice"
This reverts commit 87a2c9d9a9342a7d2be4e9f620d1294a5c3bf22c.
* Revert "Make agent action run"
This reverts commit 59531f3746c58d62cf52f58a88e27a3e428e8946.
4 年前
GitHub
ba2af269
[coma2] Make group extrinsic reward part of extrinsic ( #5033 )
* Make group extrinsic part of extrinsic
* Fix test and init
* Fix tests and bug
* Add baseline loss to TensorBoard
4 年前
Andrew Cohen
4c56e6ad
lstm runs with coma
4 年前
Andrew Cohen
81524ee8
lstm almost runs
4 年前
Andrew Cohen
8f799687
ignoring precommit, grabbing baseline/critic mems from buffer in trainer
4 年前
Andrew Cohen
67beef88
finished evaluate_by_seq, does not run
4 年前
Andrew Cohen
131fa328
inital evaluate_by_seq, does not run
4 年前
Ervin Teng
fd0dd35c
Merge branch 'main' into develop-coma2-trainer
4 年前
Andrew Cohen
43955c5b
get value estimate test
4 年前
GitHub
c9c7e3d0
Faster NaN masking, fix masking for visual obs ( #5015 )
* Fix get mask from visual obs, large obs perf imp.
* Bug fix
* Fix typo
4 年前
Andrew Cohen
8562471e
add inital coma optimizer tests
4 年前
Andrew Cohen
e2d46ca0
Merge branch 'develop-agentprocessor-teammanager' into develop-coma2-trainer
4 年前
Andrew Cohen
5d517c5e
clean ups
4 年前
Ervin Teng
bc3d3a95
Fix slicing typing and string printing in AgentBufferField
4 年前
Andrew Cohen
9060da06
Merge branch 'develop-agentprocessor-teammanager' into develop-coma2-trainer
4 年前
Andrew Cohen
4b58527c
checkout ppo/optimizer from main
4 年前
Andrew Cohen
e37c5a98
Merge branch 'master' into develop-coma2-trainer
4 年前
GitHub
67e945f0
clean ups ( #5003 )
4 年前
Ervin Teng
4da2e22e
Fix Team Cumulative Reward
4 年前
Ervin Teng
4b159789
Add PushBlockCollab config and fix some stuff
4 年前
Ervin Teng
c6904f86
Group reward function
4 年前
Ervin Teng
b3958a8d
Buffer fixes
4 年前
Ervin Teng
a4fcbb63
Right loss function for stability, fix some pypi
4 年前
Ervin Teng
9bc88c41
Running COMA (not sure if learning)
4 年前
Ervin Teng
08db7c2f
Merge branch 'develop-agentprocessor-teammanager' into develop-coma2-trainer-mm
4 年前
Andrew Cohen
98d647de
MultiInputNetBody
4 年前
Andrew Cohen
418cc778
coma trainer and optimizer
4 年前
Andrew Cohen
3f7d68b8
fix test policy
4 年前
Andrew Cohen
00b891df
fix sac shared
4 年前
Andrew Cohen
d81d0be3
fix agent processor test
4 年前
Andrew Cohen
66742dc8
test for SharedActorCritic
4 年前
Andrew Cohen
c74dca9f
add SharedActorCritic
4 年前
Ervin Teng
24ee4bd5
Merge remote-tracking branch 'origin/develop-critic-optimizer' into develop-critic-optimizer
4 年前
Andrew Cohen
6828713c
fix saver test
4 年前
Andrew Cohen
9b92f5fb
remove commented code
4 年前
Ervin Teng
c675393c
Move value network for SAC to device
4 年前
Andrew Cohen
8efdeeb0
make critic a property
4 年前
Ervin Teng
1831044a
Update SAC to use separate policy
4 年前
Andrew Cohen
543f22bc
fix test_networks
4 年前
Andrew Cohen
3aec18a1
fix precommit errors
4 年前
Andrew Cohen
6bd396ee
add critic to optimizer, ppo runs
4 年前
Andrew Cohen
f73b9dba
update policy to not use critic
4 年前
Andrew Cohen
eeabb974
Separate Actor/Critic, remove ActorCritics
4 年前