Andrew Cohen
418cc778
coma trainer and optimizer
4 年前
Andrew Cohen
98d647de
MultiInputNetBody
4 年前
Ervin Teng
9bc88c41
Running COMA (not sure if learning)
4 年前
Ervin Teng
a4fcbb63
Right loss function for stability, fix some pypi
4 年前
Ervin Teng
b3958a8d
Buffer fixes
4 年前
Ervin Teng
c6904f86
Group reward function
4 年前
Ervin Teng
4b159789
Add PushBlockCollab config and fix some stuff
4 年前
Ervin Teng
4da2e22e
Fix Team Cumulative Reward
4 年前
GitHub
67e945f0
clean ups ( #5003 )
4 年前
Andrew Cohen
5d517c5e
clean ups
4 年前
Andrew Cohen
131fa328
inital evaluate_by_seq, does not run
4 年前
Andrew Cohen
67beef88
finished evaluate_by_seq, does not run
4 年前
Andrew Cohen
8f799687
ignoring precommit, grabbing baseline/critic mems from buffer in trainer
4 年前
Andrew Cohen
81524ee8
lstm almost runs
4 年前
Andrew Cohen
4c56e6ad
lstm runs with coma
4 年前
GitHub
ba2af269
[coma2] Make group extrinsic reward part of extrinsic ( #5033 )
* Make group extrinsic part of extrinsic
* Fix test and init
* Fix tests and bug
* Add baseline loss to TensorBoard
4 年前
GitHub
6ae8ea1e
[coma2] Add support for variable length obs in COMA2 ( #5038 )
* Make group extrinsic part of extrinsic
* Fix test and init
* Fix tests and bug
* Add baseline loss to TensorBoard
* Add support for variable len obs in COMA2
* Remove weird merge artifact
* Make agent action run
* Fix __getitem__ replace with slice
* Revert "Fix __getitem__ replace with slice"
This reverts commit 87a2c9d9a9342a7d2be4e9f620d1294a5c3bf22c.
* Revert "Make agent action run"
This reverts commit 59531f3746c58d62cf52f58a88e27a3e428e8946.
4 年前
Ervin Teng
ac0b56bb
Fix pypi issues
4 年前
Andrew Cohen
0afe5f24
add slice function to agent action
4 年前
GitHub
d2635e58
Action slice ( #5047 )
* add slice function to agent action
* add type/docstring to slice
* add test
4 年前
Andrew Cohen
21d7ab85
add torch no_grad to coma LSTM value computation
4 年前
Ervin Teng
252c1f36
Fix warning message format
4 年前
Ervin Teng
58122103
Fix warning message formatting again
4 年前
Ervin Teng
a9fb37aa
Fix reporting of group rewards, CLI print of group
4 年前
Ervin Teng
0207f95e
Don't delete when agents don't die
4 年前