Ervin Teng
|
cbfbff2c
|
Split optimizer and TFOptimizer
|
5 年前 |
Ervin Teng
|
632ff859
|
add init
|
5 年前 |
Ervin Teng
|
31c844e2
|
Change memory size definition in docs
|
5 年前 |
Ervin Teng
|
ce110201
|
Add optional burn-in for SAC as well
|
5 年前 |
Ervin Teng
|
f64bdc4b
|
Fix SAC RNN test
|
5 年前 |
Ervin Teng
|
328476d8
|
Move check for creation into nn_policy
|
5 年前 |
Ervin Teng
|
14720e2d
|
Remove burn-in
|
5 年前 |
Ervin Teng
|
1407db53
|
Fix Barracuda export for LSTM
|
5 年前 |
Ervin Teng
|
d4ee7346
|
Merge commit 'f9c05a61d574305497789b5997f1ae3ea1b1ad3b' into develop-splitpolicyoptimizer
|
5 年前 |
Ervin Teng
|
5f00782b
|
Clean up some SAC LSTM
|
5 年前 |
Ervin Teng
|
7a401feb
|
Remove float64 numpy
|
5 年前 |
Ervin Teng
|
dcbb90e1
|
Fix graph init in ghost trainer
|
5 年前 |
Ervin Teng
|
cb2d2526
|
Reformat using black
|
5 年前 |
Ervin Teng
|
48b39b80
|
Fix ghost trainer and all tests
|
5 年前 |
Ervin Teng
|
1c4f60d4
|
remove more PPO tests
|
5 年前 |
Ervin Teng
|
d02bfbd4
|
Remove PPO policy tests
|
5 年前 |
Ervin Teng
|
dc43b0c6
|
Add test for NN policy
|
5 年前 |
Ervin Teng
|
2eda5575
|
Fix discrete scoping
|
5 年前 |
Ervin Teng
|
cdd57468
|
Re-fix scoping and add method to get all variables
|
5 年前 |
Ervin Teng
|
1f094da9
|
Fix policy's scoping
|
5 年前 |
GitHub
|
dd86e879
|
Separate out optimizer creation and policy graph creation (#3355)
|
5 年前 |
Ervin Teng
|
85249afc
|
Fix SAC scoping
|
5 年前 |
Ervin Teng
|
aec5fcc0
|
Fix policy tests
|
5 年前 |
Ervin Teng
|
cadf6603
|
Fix SAC CC and some reward signal tests
|
5 年前 |
Ervin Teng
|
78671383
|
Move initialization call around
|
5 年前 |
Ervin Teng
|
a6e28cf4
|
Fix for visual obs
|
5 年前 |
Ervin Teng
|
cfc2f455
|
Fix BC and tests
|
5 年前 |
Ervin Teng
|
4871f49c
|
Fix comments for PPO
|
5 年前 |
Ervin Teng
|
7d616651
|
Add burn-in for memory PPO
|
5 年前 |
Ervin Teng
|
08cb91de
|
Remove __init__ for LearningModel static class
|
5 年前 |
Ervin Teng
|
ab9b082a
|
Fix Hallway summary freq
|
5 年前 |
Ervin Teng
|
9b0b2fed
|
Reduce memory sizes
|
5 年前 |
Ervin Teng
|
5ec49542
|
SAC LSTM isn't broken
|
5 年前 |
Ervin Teng
|
7f53bf8b
|
Cleanup LSTM code
|
5 年前 |
Ervin Teng
|
9b7499a0
|
Revert learn.py
|
5 年前 |
Ervin Teng
|
edeceefd
|
Zeroed version of LSTM working for PPO
|
5 年前 |
Ervin Teng
|
8e300036
|
Add some typing to optimizer
|
5 年前 |
Ervin Teng
|
a5caf4d6
|
Remove epsilon from everywhere
|
5 年前 |
Ervin Teng
|
1b6e175c
|
Fix discrete SAC and clean up policy
|
5 年前 |
Ervin Teng
|
6bbcf2d7
|
Add typing to value head creator
|
5 年前 |
Ervin Teng
|
b21b3d5c
|
Use resamp policy for SAC
|
5 年前 |
Ervin Teng
|
28f7608f
|
Clean up value head creation
|
5 年前 |
Ervin Teng
|
db249ceb
|
Merge branch 'master' into develop-splitpolicyoptimizer
|
5 年前 |
Ervin Teng
|
0ef40c08
|
SAC CC working
|
5 年前 |
Ervin Teng
|
d9fe2f9c
|
Unified policy
|
5 年前 |
Ervin Teng
|
b61d2fa1
|
Fix some typing issues with curiosity
|
5 年前 |
Ervin Teng
|
151e3b1c
|
Move policy to common location, remove epsilon
|
5 年前 |
Ervin Teng
|
abc98c23
|
Change reward signal creation
|
5 年前 |
Ervin Teng
|
164732a9
|
Move optimizer creation to Trainer, fix some of the reward signals
|
5 年前 |
Ervin Teng
|
e912fa47
|
Simplify creation of optimizer, breaks multi-GPU
|
5 年前 |