Ervin Teng
|
151e3b1c
|
Move policy to common location, remove epsilon
|
5 年前 |
Ervin Teng
|
d9fe2f9c
|
Unified policy
|
5 年前 |
Ervin Teng
|
0ef40c08
|
SAC CC working
|
5 年前 |
Ervin Teng
|
1b6e175c
|
Fix discrete SAC and clean up policy
|
5 年前 |
Ervin Teng
|
edeceefd
|
Zeroed version of LSTM working for PPO
|
5 年前 |
Ervin Teng
|
7f53bf8b
|
Cleanup LSTM code
|
5 年前 |
Ervin Teng
|
4871f49c
|
Fix comments for PPO
|
5 年前 |
Ervin Teng
|
cfc2f455
|
Fix BC and tests
|
5 年前 |
Ervin Teng
|
78671383
|
Move initialization call around
|
5 年前 |
Ervin Teng
|
cadf6603
|
Fix SAC CC and some reward signal tests
|
5 年前 |
GitHub
|
dd86e879
|
Separate out optimizer creation and policy graph creation (#3355)
|
5 年前 |
Ervin Teng
|
1f094da9
|
Fix policy's scoping
|
5 年前 |
Ervin Teng
|
cdd57468
|
Re-fix scoping and add method to get all variables
|
5 年前 |
Ervin Teng
|
2eda5575
|
Fix discrete scoping
|
5 年前 |
Ervin Teng
|
1407db53
|
Fix Barracuda export for LSTM
|
5 年前 |
Ervin Teng
|
328476d8
|
Move check for creation into nn_policy
|
5 年前 |
Ervin Teng
|
632ff859
|
add init
|
5 年前 |
Ervin Teng
|
4d94e180
|
Move optimizer to common folder
|
5 年前 |
Ervin Teng
|
d969e013
|
Remove extra tf_optimizer
|
5 年前 |
Ervin Teng
|
7d5c1b0b
|
Add docstring and make some methods private
|
5 年前 |
Ervin Teng
|
00017bab
|
Temporarily remove multi-GPU
|
5 年前 |
Ervin Teng
|
441e6a0c
|
Add typing to optimizer, rename self.tf_optimizer
|
5 年前 |
Ervin Teng
|
ffdc41bb
|
Removed floating constants
|
5 年前 |
Ervin Teng
|
8abd4129
|
Clean up nn_policy
|
5 年前 |
Ervin Teng
|
7c0fa1c4
|
Remove action_holder placeholder
|
5 年前 |
Ervin Teng
|
c9fbb111
|
Fix entropy calculation
|
5 年前 |
Ervin Teng
|
be9d772e
|
Add option to not condition sigma on obs
|
5 年前 |
Ervin Teng
|
0ab7aa58
|
Fix tensor names
|
5 年前 |
Ervin Teng
|
1cfc461a
|
Remove and rename tf_optimizer
|
5 年前 |
Ervin Teng
|
63463bd1
|
Make TF graph seed deterministic
|
5 年前 |
Ervin Teng
|
14f2a7f2
|
Rename LearningModel to ModelUtils
|
5 年前 |
Ervin Teng
|
1156b9b3
|
Merge branch 'develop-splitpolicyoptimizer' into develop-removeactionholder
|
5 年前 |
Ervin Teng
|
d57124b4
|
Merge 'master' into develop-removeactionholder
|
5 年前 |
Ervin Teng
|
d6eb262c
|
Rename resample to reparameterize
|
5 年前 |
Ervin Teng
|
ac583acb
|
Make value estimate method private
|
5 年前 |
Ervin Teng
|
242e2421
|
Move encoder creation to separate function
|
5 年前 |
Ervin Teng
|
53c25fb1
|
Move one-hot out of policy and remove selected_actions
|
5 年前 |
Ervin Teng
|
a73704bc
|
Remove previous action from policy
|
5 年前 |