38 次代码提交 (c68b5643-de7d-40c0-992f-b1dc6214a86d)

作者 SHA1 备注 提交日期
Ervin Teng 151e3b1c Move policy to common location, remove epsilon 5 年前
Ervin Teng d9fe2f9c Unified policy 5 年前
Ervin Teng 0ef40c08 SAC CC working 5 年前
Ervin Teng 1b6e175c Fix discrete SAC and clean up policy 5 年前
Ervin Teng edeceefd Zeroed version of LSTM working for PPO 5 年前
Ervin Teng 7f53bf8b Cleanup LSTM code 5 年前
Ervin Teng 4871f49c Fix comments for PPO 5 年前
Ervin Teng cfc2f455 Fix BC and tests 5 年前
Ervin Teng 78671383 Move initialization call around 5 年前
Ervin Teng cadf6603 Fix SAC CC and some reward signal tests 5 年前
GitHub dd86e879 Separate out optimizer creation and policy graph creation (#3355) 5 年前
Ervin Teng 1f094da9 Fix policy's scoping 5 年前
Ervin Teng cdd57468 Re-fix scoping and add method to get all variables 5 年前
Ervin Teng 2eda5575 Fix discrete scoping 5 年前
Ervin Teng 1407db53 Fix Barracuda export for LSTM 5 年前
Ervin Teng 328476d8 Move check for creation into nn_policy 5 年前
Ervin Teng 632ff859 add init 5 年前
Ervin Teng 4d94e180 Move optimizer to common folder 5 年前
Ervin Teng d969e013 Remove extra tf_optimizer 5 年前
Ervin Teng 7d5c1b0b Add docstring and make some methods private 5 年前
Ervin Teng 00017bab Temporarily remove multi-GPU 5 年前
Ervin Teng 441e6a0c Add typing to optimizer, rename self.tf_optimizer 5 年前
Ervin Teng ffdc41bb Removed floating constants 5 年前
Ervin Teng 8abd4129 Clean up nn_policy 5 年前
Ervin Teng 7c0fa1c4 Remove action_holder placeholder 5 年前
Ervin Teng c9fbb111 Fix entropy calculation 5 年前
Ervin Teng be9d772e Add option to not condition sigma on obs 5 年前
Ervin Teng 0ab7aa58 Fix tensor names 5 年前
Ervin Teng 1cfc461a Remove and rename tf_optimizer 5 年前
Ervin Teng 63463bd1 Make TF graph seed deterministic 5 年前
Ervin Teng 14f2a7f2 Rename LearningModel to ModelUtils 5 年前
Ervin Teng 1156b9b3 Merge branch 'develop-splitpolicyoptimizer' into develop-removeactionholder 5 年前
Ervin Teng d57124b4 Merge 'master' into develop-removeactionholder 5 年前
Ervin Teng d6eb262c Rename resample to reparameterize 5 年前
Ervin Teng ac583acb Make value estimate method private 5 年前
Ervin Teng 242e2421 Move encoder creation to separate function 5 年前
Ervin Teng 53c25fb1 Move one-hot out of policy and remove selected_actions 5 年前
Ervin Teng a73704bc Remove previous action from policy 5 年前