20 次代码提交 (bcf880aa-ac6e-459c-995b-a548c6b4339f)

作者 SHA1 备注 提交日期
Ervin Teng cd74e51b More progress 5 年前
Ervin Teng 2373cae8 Move methods into common optimizer 5 年前
Ervin Teng 9ad99eb6 Combined model and policy for PPO 5 年前
Ervin Teng e912fa47 Simplify creation of optimizer, breaks multi-GPU 5 年前
Ervin Teng 164732a9 Move optimizer creation to Trainer, fix some of the reward signals 5 年前
Ervin Teng abc98c23 Change reward signal creation 5 年前
Ervin Teng 0ef40c08 SAC CC working 5 年前
Ervin Teng 28f7608f Clean up value head creation 5 年前
Ervin Teng edeceefd Zeroed version of LSTM working for PPO 5 年前
Ervin Teng 5ec49542 SAC LSTM isn't broken 5 年前
Ervin Teng 4871f49c Fix comments for PPO 5 年前
Ervin Teng cfc2f455 Fix BC and tests 5 年前
GitHub dd86e879 Separate out optimizer creation and policy graph creation (#3355) 5 年前
Ervin Teng dcbb90e1 Fix graph init in ghost trainer 5 年前
Ervin Teng 7a401feb Remove float64 numpy 5 年前
Ervin Teng 328476d8 Move check for creation into nn_policy 5 年前
Ervin Teng cbfbff2c Split optimizer and TFOptimizer 5 年前
Ervin Teng 7d5c1b0b Add docstring and make some methods private 5 年前
Arthur Juliani ca887743 Support tf and pytorch alongside one another 5 年前
GitHub a28e2767 Update add-fire to latest master, including Policy refactor (#4263) 4 年前