75 次代码提交 (7004604d-da76-4cfb-b61a-f28f48336694)

作者 SHA1 备注 提交日期
Ervin Teng 1b6e175c Fix discrete SAC and clean up policy 5 年前
Ervin Teng 6bbcf2d7 Add typing to value head creator 5 年前
Ervin Teng b21b3d5c Use resamp policy for SAC 5 年前
Ervin Teng 28f7608f Clean up value head creation 5 年前
Ervin Teng db249ceb Merge branch 'master' into develop-splitpolicyoptimizer 5 年前
Ervin Teng 0ef40c08 SAC CC working 5 年前
Ervin Teng d9fe2f9c Unified policy 5 年前
Ervin Teng b61d2fa1 Fix some typing issues with curiosity 5 年前
Ervin Teng 151e3b1c Move policy to common location, remove epsilon 5 年前
Ervin Teng abc98c23 Change reward signal creation 5 年前
Ervin Teng 164732a9 Move optimizer creation to Trainer, fix some of the reward signals 5 年前
Ervin Teng e912fa47 Simplify creation of optimizer, breaks multi-GPU 5 年前
Ervin Teng 6baaf980 Remove PPO model 5 年前
Ervin Teng 3348bcef Commit init file 5 年前
Ervin Teng 9ad99eb6 Combined model and policy for PPO 5 年前
Ervin Teng 2b63415e Clean up policy files 5 年前
Ervin Teng 17dc17e5 Discrete PPO working 5 年前
Ervin Teng bc04f9dc Working continuous updates 5 年前
Ervin Teng 76ad64d7 Some more bugfixes 5 年前
Ervin Teng 2373cae8 Move methods into common optimizer 5 年前
Ervin Teng cd74e51b More progress 5 年前
Ervin Teng 91ffde5f More incremental steps to separation 5 年前
Ervin Teng 6688453b Move some functionality to optimizer-black 5 年前
Ervin Teng 2c1ef594 Move some functionality to optimizer-black 5 年前
Ervin Teng 03c750a7 Move some functionality to optimizer 5 年前