18 次代码提交 (86ff4c4a-2257-4271-9eb7-6299f25e3091)

作者 SHA1 备注 提交日期
GitHub c145e75b Split Policy and Optimizer, common Policy for PPO and SAC (#3345) 5 年前
GitHub 97a1d4b1 [change] Remove the action_holder placeholder from the policy. (#3492) 5 年前
GitHub 7d954797 [change] Separate action outputs into OutputDistributions object (#3514) 5 年前
GitHub e4177de0 [change] Organize trainer files a bit better (#3538) 5 年前
Anupam Bhatnagar f4dbedcf removed extraneous logging imports and loggers 5 年前
GitHub ffd8f855 [bug-fix] Fix crash when demo size is smaller than batch size (#3591) 5 年前
GitHub 94de596b [change] Remove concatenate in discrete action probabilities to improve inference performance (#3598) 5 年前
GitHub ec278616 Hotfixes for Release 0.15.1 (#3698) 5 年前
GitHub 141831da [bug-fix] Fix entropy computation for GaussianDistribution (#3684) 5 年前
Andrew Cohen 4a3ad193 Add constant decay to beta and epsilon 5 年前
Christopher Goy ba80b292 format files with pre-commit. 4 年前
GitHub e92b4f88 [refactor] Structure configuration files into classes (#3936) 5 年前
GitHub a28e2767 Update add-fire to latest master, including Policy refactor (#4263) 4 年前
Ruo-Ping Dong 01e60921 add sac checkpoint 4 年前
GitHub 129f9ddc [MLA-427] make pyupgrade convert f-strings too (#4244) 5 年前
GitHub 1b098c9a Refactor TFPolicy and Policy (#4254) 4 年前
GitHub b853e5ba Action buffer (#4612) 4 年前
Andrew Cohen 3c65b964 fixed recurrent prev_action issue 4 年前