Options: |
--help Show this message. |
--batch-size=<n> How many experiences per gradient descent update step [default: 64]. |
--beta=<n> Strength of entropy regularization [default: 5e-3]. |
--beta=<n> Strength of entropy regularization [default: 2e-3]. |
--buffer-size=<n> How large the experience buffer should be before gradient descent [default: 2048]. |
--curriculum=<file> Curriculum json file for environment [default: None]. |
--epsilon=<n> Acceptable threshold around ratio of old and new policy probabilities [default: 0.2]. |