|
|
|
|
|
|
|
|
|
|
Options: |
|
|
|
--help Show this message. |
|
|
|
--curriculum=<file> Curriculum json file for environment [default: None] |
|
|
|
--batch-size=<n> How many experiences per gradient descent update step [default: 64]. |
|
|
|
--beta=<n> Strength of entropy regularization [default: 5e-3]. |
|
|
|
--buffer-size=<n> How large the experience buffer should be before gradient descent [default: 2048]. |
|
|
|
--curriculum=<file> Curriculum json file for environment [default: None]. |
|
|
|
--epsilon=<n> Acceptable threshold around ratio of old and new policy probabilities [default: 0.2]. |
|
|
|
--gamma=<n> Reward discount rate [default: 0.995]. |
|
|
|
--hidden-units=<n> Number of units in hidden layer [default: 64]. |
|
|
|
--keep-checkpoints=<n> How many model checkpoints to keep [default: 5]. |
|
|
|
--lambd=<n> Lambda parameter for GAE [default: 0.95]. |
|
|
|
--learning-rate=<rate> Model learning rate [default: 3e-4]. |
|
|
|
--load Whether to load the model or randomly initialize [default: False]. |
|
|
|
--num-epoch=<n> Number of gradient descent steps per batch of experiences [default: 5]. |
|
|
|
--load Whether to load the model or randomly initialize [default: False]. |
|
|
|
--train Whether to train model, or only run inference [default: False]. |
|
|
|
--summary-freq=<n> Frequency at which to save training statistics [default: 10000]. |
|
|
|
--gamma=<n> Reward discount rate [default: 0.995]. |
|
|
|
--lambd=<n> Lambda parameter for GAE [default: 0.95]. |
|
|
|
--summary-freq=<n> Frequency at which to save training statistics [default: 10000]. |
|
|
|
--beta=<n> Strength of entropy regularization [default: 1e-3]. |
|
|
|
--num-epoch=<n> Number of gradient descent steps per batch of experiences [default: 5]. |
|
|
|
--epsilon=<n> Acceptable threshold around ratio of old and new policy probabilities [default: 0.2]. |
|
|
|
--buffer-size=<n> How large the experience buffer should be before gradient descent [default: 2048]. |
|
|
|
--learning-rate=<rate> Model learning rate [default: 3e-4]. |
|
|
|
--hidden-units=<n> Number of units in hidden layer [default: 64]. |
|
|
|
--batch-size=<n> How many experiences per gradient descent update step [default: 64]. |
|
|
|
--keep-checkpoints=<n> How many model checkpoints to keep [default: 5]. |
|
|
|
--worker-id=<n> Number to add to communication port (5005). Used for asynchronous agent scenarios [default: 0]. |
|
|
|
--train Whether to train model, or only run inference [default: False]. |
|
|
|
--worker-id=<n> Number to add to communication port (5005). Used for multi-environment [default: 0]. |
|
|
|
''' |
|
|
|
|
|
|
|
options = docopt(_USAGE) |
|
|
|