fix the training doc (#1193)

6 年前 · 4a881354
--- a/docs/Training-ML-Agents.md
+++ b/docs/Training-ML-Agents.md
 | :--         | :--             | :--                   |
 | batch_size | The number of experiences in each iteration of gradient descent.| PPO, BC |
 | batches_per_epoch | In imitation learning, the number of batches of training examples to collect before training the model.| BC |
-| beta | The strength of entropy regularization.| PPO, BC |
+| beta | The strength of entropy regularization.| PPO |
-| epsilon | Influences how rapidly the policy can evolve during training.| PPO, BC |
+| epsilon | Influences how rapidly the policy can evolve during training.| PPO |
 | gamma | The reward discount rate for the Generalized Advantage Estimator (GAE).  | PPO  |
 | hidden_units | The number of units in the hidden layers of the neural network. | PPO, BC |
 | lambd | The regularization parameter. | PPO  |
-| normalize | Whether to automatically normalize observations. | PPO, BC |
-| num_epoch | The number of passes to make through the experience buffer when performing gradient descent optimization. | PPO, BC |
+| normalize | Whether to automatically normalize observations. | PPO |
+| num_epoch | The number of passes to make through the experience buffer when performing gradient descent optimization. | PPO |
 | num_layers | The number of hidden layers in the neural network. | PPO, BC |
 | sequence_length | Defines how long the sequences of experiences must be while training. Only used for training with a recurrent neural network. See [Using Recurrent Neural Networks](Feature-Memory.md). | PPO, BC |
 | summary_freq | How often, in steps, to save training statistics. This determines the number of data points shown by TensorBoard. | PPO, BC |
--- a/ml-agents/mlagents/trainers/bc/trainer.py
+++ b/ml-agents/mlagents/trainers/bc/trainer.py
        """
        super(BehavioralCloningTrainer, self).__init__(sess, brain, trainer_parameters, training, run_id)

-        self.param_keys = ['brain_to_imitate', 'batch_size', 'time_horizon', 'graph_scope',
-                           'summary_freq', 'max_steps', 'batches_per_epoch', 'use_recurrent', 'hidden_units',
-                           'num_layers', 'sequence_length', 'memory_size']
+        self.param_keys = ['brain_to_imitate', 'batch_size', 'time_horizon',
+                           'graph_scope', 'summary_freq', 'max_steps',
+                           'batches_per_epoch', 'use_recurrent',
+                           'hidden_units','learning_rate', 'num_layers',
+                           'sequence_length', 'memory_size']

        for k in self.param_keys:
            if k not in trainer_parameters: