[Documentation] SetReward method (#1996)

Added a paragraph in the docs/Learning-Environment-Design-Agents.md document regarding the use of SetReward and how it is different from AddReward
6 年前 · 0d6a24c5
--- a/docs/Learning-Environment-Design-Agents.md
+++ b/docs/Learning-Environment-Design-Agents.md
 Brain to control the Agent while watching how it accumulates rewards.

 Allocate rewards to an Agent by calling the `AddReward()` method in the
-`AgentAction()` function. The reward assigned in any step should be in the range
-[-1,1].  Values outside this range can lead to unstable training. The `reward`
-value is reset to zero at every step.
+`AgentAction()` function. The reward assigned between each decision
+should be in the range [-1,1]. Values outside this range can lead to
+unstable training. The `reward` value is reset to zero when the agent receives a
+new decision. If there are multiple calls to `AddReward()` for a single agent
+decision, the rewards will be summed together to evaluate how good the previous
+decision was. There is a method called `SetReward()` that will override all
+previous rewards given to an agent since the previous decision.

 ### Examples