浏览代码

added a sentence in the machine learning background doc (#4043)

/MLA-1734-demo-provider
GitHub 5 年前
当前提交
009dbaf5
共有 1 个文件被更改,包括 1 次插入1 次删除
  1. 2
      docs/Background-Machine-Learning.md

2
docs/Background-Machine-Learning.md


water hose and whether the hose is on or off).
The last remaining piece of the reinforcement learning task is the **reward
signal**. When training a robot to be a mean firefighting machine, we provide it
signal**. The robot is trained to learn a policy that maximizes its overall rewards. When training a robot to be a mean firefighting machine, we provide it
with rewards (positive and negative) indicating how well it is doing on
completing the task. Note that the robot does not _know_ how to put out fires
before it is trained. It learns the objective because it receives a large

正在加载...
取消
保存