浏览代码

remove Use Heuristic from docs (#3568)

/bug-failed-api-check
GitHub 5 年前
当前提交
35df5d95
共有 5 个文件被更改,包括 13 次插入11 次删除
  1. 2
      docs/Getting-Started-with-Balance-Ball.md
  2. 5
      docs/Learning-Environment-Best-Practices.md
  3. 4
      docs/Learning-Environment-Create-New.md
  4. 9
      docs/Learning-Environment-Design-Agents.md
  5. 4
      docs/Learning-Environment-Examples.md

2
docs/Getting-Started-with-Balance-Ball.md


negative reward for dropping the ball. An Agent is also marked as done when it
drops the ball so that it will reset with a new ball for the next simulation
step.
* agent.Heuristic() - When the `Use Heuristic` checkbox is checked in the Behavior
* agent.Heuristic() - When the `Behavior Type` is set to `Heuristic Only` in the Behavior
Parameters of the Agent, the Agent will use the `Heuristic()` method to generate
the actions of the Agent. As such, the `Heuristic()` method returns an array of
floats. In the case of the Ball 3D Agent, the `Heuristic()` method converts the

5
docs/Learning-Environment-Best-Practices.md


lessons which progressively increase in difficulty are presented to the agent
([learn more here](Training-Curriculum-Learning.md)).
* When possible, it is often helpful to ensure that you can complete the task by
using a heuristic to control the agent. To do so, check the `Use Heuristic`
checkbox on the Agent and implement the `Heuristic()` method on the Agent.
using a heuristic to control the agent. To do so, set the `Behavior Type`
to `Heuristic Only` on the Agent's Behavior Parameters, and implement the
`Heuristic()` method on the Agent.
* It is often helpful to make many copies of the agent, and give them the same
`Behavior Name`. In this way the learning process can get more feedback
information from all of these agents, which helps it train faster.

4
docs/Learning-Environment-Create-New.md


to the values of the "Horizontal" and "Vertical" input axis (which correspond to
the keyboard arrow keys).
In order for the Agent to use the Heuristic, You will need to check the `Use Heuristic`
checkbox in the `Behavior Parameters` of the RollerAgent.
In order for the Agent to use the Heuristic, You will need to set the `Behavior Type`
to `Heuristic Only` in the `Behavior Parameters` of the RollerAgent.
Press **Play** to run the scene and use the arrows keys to move the Agent around

9
docs/Learning-Environment-Design-Agents.md


The Policy class abstracts out the decision making logic from the Agent itself so
that you can use the same Policy in multiple Agents. How a Policy makes its
decisions depends on the kind of Policy it is. You can change the Policy of an
Agent by changing its `Behavior Parameters`. If you check `Use Heuristic`, the
Agent will use its `Heuristic()` method to make decisions which can allow you to
control the Agent manually or write your own Policy. If the Agent has a `Model`
file, it Policy will use the neural network `Model` to take decisions.
Agent by changing its `Behavior Parameters`. If you set `Behavior Type` to
`Heuristic Only`, the Agent will use its `Heuristic()` method to make decisions
which can allow you to control the Agent manually or write your own Policy. If
the Agent has a `Model` file, it Policy will use the neural network `Model` to
take decisions.
## Decisions

4
docs/Learning-Environment-Examples.md


* Goal: The agents must hit the ball so that the opponent cannot hit a valid
return.
* Agents: The environment contains two agent with same Behavior Parameters.
After training you can check the `Use Heuristic` checkbox on one of the Agents
to play against your trained model.
After training you can set the `Behavior Type` to `Heuristic Only` on one of the Agent's
Behavior Parameters to play against your trained model.
* Agent Reward Function (independent):
* +1.0 To the agent that wins the point. An agent wins a point by preventing
the opponent from hitting a valid return.

正在加载...
取消
保存